CPExpert2010

advertisement
An Expert System
designed to evaluate
IBM z/OS systems
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
1
Product Overview
• Helps analyze performance of z/OS systems.
• Written in SAS (only SAS/BASE is required).
• Runs as a batch job on mainframe (or on PC).
• Processes data in a standard performance
data base (either MXG, SAS/ITRM, or MICS).
• Produces narrative reports showing results
from analysis!
• Product is updated every six months
• 45-day trial is available (see license agreement
for details).
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
2
Components Delivered
• SRM Component *
March 1991
• TSO Component *
April 1991
• MVS Component *
June 1991
* These legacy components apply only to Compatibility Mode
• DASD Component
October 1991
• CICS Component
May 1992
• WLM Component
April 1995
• DB2 Component
October 1999
• WMQ Component
June 2004
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
3
Product Documentation
Each component has an extensive User Manual
available in hard-copy or CD, and web-enabled
• Describes the likely impact of each finding
• Discusses the performance issues associated
with each finding
• Suggests ways to improve performance and
describes alternative solutions
• Provides specific references to IBM or other
documents relating to the findings
• More than 4,000 pages for all components
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
4
WLM Component
• Checks for problems in service definition
• Identifies reasons performance goals were missed
• Analyzes general system problems:
• Coupling facility/XCF
• Paging subsystem
• System logger
• WLM-managed initiators
• Excessive CPU use by SYSTEM or SYSSTC
• IFA/zAAP, zIIP, and IOP/SAP processors
• PR/SM, LPAR, and HiperDispatch problems
• Intelligent Resource Director (IRD) problems
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
5
WLM Component - sample report
RULE WLM103: SERVICE CLASS DID NOT ACHIEVE VELOCITY GOAL
DB2HIGH (Period 1): Service class did not achieve its velocity goal
during the measurement intervals shown below. The velocity goal was
50% execution velocity, with an importance level of 2. The '% USING'
and '%TOTAL DELAY' percentages are computed as a function of the average
address space ACTIVE time. The 'PRIMARY,SECONDARY CAUSES OF DELAY'
are computed as a function of the execution delay samples on the local
system.
------LOCAL SYSTEM-------%
% TOTAL EXEC
PERF
MEASUREMENT INTERVAL USING
DELAY VELOC INDX
21:15-21:30,08SEP1998 16.6
83.4
17%
3.02
PLEX
PI
2.36
PRIMARY,SECONDARY
CAUSES OF DELAY
DASD DELAY(99%)
RULE WLM361: NON-PAGING DASD I/O ACTIVITY CAUSED SIGNIFICANT DELAYS
DB2HIGH (Period 1): A significant part of the delay to the service
class can be attributed to non-paging DASD I/O delay. The below data
shows intervals when non-paging DASD delay caused DB2HIGH to miss its
performance goal:
AVG DASD
MEASUREMENT INTERVAL I/O RATE
21:15-21:30,08SEP1998
31
AVG DASD
USING/SEC
1.405
--AVERAGE DASD I/O TIMESRESP
WAIT
DISC
CONN
0.010 0.003 0.004 0.002
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
6
WLM Component - sample report
RULE WLM601: TRANSPORT CLASS MAY NEED TO BE SPLIT
You should consider whether the DEFAULT transport class should be split.
A large percentage of the messages were too small, while a significant
percentage of messages were too large. Storage is wasted when buffers
are used by messages that are too small, while unnecessary overhead is
incurred when XCF must expand the buffers to fit a message. The CLASSLEN
parameter establishes the size of each message buffer, and the CLASSLEN
parameter was specified as 16,316 for this transport class.
This finding applies to the following RMF measurement intervals:
MEASUREMENT INTERVAL
10:00-10:30,26MAR1996
12:00-12:30,26MAR1996
12:30-13:00,26MAR1996
SENT
TO
JA0
Z0
Z0
SMALL
MESSAGES
4,296
2,653
2,017
MESSAGES
THAT FIT
0
6
0
MESSAGES
TOO BIG
57
762
109
TOTAL
MESSAGES
4,353
3,421
2,126
RULE WLM316: PEAK BLOCKED WORK WAS MORE THAN GUIDANCE
The SMF statistics showed that blocked workload waited longer than
specified by the BLWLINTHD parameter in IEAOPTxx. A maximum of more
than 2 address spaces and enclaves were concurrently blocked during
the interval.
BLWLINTHD BLWLTRPCT
--BLOCKED WORKLOAD-MEASUREMENT INTERVAL
IN IEAOPT IN IEAOPT
AVERAGE
PEAK
7:14- 7:29,01OCT2010
20
5
0.002
63
7:29- 7:44,01OCT2010
20
5
0.000
22
7:44- 7:59,01OCT2010
20
5
0.001
49
7:59- 8:14,01OCT2010
20
5
0.001
63
8:14- 8:29,01OCT2010
20
5
0.002
62
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
7
WLM Component - sample report
RULE WLM893: LOGICAL PROCESSORS IN LPAR HAD SKEWED ACCESS TO CAPACITY
LPAR SYSC: HiperDispatch was specified for one or more LPARs in
this CPC, and at least one LPAR used one or more high polarity
central processors. LPAR SYSC was not operating in HiperDispatch
Management Mode, and experienced a skew of its access to physical
processors because the high polarity processors and medium polarity
processors used by LPARs running in HiperDispatch Management Mode.
The information below shows the number of logical processors that
were assigned to LPAR SYSC and each logical processor share of physical
a processor. The CPU activity skew is shown during each RMF interval,
showing the minimum, average, and maximum CPU busy for the logical
processors assigned to LPAR SYSC.
MEASUREMENT INTERVAL
13:59-14:14,15SEP2009
LOGICAL CPUS
ASSIGNED
2
% PHYSICAL CPU ACTIVITY SKEW
CPU SHARE
MIN
AVG
MAX
45.5
28.2
43.3
58.4
RULE WLM537: ZAAP-ELIGIBLE WORK HAD HIGH GOAL IMPORTANCE
Rule WLM530 or Rule WLM535 was produced for this system, indicating
that a relatively large amount of zAAP-eligible work was processed on a
central processor. One possible cause of this situation is that the
zAAP-eligible work was assigned a relatively high Goal Importance (the
Goal Importance was either Importance 1 or Importance 2). Please see
the discussion in the WLM Component User Manual for an explanation of
this issue.
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
8
DB2 Component
• Analyzes standard DB2 interval statistics
• Applies analysis from DB2 Administration Guide
and DB2 Performance Guide (with DB2 9.1)
• Analyzes DB2 Versions 3, 4, 5, 6, 7, 8, and 9
• Evaluates overall DB2 constraints, buffer pools,
EDM pool, RID list processing, Lock Manager,
Log Manager, DDF, and data sharing
• All analysis can be tailored to your site!
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
9
DB2 Component
Typical DB2 local buffer constraints
• There might be insufficient buffers for work files
• There were insufficient buffers for work files in merge passes
• Buffer pool was full
• Hiperpool read requests failed (pages stolen by system)
• Hiperpool write requests failed (expanded storage not available
• Buffer pool page fault rate was high
• Data Management Threshold (DMTH) was reached
• DWQT and VDWQT might be too large
• DWQT, VDWQT, or VPSEQT might be too small
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
10
DB2 Component
Typical DB2 I/O prefetch constraints
• Sequential prefetch was disabled, buffer shortage
• Sequential prefetch was disabled, unavailable read engine
• Sequential prefetch not scheduled, prefetch quantity = 0
• Synchronous read I/O and sequential prefetch was high
• Dynamic sequential prefetch was high (before DB2 8.1)
• Synchronous read I/O was high
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
11
DB2 Component
Typical DB2 parallel processing constraints
• Parallel groups fell back to sequential mode
• Parallel groups reduced due to buffer shortage
• Prefetch quantity reduced to one-half of normal
• Prefetch quantity reduced to one-quarter of normal
• Prefetch I/O streams were denied, shortage of buffers
• Page requested for a parallel query was unavailable
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
12
DB2 Component
Typical DB2 EDM pool constraints
• Failures were caused by full EDM pool
• Low percent of DBDs found in EDM pool
• Low percent of CT Sections found in EDM pool
• Low percent of PT Sections found in EDM pool
• Size of EDM pool could be reduced
• Excessive Class 24 (EDM LRU) latch contention
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
13
DB2 Component
Typical DB2 Lock Manager constraints
• Work was suspended because of lock conflict
• Locks were escalated to shared mode
• Locks were escalated to exclusive mode
• Lock escalation was not effective
• Work was suspended for longer than time-out value
• Deadlocks were detected
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
14
DB2 Component
Typical DB2 Log Manager constraints
• Archive log read allocations exceeded guidance
• Archive log write allocations exceeded guidance
• Waits were caused by unavailable output log buffer
• Log reads satisfied from active log data set
• Log reads were satisfied from archive log data set
• Failed look-ahead tape mounts
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
15
DB2 Component
Typical DB2 Data Sharing constraints
• Group buffer pool is too small
• Incorrect directory entry/data entry ratio
• Directory reclaims resulting in cross-invalidations
• Castout processing occurring in “spurts”
• Excessive lock contention or false lock contention
• GBPCACHE ALL inappropriately specified
• GBPCACHE CHANGED inappropriately specified
• Conflicts between applications
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
16
DB2 Component - sample report
RULE DB2-208: VIRTUAL BUFFER POOL WAS FULL
Buffer Pool 2: A usable buffer could not be located in virtual Buffer
Pool 2, because the virtual buffer pool was full. This condition
should not normally occur, as there should be ample buffers. You
should consider using the -ALTER BUFERPOOL command to increase the
virtual buffer pool size (VPSIZE) for the virtual buffer pool. This
situation occurred during the intervals shown below:
MEASUREMENT INTERVAL
10:54-11:24, 15SEP1999
11:24-11:54, 15SEP1999
BUFFERS
ALLOCATED
100
100
NUMBER OF TIMES
POOL WAS FULL
12
13
RULE DB2-216: BUFFER POOLS MIGHT BE TOO LARGE
Buffer Pool 1: The page fault rates for read and write I/O indicated
that the buffer pools might be too large for the available processor
storage. This situation occurred for Buffer Pool 1 during the intervals
shown below:
MEASUREMENT INTERVAL
11:15-11:45, 16SEP1999
11:45-12:15, 16SEP1999
12:45-13:15, 16SEP1999
BUFFERS
ALLOCATED
25,000
25,000
25,000
PAGE-IN FOR
READ I/O
36,904
30,892
23,890
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
PAGE-IN FOR
WRITE I/O
195
563
170
PAGE
RATE
41.2
35.0
26.7
www.cpexpert.com
17
DB2 Component - sample report
RULE DB2-230: SEQUENTIAL PREFETCH WAS DISABLED - BUFFER SHORTAGE
Buffer Pool BP1: Sequential prefetch is disabled when there is a buffer
shortage, as controlled by the Sequential Prefetch Threshold (SPTH).
Ideally, sequential prefetch should not be disabled, since performance
is adversely affected. If sequential prefetch is disabled a large
number of times, the buffer pool size might be too small. The
sequential prefetch threshold was reached for Buffer Pool BP1 during
the intervals shown below.
MEASUREMENT INTERVAL
5:00- 5:15, 15MAY2009
5:15- 5:30, 15MAY2009
BUFFERS
ALLOCATED
268,000
268,000
TIMES SEQUENTIAL PREFETCH
DISABLED (BUFFER SHORTAGE)
125
BP1
1,533
BP1
RULE DB2-234: WRITE ENGINES WERE NOT AVAILABLE FOR ASYNCHRONOUS I/O
Buffer Pool BP13: DB2 has 600 deferred write engines available for
asynchronous I/O operations. When all 600 write engines are used,
synchronous writes are performed. The application is suspended during
synchronous writes, and performance is adversely affected. This
situation occurred for Buffer Pool BP13 during the intervals shown below:
MEASUREMENT INTERVAL
5:45- 6:00, 15MAY2009
BUFFERS
ALLOCATED
12,800
TIMES WRITE ENGINES
WERE NOT AVAILABLE
44
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
BP13
www.cpexpert.com
18
DB2 Component - sample report
RULE DB2-423: DATABASE ACCESS THREAD WAS QUEUED, ZPARM LIMIT WAS REACHED
Database access threads were queued because the ZPARM maximum for active
remote threads was reached. You should consider increasing the maximum
number of database access threads allowed. This situation occurred
during the intervals shown below:
MEASUREMENT INTERVAL
11:24-11:54, 01OCT2010
DATABASE ACCESS THREADS QUEUED
ZPARM LIMIT REACHED
9
RULE DB2-512: LOG READS WERE SATISFIED FROM ACTIVE LOG DATA SET
The DB2 Log Manager statistics revealed that more than 25% of the
log reads were satisfied from the active log data set. It is
preferable that the data be in the output buffer, but this is not
always possible with an active DB2 environment. However, if a large
percent of reads are satisfied from the active log, you should ensure
that the output buffer is as large as possible. This finding occurred
during the intervals shown below:
MEASUREMENT INTERVAL
14:24-14:54, 01OCT2010
14:54-15:24, 01OCT2010
TOTAL LOG
READS
6,554
7,274
LOG READS FROM
ACTIVE LOG DATA SET
4,678
3,695
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
PERCENT
71.4
50.8
www.cpexpert.com
19
DB2 Component - sample report
RULE DB2-601: COUPLING FACILITY READ REQUESTS COULD NOT COMPLETE
Group Buffer Pool 6: Coupling facility read requests could not be
completed because of a lack of coupling facility storage resources.
This situation occurred for Group Buffer Pool 6 during the intervals
shown below:
MEASUREMENT INTERVAL
11:01-11:31, 14OCT1999
GROUP BUFFER POOL
ALLOCATED SIZE
38M
TIMES CF READ
REQUESTS NOT COMPLETE
130
RULE DB2-610: GBPCACHE(N0) OR GBPCACHE NONE MIGHT BE APPROPRIATE
Group Buffer Pool 4: This buffer pool had a very small amount of read
activity relative to write activity. Pages read were less than 1% of
the pages written. Since so few pages were read from this group buffer
pool, you should consider specifying GPBCACHE(NO) for the group buffer
pool or specifying GBPCACHE NONE for the page sets using the group
buffer pool. This situation occurred for Group Buffer Pool 4 during
the intervals shown below:
MEASUREMENT INTERVAL
10:34-11:04, 14OCT1999
GROUP BUFFER POOL
ALLOCATED SIZE
38M
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
PAGES
READ
14
PAGES
WRITTEN
18,268
READ
PERCENT
0.07%
www.cpexpert.com
20
CICS Component
• Processes CICS Interval Statistics contained in
MXG Performance Data Base (standard SMF 110)
• Analyzes all releases of CICS (CICS/ESA,
CICS/TS for OS390, and CICS/TS for z/OS)
• Applies most analysis techniques contained in
IBM’s CICS Performance Guides
• Produces specific suggestions for improving
CICS performance
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
21
CICS Component
(Major areas analyzed)
•
•
•
•
•
•
•
•
•
•
•
•
•
Virtual and real storage (MXG/AMXT/TCLASS)
VSAM and File Control (NSR and LSR pools)
Database management (DL/I, IMS, DB2)
Journaling (System and User journals)
Network and VTAM (RAPOOL, RAMAX)
CICS Facilities (temp storage, transient data)
ISC/IRC (MRO, LU61., LU6.2 modegroups)
System logger
Temporary Storage
Coupling Facility Data Tables (CFDT)
CICS-DB2 Interface
Open TCB pools
TCP/IP and SSL
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
22
CICS Component - sample report
RULE CIC101: CICS REACHED MAXIMUM TASKS TOO OFTEN
The CICS statistics revealed that the number of attached tasks was
restricted by the MXT operand, but storage did not appear to be
constrained. CPExpert suggests that you consider increasing the MXT
value in the System Initialization Table (SIT) for this region.
This finding applies to the following CICS statistics intervals:
STATISTICS
COLLECTION TIME
0:00,01OCT2010
TIMES
PEAK
TIME
MXT -PEAK TASKS- MAXTASK MAXTASK WAITING
APPLID
VALUE TOTAL USER REACHED QUEUE MAXTASK
CICSIDG.
20
46
20
36
8 0:02:29.0
RULE CIC140: THE NUMBER OF TRANSACTION ERRORS IS HIGH
The CICS statistics revealed that more than 5 transaction errors
were related to terminals. These transactions errors may indicate
that there is an attempted security breach, there may be problems
with the terminal, or perhaps additional operator training is
indicated. This finding applies to the following CICS statistics intervals:
STATISTICS
COLLECTION TIME
0:00,01OCT2010
0:00,01OCT2010
0:00,01OCT2010
APPLID
CICSPROD
CICSPROD
CICSPROD
TERMINAL
T2M1
T2M2
T2M6
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
NUMBER OF ERRORS
348
60
348
www.cpexpert.com
23
CICS Component - sample report
RULE CIC170: MORE THAN ONE STRING SPECIFIED FOR WRITE-ONLY ESDS FILE
More than one string was specified for a VSAM ESDS file that was used
exclusively for write operations. Specifying more than one string can
significantly affect performance because of exclusive control conflict
that can occur. If this finding occurs for all normal CICS processing
you should consider specifying only one string in the ESDS file
definition.
STATISTICS
COLLECTION TIME
0:00,16MAR2010
APPLID
CICSYA
VSAM FILE
LNTEMSTR
NUMBER OF
WRITE OPERATIONS
431,436
RULE CIC267: INSUFFICIENT SESSIONS MAY HAVE BEEN DEFINED
CPExpert believes that an insufficient number of sessions may have been
defined for the CICS DAL1 connection, or the application system could
have been issuing ALLOCATE requests too often. CPExpert suggests you
consider increasing the number of sessions defined for the connection, or
you should increase the ALLOCQ guidance variable to cause CPExpert to signal
a potential problem only when you view the problem as serious. For APPC
modegroups, this finding applies only to generic ALLOCATE requests.
This finding applies to the following CICS statistics intervals:
STATISTICS
COLLECTION TIME
10:00,26MAR2008
APPLID
CICSDTL1
ALLOCATE REQUESTS
RETURNED TO USERS
335
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
24
CICS Component - sample report
RULE CIC267: INSUFFICIENT SESSIONS MAY HAVE BEEN DEFINED
CPExpert believes that an insufficient number of sessions may have been
defined for the CICS DAL1 connection, or the application system could
have been issuing ALLOCATE requests too often. The number of ALLOCATE
requests returned was greater than the value specified for the ALLOCQ
guidance variable in USOURCE(CICGUIDE). CPExpert suggests you consider
increasing the number of sessions defined for the connection, or you
should increase the ALLOCQ guidance variable to cause CPExpert to signal
a potential problem only when you view the problem as serious. For APPC
modegroups, this finding applies only to generic ALLOCATE requests.
This finding applies to the following CICS statistics intervals:
STATISTICS
COLLECTION TIME
10:00,26MAR2008
11:00,26MAR2008
12:00,26MAR2008
APPLID
CICSDTL1
CICSDTL1
CICSDTL1
ALLOCATE REQUESTS
RETURNED TO USERS
335
12
27
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
25
CICS Component - sample report
RULE CIC307: FREQUENT LOG STREAM DASD-SHIFTS OCCURRED
CICS75.A075CICS.DFHLOG: More than 1 log stream DASD-shift was initiated
for this log stream during the intervals shown below. A DASD-shift
event occurs when system logger determines that a log stream must stop
writing to one log data set and start writing to a different data set.
You normally should allocate sufficiently large log data sets so that
a DASD-shift occurs infrequently.
SMF INTERVAL
14:45,16MAR2010
------NUMBER OF DASD LOG SHIFTS-----DURING INTERVAL
DURING PAST HOUR
1
2
RULE CIC650: CICS EVENT PROCESSING WAS DISABLED IN CICS EVENTBINDING
Event Processing was disabled in EVENTBINDING, with the result that
events defined in the EVENTBINDING were not captured by CICS Event
Processing. You should investigate the Event Binding to determine
whether the Binding should be enabled or disabled for the region.
This finding applies to the following CICS statistics intervals:
STATISTICS
COLLECTION TIME
0:00,12MAR2009
3:00,12MAR2009
6:00,12MAR2009
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
26
DASD Component
• Processes SMF Type 70(series) to automatically
build model of your I/O configuration.
• Identifies performance problems with devices
which have most potential for improvement.
• PEND delays
• Disconnect delays
• Connect delays
• IOSQ delays
• Shared DASD conflicts
• Analyzes SMF Type 42(DS) and Type 64 to
identify VSAM performance problems.
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
27
DASD Component - sample report
RULE DAS100:
VOLUME WITH WORST OVERALL PERFORMANCE
VOLSER DB2327 (device 2A1F) had the worst overall performance during
the entire measurement period (10:00, 16FEB2001 to 11:00, 16FEB2001).
This volume had an overall average of 56.8 I/O operations per second,
was busy processing I/O for an average of 361% of the time, and had I/O
operations queued for an average of 1% of the time. Please note that
percentages greater than 100% and Average Per Second Delays greater
than 1 indicate that multiple I/O operations were concurrently delayed.
This can happen, for example, if multiple I/O operations were queued or
if multiple I/O operations were PENDing. The following summarizes
significant performance characteristics of VOLSER DB2327:
MEASUREMENT INTERVAL
10:00-10:30,16FEB2001
10:30-11:00,16FEB2001
11:00-11:30,16FEB2001
I/O
RATE
59.1
57.2
54.2
--- AVERAGE PER SECOND DELAYS--RESP
CONN
DISC
PEND
IOSQ
1.308 0.316 0.004 0.988 0.000
3.792 0.300 0.004 3.483 0.006
5.769 0.279 0.004 5.464 0.023
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
MAJOR
PROBLEM
PEND TIME
PEND TIME
PEND TIME
www.cpexpert.com
28
DASD Component - sample report
RULE DAS130:
PEND TIME WAS MAJOR CAUSE OF I/O DELAY.
A major cause of the I/O delay with VOLSER DB2327 was PEND time.
average per-second PEND delay for I/O is shown below:
MEASUREMENT INTERVAL
10:00-10:30,16FEB2001
10:30-11:00,16FEB2001
11:00-11:30,16FEB2001
RULE DAS160:
PEND
CHAN
0.492
1.927
2.840
PEND
DIR PORT
0.000
0.000
0.000
PEND
CONTROL
0.000
0.000
0.000
PEND
DEVICE
0.000
0.000
0.000
PEND
OTHER
0.495
1.556
2.624
The
TOTAL
PEND
0.988
3.483
5.464
DISCONNECT TIME WAS MAJOR CAUSE OF I/O DELAY.
A major cause of the I/O delay with VOLSER DB26380 was DISCONNECT time.
DISC time for modern systems is a result of cache read miss operations,
potentially back-end staging delay for cache write operations,
peer-to-peer remote copy (PPRC) operations, and other miscellaneous
reasons.
MEASUREMENT INTERVAL
8:30- 8:45,22OCT2001
8:45- 9:00,22OCT2001
--PERCENT-----CACHE---- READ WRITE
READS WRITES HITS
HITS
14615
932 19.2 100.0
14570
921 20.7 100.0
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
DASD
TO
CACHE
11825
11567
CACHE
TO
DASD PPRC
903
0
907
0
BPCR
0
0
ICLR
0
0
www.cpexpert.com
29
DASD Component - sample report
RULE DAS300:
PERHAPS SHARED DASD CONFLICTS CAUSED PERFORMANCE PROBLEMS
Accessing conflicts caused by sharing VOLSER DB2700 between systems
might have caused performance problems for the device during the
measurement intervals shown below. Conflicting systems had the
indicated I/O rate, average CONN time per second, average DISC time
per second, average PEND time per second, and average RESERVE time
to the device. Even moderate CONN, DISC, or RESERVE can cause delays
to shared devices.
..
I/O
MAJOR
OTHER -------OTHER SYSTEM DATA-------MEASUREMENT INTERVAL
RATE PROBLEM
SYSTEM I/O RATE CONN DISC PEND RESV
8:30- 8:45,22OCT2001 31.3 QUEUING
SY1
35.0
0.041 0.001 0.455 0.000
SY2
88.2
0.100 0.003 0.714 0.000
SY3
109.0
0.123 0.003 0.723 0.000
TOTAL
232.2
0.264 0.006 1.892 0.000
8:45- 9:00,22OCT2001 25.7 QUEUING
SY1
46.4
0.054 0.001 0.565 0.000
SY2
98.2
0.112 0.003 0.836 0.000
SY3
119.0
0.136 0.003 0.846 0.000
TOTAL
263.5
0.303 0.007 2.247 0.000
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
30
DASD Component - sample report
RULE DAS607: VSAM DATA SET IS CLOSE TO MAXIMUM NUMBER OF EXTENTS
VOLSER: RLS003. More than 225 extents were allocated for the VSAM data
sets listed below. The VSAM data sets are approaching the maximum
number of extents allowed. The below shows the number of extents
and the primary and secondary space allocation:
..
SMF TIME STAMP
JOB NAME VSAM DATA SET
..
10:30,11MAR2002
CICS2ABA RLSADSW.VF01D.DATAENDB.DATA.................
RULE DAS625:
TOTAL
EXTENTS
229
EXTENTS
THIS OPEN
4
---ALLOCATIONS--PRIMARY SECONDARY
30 CYL
1 CYL
NSR WAS USED, BUT LARGE PERCENT OF ACCESS WAS DIRECT
VOLSER: MVS902. Non-Shared resources (NSR) was specified as the
buffering technique for the below VSAM data sets, but more than 75%
of the I/O activity was direct access. NSR is not designed for direct
access, and many of the advantages of NSR are not available for direct
access. You should consider Local Shared Resources (LSR) for the below
VSAM data sets (perhaps using System Managed Buffers to facilitate the
use of LSR). The I/O RATE is for the time the data set was open. The
SMF TIME STAMP and JOB NAME are from the last record for the data set.
..
SMF TIME STAMP
JOB NAME VSAM DATA SET
..
13:19,19SEP2002
NRXX807. SDPDPA.PK.MVSP.RT.NDMGIX.DATA...............
13:19,19SEP2002
NRXX807. SDPDPA.PR.MVSP.RT.NDMGIXD.DATA..............
13:33,19SEP2002
TSJHM... SDPDPA.PR.MVSP.RT.NDMRQFDA.DATA.............
13:33,19SEP2002
TSJHM... SDPDPA.PR.MVSP.RT.NDMRQF.DATA...............
13:33,19SEP2002
TSJHM... SDPDPA.PK.MVSP.RT.NDMTCF.DATA...............
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
I/O
RATE
8.4
11.2
0.3
2.8
11.1
OPEN
DURATION
0:07:08
0:06:42
2:21:58
3:37:53
6:24:10
-ACCESS TYPE
SEQUENTIAL
0.0
0.0
0.0
0.0
0.1
(PCT)DIRECT
100.0
100.0
100.0
100.0
99.9
www.cpexpert.com
31
DASD Component
(Application Analysis)
• Requires simple modification to MXG or MICS
• Modification collects job step data while
processing SMF Type 30 (Interval) records
• Typically requires less than 10 cylinders
• Data is correlated with Type 74 information
• CPExpert associated performance problems to
specific applications (jobs and job steps)
• CPExpert can perform “Loved one” analysis of
DASD performance problems
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
32
WMQ Component
Analyzes SMF Type 115 statistics, as processed
by MXG or MICS and placed into performance
data base.
• MQMLOG
- Log manager statistics
• MQMMSGDM - Message/data manager statistics
• MQMBUFER - Buffer Manager statistics
• MQMCFMGR - Coupling Facility Manager stats
Type 115 records should be synchronized with
SMF interval recording interval.
IBM says overhead to collect accounting data is
negligible.
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
33
WMQ Component
Optionally analyzes SMF Type 116 accounting
data, as processed by MXG or MICS and
placed into performance data base.
• MQMACCTQ - Thread-level accounting data
• MQMQUEUE - Queue-level accounting data
Type 116 records should be synchronized with
SMF interval recording interval.
IBM says overhead to collect accounting data is
5-10%
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
34
WebSphere MQ
Typical queue manager problems
Assignment of queues to page sets
Assignment of page sets to buffer pools
Queue manager parameters
Index characteristics of queues
Characteristics of messages in queues
Characteristics of MQ calls
CPExpert analysis uses SMF Type 116 records
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
35
WebSphere MQ
Typical buffer manager problems
Buffer thresholds exceeded for pool
Buffers assigned per pool (too few/too many)
Message traffic
Message characteristics
Application design
CPExpert analysis uses SMF Type 115 records
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
36
WebSphere MQ
Typical log manager problems
Log buffers assigned
Active log use characteristics
Archive log use characteristics
Tasks backing out
System paging of log buffers
Excessive checkpoints taken
CPExpert analysis uses SMF Type 115 records
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
37
WebSphere MQ
Typical DB2-interface problems
Thread delays
DB2 server processing delays
Server requests queued
Server tasks experienced ABENDs
Deadlocks in DB2
Maximum request queue depth was too large
CPExpert analysis uses SMF Type 115 records
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
38
WebSphere MQ
Typical Shared queue problems
Structure was full
Large number of application structures defined
MINSIZE is less than SIZE for CSQ.ADMIN
SIZE is more than double MINSIZE
ALLOWAUTOALT(YES) not specified
FULLTHRESHOLD value might be incorrect
CPExpert analysis uses SMF Type 115 records
and Type 74 (Coupling Facility) records
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
39
WebSphere MQ – sample report
RULE WMQ100: MESSAGES WERE WRITTEN TO PAGE SET ZERO
More than 0 messages were written to Page Set Zero during the intervals
shown below. Messages should not be written to Page Set Zero, since
serious WebSphere MQ system problems could occur if Page Set Zero
should become full. This finding relates to queue
SYSTEM.COMMAND.INPUT
STATISTICS INTERVAL
13:16-14:45, 28AUG2003
MESSAGES WRITTEN
TO PAGE SET ZERO
624
RULE WMQ122: DEAD.LETTER QUEUE IS INAPPROPRIATE FOR PAGE SET ZERO
Buffer Pool 0. The DEAD.LETTER queue was assigned to Page Set Zero.
A dead-letter queue stores messages that cannot be routed to their
correct destinations. If the DEAD-LETTER queue grows large unexpectedly,
Page Set Zero can become full, and WebSphere MQ can enter a serious
stress condition. You should redefine the DEAD.LETTER queue to a page
set other than Page Set Zero. This finding relates to queue
SYSTEM.DEAD.LETTER.QUEUE
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
40
WebSphere MQ – sample report
RULE WMQ110: EXPYRINT VALUE IS OFF OR TOO SMALL
Buffer Pool 3. There were more than 25 expired messages skipped when
scanning a queue for a specific message. Processing expired messages
adds both CPU time and elapsed time to the message processing. With
WebSphere 5.3, the EXPYRINT keyword was introduced to allow the queue
manager to automatically determine whether queues contained expired
messages and to eliminate expired messages at the interval specified
by the EXPYRINT value. This finding applies to queue:
DPS.REPLYTO.RCB.IVR04
STATISTICS INTERVAL
13:41-13:41, 03JUL2003
GET
SPECIFIC
0
BROWSE
SPECIFIC
0
EXPIRED MESSAGES
PROCESSED
313
RULE WMQ320: APPLICATIONS WERE SUSPENDED FOR LOG WRITE BUFFERS
Applications were suspended while in-storage log buffers are being
written to the active log. This finding normally means that too
few log buffers were assigned. However, the finding could mean
that there is an I/O configuration problem and the log buffer writes
to the active log are delayed for I/O reasons. This finding applies
to the following statistics intervals.
STATISTICS INTERVAL
14:19-14:44, 12SEP2003
NUMBER OF SUSPENSIONS
WAITING ON OUTPUT BUFFERS
139
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
41
WebSphere MQ – sample report
RULE WMQ201: BUFFER POOL ENCOUNTERED SYNCHRONOUS (5%) THRESHOLD
Buffer Pool 0. This buffer pool encountered the Synchronous Write
threshold (less than 5% of the pages in the buffer pool were "stealable"
or more than 95% of the pages were on the Deferred Write queue). While
the Synchronous Page Writer is executing, updates to any page cause the
page to be written immediately to the page set (the page is not placed
on the Deferred Write Queue, but is written immediately to the page set
as a synchronous write operation). This situation harms performance of
applications, and is an indicator that the buffer pool is in danger of
encountering a Short on Storage condition.
STATISTICS INTERVAL
17:08-17:09, 07OCT2003
BUFFERS
ASSIGNED
1,050
TIMES AT
5% THRESHOLD
19
IMMEDIATE
WRITES
19
RULE WMQ205: HIGH I/O RATE TO PAGE SETS WITH SHORT-LIVED MESSAGES
Buffer Pool 0. This buffer pool had short-lived messages assigned.
The total I/O rate (read and write activity) to page sets for the
short-lived messages was more than 0.5 pages per second. Writing
pages to the page set and subsequently reading the pages from the
page set cause I/O overhead and delay to the application. This
finding applies to the following intervals:
STATISTICS INTERVAL
11:32-11:32, 24JUL2006
BUFFERS
ASSIGNED
50,000
PAGES
WRITTEN
101
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
PAGES
READ
0
I/O RATE
WITH DASD
50.5
www.cpexpert.com
42
WebSphere MQ – sample report
RULE WMQ300: ARCHIVE LOGS WERE USED FOR BACKOUT
WebSphere MQ applications issued log reads to the archive log
file for backout more than 0 times during the WebSphere MQ
statistics intervals shown below. Most log read requests should
come from the output buffer or the active log. Using archive
logs for backout purposes often indicates that either the active
log files were too small or long-running applications were backing
out work.
NUMBER OF LOG READS
STATISTICS INTERVAL
FROM ARCHIVE LOG
4:30- 5:00, 12SEP2003
192
RULE WMQ611: LARGE NUMBER OF APPLICATION STRUCTURES WERE DEFINED
SMF TYPE74 (Structure) statistics showed that more than 5 application
structures were defined to a coupling facility. IBM suggests that you
should have as few application structures as possible. Having multiple
application structures in a coupling facility can degrade performance.
COUPLING FACILITY
CF1
CF2
CF3
WEBSPHERE MQ
STRUCTURES DEFINED
8
9
8
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
43
CPExpert Release 18.1
(Issued April 2008)
Major enhancements with this update:
• Provided support for z10 server
• Provided analysis of HiperDispatch problems
• Provided new reports to help analysis of DB2 buffer
pool problems
• Expanded the CPExpert email feature to the DASD
Component
• Provided additional analysis features for the
WebSphere MQ Component
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
44
CPExpert Release 18.2
(Issued October 2008)
Major enhancements with this update:
• Provided support for z/OS Version 1, Release 10
• Provided additional analysis of z/OS performance
problems (in WLM Component), including reduced
CPU speed caused by cooling unit failure
• Provided new reporting of rules based on History
information kept by CPExpert (applies to all
components except DB2 Component)
• Added masking technique to select CICS regions
(by region Group), DASD volumes (including SMS
Storage Groups), and WebSphere MQ subsystems
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
45
CPExpert Release 19.1
(Issued April 2009)
Major enhancements with this update:
• Enhanced WLM Component with analysis of more
z/OS performance problems, including Enqueue
Promoted Dispatching Priority analysis
• Project the amount of zAAP-eligible work that could
be offloaded to a zAAP processor, if a zAAP
processor were assigned to the LPAR
• Provided more analysis of CICS temporary storage
in CICS Component
• Added Resource Enqueue analysis to DASD
Component
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
46
CPExpert Release 19.2
(Issued October 2009)
Major enhancements with this update:
• Provided support for z/OS Version 1, Release 11
• Provide support for CICS/TS Release 4.1.
• Added analysis of Resource Enqueue contention
between different levels of Goal Importance to WLM
Component
• Added analysis of CICS Event Processing to the
CICS Component (applicable to CICS/TS 4.1)
• Allow users to specify narrative descriptions of
individual DB2 buffer pools in CPExpert reports
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
47
CPExpert Release 20.1
(Issued April 2010)
Major enhancements with this update:
• Enhanced WLM Component with analysis of SMF
buffer specifications and other SMF performance
constraints
• Support analysis of VSAM performance problems
when analyzing a MICS performance data base, but
using MXG TYPE42DS and MXG TYPE64 files
• Allow selection of up to 20 unique DB2 subsystems
while analyzing performance problems with DB2
subsystems, and add logic to handle the case where
an installation has multiple identical DB2 subsystem
names defined in z/OS images
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
48
CPExpert Release 20.2
(Issued October 2010)
Major enhancements with this update:
• Provided support z/OS Version 1 Release 12
• Provided support for z/Enterprise System (z196)
• Enhanced WLM Component to provide analysis of
dropped SMF records and analysis of SMF flood
facility (available with z/OS V1R12)
• Enhanced WLM Component to provide Management
Overview of CPExpert findings, with web-enabled
documentation links
• Enhanced the WebSphere MQ Component to provide
analysis of a non-indexed request/reply-to queue
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
49
License fees
(Site license)
Components
First Year
Additional year
WLM Component
7,500
5,000
DB2 Component
7,500
5,000
CICS Component(see note) 5,000
3,000
WMQ Component
5,000
3,000
DASD Component
3,000
1,500
Note: Fees shown for the CICS Component are for analyzing no more than 50 CICS regions.
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
50
Summary
• The major objective is to share solutions and
provide insight into new z/OS features.
• CPExpert is updated every six months; support for
new versions of z/OS has been available within 30
days after General Availability of the new z/OS
release.
• CPExpert is offered at a low cost (affordable by all
z/OS shops).
• 45-day no-obligation trial is available (see license
agreement for details).
• Free no-obligation performance analysis is available
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
51
For more information, please contact
Don Deese
Computer Management Sciences, Inc.
634 Lakeview Drive
Hartfield, VA 23071-3113
Phone:
Fax:
email:
(804) 776-7109
(804) 776-7139
Don_Deese@cpexpert.com
Visit www.cpexpert.com for more information, to
review sample output, to review documentation in
SAS ODS “point-and-click” format, to download
license agreements in .pdf “form” mode, etc.
©Copyright 1998-2010, Computer Management Sciences, Inc., Alexandria, VA
www.cpexpert.com
52
Download