Diagnose Utility Patrick Gannon IBM Level 2 Support Session Code: A08 Wed. May 4, 2:45 – 3:45 | Platform: z/OS Diagnose Utility • The DIAGNOSE online utility generates information that is useful in diagnosing problems. • Authorization – must include one of the following • • • REPAIR privilege for the database DBADM or DBCTRL authority for the database. If the object on which the utility operates is in an implicitly created database, DBADM authority on the implicitly created database or DSNDB04 is required. SYSCTRL or SYSADM authority • An ID with installation SYSOPR authority can execute the DIAGNOSE utility on a table space in the DSNDB01 or DSNDB06 database. • An ID with installation SYSADM authority can execute the DIAGNOSE utility with the WAIT statement option on any table space. 2 Diagnose Utility • Output goes to SYSPRINT • DIAGNOSE can run concurrently on the same target object with any SQL operation or utility, except a utility that is running on DSNDB01.SYSUTILX. • You can terminate and restart the DIAGNOSE utility. • Restart starts from the beginning 3 Diagnose Utility • Standard Utility input //SYSIN DD * DIAGNOSE ……. REORG….. 4 Diagnose Keywords • • • • • • • • • • TYPE ALLDUMPS NODUMPS DISPLAY OBD DISPLAY SYSUTIL DISPLAY MEPL DISPLAY AVAILABLE DISPLAY DBET WAIT ABEND 5 Diagnose DISPLAY OBD • Formats the object descriptor (OBD) of the tablespace • Ex. DIAGNOSE DISPLAY OBD database.tablespace TABLES • May be used to compare the catalog information with the OBD. • Not very useful unless you are directed by Level 2 6 Diagnose DISPLAY SYSUTIL • Formats every record from SYSIBM.SYSUTILX • SYSUTILX stores information about all utility jobs. • • • • • • • • • USUUID =DP954086REORG USUJOBNM=INH2180C USUUTNAM=RUNSTATS USUAUID =INH2180 USUSTATU=A USUTREQ = USUDBOB ='0000'X USUPSID ='0000'X USUPSDD ='0000'X USUDBNAM= USUSPNAM= USUREL =810 USUNDONE='00000000'X USUPOS ='00000019'X USUCMPOK=N USURDATE='00000000'X USUCKSUM='0213A224'X USUMEMBR= USUOCATD=N USUOWLNK=N USUOLOG1=N USUDSENV=N USULSIZE='00000000'X USULCUR ='00000000'X DSNU868I - DSNUDISS - DISPLAY SYSUTIL UTILITY DEPENDANT AREA 7 Diagnose DISPLAY SYSUTIL • You CAN end up having orphaned SYSUTIL records • Best solution is usually to delete / define the tablespace • PM08223 – V8/V9 reduce contention on SYSUTILX data page locks and UTSERIAL lock • Requires delete / define 8 Diagnose DISPLAY MEPL • Dumps the module entry point lists (ie. A DB2 load module and it’s latest PTF) to SYSPRINT • DB2 Support will often request this if a dump is not available DSNUBBFC04/19/11UK66918 DSNUBBOP09/16/10UK60511 DSNUBBRD09/16/10UK60511 DSNUBBSC09/30/10UK60954 DSNUBBUM03/31/11UK66364 DSNUBCDA09/16/10UK60511 9 Diagnose DISPLAY MEPL • DB2 Support can check for any PE’s that are applied as well any inconsistent applies or Hiper PTF’s that are missing • Open a PMR and request that you want to check your MEPL for missing hipers or applied PE’s or inconsistencies. 10 Diagnose DISPLAY AVAILABLE • Displays the Utilities that are installed on this subsystem in both bitmap and readable form • This was helpful because the Utilities Suite has become a separately ordered product • Users may have not installed it properly • Whether you purchase the Utilities suite or not, all IBM Utilities do work against the catalog and directory 11 Diagnose DISPLAY DBET • Dumps the contents of a database exception table to SYSPRINT • The DBET and SYSUTILX should be in sync. • If you cannot terminate a utility, it is likely that the DBET entry is no longer there • This will result in having to reinitialize SYSUTILX 12 Diagnose DISPLAY DBET • Conversely …You could have a DBET entry without a SYSUTILX entry (orphaned) • This may result in 00C2010D DIAGNOSE DISPLAY DBET TABLESPACE DSNDB06.SYSCOPY 0000 12340060 D9C5D7E3 00060010 00000002 612B1D4C 0000C4E2 D5C4C2F0 F640E2E8 *...-REPT......../..<..DSNDB06 SY* • REPAIR SET DSNDB06.SYSCOPY RESET 13 Diagnose WAIT • Suspends Utility execution (message or trace ID) • Issues message to the console • Resumes: • Reply to message • The utility job times out • Utility job is cancelled • Allows events to be synchronized • If utility message or trace ID are NOT encountered processing continues 14 Diagnose ABEND • Forces an abend during utility execution • Message or Trace ID • Needed by DB2 support if abend is being suppressed • • • • DIAGNOSE ABEND MESSAGE U1450 LOAD DATA RESUME NO SHRLEVEL NONE REPLACE LOG NO NOCOPYPEND WORKDDN(SYSUT1, SORTOUT) INDDN(SYSREC) INTO TABLE IIS1.TWDAC02 DSNU1450I -P8L1 DSNUBINS - CHARACTER CONVERSION FROM CCSID 930 TO 37 FAILED WITH ERROR CODE 24 FOR TABLE 0006.002E COLUMN 1 15 Diagnose TYPE • Specifies one or more types of diagnose that you want to perform. • Syntax: DIAGNOSE TYPE(integer,….) • integer is the number of types of diagnoses. • Example //DSNUPROC.SYSIN DD * DIAGNOSE TYPE(100,101,102) LISTDEF TSPART . . 16 Diagnose TYPE • ATTENTION…DISCLAIMER…and other WARNINGS! • Diagnose Types are used for IBM support to provide diagnostics to further debug issues • They are not subject to customer requirements • They are not supported and can be changed or deprecated at anytime without any notice • Diagnose Utility should be done with guidance of IBM support 17 Diagnose TYPE(100,101,102) • Prints performance related output for all utilities. • Each utility has it’s own phases and subphases • Pinpoints an area where the utility is spending most of it’s time . 18 Diagnose TYPE(100,101,102) • REORG (partial example) • • • • • • • • • • • INTERVAL = UPDATE HEADER PAGE CPU (SEC) = 0.262410 LEVEL = SWITCH SUBPROCESS ELAPSED TIME (SEC) = 66.212 <== ----------------------------------------------------------------INTERVAL = REORG SYSCOPY CPU (SEC) = 0.021081 LEVEL = SWITCH SUBPROCESS ELAPSED TIME (SEC) = 1.097 ------------------------------------------------------------------INTERVAL = INLINE COPY SYSCOPY CPU (SEC) = 0.292745 LEVEL = SWITCH SUBPROCESS ELAPSED TIME (SEC) = 61.260 <== -----------------------------------------------------------------INTERVAL = UPDATE REALTIMESTATS CPU (SEC) = 0.000067 LEVEL = SWITCH SUBPROCESS ELAPSED TIME (SEC) = 0.017 19 Diagnose TYPE(100,101,102) - Bufferpool Section INTERVAL = REORG LEVEL = UTILITY CPU (SEC) = 6.238221 ELAPSED TIME (SEC) = 27.896 BUF POOL -------BP0 BP8K0 BP32K GETPAGES ---------83692 91 135 SYS SETW ---------64039 38 94 SYNC READS ---------90 SYNC WRITE ---------20 TOTAL 83918 64171 90 20 DDNAME -------SYS00002 SYS00001 DS OPEN ---------2 1 DS CLOSE ---------2 1 READ I/O WRITE I/O ---------- ---------4031 4002 4585 TOTAL 3 3 4031 8587 20 Diagnose TYPE(100,101,102) • COPY (partial example) INTERVAL = COPYR 0001 COPY Dat CPU (SEC) = 0.104408 LEVEL = SUBPHASE ELAPSED TIME (SEC) = 4.635 BUF POOL -------BP0 BP32K BP8K0 BP8K1 BP32K BP16K1 GETPAGES ---------985 148 16 62 43 7 TOTAL 1261 SYS SETW ---------55 125 3 8 18 3 212 SYNC READS ---------32 1 9 4 7 1 SYNC WRITE ---------39 1 2 5 12 2 54 61 21 Diagnose TYPE(2) • Prints Diagnostic information for Reorg • The dreaded 00E40005 • Three basic reasons for failure • FILSZ is wrong (ie. Row estimate) • AVGRLEN is wrong • Not enough space allocated 22 Diagnose TYPE(2) • • • • Zparm UTSORTAL=YES Zparm IGNSORTN=YES DFSORT IGNWKDD = YES RTS is a better solution for ensuring that REORG has the best information on sort allocation. • DSN1COPY will not update RTS!! • PM31463 (April 2011) – Loop because OEM software is reducing allocations with UTSORTAL=YES 23 Diagnose TYPE(2) • Find ICE046A ICE046A 0 SORT CAPACITY EXCEEDED - RECORD COUNT 63743600 ICE253I 0 RECORDS SORTED - PROCESSED: 63743600, EXPECTED: 72734160 ICE098I 0 AVERAGE RECORD LENGTH - PROCESSED: 1420, EXPECTED: 1421 > DYNAMIC ALLOCATION FAILED 1 yRC ADDR=X'1CC438' LENGTH=X'04' TYPE=81 0000 00000004 *.... * . S99ERROR ADDR=X'1C7044' LENGTH=X'02' TYPE=81 0000 970C *P. * 24 Diagnose TYPE(2) • Reorg calls DFSORT or new DB2 SORT MSGPRT=ALL,MSGDDN=UTPRINT,AVGRLEN=04160,MAINSIZE=064793K • User running REORG has a very large AVGRLEN • Tablespace is compressed so why is AVGRLEN so large? • DIAGNOSE TYPE(2) shows • lclflags ADDR=X'69448953' LENGTH=X'01' 0000 8C • which means the average record length is being decompressed because a column has been altered • URFAVGRC ADDR=X'4C4BD076' LENGTH=X'02' 0000 0FE5 = 4069 25 Diagnose TYPE(2) • TURN OFF the ALT-ADD-COL • Run a successful Reorg to materialize rows • Take Image Copy • Run Modify to delete older SYSCOPY / SYSLGRNX 26 Diagnose TYPE(66) • Prints information on parallelism (Reorg, Copy, Load, Rebuild) NumIndexes ADDR=X'18B20994' LENGTH=X'02' 0000 0008 *.. MaxTaskSets ADDR=X'18B208BC' LENGTH=X'04' 0000 0000000B *.. BelowSortMem ADDR=X'18B208B8' LENGTH=X'04' 0000 00094000 *.. ActualTaskSets ADDR=X'18B208C0' LENGTH=X'04' 0000 00000008 *.. 27 Diagnose TYPE(130) • Real Time Statistics information – • Need to have UTSORTAL=YES • DSNU3343I ) 119 23:01:39.71 DSNURFIT - REAL-TIME STATISTICS INFORMATION MISSING • xPartNo ADDR=X'7E515712_00000000' LENGTH=X'02' TYPE=82 0000 0000 *.. RsiTotalRowsIsNull(xPartNo) ADDR=X'7E54EBD0_00000000' LENGTH= 0000 68 *. RsiTotalRows(xPartNo) ADDR=X'7E54EBD8_00000000' LENGTH=X'08' 0000 00000000 00000000 *........ 28 Diagnose TYPE(130) • Real Time Statistics information (cont)… xPartNo ADDR=X'7E515712_00000000' LENGTH=X'02' TYPE=82 0000 0000 *.. RsiTotalRowsIsNull(xPartNo) ADDR=X'7E54EBD0_00000000' LENGT 0000 00 *. RsiTotalRows(xPartNo) ADDR=X'7E54EBD8_00000000' LENGTH=X'08 0000 00000000 000000BD *........ • Run Problem Reorgs with DIAGNOSE TYPE(2,130) 29 Diagnose TYPE(329) • Prevent lock contention on SYSDBASE (EPOCH update) • EPOCH is used to record number of utility operations that changes the location of rows • May be less of an issue with V10 catalog restructure • Symptom - 00E4070D - SYSTABLEPART ACCESS FAILED OR • DSNT500I +DB2B DSNUGRAR RESOURCE UNAVAILABLE 475 12:33:28 REASON 00C90088 12:33:28 TYPE 00000302 12:33:28 NAME DSNDB06 .SYSDBASE.X'000AAC‘ 30 Diagnose TYPE(386) • Reorg Rebalance diagnostic information USURUNLD ADDR=X'7EB77198_C6F4E2C1' LENGTH=X'08' 0000 00000000 00119566 *......n. AvgRecordSize ADDR=X'7E441E80_C6F4E2C1' LENGTH=X'04' 0000 0000005F *...¬ AvgRecordsPerPage ADDR=X'7E441E84_C6F4E2C1' LENGTH=X'04 0000 0000002A *.... AvgPagesUsed ADDR=X'7E441E88_C6F4E2C1' LENGTH=X'04' 0000 00006B2E *..,. AvgTotalPages ADDR=X'7E441E8C_C6F4E2C1' LENGTH=X'04' 0000 00006B2E *..,. 31 Diagnose TYPE(689) • No RI checking for PIT recovery • The design during Recover utility is to access SYSRELS to get RI parent/dependent info, thus locking SYSDBASE pages. • Multiple jobs (ie DR) will cause deadlocks/timeouts • This only occurs for Recover TORBA and Recover TOCOPY objects with RI • DB2 will not set check pending for dependent objects 32 Diagnose TYPE(689) • Row in SYSCOPY for PIT (ICTYPE = 'P') • either STYPE = 'J' or STYPE ='K' to indicate "ENFORCE(NO)“ • The STYPE='J' indicates the point in recover with the LOGONLY option. • The STYPE = "K" indicates the point in time recover without the LOGONLY option. • V10 – PM31932 (end of May) - ENFORCE YES/NO • Specifies that CHKP and ACHKP pending states are not set for a point-intime recovery when only a subset of the related objects (BASE, LOB, XML, and RI) have been recovered to a point in time 33 Diagnose TYPE(690) • COPY utility (shrlevel change or reference) writes a SYSLGRNX record for each object being copied • MODIFY can only remove the entries if there is a subsequent closed syslgrnx record that is also being deleted for the same dbid, psid, part, and mem-id. • OPEN SYSLGRNX records can make RECOVER and REORG in LOG APPLY phase to take a long time • USE MODIFY Utility to remove old records • REORG SYSLGRNX 34 Diagnose TYPE(690) • There are a number of reasons when DB2 wouldn't close off the SYSLGRNX entry, including: • • • • • • • Pageset is in GRECP SYSLGRNX update failed Close is in response to a failure during open Pageset is not flagged as having an open SYSLGRNX entry Header page is in LPL (PQ83372) Some failure updating the header page level-ID SYSLGRNX close process encounters a lock time 35 Diagnose Type(820) • turns off the intermediate commit processing while we are copying an object. • Could improve performance if number of pages is large (10000) • 36 Miscellaneous • A DIAGNOSE TYPE does have the possibility to make it to the general code. • Former DIAGNOE TYPE became APAR PM23786 in Jan. 2011 • V8/V9/V10 - COPY UTILITY PERFORMANCE ENHANCEMENT for TAPE output 37 Questions 38