Session: G10 Identifying and Tuning Suboptimal SQL Philip K. Gunning Gunning Technology Solutions, LLC May 21, 2008 • 10:00 a.m. – 11:00 a.m. Platform: DB2 LUW Objectives Learn How to Identify suboptimal SQL Identifying the Top 10 SQL Ways you can rewrite or tune the SQL Index solutions, DB2 Explain, Design Advisor Implementing and Verifying the solution 2 Outline The Need to Identify and Tune suboptimal SQL Suboptimal SQL impacts the business Poor response time Lost customers If web-facing, customers don’t come back Lost business opportunities Lost enterprise-wide productivity Lost revenue Need for more resources DBAs, CPU and IO resources, increased network bandwidth IDENTIFY and TUNE 3 Outline Several means available for identifying and capturing suboptimal SQL Convenience Views I.E. LONG_RUNNING_SQL DB2 Snapshots Application snapshots Dynamic SQL snapshots List applications show detail Administrative Routines 4 Outline db2pd (lowest overhead of all monitoring methods)* Event monitors Provided with DB2 Third Party Vendor Tools Many good tools now available 5 Outline OS monitoring tools PS command with correlation to list applications output TOPAS (PID correlated with list applications show detail command and application snapshot on the associated agentid) TOP on other platforms NMON 6 Characteristics of Suboptimal SQL Join predicates missing or not indexed Local predicates (those in the select list) not indexed for potential index-only access Order by predicates not indexed or indexes not created with “ALLOW REVERSE SCANS” Note “ALLOW REVERSE SCANS” now default in DB2 9.5 Foreign key indexes not defined Note that EXPLAIN enhanced in DB2 9.5 to show use of FK (RI) Misunderstanding of IXSCAN operator 7 Characteristics of Suboptimal SQL DB2 built-in functions such as UCASE causing IXSCAN of entire index Generated column Company culture does not allow time for explain of SQL before it goes into production Nowadays, this is very prevalent Developers not aware of explain capabilities and options Design Advisor not used or misinterpreted 8 Classes of Predicates Range Delimiting Index SARGable Data SARGable Residual 9 Predicate Example Index For the following predicate rule examples, assume that an index has been created on Col A, Col B, and Col C Asc as follows: ACCT_INDX: Col A Col B Col C 10 Predicates Range Delimiting Used to bracket an index scan Uses start and stop predicates Evaluated by the Index Manager 11 Range Delimiting Example Col A = 3 and Col B = 6 and Col C = 8 In this case the equality predicates on all the columns of the index can be applied as start-stop keys and they are all range delimiting Col A Col B Col C 12 Predicates Index SARGable Are not used to bracket an index scan Can be evaluated from the index if one is chosen Evaluated by the Index Manager 13 Index SARGable Example Col A = 9 and Col C = 4 Col A can be used as a range delimiting (start-stop) predicate. Col C can be used as an Index SARGable predicate, it cannot be used as a range delimiting since there is no predicate on Col B. Starting with columns in the index, from left to right, the first inequality predicate stops the column matching. Col A Col B Col C 14 Predicates Data SARGable Cannot be evaluated by the Index Manager Evaluated by Data Management Services Requires the access of individual rows from the base table 15 Data SARGable Example Col A = 3 and Col B <= 6 and Col D = 9 Col A is used as a start-stop predicate, Col B is used as a stop predicate, and Col D which is not present in the index is applied as a Data SARGable predicate during the FETCH from the table Col A Col B Col C 16 Residual Predicates Residual Predicates Cannot be evaluated by the Index Manager Cannot be evaluated by Data Management Services Require IO beyond accessing the base table Predicates such as those using quantified subqueries (ANY, ALL, SOME, or IN), LONG VARCHAR, or LOB data which is stored separately from the table Correlated Sub-queries Are evaluated by Relational Data Services and are the most expensive type of predicates 17 Residual Predicate Example Col B = 4 and UDF with external action(Col D) In this case the leading Col A does not have a predicate. Col B can only be used as an Index SARGable predicate (where the whole index is scanned). Col D involves a user defined function which will be applied as a residual predicate 18 Predicate Best Practice Use Range Delimiting predicates whenever possible Verify via Explain 19 DEFAULT FILTER FACTORS 20 Platform: DB2 for Linux, UNIX, and Windows SQL Coding Best Practices for Developers Phil Gunning Principal Consultant, Gunning Technology Solutions, LLC Session: G2 May 23, 2005 12:30 – 1:40 21 Outline • • • • Best Practices Classes of Predicates • Index SARGable • Range Delimiting • Data SARGable Predicate Best Practices • Local, Order By, Join Predicates Restricting Results • Restrict before joining • Selectivity • DB2 Catalog Queries/Explain 22 Outline • • • • Index Design • Local, Order By, Join predicates • Include Columns • Uniqueness DB2 Visual Explain/db2exfmt/Design Advisor Monitor and Evaluate Summary 23 Best Practices 1. Use Range Delimiting and Index SARGable Predicates wherever possible 2. Understand DB2 predicate rules 3. Specify most restrictive predicates first 4. Select only columns that are needed 5. Adhere to proper index design techniques 6. Understand inputs to the Optimizer 7. Developers and DBAs collaborate to design proper indexes 8. Evaluate all SQL using Visual Explain/db2exfmt 9. Use Design Advisor to tune SQL/SQL Workloads 10. Consistently monitor and review application performance 24 Application COST Relational Data Services Data Management Services Index Manager Residual predicates Data SARGable predicates Range Delimiting Index SARGable Data 25 RULE#1 • Use Range Delimiting and Index SARGable predicates whenever possible 26 Query Rewrite • • • • The DB2 for Linux, UNIX and Windows optimizer contains significant query rewrite capability Still important to write predicates following the local, order by, join rule Query rewrite will take care of most transformations that need to be made However, if predicates are missing or indexes are not available to support the access paths, your SQL will not be able to take advantage of query rewrite 27 Query Rewrite • • • • The DB2 for Linux, UNIX and Windows optimizer contains significant query rewrite capability Still important to write predicates following the local, order by, join rule Query rewrite will take care of most transformations that need to be made However, if predicates are missing or indexes are not available to support the access paths, your SQL will not be able to take advantage of query rewrite 28 Index Review • An index is a data structure that contains column values and a pointer to the table data • Primary key – Unique Index • If a primary key is defined, DB2 automatically creates a unique index to enforce the PK constraint • Secondary Index • Created to support access to frequently referenced columns • Indexes provide efficient access (in terms of CPU and IO) to columns found in the table • Just like an index entry in a book, an index in a database enables rapid lookup of associated table entries 29 Index Characteristics • • • • Index entries are usually much smaller (subset) of all table columns Can fit more index entries on a page Allows for more efficient use of buffer pool Separate index buffer pool • Enables often used index pages to remain in the buffer pool longer • More logical IO than physical IO 30 A Word About Index Structures • • • B+ -tree used to store index entries Provides for a tree structure that is balanced to a constant depth from the root to the leaf blocks along every branch Usually more efficient (less costly) than a table scan 31 B+ - Tree Index Example ROOT Level 0 Level 1 Level 2 Intermediate NODE 1 Leaf NODE L1.1 Intermediate NODE 2 Leaf NODE L1.300 Leaf NODE L2.1 Intermediate NODE 3 Leaf NODE L2.600 Leaf NODE L3.1 Intermediate NODE 4 Leaf NODE L3.900 Leaf NODE L4.1 Leaf NODE L4.1200 32 Index-Only Access ROOT Level 0 Level 1 Level 2 Leaf NODE L1.1 Leaf NODE L1.300 Leaf NODE L2.1 Intermediate NODE 4 Intermediate NODE 3 Intermediate NODE 2 Intermediate NODE 1 Leaf NODE L2.600 Leaf NODE L3.1 Leaf NODE L3.900 Leaf NODE L4.1 Leaf NODE L4.1200 33 Range Delimiting (Start – Stop Predicates) ROOT Level 0 Level 1 Level 2 Intermediate NODE 1 Leaf NODE L1.1 Intermediate NODE 2 Leaf NODE L1.300 Leaf NODE L2.1 Intermediate NODE 3 Leaf NODE L2.600 Leaf NODE L3.1 Intermediate NODE 4 Leaf NODE L3.900 Leaf NODE L4.1 Leaf NODE L4.1200 Base Table 34 Select deptnumb, deptname from db2admin.org Where deptnumb < 20 35 Range Delimiting Example 36 Full Table Scan Query …. Index ACCT_IND1 37 Table Scan Example Created this index and ran this SQL CREATE INDEX "DB2ADMIN". "YYZZ" ON "DB2ADMIN"."ORG" ("DEPTNUMB" ASC, "DEPTNAME" ASC, "DIVISION" ASC) Select deptnumb, deptname from db2admin.org Where deptnumb =20 and deptname like 'b%' or division = 'midwest' and manager = 88 or location like 'bo%' 38 Table Scan Example 39 Table Scan Rules of Thumb • If > 20-25% of the rows will be read, good likelihood of table scan • If 0.5 – 20% of the rows are read, likely index access but this can vary depending on numerous factors • Exact formulas used are complex and not very useful for practical purposes 40 Rule #2 • Understand and apply DB2 predicate rules 41 WITH DEPT_MGR AS ( SELECT DEPTNO, DEPTNAME, EMPNO, LASTNAME, FIRSTNME, PHONENO FROM DEPARTMENT D, EMPLOYEE E WHERE D.MGRNO=E.EMPNO AND E.JOB='MANAGER' ), DEPT_NO_MGR AS ( SELECT DEPTNO, DEPTNAME, MGRNO AS EMPNO FROM DEPARTMENT EXCEPT ALL SELECT DEPTNO, DEPTNAME, EMPNO FROM DEPT_MGR ), MGR_NO_DEPT (DEPTNO, EMPNO, LASTNAME, FIRSTNME, PHONENO) AS ( SELECT WORKDEPT, EMPNO, LASTNAME, FIRSTNME, PHONENO FROM EMPLOYEE WHERE JOB='MANAGER' EXCEPT ALL SELECT DEPTNO,EMPNO, LASTNAME, FIRSTNME, PHONENO FROM DEPT_MGR ) SELECT DEPTNO, DEPTNAME, EMPNO, LASTNAME, FIRSTNME, PHONENO FROM DEPT_MGR UNION ALL SELECT DEPTNO, DEPTNAME, EMPNO, CAST(NULL AS VARCHAR(15)) AS LASTNAME, CAST(NULL AS VARCHAR(12)) AS FIRSTNME, CAST(NULL AS CHAR(4)) AS PHONENO FROM DEPT_NO_MGR UNION ALL SELECT DEPTNO, CAST(NULL AS VARCHAR(29)) AS DEPTNAME, EMPNO, LASTNAME, FIRSTNME, PHONENO FROM MGR_NO_DEPT ORDER BY 4 42 A More “Complicated” Example 43 Table Scan Example 44 Created Two Indexes CREATE INDEX "DB2ADMIN"."AABB" ON "DB2ADMIN"."DEPARTMENT" ("DEPTNO" ASC, "DEPTNAME" ASC, "MGRNO" ASC) PCTFREE 10 CLUSTER MINPCTUSED 10 ALLOW REVERSE SCANS; CREATE INDEX "DB2ADMIN"."CCDD" ON "DB2ADMIN"."EMPLOYEE" ("EMPNO" ASC, "FIRSTNME" ASC, "MIDINIT" ASC, "LASTNAME" ASC, "WORKDEPT" ASC, "PHONENO" ASC) PCTFREE 10 MINPCTUSED 10 ALLOW REVERSE SCANS; 45 Index Scan Example Index Scan on AABB index 46 Full Index Scan Index Scan of entire index then fetch from table 47 Index on: DEPTNAME,DEPTNO MGRNO Index Scan ROOT Level 0 Level 1 Level 2 Intermediate NODE 1 Leaf NODE L1.1 Intermediate NODE 2 Leaf NODE L1.300 Leaf NODE L2.1 Intermediate NODE 3 Leaf NODE L2.600 Leaf NODE L3.1 Intermediate NODE 4 Leaf NODE L3.900 Leaf NODE L4.1 Leaf NODE L4.1200 RIDS Base Table 48 Selectivity Catalog Queries • SELECT INDNAME, NPAGES, CARD, FIRSTKEYCARD AS FIRSTK, FIRST2KEYCARD AS F2KEY, FIRST3KEYCARD AS F3KEY, FIRST4KEYCARD AS F4KEY, FULLKEYCARD AS FULLKEY, NLEAF, NLEVELS AS NLEV, CLUSTERRATIO AS CR, CLUSTERFACTOR AS CF, UNIQUERULE AS U, T.COLCOUNT AS TBCOL, I.COLCOUNT AS IXCOL FROM SYSCAT.TABLES T, SYSCAT.INDEXES I WHERE T.TABSCHEMA = I.TABSCHEMA AND T.TABSCHEMA = ‘PGUNNING' AND T.TABNAME = I.TABNAME AND CARD >20000 ORDER BY CARD DESC, 1; 49 INDNAME NPAGES CARD FIRSTK F2KEY F3KEY F4KEY FULLKEY NLEAF NLEV PSAITEM_DST516893 5682718 3 9 21 7339 7339 6886 PSBITEM_DST516893 5682718 1207 -1 -1 -1 1207 7412 PS_ITEM_DST516893 5682718 3 14947 1313902 1313906 5682718 136107 PS_PENDING_DST 516882 5682569 3 146890 148686 380119 5682569 172775 PSAITEM_ACTIVITY 577065 2884167 3 2959 -1 -1 2959 3667 PSBITEM_ACTIVITY 577065 2884167 -1 -1 -1 -1 -1 -1 PS_ITEM_ACTIVITY 577065 2884167 3 14947 1313902 1313906 2884167 50034 PSAJRNL_LN119531 1433651 543 1865 1865 -1 1865 1755 PSBJRNL_LN119531 1433651 56152 60820 65131 65268 65268 2184 PSDJRNL_LN119531 1433651 12831 14473 116514 -1 116514 2538 PSFJRNL_LN119531 1433651 1318 3654 3705 65268 65270 2223 CF U TBCOL IXCOL 8.90E-01NLEVELS D 53 9.00E-01 D > 3 53 3 3 5 8.33E-01 U 5 9.18E-01 U 3 7.77E-01 D -1 -1.00E+00 D 4 7.15E-01 U 3 7.70E-01 D 3 9.86E-01 D 3 8.58E-01 D 3 9.68E-01 D 53 49 91 91 91 44 44 44 44 COLNAMES 5 +APPL_JRNL_ID+ 1 +PROCESS_INSTA 8 +BUSINESS_UNIT 10 +GROUP_BU+GROU 2 +BUSINESS_UNIT 4 +DEPOSIT_BU+DE 5 +BUSINESS_UNIT 3 +ACCOUNT+BUSIN 4 +JOURNAL_ID+JO 3 +PROCESS_INSTA 5 +JOURNAL_DATE+ 50 XBOOKING1 Selectivity = Number of Distinct Values / CARD 1229/389151 = .003 Meets our rule for selectivity < .10 XBOOKING2 Selectivity = Number of Distinct Values / CARD 111217/389151 = .285 Does not meet our rule for selectivity < .10 51 Data Specification • Specify the most restrictive predicates first • Select only those columns needed • Use business sense when developing reports for end users • They should not be so voluminous that the average end user will not be able to use them anyway • Haven’t we all seen these monster reports that consume lots of CPU and IO and never get looked at? 52 Restriction • Restrict before you join • Example: “Select * from acct, dept where acct.nbr = dept.acct_id and acct.loc = 5” • In this example, acct.loc = 5 is a restrictive expression • It will be applied before the join thus the number of rows joined decreases • With just the join expression, the number of rows 53 Fast Retrieval • OPTIMIZE FOR N ROWS CLAUSE • Can guide the optimizer to use an access path to quickly return N Rows • Also effects the size of the number of rows blocked in the communications buffer • Useful when the number of rows you want is significantly less than total number of rows that could be returned • Can slow performance if most of the rows are going to be processed 54 Fetch First • FETCH FIRST N ROWS ONLY CLAUSE • Used to restrict fetching to only N rows regardless of number of rows that there may have been in the result set if not specified • FOR FETCH ONLY CLAUSE • Use when no updates are planned • Query can take advantage of row blocking • Only S locks taken on rows retrieved • Improved concurrency 55 Rule #3 & 4 • • Specify most restrictive predicates first Select only those columns needed 56 Selectivity • • Selectivity of an index column indicates the number of rows that will satisfy the predicate condition Formula: • Selectivity = number of distinct values / number of rows in the table • Selectivity of predicates should be < .10, that is will return less than 10% of the table rows to the requesting application or to the intermediate result set if more than a two-way join 57 Index Design • Indexes should be created on local, order by, join predicates and foreign keys • Primary key unique index created by default • Many times the primary key will be used by the optimizer as the driving index due to uniqueness and selectivity • Frequently accessed columns with good selectivity • Avoid Redundant Indexes as they are typically not used or offer additional unnecessary choices to the optimizer • Number of Indexes • Determined by business rules • OLTP • 3 indexes • Fewer indexes offer fewer choices to the optimizer • Mixed (ERP/CRM/SCM) • 3-5 indexes • DW • 5 or more 58 Rule#5 • Adhere to proper index design techniques 59 DB2 Optimizer • • What inputs does the Optimizer consider/analyze during statement optimization? Important to know as some of these inputs can cause suboptimal access paths if not current • RUNSTATS not current • Buffer pool changes • Configuration parameter changes 60 Buffer pool size (npages) To determine how much of the buffer pool may be available for tables/indexes involved. SORTHEAP DB CFG parameter To determine if a piped sort can be used. LOCKLIST To determine amount of memory available for storing locks for this access plan. CPU Speed Speed of CPUs available. PREFETCHSIZE To determine I/O costs. Value of INTRA_PARALLEL DBM CFG Parameter To determine if parallelism may be used. Type of table space and number of containers To determine I/O costs and degree of I/O parallelism. SHEAPTHRES Determine maximum amount of shared SORTHEAP available. DISK Speed To estimate I/O costs. 61 Degree of clustering To determine effectiveness of prefetching and to determine how clustered data is. Indexes Available To determine if index access cost. DFT_DEGREE Default degree of parallelism. AVG_APPLS To determine amount of buffer pool space available for a query. MAXLOCKS Percent of LOCKLIST used by a single application before lock escalation occurs. LOCKLIST Size of memory area reserved for locks. DFT_QUERYOPT The default optimization class to be used. STMTHEAP Size can effect amount of optimization conducted. COMM_BANDWITH Used for partitioned databases. 62 REOPT Bind Option • • Can be used to enable query reoptimization for dynamic and static SQL that have host variables, parameter markers or special registers Can set the REOPT option to one of three values • None – No reoptimization will take place, the default behavior • Once – the access plan will use real values the first time and the plan will be cached in the package cache • Always – the access path will always be compiled and reoptimized using the values of the parameter markers, host variables, or special registers known at each execution time 63 Lock Wait Mode • • • Application can specify individual lock wait mode strategy Take one of the following actions when it cannot obtain a lock: • Return and SQLCODE or SQLSTATE • Wait indefinitely for a lock • Wait a specified amount of time for a lock • Use value of locktimeout DB CFG parameter SET CURRENT LOCK TIMEOUT statement • Specifies number of seconds to wait for a lock • Applies to row, table, index key, and MDC block locks 64 KEEP UPDATE LOCKS • • • A lock type can be specified for queries that perform updates Allows FOR UPDATE cursors to take advantage of row blocking RR or RS can be used when querying a read only results table • Allows positioned cursor updates to succeed 65 Rule#6 • Understand inputs to the DB2 Optimizer 66 DB2 Visual Explain • • • • • Use it as part of testing and development process Developers can use any type of DB2 explain or other SQL Analysis tool but it must be integrated into the development process DBAs also use all types of explain in support of application development testing and in fixing production problems Evaluate all SQL using Visual Explain/db2exfmt or some type of explain tool Monitor on a recurring basis 67 Rule #7 • Developers and DBAs collaborate to develop applications that perform when implemented in production 68 Design Advisor • • • DBAs work with developers using Design Advisor to evaluate individual SQL statements and workloads to identify possible index solutions, clustering indexes, MDC indexes, and MQT recommendations DBAs use the package cache option to look for high cost SQL in the package cache Best used as part of physical design process but use is ongoing 69 db2advis -d gunprd -I wildsortsql.txt > wildsqlixadvout.txt execution started at timestamp 2004-08-12-10.25.44.141157 found [1] SQL statements from the input file Calculating initial cost (without recommmended indexes) [23866.660156] timerons Initial set of proposed indexes is ready. Found maximum set of [1] recommended indexes Cost of workload with all indexes included [75.079346] timerons total disk space needed for initial set [ 4.747] MB total disk space constrained to [ -1.000] MB 1 indexes in current solution [23866.6602] timerons (without indexes) [ 75.0793] timerons (with current solution) [%99.69] improvement Trying variations of the solution set.--- execution finished at timestamp 2004-08-12-10.25.45.932376--- LIST OF RECOMMENDED INDEXES -- =========================== -- index[1], 4.747MB CREATE INDEX WIZ1 ON "PUSER "."T_FILE" ("FILE_STATUS_NUM" DESC) ; -- ===========================-Design Advisor tool is finished. 70 Rule #8 • • Evaluate all SQL using Visual Explain/db2exfmt or some type of explain tool Use Design Advisor to tune SQL statements and workloads 71 RUNSTATS • • • • • RUNSTATS should be run on a regular schedule After a reorg, change in prefetchsize, static SQL change, growth in data What is a regular schedule? Nightly or Weekly depending on many things • Data changes by 10% or more • Indexes changed or added Use SAMPLING in V8.1+, and the WITH DISTRIBUTION clause • • • If simple and straight forward queries and no skewed data then don’t use WITH DISTRIBUTION In database with all dynamic SQL like most ERP, CRM, SCM packages today, RUNSTATS may be needed nightly if data changes as noted above Most shops that get consistent performance schedule RUNSTATS either nightly or weekly 72 Monitoring • • If you don’t know how performance was, you can’t tell why it is suddenly bad Application developers can monitor the performance of applications under their control by: • • • • • Monitoring service levels Periodic explains in Production Checking with end users/help desk Querying the performance repository DBAs should implement and maintain a continuous monitoring program using snapshots, snapshot repository, and event monitors when needed 73 Monitoring • • • • Compute performance metrics on an hourly/daily basis and track and evaluate over time With such a system established, you will be able to answer such questions as “What was this SQL running like yesterday, last week, and last month?” Were there any database problems today? Why has this query suddenly gone from running in 30 seconds to 30 hrs? 74 Rule#10 • • • Implement a monitoring solution that provides both real-time and historical performance data Build canned reports to identify top 10 SQL statements, in terms of User CPU, System CPU, rows read, sorts, and sort time Implement a “closed loop” system where problems are tracked until they are resolved 75 76 SQL Coding Best Practices for Developers Session: G2 THANK YOU! Phil Gunning Gunning Technology Solutions, LLC pgunning@gunningts.com 77 Identify Use TOPAS, VMSTAT, NMON, PS to identify processes consuming CPU and IO resources PS command with correlation to list applications ouput TOPAS (PID correlated with list applications show detail command and application snapshot on the associated agentid) TOP on other platforms NMON 78 Identify DB2 Snapshots and Commands Step 1 -- Use TOPAS to see what db2 process is using high CPU Step 2 -- “list applications show detail” Produces a list of connected applications, their agentids and PIDS Step 3 – Issue “db2 get snapshot for application on <agentid>” 79 Identify Step 4 – Review the SQL contained in the application or dynamic sql snapshot and look for signs of trouble Number of Executions Hash Loops Rows Read vs Rows Selected Rows Read Sorts and sort overflows Logical and physical IO User CPU System CPU Locks Help Timerons 80 Convenience Views TOP_DYNAMIC_SQL 81 Convenience Views LONG_RUNNING_SQL 82 Convenience Views SNAPAPPL 83 Convenience Views SNAPAPPL_INFO 84 Convenience Views SNAPSTMT 85 Top 10 SQL Query with t (snap_ts, rows_read, num_exec, sys_time, usr_time, exec_time, n_rr, n_ne, n_st, n_ut, n_te, stmt_text) as ( select snapshot_timestamp, rows_read, num_ executions, total_sys_cpu_time, total_usr_cpu_time, total_exec_time , row_number() over (order by rows_read desc) , row_number() over (order by num_execution s desc) , row_number() over (order by total_sys_cpu_time desc) , row_number() over (order by total_usr_cpu_time desc) , row_number() over (order by total_exe c_time desc) , substr(stmt_text,1,300) from table(sysproc.snapshot_dyn_sql('fnprd',-1)) as t2 ) select * from t where n_rr < 11 or n_ne < 11 or n_st < 11 or n_ut < 11 or n_te < 11 86 Top 10 SQL Query Output NUMBER 3 TOPSQL ==========>>>>>> 2008-03-0613.08.19.941746 55531716 43249 0 1 30 3 56 51 51 23 SELECT A.DESCR FROM PS_ITM_CAT_TBL A WHERE A.SETID=? AND A.CATEGORY_CD=? AND A.EFFDT=(SELECT MAX(C.EFFDT) FROM PS_ITM_CAT_TBL C WHERE C.SETID=A.SETID AND C.CATEGORY_CD=A.CATEGORY_CD AND C.EFFDT<=?) 87 Dynamic SQL Snapshot Number of executions =3 Number of compilations =1 Worst preparation time (ms) =7 Best preparation time (ms) =7 Internal rows deleted =0 Internal rows inserted =0 Rows read = 6519642 Internal rows updated =0 Rows written =0 Statement sorts =0 Statement sort overflows =0 Total sort time =0 Buffer pool data logical reads = 521368 Buffer pool data physical reads = 0 88 Dynamic SQL Snapshot Total execution time (sec.ms) = 9.662979 Total user cpu time (sec.ms) = 9.328125 Total system cpu time (sec.ms) = 0.187500 Statement text = select client_acc.acc_num from client_acc,client where substr(client_acc.acc_num,1,2) != ? and substr(client_acc.acc_num,1,2) in ('SD','SF') and client_acc.client_id = client.client_id and UCASE(rtrim(client.email)) = (select UCASE(rtrim(client.email)) from client_acc,client where client_acc.client_id=client.client_id and client_acc.acc_num = ? ) 89 Total Cost: 383958 Query Degree: 1 Rows RETURN ( 1) Cost I/O | 2680.75 NLJOIN ( 2) 383958 165532 /------------------+------------------\ 1 2680.75 NLJOIN FILTER ( 3) ( 8) 57.0887 383901 5.00152 165527 /----------+---------\ | 1 1 67018.7 FETCH FETCH HSJOIN ( 4) ( 6) ( 9) 34.2397 34.2503 383845 3.00022 3.0013 165527 /---+---\ /---+---\ /-----+-----\ 1 1.08125e+006 1 1.0826e+006 1.0826e+006 67018.7 IXSCAN TABLE: DB2ADMIN IXSCAN TABLE: DB2ADMIN TBSCAN TBSCAN ( 5) CLIENT_ACC ( 7) CLIENT ( 10) ( 11) 22.8315 22.8299 253909 129811 2 2 108480 57047 | | | | 1.08125e+006 1.0826e+006 1.0826e+006 1.08125e+006 INDEX: DB2ADMIN INDEX: DB2ADMIN TABLE: DB2ADMIN TABLE: DB2ADMIN CLIENT_ACC_PK CLIENT_PK CLIENT CLIENT_ACC 90 SQL Rewrite select client_acc.acc_num from client_acc, client where substr(client_acc.acc_num,1,2) != ? and substr(client_acc.acc_num,1,2) in ('SD','SF') and client_acc.client_id = client.client_id and (client.client_id, UCASE(rtrim(client.email))) in (select y.client_id, UCASE(rtrim(client.email)) from client_acc x, client y where client_acc.client_id=client.client_id and client_acc.acc_num = ? ) ; 91 Total Cost: 79.9153 Query Degree: 1 After db2exfmt Rows RETURN ( 1) Cost I/O | 0.00384 NLJOIN ( 2) 79.9153 11.5685 /------------+------------\ 0.096 0.04 TBSCAN NLJOIN ( 3) ( 4) 0.000836767 79.9145 0 11.5685 | /----------+---------\ 1 0.04 1.08125e+006 TABFNC: SYSIBM NLJOIN IXSCAN GENROW ( 5) ( 10) 57.0895 9105.25 5.00152 2619.7 /----------+---------\ | 1 0.04 1.08125e+006 FETCH FETCH INDEX: DB2ADMIN ( 6) ( 8) CLIENT_ACC_IDX3 34.2397 34.2512 3.00022 3.0013 /---+---\ /---+---\ 1 1.08125e+006 1 1.0826e+006 IXSCAN TABLE: DB2ADMIN IXSCAN TABLE: DB2ADMIN ( 7) CLIENT_ACC ( 9) CLIENT 22.8315 22.8299 2 2 | | 1.08125e+006 1.0826e+006 INDEX: DB2ADMIN INDEX: DB2ADMIN CLIENT_ACC_PK CLIENT_PK 92 Budget Check 93 Budget Check • Two portions of this identified as Top CPU consumers using GTS provided script • Delete Statement • Update Statement • Correlated Sub-queries 94 Non-Correlated Subquery • • • • Runs independently of the outer query Executes one time No correlation between inner and outer query Can normally be rewritten as a join • Most optimizers more efficient with joins 95 Correlated Subquery • Inner query relies on outer query to provide input to inner query • Inner query is correlated to outer query by reference to outer table in the inner query • Order of operation is Outer, Inner, Outer • Non-correlated is only inner • For every row in the outer query the inner query is run • Number of rows read is outer x inner • Can result in high cpu with in-memory scans and billions of rows read 96 Correlated Subquery • Problem here in your environment is that some tables have grown to 1 -10 million rows • Correlated sub-queries do not scale well • Not easily re-written • Hard to optimize • Time-Admin and Budget Check 97 TOPSQL SCRIPT 2008-04-29-14.42.41.035178 28883292121 1 770 41630 43701 1 8662 2 1 2 UPDATE PS_BP_PST1_TAO13 SET KK_PROC_INSTANCE = 1834315+ 1000000000 WHERE PROCESS_INSTANCE=? AND NOT EX ISTS ( SELECT 'X' FROM PS_LEDGER_KK WHERE PS_LEDGER_KK.BUSINESS_UNIT = PS_BP_PST1_TAO13.BUSINESS_UNIT AND PS_LEDG ER_KK.LEDGER = PS_BP_PST1_TAO13.LEDGER AND PS_LEDGER_KK.ACCOUNT = PS_BP_PST1_TAO13.ACCOUNT AND PS_LEDGER_KK.DEPTI D = PS_BP_PST1_TAO13.DEPTID AND PS_LEDGER_KK.OPERATING_UNIT = PS_BP_PST1_TAO13.OPERATING_UNIT AND PS_LEDGER_KK.PR ODUCT = PS_BP_PST1_TAO13.PRODUCT AND PS_LEDGER_KK.FUND_CODE = PS_BP_PST1_TAO13.FUND_CODE AND PS_LEDGER_KK.CLASS_F LD = PS_BP_PST1_TAO13.CLASS_FLD AND PS_LEDGER_KK.PROGRAM_CODE = PS_BP_PST1_TAO13.PROGRAM_CODE AND PS_LEDGER_KK.BU DGET_REF = PS_BP_PST1_TAO13.BUDGE 2008-04-29-14.42.41.035178 573198673 4470 1517 8032 70807 2 151 1 2 1 SELECT A.PROCESS_INSTANCE, A.BUSINESS_UNIT, A.INVOICE, A.SEQ_NUM, A.BUSINESS_UNIT_PC, A.PROJECT_ID, A. LINE_SEQ_NUM, A.PRINT_SEQ_NBR, A.LANGUAGE_CD, A.PROJ_DESCR, A.NAME, A.ACTIVITY_DESCR, A.RES_TYPE_DESCR, A.RES_CAT _DESCR, A.RES_SUB_CAT_DESCR, MIN( A.CHARGE_FROM_DT), MAX( A.CHARGE_TO_DT), SUM( A.QTY), A.UNIT_OF_MEASURE, SUM( A .UNIT_AMT), SUM( A.GROSS_EXTENDED_AMT), SUM( A.NET_EXTENDED_AMT), A.VAT_TREATMENT, A.DESCR FROM PS_BI_EXTRCT_PROJ A WHERE A.INVOICE_FORM_ID = 'PS/PSA01' GROUP BY A.PROCESS_INSTANCE, A.BUSINESS_UNIT, A.INVOICE, A.SEQ_NUM, A.BUS INESS_UNIT_PC, A.PROJECT_ID, A.LINE_SEQ_NUM, A.PRINT_SEQ_NBR, A.LANGUAGE_CD, A.PROJ_DESCR, A.NAME, A.ACTIVITY_DES CR, A.RES_TYPE_DESCR, A.RES_CAT_D 2008-04-29-14.42.41.035178 145652212 472 5 365 1312 3 431 5 4 5 DELETE FROM PS_LEDGER_KK WHERE EXISTS ( SELECT 'X' FROM PS_BP_PST1_TAO17 WHERE PS_LEDGER_KK.BUSINESS_U NIT = PS_BP_PST1_TAO17.BUSINESS_UNIT AND PS_LEDGER_KK.LEDGER = PS_BP_PST1_TAO17.LEDGER AND PS_LEDGER_KK.ACCOUNT = PS_BP_PST1_TAO17.ACCOUNT AND PS_LEDGER_KK.DEPTID = PS_BP_PST1_TAO17.DEPTID AND PS_LEDGER_KK.OPERATING_UNIT = PS_ BP_PST1_TAO17.OPERATING_UNIT AND PS_LEDGER_KK.PRODUCT = PS_BP_PST1_TAO17.PRODUCT AND PS_LEDGER_KK.FUND_CODE = PS_ BP_PST1_TAO17.FUND_CODE AND PS_LEDGER_KK.CLASS_FLD = PS_BP_PST1_TAO17.CLASS_FLD AND PS_LEDGER_KK.PROGRAM_CODE =P S_BP_PST1_TAO17.PROGRAM_CODE AND PS_LEDGER_KK.BUDGET_REF = PS_BP_PST1_TAO17.BUDGET_REF AND PS_LEDGER_KK.AFFILIATE = PS_BP_PST1_TAO17.AFFILIATE AND 98 BUDGET CHECK DELETE 99 BUDGET CHECK DELETE DB2 temporary table sort Table scan of temp table…. 10 0 NEW BUDGET CHECK DELETE DELETE FROM accessfn.PS_LEDGER_KK ACCESSFN_PS_LEDGER_KK1 WHERE EXISTS (SELECT 'X' FROM accessfn.PS_LEDGER_KK ACCESSFN_PS_LEDGER_KK2, accessfn.PS_BP_PST1_TAO13 WHERE ACCESSFN_PS_LEDGER_KK2.BUSINESS_UNIT = accessfn.PS_BP_PST1_TAO13.BUSINESS_UNIT AND ACCESSFN_PS_LEDGER_KK2.LEDGER = accessfn.PS_BP_PST1_TAO13.LEDGER AND ACCESSFN_PS_LEDGER_KK2.ACCOUNT = accessfn.PS_BP_PST1_TAO13.ACCOUNT AND ACCESSFN_PS_LEDGER_KK2.DEPTID = accessfn.PS_BP_PST1_TAO13.DEPTID AND ACCESSFN_PS_LEDGER_KK2.OPERATING_UNIT = accessfn.PS_BP_PST1_TAO13.OPERATING_UNIT AND ACCESSFN_PS_LEDGER_KK2.PRODUCT = accessfn.PS_BP_PST1_TAO13.PRODUCT AND ACCESSFN_PS_LEDGER_KK2.FUND_CODE = accessfn.PS_BP_PST1_TAO13.FUND_CODE AND ACCESSFN_PS_LEDGER_KK2.CLASS_FLD = accessfn.PS_BP_PST1_TAO13.CLASS_FLD AND ACCESSFN_PS_LEDGER_KK2.PROGRAM_CODE = accessfn.PS_BP_PST1_TAO13.PROGRAM_CODE AND ACCESSFN_PS_LEDGER_KK2.BUDGET_REF = accessfn.PS_BP_PST1_TAO13.BUDGET_REF AND ACCESSFN_PS_LEDGER_KK2.AFFILIATE = accessfn.PS_BP_PST1_TAO13.AFFILIATE AND ACCESSFN_PS_LEDGER_KK2.AFFILIATE_INTRA1 = accessfn.PS_BP_PST1_TAO13.AFFILIATE_INTRA1 AND ACCESSFN_PS_LEDGER_KK2.AFFILIATE_INTRA2 = accessfn.PS_BP_PST1_TAO13.AFFILIATE_INTRA2 AND ACCESSFN_PS_LEDGER_KK2.CHARTFIELD1 = accessfn.PS_BP_PST1_TAO13.CHARTFIELD1 AND ACCESSFN_PS_LEDGER_KK2.CHARTFIELD2 = accessfn.PS_BP_PST1_TAO13.CHARTFIELD2 AND ACCESSFN_PS_LEDGER_KK2.CHARTFIELD3 = accessfn.PS_BP_PST1_TAO13.CHARTFIELD3 AND ACCESSFN_PS_LEDGER_KK2.BUSINESS_UNIT_PC = accessfn.PS_BP_PST1_TAO13.BUSINESS_UNIT_PC AND ACCESSFN_PS_LEDGER_KK2.PROJECT_ID = accessfn.PS_BP_PST1_TAO13.PROJECT_ID AND ACCESSFN_PS_LEDGER_KK2.ACTIVITY_ID = accessfn.PS_BP_PST1_TAO13.ACTIVITY_ID AND ACCESSFN_PS_LEDGER_KK2.RESOURCE_TYPE = accessfn.PS_BP_PST1_TAO13.RESOURCE_TYPE AND ACCESSFN_PS_LEDGER_KK2.BUDGET_PERIOD = accessfn.PS_BP_PST1_TAO13.BUDGET_PERIOD AND ACCESSFN_PS_LEDGER_KK2.CURRENCY_CD = accessfn.PS_BP_PST1_TAO13.CURRENCY_CD AND ACCESSFN_PS_LEDGER_KK2.STATISTICS_CODE = accessfn.PS_BP_PST1_TAO13.STATISTICS_CODE AND ACCESSFN_PS_LEDGER_KK2.FISCAL_YEAR = accessfn.PS_BP_PST1_TAO13.FISCAL_YEAR AND ACCESSFN_PS_LEDGER_KK2.ACCOUNTING_PERIOD = accessfn.PS_BP_PST1_TAO13.ACCOUNTING_PERIOD AND ACCESSFN_PS_LEDGER_KK2.KK_BUDG_TRANS_TYPE = accessfn.PS_BP_PST1_TAO13.KK_BUDG_TRANS_TYPE AND accessfn.PS_BP_PST1_TAO13.KK_PROC_INSTANCE = 1628469 + 1000000000 AND ACCESSFN_PS_LEDGER_KK2.DBXCONCATCOL = ACCESSFN_PS_LEDGER_KK1.DBXCONCATCOL AND ACCESSFN_PS_LEDGER_KK2.BUSINESS_UNIT = COALESCE(ACCESSFN_PS_LEDGER_KK2.BUSINESS_UNIT, ACCESSFN_PS_LEDGER_KK2.BUSINESS_UNIT) AND ACCESSFN_PS_LEDGER_KK2.PROGRAM_CODE = COALESCE(ACCESSFN_PS_LEDGER_KK2.PROGRAM_CODE, ACCESSFN_PS_LEDGER_KK2.PROGRAM_CODE) AND ACCESSFN_PS_LEDGER_KK2.BUDGET_PERIOD = COALESCE(ACCESSFN_PS_LEDGER_KK2.BUDGET_PERIOD, ACCESSFN_PS_LEDGER_KK2.BUDGET_PERIOD) AND ACCESSFN_PS_LEDGER_KK2.DBXCONCATCOL = COALESCE(ACCESSFN_PS_LEDGER_KK2.DBXCONCATCOL, ACCESSFN_PS_LEDGER_KK2.DBXCONCATCOL)) 10 1 NEW BUDGET CHECK DELETE 10 2 Budget Check -- UPDATE UPDATE accessfn.PS_BP_PST1_TAO16 SET KK_PROC_INSTANCE = 1628583+ 1000000000 WHERE PROCESS_INSTANCE= 1628583 AND not EXISTS ( SELECT 'X' FROM accessfn.PS_LEDGER_KK where PS_LEDGER_KK.BUSINESS_UNIT = PS_BP_PST1_TAO16.BUSINESS_UNIT AND PS_LEDGER_KK .LEDGER = PS_BP_PST1_TAO16.LEDGER AND PS_LEDGER_KK.ACCOUNT = PS_BP_PST1_TAO16.ACCOUNT AND PS_LEDGER_KK.DEPTID = PS_BP_PST1_TAO16.DEPTID AND PS_LEDGER_KK.OPERATING_UNIT = PS_BP_PST1_TAO16.OPERATING_UNIT AND PS_LEDGER_KK.PRODUCT = PS_BP_PST1_TAO16.PRODUCT AND PS_LEDGER_KK.FUND_CODE = PS_BP_PST1_TAO16.FUND_CODE AND PS_LEDGER_KK.CLASS_FLD = PS_BP_PST1_TAO16. CLASS_FLD AND PS_LEDGER_KK.PROGRAM_CODE = PS_BP_PST1_TAO16.PROGRAM_CODE AND PS_LEDGER_KK.BUDGET_REF = PS_BP_PST1_TAO16.BUDGET_REF AND PS_LEDGER_KK. AFFILIATE = PS_BP_PST1_TAO16.AFFILIATE AND PS_LEDGER_KK.AFFILIATE_INTRA1 = PS_BP_PST1_TAO16.AFFILIATE_INTRA1 AND PS_LEDGER_KK.AFFILIATE_INTRA2 = PS_BP_PST1_TAO16.AFFILIATE_INTRA2 AND PS_LEDGER_KK.CHARTFIELD1 = PS_BP_PST1_TAO16.CHARTFIELD1 AND PS_LEDGER_KK.CHARTFIELD2 = PS_BP_PST1_TAO16 .CHARTFIELD2 AND PS_LEDGER_KK.CHARTFIELD3 = PS_BP_PST1_TAO16.CHARTFIELD3 AND PS_LEDGER_KK.BUSINESS_UNIT_PC = PS_BP_PST1_TAO16.BUSINESS_UNIT_PC AND PS_LEDGER_KK.PROJECT_ID = PS_BP_PST1_TAO16.PROJECT_ID AND PS_LEDGER_KK. ACTIVITY_ID = PS_BP_PST1_TAO16.ACTIVITY_ID AND PS_LEDGER_KK.RESOURCE_TYPE = PS_BP_PST1_TAO16.RESOURCE_TYPE AND PS_LEDGER_KK.BUDGET_PERIOD = PS_BP_PST1_TAO16.BUDGET_PERIOD AND PS_LEDGER_KK.CURRENCY_CD = PS_BP_PST1_TAO16.CURRENCY_CD AND PS_LEDGER_KK.STATISTICS_CODE = PS_BP_PST1_TAO16.STATISTICS_CODE AND PS_LEDGER_KK.FISCAL_YEAR = PS_BP_PST1_TAO16.FISCAL_YEAR AND PS_LEDGER_KK.ACCOUNTING_PERIOD = PS_BP_PST1_TAO16.ACCOUNTING_PERIOD AND PS_LEDGER_KK.KK_BUDG_TRANS_TYPE = PS_BP_PST1_TAO16.KK_BUDG_TRANS_TYPE ) 10 3 Current Budget Check 10 4 Current Budget Check 10 5 Budget Check • Initial investigation revealed low cost in terms of timerons due to temp tables with 0 rows • DB2 develops access path at runtime based on current statistics, which indicated empty table • DB2 may not optimize access plan correctly because of this • Low timeron cost adds to difficulty in identifying and tuning this SQL 10 6 Budget Check • Other problems were PeopleSoft temporary tables, such as PS_BP_PST1_TAO16 were being scanned • Marked as volatile to prevent table scan • DB2 optimizer not using index in many cases due to table statistics • Choosing table scan because optimizer sees either empty table or few rows 10 7 Budget Check • What seems to be working is the following: • Mark temporary work tables as VOLATILE • Tells DB2 to use an index if possible • Do not run runstats on temporary tables, not even once, DB2 will then use intelligent defaults which favor index access • Consider nightly reorg of temporary tables – Still being evaluated, work in progress 10 8 UPDATE accessfn.PS_BP_PST1_TAO16 SET KK_PROC_INSTANCE = 1783296 + 1000000000 NEW BUDGET CHECK WHERE PROCESS_INSTANCE = 1787057 AND 0= (SELECT count(*) FROM accessfn.PS_LEDGER_KK WHERE accessfn.PS_LEDGER_KK.BUSINESS_UNIT = accessfn.PS_BP_PST1_TAO16.BUSINESS_UNIT AND accessfn.PS_LEDGER_KK.LEDGER = accessfn.PS_BP_PST1_TAO16.LEDGER AND accessfn.PS_LEDGER_KK.ACCOUNT = accessfn.PS_BP_PST1_TAO16.ACCOUNT AND accessfn.PS_LEDGER_KK.DEPTID = accessfn.PS_BP_PST1_TAO16.DEPTID AND accessfn.PS_LEDGER_KK.OPERATING_UNIT =accessfn.PS_BP_PST1_TAO16.OPERATING_UNIT AND accessfn.PS_LEDGER_KK.PRODUCT = accessfn.PS_BP_PST1_TAO16.PRODUCT AND accessfn.PS_LEDGER_KK.FUND_CODE = accessfn.PS_BP_PST1_TAO16.FUND_CODE AND accessfn.PS_LEDGER_KK.CLASS_FLD = accessfn.PS_BP_PST1_TAO16.CLASS_FLD AND accessfn.PS_LEDGER_KK.PROGRAM_CODE =accessfn.PS_BP_PST1_TAO16.PROGRAM_CODE AND accessfn.PS_LEDGER_KK.BUDGET_REF = accessfn.PS_BP_PST1_TAO16.BUDGET_REF AND accessfn.PS_LEDGER_KK.AFFILIATE = accessfn.PS_BP_PST1_TAO16.AFFILIATE AND accessfn.PS_LEDGER_KK.AFFILIATE_INTRA1 =accessfn.PS_BP_PST1_TAO16.AFFILIATE_INTRA1 AND accessfn.PS_LEDGER_KK.AFFILIATE_INTRA2 =accessfn.PS_BP_PST1_TAO16.AFFILIATE_INTRA2 AND accessfn.PS_LEDGER_KK.CHARTFIELD1 =accessfn.PS_BP_PST1_TAO16.CHARTFIELD1 AND accessfn.PS_LEDGER_KK.CHARTFIELD2 =accessfn.PS_BP_PST1_TAO16.CHARTFIELD2 AND accessfn.PS_LEDGER_KK.CHARTFIELD3 =accessfn.PS_BP_PST1_TAO16.CHARTFIELD3 AND accessfn.PS_LEDGER_KK.BUSINESS_UNIT_PC =accessfn.PS_BP_PST1_TAO16.BUSINESS_UNIT_PC AND accessfn.PS_LEDGER_KK.PROJECT_ID = accessfn.PS_BP_PST1_TAO16.PROJECT_ID AND accessfn.PS_LEDGER_KK.ACTIVITY_ID =accessfn.PS_BP_PST1_TAO16.ACTIVITY_ID AND accessfn.PS_LEDGER_KK.RESOURCE_TYPE =accessfn.PS_BP_PST1_TAO16.RESOURCE_TYPE AND accessfn.PS_LEDGER_KK.BUDGET_PERIOD =accessfn.PS_BP_PST1_TAO16.BUDGET_PERIOD AND accessfn.PS_LEDGER_KK.CURRENCY_CD =accessfn.PS_BP_PST1_TAO16.CURRENCY_CD AND accessfn.PS_LEDGER_KK.STATISTICS_CODE =accessfn.PS_BP_PST1_TAO16.STATISTICS_CODE AND accessfn.PS_LEDGER_KK.FISCAL_YEAR =accessfn.PS_BP_PST1_TAO16.FISCAL_YEAR AND accessfn.PS_LEDGER_KK.ACCOUNTING_PERIOD =accessfn.PS_BP_PST1_TAO16.ACCOUNTING_PERIOD AND accessfn.PS_LEDGER_KK.KK_BUDG_TRANS_TYPE =accessfn.PS_BP_PST1_TAO16.KK_BUDG_TRANS_TYPE) AND PROCESS_INSTANCE = PROCESS_PROCESS_INSTANCE 10 9 Correlated Subquery Alternatives • Left Outer Join • Problem here is joining many columns (almost half the table) and lack of good join column • Use of intermediate DB2 temporary table 11 0 Use of IN and EXISTS • EXISTS -- checks for first row that qualifies, this existence checking enables “early out” and can be seen in explain output • Doesn’t compare values so not subject to problem with nulls • IN -- used to determine whether a given value matches any of the values in a list, needs to examine all values, no early out 11 1 Suboptimal SQL (Date Function) SELECT acc_num, REAL_CHIPS, PROMO_CHIPS, CASH_AMT FROM DB2ADMIN.acc_balance WHERE date(timestamp) = '2007-1-01' ; Timeron Cost: 550,846 11 2 Suboptimal SQL (Date Function) Solution SELECT acc_num, REAL_CHIPS, PROMO_CHIPS, CASH_AMT FROM DB2ADMIN.acc_balance WHERE timestamp between '2007-01-0100.00.00' and '2007-01-01-23.59.59.999999' ; New Timeron Cost: 54,675 11 3 Common Table Expression High Cost WITH INFO AS (SELECT PLAYERNAME AS PLAYERNAME, ACC_NUM AS ACC_NUM,STAKE AS STAKE, PLAY_COUNT AS PLAY_COUNT,GAME_ID AS GAME_ID,NUM_QUALIFY AS NUM_QUALIFY , REAL_PRIZE_PAID AS REAL_PRIZE_PAID, REBUY_COUNT AS REBUY_COUNT, PROMO_PRIZE_PAID AS PROMO_PRIZE_PAID , RANK() OVER (ORDER BY STAKE DESC) AS PLAYER_RANK FROM db2admin.CT_PLAYER WHERE TOURNAMENT_ID = ? AND NUM_QUALIFY=0 UNION SELECT PLAYERNAME AS PLAYERNAME, ACC_NUM AS ACC_NUM,STAKE AS STAKE, PLAY_COUNT AS PLAY_COUNT, GAME_ID AS GAME_ID, NUM_QUALIFY AS NUM_QUALIFY , REAL_PRIZE_PAID AS REAL_PRIZE_PAID, REBUY_COUNT AS REBUY_COUNT, PROMO_PRIZE_PAID AS PROMO_PRIZE_PAID, 0 AS PLAYER_RANK FROM db2admin.CT_PLAYER WHERE TOURNAMENT_ID = ? AND ACC_NUM IN ('EH0144300844','GP0805174740', 'GP0280683162','SL0763326234','GP0806937257','SL0410586631', 'SL0871800961','GP0002320186','GP0006520691','SD0580234716', 'SL0919369066','SL0673693302','EH0131748166','HT0232729921', 'GP0550097653','GP0695261884','EP0939931413','EF0273763788', 'GP0035242171','GP0994999656','SL0237577932','EH0109845675')) SELECT * FROM INFO WHERE ACC_NUM IN ('EH0144300844','GP0805174740','GP0280683162','SL0763326234', 'GP0806937257','SL0410586631','SL0871800961','GP0002320186', 'GP0006520691','SD0580234716','SL0919369066','SL0673693302', 'EH0131748166','HT0232729921','GP0550097653','GP0695261884', 'EP0939931413','EF0273763788','GP0035242171','GP0994999656', 'SL0237577932','EH0109845675') OR PLAYER_RANK<=10 ORDER BY PLAYERNAME,PLAYER_RANK 11 4 Total Cost: 5105.9 Query Degree: Rows RETURN ( 1) Cost I/O | 818.703 FILTER ( 2) 5105.9 803.1 | 2046.76 TBSCAN ( 3) 5105.44 803.1 | 2046.76 SORT ( 4) 5105.34 803.1 | 2046.76 UNION ( 5) 5102 803.1 /-----------+-----------\ 1 22 2024.76 NLJOIN TBSCAN ( 6) ( 10) 1650.19 3451.58 66 737.1 /------+------\ | 22 1 2024.76 TBSCAN FETCH SORT ( 7) ( 8) ( 11) 0.000141703 75.0168 3451.48 0 3 737.1 | /---+--\ | 22 1 95556 2024.76 TABFNC: SYSIBM IXSCAN TABLE: DB2ADMIN FETCH GENROW ( 9) CT_PLAYER ( 12) 50.014 3450.64 2 737.1 | /---+---\ 95556 2582.59 95556 INDEX: DB2ADMIN RIDSCN TABLE: DB2ADMIN CT_PLAYER_PK ( 13) CT_PLAYER 664.375 26.5135 | 2582.59 SORT ( 14) 664.375 26.5135 | 2582.59 IXSCAN ( 15) 663.752 26.5135 | 95556 INDEX: DB2ADMIN CT_PLAYER_PK 11 5 More Suboptimal SQL Number of executions = 365257 Number of compilations =1 Worst preparation time (ms) = 40 Best preparation time (ms) =1 Internal rows deleted =0 Internal rows inserted =0 Rows read = 7022812035 Internal rows updated =0 Rows written = 3502999515 Statement sorts =0 Statement sort overflows = 243499 Total sort time =0 Buffer pool data logical reads = Not Collected Buffer pool data physical reads = Not Collected Buffer pool temporary data logical reads = Not Collected Buffer pool temporary data physical reads = Not Collected Buffer pool index logical reads = Not Collected Buffer pool index physical reads = Not Collected Buffer pool temporary index logical reads = Not Collected Buffer pool temporary index physical reads = Not Collected Total execution time (sec.ms) = 7728.817467 Total user cpu time (sec.ms) = 6918.140625 Total system cpu time (sec.ms) = 348.734375 Statement text = select sum(rebuy_count) from ct_player where tournament_id = ? 11 6 Access Plan: ----------- Total Cost: Query Degree: Rows RETURN ( 1) Cost I/O | 1 GRPBY ( 2) 3450.42 737.1 | 2582.59 FETCH ( 3) 3450.3 737.1 3450.42 1 /---+---\ 2582.59 95556 RIDSCN TABLE: DB2ADMIN ( 4) CT_PLAYER 664.375 26.5135 | 2582.59 SORT ( 5) 664.375 26.5135 | 2582.59 IXSCAN ( 6) 663.752 26.5135 | 95556 INDEX: DB2ADMIN CT_PLAYER_PK 11 7 CREATE TABLE "DB2ADMIN"."CT_PLAYER" ( "TOURNAMENT_ID" INTEGER NOT NULL , "ACC_NUM" CHAR(12) NOT NULL , "STAKE" DECIMAL(11,2) NOT NULL , "PLAY_COUNT" INTEGER NOT NULL , "FINAL_POSITION" INTEGER , "REAL_PRIZE_PAID" DECIMAL(11,2) , "PROMO_PRIZE_PAID" DECIMAL(11,2) , "LOCK" TIMESTAMP NOT NULL WITH DEFAULT CURRENT TIMESTAMP , "REBUY_COUNT" INTEGER NOT NULL WITH DEFAULT 0 , "REBUY" CHAR(1) NOT NULL WITH DEFAULT 'F' , "TOKEN_ID" VARCHAR(25) , "BUYIN_TYPE" CHAR(1) NOT NULL WITH DEFAULT 'R' , "GAME_ID" INTEGER NOT NULL WITH DEFAULT 0 , "NUM_QUALIFY" INTEGER WITH DEFAULT -1 , "PLAYERNAME" VARCHAR(20) , "UPDATE_TS" TIMESTAMP NOT NULL WITH DEFAULT ) IN "TSD_SIN" INDEX IN "TSI_SIN" ORGANIZE BY ( ( "TOURNAMENT_ID" ) ) Create DDL for MDC Table 11 8 MDC Table Created Fixed Two Costly SQL Statements WITH INFO AS (SELECT PLAYERNAME AS PLAYERNAME, ACC_NUM AS ACC_NUM,STAKE AS STAKE, PLAY_COUNT AS PLAY_COUNT,GAME_ID AS GAME_ID,NUM_QUALIFY AS NUM_QUALIFY , REAL_PRIZE_PAID AS REAL_PRIZE_PAID, REBUY_COUNT AS REBUY_COUNT, PROMO_PRIZE_PAID AS PROMO_PRIZE_PAID , RANK() OVER (ORDER BY STAKE DESC) AS PLAYER_RANK FROM db2admin.CT_PLAYER WHERE TOURNAMENT_ID = ? AND NUM_QUALIFY=0 UNION SELECT PLAYERNAME AS PLAYERNAME, ACC_NUM AS ACC_NUM,STAKE AS STAKE, PLAY_COUNT AS PLAY_COUNT, GAME_ID AS GAME_ID, NUM_QUALIFY AS NUM_QUALIFY , REAL_PRIZE_PAID AS REAL_PRIZE_PAID, REBUY_COUNT AS REBUY_COUNT, PROMO_PRIZE_PAID AS PROMO_PRIZE_PAID, 0 AS PLAYER_RANK FROM db2admin.CT_PLAYER WHERE TOURNAMENT_ID = ? AND ACC_NUM IN ('EH0144300844','GP0805174740', 'GP0280683162','SL0763326234','GP0806937257','SL0410586631', 'SL0871800961','GP0002320186','GP0006520691','SD0580234716', 'SL0919369066','SL0673693302','EH0131748166','HT0232729921', 'GP0550097653','GP0695261884','EP0939931413','EF0273763788', 'GP0035242171','GP0994999656','SL0237577932','EH0109845675')) SELECT * FROM INFO WHERE ACC_NUM IN ('EH0144300844','GP0805174740','GP0280683162','SL0763326234', 'GP0806937257','SL0410586631','SL0871800961','GP0002320186', 'GP0006520691','SD0580234716','SL0919369066','SL0673693302', 'EH0131748166','HT0232729921','GP0550097653','GP0695261884', 'EP0939931413','EF0273763788','GP0035242171','GP0994999656', 'SL0237577932','EH0109845675') OR PLAYER_RANK<=10 ORDER BY PLAYERNAME,PLAYER_RANK 11 9 Total Cost: 765.726 85.01% Query Degree: Improvement Rows RETURN ( 1) Cost I/O | 800.583 FILTER ( 2) 765.726 192 | 2001.46 TBSCAN ( 3) 765.274 192 | 2001.46 SORT ( 4) 765.179 192 | 2001.46 UNION ( 5) 761.921 192 Block Index Used! List prefetch eliminated! /-------------+-------------\ 22 1979.46 MSJOIN TBSCAN ( 6) ( 15) 381.049 380.646 96 96 /--------+-------\ | 2324.66 0.00946376 1979.46 TBSCAN FILTER SORT ( 7) ( 11) ( 16) 380.844 0.0078653 380.552 96 0 96 | | | 2324.66 22 1979.46 SORT TBSCAN FETCH ( 8) ( 12) ( 17) 380.844 0.0078653 379.727 96 0 96 | | /---+---\ 2324.66 22 2.45714 81363 FETCH SORT IXSCAN TABLE: DB2ADMIN ( 9) ( 13) ( 18) MDC_CT_PLAYER 380.141 0.00653574 2.95247 96 0 0 /---+---\ | | 2.45714 81363 22 81363 IXSCAN TABLE: DB2ADMIN TBSCAN INDEX: SYSIBM ( 10) MDC_CT_PLAYER ( 14) SQL0612201328275 2.95247 0.000141703 0 0 | | 81363 22 INDEX: SYSIBM TABFNC: SYSIBM SQL0612201328275 GENROW 12 0 Second Suboptimal Query MDC Solution select sum(rebuy_count) from ct_player where tournament_id =? 12 1 Original Statement: -----------------select sum(rebuy_count) from db2admin.mdc_ct_player where tournament_id = ? Access Plan: ----------Total Cost: Query Degree: Rows RETURN ( 1) Cost I/O | 1 GRPBY ( 2) 379.384 96 | 2324.66 FETCH ( 3) 379.274 96 /---+---\ 2.45714 81363 IXSCAN TABLE: DB2ADMIN ( 4) MDC_CT_PLAYER 2.95247 0 | 81363 INDEX: SYSIBM SQL0612201328275 89.01% Improvement 379.384 1 Cost of Query After MDC Table Created Block Index Used! List prefetch eliminated! 12 2 Another Suboptimal SQL Query Original SQL: IDENTIFIED via SELECT COUNT(*) Top 10 SQL Query FROM ACCOUNT_MACHINE, CLIENT_ACC WHERE ACCOUNT_MACHINE.ACC_NUM = CLIENT_ACC.ACC_NUM AND HEX(MACHINE_ID) =? AND CLIENT_ACC.CASINO_ID = ? 12 3 Visual Explain shows high cost of IXSCAN (no start-stop predicates) on ACCOUNT_MACHINE 12 4 Generated Column Solution Modifications to the ACCOUNT_MACHINE table: 1. SET INTEGRITY FOR DB2ADMIN.ACCOUNT_MACHINE OFF; 2. ALTER TABLE DB2ADMIN.ACCOUNT_MACHINE ADD COLUMN MACHINE_HEX_ID CHARACTER (127) NOT NULL GENERATED ALWAYS AS (HEX(MACHINE_ID)); 3. SET INTEGRITY FOR DB2ADMIN.ACCOUNT_MACHINE IMMEDIATE CHECKED FORCE GENERATED; 4. CREATE INDEX DB2ADMIN.XH_ACCOUNT_MACHINE ON DB2ADMIN.ACCOUNT_MACHINE (MACHINE_HEX_ID, ACC_NUM) ALLOW REVERSE SCANS; 12 5 Revised SQL -- Generated Column Solution Revised SQL SELECT COUNT(*) FROM ACCOUNT_MACHINE, CLIENT_ACC WHERE ACCOUNT_MACHINE.ACC_NUM = CLIENT_ACC.ACC_NUM AND MACHINE_HEX_ID =? AND CLIENT_ACC.CASINO_ID = ? 12 6 12 7 db2pd db2pd First available in DB2 V8.1 db2pd reads from memory and does not take any locks Low Overhead makes it a good choice for monitoring Data provided not the same as from snapshots and snapshots still needed for complete picture db2pd –tcbstats option shows update, delete and insert activity on tables Good way to back into suboptimal SQL by focusing on tables with high UDI activity db2pd continues to be enhanced in subsequent releases 12 8 -applications option Command: db2pd –db db2mon –applications Applications: Address AppHandl [nod-index] NumAgents CoorTid Status C-StmtUID L-AnchID L-StmtUID Appid 0x02B9B120 1086 [000-01086] 1 3148 UOW-Waiting 79 180 *LOCAL.DB2.050901041938 0x01B686D0 599 [000-00599] 1 1664 UOW-Waiting 84 216 *LOCAL.DB2.050901022803 0x0196FB80 591 [000-00591] 1 1984 UOW-Waiting 0 0 *LOCAL.DB2.050901022720 0x003FBF30 589 [000-00589] 1 2928 UOW-Waiting 0 0 *LOCAL.DB2.050901022718 0x01B6BD70 588 [000-00588] 1 2520 UOW-Waiting 0 0 *LOCAL.DB2.050901022717 0x003FB490 9 [000-00009] 1 2172 UOW-Waiting 0 *LOCAL.DB2.050830232506 0x0196F520 8 [000-00008] 1 2244 UOW-Waiting 0 *LOCAL.DB2.050830232351 C-AnchID 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 12 9 -activestatements option Command: db2pd –db db2mon –activestatements Database Partition 0 -- Database DB2MON -- Active -- Up 1 days 04:56:27 Active Statement List: Address AppHandl [nod-index] UOW-ID StartTime LastRefTime 0x02D279E0 1086 [000-01086] 3 2005 Thu Sep 01 00:19:54 2005 Snippet from –dynamic option Dynamic SQL Statements: Address AnchID StmtUID x03F2D9B0 79 180 sysibm.systables StmtID 1 NumEnv 0 AnchID StmtUID 79 NumVar 0 180 NumRef EffISO 1 EffLockTOut EffDegree -2 NumExe 1 1 0 Thu Sep 01 00:19:54 Text select * from 13 0 Putting it All Together with db2pd Locks: Address TranHdl Lockname 0x0485E3B0 3 0x0485C998 2 Type Mode Sts Owner 02000300270000000000000052 Row 02000300000000000000000054 Table Dur HldCnt Att .NS W 2 .IX G 2 ReleaseFlg 1 0 0x0000 0x00000001 1 0 0x0000 0x40000000 Transactions Address AppHandl [nod-index] TranHdl Locks State Tflag Tflag2 Firstlsn Lastlsn LogSpace SpaceReserved TID AxRegCnt GXID 0x045B1580 236 [000-00236] 2 8 WRITE 0x00000000 0x00000000 0x000003A9800C 0x000003AA27B6 234 572 0x0000000014DD 1 0 0x045B2000 425 [000-00425] 3 0x000000001716 1 0 0x045B2A80 339 [000-00339] 4 0 0x00000000168C 1 0 5 READ READ 0x00000000 0x00000000 0x000000000000 0x000000000000 0 0x00000000 0x00000000 0x000000000000 0x000000000000 0 0 0 Applications: Address AppHandl [nod-index] NumAgents CoorTid 0x01B19950 425 0x0090BE70 236 [000-00425] 1 [000-00236] 1 Status 3652 2452 C-AnchID C-StmtUID L-AnchID L-StmtUID Appid Lock-wait UOW-Waiting 123 0 1 0 0 0 149 1 *LOCAL.DB2.050920034220 *LOCAL.DB2.050920021447 Dynamic SQL Statements: Address AnchID StmtUID NumEnv NumVar NumRef NumExe Text 0x05E289C0 123 1 1 1 1 1 select * from staff 0x05E26FB0 129 1 1 1 1 1 SELECT SCHEMA, NAME, STATS_DETAIL, STATS_STATE, STATS_TIME FROM SYSTOOLS.HMON_ATM_INFO WHERE (STATS_STATE = 2 OR STATS_STATE = 6) AND STATS_FLAG = 'Y' 0x05E27E60 149 1 1 1 1 1 insert into department values('1b', 'Gunning','4', 'C00', NULL) 13 1 Summary In this presentation we have concentrated on Identifying suboptimal SQL Predicate coding rules presented Key convenience views, snapshots and Administrative routines were identified Tips and Techniques for identifying suboptimal SQL was presented A TOP 10 SQL Query was presented and discussed Suboptimal SQL examples were provided and solutions were identified and discussed Problems and solutions were based on actual problems in the field 13 2 Session G10 Identifying and Tuning Suboptimal SQL Philip K. Gunning Gunning Technology Solutions, LLC pgunning@gunningts.com 13 3