Deep Understandnig of DB2 Snapshot Monitoring and

advertisement
Deep understanding of DB2 and Snapshots monitoring and Administrative view
Sharad D. Pawar
ACI Payment system
Agenda
1. How to capture DB2 Snapshots
2. Database snapshot interpretations
3. Snapshot Analysis
4. DB2 Administrative views
2
How to Capture DB2 snapshots
Get the snapshots for scenario from productions
•
•
•
•
•
Verify monitor switches turn on
db2 “get monitor switches”
db2 "get dbm cfg "| grep -i "DFT_MON«
db2 "get monitor switches"
Some time database server bounce will be required.
3
How to Capture DB2 snapshots
DB Configuration for Monitor Switches are as follows :







Buffer pool
Lock
Sort
Statement
Table
Timestamp
Unit of work
(DFT_MON_BUFPOOL) = ON
(DFT_MON_LOCK) = ON
(DFT_MON_SORT) = ON
(DFT_MON_STMT) = ON
(DFT_MON_TABLE) = ON
(DFT_MON_TIMESTAMP) = ON
(DFT_MON_UOW) = ON
4
How to Capture DB2 snapshots
 db2 “get snapshot for all on <DBNAME> ” > outfile.txt
 Basic command to get the snapshots.
GET SNAPSHOT FOR {DATABASE MANAGER | ALL [DCS] DATABASES |
ALL [DCS] APPLICATIONS | ALL BUFFERPOOLS | [DCS] APPLICATION
{APPLID appl-id | AGENTID appl-handle} | FCM FOR ALL DBPARTITIONNUMS |
LOCKS FOR APPLICATION {APPLID appl-id | AGENTID appl-handle} |
{ALL | [DCS] DATABASE | [DCS] APPLICATIONS | TABLES |
TABLESPACES | LOCKS | BUFFERPOOLS | DYNAMIC SQL [write to file]}
ON database-alias} [AT DBPARTITIONNUM db-partition-number | GLOBAL]
5
How to Capture DB2 snapshots
Snapshot can be collected for following monitoring switches :
•
•
•
•
•
•
•
Database
Tables
Tablespaces
Locks
Bufferpools
Application
Dynamic SQL
6
Database Snapshot Interpretations
Snapshot can be interpreted in following ways :
 General health of database can be determined from the Database
Snapshots
 Even based on these counters we can determine application
counter.
7
Snapshot Analysis
Following are the matrices for measuring the performance database
 ARSS ( Average Result Set Size )
 IRE ( Index Read Efficiency )
 Synchronous Read Percentage (SRP)
 Number of Transactions Completed (TXCNT)
Number of Selects per Transaction (SELTX)
 DML per Transaction (DMLTX)
 Sorts per Transaction (SRTTX)
 Sort Overflows per Transaction
 Rows Read per Transaction (DB-RRTX)
 Rows Fetched per Transaction (DB-FETTX)
 Bufferpool Logical Reads per Transaction (BPLRTX)
 Bufferpool Logical INDEX Reads per Transaction (BPLITX)
8
Snapshot Analysis
ARSS ( Average Result Set Size )
Rule of Thumb : If ARSS is less than or equal to 10, then the
database is behaving like an OLTP database. If the ARSS is
greater than 10, then your database is behaving like a Data
Warehouse database. If the ARSS is just a little bit greater
than 10, then you may have an OLTP database with some
concurrent decision support (DW) queries running.
How to Calculate ARSS :
ARSS = ROWS_SELECTED / SELECT_SQL_STMTS
9
Snapshot Analysis
 IRE (Index Read Efficiency)
Rule of Thumb : If IRE is less than or equal to 10, This is desirable
for OLTP database. If you have an OLTP database, an Index Read
Efficiency ratio of ten or higher is cause for concern. This would
indicate that indexes providing sufficient filtration quality are not
available. DB2 may be performing scans or using inefficient indexes
as a poor substitute.
The IREF for this database was 180 meaning that DB2 picks up and
evaluates 180 rows to return just one row on average. This is
indicative of scans, and scans in an OLTP database are bad.
How to Calculate IRF :
IRE : Rows Read / Rows Selected
10
Snapshot Analysis
SRP (Synchronous Read Percentage )
SRP Guidelines for OLTP databases:
Rule of Thumb :
* SRP should be greater than 90 to make good use of of high quality
synchronous I/O to retrieve the required result sets.
* If the SRP is in the range of 80-90%, this is good, but may have
tuning opportunities for improvement.
* If the SRP is in the range of 50-80%, the database's performance
may be marginal at best. There are definitely physical design
opportunities for improvement. If your SRP is in this range
* SRP is being less than 50%, this is highly undesirable. DBA will
have lot of opportunity to improve the performance.
11
Snapshot Analysis
SRP (Synchronous Read Percentage)
How to Calculate SRP :
SRP : 100 - (((Asynchronous pool data page reads + Asynchronous
pool index page reads) * 100) / (Buffer
pool data physical reads + Buffer pool index physical reads)).
12
Snapshot Analysis
SRP (Synchronous Read Percentage)
SRP Calculation for DW Database :
Rule of Thumb :
* If your SRP is greater than 50%, database is performing very good
as Data Warehouse queries tend to do a significant amount of data
scanning for queries that return larger result sets.
* If the SRP is in the range of 25-50%, this is good, but it may have
tuning opportunities.
* If the SRP is anywhere less than 20%, it means database has lot
of opportunities for tuning.
13
Snapshot Analysis
 TXCNT (Number of Transactions Completed)
It’s summation of the number of committed statement and rollback
statements.
How to Calculate TXCNT :
TXCNT = Commit statements attempted + Rollback statements attempted
14
Snapshot Analysis
 SELTX (Number of Selects per transaction)
Rule of Thumb : For OLTP database, the range is 3-15.
SELTX indicates how much data retrieval work is being done
for each transaction. A value less than 10 is common and
desirable.
How to Calculate SELTX :
SELTX = "Select SQL statements executed" / TXCNT
15
Snapshot Analysis

DMLTX (DML per Transaction)
Rule of Thumb : General consideration is within the worldwide
normal range of .5 to 4.
3-4 Select statements are being accompanied by 1-2
Insert/Update/Delete statements, on average. This is good because
the units of work are small.
DMLTX indicates how much data change activity is being performed
for each transaction. A value less than 4 is common and desirable.
As DMLTX increases, this will influence the need to increase the DB
CFG parameter LOGBUFSZ. The risk of lock contention also
increases along with increases in DMLTX
How to calculate DMLTX :
DMLTX = "Update/Insert/Delete statements executed" / TXCNT
16
Snapshot Analysis
 SRTX (Sorts per Transaction)
Rule of Thumb : Removing Sorts from your transactions will
measurably improve transaction response times AND lower CPU
consumption.
How to Calculate SRTX :
SRTX= Total sorts / TXCNT
17
Snapshot Analysis

SOPT (Sort Overflows per Transaction)
Rule of Thumb : SORT consumes CPU cycle and it should be lower
for each transaction. Ideally, this should be less than 1 or 2.
How to calculate SOPT :
SOPT = (Sort overflows * 100) / Total sorts
18
Snapshot Analysis

RRTX (Rows Read per Transaction)
Rule of Thumb : Rows Read per Database Transaction should be
less than 10%. Making higher rows read per transaction will cause
high CPU consumption on the database server.
How to Calculate RRTX :
RRTX = Rows Read / TXCNT
19
Snapshot Analysis

FETTX (Rows Fetched per Transaction)
Rule of Thumb : Rows Read per Database Transaction should be
less than 10%. Making higher rows read per transaction will cause
high CPU consumption on the database server.
How to Calculate FETTX :
FETTX = Rows Selected / TXCNT (Commit statements
attempted + Rollback statements attempted)
20
Snapshot Analysis

BPLRTX (Bufferpool Logical Reads per
Transaction)
This is cost metric and one of the best cost measurements Bufferpool Read
I/O ms per Transaction or verifying your tuning success.
Bufferpool logical page reads equate in direct proportion to CPU time
consumed.
This value Should be low.
Just remember: Logical Reads = CPU Consumption
When DB2 wants to access data (either index pages or data pages), a
Logical Read is performed against the bufferpool. If the data requested is not
present in the bufferpool per the logical read request, then a Physical Read
must be performed to disk to retrieve the page that was logically requested.
If the logically requested data is already in the bufferpool, then the physical
disk read is avoided. So, you will note that a request for data typically
begins with a Logical request which may, or may not, result in a physical
request.
21
Snapshot Analysis
 BPLRTX (Bufferpool Logical Reads per
Transaction)
Formula for calculation :
BPLRTX = cast((((Buffer pool data logical reads)+(Buffer
pool index logical reads))/((Commit statements
attempted)+(Rollback statements attempted)))
22
Snapshot Analysis
Overall Read Time (ms) (ORMS)
ORMS tells us the average time for DB2 to complete a physical read.
It should be computed for the database and for each tablespace.
DBA should compare the ORMS for the database against the ORMS
for each tablespace.
If any tablespaces have read times significantly higher than the
average for the database, then it is important to determine why and
attempt to improve the performance of the slowest tablespaces.
In our example it is 0
23
Snapshot Analysis
Overall Write Time (ms) (OWMS)
The average time to perform a physical write for this database was 1.84ms.
This is good. 97.29% of writes are being performed asynchronously. This is
also good.
OWMS at the database level tells us the average time for the database to
perform a write (any write, whether synchronous or asynchronous). OWMS
at the tablespace level tells us the average write time for each tablespace. If
the OWMS for a tablespace is significantly higher than the OWMS for the
database, then you have likely uncovered a "performance opportunity for
improvement".
This situation needs to be investigated. For the tablespaces with the slowest
write times, carefully examine their definitions, containers, and placement
of containers, and consider these best practices:
24
Snapshot Analysis
Bufferpool Write I/O ms per Transaction (BPWIOTX)
Bufferpool write time is just one important component of understanding where
transaction time goes.
we'll also need to look direct I/O times, lock times, sort times, and CPU times.
Once we know where time is spent inside the database, then we can focus on
the resource that is the greatest bottleneck to optimized performance.
We'll also look at determining average transaction times, and how much time,
and what percent of time, is spent inside the database and out.
BPWIOTX = cast(TOTBPWRITETM as decimal(18,0))/
cast((COMMITSTMTATTMPTD + ROLLBCKSTMTATTMPTD) as decimal(18,0)))
25
Snapshot Analysis
Bufferpool Matrix
Database Bufferpool Index Hit Ratio (DB-BPIHR)
Database Bufferpool Overall Hit Ratio (DB-BPOHR)
The bufferpool index hit ratio was 98.58% and the overall
bufferpool hit ratio was 86.52%. While these hit ratios look
good, remember that bufferpool hit ratios can be very
misleading (give you a false sense of security and tuning
success) when scans are occurring in the bufferpools.
To improve hit ratios, DBAs will commonly throw more and
more memory at the bufferpool sizes until they can't get the
bufferpool hit ratios to go any higher.
26
Snapshot Analysis
Various Bufferpool Matrix









Buffer pool hit ratio per bufferpool
Should be around 95% and above.
Buffer pool data read (logical + physical)
Buffer pool index read (logical + physical)
Buffer pool total read time
Buffer pool total write time
No of victim buffers Available
Direct read/write
Time spend on direct read/write
27
Snapshot Analysis
Lock related matrix
LCKMS - The Average Lock Wait Time
Not every lock times out. Some locks just experience temporary delays while
they wait for required resources to become available.
The LCKMS formula will tell you the average lock wait time. LCKMS should
not be greater than LOCKTIMEOUT, but it could be equal to LOCKTIMEOUT
if ALL of your locks time out. Remember, too, that LOCKTIMEOUT is
configured in seconds and this formula computes milliseconds another one of the wonderful consistencies within DB2
LCKTX - The Average Lock Wait Time per Transaction
LCKTX = Time database waited on locks (ms) / TX Count
28
What are different Lock matrix indicate
What are different lock matric indicate :







Number Lock held.
Total Lock wait time.
Snapshot timestamp
Application Id
Application status
Status Change time
Look for “Application status
=“ basically lock waiting.
29
Snapshot Analysis
Problem Analysis for the Query








Number Rows read per execution.
Execution time per execution.
System CPU time per execution.
Rows returned per execution.
Look for Buffer pool Data read (logical + physical)
Look for Buffer pool index read (logical + physical)
Look for temporary xml read (logical + physical)
Look for Sort time/ Number of sort/Sort overflow
30
Snapshot Analysis
Catalog Cache Analysis :
The Catalog Cache Hit Ratio (CATHR)
CATHR = 100 - ( Catalog cache inserts * 100 / Catalog cache
lookups )
The Catalog Cache Hit Ratio should generally be at least 95%, and
most shops are able to achieve this rather easily.
If you find that your CATHR is less than 95%, then you will want to
increase DB CFG parameter CATALOGCACHE_SZ in gradual 5%
increments, or 16 4K pages, whichever is greater, until such time as
you successfully achieve the 95% goal.
31
Snapshot Analysis
What are different application matrix indicate :
















Application status
Status change time
Application idle time
= 2 minutes 22 seconds
(example)
Look for Lock section in the transaction
Locks held by application
Lock waits since connect
Time application waited on locks (ms)
Deadlocks detected
Lock escalations
Exclusive lock escalations
Number of Lock Timeouts since connected
Total time UOW waited on locks (ms)
Sort related counter (Total sort/Total sort time/overflow)
Rows deleted/ inserted/updated
Rows selected (If this value is higher then application is reading
too much data)
Rows written
32
DB2 Administrative View
 The system-defined routines and views provide a primary,
easy-to-use programmatic interface to administer and use
DB2® through SQL. They encompass a collection of builtin views, table functions, procedures, and scalar functions
for performing a variety of DB2 tasks.
 Refer
http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/topic/com.ibm.
db2.luw.sql.rtn.doc/doc/c0022652.html
 SysIBMAdm (Administrative view) and Sysproc (Table
Functions)
33
DB2 Administrative View
 Db2 “list tables for schema sysibmadm” | grep –i “mon_”
34
DB2 Administrative View








MON_DB_SUMMARY
MON_GET_APPL_LOCKWAIT
MON_GET_BUFFERPOOL
MON_GET_INDEX
MON_GET_LOCKS
MON_GET_TABLE
MON_GET_TABLESPACE
MON_GET_PKG_CACHE_STMT
35
Summary
• How to use the database level matrix to overall performance of
system
• How to relate the same into the Administrative view
Questions
Download