Optimizing Database Performance: The Example of Tuning an Oracle Database By Willard Baird, Progress Telecom wwbaird@tampabay.rr.com Tuning a database involves utilizing computing resources to process a database transaction efficiently and effectively. The tuning process determines the amount and kind of database activity that will be generated by a given database. These activities will tend to be specific to each database type, and to a lesser extent, to each database. For example, a transaction-oriented database uses different computing resources from a database that supports data warehouse transactions. Tuning resources are broad and include the entire computing architecture; especially the network, application server and database server. CPU resources, memory, and disk devices must also be utilized efficiently. We use the Oracle DBMS as an example because it has many ‘knobs’ that may be adjusted to maximize operating system resources. The tuning process consists of collecting and analyzing performance statistics of the current database computing environment and then making any necessary changes to the database or the computing architecture. Historically, the tuning process has been a collection of manual and automated activity and has been mostly reactive in nature. Recent tuning support and products have enabled DBA’s to take a more proactive approach to performance tuning. Consider the following scenario: An end user experiences a slow down on a database system and calls the help desk to complain about response time. The help desk calls the database administrator (DBA). The DBA gathers statistics related to the problem and analyzes them. As the DBA identifies the cause(s) of the slow down, necessary changes to the database are completed. If the issue is not database centric, the DBA involves other information technology professionals. When a database application slows down, all parts of the application architecture must be reviewed to determine the cause of the slow down. Reasons for a database application slow down may be as varied as the reasons for a highway traffic jam. After the cause of a traffic jam is removed, traffic resumes normal speed. The Oracle DBMS aids the DBA in performance tuning by collecting performance statistics that are kept in views named V$_*. The V$ views contain dynamic tuning statistics and therefore are reactive in nature. When a DBA encounters a performance problem, the database onboard statistics can be analyzed, providing the DBA with only the current state of the database. While these statistics are often helpful in determining the cause of a performance problem, they do not facilitate any trend analysis of performance over time. Trend analysis of database performance may well reveal the causes of slowdowns that were not apparent from the snapshot data provided by the V$ views. And, if a database has returned to its normal processing speed when the DBA is searching for the problem, determining its cause is a challenge. A proactive tuning tool called STATSPACK was introduced with Oracle 8i. Now Oracle DBAs collect statistics from in-memory structures and store the data in STATSPACK tables. The STATSPACK collects data at time intervals set by the DBA. This data is used to perform trend analysis, allowing the DBA to develop a capacity plan that will ensure that the Oracle databases have enough operating system resources to meet the response time requirements of the end user. Figure 1 contains an excerpt from a forty nine page STATSPACK report showing BUFFER POOL Statistics. Buffer Pool Statistics for DB: TEST-DB Instance: TEST-DB Snaps: 28 -29 -> Standard block size Pools D: default, K: keep, R: recycle -> Default Pools for other block sizes: 2k, 4k, 8k, 16k, 32k Free Write Number of Cache Buffer Physical Physical Buffer Complete P Buffers Hit % Gets Reads Writes Waits Waits --- ---------- ----- ----------- ----------- ---------- ------- -------D 28,028 96.3 926,754 33,911 774 0 0 ------------------------------------------------------------- Buffer Busy Waits -----1 Figure 1: Sample Buffer Pool Statistics The TEST-DB has 28,028 8k buffers assigned to the data buffer cache. This is operating system memory that is being utilized by the TEST-DB and is configured using the DB_CACHE_SIZE parameter. The CACHE HIT % is the percentage of time that the data was found in memory. When a SELECT statement is processed, the DBMS will look in the buffer cache first and if the data is not found then it will retrieve the data from the table causing a physical read to occur. If the CACHE HIT % is less then ninety percent then further investigation is needed. It is never a good idea to just increase the number of data buffers, which will increase the buffer cache, without knowing why the CACHE HIT % fell below an acceptable level. Through analysis of the STATSPAK report the DBA determines if more memory buffers are needed or if a SQL statement needs to be tuned to reduce the number of I/O’s it is generating. Determining how many buffers to allocate or increase/decrease is an important part of tuning an Oracle instance. STATSPAK has a Buffer Pool Advisor metric that shows the relationship of the number of the buffers in the buffer cache to the estimated physical read factor. For example, the TEST-DB currently has 28,028 buffers allocated to it, making the buffer size 224 megabytes with a read factor of one and the number of physical reads at 4,844,540 (see Figure 2). The physical reads can be reduced by over a million if the number of buffers is increased from 28,028 to 36,036. This reduction of physical reads will require an additional sixty four megabytes of operating system memory, but may be well worth the investment of system resources. Buffer Pool Advisory for DB: TEST-DB Instance: TEST-DB End Snap: 29 -> Only rows with estimated physical reads >0 are displayed -> ordered by Block Size, Buffers For Estimate (default block size first) Size for Size Buffers for Est Physical Estimated P Estimate (M) Factr Estimate Read Factor Physical Reads --- ------------ ----- ---------------- ------------- -----------------D 16 .1 2,002 4.96 24,033,162 D 32 .1 4,004 3.38 16,378,836 D 48 .2 6,006 2.77 13,438,409 D 64 .3 8,008 2.45 11,862,445 D 80 .4 10,010 2.24 10,866,859 D 96 .4 12,012 2.12 10,258,807 D 112 .5 14,014 2.04 9,903,576 D 128 .6 16,016 2.00 9,670,701 D 144 .6 18,018 1.89 9,133,869 D 160 .7 20,020 1.48 7,184,155 D 176 .8 22,022 1.24 6,026,566 D 192 .9 24,024 1.14 5,520,784 D 208 .9 26,026 1.07 5,179,825 D 224 1.0 28,028 1.00 4,844,540 D 240 1.1 30,030 0.94 4,577,377 D 256 1.1 32,032 0.90 4,352,578 D 272 1.2 34,034 0.84 4,085,693 D 288 1.3 36,036 0.75 3,608,977 D 304 1.4 38,038 0.71 3,422,851 D 320 1.4 40,040 0.69 3,327,195 ------------------------------------------------------------- Figure 2: A Buffer Pool Advisory for the TEST-DB Response time is defined as the total amount of time a transaction consumes: Response Time = Service Time + Wait Time, where Service time is the time a transaction consumes while executing its function, and Wait time is the time the transaction spends waiting to execute. Thus, a transaction may execute very efficiently from a database perspective and still provide the end user with very slow response time. A slow response time may be due to the wait time the transaction encountered and not the actual service time of the transaction within the database. Long wait time could be related to network traffic speed, overloaded CPU usage, or I/O disk contention. If the transaction originates from an application server, the operating resources of the application server must also be factored into the overall tuning equation. DBAs are expected to manage the computing resources that are used by the database. This takes us back to the tuning process of gathering data about the way the database is currently consuming the resources. The data must then be analyzed and if an opportunity for tuning exists then the tuning solution must be implemented. To give you an idea of current tuning assistance, the example of Oracle 10g is presented. With Oracle 10g the tuning process has become a “Performance Diagnosis” process. This process is based on the Oracle 10g Intelligent Self-Management Infrastructure, which is part of the database kernel. The goal of the Self-Management Infrastructure is to help the database learn about its operational environment and perform dynamic adjustments to the database to create an optimal database environment. The database takes corrective action on behalf of a slow running transaction and attempts to take corrective action. Oracle responds to current computing resources being used in order to decide about the best remedy for a slow database transaction. The key to the database being self-learning and self-managing is the Automatic Workload Repository (AWR). The 10g database takes a snapshot every hour (out of the box default) of the current work load and statistics and stores them in AWR. AWR keeps seven days worth of data, but the DBA can modify the default settings to retain more or less data and modify the snapshot interval. STATSPACK and AWR provide information at a system level but do not provide information at the transaction/session level. Detailed session information is provided to the DBA by Active Session History (ASH). The 10g database samples all active sessions and puts the sampling into a circular buffer called ASH buffers. Samples of ASH buffers are written to AWR. By only writing samples to AWR the overhead for ASH is minimal. The Automatic Database Diagnostic Monitor (ADDM) is built into the database kernel and uses the data collected in the AWR to perform a diagnostic check on the database to ensure that it is running in an optimal fashion. ADDM identifies potential tuning issues and provides solutions based on the statistics gathered in AWR. This means that the corrective action will be customized to the problem that was defined in the AWR statistics without the DBA having to perform the analysis function of the tuning process. ADDM uses a classification tree that correctly identifies the root cause of the transaction slow down, rather than only reviewing database performance statistics. DBA’s must still perform performance triage on database systems when a database sees a system wide slow down or a critical transaction that does not complete in a timely manner. To accomplish this type of reactive tuning all of the statistics in the AWR are integrated in Oracles Enterprise Manager (EM) and Database (DB) Control interface. EM DB is a powerful tool for the DBA to use to drill down in the performance data. Numerous advisors (wizards) help in understanding the cause of a database transaction slowdown. In summary, the DBA’s goal is to provide the end user with a database environment that meets or exceeds business processing requirements. The tuning process consists of both reactive and proactive elements and the DBA must know how to effectively use these tools to create an environment that allows the end user to accomplish the business activity in a timely manner. Current developments in DBMS packages are allowing DBAs to take more proactive tuning steps to keep their databases performing to expectations.