Tuning Oracle PL/SQL

advertisement
Tuning Oracle PL/SQL
PL/SQL - Oracle's procedural extensions to the SQL language - can be used to implement
complex application or data retrieval logic which cannot easily be expressed in single
SQL statements. PL/SQL plays a big part in many applications - particularly client-server
applications where PL/SQL can be used to implement application logic within the
database server itself.
PL/SQL has some advantages over standard SQL:




It can be stored in compiled form within the database. This reduces the overhead
of preparing an SQL statement for execution (parsing).
It operates completely within the server environment, eliminating or reducing
client-server and network traffic.
It allows the programmer to fully specify the processing logic and to implement
data access paths which might not be possible using standard SQL.
Using PL/SQL you can write triggers, which fire automatically when database
changes are made, and user-defined functions which can be embedded in your
SQL code.
Having decided to implement some processing in PL/SQL, whether as a replacement for
standard SQL or to implement some complex application logic, we have some
opportunities for improving the performance of that PL/SQL:



The procedural constructs in PL/SQL (loops, branches, etc.) are subject to many
of the same optimization techniques appropriate for other procedural languages
Storing PL/SQL routines in the database as a stored program reduces the overhead
of parsing PL/SQL.
There are some PL/SQL specific techniques which can be used to improve
database access (for instance, explicit cursors, CURRENT OF CURSOR, PL/SQL
tables).
Code Optimization
Usually, we think of PL/SQL as a database access language and concentrate on
optimizing the SQL within the PL/SQL program. But as a procedural language, PL/SQL
is subject to many of the same principles of optimization as other languages. There are
circumstances in which PL/SQL itself - without any database accesses - can consume
CPU resources at an excessive rate. Some of the basic principles for optimizing PL/SQL
code (and indeed other languages) are:



Optimize or minimize loop processing.
Optimize conditional statements (such as IF).
Avoid recursion.
Optimize loop processing
The loop - end loop clauses are used to execute statements within the loop repeatedly. A
badly constructed loop can have a drastic effect on performance. Two important
principles in the optimization of loops are:

Try to minimize the number of iterations in the loop. Each iteration will consume
CPU, so if you are finished in the loop, use the EXIT statement to move out of the
loop.
Make sure that there are no statements inside the loop which could be located
outside of the loop. If a statement doesn't reference a loop variable it's possible it
could execute outside the loop - and perhaps execute only once, rather than many
times.

The following code fragment illustrates a poorly designed loop:
FOR counter1 in 1..500 LOOP
FOR counter2 in 1..500 LOOP
modcounter1:=MOD(counter1,10);
-- will execute 250,000
times
modcounter2:=MOD(counter2,10);
sqrt1:=SQRT(counter1);
outside of this loop
sqrt2:=SQRT(counter2);
IF modcounter1=0 THEN
-- Could be located
IF modcounter2=0 THEN
-- DO something with sqrts
END IF;
END IF;
END LOOP;
END LOOP;
Some of the problems with this piece of code are:


Although we only want to process numbers divisible by 10 (the mod function
specifies that only multiples of 10 get processed) we are actually looping though
every value from 1 to 500. The FOR statement in PL/SQL doesn't allow you to
iterate in multiples of 10, so it's pretty easy to make this mistake. However, by
using FOR, we've actually executed the statements inside the each loop 10 times
more often than necessary.
The SQRT and MOD functions for counter1 are included within the second
(inner) loop. This means that they are executed once for every iteration of the
inner loop - even though the value of counter1 does not change. As a result they
must be executed 500 times more often than necessary.
The following code fragment implements the same logic as the previous example, but
with optimized loop processing:
WHILE counter1 = 500 LOOP
sqrt1:=SQRT(counter1);
-- Executed 50 times
WHILE counter2 =500 LOOP
sqrt2:=SQRT(counter2);
-- DO something....
counter2:=counter2+10;
-- increment by 10's
END LOOP;
counter1:=counter1+10;
END LOOP;
In this example, we use the WHILE clause and manually increment the loop counter by
10. Consequently we execute the inner loop only 50 times, rather than 500. The modulus
calculations (mod function) are no longer necessary and the Square root calculation on
counter1 (SQRT function) has been moved out of the inner loop to the outer loop,
reducing executions of this from 250,000 executions in the first example to only 50
executions in this example.
The second example greatly outperforms the first example. The first example took 111
seconds to execute - almost two minutes. The second example takes only .02 seconds to
execute - almost instantaneous! So we can see that optimizing loop processing can lead to
large improvements.
Figure 1 Elapsed time for a PL/SQL block with de-optimized loops and for an optimized
equivalent.
When possible, reduce the number of iterations of a loop. Each loop consumes CPU, so
exit the loop if there is no need to continue. Also reduce processing within the loop by
moving statements outside of the loop if possible.
Optimize IF statements
When processing an IF statement with multiple conditions, PL/SQL will consider each
condition specified in turn until a match is found. Once the match is found, PL/SQL
doesn't have to evaluate any of the subsequent conditions. It therefore follows that if the
most likely condition comes first in the IF block, then the average amount of processing
that PL/SQL will have to do is reduced.
The following code fragment illustrates a de-optimized IF statement. The first condition
(counter<10) will be true only 9 times out of 10,000 loops. The final else condition will
be true for 9,910 of the 10,000 loops, but the preceding 9 comparisons will be needlessly
evaluated for each of these iterations.
FOR counter1 in 1..10000 LOOP
IF counter1 10 THEN
-- Do some processing;
ELSIF counter1 20 THEN
-- Do some processing;
ELSIF counter1 30 THEN
-- Do some processing;
ELSIF counter1 40 THEN
-- Do some processing;
ELSIF counter1 50 THEN
-- Do some processing;
ELSIF counter1 60 THEN
-- Do some processing;
ELSIF counter1 70 THEN
-- Do some processing;
ELSIF counter1 80 THEN
-- Do some processing;
ELSIF counter1 90 THEN
-- Do some processing;
ELSE
-- counter above 91
-- Do some processing;
END IF;
END LOOP;
The next example shows the IF block optimized. Now the most commonly satisfied
expression is first in the IF structure. For most iterations, this first evaluation is the only
one that needs to be performed:
FOR counter1 in 1..10000 LOOP
IF counter1 >90 THEN
-- Do some processing;
ELSIF counter1 10 THEN
-- Do some processing;
ELSIF counter1 20 THEN
-- Do some processing;
ELSIF counter1 30 THEN
-- Do some processing;
ELSIF counter1 40 THEN
-- Do some processing;
ELSIF counter1 50 THEN
-- Do some processing;
ELSIF counter1 60 THEN
-- Do some processing;
ELSIF counter1 70 THEN
-- Do some processing;
ELSIF counter1 80 THEN
-- Do some processing;
ELSIF counter1 90 THEN
-- Do some processing;
END IF;
END LOOP;
Optimizing the IF statement reduced execution time from 1.22 seconds to .21 seconds.
Figure 2 Execution times for optimized and un-optimized IF statements.
When constructing a PL/SQL IF block with multiple conditions, place the conditions in
decreasing order of probability. Make sure that the condition most likely to be satisfied is
the first condition tested in the block.
Recursion
A recursive routine is one that invokes itself. Recursive routines often offer elegant
solutions to complex programming problems but tend to consume large amounts of
memory and tend to be less efficient than a non-recursive alternative.
Many recursive algorithms can be re-formulated using non-recursive techniques. Where
possible, use non-recursive solutions to improve performance.
Use stored procedures instead of anonymous blocks
A PL/SQL block which is not contained in a procedure or function is referred to as an
anonymous block. When submitted, anonymous blocks go through much the same
parsing process as stand alone SQL statements. A search of the shared pool is performed
for a matching PL/SQL block and if not found the block is parsed and stored in the
shared pool. Parsing PL/SQL blocks is typically a more expensive operation than parsing
an SQL statement, since SQL statements contained within the PL/SQL must also be
parsed.
Parsing PL/SQL blocks is therefore a time consuming process and one that we want to
avoid where possible. Because stored procedures and functions are stored in the database
in compiled form, they normally do not need to be compiled when executed, only when
created.
For instance the following anonymous PL/SQL block executes in 0.19 seconds:
DECLARE
CURSOR get_dept_csr(cp_dept_id number) is
SELECT department_name
FROM departments
WHERE department_id=cp_dept_id;
BEGIN
OPEN get_dept_csr(3);
FETCH get_dept_csr into :dept_name;
CLOSE get_dept_csr;
END;
When converted to a stored function, execution time reduced to 0.12 seconds. Although
this is a reduction of only .07 seconds, it does represent a reduction of 37% which could
be very significant if the procedure were being executed frequently in a OLTP
environment.Figure 3 Execution times for a stored procedure and for a equivalent
anonymous PL/SQL block.
Use PL/SQL stored programs in preference to anonymous PL/SQL blocks. Stored
programs are stored in compiled form and can be executed without parse overhead.
Using packages
A PL/SQL package allows related procedures, functions, cursor and variable definitions
to be stored together. Packages are useful from a programming point of view, since they
allow related program units to be grouped and allow encapsulation of local variables and
subroutines. In an application which contains a complex of inter-related procedures,
using packages can also reduce dynamic recompilation.
Dynamic recompilation occurs when a PL/SQL program which is referred to by other
PL/SQL programs is changed. Oracle keeps track of dependencies between PL/SQL
programs and other objects and will automatically mark dependent objects as needing
recompilation when a PL/SQL program changes. PL/SQL packages consist of a package
header and a body. Providing the package header remains constant, the package body can
be changed without requiring recompilation of dependent programs.
Pinning packages in the shared pool
PL/SQL stored programs are stored within the Oracle shared memory area known as the
shared pool. The shared pool contains a number of other objects, including standard SQL
statements, object definitions and sometimes private session objects. When the shared
pool memory is exhausted, objects are "aged out" of the pool using a "least recently used"
(LRU) algorithm. In some circumstances, the aging out of stored packages and
procedures may result in a performance degradation when the package is next executed.
You can prevent PL/SQL programs from being aged out of the shared pool by using the
dbms_sharedpool package. For instance, the following command (from SQL*PLUS)
keeps the COMPLEX_FUNCTION pinned in the shared pool:
SQL> exec sys.dbms_shared_pool.keep('COMPLEX_FUNCTION','P')
Your DBA will need to install the dbms_shared_pool package (using the dbmspool.sql
script). If you want to use the package as a non-DBA, your DBA will need to grant
execute permission on the package.
Take advantage of PL/SQL packages to reduce dynamic recompilation of stored sub
programs. Consider pinning large or performance critical packages in the shared pool.
Triggers vs. stored procedures
Database triggers are stored PL/SQL programs which fire when INSERTs, UPDATEs or
DELETEs occur against nominated tables.
In earlier versions of Oracle7 (prior to 7.3), database triggers are stored in the database in
un-compiled format. Consequently these trigger need to be parsed when first executed by
an Oracle session. It therefore makes sense in these versions of Oracle to move as much
code as possible from the trigger into a stored procedure and call the stored procedure
from the trigger. The parse requirements for the trigger will then be substantially reduced.
If using a version of PL/SQL prior to 2.3 (Oracle 7.3) minimize the amount of code in
database triggers since triggers are not stored in compiled format. Put the program code
in a stored procedure and call it from the trigger.
Using the UPDATE OF and WHEN clauses in triggers
The UPDATE OF clause of the CREATE TRIGGER statement allows a FOR UPDATE
trigger to fire only when the nominated columns are updated. In a similar fashion, the
WHEN clause can be used to prevent the execution of the trigger unless a logical
condition is met.
These clauses help to prevent the trigger from executing unnecessarily and can improve
performance of DML operations on the table on which the trigger is based.
For example, the following trigger fires whenever any column in the EMPLOYEES table
is updated:
CREATE or replace trigger employee_upd
before update
or insert
on employees
for each row
BEGIN
IF :new.salary > 100000 THEN
:new.adjusted_salary:=complex_function(:new.salary);
END IF;
END;
The following trigger is more efficient, because it only fires when the SALARY column
is updated, and only when the new value of SALARY is greater than $100,000:
CREATE or replace trigger employee_upd
before update of salary
or insert
on employees
for each row
when (new.salary > 100000)
BEGIN
:new.adjusted_salary:=complex_function(:new.salary);
END;
The optimized trigger only fires when the salary column is updated. This will improve
the performance of updates which don't update the salary clause.
Figure 4 Using the WHEN and UPDATE OF trigger clauses to reduce trigger overhead.
Make use of the OF COLUMNS and WHEN clauses of the CREATE TRIGGER
statement to ensure that your trigger only fires when necessary.
Using explicit cursors
PL/SQL allows an SQL statement to be included in a block without being explicitly
associated with a cursor. For instance, the following block is legal:
begin
select
into
from
where
and
and
end;
employee_id
:employee_id
customers
contact_surname=:surname
contact_firstname=:firstname
date_of_birth=:dob;
A cursor is in fact associated with such a statement, but is created automatically by
PL/SQL. Such a cursor is referred to as an implicit cursor.
Although implicit cursors can be convenient when programming, they can impose some
execution overhead. Implicit cursors may only return a single row and Oracle must check
to make sure that only one row is returned.
f the retrieval is based on an index, then the check for additional rows can be done with a
single additional I/O. Typically, this increases the number of reads required from 3 to 4 which may still be significant if the procedure is executed very frequently. If the retrieval
is based on a table scan then Oracle will continue scanning until it finds an eligible row
or reaches the end of the table. On average, this will result in twice as many block reads.
Figure 5 I/O requirements for returning a single row or table scan in PL/SQL using
implicit and explicit cursors.
Using an explicit cursor - as in the code fragment below - avoids the overhead involved
in the implicit cursor processing and is therefore the more efficient approach.
DECLARE
CURSOR get_cust_csr is
SELECT employee_id
FROM customers
WHERE contact_surname=:surname
AND contact_firstname=:firstname
AND date_of_birth=:dob;
BEGIN
OPEN get_cust_csr ;
FETCH get_cust_csr into :employee_id;
CLOSE get_cust_csr;
END;
Implicit cursors in PL/SQL involve additional processing overheads. Explicit cursors will
usually provide better performance.
The "WHERE CURRENT OF" clause
Frequently, you will find yourself opening a cursor against a table, performing some
complex processing upon the data retrieved, and then updating the table row in question.
For instance, the following fragment illustrates such a processing loop:
DECLARE
l_newsal number;
CURSOR emp_csr is
SELECT * from employees
WHERE salary >50000;
BEGIN
FOR emp_row in emp_csr LOOP
-- Perform some complex processing
-- to work out the new salary
l_newsal:=complex_function(emp_row.salary);
UPDATE employees
-- Use primary key to find employee
SET salary=l_newsal
WHERE employee_id=emp_row.employee_id;
END LOOP;
END;
The update statement within the loop uses the primary key of the table, and is therefore a
reasonably efficient index lookup. However, since we just fetched the row in question,
why should we need to perform an index lookup? Shouldn't we already know where the
row is?
In fact, the second index lookup is unnecessary. PL/SQL (and the programmatic
interfaces) can refer to the current row selected by the cursor using the clause "WHERE
CURRENT OF cursor_name". Using this notation, PL/SQL can use the row address
(ROWID) stored in the cursor structure to locate the row without an index lookup. Using
this method, our example now looks like this:
DECLARE
l_newsal number;
CURSOR emp_csr is
SELECT * from employees
where salary >50000
for update;
-- Need FOR UPDATE since using
-- WHERE CURRENT OF
BEGIN
FOR emp_row in emp_csr LOOP
-- Perform some complex processing
-- to work out the new salary
l_newsal:=complex_function(emp_row.salary);
UPDATE employees
set salary=l_newsal
where current of emp_csr;
END LOOP;
END;
The WHERE CURRENT OF clause eliminates the I/Os involved with an index lookup
and does improve the performance of the update statement. However, to use the WHERE
CURRENT OF notation you must first lock the rows involved using the FOR UPDATE
clause in the cursor's SELECT statement. The FOR UPDATE clause has the following
side-effects:


All rows selected by the query must be locked before the first row can be
returned. If you only intend to process a subset of rows, or if response time for the
first row is your primary concern, then you probably don't want to use FOR
UPDATE.
Locking the rows requires a transaction entry be made in every block to be
updated. This involves a considerable I/O overhead and is another reason why
you may not want to use the FOR UPDATE clause.
If you want to optimize the update statement by avoiding the unnecessary index read but
don't wish to endure the locking overhead of the FOR UPDATE clause, you can keep
track of the ROWID yourself and use it in subsequent updates. The following PL/SQL
block illustrates this technique:
DECLARE
l_newsal number;
CURSOR emp_csr is
SELECT e.rowid employee_rowid, -- Give the ROWID an alias
e.*
from employees e
where salary >50000 ;
BEGIN
FOR emp_row in emp_csr LOOP
-- Perform some complex processing
-- to work out the new salary
l_newsal:=complex_function(emp_row.salary);
UPDATE employees
SET salary=l_newsal
WHERE rowid=emp_row.employee_rowid; -- Refer to the ROWID by
alias
END LOOP;
END;
By using this technique we avoid both the overhead of locking rows and the overhead of
the needless index lookup.
Figure 6 compares I/O requirements for the three approaches. Using the WHERE
CURRENT OF cursor technique reduced the I/Os required for the update statement, but
imposed an additional I/O requirement because rows had to be locked by the FOR
UPDATE clause. Fetching the row's ROWID, and specifying this in the WHERE clause
gave a "best of both world's" solution.
Figure 6 I/O requirements for an update within a cursor loop using various lookup
techniques.
A word of caution: Although the FOR UPDATE clause imposes a processing overhead, it
does prevent another user from updating the row between the time you open the cursor
and the time you issue your update. If there is any chance that such an event may render
your update invalid or that you will overwrite another session's update you should do one
of the following:



Use the FOR UPDATE clause and accept the performance consequences. This is
referred to as the pessimistic locking strategy.
Check that the row has not changed by including a check in the WHERE clause of
the UPDATE statement. For instance, adding the condition
"SAL=EMP_ROW.SAL" to the update statement in our example might be
sufficient. This is the optimistic locking strategy.
If you are processing all or most of the rows in the table you might consider
issuing a LOCK TABLE statement to lock the entire table. This has less overhead
than locking each row individually but will prevent any concurrent updates - even
of rows not being accessed by our procedure.
Make use of the WHERE CURRENT OF CURSOR clause to update a row fetched from
a cursor. If you don't want to lock the rows, store and use the ROWID column to speed
up the update.
Caching with PL/SQL tables
PL/SQL tables are analogous to arrays in other languages. Like arrays, PL./SQL tables
can be used to remember, or to cache, frequently accessed codes or values. Using this
technique, we can minimize our database accesses.
The following example illustrates the technique. A PL/SQL table is used to hold the
normal value for a product with a given product id. Only if the normal value is not found
in the table will it be necessary to query the PRODUCTS table.
CREATE or replace package body demo is
-- Declare a PL/SQL table to hold product values.
type product_val_tab_typ is table of products.normal_value%TYPE
index by binary_integer;
product_val_table product_val_tab_typ;
PROCEDURE make_sale(p_customer_id number,
p_product_id number,
p_quantity number ) is
-- Cursor to get the normal value for a product
CURSOR product_csr(cp_product_id number) is
SELECT normal_value
from products
where product_id=cp_product_id;
l_product_normal_value
products.normal_value%TYPE;
BEGIN
-- Look in the PL/SQL table for the
product_id in question
BEGIN
l_product_normal_value:=product_val_table(p_product_id);
EXCEPTION
WHEN no_data_found THEN
-- Not found in the PL/SQL table, so fetch from
-- the database
and add to the table
OPEN product_csr(p_product_id);
FETCH product_csr into l_product_normal_value;
CLOSE product_csr;
product_val_table(p_product_id):=l_product_normal_value;
END;
INSERT into sales(customer_id,product_id,sale_date,
quantity,sale_value)
values (p_customer_id,p_product_id,SYSDATE,
p_quantity, p_quantity*l_product_normal_value);
COMMIT;
END;
END;
Figure 7 shows the improvement which resulted from implementing the PL/SQL cache
table for 400 executions of our function. The PL/SQL cache table is especially effective
when the same PL/SQL function will be executed many times within a session. Because
the cache table is destroyed when you disconnect from Oracle, you won't see much
improvement if you execute the function only a couple of times in each session.
Figure 7 Using a PL/SQL table to cache frequently accessed data reduces I/O
requirements.
Since Oracle caches recently accessed data values within it's shared memory (the buffer
cache within the SGA) it might occur to you to wonder if it's wasteful to maintain a cache
within your programs memory. In fact, caching data within a program does yield
substantial performance improvements even if the data is held within Oracle's shared
memory. Issuing SQL to retrieve data from Oracle requires that parse and execute calls
be issued to Oracle. The overhead of issuing these calls will be greater than the overhead
of examining the cache table. If you implement some caching mechanism in a clientserver environment then the local cache will be even more effective, since it will reduce
network traffic.
Consider using PL/SQL tables to cache frequently accessed values and reduce
unnecessary database I/O.
Summary
PL/SQL allows us to perform many activities not available to standard SQL. If you use
PL/SQL in your application, consider some of the following measures to improve the
performance of your PL/SQL code:





Use normal code optimization techniques. In particular, avoid unnecessary loops
and place the more frequently satisfied conditions first in IF clauses.
Take advantage of stored procedures and functions. Storing PL/SQL in the
database avoids parsing overhead.
Use explicit cursors in PL/SQL. An implicit cursor will often perform additional
fetches.
Use the WHERE CURRENT OF CURSOR clause to quickly access a row
currently open in a cursor. Alternately, fetch and use the ROWID column if you
want to avoid the overhead of the FOR UPDATE statement.
Cache frequently accessed data values in PL/SQL tables.
Download