Performance and Tuning - Oracle Sql V3

advertisement
Performance and Tuning ORACLE SQL
GUIDE for Developers and DBAs
Kanagaraj Velusamy
15 December 2018
Workshop Objective
15 December 2018
2
Who Tunes?
15 December 2018
3
Introduction to SQL Tuning
An important fact of database system performance tuning is the tuning of SQL
statements. SQL tuning involves three basic steps
 Identifying high load or top SQL statements that are responsible for a large share
of the application workload and system resources, by reviewing past SQL
execution history available in the system.
 Verifying that the execution plans produced by the query optimizer for these
statements perform reasonably.
 Implementing corrective actions to generate better execution plans for poorly
performing SQL statements.
These three steps are repeated until the system performance reaches a satisfactory
level or no more statements can be tuned
15 December 2018
4
Goals for Tuning
The objective of tuning a system is to reduce the response time for end users of
the system and reduce the resources used to process the same work.
You can accomplish both of these objectives in several ways:
 Reduce the Workload
 Balance the Workload
 Parallelize the Workload
15 December 2018
5
Reduce the workload
SQL tuning commonly involves finding more efficient ways to process the same
workload. It is possible to change the execution plan of the statement without
altering the functionality to reduce the resource consumption.
Two examples of how resource usage can be reduced are:
1. If a commonly executed query needs to access a small percentage of data in the
table, then it can be executed more efficiently by using an index. By creating such an
index, you reduce the amount of resources used.
2. If a user is looking at the first twenty rows of the 10,000 rows returned in a specific
sort order, and if the query (and sort order) can be satisfied by an index, then the user
does not need to access and sort the 10,000 rows to see the first 20 rows.
15 December 2018
6
Balance the Workload
Systems often tend to have peak usage in the daytime when real users are
connected to the system, and low usage in the nighttime. If noncritical reports and
batch jobs can be scheduled to run in the nighttime and their concurrency during day
time reduced, then it frees up resources for the more critical programs in the day.
Parallelize the Workload
Queries that access large amounts of data (typical data warehouse queries) often
can be parallelized. This is extremely useful for reducing the response time in low
concurrency data warehouse. However, for OLTP environments, which tend to be
high concurrency, this can adversely impact other users by increasing the overall
resource usage of the program.
15 December 2018
7
Understanding the Query Optimizer
The query optimizer determines which execution plan is most efficient by
considering available access paths and by factoring in information based
on statistics for the schema objects (tables or indexes) accessed by the
SQL statement.
The query optimizer also considers hints, which are optimization
instructions placed in a comment in the statement.
15 December 2018
8
The query optimizer performs the following steps
 The optimizer generates a set of potential plans for the SQL statement based
on available access paths and hints.
 The optimizer estimates the cost of each plan based on statistics in the data
dictionary for the data distribution and storage characteristics of the tables,
indexes, and partitions accessed by the statement.
 The cost is an estimated value proportional to the expected resource use
needed to execute the statement with a particular plan. The optimizer calculates
the cost of access paths and join orders based on the estimated computer
resources, which includes I/O, CPU, and memory.
 Serial plans with higher costs take more time to execute than those with
smaller costs. When using a parallel plan, however, resource use is not directly
related to elapsed time.
 The optimizer compares the costs of the plans and chooses the one with the
lowest cost.
15 December 2018
9
SQL Processing Architecture
15 December 2018
10
Query Optimizer Components
The main objective of the query transformer is to determine if
it is advantageous to change the form of the query so that it
enables generation of a better query plan
Selectivity
Cardinality
Cost
15 December 2018
11
The SQL Optimizers
Whenever you execute a SQL statement, a component of the database known as
the optimizer must decide how best to access the data operated on by that
statement.
Oracle supports two optimizers: the rule-base optimizer (which was the original),
and the cost-based optimizer.
To figure out the optimal execution path for a statement, the optimizers consider the
following:
 The syntax you've specified for the statement
 Any conditions that the data must satisfy (the WHERE clauses)
 The database tables your statement will need to access
 All possible indexes that can be used in retrieving data from the table
 The Oracle RDBMS version
 The current optimizer mode
 SQL statement hints
 All available object statistics (generated via the ANALYZE command)
 The physical table location (distributed SQL)
 INIT.ORA settings (parallel query, async I/O, etc.)
15 December 2018
12
Not so good things about RBO...
 Released with Oracle 6.
 Using an ordered list of access methods and join methods on relative cost or each
operation.
 Normally, it chooses the path from right to left in the from clause.
 Has a very limited input in determining access paths.
 RBO has a small number of possible access method. (it does not recognize IOT,
bitmap index, hash join, …)
 It will process the tables based on how they are ordered on the query. (can be
good and most of the time is not so good)
 Always ranks execution plan based on relative cost in the list, regardless of the
data stored in the table. Index scan will always be better than table scan, which is
not true.
 Coding for the RBO is halted. All new features require implementation of CBO.
15 December 2018
13
Understanding the Cost-Based Optimizer
 The cost-based optimizer is a more sophisticated facility than the rule-based
optimizer.
 To determine the best execution path for a statement, it uses database
information such as table size, number of rows, key spread, and so forth, rather
than rigid rules.
 The information required by the cost-based optimizer is available once a table has
been analyzed via the ANALYZE command, or via the DBMS_STATS facility.
 If a table has not been analyzed, the cost-based optimizer can use only rulebased logic to select the best access path.
 The ANALYZE command and the DBMS_STATS functions collect statistics about
tables, clusters, and indexes, and store those statistics in the data dictionary.
15 December 2018
14
Understanding Access Paths for the Query Optimizer
Full Table Scans:
 Reads all rows from a table and filters out those that do not meet the selection
criteria.
 All blocks in the table that are under the high water mark are scanned.
 The high water mark indicates the amount of used space, or space that had been
formatted to receive data. Each row is examined to determine whether it satisfies
the statement's WHERE clause
Rowid Scans:
 The rowid of a row specifies the data file and data block containing the row and
the location of the row in that block.
 Locating a row by specifying its rowid is the fastest way to retrieve a single row,
because the exact location of the row in the database is specified.
 To access a table by rowid, Oracle first obtains the rowids of the selected rows,
either from the statement's WHERE clause or through an index scan of one or
more of the table's indexes.
 Oracle then locates each selected row in the table based on its rowid.
15 December 2018
15
Understanding Access Paths for the Query Optimizer
Index Scans:
 In this method, a row is retrieved by traversing the index, using the indexed
column values specified by the statement.

An index scan retrieves data from an index based on the value of one or more
columns in the index. To perform an index scan, Oracle searches the index for the
indexed column values accessed by the statement. If the statement accesses only
columns of the index, then Oracle reads the indexed column values directly from the
index, rather than from the table.
 The index contains not only the indexed value, but also the rowids of rows
in the table having that value. Therefore, if the statement accesses other
columns in addition to the indexed columns, then Oracle can find the rows in the
table by using either a table access by rowid or a cluster scan.
15 December 2018
16
Understanding Access Paths for the Query Optimizer
Cluster Access:
 A cluster scan is used to retrieve, from a table stored in an indexed cluster, all
rows that have the same cluster key value.
 In an indexed cluster, all rows with the same cluster key value are stored in the
same data block.
 To perform a cluster scan, Oracle first obtains the rowid of one of the selected
rows by scanning the cluster index.
 Oracle then locates the rows based on this rowid.
Hash Access:
 A hash scan is used to locate rows in a hash cluster, based on a hash value.
 In a hash cluster, all rows with the same hash value are stored in the same data
block.
 To perform a hash scan, Oracle first obtains the hash value by applying a hash
function to a cluster key value specified by the statement.
 Oracle then scans the data blocks containing rows with that hash value.
15 December 2018
17
Understanding Joins
Joins are statements that retrieve data from more than one table. A join is
characterized by multiple tables in the FROM clause, and the relationship between
the tables is defined through the existence of a join condition in the WHERE clause.
In a join, one row set is called inner, and the other is called outer.
How the Query Optimizer Executes Join Statements
 Access Paths As for simple statements, the optimizer must choose an access
path to retrieve data from each table in the join statement.
 Join Method
To join each pair of row sources, Oracle must perform a join
operation. Join methods include nested loop, sort merge, Cartesian, and hash joins.
 Join Order
To execute a statement that joins more than two tables, Oracle
joins two of the tables and then joins the resulting row source to the next table. This
process is continued until all tables are joined into the result.
15 December 2018
18
How the Query Optimizer Chooses Execution Plans
for Joins
The query optimizer considers the following when choosing an execution plan:
 The optimizer first determines whether joining two or more tables definitely results
in a row source containing at most one row.
 The optimizer recognizes such situations based on UNIQUE and PRIMARY KEY
constraints on the tables.
 If such a situation exists, then the optimizer places these tables first in the join
order.
 The optimizer then optimizes the join of the remaining set of tables.
 With the query optimizer, the optimizer generates a set of execution plans,
according to possible join orders, join methods, and available access paths.
 The optimizer then estimates the cost of each plan and chooses the one with the
lowest cost.
15 December 2018
19
Understanding Joins
Nested loop joins
are useful when small subsets of data are being joined and
if the join condition is an efficient way of accessing the second table.
 It is very important to ensure that the inner table is driven from (dependent
on) the outer table.
 If the inner table's access path is independent of the outer table, then the same
rows are retrieved for every iteration of the outer loop, degrading performance
considerably.
 In such cases, hash joins joining the two independent row sources perform better.
A nested loop join involves the following steps:
 The optimizer determines the driving table and designates it as the outer table.
 The other table is designated as the inner table.
 For every row in the outer table, Oracle accesses all the rows in the inner table.
The outer loop is for every row in outer table and the inner loop is for every row in
the inner table. The outer loop appears before the inner loop in the execution plan,
as follows:
NESTED LOOPS
outer_loop
inner_loop
15 December 2018
20
Understanding Joins
Hash Joins
 Hash joins are used for joining large data sets.
 The optimizer uses the smaller of two tables or data sources to build a hash table
on the join key in memory. It then scans the larger table, probing the hash table to
find the joined rows.
 This method is best used when the smaller table fits in available memory.
 The cost is then limited to a single read pass over the data for the two tables.
When the Optimizer Uses Hash Joins
The optimizer uses a hash join to join two tables if they are joined using an equijoin
and if either of the following conditions are true:
 A large amount of data needs to be joined.
 A large fraction of a small table needs to be joined.
15 December 2018
21
Understanding Joins
Sort Merge Joins


Sort merge joins can be used to join rows from two independent sources.
Hash joins generally perform better than sort merge joins. On the other hand,
sort merge joins can perform better than hash joins if both of the following
conditions exist:
1.
The row sources are sorted already.
2.
A sort operation does not have to be done.
However, if a sort merge join involves choosing a slower access method (an index
scan as opposed to a full table scan), then the benefit of using a sort merge might be
lost. Sort merge joins are useful when the join condition between two tables is an
inequality condition (but not a nonequality) like <, <=, >, or >=. Sort merge joins
perform better than nested loop joins for large data sets. You cannot use hash joins
unless there is an equality condition. In a merge join, there is no concept of a driving
table.
The join consists of two steps:
 Sort join operation: Both the inputs are sorted on the join key.
 Merge join operation: The sorted lists are merged together.
http://oracle-online-help.blogspot.com/2007/03/nested-loops-hash-join-and-sort-merge.html
15 December 2018
22
Index Management
B-tree indexes –.This is the standard tree index that Oracle has been using since the
earliest releases
INDEX is one of the very complex structure in ORACLE database.
Some important points about INDEX
 Index is like table only, it stores information and occupies some space. It has links
internally with tables.
 Index stores ROWID and Indexed column values from tables.
 Don't rush on creating INDEX. Do impact analyze like.. is it possible to achieve
the same goal without creating INDEX?, Are you going to use the INDEX
Permanently etc…
 Avoid creating index temporally. Because you will forget to drop index
if it is not used really. Which leads to more storage occupation and
decreasing DML performance on your table. Please drop the Index if
it is not used really.
 Ordering the columns is very important for creating combined Index
columns.
 Index rebuilding is required periodically if table has frequent DML
15 December 2018
23
Index Management
Ordering the Columns & Cardinality
15 December 2018
24
Bitmap Index
 Bitmap indexes are used where an index column has a relatively small number
of distinct values (low cardinality).
 These are super-fast for read-only databases, but are not suitable for systems
with frequent updates
You will want a bitmap index when:
 Table column is low cardinality - As a ROUGH guide, consider a bitmap for
any index with less than 100 distinct values
 The table has LOW DML - You must have low insert./update/delete activity.
Updating bitmapped indexes take a lot of resources, and bitmapped indexes are
best for largely read-only tables and tables that are batch updated nightly.
 Multiple columns - Your SQL queries reference multiple, low cardinality values
in there where clause. Oracle cost-based SQL optimizer (CBO) will scream when
you have bitmap indexes on .
15 December 2018
25
Multiple columns BIT Map INDEX Example
For example, assume there is a motor vehicle database with numerous lowcardinality columns such as car_color, car_make, car_model, and car_year. Each
column contains less than 100 distinct values by themselves, and a b-tree index
would be fairly useless in a database of 20 million vehicles. However, combining
these indexes together in a query can provide blistering response times a lot faster
than the traditional method of reading each one of the 20 million rows in the base
table. For example, assume we wanted to find old blue Toyota Corollas
manufactured in 1981:
SELECT license_plat_nbr FROM vehicle WHERE color = ‘blue’ AND make = ‘toyota’ AND year = 1981;
Oracle uses a specialized optimizer method called a BITMAPPED INDEX
MERGE to service this query. In a bitmapped index merge, each Row-ID, or RID,
list is built independently by using the bitmaps, and a special merge routine is used
in order to compare the RID lists and find the intersecting values. Using this
methodology, Oracle can provide sub-second response time when working against
multiple low-cardinality columns
15 December 2018
26
Troubleshooting Oracle bitmap indexes:
Some of the most common problems when implementing bitmap indexes include:
Small table -
The CBO may force a full-table scan if your table is small!
Bad statistics -
Make sure you always analyze the bitmap with dbms_stats
right after creation:
CREATE BITMAP INDEX emp_bitmap_idx ON index_demo (gender);
Exec dbms_stats.gather_index_stats
(OWNNAME=>'SCOTT', INDNAME=>'EMP_BITMAP_IDX');
Test with a hint - To force the use of your new bitmap index, just use a Oracle
INDEX hint: SELECT /*+ index(emp emp_bitmap_idx) */
COUNT(*)
FROM emp, dept
WHERE emp.deptno = dept.deptno;
15 December 2018
27
Bitmap Join Index (JBM)
This is an index structure whereby data columns from other tables appear in a multicolumn index of a junction table. This is the only create index syntax to employ a
SQL-like from clause and where clause.
Oracle9i has added the bitmap join index to its mind-boggling array of table join
methods. This new table access method requires that you create an index that
performs the join at index creation time and that creates a bitmap index of the keys
used in the join. But unlike most relational database indexes, the indexed columns
don't reside in the table. Oracle has revolutionized index creation by allowing a
WHERE clause to be included in the index creation syntax. This feature
revolutionizes the way relational tables are accessed via SQL.
The bitmap join index is extremely useful for table joins that involve low-cardinality
columns (e.g., columns with less than 300 distinct values). However, bitmap join
indexes aren't useful in all cases. You shouldn't use them for OLTP databases
because of the high overhead associated with updating bitmap indexes. Let’s take a
closer look at how this type of index works.
15 December 2018
28
How Bitmap Join Indexes work
To illustrate bitmap join indexes, I'll use a simple example, a many-to-many
relationship where we have parts and suppliers with an inventory table serving as the
junction for the many-to-many relationship. Each part has many suppliers and each
supplier provides many parts
300
50
we create an index on the
Inventory using columns
contained in the Supplier and
Part tables. The idea behind a
bitmap join index is to pre-join
the low cardinality columns,
making the overall join faster
but this technique has never
been employed in cases
where the low cardinality
columns reside in a foreign
table.
15 December 2018
29
How Bitmap Join Indexes work
To create a bitmap join index, issue the
following Oracle DDL
CREATE bitmap index
part_suppliers_state
ON
inventory( parts.part_type, supplier.state)
FROM
inventory i,
parts p,
supplier s
WHERE
i.part_id=p.part_id
AND
i.supplier_id=p.supplier_id;
Bitmap join indexes in action
To see how bitmap join indexes work,
look at this example of a SQL query.
Let's suppose you want a list of all
suppliers of pistons in North Carolina.
To get that list, you would use this
query:
select
supplier_name
from
parts natural join inventory
natural join suppliers
where
part_type = 'piston' and state='nc';
15 December 2018
30
How Bitmap Join Indexes work
Prior to Oracle9i, this SQL query would be serviced by a nested loop join or hash join
of all three tables.
With a bitmap join index, the index has pre-joined the tables, and the query can
quickly retrieve a row ID list of matching table rows in all three tables.
Note that this bitmap join index specified the join criteria for the three tables and
created a bitmap index on the junction table (Inventory) with the Part_type and State
keys (Figure A).
Oracle benchmarks claim that bitmap join indexes can run a query more than eight
times faster than traditional indexing methods. However, this speed improvement is
dependent upon many factors, and the bitmap join is not a panacea.
15 December 2018
31
How Bitmap Join Indexes work
Restrictions on using the bitmap join index include
• The indexed columns must be of low cardinality—usually with less than 300
distinct values.
• The query must not have any references in the WHERE clause to data columns
that are not contained in the index.
• The overhead when updating bitmap join indexes is substantial. For practical use,
bitmap join indexes are dropped and rebuilt each evening about the daily batch
load jobs. This means that bitmap join indexes are useful only for Oracle data
warehouses that remain read-only during the processing day.
Remember: Bitmap join indexes can tremendously speed up specific data
warehouse queries but at the expense of pre-joining the tables at bitmap index
creation time. You must also be concerned about high-volume updates.
Bitmap indexes are notoriously slow to change when the table data changes,
and this can severely slow down INSERT and UPDATE DML against the
target tables.
15 December 2018
32
How bitmap join indexes work
There are also restrictions on when the SQL optimizer is allowed to invoke a
bitmap join index. For queries that have additional criteria in the WHERE clause
that doesn't appear in the bitmap join index, Oracle9i will be unable to use this
index to service the query. For example, the following query will not use the bitmap
join index:
SELECT supplier_name
FROM parts
NATURAL JOIN
inventory
NATURAL JOIN
suppliers
WHERE
AND
AND
part_type = 'piston'
state = 'nc'
part_color = 'yellow'; -- part_color is not a part of JBI columns
15 December 2018
33
FUNCTION based Index
• A function-based index allows you to match any WHERE clause in an SQL
statement and remove unnecessary large-table full-table scans with super-fast
index range scans.
• This capability allows you to have case insenstive searches or sorts, search on
complex equations, and extend the SQL language efficiently by implementing
your own functions and operators and then searching on them.
• Why to use this feature : It's easy and provides immediate value. It can be used
to speed up existing applications without changing any of their logic or queries.
It can be used to supply additional functionality to applications with very little cost.
Example 1
CREATE INDEX emp_upper_idx ON emp(upper(ename));
SELECT ename, empno, sal FROM emp WHERE upper(ename) = 'KING';
Example 2 CREATE INDEX sales_margin_inx ON sales(revenue - cost);
SELECT ordid FROM sales WHERE (revenue - cost) > 1000;
15 December 2018
34
How to enable Function Based Indexes
The following is a list of what needs to be done to use function based indexes:
• You must have the system privelege query rewrite to create function based
indexes on tables in your own schema.
• For the optimizer to use function based indexes, the following session or system
variables must be set in Systems level or session level:
ALTER SESSION SET QUERY_REWRITE_ENABLED=TRUE
ALTER SESSION SET QUERY_REWRITE_INTEGRITY=TRUSTED
or by setting them in the init.ora parameter file.
The meaning of query_rewrite_enabled is to allow the optimizer to rewrite the query
allowing it to use the function based index. The meaning of is to tell the optimizer to
«trust» that the code marked deterministic by the programmer is in fact deterministic.
If the code is in fact not deterministic (that is, it returns different output given the
same inputs), the resulting rows from the index may be incorrect.
Function based indexes are only visible to the Cost Based Optimizer and will not be
used by the Rule Based Optimizer ever.
15 December 2018
35
Tips : using ANALYZE command
The way that you analyze your tables can have a dramatic effect on your SQL performance.
 If your DBA forgets to analyze tables or indexes after a table re-build, the impact on
performance can be devastating.
 If your DBA analyzes each weekend, a new threshold may be reached and Oracle may
change its execution plan.
 If you do want to analyze frequently, use DBMS_STATS.EXPORT_SCHEMA_STATS to
back up the existing statistics prior to re-analyzing. This gives you the ability to revert back
to the previous statistics if things screw up.
 When you analyze, you can have Oracle look at all rows in a table (ANALYZE COMPUTE)
or at a sampling of rows (ANALYZE ESTIMATE).
 Typically, use ANALYZE ESTIMATE for very large tables (1,000,000 rows or more), and
ANALYZE COMPUTE for small to medium tables.
 ORACLE strongly recommends that you analyze FOR ALL INDEXED COLUMNS for any
table that can have severe data skewness.
For example, if a large percentage of rows in a table has the same value in a given
column, that represents skewness. The FOR ALL INDEXED COLUMNS option makes the
cost- based optimizer aware of the skewness of a column's data in addition to the
cardinality (number-distinct values) of that data.
15 December 2018
36
Tips : using ANALYZE command
 When a table is analyzed using ANALYZE, all associated indexes are analyzed as well.
 If an index is subsequently dropped and recreated, it must be re-analyzed.
 Be aware that the procedures DBMS_STATS.GATHER_SCHEMA_STATS and
GATHER_TABLE_STATS analyze only tables by default, not their indexes. When using
those procedures, you must specify the CASCADE=>TRUE option for indexes to be
analyzed as well.
 Following are some sample ANALYZE statements:
ANALYZE TABLE EMP ESTIMATE STATISTICS SAMPLE 5 PERCENT FOR ALL INDEXED COLUMNS;
ANALYZE INDEX EMP_NDX1 ESTIMATE STATISTICS SAMPLE 5 PERCENT FOR ALL INDEXED COLUMNS;
ANALYZE TABLE EMP COMPUTE STATISTICS FOR ALL INDEXED COLUMNS;
 If you analyze a table by mistake, you can delete the statistics.
For example:
ANALYZE TABLE EMP DELETE STATISTICS;
 Analyzing can take an excessive
amount of time if you use the COMPUTE option on
large objects.
 We find that on almost every occasion, ANALYZE ESTIMATE 5 PERCENT on a large table
forces the optimizer make the same decision as ANALYZE COMPUTE.
15 December 2018
37
Index Rebuilding
There are many myths and legends surrounding the use of Oracle indexes, especially
the ongoing passionate debate about rebuilding of indexes for improving
performance.
Some experts claim that periodic rebuilding of Oracle b-tree indexes greatly improves
space usage and access speed, while other experts maintain that Oracle indexes
should “rarely” be rebuilt. Interestingly, Oracle reports that the new Oracle10g
Automatic Maintenance Tasks (AMT) will automatically detect indexes that are in
need of re-building.
Here are the pros and cons of this highly emotional issue:
 Arguments against Index Rebuilding
Some Oracle in-house experts maintain that Oracle indexes are super-efficient at
space re-use and access speed and that a b-tree index rarely needs rebuilding. They
claim that a reduction in Logical I/O (LIO) should be measurable, and if there were
any benefit to index rebuilding, someone would have come up with “provable” rules
15 December 2018
38
Index Rebuilding
 Arguments for Index Rebuilding
Many Oracle shops schedule periodic index rebuilding, and report measurable speed
improvements after they rebuild their Oracle b-tree indexes. In an OracleWorld 2003
presentation titled Oracle Database 10 g: The Self-Managing Database by Sushil
Kumar of Oracle Corporation, Kumar states that the Automatic Maintenance Tasks
(AMT) Oracle10g feature will automatically detect and rebuild sub-optimal indexes.
“AWR provides the Oracle Database 10g a very good 'knowledge' of how it is being
used. By analyzing the information stored in Automatic Workload Repository (AWR) ,
the database can identify the need of performing routine maintenance tasks, such as
optimizer statistics refresh, rebuilding indexes, etc. The Automated Maintenance
Tasks infrastructure enables the Oracle Database to automatically perform those
operations.”
15 December 2018
39
Index Rebuilding
Where are the index details?
Most Oracle professionals are aware of the dba_indexes view that is populated with
index statistics when indexes are analyzed. The dba_indexes view contains a great
deal of important information for the SQL optimizer, but there is still more to
see. Oracle provides an analyze index Index_name validate structure command that
provides additional statistics into a temporary tables called index_stats.
The important Index statistics for index rebuilding decision
The following INDEX_STATS columns are especially useful:
 HEIGHT refers to the maximum number of levels encountered within the
index. An index could have 90 percent of the nodes at three levels, but excessive
splitting and spawning in one area of the index with heavy DML operations could
make nodes in that area to have more than three levels.
As an index accepts new rows, the index blocks split. Once the index nodes have
split to a predetermined maximum level the index will “spawn” into a new level..
 LF_ROWS
refers to the total number of leafs nodes in the index.
15 December 2018
40
 DEL_LF_ROWS refers to the number of leaf rows that have been marked deleted
as a result of table DELETEs."
 CLUSTERING_FACTOR – This is one of the most important index statistics
because it indicates how well sequenced the index columns are to the table rows. If
clustering_factor is low (about the same as the number of dba_segments.blocks in
the table segment) then the index key is in the same order as the table rows and
index range scans will be very efficient, with minimal disk I/O. As clustering_factor
increases (up to dba_tables.num_rows), the index key is increasingly out of
sequence with the table rows. Oracle’s cost-based SQL optimizer relies heavily upon
clustering_factor to decide whether to use the index to access the table.
 BLOCKS – This is the number of blocks consumed by the index. This is
dependent on the db_block_size. In Oracle9i and beyond, many DBAs create b-tree
indexes in very large blocksizes (db_32k_cache_size) because the index will spawn
less. Robin Schumacher has noted in his book Oracle Performance Troubleshooting
notes “As you can see, the amount of logical reads has been reduced in half simply
by using the new 16K tablespace and accompanying 16K data cache. Clearly, the
benefits of properly using the new data caches and multi-block tablespace feature of
Oracle9i and above are worth your investigation and trials in your own database.“
15 December 2018
41
Method 1:
15 December 2018
42
Method 2:
1. Create a table index_details with all the columns from
( ********* dba_indexes ****************** , ******************Index_stats***********)
2. Insert into (mention all dba_index columns) index_details select * from dba_indexes where
owner not like 'SYS%' (or you can filter only for your schema for example 'SCOTT')
Now that we have gathered the index details from dba_indexes, we must loop through iterations
of the analyze index Index_name validate structure command to populate our table with other
statistics. Here is the script.
3.
4.
15 December 2018
43
Is there a criterion for index rebuilding?
For example, here are the criteria used by a fellow Oracle DBA who swears that rebuilding
indexes with these criteria has a positive effect on his system performance:
-- *** Only consider when space used is more than 1 block *** btree_space > 8192 and
-- *** The number of index levels is > 3
*** height > 3
-- *** The % being used is < 75%
*** or pct_used < 75
-- *** Deleted > 20% of total
*** or
(del_lf_rows/(decode(lf_rows,0,1,lf_rows)) *100) > 20)
15 December 2018
44
Best approach for tuning a Query
Query text with good style and
understandable format
SELECT a.empno,a.empname
SELECT a.empno
,a.empname
--- ,a.salary Removed by SRZVE2 on 12/OCT/07
,b.deptname
,c.salary
FROM emp
,dept
, ( SELECT
a
b
a.empno
,SUM(a.salary*b.tax) salary
FROM wage a
tax b
WHERE a.tax_code=b.tax_code
GROUP BY a.empno )
c
WHERE a.empno
=200
AND
a.salary
=40000
AND
a.empname LIKE ‘%RAJ%’ AND
a.deptno
=b.deptno
AND
a.empno
= c.empno
---,a.salary
--Removed by SRZVE2 on ---12/OCT/07
,b.deptname,c.salary FROM emp a ,dept b ,
( SELECT a.empno ,SUM(a.salary*b.tax)
salary FROM wage a tax b WHERE
a.tax_code=b.tax_code GROUP BY a.empno
) c
WHERE a.empno
= 200 AND a.salary
= 40000 AND a.empname LIKE ‘%RAJ%’
AND a.deptno = b.deptno AND a.empno =
c.empno
15 December 2018
45
Tuning Tips
TIP 1 (Best Tip): SQL cannot be shared within Oracle unless it is absolutely
identical. Statements must have match exactly in case, white space and
underlying schema objects to be shared within Oracle's memory. Oracle avoids
the parsing step for each subsequent use of an identical statement.
• Use SQL standards within an application. Rules like the following are easy to
implement and will allow more sharing within Oracle's memory.
- Using a single case for all SQL verbs
- Beginning all SQL verbs on a new line
- Right or left aligning verbs within the initial SQL verb
- Separating all words with a single space
• Use a standard approach to table aliases. If two identical SQL statements
vary because an identical table has two different aliases, then the SQL is different
and will not be shared.
• Use table aliases and prefix all column names by their aliases when more
than one table is involved in a query. This reduces parse time AND prevents future
syntax errors if someone adds a column to one of the tables with the same name as
a column in another table. (ORA-00918: COLUMN AMBIGUOUSLY DEFINED)
15 December 2018
46
TIP 2
Beware of WHERE clauses which do not use indexes at all. Even if there is
an index over a column that is referenced by a WHERE clause included in this
section, Oracle will ignore the index.
15 December 2018
47
TIP 3: Don't forget to tune views. Views are SELECT statements and can be
tuned in just the same way as any other type of SELECT statement can be. All
tuning applicable to any SQL statement are equally applicable to views.
TIP 4: Avoid including a HAVING clause in SELECT statements. The HAVING
clause filters selected rows only after all rows have been fetched. Using a WHERE
clause helps reduce overheads in sorting, summing, etc. HAVING clauses should
only be used when columns with summary operations applied to them are restricted
by the clause.
15 December 2018
48
TIP 5:
Minimize the number of table lookups (subquery blocks) in queries,
particularly if your statements include subquery SELECTs or multicolumn UPDATEs.
TIP 6
Avoid joins that require the DISTINCT qualifier on the SELECT list in
queries which are used to determine information at the owner end of a one-to-many
relationship. The DISTINCT operator causes Oracle to fetch all rows satisfying the
table join and then sort and filter out duplicate values. EXISTS is a faster
alternative, because the Oracle optimizer realizes when the subquery has been
satisfied once, there is no need to proceed further and the next matching row can
be fetched. (Note: This query returns all department numbers and names which have at least one employee)
15 December 2018
49
TIP 7
Consider whether a UNION ALL will suffice in place of a UNION. The
UNION clause forces all rows returned by each portion of the UNION to be sorted
and merged and duplicates to be filtered before the first row is returned. A UNION
ALL simply returns all rows including duplicates and does not have to perform any
sort, merge or filter. If your tables are mutually exclusive (include no duplicate
records), or you don't care if duplicates are returned, the UNION ALL is much more
efficient.
TIP 8:
Consider using DECODE to avoid having to scan the same rows
repetitively or join the same table repetitively. Note, DECODE is not necessarily
faster as it depends on your data and the complexity of the resulting query. Also,
using DECODE requires you to change your code when new values are allowed in
the field.
15 December 2018
50
TIP 9:
Oracle automatically performs simple column type conversions (or
casting) when it compares columns of different types. Depending on the type of
conversion, indexes may not be used. Make sure you declare your program
variables as the same type as your Oracle columns, if the type is supported in the
programming language you are using.
15 December 2018
51
TIP 10: Specify the leading index columns in WHERE clauses.
For a composite index, the query would use the index as long as the leading column
of the index is specified in the WHERE clause. The following query would use the
composite index based on the primary key constraint on the PART_ID and
PRODUCT_ID columns:
SELECT * FROM PARTS WHERE PART_ID = 100; SELECT * FROM PARTS WHERE PRODUCT_ID = 1111;
The same request can rewritten to take advantage of the index. In this query, it is
assumed that the PART_ID column will always have a value greater than zero:
SELECT * FROM PARTS WHERE PART_ID > 0 AND PRODUCT_ID = 1111;
TIP 11: Evaluate index scan vs. full table scan.
If selecting more than 20 percent of the rows from a table, full table scan is usually
faster than an index access path. In such cases, write your SQLs so that they use full
table scans.
The following statements would not use index scans even if an index is created on
the SALARY column. In the first SQL, using the FULL hint forces Oracle to employ full
table scan. When using an index does more harm than good, you can also use these
techniques to suppress the use of the index.
SELECT /* +FULL/ * FROM EMP WHERE SALARY = 50000;
SELECT * FROM EMP WHERE SALARY+0 = 50000;
15 December 2018
52
TIP 12: Use ORDER BY for index scan.
Oracle's optimizer will use an index scan if the ORDER BY clause is on an
indexed column. The following query illustrates this point. This query would use the
index available on EMPID column even though the column is not specified in the
WHERE clause. The query would retrieve ROWID for each row from the index and
access the table using the ROWID.
SELECT SALARY FROM EMP ORDER BY EMPID;
If this query performs poorly, you can try another alternative by rewriting the same
query using the FULL hint
TIP 13: Know the data.
you have to know your data intimately. For example, say
you have a table called BOXER containing two columns BOXER_NAME and SEX
with a nonunique index on column SEX. If there are an equal number of male and
female boxers (1000 records), the following query will run faster if Oracle performs a
full table scan:
SELECT BOXER_NAME FROM BOXER WHERE SEX = 'F';
You can ensure the query performs a full table scan by rewriting it as :
SELECT BOXER_NAME --+ FULL FROM BOXER WHERE SEX = 'F';
If the table contains 980 male boxers, this query would be faster because it results in
index scan:
SELECT BOXER_NAME /*+ INDEX (BOXER BOXER_SEX)*/ FROM BOXER WHERE SEX = 'F';
15 December 2018
53
TIP 14: You can reach the same destination in different ways.
In many cases, more than one SQL statement can get you the same desired results.
Each SQL may use a different access path and may perform differently. For
example, the MINUS operator can be much faster than using WHERE NOT IN (SELECT )
or WHERE NOT EXISTS.
Let's say we have an index on a STATE column and another index on an
AREA_CODE column. Despite the availability of indexes, the following statement will
require a full table scan due to the usage of the NOT IN predicate:
SELECT CUSTOMER_ID FROM CUSTOMERS WHERE STATE IN ('VA', 'DC', 'MD') AND
AREA_CODE NOT IN (804, 410);
However, if the same query is rewritten as the following, it will result in index scans:
SELECT CUSTOMER_ID FROM CUSTOMERS WHERE STATE IN ('VA', 'DC', 'MD')
MINUS
SELECT CUSTOMER_ID FROM CUSTOMERS WHERE AREA_CODE IN (804, 410);
If a SQL involves OR in the WHERE clause, it can also be rewritten by substituting
UNION for OR in the WHERE clause. You must carefully evaluate execution plans of
all SQLs before selecting one to satisfy the information request. You can use Explain
Plan and TKPROF tools for this process.
15 December 2018
54
TIP 15: Use the special columns
Take advantage of ROWID and ROWNUM columns. Remember, the ROWID search is
the fastest. Here's an example of UPDATE using ROWID scan:
SELECT ROWID, SALARY INTO TEMP_ROWID, TEMP_SALARY FROM EMPLOYEE;
UPDATE EMPLOYEE SET SALARY = TEMP_SALARY * 1.5 WHERE ROWID =
TEMP_ROWID;
A ROWID value is not constant in a database, so don't hard-code a ROWID value in
your SQLs and applications.
Use ROWNUM column to limit the number of rows returned. If you're not sure how
many rows a SELECT statement will return, use ROWNUM to restrict the number of
rows returned. The following statement would not return more than 100 rows:
SELECT EMPLOYE.SS#, DEPARTMENT.DEPT_NAME
FROM EMPLOYEE, DEPENDENT
WHERE EMPLOYEE.DEPT_ID = DEPARTMENT.DEPT_ID AND ROWNUM < 100;
15 December 2018
55
Additional Tips
 Do not use the set operator UNION if the objective can be achieved through an
UNION ALL. UNION incurs an extra sort operation.
 Select ONLY those columns in a query which are required. Extra columns which
are not actually used incur more I/O on the database and increase network
traffic.
 Do not use the keyword DISTINCT if the objective can be achieved otherwise.
DISTINCT incurs an extra sort operation.
 If it is required to use a composite index try to use the ‘Leading” column in the
“WHERE” clause. Though Index skip scan is possible it incurs extra cost in
creating virtual indexes and may not be always possible depending on the
cardinality of the leading columns.
 There should not be any Cartesian product in the query unless there is a
definite requirement.
15 December 2018
56
Additional Tips
 It is always better to write separate SQL statements for different tasks, but if you
must use one SQL statement, then you can make a very complex statement
slightly less complex by using the UNION ALL operator
 Joins to complex views are not recommended, particularly joins from one
complex view to another. Often this results in the entire view being instantiated,
and then the query is run against the view data
 Querying from a view requires all tables from the view to be accessed for the
data to be returned. If that is not required, then do not use the view. Instead, use
the base table(s), or if necessary, define a new view.
 While querying on a partitioned table try to use the partition key in the “WHERE”
clause if possible. This will ensure partition pruning.
 Avoid doing an ORDER BY on a large data set especially if the response time is
important.
15 December 2018
57
Additional Tips
 Use CASE statements instead of DECODE (especially where nested DECODEs
are involved) because they increase the readability of the query immensely.
 Do not use HINTS unless the performance gains clear.

Check if the statistics for the objects used in the query are up to date. If not, use
the DBMS_STATS package to collect the same.

It is always good to understand the data both functionally and it’s diversity and
volume in order to tune the query. Selectivity (predicate) and Cardinality (skew)
factors have a big impact on query plan. Use of Statistics and Histograms can
drive the query towards a better plan.

Read explain plan and try to make largest restriction (filter) as the driving site for
the query, followed by the next largest, this will minimize the time spent on I/O and
execution in subsequent phases of the plan.
15 December 2018
58
Additional Tips
 Queries tend to perform worse as they age due to volume increase, structural
changes in the database and application, upgrades etc. Use Automatic Workload
Repository (AWR) and Automatic Database Diagnostic Monitor (ADDM) to better
understand change in execution plan and throughput of top queries over a period
of time.
 SQL Tuning Advisor and SQL Access Advisor can be used for system advice on
tuning specific SQL and their join and access paths, however, advice generated
by these tools may not be always applicable
Disclaimer: Points listed above are only pointers and may not work under every
circumstances. This check list can be used as a reference while fixing performance
problems in the Oracle Database.
Suggested further readings
Materialized Views
•
Advanced Replication
•
Change Data Capture (Asynchronous)
•
Automatic Workload Repository (AWR) and Automatic Database Diagnostic Monitor
(ADDM).
•
Partitioning strategies.
15 December 2018
59
How to use Explain Plan
Before Tuning: Two minutes
After Tuning: Few Seconds
15 December 2018
60
15 December 2018
61
Before
Tuning
After
Tuning
15 December 2018
62
Introduction to HINTS
Optimizer hints can be used with SQL statements to alter execution plans for better
execution.
Types of Hints
Hints can be of the following general types:
 Single-table
Single-table hints are specified on one table. INDEX and
USE_NL are examples of single-table hints.
 Multi-table
Multi-table hints are like single-table hints, except that the hint can
specify one or more tables or views. LEADING is an example of a multi-table hint. Note
that USE_NL(table1 table2) is not considered a multi-table hint because it is actually a
shortcut for USE_NL(table1) and USE_NL(table2).
 Query block
Query block hints operate on single query blocks.
START_TRANSFERMATION and UNNEST are examples of query block hints.
 Statement
Statement hints apply to the entire SQL statement. ALL_ROWS is an
example of a statement hint.
15 December 2018
63
Hints by Category
Optimizer hints are grouped into the following categories:
 Hints for Optimization Approaches and Goals
 Hints for Access Paths
 Hints for Query Transformations
 Hints for Join Orders
 Hints for Join Operations
 Hints for Parallel Execution
 Additional Hints
15 December 2018
64
Hints for Optimization Approaches and Goals
The following hints let you choose between optimization approaches and goals:
ALL_ROWS, FIRST_ROWS(n)
If a SQL statement has a hint specifying an optimization approach and goal, then the
optimizer uses the specified approach regardless of the presence or absence of
statistics, the value of the OPTIMIZER_MODE initialization parameter, and the
OPTIMIZER_MODE parameter of the ALTER SESSION statement.
The optimizer goal applies only to queries submitted directly. Use hints to specify the
access path for any SQL statements submitted from within PL/SQL. The ALTER
SESSION... SET OPTIMIZER_MODE statement does not affect SQL that is run from
within PL/SQL.
15 December 2018
65
ALL_ROWS, FIRST_ROWS(n)
If you specify either the ALL_ROWS or the FIRST_ROWS(n) hint in a SQL
statement, and if the data dictionary does not have statistics about tables accessed
by the statement, then the optimizer uses default statistical values, such as allocated
storage for such tables, to estimate the missing statistics and to subsequently
choose an execution plan. These estimates might not be as accurate as those
gathered by the DBMS_STATS package, so you should use the DBMS_STATS
package to gather statistics.
If you specify hints for access paths or join operations along with either the
ALL_ROWS or FIRST_ROWS(n) hint, then the optimizer gives precedence to the
access paths and join operations specified by the hints.
15 December 2018
66
Example ALL_ROWS, FIRST_ROWS(n)
Assume that we have a simple query that selects 1,000,000 rows from the
customer table, and orders the result by customer name:
select cust_name from customer order by cust_name;
Let's also assume that we have an index on the cust_name column.
The SQL optimizer has a choice of methods to produce the result set:
Choice 1 - The database can use the cust_name index to retrieve the customer table
rows. This will alleviate the need for sorting the result set at the end of the query, but
using the index has the downside of causing additional I/O within the database as the
index nodes are accessed.
Choice 2 - The database can perform a parallel full table scan against the table and
then sort the result set on desk. This execution plan will generally result in less
overall disk I/O resources than using the index, but the downside to this optimization
technique that no rows from the query will be available until the entire query has been
completed. For a giant query, this could take several minutes.
15 December 2018
67
Example ALL_ROWS, FIRST_ROWS(n)
Hence, we see two general approaches to SQL query optimization.
The use of the indexes to avoiding sorting been codified within Oracle as the
first_rows optimization technique. Under first_rows optimization, the optimizer
goal is to begin to return rows to the query as quickly as possible, even if it
means extra disk I/O.
It gives preference to Index scan Vs Full scan (even when index scan is not
good). It prefers nested loop over hash joins because nested loop returns data as
selected. Cost of the query is not the only criteria for choosing the execution plan.
It chooses plan which helps in fetching first rows fast.
This mode may be good for interactive client-server model. In most of
OLTP systems, where users want to see data fast on their screen, this mode of
optimizer is very handy.
15 December 2018
68
Example ALL_ROWS, FIRST_ROWS(n)
The all_rows optimizer goal is designed to minimize overall machine
resources. Under all_rows optimization the goal is to minimize the amount of
machine resources and disk I/O for the query. Hence, the all_rows optimizer
mode tends to favor full table scans, and is generally used in large data
warehouses where immediate response time is not required.
Important facts about ALL_ROWS
• ALL_ROWS considers both index scan and full scan and based on their
contribution to the overall query, it uses them. If Selectivity of a column is low,
optimizer may use index to fetch the data (for example ‘where
employee_code=7712’), but if selectivity of column is quite high ('where deptno=10'),
optimizer may consider doing Full table scan. With ALL_ROWS, optimizer has more
freedom to its job at its best.
• Good for OLAP system, where work happens in batches/procedures. (While some
of the report may still use FIRST_ROWS depending upon the anxiety level of report
reviewers)
15 December 2018
69
Hints for Access Paths
Each of the following hints instructs the optimizer to use a specific access path for
a table:
FULL
CLUSTER
HASH
INDEX
NO_INDEX
INDEX_ASC
INDEX_COMBINE
INDEX_JOIN
INDEX_DESC
INDEX_FFS
NO_INDEX_FFS
INDEX_SS
INDEX_SS_ASC
INDEX_SS_DESC
NO_INDEX_SS
15 December 2018
70
Hints for Access Paths
FULL Hint
SELECT /*+ FULL(e) */ employee_id, last_name FROM hr.employees e WHERE
last_name LIKE :b1;
Oracle Database performs a full table scan on the employees table to execute this
statement, even if there is an index on the last_name column that is made available
by the condition in the WHERE clause.The employees table has alias e in the FROM
clause, so the hint must refer to the table by its alias rather than by its name. Do not
specify schema names in the hint even if they are specified in the FROM clause.
15 December 2018
71
Hints for Access Paths
NO_INDEX Hint
SELECT /*+ NO_INDEX(employees emp_empid) */ employee_id FROM employees
WHERE employee_id > 200;
Each parameter serves the same purpose as in "INDEX Hint" with the following
modifications:
If this hint specifies a single available index, then the optimizer does not consider a
scan on this index. Other indexes not specified are still considered.
If this hint specifies a list of available indexes, then the optimizer does not consider a
scan on any of the specified indexes. Other indexes not specified in the list are still
considered.
If this hint specifies no indexes, then the optimizer does not consider a scan on any
index on the table.
15 December 2018
72
Hints for Query Transformations
• FACT - The FACT hint is used in the context of the star transformation to
indicate to the transformation that the hinted table should be considered as a
fact table.
• MERGE
• NO_EXPAND
• NO_EXPAND_GSET_TO_UNION
• NO_FACT
• NO_MERGE
• NOREWRITE
• REWRITE
• STAR_TRANSFORMATION
• USE_CONCAT
15 December 2018
73
Hints for Join Orders
• LEADING
• ORDERED
The ORDERED hint instructs Oracle to join tables in the order in which they appear
in the FROM clause. Oracle recommends that you use the LEADING hint, which is
more versatile than the ORDERED hint.
When you omit the ORDERED hint from a SQL statement requiring a join, the
optimizer chooses the order in which to join the tables. You might want to use the
ORDERED hint to specify a join order if you know something that the optimizer does
not know about the number of rows selected from each table. Such information lets
you choose an inner and outer table better than the optimizer could.
The following query is an example of the use of the ORDERED hint:
SELECT /*+ORDERED */ o.order_id, c.customer_id, l.unit_price * l.quantity
FROM customers c, order_items l, orders o
WHERE c.cust_last_name = :b1 AND
o.customer_id = c.customer_id AND
o.order_id = l.order_id;
15 December 2018
74
Hints for Parallel Execution
Large queries (SELECT statements) can be split into smaller tasks and executed
in parallel by multiple slave processes in order to reduce the overall elapsed time.
The task of scanning a large table, for example, can be performed in parallel by
multiple slave processes. Each process scans a part of the table, and the results
are merged together at the end. Oracle's parallel query feature can significantly
improve the performance of large queries and is very useful in decision support
applications, as well as in other environments with large reporting requirements.
• NOPARALLEL
• PARALLEL
• NOPARALLEL_INDEX
• PARALLEL_INDEX
• PQ_DISTRIBUTE
15 December 2018
75
SELECT
.........
FROM
(
SELECT
*
FROM ( SELECT /*+ PARALLEL(bo_daily_business_ctrl_fact) */ * FROM bo_daily_business_ctrl_fact
where ..............
( Complex Sub queries )
) cy_m
where
...............
UNION ALL
(
SELECT *
FROM ( SELECT /*+ PARALLEL (bo_daily_business_ctrl_fact) */ * FROM bo_daily_business_ctrl_fact
WHERE ..............
(Complex Sub queries )
) py_m
WHERE
..............
Weekly Business Review - Rolling 14 days (Data Provider query) - Analysis
S.No
Market
Parameter
Week id
Parameter
Old Version query –
Execution Time (Sec)
Without Parallel
New Version query –
Execution Time (Sec)
With Parallel
1
Netherlands
200510
98.69
45.12
2
Netherlands
200515
172.76
87.55
3
Netherlands
200520
354.75
21.49
4
Netherlands
200530
214.96
35.89
5
Netherlands
200535
202.79
38.28
15 December 2018
76
Hints for Join Operations
• USE_NL
• NO_USE_NL
• USE_NL_WITH_INDEX
• USE_MERGE
• NO_USE_MERGE
• USE_HASH
• NO_USE_HASH
15 December 2018
77
Improving Query Performance with the WITH Clause
 Oracle9i significantly enhances both the functionality and performance of SQL to
address the requirements of business intelligence queries.
 The SELECT statement’s WITH clause, introduced in Oracle9i, provides powerful
new syntax for enhancing query performance.
 It optimizes query speed by eliminating redundant processing in complex queries.
• Consider a lengthy query which has multiple references to a single subquery block.
Processing subquery blocks can be costly, so recomputing a block every time it is
referenced in the SELECT statement is highly inefficient.
• The WITH clause enables a SELECT statement to define the subquery block at the
start of the query, process the block just once, label the results, and then refer to
the results multiple times.
• The WITH clause, formally known as the subquery factoring clause, is part of the
SQL-99 standard. The clause precedes the SELECT statement of a query and
starts with the keyword “WITH.” The WITH is followed by the subquery definition
and a label for the result set. The query below shows a basic example of the
clause
15 December 2018
78
CASE 1
Execution time
Using WITH clause 2 minutes
Without WITH clause 6 to 7 minutes
WITH clause defined
with label vcexp
Main Query using WITH
clause label vcexp
15 December 2018
79
CASE 2
Execution time
Using WITH clause 2 minutes
Without WITH clause 3.5 to 4 minutes
WITH clause defined with label
wuser
Main Query using WITH
clause label wuser
15 December 2018
80
CASE 3
Execution time
Without WITH clause 2 minutes
Using WITH clause 1.07 minutes
This query uses the WITH clause to calculate the sum of financial for each financial set and label
the results as wfinan. Then it checks each financial set’s total financial value to see if any
financial set’s total value are greater than one fourth of the total fianancial values . By using the
new clause, the wfinan data is calculated just once, avoiding an extra scan through the large
financial table.
Although the primary purpose of the WITH clause is performance improvement, it also makes
queries easier to read, write and maintain. Rather than duplicating a large block repeatedly
through a SELECT statement, the block is localized at the very start of the query. Note that the
clause can define multiple subquery blocks at the start of a SELECT statement: when several
blocks are defined at the start, the query text is greatly simplified and its speed vastly improved.
The SQL WITH clause in Oracle9i significantly improves performance for complex business
intelligence queries. Together with the many other SQL enhancements in Oracle9i, the WITH
clause extends Oracle's leadership in business intelligence.
15 December 2018
81
Working with Merge Statement
Oracle9i introduces a new set of
server functionality especially
beneficial for the ETL
(Extraction, Transformation, and
Loading) part of any Business
Intelligence process flow,
addressing all the needs of
highly scalable data
transformation inside the
database.
One of the most exciting new features
addressing the needs for ETL is the SQL
statement MERGE. The new SQL
combines the sequence of conditional
INSERT and UPDATE commands in a
single atomic statement, depending on
the existence of a record. This operation
is commonly known as Upsert
functionality.
90
Minutes
Execution time with simple
PLSQL block
15 December 2018
82
7 to 8
Minutes
Execution time with single merge
Statement
15 December 2018
83
How Oracle Analytical function works
"Analytic Functions are an important feature of the Oracle database that allows
the users to enhance SQL's analytical processing capabilities.
These functions enable the user to calculate rankings and percentiles, moving
window calculations, lag/lead analysis, top-bottom analysis, linear regression
analytics and other similar calculation-intense data processing "
Analytic functions compute an aggregate value based on a group of rows.
They differ from aggregate functions in that they return multiple rows for each
group. The group of rows is called a window and is defined by the analytic clause.
For each row, a "sliding" window of rows is defined.
The window determines the range of rows used to perform the calculations for the
"current row". Window sizes can be based on either a physical number of rows or a
logical interval such as time.
Analytic functions are the last set of operations performed in a query except for the
final ORDER BY clause. All joins and all WHERE, GROUP BY, and HAVING
clauses are completed before the analytic functions are processed. Therefore,
analytic functions can appear only in the select list or ORDER BY clause.
15 December 2018
84
Business case : Tuning a query with Analytical
Function
Business Requirement for BI report Name: Current year – Previous year comparsion
Rolling 14 days (WBR)
Business Inputs: Year, week number, Country, Number of years
Business Output: For a given Year, week number and Country, data (sales,no of units)
should be captured with average of previous 14 days for each day, from first day
(Monday) of a given week to the last day(Sunday) of previous week to
(year – Number of years) th year
Input Example: 2005, 14, ‘Netherland’, 2
Output Example:
Data range is from 2005 14th week(First day) to 2004 13th week (last day)
Data range is from 2004 14th week(First day) to 2003 13th week (last day)
15 December 2018
85
One part of the main query
select to_date(cy_m.timebyday_id,'yyyymmdd') dat,
'cy' current_year,
substr(cy_m.timebyday_id,1,4) ,
cy_m.cy_sales,
cy_m.cy_py_sales,
cy_m.cy_units,
cy_m.cy_py_units,
(select week_no from bo_time_by_day where timebyday_id=cy_m.timebyday_id) Week_no
from
(
select cy.timebyday_id,
avg( sum( CY.actual_net_sales_amt+cy.non_product_amt ) ) over (order by CY.timebyday_id desc rows
avg( sum( CY.prev_year_actual_net_sales_amt) )
over (order by CY.timebyday_id desc rows
avg( sum( CY.actual_trans_count))
over (order by CY.timebyday_id desc rows
avg( sum( CY.prev_year_actual_trans_count))
over (order by CY.timebyday_id desc rows
between 1 following and 14 following )
between 1 following and 14 following )
between 1 following and 14 following )
between 1 following and 14 following )
cy_sales,
cy_py_sales,
cy_units,
cy_py_units
from ( select /*+ parallel(bo_daily_business_ctrl_fact) */ * from bo_daily_business_ctrl_fact
where timebyday_id >=(select to_number(to_char(max(add_months(time_by_day_date,-12))-14,'yyyymmdd'))
from bo_time_by_day where year||lpad(week_no,2,0) in (200535))
and timebyday_id <=(select to_number(to_char(max(time_by_day_date),'yyyymmdd')) from bo_time_by_day where year||lpad(week_no,2,0) in (200535))
and market_id in (select distinct market_id from bo_region_dim where market_name in ('Netherlands'))
and sales_comp_flag = 1
and decode(nvl(bo_daily_business_ctrl_fact.actual_trans_count,0),0,0, decode(nvl(bo_daily_business_ctrl_fact.prev_year_actual_trans_count,0),0,0,1 ))=1 ) CY
group by timebyday_id order by 1 desc
) CY_M
Weekly Business Review - Rolling 14 days (Data Provider query) - Analysis
S.No
Market
Parameter
Week
id
Old Version query –
Without Parallel &
Analytical function
Old Version query –
With Analytical
function
New Version query –
With Parallel & Analytical
function
1
Netherlands
200510
21.00 Minutes
98.69 Seconds
45.12 Seconds
2
Netherlands
200515
18.41 Minutes
172.76 Seconds
87.55 Seconds
3
Netherlands
200520
30.20 Minutes
354.75 Seconds
21.49 Seconds
4
Netherlands
200530
25.48 Minutes
214.96 Seconds
35.89 Seconds
5
Netherlands
200535
22.17 Minutes
202.79 Seconds
38.28 Seconds
15 December 2018
86
Reduce I/O with Oracle cluster tables
Disk I/O is expensive because when Oracle retrieves a block from a data file on
disk, the reading process must wait for the physical I/O operation to complete.
For queries that access common rows with a table (e.g. get all items in order 123),
unordered tables can experience huge I/O as the index retrieves a separate data
block for each row requested.
If we group like rows together (as measured by the clustering_factor in
dba_indexes) we can get all of the row with a single block read because the rows
are together. You can use 10g hash cluster tables, single table clusters, or manual
row re-sequencing to achieve this goal:
15 December 2018
87
Using Clusters for Performance
Clusters are groups of one or more tables that are physically stored together
because they share common columns and usually are used together. Because
related rows are physically stored together, disk access time improves.
•
Cluster tables that are accessed frequently by the application in join statements.
• Do not cluster tables if the application joins them only occasionally or modifies
their common column values frequently. Modifying a row's cluster key value takes
longer than modifying the value in an unclustered table, because Oracle might
need to migrate the modified row to another block to maintain the cluster.
• Do not cluster tables if the application often performs full table scans of only one
of the tables. A full table scan of a clustered table can take longer than a full table
scan of an unclustered table. Oracle is likely to read more blocks, because the
tables are stored together.
15 December 2018
88
Using Clusters for Performance
• Cluster master-detail tables if you often select a master record and then the
corresponding detail records. Detail records are stored in the same data block(s)
as the master record, so they are likely still to be in memory when you select
them, requiring Oracle to perform less I/O.
• Store a detail table alone in a cluster if you often select many detail records of
the same master. This measure improves the performance of queries that select
detail records of the same master, but does not decrease the performance of a
full table scan on the master table. An alternative is to use an index organized
table.
• Do not cluster tables if the data from all tables with the same cluster key value
exceeds more than one or two Oracle blocks. To access a row in a clustered
table, Oracle reads all blocks containing rows with that value. If these rows take
up multiple blocks, then accessing a single row could require more reads than
accessing the same row in an unclustered table.
•
Do not cluster tables when the number of rows for each cluster key value
varies significantly. This causes waste of space for the low cardinality key value;
it causes collisions for the high cardinality key values. Collisions degrade
performance.
15 December 2018
89
Using Hash Clusters for Performance
Hash clusters group table data by applying a hash function to each row's cluster key
value. All rows with the same cluster key value are stored together on disk. Consider
the benefits and drawbacks of hash clusters with respect to the needs of the
application. You might want to experiment and compare processing times with a
particular table as it is stored in a hash cluster, and as it is stored alone with an
index.
Follow these guidelines for choosing when to use hash clusters:
Use hash clusters to store tables accessed frequently by SQL statements with
WHERE clauses, if the WHERE clauses contain equality conditions that use the
same column or combination of columns. Designate this column or combination of
columns as the cluster key.
•
•
Store a table in a hash cluster if you can determine how much space is
required to hold all rows with a given cluster key value, including rows to be inserted
immediately as well as rows to be inserted in the future.
15 December 2018
90
Using Hash Clusters for Performance
• Use sorted hash clusters, where rows corresponding to each value of the hash
function are sorted on a specific columns in ascending order, when response time
can be improved on operations with this sorted clustered data.
• Do not store a table in a hash cluster if the application often performs full table
scans and if you must allocate a great deal of space to the hash cluster in
anticipation of the table growing. Such full table scans must read all blocks
allocated to the hash cluster, even though some blocks might contain few rows.
Storing the table alone reduces the number of blocks read by full table scans.
• Do not store a table in a hash cluster if the application frequently modifies the
cluster key values. Modifying a row's cluster key value can take longer than
modifying the value in an unclustered table, because Oracle might need to migrate
the modified row to another block to maintain the cluster
15 December 2018
91
Index Organized Table (IOT)
Index Organized Tables (IOT)
have their primary key data and non-key column data stored within the same B-Tree
structure. Effectively, the data is stored within the primary key index.
There are several reasons to use this type of table:
Why Use Index Organized Tables
Faster Index Access
Index-organized tables provide faster access to table rows by the primary key. Also,
since rows are stored in primary key order, range access by the primary key involves
minimum block accesses. In order to allow even faster access to frequently accessed
columns, the row overflow storage option can be used to push out infrequently
accessed non-key columns from the B-tree leaf bloc k to an optional overflow storage
area. This limits the size and content of the row portion actually stored in the B-tree
leaf block, resulting in a smaller B-tree and faster access
15 December 2018
92
Index Organized Table (IOT)
Reduced Storage
Index-organized tables maintain a single storage structure -- the B-tree
index. Primary key column values are stored only in the B-tree ind ex and not
duplicated in the table and index as happens in a conventional heap-organized
table. Because rows of an index-organized table are stored in primary key order, a
significant amount of additional storage space savings can be obtained throug h the
use of key compression.
Increased 24x7 Availability
Index-organized tables identify rows using logical ROWIDs based on the primary
key. The use of logical ROWIDs enables online reorganization and also does not
affect the secondary indexes which remain valid and usable after the
reorganization. This capability reduces or eliminates the downtime for reorganization
of secondary indexes, making index-organized tables beneficial for 24x7
applications.
15 December 2018
93
Where are Index-Organized Tables Used?
Electronic order processing - An index-organized table is an ideal storage structure
for the Îordersâ table, when the query and DML is predominantly primary-key
based. Heavy volume of DML operations occurring in this type of application usually
fra gments the table, requiring frequent table reorganization. An index-organized
table can be reorganized without invalidating its secondary indexes, and can be
performed online thus reducing or even eliminating "downtime" for the orders table.
Electronic catalogs - An index-organized table can be used to store all types of
manufacturing and retail catalogs. Manufacturing catalogs are usually indexed by
product attributes based o n primary key and a retailers catalog may have a multicolumn primary key matching the hierarchy of products offered. Both types benefit
from using index-organized tables. Key compression can be used on these indexorganized tables to avoid co lumn value repetitions increasing performance and
reducing storage.
Internet searches - These applications maintain lists of keywords, users, or URLs,
suitable for storage in a inde x-organized table, where each row holds a primary key
with some additional information. An index-organized table storing URLs and their
associated links can considerably speed up access time.
15 December 2018
94
Web portals and auction sites - A prevailing feature of these application types is
databases of users names with a subset of this available user information accessed
more frequently than the rest. The flexible column placement within index-organized
tables provides options for increasing performance of these applications.
Data Warehousing - Index-organized tables support parallel features for loading,
index creation, and scans required for handling large volumes of data. Partitioned
index-organized tables are also supported, so that each partition can be loaded
concurrently. Data warehousing applications using star schemas can also gain
performance and s calability by implementing "fact" tables as index-organized tables
for efficient execution of star queries. All these features make index-organized tables
suitable for handling large scale data.
Creation Of Index Organized Tables
• Specify the primary key using a column or table constraint.
• Use the ORGANIZATION INDEX.
CREATE TABLE locations
(id NUMBER(10) NOT NULL,
description VARCHAR2(50) NOT NULL,
map BLOB, CONSTRAINT pk_locations PRIMARY KEY (id)
)
ORGANIZATION INDEX TABLESPACE iot_tablespace
PCTTHRESHOLD 20
INCLUDING description
OVERFLOW TABLESPACE overflow_tablespace;
15 December 2018
95
Working with Partitioned Tables and Indexes
Modern enterprises frequently run mission-critical databases containing upwards of
several hundred gigabytes and, in many cases, several terabytes of data.
These enterprises are challenged by the support and maintenance requirements of
very large databases (VLDB), and must devise methods to meet those challenges.
One way to meet VLDB demands is to create and use partitioned tables and
indexes. Partitioned tables allow your data to be broken down into smaller, more
manageable pieces called partitions, or even subpartitions.
Indexes can be partitioned in similar fashion. Each partition is stored in its own
segment and can be managed individually. It can function independently of the
other partitions, thus providing a structure that can be better tuned for availability
and performance.
15 December 2018
96
Working with Partitioned Tables and Indexes
If you are using parallel execution, partitions provide another means of
parallelization. Operations on partitioned tables and indexes are performed in
parallel by assigning different parallel execution servers to different partitions of the
table or index. Partitions and subpartitions of a table or index all share the same
logical attributes. For example, all partitions (or subpartitions) in a table share the
same column and constraint definitions, and all partitions (or subpartitions) of an
index share the same index options. They can, however, have different physical
attributes (such as TABLESPACE).
Although you are not required to keep each table or index partition (or subpartition)
in a separate tablespace, it is to your advantage to do so.
Storing partitions in separate tablespaces enables you to:
Reduce the possibility of data corruption in multiple partitions
Back up and recover each partition independently
Control the mapping of partitions to disk drives (important for balancing I/O load)
Improve manageability, availability, and performance
Partitioning is transparent to existing applications and standard DML statements run
against partitioned tables. However, an application can be programmed to take
advantage of partitioning by using partition-extended table or index names in DML.
15 December 2018
97
Partitioning Methods
There are several partitioning methods offered by Oracle Database:
• Range partitioning
• Hash partitioning
• List partitioning
• Composite range-hash partitioning
• Composite range-list partitioning
15 December 2018
98
When to use Range partitioning
Use range partitioning to map rows to partitions based on ranges of column values.
This type of partitioning is useful when dealing with data that has logical ranges into
which it can be distributed; for example, months of the year. Performance is best
when the data evenly distributes across the range. If partitioning by range causes
partitions to vary dramatically in size because of unequal distribution, you may want
to consider one of the other methods of partitioning.
The example below creates a table of four partitions, one for each quarter of sales.
The columns sale_year, sale_month, and sale_day are the partitioning columns,
while their values constitute the partitioning key of a specific row. The VALUES
LESS THAN clause determines the partition bound: rows with partitioning key
values that compare less than the ordered list of values specified by the clause are
stored in the partition. Each partition is given a name (sales_q1, sales_q2, ...), and
each partition is contained in a separate tablespace (tsa, tsb, ...).
15 December 2018
99
When to use Range partitioning
CREATE TABLE sales ( invoice_no NUMBER,
sale_year INT NOT NULL,
sale_month INT NOT NULL,
sale_day INT NOT NULL )
PARTITION BY RANGE (sale_year, sale_month, sale_day)
( PARTITION sales_q1 VALUES LESS THAN (1999, 04, 01) TABLESPACE tsa,
PARTITION sales_q2 VALUES LESS THAN (1999, 07, 01) TABLESPACE tsb,
PARTITION sales_q3 VALUES LESS THAN (1999, 10, 01) TABLESPACE tsc,
PARTITION sales_q4 VALUES LESS THAN (2000, 01, 01) TABLESPACE tsd
);
15 December 2018
100
When to use Hash partitioning
Use hash partitioning if your data does not easily lend itself to range partitioning, but
you would like to partition for performance and manageability reasons. Hash
partitioning provides a method of evenly distributing data across a specified number
of partitions. Rows are mapped into partitions based on a hash value of the
partitioning key. Creating and using hash partitions gives you a highly tunable
method of data placement, because you can influence availability and performance
by spreading these evenly sized partitions across I/O devices (striping).
To create hash partitions you specify the following:
Partitioning method: hash
Partitioning column(s)
Number of partitions or individual partition descriptions
The following example creates a hash-partitioned table. The partitioning column is
id, four partitions are created and assigned system generated names, and they are
placed in four named tablespaces (gear1, gear2, ...).
CREATE TABLE scubagear (id NUMBER,
name VARCHAR2 (60))
PARTITION BY HASH (id) PARTITIONS 4 STORE IN (gear1, gear2, gear3, gear4);
15 December 2018
101
When to use list partitioning
Use list partitioning when you require explicit control over how rows map to
partitions. You can specify a list of discrete values for the partitioning column in the
description for each partition. This is different from range partitioning, where a range
of values is associated with a partition, and from hash partitioning, where the user
has no control of the row to partition mapping.
The list partitioning method is specifically designed for modeling data distributions
that follow discrete values. This cannot be easily done by range or hash partitioning
because:
Range partitioning assumes a natural range of values for the partitioning column. It
is not possible to group together out-of-range values partitions.
Hash partitioning allows no control over the distribution of data because the data is
distributed over the various partitions using the system hash function. Again, this
makes it impossible to logically group together discrete values for the partitioning
columns into partitions.
Further, list partitioning allows unordered and unrelated sets of data to be grouped
and organized together very naturally.
15 December 2018
102
When to use list partitioning
Unlike the range and hash partitioning methods, multicolumn partitioning is not
supported for list partitioning. If a table is partitioned by list, the partitioning key can
consist only of a single column of the table. Otherwise all columns that can be
partitioned by the range or hash methods can be partitioned by the list partitioning
method.
The following example creates a list-partitioned table. It creates table
q1_sales_by_region which is partitioned by regions consisting of groups of states.
CREATE TABLE q1_sales_by_region (deptno number, deptname varchar2(20),
quarterly_sales number(10, 2), state varchar2(2))
PARTITION BY LIST (state)
(PARTITION q1_northwest VALUES ('OR', 'WA'),
PARTITION q1_southwest VALUES ('AZ', 'UT', 'NM'),
PARTITION q1_northeast VALUES ('NY', 'VM', 'NJ'),
PARTITION q1_southeast VALUES ('FL', 'GA'),
PARTITION q1_northcentral VALUES ('SD', 'WI'),
PARTITION q1_southcentral VALUES ('OK', 'TX'));
15 December 2018
103
When to use Composite Range-Hash Partitioning
Range-hash partitioning partitions data using the range method, and within each
partition, subpartitions it using the hash method. These composite partitions are
ideal for both historical data and striping, and provide improved manageability of
range partitioning and data placement, as well as the parallelism advantages of
hash partitioning.
The following statement creates a range-hash partitioned table. In this example,
three range partitions are created, each containing eight subpartitions. Because the
subpartitions are not named, system generated names are assigned,
but the STORE IN clause distributes them across the 4
specified tablespaces (ts1, ...,ts4).
15 December 2018
104
When to use Composite Range-Hash Partitioning
CREATE TABLE scubagear (equipno NUMBER,
equipname VARCHAR(32),
price NUMBER)
PARTITION BY RANGE (equipno)
SUBPARTITION BY HASH(equipname)
SUBPARTITIONS 8 STORE IN (ts1, ts2, ts3, ts4)
(PARTITION p1 VALUES LESS THAN (1000),
PARTITION p2 VALUES LESS THAN (2000),
PARTITION p3 VALUES LESS THAN (MAXVALUE)
);
The partitions of a range-hash partitioned table are logical structures only, as their
data is stored in the segments of their subpartitions. As with partitions, these
subpartitions share the same logical attributes. Unlike range partitions in a rangepartitioned table, the subpartitions cannot have different physical attributes from
the owning partition, although they are not required to reside in the same
tablespace.
15 December 2018
105
When to use Use Composite Range-List Partitioning
Like the composite range-hash partitioning method, the composite range-list
partitioning method provides for partitioning based on a two level hierarchy. The first
level of partitioning is based on a range of values, as for range partitioning; the
second level is based on discrete values, as for list partitioning. This form of
composite partitioning is well suited for historical data, but lets you further group the
rows of data based on unordered or unrelated column values.
The following example illustrates how range-list partitioning might be used. The
example tracks sales data of products by quarters and within each quarter, groups it
by specified states.
15 December 2018
106
When to use Use Composite Range-List Partitioning
CREATE TABLE quarterly_regional_sales (deptno number, item_no varchar2(20), txn_date date,
txn_amount number, state varchar2(2))
TABLESPACE ts4
PARTITION BY RANGE (txn_date)
SUBPARTITION BY LIST (state)
(PARTITION q1_1999 VALUES LESS THAN (TO_DATE('1-APR-1999','DD-MON-YYYY'))
(SUBPARTITION q1_1999_northwest VALUES ('OR', 'WA'), SUBPARTITION q1_1999_southwest VALUES ('AZ', 'UT', 'NM')),
PARTITION q2_1999 VALUES LESS THAN ( TO_DATE('1-JUL-1999','DD-MON-YYYY'))
(SUBPARTITION q2_1999_northwest VALUES ('OR', 'WA'), SUBPARTITION q2_1999_southwest VALUES ('AZ', 'UT', 'NM'),
PARTITION q3_1999 VALUES LESS THAN (TO_DATE('1-OCT-1999','DD-MON-YYYY'))
(SUBPARTITION q3_1999_northwest VALUES ('OR', 'WA'), SUBPARTITION q3_1999_southwest VALUES ('AZ', 'UT', 'NM')),
PARTITION q4_1999 VALUES LESS THAN ( TO_DATE('1-JAN-2000','DD-MON-YYYY'))
(SUBPARTITION q4_1999_northwest VALUES ('OR', 'WA'), SUBPARTITION q4_1999_southwest VALUES ('AZ', 'UT', 'NM')
A row is mapped to a partition by checking whether the value of the partitioning
column for a row falls within a specific partition range. The row is then mapped to a
subpartition within that partition by identifying the subpartition whose descriptor
value list contains a value matching the subpartition column value.
For example, some sample rows are inserted as follows:
(10, 4532130, '23-Jan-1999', 8934.10, 'WA') maps to subpartition
q1_1999_northwest
15 December 2018
107
References:
15 December 2018
108
Performance and Tuning ORACLE SQL
Kanagaraj_velusamy@rcomext.com, Kanagaraj.Velusamy@yahoo.com
15 December 2018
Download