bitmap key iteration

advertisement
Star Transformations
Tony Hasler, UKOUG Birmingham
2012
Tony Hasler, Anvil Computer Services Ltd.
Who is Tony Hasler?
Google Tony Hasler!
My blog contains all the material from this presentation on the front
page
Acknowledgments
SQL Tuning by Dan Tow
Optimizer team blog:
https://blogs.oracle.com/optimizer/entry/star_transformation
(Sunil Chakkappen 2010)
Examples
Based on11.2.0.3 Enterprise Edition
Tony Hasler http://tonyhasler.wordpress.com
Problem statement: Distributed Filters
Generally, we want to access very large tables using an index on
the most selective filter we can find.
− small (ish) tables may be accessed by a full-table scan using
multi-block reads
− Partitioning by column X is an alternative to creating an index on
X (and a local index on Y is an alternative to a multi-column index
on X,Y)
This is because we want to avoid reading data from a table only to
reject it later on.
The most selective filter may or may not be a join condition
However, sometimes a combination of filters provides much
stronger selectivity than any one filter.
When combinations of filters involve join conditions they are called
Distributed Filters (by Dan Tow).
Tony Hasler http://tonyhasler.wordpress.com
Query using SH example schema
SELECT prod_name,cust_first_name,
time_id, amount_sold
FROM customers c, products p, sales s
WHERE
s.cust_id = c.cust_id
AND s.prod_id = p.prod_id
AND c.cust_last_name = 'Everett'
AND p.prod_category = 'Electronics';
NOTE: For convenience I will refer to the SALES table as a FACT
table and the CUSTOMERS and PRODUCTS tables as
DIMENSION tables but star transformations are NOT restricted to
star schemas!
Tony Hasler http://tonyhasler.wordpress.com
What join order do we want?
Product
(13)
Sales
(116,267
/918,843 )
Customers
(80)
Sales (740
Customers
(80)
Product
(1,040)
/918,843 )
They are all bad!
Tony Hasler http://tonyhasler.wordpress.com
Customers
(115)
Product
(115)
Sales (115
/918,843 )
Wouldn’t it be nice if….
SALES (115/918,843 )
Merged bitmap
(115/918,843)
SALES_PROD_IDX
(116,267 /918,843 )
SALES_CUST_IDX
(740 /918,843 )
Product
(13)
Customers
(115)
Tony Hasler http://tonyhasler.wordpress.com
Rewrite attempt 1:
SELECT time_id, amount_sold
FROM sales
WHERE cust_id = 4117 AND prod_id = 20;
This query returns too few rows because we are only picking one customer and
one product
This query returns too few columns because we don’t have the details from the
dimension tables.
Tony Hasler http://tonyhasler.wordpress.com
Enterprise edition execution plan
------------------------------------------------------------| Id | Operation
| Name
|
------------------------------------------------------------|
0 | SELECT STATEMENT
|
|
|
1 | PARTITION RANGE ALL
|
|
|
2 |
TABLE ACCESS BY LOCAL INDEX ROWID| SALES
|
|
3 |
BITMAP CONVERSION TO ROWIDS
|
|
|
4 |
BITMAP AND
|
|
|* 5 |
BITMAP INDEX SINGLE VALUE
| SALES_CUST_BIX |
|* 6 |
BITMAP INDEX SINGLE VALUE
| SALES_PROD_BIX |
-------------------------------------------------------------
Here we see selection filters being combined in a straightforward
way.
Tony Hasler http://tonyhasler.wordpress.com
Rewrite attempt 2:
SELECT time_id, amount_sold
FROM sales s
WHERE
s.cust_id IN (SELECT
FROM
WHERE
AND s.prod_id IN (SELECT
FROM
WHERE
cust_id
customers
cust_last_name = 'Everett')
prod_id
products
prod_category = 'Electronics');
This query returns fewer columns than the original query because the columns
from the dimension tables are missing
This query returns the same number of rows as the original query providing
that CUST_ID is a unique key in CUSTOMERS and PROD_ID is unique within
PRODUCTS.
Tony Hasler http://tonyhasler.wordpress.com
Enterprise edition execution plan
--------------------------------------------------------------| Id | Operation
| Name
|
--------------------------------------------------------------|
0 | SELECT STATEMENT
|
|
|
1 | VIEW
| VW_ST_EABF22F5
|
|
2 |
NESTED LOOPS
|
|
|
3 |
PARTITION RANGE ALL
|
|
|
4 |
BITMAP CONVERSION TO ROWIDS|
|
|
5 |
BITMAP AND
|
|
|
6 |
BITMAP MERGE
|
|
|
7 |
BITMAP KEY ITERATION
|
|
|
8 |
BUFFER SORT
|
|
|* 9 |
TABLE ACCESS FULL
| CUSTOMERS
|
|* 10 |
BITMAP INDEX RANGE SCAN| SALES_CUST_BIX
|
| 11 |
BITMAP MERGE
|
|
| 12 |
BITMAP KEY ITERATION
|
|
| 13 |
BUFFER SORT
|
|
|* 14 |
VIEW
| index$_join$_051
|
|* 15 |
HASH JOIN
|
|
|* 16 |
INDEX RANGE SCAN
| PRODUCTS_PROD_CAT_IX |
| 17 |
INDEX FAST FULL SCAN| PRODUCTS_PK
|
|* 18 |
BITMAP INDEX RANGE SCAN| SALES_PROD_BIX
|
| 19 |
TABLE ACCESS BY USER ROWID | SALES
|
--------------------------------------------------------------Tony Hasler http://tonyhasler.wordpress.com
Star transformations make an appearance
Multi-column indexes, and the INDEX_COMBINE (and the deprecated
AND_EQUAL) hints can be used to deal with combined selection filters
Star transformations are used to solve the distributed filter problem when
at least one of the filter predicates is a join condition with another table
Requires enterprise edition and the initialisation parameter set to TRUE or
TEMP_DISABLE
They are frequently used with “star” or “snowflake” schemas, hence the
name, but can be used elsewhere
BITMAP KEY ITERATION is the signature of a star transformation
BITMAP KEY ITERATION runs a loop performing lookups on indexes
(usually but not necessarily bitmap indexes)
The BITMAP MERGE operation merges the output of the loop
In the example, the BITMAP AND operation combines the results of two
BITMAP KEY ITERATION operations
Tony Hasler http://tonyhasler.wordpress.com
Rewrite 3: Getting our missing columns back
WITH q1
AS (SELECT /*+ no_merge */
time_id, amount_sold, prod_id, cust_id
FROM sales s
WHERE
s.cust_id IN (SELECT cust_id
FROM customers
WHERE cust_last_name = 'Everett')
AND s.prod_id IN (SELECT prod_id
FROM products
WHERE prod_category = 'Electronics'))
SELECT /*+ leading(p s c) use_hash(c) use_hash(p)
swap_join_inputs(c) no_swap_join_inputs(s) */
prod_name,cust_first_name,time_id,amount_sold
FROM customers c, products p, q1 s
WHERE
s.cust_id = c.cust_id
AND s.prod_id = p.prod_id
AND p.prod_category = 'Electronics';
Note: The hints and the last predicate are for demonstration purposes only
Tony Hasler http://tonyhasler.wordpress.com
Execution plan for rewrite 3:
-----------------------------------------------------------------| Id | Operation
| Name
|
-----------------------------------------------------------------|
0 | SELECT STATEMENT
|
|
|* 1 | HASH JOIN
|
|
|
2 |
TABLE ACCESS FULL
| CUSTOMERS
|
|* 3 |
HASH JOIN
|
|
|
4 |
TABLE ACCESS BY INDEX ROWID
| PRODUCTS
|
|* 5 |
INDEX RANGE SCAN
| PRODUCTS_PROD_CAT_IX |
|
6 |
VIEW
|
|
|
7 |
VIEW
| VW_ST_EABF22F5
|
|
8 |
NESTED LOOPS
|
|
|
9 |
PARTITION RANGE ALL
|
|
| 10 |
BITMAP CONVERSION TO ROWIDS|
|
| 11 |
BITMAP AND
|
|
| 12 |
BITMAP MERGE
|
|
| 13 |
BITMAP KEY ITERATION
|
|
| 14 |
BUFFER SORT
|
|
|* 15 |
TABLE ACCESS FULL
| CUSTOMERS
|
|* 16 |
BITMAP INDEX RANGE SCAN| SALES_CUST_BIX
|
| 17 |
BITMAP MERGE
|
|
| 18 |
BITMAP KEY ITERATION
|
|
| 19 |
BUFFER SORT
|
|
|* 20 |
VIEW
| index$_join$_060
|
|* 21 |
HASH JOIN
|
|
|* 22 |
INDEX RANGE SCAN
| PRODUCTS_PROD_CAT_IX |
| 23 |
INDEX FAST FULL SCAN| PRODUCTS_PK
|
|* 24 |
BITMAP INDEX RANGE SCAN| SALES_PROD_BIX
|
| 25 |
TABLE ACCESS BY USER ROWID | SALES
|
-----------------------------------------------------------------Tony Hasler http://tonyhasler.wordpress.com
Same as
rewrite 2
Thoughts so far?
Do we really need to rewrite our simple query in such a complex
way?
Answer: If you set
STAR_TRANSFORMATION_ENABLED=TEMP_DISABLE then the
un-hinted execution plan for the original query is (almost) identical
to that of rewrite 3!!!
But why do we need to access the CUSTOMERS table twice?
Couldn’t we cache the columns from the select list first time round?
Answer: You will cache if
STAR_TRANSFORMATION_ENABLED=TRUE
Tony Hasler http://tonyhasler.wordpress.com
Execution plan for original query when
STAR_TRANSFORMATION =TRUE
----------------------------------------------------------------------| Id | Operation
| Name
|
----------------------------------------------------------------------|
0 | SELECT STATEMENT
|
|
|
1 | TEMP TABLE TRANSFORMATION
|
|
|
2 |
LOAD AS SELECT
| SYS_TEMP_0FD9D6609_38DC9C |
|* 3 |
TABLE ACCESS FULL
| CUSTOMERS
|
|* 4 |
HASH JOIN
|
|
|
5 |
TABLE ACCESS FULL
| SYS_TEMP_0FD9D6609_38DC9C |
|* 6 |
HASH JOIN
|
|
|
7 |
TABLE ACCESS BY INDEX ROWID
| PRODUCTS
|
|* 8 |
INDEX RANGE SCAN
| PRODUCTS_PROD_CAT_IX
|
|
9 |
VIEW
| VW_ST_62EEF96F
|
| 10 |
NESTED LOOPS
|
|
| 11 |
PARTITION RANGE ALL
|
|
| 12 |
BITMAP CONVERSION TO ROWIDS|
|
| 13 |
BITMAP AND
|
|
| 14 |
BITMAP MERGE
|
|
| 15 |
BITMAP KEY ITERATION
|
|
| 16 |
BUFFER SORT
|
|
| 17 |
TABLE ACCESS FULL
| SYS_TEMP_0FD9D6609_38DC9C |
|* 18 |
BITMAP INDEX RANGE SCAN| SALES_CUST_BIX
|
| 19 |
BITMAP MERGE
|
|
| 20 |
BITMAP KEY ITERATION
|
|
| 21 |
BUFFER SORT
|
|
|* 22 |
VIEW
| index$_join$_051
|
|* 23 |
HASH JOIN
|
|
|* 24 |
INDEX RANGE SCAN
| PRODUCTS_PROD_CAT_IX
|
| 25 |
INDEX FAST FULL SCAN| PRODUCTS_PK
|
|* 26 |
BITMAP INDEX RANGE SCAN| SALES_PROD_BIX
|
| 27 |
TABLE ACCESS BY USER ROWID | SALES
|
-----------------------------------------------------------------------
Tony Hasler http://tonyhasler.wordpress.com
So is that it?
Star transformations are only a good idea if the selectivity
improvement outweighs the cost of the bitmap operations.
Star transformations may occasionally be used in mixed OLTP
environments by converting b-tree indexes to bitmaps
But execution plans using star transformations in a data
warehouse can almost always be optimized further:
− Denormalize: e.g. adding columns from CUSTOMERS and
PRODUCTS to the SALES table. You can then create multicolumn indexes or use INDEX_COMBINE operations
− Or if you don’t like wasting disk space….
Tony Hasler http://tonyhasler.wordpress.com
Bitmap join indexes
ALTER TABLE products MODIFY CONSTRAINT products_pk VALIDATE;
ALTER TABLE customers MODIFY CONSTRAINT customers_pk VALIDATE;
CREATE BITMAP INDEX sales_prod_category_bjx
ON sales (p.prod_category)
FROM products p, sales s
WHERE s.prod_id = p.prod_id
LOCAL;
CREATE BITMAP INDEX sales_cust_ln_bjx
ON sales (c.cust_last_name)
FROM customers c, sales s
WHERE c.cust_id = s.cust_id
LOCAL;
Tony Hasler http://tonyhasler.wordpress.com
Execution plan with bitmap join indexes
------------------------------------------------------------------------| Id | Operation
| Name
|
------------------------------------------------------------------------|
0 | SELECT STATEMENT
|
|
|
1 | NESTED LOOPS
|
|
|
2 |
NESTED LOOPS
|
|
|
3 |
HASH JOIN
|
|
|
4 |
TABLE ACCESS BY INDEX ROWID
| PRODUCTS
|
|
5 |
INDEX RANGE SCAN
| PRODUCTS_PROD_CAT_IX
|
|
6 |
PARTITION RANGE ALL
|
|
|
7 |
TABLE ACCESS BY LOCAL INDEX ROWID| SALES
|
|
8 |
BITMAP CONVERSION TO ROWIDS
|
|
|
9 |
BITMAP AND
|
|
| 10 |
BITMAP INDEX SINGLE VALUE
| SALES_CUST_LN_BJX
|
| 11 |
BITMAP INDEX SINGLE VALUE
| SALES_PROD_CATEGORY_BJX |
| 12 |
INDEX UNIQUE SCAN
| CUSTOMERS_PK
|
| 13 |
TABLE ACCESS BY INDEX ROWID
| CUSTOMERS
|
-------------------------------------------------------------------------
Tony Hasler http://tonyhasler.wordpress.com
What about standard edition users????
CREATE TABLE sales2
AS
SELECT * FROM sales;
CREATE INDEX sales2_cust_id_idx
ON sales2 (cust_id);
CREATE INDEX sales2_prod_id_idx
ON sales2 (prod_id);
Tony Hasler http://tonyhasler.wordpress.com
Standard edition alternative
WITH q1
AS (SELECT /*+ no_merge */
s.ROWID rid
FROM customers c, sales2 s
WHERE s.cust_id = c.cust_id AND c.cust_last_name = 'Everett')
,q2
AS (SELECT /*+ no_merge */
s.ROWID rid
FROM products p, sales2 s
WHERE s.prod_id = p.prod_id AND prod_category = 'Electronics')
SELECT /*+ no_star_transformation leading(q1 q2 s)
use_nl(s) use_nl(p) use_nl(c) */
prod_name,cust_first_name,time_id,amount_sold
FROM q1,q2,sales2 s,products p,customers c
WHERE
q1.rid = q2.rid AND q1.rid = s.ROWID
AND s.cust_id = c.cust_id AND s.prod_id = p.prod_id ;
Tony Hasler http://tonyhasler.wordpress.com
Standard edition execution plan (simulated)
--------------------------------------------------------------| Id | Operation
| Name
|
--------------------------------------------------------------|
0 | SELECT STATEMENT
|
|
|
1 | NESTED LOOPS
|
|
|
2 |
NESTED LOOPS
|
|
|
3 |
NESTED LOOPS
|
|
|
4 |
NESTED LOOPS
|
|
|* 5 |
HASH JOIN
|
|
|
6 |
VIEW
|
|
|
7 |
NESTED LOOPS
|
|
|* 8 |
TABLE ACCESS FULL
| CUSTOMERS
|
|* 9 |
INDEX RANGE SCAN
| SALES2_CUST_ID_IDX
|
| 10 |
VIEW
|
|
| 11 |
NESTED LOOPS
|
|
|* 12 |
VIEW
| index$_join$_003
|
|* 13 |
HASH JOIN
|
|
|* 14 |
INDEX RANGE SCAN
| PRODUCTS_PROD_CAT_IX |
| 15 |
INDEX FAST FULL SCAN | PRODUCTS_PK
|
|* 16 |
INDEX RANGE SCAN
| SALES2_PROD_ID_IDX
|
| 17 |
TABLE ACCESS BY USER ROWID| SALES2
|
|* 18 |
TABLE ACCESS BY INDEX ROWID| PRODUCTS
|
|* 19 |
INDEX UNIQUE SCAN
| PRODUCTS_PK
|
|* 20 |
INDEX UNIQUE SCAN
| CUSTOMERS_PK
|
| 21 |
TABLE ACCESS BY INDEX ROWID | CUSTOMERS
|
--------------------------------------------------------------Tony Hasler http://tonyhasler.wordpress.com
Some final pieces of trivia..
The deprecated STAR hint has nothing to do with star
transformations….DON’T USE IT.
STAR_TRANSFORMATION and NO_STAR_TRANSFORMATION
hints can be used to control star transformations.
The FACT hint can be used to specify the table to which the star
transformation is applied (although it is usually obvious).
The fact table can also be supplied as an argument to the
STAR_TRANSFORMATION hint (undocumented, unsupported
etc.)
Enabled integrity constraints are not strictly required on the
dimension table(s). However, if they are absent an extra join may
be seen to ensure the row count is accurate.
Enabled and validated Integrity constraints are required on
dimension tables used in bitmap join indexes.
Bitmaps from BITMAP KEY ITERATION operations can be
combined with any other bitmaps including those from bitmap join
indexes
Tony Hasler http://tonyhasler.wordpress.com
Summary
Star transformations are used to solve the problem of distributed
filters when at least one of the filters is a join predicate
Star transformations can be recognised by the presence of the
BITMAP KEY ITERATION operation in an execution plan.
Star transformations are usually found using bitmap indexes in
“star” schemas in a data warehouse but can occasionally be useful
in mixed OLTP environments with traditional b-tree indexes and in
non-star schemas.
Star transformations, like all bitmap combination operations, relies
on the fact that the cost of the operation is outweighed by
increased selectivity.
STAR_TRANSFORMATION_ENABLED=TEMP_DISABLE is to
avoid bugs but should rarely be needed these days.
Performance of queries using star transformations in a data
warehouse can almost always be improved by the use of bitmap
join indexes.
Tony Hasler http://tonyhasler.wordpress.com
Questions
Tony Hasler http://tonyhasler.wordpress.com
Download