Partitioning with Oracle 11G Bert Scalzo, Domain Expert, Oracle Solutions Bert.Scalzo@Quest.com Copyright © 2006 Quest Software About the Author … Domain Expert & Product Architect for Quest Software Oracle Background: • Worked with Oracle databases for over two decades (starting with version 4) • Work history includes time at both “Oracle Education” and “Oracle Consulting” Academic Background: • Several Oracle Masters certifications • BS, MS and PhD in Computer Science • MBA (general business) • Several insurance industry designations Key Interests: • Data Modeling • Database Benchmarking • Database Tuning & Optimization • "Star Schema" Data Warehouses • Oracle on Linux – and specifically: RAC on Linux Articles for: • Oracle’s Technology Network (OTN) • Oracle Magazine, • Oracle Informant • PC Week (eWeek) This presentation draws heavily on these areas Articles for: • Dell Power Solutions Magazine • The Linux Journal • www.linux.com • www.orafaq.com 1 Books by Author … Coming in 2008 … 2 Agenda • Partitioning Benefits • Partitioning History • Partitioning Options • Partitioning Advisor (if you’re licensed) • Typical Data Warehousing Environment • TPC-H “Data Warehouse” Benchmark • Results TPC-H with Various Partition Strategies • What about OLTP Environments and the TPC-C/E • Lessons Learned (and their relevance/application) • Questions & Answers 3 Partitioning Benefits: Facts • Manageability – Classic “Divide & Conquer” technique – More granular storage allocation options – Keeps otherwise time consumptive options viable • Availability – More granular online/offline options – More granular rebuild/reorganization options – More granular object level backup/restore options • Capacity Management – Enables a “Tiered Storage Architecture” approach – More granular storage cost management decision points • Performance – Partition Pruning – Partition-Wise Joins 4 Partitioning Benefits: Opinion (Mine) • • • • Manageability Availability Capacity Management Performance 40% 20% 20% 20% Why to Partition • Don’t over-sell/over-expect the performance aspect • Need to experiment for best approach for a database • Better to take longer at the start to get right, because very often it’s far too expensive to change afterwards • Examples demonstrate very positive performance, but better to be conservative and error on the side of caution – then be very pleasantly surprised… 5 Partition Pruning (Restriction Based) • From Docs: In partition pruning, the optimizer analyzes FROM and WHERE clauses in SQL statements to eliminate unneeded partitions when building the partition access list. This enables Oracle Database to perform operations only on those partitions that are relevant ... • “Divide and Conquer” for performance – Sometimes can yield order of magnitude improvement – But once again, best not to oversell and/or over-expect • Some Potential Issues to be aware of: – SQL*Plus Auto-Trace can sometimes miss partition pruning – “Old Style” Explain Plans via simple SELECT has issues too – Best to always use DBMS_XPLAN and/or SQL_TRACE Note: Trace file analysis much easier these days – SQL Developer + free Hotsos plug-in, metalink trace analysis scripts, Quest Toad DBA 6 Partition-Wise Join (Multi-Object Based) • From Docs: Partition-wise joins reduce query response time by minimizing the amount of data exchanged among parallel execution servers when joins execute in parallel. This significantly reduces response time & improves the use of both CPU & memory resources. • Different Flavors: – – – – – Full – Single to Single Full – Composite to Single Full – Composite to Composite Partial – Single Partial – Composite All of these affect the explain plan • Indexing Strategy Counts – Local Prefixed/Non-Prefixed – Global 7 Picture Worth 1000 Words (from Docs) Simple Mantra: Subdivide the work into equally paired chunks, then perform all that work using many parallel processes Make sure not to over-allocate CPU’s – remember there will also be concurrent workload 8 Partitioning History (from Oracle 11G training+) Oracle 5 Oracle 7 Before Tablespaces – we had partitions Partition Views – really more of a cheat 9 Partitioning Options – Part 1 IOT’s can be partitioned as well in later versions of Oracle, so the basic choices are even more complex than this… 10 Partitioning Options – Part 2 Prior to 11G: Oracle White Paper: 2007 Partitioning in Oracle Database 11g 11 Partitioning Options – Part 3 Post 11G: Oracle White Paper: 2007 Partitioning in Oracle Database 11g Very exciting new options… 12 Partitioning Advisor (if you’re licensed) Advisor Central -> SQL Advisors -> SQL Access Advisor 13 Typical Data Warehouse Architecture DM 1 OLTP App #1 VSAM OLTP App #2 Sybase ETL ODS Oracle TPC-H DM 2 Enterprise DW OLTP App #3 Oracle ET OLTP App #4 ISAM Staging Area L 14 Typical Environments OLTP ODS OLAP DM/DW Business Focus Operational Operational Tactical Tactical Tactical Strategic End User Tools Client Server Web Client Server Web Client Server Client Server Web DB Technology Relational Relational Cubic Relational Trans Count Large Medium Small Small Trans Size Small Medium Medium Large Trans Time Short Medium Long Long Size in Gigs 10 – 200 50 – 400 50 – 400 400 - 4000 Normalization 3NF 3NF N/A 0NF Data Modeling Traditional ER Traditional ER N/A Dimensional We’ll come back to this picture 15 TPC-H Benchmark • Industry Standard “Data Warehouse” Benchmark • URL: www.tpc.org/tpch • Spec: http://tpc.org/tpch/spec/tpch2.7.0.pdf • 8 Tables • 22 Queries (answer complex business questions) • Database scaling: – Factor = 1, 10, 30, 100, 300, 1000, 3000, 10000, 30000, 100000 – Size GB = 1, 10, 30, 100, 300, 1000, 3000, 10000, 30000, 100000 16 Sub-Partitions TPC-H Data Model SF * 200,000 SF * 6,000,000 SF * 800,000 SF * 10,000 Partitions 25 5 SF * 150,000 SF * 1,500,000 17 TPC-H Permits Partitioning … But what to do, what to do ??? 18 Disclosure Reports http://tpc.org/tpch/results/tpch_perf_results.asp 19 Disclosure Report – Lots of Info This is where people document exactly what advanced database feature and storage parameters they used – info is invaluable 20 Disclosure Report – Appendix B 21 Sample Expensive Query 22 Example Explain Plan 23 Example Explain Plan Explain complete. Plan hash value: 2545634784 ---------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 42533 | 5648K| | 641K (1)| 01:57:41 | | 1 | SORT GROUP BY | | 42533 | 5648K| 105M| 641K (1)| 01:57:41 | |* 2 | HASH JOIN | | 715K| 92M| | 631K (1)| 01:55:51 | | 3 | TABLE ACCESS FULL | H_NATION | 25 | 725 | | 3 (0)| 00:00:01 | |* 4 | HASH JOIN | | 715K| 72M| | 631K (1)| 01:55:51 | | 5 | TABLE ACCESS FULL | H_SUPPLIER | 100K| 781K| | 646 (1)| 00:00:08 | |* 6 | HASH JOIN | | 720K| 68M| 68M| 631K (1)| 01:55:44 | |* 7 | HASH JOIN | | 751K| 59M| 232M| 589K (1)| 01:48:10 | |* 8 | HASH JOIN | | 3004K| 197M| 4984K| 485K (1)| 01:28:57 | |* 9 | TABLE ACCESS FULL| H_PART | 100K| 3808K| | 11805 (1)| 00:02:10 | | 10 | TABLE ACCESS FULL| H_LINEITEM | 60M| 1716M| | 342K (1)| 01:02:52 | | 11 | TABLE ACCESS FULL | H_ORDER | 15M| 200M| | 72122 (1)| 00:13:14 | | 12 | TABLE ACCESS FULL | H_PARTSUPP | 8000K| 122M| | 25945 (1)| 00:04:46 | ---------------------------------------------------------------------------------------------Predicate Information (identified by operation id): --------------------------------------------------2 - access("S_NATIONKEY"="N_NATIONKEY") 4 - access("S_SUPPKEY"="L_SUPPKEY") 6 - access("PS_SUPPKEY"="L_SUPPKEY" AND "PS_PARTKEY"="L_PARTKEY") 7 - access("O_ORDERKEY"="L_ORDERKEY") 8 - access("P_PARTKEY"="L_PARTKEY") 9 - filter("P_NAME" LIKE :SYS_B_2) 24 Method of Attack • Since many data warehouses are utilized for data mining, we can’t always know every possible query likely to run – thus aggregate measure for success • Thus we’ll compare the benchmark’s weighted performance scores for the TPC-H using various partitioning schemes (all within spec of course…) • Goal will be to find the best overall partitioning … • Then we’ll examine some specific explain plans … 25 10G Sample Test cases • 10G Simple Approach (Just Huge Tables) – Range: ORDER (order date) – Hash: LINEITEM (order key) • 10G Basic Approach (Single Level Partitions) – Range: ORDER (order date) – Hash: LINEITEM (order key) – List: CUSTOMER (nation key) – Hash: PART, SUPPLIER and PARTSUPP (part & supp keys) • 10G Complex Approach (Composite Partitions) – Range-Hash: ORDER (order date & cust key) – Multi-Hash: LINEITEM (part, supp & order keys) – List: CUSTOMER (nation key) – Hash: PART, SUPPLIER and PARTSUPP (part & supp keys) 26 11G Sample Test cases • 11G Simple Approach (+Interval) – Interval-Hash: ORDER (order date & cust key) – Multi-Hash: LINEITEM (part, supp & order keys) – List: CUSTOMER (nation key) – Hash: PART, SUPPLIER and PARTSUPP (part & supp keys) • 11G Basic Approach (+Virtual) – Interval-Hash: ORDER (virtualized order date & cust key) – Multi-Hash: LINEITEM (part, supp & order keys) – List: CUSTOMER (nation key) – Hash: PART, SUPPLIER and PARTSUPP (part & supp keys) • 11G Complex Approach (+REF) – Interval-Hash: ORDER (virtualized order date & cust key) – REF: LINEITEM (order key) – List: CUSTOMER (nation key) – Hash: PART, SUPPLIER and PARTSUPP (part & supp keys) 27 Is that It? • No – just six very obvious high-level scenarios • Your selections and actual mileage will vary • Experimentation usually yields the best results • Always trust “empirical results” over conjecture • So improved response-time beats better explain plan • Remember, DW’s usually have unpredictable queries • So don’t tune for just a few queries, look for the best overall and/or more generic performance solution 28 Intermediate Results can be Misleading Power Th-Put 450 300 400 250 350 300 200 250 150 200 150 100 100 50 50 0 bmf1 bmf2 bmf3 bmf4 bmf5 TPC-H Power score seemingly implies that every partitioning schema is incrementally better 0 bmf1 bmf2 bmf3 bmf4 bmf5 TPC-H Throughput score seems to show that nonpartitioned is equal to the best 29 Final Results tell the Real Truth $/QpH QpH $14.00 350 $12.00 300 250 $10.00 200 $8.00 150 $6.00 100 $4.00 50 $2.00 0 $0.00 bmf1 bmf2 bmf3 bmf4 bmf5 TPC-H Query/Hour score shows that some partitioning schemes better, and some not bmf1 bmf2 bmf3 bmf4 bmf5 TPC-H $/Query/Hour score confirms the inverse in terms of dollars per unit of work 30 Why such seemingly Opposite Results ??? •Run times and explain plans apply to single measurable operation •Even aggregate & averaged run times don’t relate the entire truth! •Need answer based upon sound mathematics (reliable & repeatable) 31 TPC-C Benchmark • Historical Industry Standard “OLTP” Benchmark • URL: www.tpc.org/tpchc • Spec: http://tpc.org/tpcc/spec/tpcc_current.pdf • Probably the most used & widely quoted benchmark • But suffers from overly simplistic design & code logic • Generally considered unreliable with modern RDBMS • But still a decent rough “sounding board” for many …. • Being replaced by the newer TPC-E (later slides) 32 TPC-C Data Model Base Scaling Unit # Terminals/Warehouse (i.e. concurrent users) Clustered Partitions Sub-Partitions TPC-E Benchmark • Emerging Industry Standard “OLTP” Benchmark • URL: www.tpc.org/tpche • Spec: http://tpc.org/tpce/spec/TPCE-v1.5.1.pdf • Very new and still evolving – but highly promising • Not too many published TPC-E results as of yet … • Design not compromised by RDBMS features • Much more realistic (i.e. real world) in nature • Nowhere near as easy as the old TPC-C test 34 TPC-E Data Model 35 TPS is Moot, Average Response Time is King tpmC 584 582 579 578 But wait: adding cluster &partitioning yields negative – why ??? Look to Stats Pack, AWR and ADDM Reports to investigate… 36 Single block reads 32 ms! DISTRICT Table needs clustered Suggest fix one major item per test iteration, so made choice to address this 1st Single block read 32 ms Clustering worked, it made SQL the #1 performance issue – as was expected Single block read 5 ms Partitioning did not shine through just yet, possibly skewed by the first issue Switch MEMORY_TARGET to SGA/PGA_TARGETS -13% Notice that manual memory management resulted in 13% gain !!! But wait, there’s more (isn’t that almost always the case) … 39 Now it’s all just SQL, so time for SQL Tuning Advisor & SQL Tuning Sets SQL Tuning Sets & 11G Results/Client Cache -9% Going to stop – Have quickly reached the “good enough” point 41 Architecture Findings OLTP ODS OLAP DM/DW Business Focus Operational Tactical Strategic End User Tools Client Server Web Client Server Web DB Technology Relational Relational Trans Count Large Small Trans Size Small Trans Time Short Size in Gigs 10 – 200 400 - 4000 Normalization 3NF 0NF Data Modeling Traditional ER Mostly Partition Elimination Design to eliminate per object Mostly Partition Wise Join Design to parallelize across objects Large Long Dimensional 42 Other Interesting Findings • 64-bit scales much more reliably than 32-bit, even when on same hardware and using the <= 4GB memory model • If you know the application code’s nature and it’s well definable, manual memory management may be better • Using manual SGA/PGA targets with floors yields much more scalable results and also more predictable patterns • Partitioning is not an automatic bonus, must experiment to identify the optimal partitioning scheme per situation • Don’t forget older technologies like clusters, they still can add a positive to the overall equation in certain cases • Don’t forget SQL Tuning Sets & SQL Advisor, or the “explain plans” may not in fact be the best obtainable • Don’t forget 11G’s Result and Client Caches 43 Questions and Answers … Thank you Presenter: Bert Scalzo E-mail: Bert.Scalzo@Quest.com Note: these slides should be available on Open World web site, but I’ll also make sure to post them on my company’s web site: www.toadworld.com/Experts/BertScalzosToadFanaticism/tabid/318/Default.aspx