Judicious use of Histograms for Oracle Applications Tuning John Kanagaraj Cisco Systems Inc. ora_apps_dba_y@yahoo.com Learning Objectives • As a result of this presentation, you will be able to know: – What Histograms are – How Histograms influence SQL performance – How EBS R12/11i uses Histograms – How to configure Histogram collection in EBS – A real life example in an EBS environment Speaker’s Qualifications • IT Architect @ Cisco Systems Inc • Executive Editor for IOUG’s SELECT Journal • Oracle ACE • Author and Technical Editor – Oracle Database 10g Insider Solutions – Oracle Wait Interface – Oracle RAC Handbook • Skilled in Oracle Database and Applications Tuning • Link up with me on LinkedIn Presentation Agenda • • • • • • • Introduction to Optimizer and Histograms Histograms’ influence on SQL Execution Caveats and overheads for Histograms Applicability to EBS R12/11i Collection methods in EBS R12/11i A real-life example Wrap up, Q & A The Oracle Optimizer • • • • • The Optimizer twins - CBO and RBO CBO available since 7.0 RBO de-supported in Oracle Db 10g EBS11i first to use CBO Optimizer determines data access path for SQL statements – CBO uses Cost of access (Table/Index Stats) – RBO uses a set of rules (Set of 15 Rules) CBO • CBO characteristics – Requires Table, Index, Column statistics – Histograms introduced in Oracle 7.3 (more later) – Caters to new object types and accesses – Flexible access path – Traced via the 10053 event – Statistics needs to be kept up to date – Sensitive to Optimizer parameters and algorithms – “When she is good, she is very, very good..” What are Histograms? • Histograms describe data skew – Columns with non-unique, repeating keys – Few distinct values forming sizeable portion of the row count • Typically on secondary keys such as Organization IDs, Type IDs • Histograms quantify spread of distinct values • Described in terms of buckets – Height-balanced or Width-balanced How are Histograms used? • Simplest example – FTS vs Indexed read – Large table indexed by few distinct values – Histogram enables choice of FTS vs Index – CBO considers total cost of I/O • Complex example – Multi-table join – Histograms help determine Table join order – Join producing least number of rows first • Histograms greatly influence Selectivity • Helps tune SQL without code change Histograms and the DD • Collected using DBMS_STATS (all levels) • Exposed via the following views – DBA_TAB_HISTOGRAMS – DBA_PART_HISTOGRAMS (partitioned tab) – Equivalent ALL_ and USER_ views • Based on HISTGRM$ and HIST_HEAD$ • Uses memory in the Data Dictionary Cache (SHARED_POOL_SIZE) • Affects parse time (minimal) Caveats • Version specifics – Histograms considered only when hard-coded values used in Oracle 8i – Histograms considered at first parse even if bind variables are used in versions > Oracle 9i • Bind variable peeking in Oracle 9i/10g – – – – – – Initial parse determines future paths! Widely varying and unpredictable performance Performance could vary across RAC nodes Controlled by “_optim_peek_user_binds” 10g increased this problem (METHOD_OPT default) “Fixed” in Oracle 11g using Adaptive Cursor Sharing Histograms and EBS • Histograms are used in EBS R12/11i • But… we collect them differently! (good) • Stats gathering: ‘Gather <Level> Statistics’ – Calls FND_STATS which calls DBMS_STATS – Guided collection of Histograms • Seeded table FND_HISTOGRAM_COLS – Contains Application ID, Table, Column and Maximum Number of Histograms (254) – Inserted into by (internal use only) FND_STATS.LOAD_HISTOGRAM_COLS – Initial seed data; additions via patches Histogram candidates • Use not-so-well documented FND_STATS.CHECK_HISTOGRAM_COLS • Algorithm from code – Checks leading cols in non-unique indexes – Single value occupies >=1/75th of sample select decode(floor(sum(tot)/(max(cnt)*75)),0, 'YES','NO') HIST from (select count(col) cnt , count(*) tot from tab sample (S) where col is not null group by col); – Count of at least 3000 recommended A real life example • Inventory Report Performance – Widely varying run-times – 1.5 hours for some orgs, 2-3 minute for others – Differentiating parameter – Org ID – Complex SQL handling various parameters – Top CPU consumer when run – Large tables involved (CST_ITEM_COSTS, RCV_SHIPMENT_LINES, MTL_ONHAND_QUANTITIES_*) A real life example (contd.) • The investigation – – – – – Large amount of PIO and LIO (V$SESSTAT) Used ‘10046 Level 12’ extended SQL Trace Level 12 (8+4) shows both bind (4) and wait events (8) Determine objects accessed and count TKPROF results call cnt ----- --Parse 2 Execute 1 Fetch 1 ----- --total 4 cpu ------0.04 0.01 5102.81 ------5102.86 elapsed ------0.06 0.01 5425.30 ------5425.37 disk query current rows ----- ------ ------- ----0 0 0 0 0 0 0 0 37506 226144503 5 47 ----- ---------- --- -----37506 226144503 5 47 A real life example (contd.) • The investigation (continued) – Extended trace shows exact fetches and values – Refer “Optimizing Oracle” (Cary Millsap) and the “OWI” book by Kirti Deshpande and others – Metalink notes 39817.1 and 171647.1 BINDS #18: bind 0: dty=1 mxl=128(40) mal=00 scl=00 pre=00 oacflg=03 oacfl2=10 size=128 offset=0 bfp=01b16e70 bln=128 avl=03 flg=05 value="558" <snipped> WAIT #18: nam='file open' ela= 0 p1=0 p2=0 p3=0 WAIT #18: nam='db file sequential read' ela= 4 p1=432 p2=169056 p3=1 WAIT #18: nam='db file sequential read' ela= 1 p1=72 p2=195769 p3=1 WAIT #18: nam='db file sequential read' ela= 1 p1=73 p2=197743 p3=1 WAIT #18: nam='db file scattered read' ela= 2 p1=65 p2=95737 p3=8 FETCH #18:c=510281,e=542530,p=37506,cr=226144503,cu=5, mis=0,r=47,dep=0,og=4,tim=2464590823 A real life example (contd.) • The investigation (continued) – SQL to show segments accessed select segment_type, segment_name from dba_extents where file_id = 432 and 169056 between block_id and block_id + blocks – 1; – Indexed read of MTL_ONHAND_QUANTITIES – Majority was indexed reads of the 2.6 Gb CST_ITEM_COSTS table – Indexed read is good, but not when it is excessive – This was probably due to wrong access path for that ORG_ID (confirmed via Execution plan) i.e. skew in data patterns for ORG_IDs A real life example (contd.) • The Solution – Connecting the dots (patterns – Data, parameter, I/O) – Ran FND_STATS.CHECK_HISTOGRAM_COLS execute fnd_stats.check_histogram_cols('PO.RCV_SHIPMENT_LINES, INV.MTL_ONHAND_QUANTITIES_DETAIL,BOM.CST_ITEM_COSTS', factor=>75,percent=>99,degree=>4); – – – – Showed need for Histograms on some of the columns Added rows into FND_HISTOGRAM_COLS Reran “Gather Table Stats” for changed tables Modified Report to use Literal values (ok since this is parsed only once) – No functional or SQL structural change A real life example (contd.) • The Result call cnt cpu elapsed disk ---- ---- ---- ------- ----Parse 2 0.04 0.02 3 Execute 1 0.00 0.01 0 Fetch 1 30.71 101.28 42423 ---- ---- ---- ------- ----total 4 30.75 101.31 42426 query current rows ------ ------- ---10 0 0 0 0 0 913573 32 47 ------ ------- ---913583 32 47 – Dramatic reduction in CPU, LIO and runtime – Increased PIO; Most of it was FTS to large tables – Final Runtime: 1 ½ to 3 minutes for all Orgs!!! Conclusion • • • • • • • • Knowledge of CBO and Histograms Business knowledge (data patterns, usage) Use of tracing tools Connecting the dots Knowledge of sparsely documented features Test, Test, Test! (And Validate!!) Judicious use of Histograms Validate results, again using tracing Q&A