40166 Oracle Database 10g SQL Model Clause Andy Witkowski, Architect Thomas Kyte, VP Oracle Corporation What’s now in SQL for Modeling Aggregation Enhancements – Cube, Rollup, Grouping Sets – New aggregates: Inverse Distribution, FIRST/LAST,etc Analytic Functions – Window Functions: Rank, Moving, Cumulative – Statistical Functions: Correlation, Linear Regression,etc Old tools still have more modeling power than SQL SQL Model enhances SQL with modeling power – Spreadsheets, MOLAP engines Case Study – Modeling with Excel Excel fits well at the personal scale – – – UI and Formatting Calculations (build-in functions, formulas) What-If analysis Excel fits poorly at corporate scale for modeling Cryptic row-column addressing – No metadata, No standards, No mathematical model – 100s of spreadsheets and consolidation by hand – Does not scale (1000’s formulas, TB of data) Replace Excel Modeling with SQL Modeling – Perpetual data exchange: databases->Excel – Modeling with SQL Model Language: Spreadsheet-like calculations in SQL – – – – – – – Inter-row calculation. Treats relations as an N-Dim array Symbolic references to cells and their ranges Multiple Formulas over N-Dim arrays Automatic Formula Ordering Recursive Model Solving Model is a relation & can be processed further in SQL Multiple arrays with different dimensionality in one query Performance – – – Parallel Processing in partitioning & formulas Multiple-self joins with one data access structure Multiple UNIONs with one data access structure Why Better? – – – Automatic Consolidation (models as views – combine using SQL) Self Adjusting (as database changes no need to re-define) One version of truth (calc directly over data base, no exchange) SQL Model Concepts prod time vcr dvd 2001 2001 Define Relation as Array s 9 0 Relation Array SELECT prod, time, s FROM sales 1 2 3 4 1999 5 6 7 8 2000 9 0 1 2 2001 dvd tv pc vcr prod DIMENSION BY (prod, time) MEASURES (s) prod time vcr dvd 2001 2001 Define Business Rules s 9 0 Relation Array SELECT prod, time, s FROM sales 1 2 3 4 1999 5 6 7 8 2000 9 0 1 2 2001 dvd tv pc vcr DIMENSION BY (prod, time) MEASURES (s) RULES UPSERT ( Sales in 2000 2x of previous year s[ANY, 2000] = s[CV(prod), CV(time) - 1] * 2, Predict vcr sales in 2002 s[vcr, 2002] = s[vcr, 2001] + s[vcr, 2000], Predict dvd sales in 2002 s[dvd, 2002] =AVG(s) [CV(prod), time<2001] ) prod prod time vcr dvd 2001 2001 Evaluate Formulas – 1st s 9 0 Relation Array SELECT prod, time, s FROM sales 1 2 3 4 1999 2 4 6 8 2000 9 0 1 2 2001 dvd tv pc vcr DIMENSION BY (prod, time) MEASURES (s) RULES UPSERT ( Sales in 2000 2x of previous year s[ANY, 2000] = s[CV(prod), CV(time) - 1] * 2, Predict vcr sales in 2002 s[vcr, 2002] = s[vcr, 2001] + s[vcr, 2000], Predict dvd sales in 2002 s[dvd, 2002] = AVG(s) [CV(prod), time<2001] ) prod time vcr dvd 2001 2001 Evaluate Formulas – 2nd s 9 0 Relation SELECT prod, time, s FROM sales 1 2 3 4 1999 2 4 6 8 2000 9 0 1 2 2001 DIMENSION BY (prod, time) MEASURES (s) 2002 11 RULES UPSERT ( Sales in 2000 2x of previous year s[ANY, 2000] = s[CV(prod), CV(time) - 1] * 2, Predict vcr sales in 2002 s[vcr, 2002] = s[vcr, 2001] + s[vcr, 2000], Predict dvd sales in 2002 s[dvd, 2002] = AVG(s) [CV(prod), time<2001] ) vcr dvd tv pc prod time vcr dvd 2001 2001 Evaluate Formulas – 3rd s 9 0 Relation SELECT prod, time, s FROM sales 1 2 3 4 1999 2 4 6 8 2000 9 0 1 2 2001 DIMENSION BY (prod, time) MEASURES (s) 2002 11 3 RULES UPSERT ( Sales in 2000 2x of previous year s[ANY, 2000] = s[CV(prod), CV(time) - 1] * 2, Predict vcr sales in 2002 s[vcr, 2002] = s[vcr, 2001] + s[vcr, 2000], Predict dvd sales in 2002 s[dvd, 2002] = AVG(s) [CV(prod), time<2001] ) vcr dvd tv pc prod time vcr dvd 2001 2001 Return as Relation s 9 0 Relation SELECT prod, time, s FROM sales 1 2 3 4 1999 2 4 6 8 2000 9 0 1 2 2001 2002 11 3 vcr dvd DIMENSION BY (prod, time) MEASURES (s) tv pc Self-join. join + UNION join + UNION RULES UPSERT ( s[ANY, 2000] = s[CV(prod), CV(time) - 1] * 2, s[vcr, 2002] = s[vcr, 2001] + s[vcr, 2000], s[dvd, 2002] = AVG(s) [CV(prod), time<2001] ) Relation again vcr dvd 2001 2001 9 0 vcr dvd 2002 2002 11 3 Rows updated & inserted by the Model clause Model Clause – Components Model clause Partitioning SELECT region, prod, time, s Dims of array FROM sales GROUP BY region, prod, time MODEL PARTITION BY (region) DIMENSION BY (prod, time) MEASURES (sum(sales) s, count(sales) c) RULES ITERATE ( ) UNTIL ( ) Model options ( s[ANY, 2000] = s[CV(prod), CV(time) - 1], s[dvd, 2003] = s[dvd, 2002] + s[dvd, 2001], Formulas UPSERT s[vcr, 2003] = AVG(s) [vcr, time < 2001] ) ORDER BY region, product, time, s; Formula Options Key Concepts (1) New SQL Model Clause: – – – Data as N-dim arrays with DIMENSIONS & MEASURES Data can be PARTITION-ed - creates an array per partition Formulas defined over the arrays express a (business) model Formulas within a Model: – – – – – Use symbolic addressing using familiar array notation Can be ordered automatically based on dependency between cells Can be recursive with a convergence condition – recursive models Can UPDATE or UPSERT cells Support most SQL functions including aggregates Key Concepts (2) Result of a SQL Model is a relation – – Can participate further in processing via joins, etc. Can define views containing Model computations SQL Model is the last query clause – – Executed after joins, aggregation, window functions Before ORDER BY Main Model and Reference Models – Can relate models of different dimensionality Formula Fundamentals (1) Formulas: SQL expressions over cells with aggs, functions, etc. Formula has a left and right side and represents assignment – – s[‘vcr’, 2002] = s[‘vcr’, 2001] + s[‘vcr’, 2000] s[‘vcr’, 2002] = AVG(s)[‘vcr’, t<2002] – single ref – multi ref on right Left side can qualify multiple cells – – – s[p IN (‘vcr’,’dvd’), t<2002] = 1000 – multi ref on left s[ANY, t=2002] = 2 * s[CV(p), CV(t)-1] – left-right correlation s[p IN (SELECT prod FROM prod_tb), 2000] = 1000 Formula can operate in update or upsert mode – – update s[‘vcr’, 2002] = s[‘vcr’, 2001] + s[‘vcr’, 2000] upsert s[‘vcr’, 2002] = s[‘vcr’, 2001] + s[‘vcr’, 2000] Formula Fundamentals (2) Function CV(dimension) propagates values from left to the right side. In example, products in 2002 are sum of two previous years. s[ANY, 2002] = s[CV(p), CV(t)-1] +s[CV(p), CV(t) – 2] Formula Fundamentals (2) Function CV(dimension) propagates values from left to the right side. In example, products in 2002 are sum of two previous years. s[ANY, 2002] = s[CV(p), CV(t) -1] + s[CV(p), CV(t) – 2] Formula result can depend on processing order. Can specify order in each formula. E.g., shift by time: s[‘vcr’, ANY] ORDER BY t = s[‘vcr’, CV(t) - 1] Formula Fundamentals (2) Function CV(dimension) propagates values from left to the right side. In example, products in 2002 are sum of two previous years. s[ANY, 2002] = s[CV(p), CV(t) -1] + s[CV(p), CV(t) – 2] Formula result can depend on processing order. Can specify order in each formula. E.g., shift by time: s[‘vcr’, ANY] ORDER BY t = s[‘vcr’, CV(t) - 1] vcr vcr vcr vcr vcr 2001 2002 2003 2004 2005 300.00 0 350.00 300.00 400.00 450.00 500.00 ORDER BY t Formula Fundamentals (2) Function CV(dimension) propagates values from left to the right side. E.g, products in 2002 are sum of two previous years s[ANY, 2002] = s[CV(p), CV(t) -1] + s[CV(p), CV(t) – 2] Formula result can depend on processing order. Can specify order in each formula. E.g., shift by time: s[‘vcr’, ANY] ORDER BY t = s[‘vcr’, CV(t) - 1] vcr vcr vcr vcr vcr 2001 2002 2003 2004 2005 300.00 0 350.00 300.00 400.00 350.00 450.00 500.00 ORDER BY t Formula Fundamentals (2) Function CV(dimension) propagates values from left to the right side. E.g, products in 2002 are sum of two previous years s[ANY, 2002] = s[CV(p), CV(t) -1] + s[CV(p), CV(t) – 2] Formula result can depend on processing order. Can specify order in each formula. E.g., shift by time: s[‘vcr’, ANY] ORDER BY t = s[‘vcr’, CV(t) - 1] vcr vcr vcr vcr vcr 2001 2002 2003 2004 2005 300.00 350.00 400.00 450.00 500.00 ORDER BY t 0 300.00 350.00 400.00 450.00 Model Options – Fundamentals global options MODEL [ UNIQUE DIMENSIONS | UNIQUE SINGLE REFERENCE ] [ IGNORE NAV | KEEP NAV ] PARTITION BY (r) DIMENSION BY (p, t) MEASURES (s) RULES [ UPDATE | UPSERT ] rule [ AUTOMATIC ORDER | SEQUENTIAL ORDER ] ( s[ANY, 2002] = 1.2 * s[CV(product), 2002], s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001], s[‘vcr’, 2003] = AVG(s) [‘vcr’, t BETWEEN 1995 AND 2000], s[‘video’, 2003] = s[‘dvd’, 2003], s[‘vcr’, 2003] ) options NAV Options: Handling Sparse Data West dvd 2001 300.00 West tv West vcr West vcr 2002 2001 2002 500.00 200.00 400.00 MODEL KEEP NAV PARTITION BY (r) DIMENSION BY (p, t) MEASURES (s) RULES UPSERT ( s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001] s[‘tv’ ,2003] = sum(s) [‘tv’, t BETWEEN 2001 AND 2002] ) keep nav 2001 ? West West West West West West dvd tv dvd tv vcr vcr 2001 2002 2003 2003 2001 2002 300.00 500.00 500.00 200.00 400.00 NAV Options: Handling Sparse Data West dvd 2001 300.00 West tv West vcr West vcr 2002 2001 2002 500.00 200.00 400.00 MODEL INGNORE NAV PARTITION BY (r) DIMENSION BY (p, t) MEASURES (s) RULES UPSERT ( s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001] s[‘tv’ ,2003] = sum(s) [‘tv’, t BETWEEN 2001 AND 2002] ) ignore nav assume 0 West West West West West West dvd tv dvd tv vcr vcr 2001 2002 2003 2003 2001 2002 300.00 500.00 300.00 500.00 200.00 400.00 NAV Options: Handling Sparse Data West dvd 2001 300.00 West tv West vcr West vcr 2002 2001 2002 500.00 200.00 400.00 MODEL KEEP NAV PARTITION BY (r) DIMENSION BY (p, t) MEASURES (s) RULES UPSERT ( s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001] s[‘tv’ ,2003] = sum(s) [‘tv’, t BETWEEN 2001 AND 2002] ) keep nav ignore nav assume 0 2001 ? West West West West West West dvd tv dvd tv vcr vcr 2001 2002 2003 2003 2001 2002 300.00 500.00 500.00 200.00 400.00 West West West West West West dvd tv dvd tv vcr vcr 2001 2002 2003 2003 2001 2002 300.00 500.00 300.00 500.00 200.00 400.00 Automatic Formula Ordering MODEL PARTITION BY (r) DIMENSION BY (p, t) MEASURES (s) RULES UPDATE AUTOMATIC ORDER ( F1: s[‘vcr’, 2003] = AVG(s) [‘vcr’, t BETWEEN 1995 AND 2002], F2: s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001], F3: s[ANY, 2002] = s[CV(p), 2001] + s[CV(p), 1999] ) Automatic Formula Ordering MODEL PARTITION BY (r) DIMENSION BY (p, t) MEASURES (s) RULES UPDATE AUTOMATIC ORDER ( F1: s[‘vcr’, 2003] = AVG(s) [‘vcr’, t BETWEEN 1995 AND 2002], F2: s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001], F3: s[ANY, 2002] = s[CV(p), 2001] + s[CV(p), 1999] ) F1 depends on F3 and F2 depends on F3, thus F3 automatically first: Automatic Formula Ordering MODEL PARTITION BY (r) DIMENSION BY (p, t) MEASURES (s) RULES UPDATE AUTOMATIC ORDER ( F1: s[‘vcr’, 2003] = AVG(s) [‘vcr’, t BETWEEN 1995 AND 2002], F2: s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001], F3: s[ANY, 2002] = s[CV(p), 2001] + s[CV(p), 1999] ) F1 depends on F3 and F2 depends on F3, thus F3 automatically first: RULES UPDATE AUTOMATIC ORDER ( F3: s[ANY, 2002] = s[CV(p), 2001] + s[CV(p), 1999] F2: s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001], F1: s[‘vcr’, 2003] = AVG(s) [‘vcr’, t BETWEEN 1995 AND 2000] ) UPDATE, UPSERT & Partitions Region East East East West Product Time dvd 2001 dvd 2002 vcr 2002 dvd 2001 s 100 150 100 200 MODEL PARTITION BY (r) DIMENSION BY (p,t) MEASURES (s) RULES UPDATE IGNORE NAV ( UPDATE s[‘vcr’, 2002] = s[‘vcr’, 2002] * 1.2, UPSERT s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001] ) UPDATE, UPSERT & Partitions Region East East East West Product Time dvd 2001 dvd 2002 vcr 2002 dvd 2001 s 100 150 100 200 MODEL PARTITION BY (r) DIMENSION BY (p,t) MEASURES (s) RULES UPDATE IGNORE NAV ( UPDATE s[‘vcr’, 2002] = s[‘vcr’, 2002] * 1.2, UPSERT s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001] ) Region East East East Product Time dvd 2001 dvd 2002 vcr 2002 West dvd 2001 Old s 100 150 100 200 New s 100 100 120 200 updated UPDATE, UPSERT & Partitions Region East East East West Product Time dvd 2001 dvd 2002 vcr 2002 dvd 2001 s 100 150 100 200 MODEL PARTITION BY (r) DIMENSION BY (p,t) MEASURES (s) RULES UPDATE IGNORE NAV ( UPDATE s[‘vcr’, 2002] = s[‘vcr’, 2002] * 1.2, UPSERT s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001] ) Region East East East East West West Product Time dvd 2001 dvd 2002 vcr 2002 dvd 2003 dvd 2001 dvd 2003 Old s 100 150 100 200 - New s 100 100 120 250 200 200 updated upserted Different dimensions: Reference Relate Models with different dimensions. Represent each as ndimensional array: one main, others as reference or lookup arrays. Sales Table c p USA dvd USA tv Poland vcr France vcr Conv table converts currency to $ t 2001 2001 2001 2001 s 300.00 $ 500.00 $ 200.00 zl 100.00 fr c USA Poland France ratio 1 0.24 0.12 SELECT c, p, t, s FROM sales MODEL REFERENCE convert ON (SELECT c, ratio FROM conv) DBY (c) MEASURES(r) MAIN DIMENSION BY (c,p,t) MEASURES (s) RULES UPSERT ( s[ANY, ANY, ANY] = r[CV(c)] * s[CV(c), CV(p), CV(t)] ) Different dimensions: Reference Sales Table c p USA dvd USA tv Poland vcr France vcr t 2001 2001 2001 2001 Conv table converts currency to $ c ratio USA 1 Poland 0.24 France 0.12 s 300.00 $ 500.00 $ 200.00 zl 100.00 fr SELECT c, p, t, s FROM sales MODEL REFERENCE convert ON (SELECT c, ratio FROM conv) DBY (c) MEASURES(r) MAIN DIMENSION BY (c,p,t) MEASURES (s) RULES UPSERT ( s[ANY, ANY, ANY] = r[CV(c)] * s[CV(c), CV(p), CV(t)] ) USA dvd USA tv Poland vcr France vcr 2001 2001 2001 2001 300.00 $ 500.00 $ 48.00 $ 12.00 $ Converted values Recursive Model Solving Model can contain cyclic (recursive) formulas. - If cyclic formulas desired, use ITERATE option - If ITERATE not present, cyclic formulas automatically detected, and an error reported. Use ITERATE clause to specify # of iterations or Use UNTIL clause to specify convergence conditions SELECT x, s FROM dual MODEL DIMENSION BY (1 x) MEASURES (1024 s) RULES ITERATE (8) ( s[1] = s[1] / 2 ) Iteration 1 2 S value 1024 512 3 128 4 64 5 32 6 16 7 8 8 4 Recursive Model Solving Model can contain cyclic (recursive) formulas. - If cyclic formulas desired, use ITERATE option - If ITERATE not present, cyclic formulas automatically detected, and an error reported. Use ITERATE clause to specify # of iterations or Use UNTIL clause to specify convergence conditions SELECT x, s FROM dual MODEL DIMENSION BY (1 x) MEASURES (1024 s) RULES ITERATE (8) ( s[1] = s[1] / 2 ) Iteration 1 2 S value 1024 512 3 128 4 64 5 32 6 16 7 8 8 4 Recursive Model Solving Model can contain cyclic (recursive) formulas. - If cyclic formulas desired, use ITERATE option - If ITERATE not present, cyclic formulas automatically detected, and an error reported. Use ITERATE clause to specify # of iterations or Use UNTIL clause to specify convergence conditions SELECT x, s FROM dual MODEL DIMENSION BY (1 x) MEASURES (1024 s) RULES ITERATE (8) ( s[1] = s[1] / 2 ) Iteration 1 2 S value 1024 512 3 128 4 64 5 32 6 16 7 8 8 4 Recursive Model Solving Model can contain cyclic (recursive) formulas. - If cyclic formulas desired, use ITERATE option - If ITERATE not present, cyclic formulas automatically detected, and an error reported. Use ITERATE clause to specify # of iterations or Use UNTIL clause to specify convergence conditions SELECT x, s FROM dual MODEL DIMENSION BY (1 x) MEASURES (1024 s) RULES ITERATE (8) ( s[1] = s[1] / 2 ) Iteration 1 2 S value 1024 512 3 128 4 64 5 32 6 16 7 8 8 4 Recursive Model Solving with Until Model can contain cyclic (recursive) formulas. - If cyclic formulas desired, use ITERATE option - If ITERATE not present, cyclic formulas automatically detected, and an error reported. Use ITERATE clause to specify # of iterations or Use UNTIL clause to specify convergence conditions SELECT x, s FROM dual MODEL DIMENSION BY (1 x) MEASURES (1024 s) RULES ITERATE (10000) UNTIL (PREVIOUS(s[1]) – s[1] <= 1) ( s[1] = s[1] / 2 ) Iteration 1 2 S value 1024 512 3 256 previous(s[1]) - s[1] = 512 4 128 5 64 6 32 7 16 8 8 9 4 10 2 Recursive Model Solving with Until Model can contain cyclic (recursive) formulas. - If cyclic formulas desired, use ITERATE option - If ITERATE not present, cyclic formulas automatically detected, and an error reported. Use ITERATE clause to specify # of iterations or Use UNTIL clause to specify convergence conditions SELECT x, s FROM dual MODEL DIMENSION BY (1 x) MEASURES (1024 s) RULES ITERATE (10000) UNTIL (PREVIOUS(s[1]) – s[1] <= 1) ( s[1] = s[1] / 2 ) Iteration 1 2 S value 1024 512 3 256 previous(s[1]) - s[1] = 256 4 128 5 64 6 32 7 16 8 8 9 4 10 2 Recursive Model Solving with Until Model can contain cyclic (recursive) formulas. They are automatically detected, and error is reported. Unless cycles are intentional which is indicated with ITERATE option Use ITERATE clause to specify # of iterations or Use UNTIL to specify convergence conditions. Stop if true. SELECT x, s FROM dual MODEL DIMENSION BY (1 x) MEASURES (1024 s) RULES ITERATE (10000) UNTIL (PREVIOUS(s[1]) – s[1] <= 1) ( s[1] = s[1] / 2 ) Iteration 1 2 S value 1024 512 3 256 4 128 5 64 6 32 7 16 8 8 9 4 10 2 previous(s[1]) - s[1] = 4 SQL Model Business Examples Time Series Calculation (1) Compute the ratio of current month sales of each product to sales one year ago, one quarter ago and one month ago. Assume: Sales cube with product sales per year, quarter, and month & a time table mapping periods to prior year, quarter and month time table: maps t to y_ago, q_ago, m_ago Sales cube: prod sales per y, q, m t y_ago q_ago m_ago t product sales 1999m01 1998m01 1998m10 1998m12 1999m01 vcr 100.00 1999m02 1998m02 1998m11 1999m01 1999m02 vcr 120.00 … … … … … … … 1999q01 1998q01 1998q04 NULL 1999q01 vcr 360.00 … … … … … … … Time Series Calculation (2) • Reference model with Time table acts like look-up table • CV carries values from the left side to the right side • Without Model, you need 3 outer joins and a regular join SELECT product, sales, r_y_ago, r_q_ago, r_m_ago FROM sales_cube MODEL REFERENCE r ON (SELECT * from time) DIMENSION BY (t) MEASURES (y_ago, q_ago, m_ago) MAIN PARTITION BY (product) DIMENSION BY (t) MEASURES (sales, 0 r_y_ago, 0 r_q_ago, 0 r_m_ago) RULES ( r_y_ago[ANY] = s[CV(t)] / s[ y_ago[CV(t)] ], -- year ago r_q_ago[ANY] = s[CV(t)] / s[ q_ago[CV(t)] ], -- quarter ago r_m_ago[ANY] = s[CV(t)] / s[ m_ago[CV(t)] ] -- month ago ); Time Series Calculation (3) Compute the ratio of current period sales of each product to sales a year ago, quarter ago and a month ago. For each row, we use the reference Model to find 3 other rows. Sales cube: prod sales per y, q, m t product sales r_y_ago r_q_ago a_m_ago 1999m01 vcr 100.00 0.050 0.280 0.830 1999m02 vcr 120.00 … … … 1999q01 vcr 360.00 1998q04 vcr 370.00 … … … Time Series Calculation (3) Compute the ratio of current period sales of each product to sales a year ago, quarter ago and a month ago. For each row, we use the reference Model to find 3 other rows. Sales cube: prod sales per y, q, m t product sales r_y_ago r_q_ago a_m_ago 1999m01 vcr 100.00 0.050 0.280 0.830 1999m02 vcr 120.00 0.055 0.330 … … … … 1999q01 vcr 360.00 0.160 0.970 null 1998q04 vcr 370.00 … … … Time Series Calculation (3) Compute the ratio of current period sales of each product to sales a year ago, quarter ago and a month ago. For each row, we use the reference Model to find 3 other rows. Sales cube: prod sales per y, q, m t product sales r_y_ago r_q_ago a_m_ago 1999m01 vcr 100.00 0.050 0.280 0.830 1999m02 vcr 120.00 0.055 0.330 ... … … … 1999q01 vcr 360.00 0.160 0.970 null 1998q04 vcr 370.00 … … null … … … Recursive Model Solving: Ledger (1) In my ledger, I have accounts: Net income, Interest, Taxes, etc. •I want to have 30 % of my Net income as Interest (F1) •My Net income is Salary minus Interest, minus Tax (F2) •Taxes are 38% of Gross (salary–interest) and 28% of Capital_gain (F3) SELECT account, b FROM ledger MODEL IGNORE NAV DIMENSION (account) MEASURES (balance b) RULES ITERATE (..) UNTIL .. ( b[‘interest’] = b[‘net’] * 0.30, --F1 b[‘net’] = b[‘salary’] – b[‘interest’] – b[‘tax’], --F2 b[‘tax’] = (b[‘salary’] – b[‘interest’]) * 0.38 + b[‘capital_gain’] *0.28 --F3 ) Recursive Model Solving: Ledger (1) In my ledger, I have accounts: Net income, Interest, Taxes, etc. •I want to have 30 % of my Net income as Interest (F1) •My Net income is Salary minus Interest, minus Tax (F2) •Taxes are 38% of Gross (salary–interest) and 28% of Capital_gain (F3) SELECT account, b FROM ledger MODEL IGNORE NAV DIMENSION (account) MEASURES (balance b) RULES ITERATE (..) UNTIL .. ( b[‘interest’] = b[‘net’] * 0.30, --F1 b[‘net’] = b[‘salary’] – b[‘interest’] – b[‘tax’], --F2 b[‘tax’] = (b[‘salary’] – b[‘interest’]) * 0.38 + b[‘capital_gain’] *0.28 --F3 ) net F1 interest interest F2 tax F3 two cycles in the formulas Recursive Model Solving: Ledger (2) In my ledger, I know Salary & Capital_gains. What are my Net income, Interest expense & Taxes? SELECT account, b FROM ledger MODEL IGNORE NAV DIMENSION BY (account) MEASURES (balance b) RULES ITERATE (10000) UNTIL (ABS(b[‘net’] - PREVIOUS(b[‘net’])) < 0.01) ( b[‘interest’] = b[‘net’] * 0.30, b[‘net’] = b[‘salary’] – b[‘interest’] – b[‘tax’], b[‘tax’] = (b[‘salary’] – b[‘interest’]) * 0.38 + b[‘capital_gain’] *0.28 ) Input Ledger Account salary capital_gains net tax interest Balance 100,000 15,000 0 0 0 Output Iterate till accuracy of .01 after 1st iteration Account salary capital_gains net tax interest Balance 100,000 15,000 100,000 42,220 30,000 Recursive Model Solving: Ledger (2) In my ledger, I know Salary & Capital_gains. What is my Net & Taxes? SELECT account, b FROM ledger MODEL IGNORE NAV DIMENSION BY (account) MEASURES (balance b) RULES ITERATE (10000) UNTIL (ABS(b[‘net’] - PREVIOUS(b[‘net’])) < 0.01) ( b[‘interest’] = b[‘net’] * 0.30, b[‘net’] = b[‘salary’] – b[‘interest’] – b[‘tax’], b[‘tax’] = (b[‘salary’] – b[‘interest’]) * 0.38 + b[‘capital_gain’] *0.28 ) Input Ledger Account salary capital_gains net tax interest Balance 100,000 15,000 0 0 0 Output Iterate till accuracy of .01 after 2nd iteration Account salary capital_gains net tax interest Balance 100,000 15,000 27,800 30,800 8,340 Recursive Model Solving: Ledger (2) In my ledger, I know Salary & Capital_gains. What is my Net & Taxes? SELECT account, b FROM ledger MODEL IGNORE NAV DIMENSION BY (account) MEASURES (balance b) RULES ITERATE (10000) UNTIL (ABS(b[‘net’] - PREVIOUS(b[‘net’])) < 0.01) ( b[‘interest’] = b[‘net’] * 0.30, b[‘net’] = b[‘salary’] – b[‘interest’] – b[‘tax’], b[‘tax’] = (b[‘salary’] – b[‘interest’]) * 0.38 + b[‘capital_gain’] *0.28 ) Input Ledger Account salary capital_gains net tax interest Balance 100,000 15,000 0 0 0 Output Iterate till accuracy of .01 after reaching accuracy (26 iterations) Account salary capital_gains net tax interest Balance 100,000 15,000 48,735 36,644 14,620 Financial Functions: NPV NPV – net present value of a series of periodic cash flows. Cash_Flow table year i prod amoun t 1999 0 vcr 100.00 2000 1 vcr 12.00 2001 2 vcr 10.00 2002 3 vcr 20.00 1999 0 dvd 200.00 2000 1 dvd 22.00 2001 2 dvd 12.00 npv values i (1 rate) i amount i (1 rate) i Financial Functions: NPV NPV – net present value of a series of periodic cash flows. values i amount i (1 rate) i (1 rate) i Cash_Flow table year i prod amoun t 1999 0 vcr 100.00 2000 1 vcr 12.00 2001 2 vcr 10.00 2002 3 vcr 20.00 1999 0 dvd 200.00 2000 1 dvd 22.00 2001 2 dvd 12.00 npv amount[1]/power(1+rate,1) + npv[1-1] Financial Functions: NPV NPV – net present value of a series of periodic cash flows. values i amount i (1 rate) i (1 rate) i Cash_Flow table year i prod amoun t 1999 0 vcr 100.00 2000 1 vcr 12.00 2001 2 vcr 10.00 2002 3 vcr 20.00 1999 0 dvd 200.00 2000 1 dvd 22.00 2001 2 dvd 12.00 2002 3 dvd 14.00 npv amount[2]/power(1+rate,2) + npv[2-1] Financial Functions: NPV NPV – net present value of a series of periodic cash flows. values i amount i (1 rate) i (1 rate) i Cash_Flow table year i prod amoun t 1999 0 vcr 100.00 2000 1 vcr 12.00 2001 2 vcr 10.00 2002 3 vcr 20.00 1999 0 dvd 200.00 2000 1 dvd 22.00 2001 2 dvd 12.00 npv amount[3]/power(1+rate,3) + npv[3-1] Financial Functions: NPV NPV – net present value of a series of periodic cash flows. values i amount i (1 rate) i (1 rate) i Cash_Flow table year i prod amoun t 1999 0 vcr 100.00 2000 1 vcr 12.00 2001 2 vcr 10.00 2002 3 vcr 20.00 1999 0 dvd 200.00 2000 1 dvd 22.00 2001 2 dvd 12.00 2002 3 dvd 14.00 npv amount[i]/power(1+rate, i) + npv[i-1] npv[ANY] ORDER BY i = amount[ CV(i) ] / power(1+rate, CV(i)) + npv[CV(i) – 1] Financial Functions: NPV (2) NPV – Net present value of a series of periodic cash flows. amount i (1 rate) i Cash_Flow table and npv for rate = 0.14 year i prod amoun t npv 1999 0 vcr 100.00 100.00 2000 1 vcr 12.00 -89.47 2001 2 vcr 10.00 -81.78 2002 3 vcr 20.00 -68.28 1999 0 dvd 200.00 200.00 2000 1 dvd 22.00 180.70 2001 2 dvd 12.00 171.47 SELECT year, i, prod, amount, npv FROM cash_flow MODEL PARTITION BY (prod) DIMENSION BY (i) MEASURES (amount, 0 npv, year) RULES ( npv[ 0] = amount[0], npv[i !=0] ORDER BY i = amount[ CV() ] / POWER(1.14, CV() ) + npv[CV() - 1] ) SQL Model Performance SQL Model – Time Series Earlier example: ratio of sales to year, quarter and month ago SELECT product, sales, r_y_ago, r_q_ago, r_m_ago FROM sales_cube MODEL REFERENCE r ON (SELECT * from time) DIMENSION BY (t) MEASURES (y_ago, q_ago, m_ago) MAIN PARTITION BY (product) DIMENSION BY (t) MEASURES (sales, 0 r_y_ago, 0 r_q_ago, 0 r_m_ago) RULES ( r_y_ago[ANY] = s[CV(t)] / s[ y_ago[CV(t)] ], -- year ago r_q_ago[ANY] = s[CV(t)] / s[ q_ago[CV(t)] ], -- quarter ago r_m_ago[ANY] = s[CV(t)] / s[ m_ago[CV(t)] ] -- month ago ); • ANSI SQL version needs outer join for each formula plus a join for reference model. • N formulas, M reference models N+M joins 4 joins in this example: sales_cube time sales_cube sales_cube sales_cube SQL Model vs. ANSI Joins 400 ANSI joins 350 300 250 200 SQL Model 150 100 50 Number of rules or joins 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Summary New facility for spreadsheet-like computations in SQL High Performance – – – Replaces multiple joins, unions Scalable in size and parallel processing Powerful optimizations Collaborative analysis Move external processing such as spreadsheets into RDBMs for manageability and consolidation Next Steps…. Demonstration at Oracle DEMOgrounds – – Exhibit hall, Booth 1326, Database Area Monday: 5:00 PM - 8:00, Tuesday: 10:30 - 1:00, 3:00 6:00, Wednesday: 11:00 - 4:30, Thursday: 10:30 - 2:00 Hands-on Lab – – – Marriott Hotel - Golden Gate B1 Lab Section: Use Information from your Data Warehouse Lesson 1: Using the SQL Model clause Monday: 10:30 - 5:00, Tuesday: 8:30 - 12:30, 3:00 5:00, Wednesday: 8:30 - 4:30, Thursday: 8:30 - 2:30 Reminder – please complete the OracleWorld online session survey Thank you. Q U E S T I O N S A N S W E R S