SQL/OLAP Sang-Won Lee Let’s e-Wha! Email: swlee@ewha.ac.kr URL: http://home.ewha.ac.kr/~swlee Jul. 12th, 2001 SQL/OLAP ISO/IEC JTC1 SC32 1 Contents Introduction to OLAP and SQL Issues Current OLAP Solutions SQL/OLAP Future OLAP Trends SQL/OLAP ISO/IEC JTC1 SC32 2 OLAP On-Line Analytical Processing – E.F. Codd coined the term “OLAP”([1]) – Multi-dimensional data model – vs. On-Line Transaction Processing – vs. Data warehouse SQL/OLAP ISO/IEC JTC1 SC32 3 Multi-dimensional Data Model Sales(prod-id,store-id,time-id,qty,amt) Dimension: Product, Store, Time Hierarchy: – – – SQL/OLAP Product -> Category -> Industry Store->City -> State -> Country Date -> Month -> Quarter -> Year ISO/IEC JTC1 SC32 4 Multi-dimensional Data Model(2) Operations – – – – – – – roll-up/drill-down slice/dice pivot ranking comparisons drill-across etc. Example – – – for each state show me top 10 products based on total sales what is the percentage growth of Jan-99 total sales over total Jan-98? for each product show me the quantity shipped and sold SQL/OLAP ISO/IEC JTC1 SC32 5 OLAP Operations Many business operations was hard or impossible to express in SQL – multiple aggregations – comparisons(with aggregation) – reporting features Be prepared for serious performance penalty Client and middle-ware tools provide the necessary functionality – OLAP server: ROLAP vs. MOLAP SQL/OLAP ISO/IEC JTC1 SC32 6 Multiple Aggregations Create a 2-dimensional spreadsheets that shows sum of sales by maker as well as model of car Each subtotal requires a separate aggregate query Cross Tab Chevy Ford By Color RED WHITE BLUE By Make Sum SQL/OLAP SELECT color, make, sum(amt) FROM sales GROUP BY color, make union SELECT color, sum(amt) FROM sales GROUP BY color union SELECT make, sum(amt) FROM sales GROUP BY make union SELECT sum(amt) FROM sales ISO/IEC JTC1 SC32 7 Comparisons Examples: – last year’s sales vs. this year’s sales for each product requires a self-join VIEW: create or replace view v_sales as select prod-id, year, sum(qty) as sale_sum from sales group by prod-id, year; QUERY: select cur.year cur_year, cur.sale_cur_sales, last.sum last_sales from v_sales curr, v_sales last where curr.year=(last.year+1) SQL/OLAP ISO/IEC JTC1 SC32 8 The Data CUBE Relational Operator Generalizes Group By and Aggregates Aggregate Group By (with total) Sum By Color RED WHITE BLUE Cross Tab Chevy Ford Sum By Color RED WHITE BLUE The Data Cube and The Sub-Space Aggregates By Make Sum By Year By Make By Make & Year RED WHITE BLUE By Color & Year Sum By Make & Color By Color source:[6] SQL/OLAP ISO/IEC JTC1 SC32 9 Getting Sub-totals: ROLLUP Operation SELECT year, brand, SUM(qty) FROM sales GROUP BY ROLLUP (year, brand); YEAR 1996 1996 1996 1996 1997 … 1997 SQL/OLAP BRAND Ford Honda Toyota Ford SUM(qty) 250 300 450 1000 300 1200 2200 ISO/IEC JTC1 SC32 Getting Cross-tabs: CUBE Operation SELECT year, brand, SUM(amount) FROM sales GROUP BY CUBE (year, brand); YEAR BRAND SUM(AMOUNT) 1996 Ford 250 ... 1996 Toyota 450 1997 Ford 300 ... 1997 1200 Ford 550 Honda 650 Toyota 1000 2200 SQL/OLAP ISO/IEC JTC1 SC32 Flexible Grouping: GROUPING_SETS Operator SELECT year, brand, color, SUM(qty) FROM sales GROUP BY GROUPING_SETS ((year, brand), (brand,color),()); YEAR BRAND COLOR 1996 Ford 1996 Honda 1996 Toyota 1997 Ford 1997 Honda 1997 Toyota Ford Blue Ford Red Honda Blue Toyota Red Toyota White SQL/OLAP SUM(QTY) 250 300 450 300 350 550 400 150 650 700 300 2200 ISO/IEC JTC1 SC32 Year, Brand Brand, Color Grand total LAG Operator SQL> SELECT timekey, sales 2 LAG(sales, 12) OVER 3 (ORDER BY timekey) AS sales_last_year, 4 (sales - sales_last_year) AS sales_change 5 FROM sales; TIMEKEY 98-1 ….. 99-1 99-2 99-3 99-4 99-5 99-6 99-7 99-8 SQL/OLAP SALES 1100 … 1200 1500 1700 1600 1800 1500 1300 1400 SALES_LAST_YEAR … 1100 1450 1350 1700 1600 1450 1250 1200 ISO/IEC JTC1 SC32 SALES_CHANGE ... 100 50 250 -100 200 50 50 200 13 MOVING Average SELECT time-id, avg(sum(qty)) over (order by time-id RANGE INTERVAL ‘2’ DAY PRECEDING ) as mvg_avg_sales from sales group by time_id ; SQL/OLAP ISO/IEC JTC1 SC32 14