Ulrike Fischer Processing and Optimization of Forecast Queries © Prof. Dr.-Ing. Wolfgang Lehner | > Motivation Time series data appears in many domains Sales and inventory Renewable energy ressources High accuracy possible Runtime restrictions Sophisticated models Sophisticated estimators Large number of time series Short amount of time available Two Optimization Dimensions: Accuracy and Runtime © Ulrike Fischer | Processing and Optimization of Forecast Queries | 2 > Outline Motivation Integration of Forecasting inside a DBMS Processing of Forecast Queries Optimization of Forecast Queries in Hierarchies Summary © Ulrike Fischer | Processing and Optimization of Forecast Queries | 3 > Model-based Time Series Forecasting Forecasting Model Triple Exponential Smoothing 1. Model Creation ! Model Identification Parameter Estimation 2. Model Usage 3. Model Maintenance Model Evaluation Threshold-based, time-based … Model Adaption ! Parameter Re-estimation © Ulrike Fischer | Processing and Optimization of Forecast Queries | 4 > Time Series Forecasting in DBMS Transparency and Effienciency M M export M M SQL M Reuse of models and results © Ulrike Fischer | SQL Processing and Optimization of Forecast Queries | 5 > Project Overview EU FP7 project date 2012 2013 … SELECT date, quantity FROM sales WHERE … FORECAST … quantity 34,000 38,000 … Scheduling Forecasting Aggregation FlexOffers DWH Supply © Ulrike Fischer | Demand Processing and Optimization of Forecast Queries | 6 > Overview F2DB Forecast Queries Inserts Query Interface Model Usage Model Maintenance Query Processing & Optimization On-Demand Estimation QP in Hierarchies Hybrid Maintenance Publish Subscribe Queries Model Index Model Pool Model Model Creation Time Series Model Model Model Time Series Time Series Ensemble Models Physical Design © Ulrike Fischer | Base Tables Processing and Optimization of Forecast Queries | 7 > Outline Motivation Integration of Forecasting inside a DBMS Processing of Forecast Queries Optimization of Forecast Queries in Hierarchies Summary © Ulrike Fischer | Processing and Optimization of Forecast Queries | 8 > Forecast Query Processing SELECT FROM WHERE GROUP BY FORECAST Extension of SQL language Horizon, measure and time column, model type and parameters, … Logical query plan date, SUM(quantity) sales product = ‘HTC‘ date 3 Physical query plan Forecast operator Ψ Ψk=3 πdate, quantity BuildModel Forecast γdate:AGG(sales) Aggregate MHTC σ product= 'HTC' sales © Ulrike Fischer | Forecast Scan sales Processing and Optimization of Forecast Queries | 9 > Advanced Forecast Query Processing Data warehouse contains multidimensional data Mobiles 3. Disaggregation Nokia HTC 1. Direct SELECT FROM WHERE GROUP BY FORECAST date, SUM(quantity) sales product = ‘HTC‘ date 3 days Aggregation DisAgg 2. Aggregation HD2 © Ulrike Fischer | Forecast Forecast Forecast Key MMobiles M HD2 MSmart Smart Processing and Optimization of Forecast Queries | 10 > Aggregation vs. Disaggregation Top-Down (Disaggregation) Bottom-Up (Aggregation) Complete (Direct) Efficiency Accuracy Model creation easier Edwards and Orcuss (1969) Schwarzkopf et. al. (1988) Hubrich (2005) … No information loss Grunfeld and Griliches (1960) Gross and Sohl (1990) Zellner and Tobias (2000) …. Depends on data set, quality of forecast model, correlation … © Ulrike Fischer | Processing and Optimization of Forecast Queries | 11 > Outline Motivation Integration of Forecasting inside a DBMS Processing of Forecast Queries Optimization of Forecast Queries in Hierarchies Summary © Ulrike Fischer | Processing and Optimization of Forecast Queries | 12 > Configuration Advisor Updates Forecast Queries Workload W Preference α Query Interface Model Advisor Analyze Cost BW + Error EW Create Configuration CW Configuration + Strategy DWH Model Pool Problem: Exponential search space Greedy Algorithm (monotonic maintenance costs) Start one model at the top, add models step-by-step © Ulrike Fischer | Processing and Optimization of Forecast Queries | 13 > Performance Comparison Complete (C) All models, only direct forecasts Bottom-Up (B) Only models at level one, others use aggregation Top-Down (T) Only one model for top element, others use disaggregation Greedy (G) © Ulrike Fischer | Processing and Optimization of Forecast Queries | 14 > Extensions Observation: aggregation (bottom-up) hardly used in real data sets Reason: large number of child time series Sample Aggregation Group Design Use sample of child models Relax fixed aggregation groups ? ? Virtual Group ? aggregation + estimation Estimate using historical proportion Weighted sampling © Ulrike Fischer | support of disjunctive queries Processing and Optimization of Forecast Queries | 15 > Outline Motivation Integration of Forecasting inside a DBMS Processing of Forecast Queries Optimization of Forecast Queries in Hierarchies Summary © Ulrike Fischer | Processing and Optimization of Forecast Queries | 16 > Summary DBMS Integration Sophisticated models computationally expensive DBMS integration for reuse, transparency and optimization Forecast Queries New query type with forecast horizon Face two otimization dimensions Hierarchical Forecasting Reduce maintenance costs with derivation schemes Possible increase of accuracy Large search space © Ulrike Fischer | Processing and Optimization of Forecast Queries | 17 Ulrike Fischer Processing and Optimization of Forecast Queries © Prof. Dr.-Ing. Wolfgang Lehner |