Forecasting V Bandi and R Lahdelma 1 Forecasting? • Decision-making deals with future problems - Thus data describing future must be needed • Representation of what occurs in future Strategic decisions Tactical decisions Operative decisions Time Hours Days Weeks Months Year 5 Years V Bandi and R Lahdelma 20-50 Years 2 Time horizons of Forecast • Depending on the purpose, the time horizon may differ - Operational planning • Day – week level - Tactical planning • Week – month– year - Strategic planning • Year – 10 year – 50 years V Bandi and R Lahdelma 3 Requirments of forecasting model • Sufficient accuracy - Depends on purpose of the forecast • Operative decisions requires high degree of accracy • Necessary Input data availability - Having access to real data is always a challenge • Model must be easy to update and maintain the model - when the system changes - Not overly complex and specialized V Bandi and R Lahdelma 4 Different approaches to forecasting • Theory-oriented - The laws of physics determine how the system behaves; therefore the model is formed based on theoretical laws • Example: Heat is transferred through radiation, conduction and convection... • Data-oriented - History data is analyzed in order to find out dependencies • Requires applied mathematical techniques V Bandi and R Lahdelma 5 Different approaches to forecasting • In practice it is wise to use both (theortical and dataoriented) approaches together - forecast model structure is planned based on theory but the parameters are estimated from history data - Sometimes observing the data can reveal dependences that are otherwise missed in theoretical analyses - Understanding the laws of physics allows making the model more generic and accurate V Bandi and R Lahdelma 6 Let us try some simple forecast models V Bandi and R Lahdelma 7 Forecasting demand for Cars • The demand for Toyota cars over first six months in helsinki region is summarized in following table. Forcast the demand for car in next 6 months. Month Number of units Jan 46 Feb 56 Mar 43 Apr 43 May 60 Jun 72 V Bandi and R Lahdelma 8 Forecast demand for cars • Simple modeling techniques - Based on a averages, weighing averages • In the example, dependency between month and net unit of sales is hard to identify - It is very difficult to forecast accurately V Bandi and R Lahdelma 9 Forecasting applications in Engineering • Planning and optimization - example: coordination of cogeneration • Simulation - Planning new systems - Improving existing systems - To understand the behavior of systems V Bandi and R Lahdelma 10 Forecasting methods • Based on averages - Moving averages • Smoothing techniques • Regression - Linear regression • In simplest form: Y = aX+b • Y dependent variable, X independent variable - Non-linear regression - Dynamic regression • Neural networks and many more V Bandi and R Lahdelma 11 Regression analysis • A regression analysis is for forecasting one variable from another - we must decide which variable will be independent variable and which is dependent variable Y - This choice is usually motivated by a theory or hypothesis of causality • The alleged “cause” is X and the alleged “effect” is Y V Bandi and R Lahdelma 12 What regression does • What regression does - A regression analysis produces a straight line that estimates the average value of Y at any specific value of X - Example: Heat demand forecast in a year based on out door temperature yt = a0+a1xt • a0= 261 MW • a1= -11.3 MW/Co - The curve fits badly at high temperatures • therefore it is misaligned also for cold temperatures V Bandi and R Lahdelma 13 Forecast regression model • The model aims to explain the behavior of the unknown quantity y in terms of known quantities x, parameters a and random noise e - y = f(x, a) + e • The structure of the model (shape of function f()) can be determined based on theory, based on intuition or by exploring history data - The parameters a are estimated from history data so that the noise e is minimized • When the model has a good structure, e is white noise • Forecasting models can be classified according to the shape of function f V Bandi and R Lahdelma 14 Linear Regression model based on one dependent and one independent variable • A model where a single dependent variable y is explained by a single independent variable x is fitted to history data - yt = a0+a1xt, where t= 1,...,T • This is a linear equation system with two unknowns - The equation can be solved in the least squares sense (2norm) - To solve it we augment it with a error variable et V Bandi and R Lahdelma 15 Linear Regression model, Determining parameters • We seek for parameters a values that minimize the square sum of the error variables 2 Min π12 + π21 + β― + ππ‘2 s.t. ππ‘2 = ππ + ππ‘ π₯π‘ , π‘ = 1, … , π • If we introduce the vector/matrix notations, π0 π= π 1 π1 . π= . ππ π¦1 . π= . π¦π 1 . And πΏ = . 1 V Bandi and R Lahdelma π₯1 . . π₯π 16 Linear Regression model, Determining parameters • The problem in vector/matrix format πππ ππ π π . π‘ π = πΏπ − π • Substituting e into the objective function yields an unconstrained optimization problem Min (Xa – y)T(Xa – y) = aTXTXa – 2aTXTy + yTy • Derivative w.r.t to a gives the solution 2XTXa – 2XTy = 0 a = (XTX)-1 XTy V Bandi and R Lahdelma 17 Generalizations of linear regression Multiple independent (explaining) variables • Linear regression model with multiple parameters xi yt = a0+a1x1,t+a2x2,t+….+anxn,t , where t= 1,...,T • Now there are more unknown parameters a and the X-matrix becomes wider i π0 π1 π= . ππ • π1 . π= . ππ π¦1 . π= . π¦π 1 . And πΏ = . 1 π₯11 . . π₯1π . . . . . . . . π₯π1 . . π₯ππ The matrix formulas and solution remain the same πππ ππ» π π . π‘ a = (XTX)-1 XTy V Bandi and R Lahdelma 18 Heat Demand Forecast • Heat demand depends - Weather • Outside temperature, wind, solar radiation, seasons - Building properties - Residents behavior • Forecasting requires identification of independent variables V Bandi and R Lahdelma 19 Heat Demand Forecast • Accurate heat demand forecast - Weather, resident behavior, building properties can be considered as independent variables - Forecast modelling with all independent variables requires data • Obtaining data is challenging • According to previous studies, outside temperature has most influence on heat demand V Bandi and R Lahdelma 20 Heat demand forecast using Regression based on outside temperature • Dependent variable - Heat consumption (historical data) • Independent variable - Outside temperature (historical data) • Forecasting model yt = a0 + a1xt • The curve fits badly at high temperatures, therefore it is misaligned also for cold temperatures V Bandi and R Lahdelma 21 Standard Deviation (SD) or RMSE (RootMean-Squared-Error) • The square root of the mean/average of the square of all of the error - The use of SD or RMSE is very common and it makes an excellent general purpose error criteria for forecasts • stdev(e) = sqrt(eTe/T) V Bandi and R Lahdelma 22 Forecast based on outdoor temperature Forecast vs actual for sample week The forecast is on good on average, but does not quite satisfactory, RMSE (Root-Mean-Squared-Error) or standard deviation for annual forecast is 20% V Bandi and R Lahdelma 23 Forecast based on outdoor temperature Forecast vs actual for sample week • RMSE = 20% (out of average demand) - not a good forecast • Reason for low accuracy - Outside temperature alone cannot explain heat consumption completely - Outside temperature alone cannot explain heat consumption completely. This can be explained by correlation coefficient between outside temperature and heat consumption V Bandi and R Lahdelma 24 Correlation coefficient • The correlation coefficient is a number between -1 and 1 that indicates the strength of the linear relationship between two variables - Very strong positive linear relationship between X and Y • r ≈ 1: - No linear relationship between X and Y. Y does not tend to increase or decrease as X increases. • r ≈ 0: - Very strong negative linear relationship between X and Y. Y decreases as X increases • r ≈ -1 • The sign of r (+ or -) indicates the direction of the relationship between X and Y. The magnitude of r (how far away from zero it is) indicates the strength of the relationship. V Bandi and R Lahdelma 25 Correlation coefficient V Bandi and R Lahdelma 26 Correlation between outside temperature and heat consumption for a single building • Correlation coefficient for a building r = -0.956 - Strong negative relation ship - Model could have been more accurate if r = -1 V Bandi and R Lahdelma 27 Residents behavior in a building • People behavior usually have a rhythm (a strong, regular repeated pattern) • Lets hypothesis residents behavior has similar rhythm or on weekdays (Monday to Friday) and weekends (Saturday and Sunday) • Let us modify the forecast model using these week rhythms V Bandi and R Lahdelma 28 Modified Forecast Model • Original forecast model yt = a0 + a1xt • y intercept a0 also has a negative on accuracy, as it influences the forecast being a constant • Modified forecast model yt = ah(t) + a1xt Where πβ(π‘) is a social component based on weekly rhythm ah(t) = a0 − ππ£πππππ(eπ‘| β = β(π‘)) β = 1,2, … , 168 V Bandi and R Lahdelma 29 Forecast based using weekly rhythm RMSE (Root-Mean-Squared-Error) or standard deviation for annual forecast is 13% V Bandi and R Lahdelma 30 Improving accuracy of the model • The weekly rhythm model does not consider that some weeks and days are different - E.g. during holiday seasons, religious holidays etc the demand is different from the normal weekday • The days can be classified e.g. working day, Saturday, holiday • Possible to include more independent variables - solar radiation, wind speed and direction, cloudiness, ... • In general these affect the precision only a little • History data from multiple years - Weighted regression – recent history can obtain more weight V Bandi and R Lahdelma 31