Tutorial for solution of Assignment week 40 “Forecasting monthly values of Consumer Price Index Data set: Swedish Consumer Price Index” sparetime “Construct a time series graph for the monthly values of Consumer Price Index (Konsumentprisindex (KPI) in Swedish) for spare time occupation, amusement and culture (fritid, nöje och kultur in Swedish) (in file ‘sparetime.txt’).” Time Series Plot of CPI(group)) 200 CPI(group)) 180 160 140 120 100 Month jan Year 1980 jan 1983 jan 1986 jan 1989 jan 1992 jan 1995 jan 1998 “Then estimate the autocorrelations and display them in a graph.” Autocorrelation Function for CPI(group)) (with 5% significance limits for the autocorrelations) 1.0 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 1 5 10 15 20 25 30 Lag 35 40 45 50 55 “Is there any obvious upward or downward trend?” Time Series Plot of CPI(group)) 200 CPI(group)) 180 Yes, upward, but turning at the end 160 140 120 100 Month jan Year 1980 jan 1983 jan 1986 jan 1989 jan 1992 jan 1995 jan 1998 “Are there any signs of long-time oscillations in the time series? Time Series Plot of CPI(group)) 200 CPI(group)) 180 No! 160 140 120 100 Month jan Year 1980 jan 1983 jan 1986 jan 1989 jan 1992 jan 1995 jan 1998 Are there any signs of seasonal variation in the series?” Not visible! “Do the autocorrelations cancel out quickly?” Autocorrelation Function for CPI(group)) (with 5% significance limits for the autocorrelations) 1.0 0.8 Autocorrelation 0.6 No! 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 1 5 10 15 20 25 30 Lag 35 40 45 50 55 “Judge upon the need for differentiation according to ut = yt - yt-1 or vt = yt - yt-12 to get a time series that is suitable for forecasting with ARMA-models. Construct new graphs for the series obtained by differentiation and estimate the autocorrelations for these series.” ut = yt – yt - 1 Time Series Plot of u 3 2 Not convincingly stationary! u 1 0 -1 -2 Month jan Year 1980 jan 1983 jan 1986 jan 1989 jan 1992 jan 1995 jan 1998 Autocorrelation Function for u (with 5% significance limits for the autocorrelations) 1.0 0.8 Autocorrelation 0.6 Diffuse pattern! 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 1 5 10 15 20 25 30 Lag 35 40 45 50 55 vt = yt – yt - 12 Time Series Plot of v 15 10 v 5 0 -5 Month jan Year 1980 jan 1983 jan 1986 jan 1989 jan 1992 jan 1995 jan 1998 Autocorrelation Function for v (with 5% significance limits for the autocorrelations) 1.0 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 1 5 10 15 20 25 30 Lag 35 40 45 50 Definitely nonstationary! “E.2. Fitting different ARMA-models Try different combinations of ARMA-models and differentiation to forecast the Consumer Price Index. Which model seems to give the best forecasts in this case.” From E.1.: Seems to be best to use first-order non-seasonal differences Chosen “design”: AR(1), AR(2) MA(1), MA(2) ARMA(1,1), ARMA(2,1), ARMA(1,2), ARMA(2,2) Fixed to 1 in all models! Altered from model to model AR(1): Type AR 1 Constant Coef SE Coef T P 0.1170 0.0671 1.75 0.082 0.38522 0.04779 8.06 0.000 … MS = 0.512 2 months forecasts: Forecast Lower Upper 195.472 194.070 196.875 195.925 193.823 198.027 DF = 222 Modified Box-Pierce (Ljung-Box) Chi-Squ Lag 12 24 36 48 0.000 0.000 0.000 0.000 … P-Value AR(2): Type Coef SE Coef T P AR 1 0.0770 0.0648 1.19 0.236 AR 2 0.3012 0.0655 4.60 0.000 0.27053 0.04576 5.91 0.000 Constant … MS = 0.469 DF = 221 Modified Box-Pierce (Ljung-Box) Chi-Squ Lag 12 24 36 48 0.007 0.003 0.001 0.003 … P-Value 2 months forecasts: Forecast Lower Upper 194.962 193.620 196.305 195.720 193.747 197.693 MA(1): Type MA 1 Constant Coef SE Coef T P -0.0741 0.0675 -1.10 0.273 Forecast 0.43605 0.05146 8.47 0.000 … MS = 0.514 2 months forecasts: Lower Upper 195.430 194.024 196.835 195.866 193.803 197.929 DF = 222 Modified Box-Pierce (Ljung-Box) Chi-Squ Lag 12 24 36 48 0.000 0.000 0.000 0.000 … P-Value MA(2): Type Coef SE Coef T P MA 1 -0.0592 0.0668 -0.89 0.376 MA 2 -0.2533 0.0670 -3.78 0.000 0.43664 0.06071 7.19 0.000 0.479 DF = 221 Constant … MS = Modified Box-Pierce (Ljung-Box) Chi-Squ Lag 12 24 36 48 0.000 0.000 0.000 0.000 … P-Value 2 months forecasts: Forecast Lower Upper 195.146 193.789 196.503 196.032 194.056 198.009 ARMA(1,1): * WARNING * Back forecasts not dying out rapidly Type Coef SE Coef T P AR 1 1.0186 0.0238 42.85 0.000 MA 1 0.9769 0.0006 1560.10 0.000 -0.0117678 -0.0013602 8.65 0.000 Constant … MS = 0.458 DF = 221 2 months forecasts: Modified Box-Pierce (Ljung-Box) Chi-Squ Forecast Lag 12 24 36 48 … P-Value Type ARMA(2,1): 0.000 0.000 0.000 Lower Upper 194.516 193.190 195.843 194.114 192.199 196.029 0.000 Coef SE Coef T P AR 1 0.3311 0.2045 1.62 0.107 AR 2 0.2711 0.0764 3.55 0.000 MA 1 0.2821 0.2129 1.33 0.186 Forecast 0.17191 0.03287 5.23 0.000 Constant … MS = 0.469 DF = 220 Modified Box-Pierce (Ljung-Box) Chi-Squ Lag P-Value 12 24 36 48 0.004 0.001 0.000 0.001 2 months forecasts: Lower Upper 194.746 193.403 196.088 195.300 193.355 197.246 ARMA(1,2): Type Coef SE Coef T P AR 1 0.6136 0.1635 3.75 0.000 MA 1 0.5577 0.1679 3.32 0.001 MA 2 -0.2202 0.0763 -2.89 0.004 0.16753 0.03043 5.51 0.000 0.472 DF = 220 Constant … MS = 2 months forecasts: Modified Box-Pierce (Ljung-Box) Chi-Squ Lag 12 24 36 48 … P-Value ARMA(2,2): 0.002 0.001 0.000 Forecast Lower Upper 194.728 193.382 196.075 195.214 193.256 197.173 0.000 * ERROR * Model cannot be estimated with these data. None of the models are satisfactory in goodness-of-fit and prediction intervals are quite similar (slightly more narrow for the more complex models). Maybe second-order non-seasonal differences would work? wt = ut – ut – 1 = (yt – yt – 1) – (yt – 1 – yt – 2) = yt – 2yt – 1 + yt – 2 Time Series Plot of w 4 3 2 w 1 0 -1 -2 -3 Month jan Year 1980 jan 1983 jan 1986 jan 1989 jan 1992 jan 1995 jan 1998 Autocorrelation Function for w (with 5% significance limits for the autocorrelations) Clear seasonal correlation and close to non-stationary 1.0 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 1 5 10 15 20 25 30 Lag 35 40 45 50 55 How about first order seasonal differences on the first-order non-seasonal differences? zt = ut – ut – 12 = (yt – yt – 1) – (yt – 12 – yt – 13) Time Series Plot of z 2 1 z 0 -1 -2 Much more a stationary look! Month jan Year 1980 jan 1983 jan 1986 jan 1989 jan 1992 jan 1995 jan 1998 Autocorrelation Function for z (with 5% significance limits for the autocorrelations) 1.0 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 1 5 10 15 20 25 30 Lag 35 40 45 50 Autocorrelation Function for z (with 5% significance limits for the autocorrelations) 1.0 0.8 Autocorrelation 0.6 Tricky to identify the correct model. 0.4 0.2 0.0 -0.2 Clearly a seasonal model must be used, most probably with at least one MA –term -0.4 -0.6 -0.8 -1.0 1 5 10 15 20 25 30 Lag 35 40 45 50 Non-seasonal part more difficult. ARMA(1,1) ? Partial Autocorrelation Function for z (with 5% significance limits for the partial autocorrelations) 1.0 Partial Autocorrelation 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 1 5 10 15 20 25 30 Lag 35 40 45 50 Try ARIMA(1,1,1,0,1,1)12 Type Coef SE Coef T P AR 1 -0.6368 1.2524 -0.51 0.612 MA 1 -0.6085 1.2902 -0.47 0.638 SMA 12 0.8961 0.0484 18.51 0.000 -0.07528 0.01129 -6.67 0.000 Constant Differencing: 1 regular, 1 seasonal of order 12 Number of observations: Residuals: Original series 225, after differencing 212 SS = 77.9325 (backforecasts excluded) MS = 0.3747 DF = 208 Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag Chi-Square DF P-Value 12 24 36 48 12.9 22.8 28.9 36.0 8 20 32 44 0.115 0.300 0.626 0.800 2 months forecasts: Forecast Lower Upper 194.971 193.771 196.171 195.580 193.907 197.253 Compare with ARIMA(0,1,0,0,1,1)12 Type SMA Coef SE Coef T P 0.9039 0.0472 19.15 0.000 -0.045964 0.006893 -6.67 0.000 12 Constant Differencing: 1 regular, 1 seasonal of order 12 Number of observations: Residuals: Original series 225, after differencing 212 SS = 78.0539 (backforecasts excluded) MS = 0.3717 DF = 210 Slightly smaller MS! Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag Chi-Square DF P-Value 12 24 36 48 12.5 21.8 28.0 34.8 10 22 34 46 0.254 0.475 0.757 0.887 2 months forecasts: Forecast Lower Upper 194.987 193.792 196.182 195.587 193.897 197.277 “E.3. Residual analysis Construct a graph for the residuals (the one-step-ahead prediction errors) and examine visually if there is any pattern in the residuals indicating that the selected forecasting model is not optimal.” Residual plots for ARIMA(1,1,1,0,0,0) ARIMA(1,1,1,0,1,1)12 ARIMA(0,1,0,0,1,1)12 ARIMA(1,1,1,0,0,0): Residual Plots for CPI(group)) ACF of Residuals for CPI(group)) (with 5% significance limits for the autocorrelations) Normal Probability Plot of the Residuals 1.0 2 99 Percent 0.6 0.4 0.2 90 Residual 0.8 50 10 0.0 1 -0.2 0.1 -2 -1 -0.4 -0.6 0 Residual 1 1 0 -1 -2 2 Histogram of the Residuals -0.8 1 5 10 15 20 25 30 Lag 35 40 45 50 55 Frequency -1.0 (with 5% significance limits for the partial autocorrelations) 125 150 Fitted Value 10 0 -1 -1.2 -0.6 0.0 0.6 Residual 1.2 1.8 -2 1 20 40 60 80 100 120 140 160 180 200 220 Observation Order Partial Autocorrelation 0.8 0.6 0.4 0.2 0.0 Non-satisfactory -0.4 -0.6 -0.8 -1.0 1 5 10 15 20 25 30 Lag 35 40 45 50 55 200 1 1.0 -0.2 175 2 20 0 PACF of Residuals for CPI(group)) 100 Residuals Versus the Order of the Data 30 Residual Autocorrelation Residuals Versus the Fitted Values 99.9 ARIMA(1,1,1,0,1,1)12 ACF of Residuals for CPI(group)) (with 5% significance limits for the autocorrelations) Residual Plots for CPI(group)) 1.0 Normal Probability Plot of the Residuals 0.8 99 Percent 0.4 0.2 0.0 -0.2 1 90 Residual 0.6 Autocorrelation Residuals Versus the Fitted Values 99.9 50 10 0 -1 1 0.1 -0.4 -2 -1 0 Residual -0.6 -0.8 1 -2 2 Histogram of the Residuals 100 125 150 175 Fitted Value 200 Residuals Versus the Order of the Data -1.0 12 18 24 30 36 42 48 Lag PACF of Residuals for CPI(group)) 30 20 10 0 (with 5% significance limits for the partial autocorrelations) 1 Residual 6 Frequency 1 -1.2 -0.6 0.0 Residual 0.6 1.0 Partial Autocorrelation 0.8 0.6 0.4 0.2 0.0 Satisfactory! -0.2 -0.4 -0.6 -0.8 -1.0 1 6 12 18 24 30 Lag 36 42 48 1.2 0 -1 -2 1 20 40 60 80 100 120 140 160 180 200 220 Observation Order ARIMA(0,1,0,0,1,1)12 ACF of Residuals for CPI(group)) (with 5% significance limits for the autocorrelations) Residual Plots for CPI(group)) 1.0 Normal Probability Plot of the Residuals 0.8 2 99 0.2 0.0 -0.2 1 90 Residual 0.4 Percent 50 10 0 -1 1 -0.4 0.1 -0.6 -0.8 -2 -1 0 Residual 1 -2 2 Histogram of the Residuals -1.0 1 6 12 18 24 30 36 42 48 Frequency Lag PACF of Residuals for CPI(group)) (with 5% significance limits for the partial autocorrelations) 2 30 1 20 10 0 1.0 -1.2 -0.6 0.0 Residual 0.6 1.2 0.8 0.6 0.4 0.2 0.0 -0.2 Satisfactory! -0.4 -0.6 -0.8 -1.0 1 6 12 18 24 30 Lag 36 42 48 100 125 150 175 Fitted Value 200 Residuals Versus the Order of the Data 40 Residual Autocorrelation 0.6 Partial Autocorrelation Residuals Versus the Fitted Values 99.9 0 -1 -2 1 20 40 60 80 100 120 140 160 180 200 220 Observation Order “F. ARMA-models and exponential smoothing Data set: The Dollar-Danish Crowns Exchange rates Consider the time series of monthly exchange rates US$/DKK.” “At first, calculate forecasts by using exponential smoothing and note the prediction formula.” Time Series Plot of Exchange rate Change scale so that y-axis starts at 0 (and ends at 10) 6.5 6.0 5.5 Month jan Year 1991 jan 1992 jan 1993 jan 1994 jan 1995 jan 1996 jan 1997 jan 1998 Time Series Plot of Exchange rate 10 8 Single exponential smoothing will probably work well. Optimize Exchange rate Exchange rate 7.0 6 4 2 0 Month jan Year 1991 jan 1992 jan 1993 jan 1994 jan 1995 jan 1996 jan 1997 jan 1998 Calculate forecasts for 6 months (an arbitrarily chosen value) Single Exponential Smoothing Plot for Exchange rate Variable Actual Fits Forecasts 95.0% PI Exchange rate 7.0 Smoothing Constant Alpha 0.995540 6.5 Accuracy MAPE MAD MSD 6.0 Measures 2.41996 0.14983 0.03784 5.5 Forecasts 1 10 20 30 40 50 60 Index 70 80 90 100 Period Prediction formula: yˆT T yT 1 T 1 0.9955 yT 0.0045 T 1 Forecast Lower Upper 96 6.31118 5.94410 6.67825 97 6.31118 5.94410 6.67825 98 6.31118 5.94410 6.67825 99 6.31118 5.94410 6.67825 100 6.31118 5.94410 6.67825 101 6.31118 5.94410 6.67825 “Then calculate forecasts by fitting a MA(1)-model to first differences of the original series (i.e. you must differentiate the series once).” Time Series Plot for Exchange rate (with forecasts and their 95% confidence limits) Final Estimates of Parameters 7.5 Type Exchange rate 7.0 MA 6.5 Coef SE Coef 0.0052 0.1043 0.00537 0.02027 1 Constant 6.0 5.5 1 10 20 30 40 50 Time 60 70 80 90 100Forecasts from period 95 95 Percent “How does the prediction formula look like in this case?” Limits Period Forecast Lower Upper 96 6.31652 5.92916 6.70387 97 6.32189 5.77550 6.86828 98 6.32726 5.65864 6.99587 99 6.33263 5.56091 7.10434 100 6.33800 5.47542 7.20057 101 6.34336 5.39862 7.28811 “How do the forecasts differ between the two different methods of forecasting?” SES ARIMA(0,1,1) Single Exponential Smoothing Plot for Exchange rate (with forecasts and their 95% confidence limits) 7.5 Smoothing Constant Alpha 0.995540 6.5 Accuracy MAPE MAD MSD 6.0 Measures 2.41996 0.14983 0.03784 7.0 Exchange rate Exchange rate Time Series Plot for Exchange rate Variable Actual Fits Forecasts 95.0% PI 7.0 6.5 6.0 5.5 5.5 1 10 20 30 40 50 60 Index 70 80 90 Forecasts from period 95 100 1 10 20 30 40 50 Time 60 70 80 90 100 95 Percent Forecasts Limits Forecast Lower Upper Period Forecast Lower Upper 96 6.31118 5.94410 6.67825 96 6.31652 5.92916 6.70387 97 6.31118 5.94410 6.67825 97 6.32189 5.77550 6.86828 98 6.31118 5.94410 6.67825 98 6.32726 5.65864 6.99587 99 6.31118 5.94410 6.67825 99 6.33263 5.56091 7.10434 100 6.31118 5.94410 6.67825 100 6.33800 5.47542 7.20057 101 6.31118 5.94410 6.67825 101 6.34336 5.39862 7.28811 Period