Tutorial for solution of Assignment week 39 “A. Time series without seasonal variation Use the data in the file 'dollar.txt'. “ “Construct a time series graph of the fluctuations of the dollar exchange rate, yt, for the period 1994-1998.” Time Series Plot of $US/SEK Jan 3, 1994 - Nov 3, 1998 8.5 $US/SEK 8.0 7.5 7.0 6.5 1 123 246 369 492 615 Index 738 861 984 1107 Note! The time scale is best set to index here as the days are not consecutive in time series (Saturdays, Sundays and other holidays are not present) “Construct also a point plot for all pairs (yt-1 , yt) and try to visually estimate how strong the correlation between two consecutive observations is (=autocorrelation).” Scatterplot of $US/SEK (y_t vs y_t-1), Jan 3, 1994 - Nov 3, 1998 Strong positive autocorrelation! 8.5 y_t 8.0 7.5 7.0 6.5 6.5 7.0 7.5 y_t-1 8.0 8.5 “How do the estimated autocorrelations change with increasing timelags between observations?” To estimate the autocorrelation function, copy the relevant rows (data for 1994-1998) of column $US/SEK to a new column and use the autocorrelation function estimation on that column Autocorrelation Function for $US/SEK_1 (with 5% significance limits for the autocorrelations) 1.0 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 1 5 10 15 20 25 30 35 40 45 Lag 50 55 60 65 70 75 80 As was deduced from the scatter plot, the autocorrelations are strongly positive. The autocorrelations do not change very much with increasing time lags. Note that this is what we see when the time series is non-stationary (has a trend). “Construct a time series graph of the changes zt = yt - yt-1 of the dollar exchange rate. Then try to judge upon how the estimated autocorrelations for the series zt change with the time lag between observations and check your judgement by estimating the autocorrelations.” The changes are already present in the column Difference. The analogous procedures are applied to this column to produce the time series graph and the estimated acf plot, i.e. by including only values where column Year is 1994. Time Series Plot of Difference Jan 3, 1994 - Nov 3, 1998 0.2 Noisy plot As previously plot zt vs. zt – 1 0.0 -0.1 -0.2 -0.3 1 123 246 369 492 615 738 861 984 1107 Index Scatterplot of Difference vs z_t-1 Jan 3, 1994 - Nov 3, 1998 0.2 Seems to be no autocorrelation at all 0.1 0.0 z_t Difference 0.1 -0.1 -0.2 -0.3 -0.3 -0.2 -0.1 0.0 z_t-1 0.1 0.2 Autocorrelation Function for Difference_1 (with 5% significance limits for the autocorrelations) 1.0 Our conclusions are verified! 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 1 5 10 15 20 25 30 35 40 45 Lag 50 55 60 65 70 75 80 “B. Time series with seasonal variation Use the time series of monthly discharge in the lake Hjälmaren (‘Hjalmarenmonth.txt’), which you have used in the assignment for week 36. Compute the autocorrelation function (Minitab: StatTime seriesAutocorrelation…) for the variable Discharge.m.” Autocorrelation Function for Discharge.m (with 5% significance limits for the autocorrelations) 1.0 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 1 5 10 15 20 25 30 35 40 45 Lag 50 55 60 65 70 75 80 “Deseasonalise the time series and make a new graph of the seasonally adjusted values. Try to visually estimate how the autocorrelations look like and check your judgement by computing the autocorrelation function.” Time Series Plot of Discharge.m 120 Discharge.m 100 Additive model for deseasonalization seems best! 80 60 40 20 0 Month jan Year 1994 jan 2011 jan 2028 jan 2045 jan 2062 jan 2079 jan 2096 Time Series Plot of DESE1 120 100 80 DESE1 60 40 20 0 Month jan Year 1994 jan 2011 jan 2028 jan 2045 jan 2062 jan 2079 jan 2096 Scatterplot of DESE1 vs DESE1_1 120 Plot DESE1(t) vs. DESE1(t-1) Indicates positive autocorrelation 100 DESE1 80 60 40 20 0 0 20 40 60 DESE1_1 80 100 120 Autocorrelation Function for DESE1 (with 5% significance limits for the autocorrelations) 1.0 0.8 Indication confirmed! Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 1 5 10 15 20 25 30 35 40 45 Lag 50 55 60 65 70 75 80 “C. Forecasting with autoregressive models Data set: The Dollar Exchange rates Consider again the time series of dollar exchange rates for the period 1994-1998. Then use the Minitab time series module ARIMA (see further below) to estimate the parameters in an AR(1)-model (1 nonseasonal autoregressive parameter) and plot the observed values together with forecasts for a period of 20 days after the last observed time-point.” Use the already created column of $US/SEK exchange rates from 19941998 (there is no opportunity in Minitab’s ARIMA module to just analyze a subset of a column like in the graphing modules) Forecasts for a 20 days period are requested. (Origin field is left blank analogously to previous modules) See next slide! Three new columns should be entered here! Must be checked (not default) Should always by checked for diagnostic purposes Final Estimates of Parameters Type AR Coef SE Coef T P 0.9971 0.0026 385.44 0.000 0.021782 0.001280 17.02 0.000 7.4405 0.4371 1 Constant Mean Number of observations: Residuals: 1229 SS = 2.45718 (backforecasts excluded) MS = 0.00200 DF = 1227 Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag Chi-Square DF P-Value Significant! 12 24 36 48 9.0 22.9 33.3 38.2 10 22 34 46 0.529 0.410 0.504 0.786 Keep in mind for comparison with next model OK! Forecasts from period 1229 95 Percent Limits Period Forecast Lower Upper 1230 7.79895 7.71122 7.88668 1231 7.79790 7.67401 7.92178 1232 7.79685 7.64535 7.94836 1233 7.79581 7.62112 7.97050 1234 7.79477 7.59974 7.98979 1235 7.79373 7.58040 8.00706 1236 7.79270 7.56261 8.02278 1237 7.79167 7.54605 8.03728 1238 7.79064 7.53051 8.05077 1239 7.78961 7.51581 8.06342 1240 7.78859 7.50184 8.07534 1241 7.78757 7.48850 8.08664 1242 7.78655 7.47572 8.09739 1243 7.78554 7.46344 8.10764 1244 7.78453 7.45161 8.11745 1245 7.78352 7.44018 8.12687 1246 7.78252 7.42912 8.13592 1247 7.78152 7.41839 8.14464 1248 7.78052 7.40798 8.15306 1249 7.77952 7.39786 8.16119 Actual These forecasts and prediction limits are stored in columns C12, C13 and C14 (as entered in dialog box) Time Series Plot for $US/SEK_1 (with forecasts and their 95% confidence limits) 8.5 $US/SEK_1 8.0 7.5 Seems to be OK (as was confirmed by the Ljung-Box statistic) 7.0 6.5 1 84 168 252 336 420 504 588 672 Time 756 840 924 1008 1092 1176 PACF of Residuals for $US/SEK_1 ACF of Residuals for $US/SEK_1 (with 5% significance limits for the partial autocorrelations) 1.0 1.0 0.8 0.8 0.6 0.6 Partial Autocorrelation Autocorrelation (with 5% significance limits for the autocorrelations) 0.4 0.2 0.0 -0.2 -0.4 -0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -0.8 -1.0 -1.0 1 6 12 18 24 30 36 42 Lag 48 54 60 66 72 78 1 6 12 18 24 30 36 42 Lag 48 54 60 66 72 78 Use the stored prediction limits to calculate the widths of the prediction intervals The column widths_1 (C15) will later be compared with the widths from another model “Investigate also if the forecasts can improve by instead using an AR(2)-model.” Don’t forget to enter new columns here! Final Estimates of Parameters Type Coef SE Coef T P AR 1 1.0107 0.0286 35.35 0.000 AR 2 -0.0138 0.0285 -0.48 0.629 0.023161 0.001280 18.09 0.000 7.4372 0.4110 Constant Mean Number of observations: Residuals: Non-significant! 1229 SS = 2.45873 (backforecasts excluded) MS = 0.00201 DF = 1226 Slightly larger than in AR(1)-model Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag Chi-Square DF P-Value 12 24 36 48 10.2 24.0 34.3 39.3 9 21 33 45 0.337 0.292 0.403 0.710 Still OK! Time Series Plot for $US/SEK_1 (with forecasts and their 95% confidence limits) 8.5 7.5 7.0 6.5 1 84 168 252 336 420 504 588 672 Time 756 840 924 1008 1092 1176 PACF of Residuals for $US/SEK_1 ACF of Residuals for $US/SEK_1 (with 5% significance limits for the partial autocorrelations) (with 5% significance limits for the autocorrelations) 1.0 1.0 0.8 Partial Autocorrelation 0.8 0.6 Autocorrelation $US/SEK_1 8.0 0.4 0.2 0.0 -0.2 -0.4 -0.6 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -0.8 -1.0 -1.0 1 1 6 12 18 24 30 36 42 Lag 48 54 60 66 72 78 6 12 18 24 30 36 42 Lag 48 54 60 66 72 78 Calculate widths for the new prediction intervals Make a time series plot of the intervals widths from the two analyses. Time Series Plot of widths_1; widths_2 0.8 Variable widths_1 widths_2 0.7 Data 0.6 0.5 0.4 0.3 0.2 0.1 2 4 6 8 10 12 Index 14 16 18 20 Slightly wider prediction intervals with AR(2)-model (widths_2) Forecasts do not improve with AR(2)-model “Finally perform a residual analysis of the errors in the one-step-ahead forecasts (can be asked for under the “Graph” button in the dialog box. By residuals we mean here the errors in the one-step-ahead forecasts). Are there any signs of serial correlations in the residuals?” PACF of Residuals for $US/SEK_1 ACF of Residuals for $US/SEK_1 (with 5% significance limits for the partial autocorrelations) (with 5% significance limits for the autocorrelations) 1.0 1.0 0.8 0.8 0.6 0.6 Partial Autocorrelation Autocorrelation AR(1): 0.4 0.2 0.0 -0.2 -0.4 -0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -0.8 -1.0 -1.0 1 6 12 18 24 30 36 42 Lag 48 54 60 66 72 1 78 6 12 30 36 42 Lag 48 54 60 66 72 78 72 78 (with 5% significance limits for the partial autocorrelations) 1.0 1.0 0.8 0.8 0.6 0.6 Partial Autocorrelation Autocorrelation 24 PACF of Residuals for $US/SEK_1 ACF of Residuals for $US/SEK_1 (with 5% significance limits for the autocorrelations) AR(2): 18 0.4 0.2 0.0 -0.2 -0.4 -0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -0.8 -1.0 -1.0 1 6 12 18 24 30 36 42 Lag 48 54 60 66 72 78 1 6 12 No signs of serial correlations in resiaduals in any of the models 18 24 30 36 42 Lag 48 54 60 66 “D. ARIMA models and differentiation In this task you will first have to judge upon whether you need to differentiate the current time series ( zt = yt - yt-1 ) before forecasting with an ARMA-model can be applied. Then you shall try different models with a number of parameters to find the model that gives the least one-step-ahead prediction errors on the average. Finally you shall make some residual plots to investigate if the selected model of forecasting can be improved.” “Forecasting monthly dollar exchange rates in Danish crowns (DKK) Data set: The Dollar-Danish Crowns Exchange rates” “D.1. The need for differentiation Construct a time series graph for the monthly means of dollar exchange rates in Danish crowns (file ‘DKK.txt’). Then estimate the autocorrelations and display them in a graph. Does the time series show any obvious upward or downward trend?” Time Series Plot of Exchange rate Exchange rate 7.0 Note that the yaxis do not start at zero! 6.5 6.0 5.5 Month jan Year 1991 jan 1992 jan 1993 jan 1994 jan 1995 jan 1996 jan 1997 jan 1998 A slight upward trend may be concluded “Are there any signs of long-time oscillations in the time series (that can be seen from the time series graph)?” Yes, there seem to be a cyclical variation with cycle periods longer than a year. Autocorrelation Function for Exchange rate (with 5% significance limits for the autocorrelations) 1.0 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 2 4 6 8 10 12 14 Lag 16 18 20 22 24 “Is there a fast cancel-out in the autocorrelations?” No, the cancel-out is not fast (although the spikes come quickly within the red limits) “Is there need for differentiation to get a time series suitable for ARMAmodelling?” Probably, but not certainly! “D.2 Fitting different ARMA-models Calculate the estimated autocorrelations possibly after differentiation of the original series and display these estimates in a graph.” Without differentiation: Partial Autocorrelation Function for Exchange rate Autocorrelation Function for Exchange rate (with 5% significance limits for the partial autocorrelations) (with 5% significance limits for the autocorrelations) 1.0 1.0 0.8 Partial Autocorrelation 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -0.8 -1.0 -1.0 2 4 6 8 10 12 14 Lag 16 18 20 22 24 2 4 6 8 10 12 14 Lag 16 18 20 22 24 (Slowly) decreasing postive autocorrelations. One positive spike (at lag 1) in SPAC Either this is a non-stationary time series or an AR(1)-time series with a close to 1. With first-order differentiation (use the ready series of differences): Partial Autocorrelation Function for Difference in exchange rate Autocorrelation Function for Difference in exchange rate (with 5% significance limits for the partial autocorrelations) 1.0 1.0 0.8 0.8 0.6 0.6 Partial Autocorrelation Autocorrelation (with 5% significance limits for the autocorrelations) 0.4 0.2 0.0 -0.2 -0.4 -0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -0.8 -1.0 -1.0 2 4 6 8 10 12 14 Lag 16 18 20 22 24 2 No obvious pattern in any of these two plots. The differentiated series may be an ARMA-series 4 6 8 10 12 14 Lag 16 18 20 22 24 “Then try to predict the dollar exchange rate by combining differentiation with ARMA-models of different orders.” Strategy: On original series, try AR(1) On differentiated series, try AR(1), AR(2), MA(1), MA(2), ARMA(1,1), ARMA(1,2), ARMA(2,1) and ARMA(2,2) Compare the values of MS from each model. This measure corresponds with onestep-ahead prediction errors on the average. Model Original MS Differentiated AR(1) 0.03682 AR(1) 0.03904 AR(2) 0.03914 MA(1) 0.03904 MA(2) 0.03916 ARMA(1,1) 0.03905 ARMA(2,1) 0.03889 ARMA(1,2) 0.03869 ARMA(2,2) 0.03807 None of the models on the differentiated series produces better MS value than the AR(1) on original series, but MS seems to decrease with larger complexity. “What happens if one tries to fit a very complex model with a lot of parameters to the observations?” Study e.g. ARMA(3,3) and ARMA(4,4) on the differentiated series: Final Estimates of Parameters Type Coef SE Coef T P AR 1 -0.1113 0.3369 -0.33 0.742 AR 2 0.4786 0.2274 2.10 0.038 AR 3 0.3689 0.3237 1.14 0.258 MA 1 -0.1098 0.2941 -0.37 0.710 MA 2 0.4351 0.2136 2.04 0.045 MA 3 0.6165 0.2846 2.17 0.033 0.000924 0.001931 0.48 0.634 Constant ARMA(3,3) Differencing: 1 regular difference Number of observations: Residuals: Original series 95, after differencing 94 SS = 3.25649 (backforecasts excluded) MS = 0.03743 DF = 87 Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag Chi-Square DF P-Value 12 24 36 48 4.9 17.3 26.4 39.5 5 17 29 41 0.425 0.431 0.606 0.537 Even lower than in ARMA(2,2) No severe problems but not all parameters are significant! ACF of Residuals for Exchange rate (with 5% significance limits for the autocorrelations) 1.0 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 2 4 6 8 10 12 Lag 14 16 18 20 22 PACF of Residuals for Exchange rate (with 5% significance limits for the partial autocorrelations) 1.0 Partial Autocorrelation 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 2 4 6 8 10 12 Lag 14 16 18 20 22 No severe problems here either, but spikes seem to increase with lag! Unable to reduce sum of squares any further Estimation problems! ARMA(4,4) Final Estimates of Parameters Type Coef SE Coef T P AR 1 0.4196 2.3514 0.18 0.859 AR 2 0.4329 0.4304 1.01 0.317 AR 3 0.0536 1.2079 0.04 0.965 AR 4 -0.0652 0.7425 -0.09 0.930 MA 1 0.4119 2.3452 0.18 0.861 MA 2 0.3871 0.4030 0.96 0.340 MA 3 0.3397 1.0707 0.32 0.752 MA 4 -0.1736 1.2715 -0.14 0.892 0.000597 0.001779 0.34 0.738 Constant None of the parameters are significant! Estimation problems and an increase in MS. Differencing: 1 regular difference Number of observations: Residuals: Original series 95, after differencing 94 SS = 3.26434 (backforecasts excluded) MS = 0.03840 DF = 85 Increased! The conclusion must be that an AR(1)-model on original data seems to be the best. “D.3. Residual analysis Construct a graph for the residuals (the one-step-ahead prediction errors) and examine visually if anything points to a possible improvement of the model.” PACF of Residuals for Exchange rate ACF of Residuals for Exchange rate (with 5% significance limits for the partial autocorrelations) (with 5% significance limits for the autocorrelations) 1.0 1.0 0.8 Partial Autocorrelation 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -0.6 -1.0 -0.8 2 -1.0 2 4 6 8 10 12 14 Lag 16 18 20 22 4 6 8 10 12 14 Lag 16 18 20 22 24 24 SAC and SPAC of residuals do not indicate that another ARIMA-model should be used. Residual Plots for Exchange rate Normal Probability Plot of the Residuals Residuals Versus the Fitted Values 99 0.50 90 0.25 Residual Percent 99.9 50 10 0.00 -0.25 1 0.1 -0.50 -0.25 0.00 Residual 0.25 -0.50 0.50 20 0.50 15 0.25 10 7.0 0.00 -0.25 5 0 6.0 6.5 Fitted Value Residuals Versus the Order of the Data Residual Frequency Histogram of the Residuals 5.5 -0.4 -0.2 0.0 0.2 Residual 0.4 0.6 -0.50 1 10 20 30 40 50 60 70 Observation Order 80 90 There do not seem to be any violations of the assumption of normal distribution and constant variance either.