sol - GoZips.uakron.edu

advertisement
In-class Exercise 1 : Fitting Temperature Data with ARIMA
(due Fri 3/06/2015)
Name:
Sample Soluiton
Use This file as a template for your assignment. Submit your code and comments together with
(selected) output from R console. Your comments must be BOLD FACED.
First, load global temperature data from class web site using below R code.
D <- read.csv("http://gozips.uakron.edu/~nmimoto/pages/datasets/gtemp.txt")
X <- ts(D, start=c(1880), freq=1)
plot(X, type='o')
plot(diff(X), type='o')
Fit1 <- auto.arima(X)
Fit1
d=0
d=1
1. Use auto.arima and select the best ARIMA(p,d,q) model for this dataset. How did
auto.arima came to the final model? Briefly explain.
ARIMA(0,1,2) with drift
ma1 ma2 drift
-0.5134 -0.1915 0.0065
s.e. 0.0833 0.0783 0.0026
sigma^2 estimated as 0.009181: log likelihood=118.78
AIC=-229.56 AICc=-229.23 BIC=-218.12
Auto arima chose d=1 by
KPSS stationarity test. Then
it looked for ARMA(p,1,q)
model stepwise looking for
lowest AICc.
2. Determine the adequacy of the model fit by residual analysis.
Ljung-Box, and McLeod-Li randomness tests of residuals all have high pvalues. Residuals look uncorrelated. The model seems adequate. JaqueBera test have high p-values indicating normality of the residual
distribution.
3. Check for parameter significance of the current model (trusting the asymptotic s.e.
from auto.arima()).
Using standard error in the output, all model parameters are significant.
(i.e. parameter estimate (+ - ) 1.96* s.e. does not contain 0).
4. Check for adequacy of value d selected by auto.arima(). Which method used as
default? Use Stationarity.test() from the class website and check d=0, d=1, d=2 for
its (non)stationarity. Do you agree with the choice made by auto.arima()?
Stationarity.tests(X)
KPSS ADF PP
p-val: 0.01 0.706 0.01
Stationarity.tests(diff(X))
KPSS ADF PP
p-val: 0.1 0.01 0.01
Stationarity.tests(diff(diff(X)))
KPSS ADF PP
p-val: 0.1 0.01 0.01
When d=0, KPSS and ADF test both
indicate non-stationarity, and that is
obvious from the plot (see top). When
d=1, all three tests indicate stationarity,
and plot does look stationary.
Since d=1 is stationary, d=2 looks
stationaty as well.
5. Look for signs of under-difference (d too low) and over-difference (d too high).
What are the signs? Do you see any from your fit?
Mod( polyroot( c( 1, -0.5134, -0.1915)) )
[1] 1.30883 3.98977
If AR parameters are close to unit root, it is a sign of under-differencing. However
since the model does not have AR part, we cannot look for under-differenced sign
here. However, we did check stationarity when d=1 in part (4), so we are not
worried about under-differencing here.
For over-differencing sign, we look at if MA polynomial have root close to unit circle.
Note that auto.arima() uses (1+theta B) convention for the sign. Polyroot function
shows the roots are not too close to the unit circle.
From the different point of view, d=0 is clearly non-stationary. Therefore, d=1
cannot be over-differencing.
6. State your final model using equation(s) with parameter values.
ARIMA(0,1,2) with drift
▽𝑋𝑡
= 𝑌𝑡
𝑌𝑡 = 𝑒𝑡 − .5134 𝑒𝑡−1 − 0.1915 𝑒𝑡−2
7. Use trace=TRUE option in auto.arima(), to see second and third lowest model from
part(1). Use Arima() and repeat part (2) and (3) for those models. Is there any
reason that they are better than model chosen by part (1)?
auto.arima(X, trace=TRUE)
ARIMA(1,1,3) with drift
ARIMA(0,1,3) with drift
: -226.4521
: -224.9137
Arima(X, order=c(0,1,3), include.drift=TRUE)
Coefficients:
ma1 ma2 ma3 drift
-0.5048 -0.1875 -0.0164 0.0065
s.e. 0.0960 0.0816 0.0877 0.0025
sigma^2 estimated as 0.009179: log likelihood=118.8
AIC=-227.59 AICc=-227.1 BIC=-213.29
Arima(X, order=c(1,1,3), include.drift=TRUE)
Coefficients:
ar1 ma1 ma2 ma3 drift
-0.9376 0.4846 -0.6341 -0.2857 0.0065
s.e. 0.0950 0.1169 0.0959 0.0869 0.0025
sigma^2 estimated as 0.008824: log likelihood=121.23
AIC=-230.45 AICc=-229.76 BIC=-213.29
Using trace=TRUE option, next two lowest AICc models are as above.
ARIMA(0,1,3) contains one non-significant parameter (MA(3)), and
ARIMA(1,1,2) contains phi1 being not significantly different from -1, which
is sign of under-differencing. However, from part (4), we are convinced
that d=1 is a stationary series. So this is not a good model.
8. Perform 5-step prediction of global temperature using your final model.
X.hat = forecast(Fit1)
plot(X.hat)
plot(X.hat, xlim=c(2000, 2015)
Download