Computer lesson III This computer lesson will be used to learn how to estimate parameters of an AR-model and an ARMA-model. Kilpisjärvi data (spring) We are going to use the yearly data of Trap index values from the rodent Clethrionomys rufocanus at Kilpisjärvi, Finland in the spring. The data starts in 1953. Download the data “kilpissp.txt” from the home page of the course: http://biologi.uio.no/undervisning/bio360/. Create a time series and plot the data What special features are there in these data? How should you deal with this? Fit an AR-model to the data. This can e.g. be done by kilpis.ar <- ar(kilpis.spring, order.max=9) The order.max sets the maximum order the routine will AIC to choose the best model (“AIC=T”). try to fit to the data. Default is to use Use the object browser (or summary()) to find out what sort of output is created. What is the order of the best model? Why? Is there other ways to confirm this? Is your model satisfactory? Why/why not? Why not choose a model with four lags? What is your model? Can you express it with letters? Kilpisjärvi data (all data) We will now use the whole data set (both spring and fall data). You downloaded this dataset in the first computer lesson (“kilpis.txt from http://www.uio.no/studier/emner/matnat/biologi/BIO4040/h03/pensumliste.xml). Can you fit an AR-model to these data? Why not? Prepare the data for analysis (hint: regular features, cf. second computer lesson.) Use arima.mle to fit ARMA-models to the (prepossessed) data, and use AIC to chose an appropriate model. You should not fit models randomly. You can fit an ARMA(2,2) model by the following command: kilpis.arima22 <- arima.mle(kilpis.stl$rem – mean(kilpis.stl$rem), model = list(order=c(2,0,2)), n.cond=10) model: Here you specify the ARIMA-model of the form ARIMA(p,d,q). We will restrict ourselves to the ARMA-models. Hence d=0. n.cond: The Maximum Likelihood must be conditioned on some values (more ideally initial values should be specified. Some other programs do that.). In order to compare different models, the ML must be based on the same number of values. You should thus choose some value larger than the most complicated model (n.cond >> p+q). NB: arima.mle expects a series with expectation of 0. Thus you must remove the mean before the analysis. Can you program a function that calculates the AICC? (You can find it on the homepage of the course.) Use arima.diag( ) to evaluate your model(s). How well is the model fitted to the data? What is your model? Can you express it with letters? What is your biological interpretation of the model? __________________________________________ D:\687301211.doc, 05.02.16, page 1 of 9 tmp <- read.table("e:\\Kyrre\\Studier\\drgrad\\kurs\\Timeseries\\kilpis.txt") # tmp <read.table("c:\\Kyrre\\Studier\\drgrad\\kurs\\Timeseries\\kilpis.txt") kilpis <- ts(tmp[,1], start=1953, frequency=2) length(kilpis) [1] 89 kilpis.spring <- matrix(0, 45, 1) for (i in 1:45) {kilpis.spring[i] <- kilpis[(i*2)-1]} kilpis.spring <- ts(kilpis.spring, start=1953) kilpis.fall <- matrix(0, 44, 1) for (i in 1:44) {kilpis.fall[i] <- kilpis[i*2]} kilpis.fall <- ts(kilpis.fall, start=1953) par(mfrow=c(3,3)) ts.plot(kilpis, main="Clethrionomys rufocanus at Kilpisjarvi", ylab="Trap index", xlab="Year") ts.points(kilpis) ts.plot(kilpis.spring, main="C. rufocanus in spring", xlab="Year") ts.points(kilpis.spring) ts.plot(kilpis.fall, main="C. rufocanus in fall", ts.points(kilpis.fall) xlab="Year") acf(kilpis, type="correlation", plot=T) acf(kilpis.spring, type="correlation", plot=T) acf(kilpis.fall, type="correlation", plot=T) acf(kilpis, type="partial", plot=T) acf(kilpis.spring, type="partial", plot=T) acf(kilpis.fall, type="partial", plot=T) __________________________________________ D:\687301211.doc, 05.02.16, page 2 of 9 0 510 20 0 5 10 15 0 51Trapindex0 20 Cl ethri C onomy . rufoc Cs . an r r uf u 11 9 1 9 6 1 9 7 0 9 8 0 9 0 1 0 1 9 1 9 6 1 9 7 0 9 8 0 9 0 1 0 1 91 9 6 1 9 7 0 9 8 09 0 Ye a r Ye a r Ye a r 02468 L a g 0 -0.4 0.2ACF 0.8 -0.4 0.2ACF 0.8 -0.4 0.A2CF 0.8 S e r ie Se s r ie : S k s e ilp r : 5 1 0 1 50 L a g 5 1 0 1 5 L a g 02468 L a g 0 -0.4 PartilACF 0. -0.4 PartilACF 0. -0.4 PartilACF 0. 0.4 S e r ie Se s r ie : S k s e ilp r : 5 1 0 1 50 L a g __________________________________________ D:\687301211.doc, 05.02.16, page 3 of 9 5 1 0 1 5 L a g kilpis.ar <- ar(kilpis.spring, order.max=9) kilpis.ar $order: [1] 2 $ar: , , 1 [,1] [1,] 0.31903 [2,] -0.45581 $var.pred: [,1] [1,] 17.401 $aic: [1] 8.69482 8.48007 0.00000 1.99762 6.83923 [10] 8.49921 10.12308 12.11896 12.49896 0.32233 1.96094 3.24463 4.84671 $n.used: [1] 45 $order.max: [1] 12 $partialacf: , , 1 [,1] [1,] 0.2191459 [2,] -0.4558068 [3,] 0.0072625 [4,] 0.2800484 [5,] 0.0894347 [6,] -0.1256669 $resid: Series 1 1953 NA 1954 NA 1955 8.48208 1956 -4.75243 1957 1.41131 1958 1.12721 : 1995 -3.23340 1996 -3.26534 1997 -3.37557 start deltat frequency 1953 1 1 $method: [1] "yule-walker" $series: [1] "kilpis.spring" par(mfrow=c(2,2)) acf.plot(kilpis.ar, conf=T, main=(“Partial ACF for data”)) ts.plot(ts(kilpis.ar$aic, start=0), main=("AIC values. NB: Startpoint")) ts.plot(kilpis.ar$resid, main=(“Residuals of AR(2) model”)) acf.plot(tmp <- acf(kilpis.ar$resid, plot=F), main="ACF of residuals”) __________________________________________ D:\687301211.doc, 05.02.16, page 4 of 9 -0.4 -0.2PartilACF 0. 0.2 02468 P a rt i a l A C A IC v a 0 2 0 4 6 8 Lag 2 4 6 8 T im e -5 0 5 10 -0.2 0.2 ACF 0.6 1.0 A CF Re si d ua ls of A 1960 1970 1980 1990 0 T im e 5 1015 Lag kilpis.fit <- ar(kilpis.spring, aic=F, order=3) # -> This gives the identical results. par(mfrow=c(2,2)) ts.plot(kilpis.fit$resid, main=(“Residuals of AR(3) model”)) ts.points(kilpis.fit$resid, col=8) acf.plot(tmp <- acf(kilpis.fit$resid, plot=F), main="ACF of residuals”) acf.plot(tmp <- acf(kilpis.fit$resid, plot=F), add=T) ACF of residuals -5 -0.2 0 5 ACF 0.2 0.6 10 1.0 Residuals of AR(2) model 1960 1970 1980 1990 0 5 10 15 Lag Time -0.2 ACF 0.2 0.6 1.0 Series : kilpis.fit$resid 0 5 10 15 Lag ----__________________________________________ D:\687301211.doc, 05.02.16, page 5 of 9 tsp(kilpis) [1] 1953 1997 2 kilpis.stl <- stl(kilpis, "periodic") # plot.stl(kilpis.stl) par(mfrow=c(1,1)) ts.plot(kilpis, main="Kilpisjärvi data", xlab="Year", ylab="Trap index", ylim=c(-2, 30)) ts.points(kilpis, pch=28, col=8) ts.lines(kilpis.stl$seas, col=4) ts.lines(kilpis.stl$rem, lty=1, col=3) legend(locator(1), legend=c("Data", "Seasonal effects", "remainder"), lty=1, col=c(1,4,3)) 0 5 10 15 20 25 30 Kilpis jär Data S eas onal effe remai nder 1960 1970 1980 1990 __________________________________________ D:\687301211.doc, 05.02.16, page 6 of 9 par(mfrow=c(2,2)) ts.plot(kilpis, main="Kilpisjärvi data", xlab="Year", ylab="Trap index") ts.points(kilpis, pch=28, col=8) ts.plot(kilpis.stl$seas, main="Seasonal components of Kilpisjärvi data", xlab="Year", ylab="Trap index", ylim=c(-3, 3)) ts.lines(kilpis.stl$seas, col=4) ts.plot(kilpis.stl$rem, main="Remaining series (Kilpisjärvi)", xlab="Year", ylab="Trap index") ts.lines(kilpis.stl$rem, lty=1, col=3) acf(kilpis.stl$rem) -3 -2 -1 0 1 2 3 0 5 10 15 20 25 0 5 10 15 20 25 Ki l p i sj S ä e ra v i so d a n t a 1960 1970 1980 1990 1960 1970 1980 1990 Re m a in in g se S e ri r e 1960 1970 1980 1990 __________________________________________ D:\687301211.doc, 05.02.16, page 7 of 9 kilpis.arima22 <- arima.mle(kilpis.stl$rem – mean(kilpis.stl$rem), model = list(order=c(2,0,2)),n.cond=10) aicc <- function(loglik, p, q, n){ # loglik is the log likelihood of the model # (estimated by arima.mle) # p is the order of the AR-component # q is the order of the MA-component # n is the length of the time series a <- loglik + ((2 * (p + q + 1) * n)/(n - p - q - 2)) return(a) } kilpis.arima22$aic # [1] 451.4363 # Without subtracting mean: [1] 472.1974 # length(kilpis) # [1] 89 aicc(kilpis.arima22$loglik, p=2, q=2, n=89) # [1] 454.1592 # Without subtracting mean: [1] 474.9203 # par(mfrow=c(1,2)) arima.diag(kilpis.arima22) P lot of S t P andar lot of diz S 0. 0.2 p-value 0.4 0.6 -0.2 0P.ACF 0.2 -1.0 ACF0. 0.5 1.0 -2 0 1 2 3 4 0. 0.5 p-value 0.1 0.15 -0.2 0.PACF 0.2 0.4 -1.0 ACF0. 0.5 1.0 -2 -1 0 1 2 3 A R I M AA R I M M A od M 1960 1970 1980 1990 A CF 0 2 4 6 8 10 P A CF 2 4 6 8 4 5 6 Lag 10 7 0 A of CF R es P l i o d 2 4 6 8 10 P lot P A of CF Res P P - v alues 3 1960 1970 1980 1990 P lot 2 4 6 8 10 of P -v Lj al ungues B 3 4 5 6 Lag 7 A RIM A (2,0,2) A RIM A M ( __________________________________________ D:\687301211.doc, 05.02.16, page 8 of 9 # Including a covariate tmp <- read.table("C:\\Kyrre\\data\\nao1864_1999.txt") nao5397 <- as.matrix(tmp[90:134,]) nao5397.ts <- ts(nao5397, start=1953) tsp(nao5397.ts) [1] 1953 1997 1 tsp(kilpis.spring) # With a covariate (notice that the covariate must not be a time series) kilpis.arima11 <- arima.mle(kilpis.spring, model = list(order=c(1,0,1)), n.cond=10, xreg=nao5397) kilpis.arima11 Call: arima.mle(x = kilpis.spring, model = list(order = c(1, 0, 1)), n.cond = 10, xreg = nao5397) Method: Maximum Likelihood Model : 1 0 1 Coefficients: AR : 0.2389 MA : -0.41193 Variance-Covariance Matrix: ar(1) ma(1) ar(1) 0.07674 0.05801 ma(1) 0.05801 0.06757 Coeffficients for regressor(s): nao5397 [1] 0.08591 Optimizer has converged Convergence Type: relative function convergence AIC: 216.96855 # Without a covariate kilpis.arima11 <- arima.mle(kilpis.spring, model = list(order=c(1,0,1)), n.cond=10) kilpis.arima11 Call: arima.mle(x = kilpis.spring, model = list(order = c(1, 0, 1)), n.cond = 10) Method: Maximum Likelihood Model : 1 0 1 Coefficients: AR : 0.2477 MA : -0.40534 Variance-Covariance Matrix: ar(1) ma(1) ar(1) 0.07615 0.05783 ma(1) 0.05783 0.06780 Optimizer has converged Convergence Type: relative function convergence AIC: 215.01605 __________________________________________ D:\687301211.doc, 05.02.16, page 9 of 9