Computer lesson III

advertisement
Computer lesson III
This computer lesson will be used to learn how to estimate parameters of an AR-model and
an ARMA-model.
Kilpisjärvi data (spring)
We are going to use the yearly data of Trap index values from the rodent Clethrionomys
rufocanus at Kilpisjärvi, Finland in the spring. The data starts in 1953.
 Download the data “kilpissp.txt” from the home page of the course:
http://biologi.uio.no/undervisning/bio360/.
 Create a time series and plot the data
What special features are there in these data? How should you deal with this?
 Fit an AR-model to the data. This can e.g. be done by
kilpis.ar <- ar(kilpis.spring, order.max=9)
The order.max sets the maximum order the routine will
AIC to choose the best model (“AIC=T”).
try to fit to the data. Default is to use
 Use the object browser (or summary()) to find out what sort of output is created.
What is the order of the best model? Why? Is there other ways to confirm this?
Is your model satisfactory? Why/why not?
Why not choose a model with four lags?
What is your model? Can you express it with letters?
Kilpisjärvi data (all data)
We will now use the whole data set (both spring and fall data). You downloaded this dataset
in the first computer lesson (“kilpis.txt from
http://www.uio.no/studier/emner/matnat/biologi/BIO4040/h03/pensumliste.xml).
Can you fit an AR-model to these data? Why not?
 Prepare the data for analysis (hint: regular features, cf. second computer lesson.)
 Use arima.mle to fit ARMA-models to the (prepossessed) data, and use AIC to chose an
appropriate model. You should not fit models randomly.
You can fit an ARMA(2,2) model by the following command:
kilpis.arima22 <- arima.mle(kilpis.stl$rem – mean(kilpis.stl$rem),
model = list(order=c(2,0,2)), n.cond=10)
model:
Here you specify the ARIMA-model of the form ARIMA(p,d,q). We
will restrict ourselves to the ARMA-models. Hence d=0.
n.cond:
The Maximum Likelihood must be conditioned on some values (more
ideally initial values should be specified. Some other programs do
that.). In order to compare different models, the ML must be based on
the same number of values. You should thus choose some value larger
than the most complicated model (n.cond >> p+q).
NB: arima.mle expects a series with expectation of 0. Thus you must remove the mean before
the analysis.
Can you program a function that calculates the AICC? (You can find it on the homepage of
the course.)
 Use arima.diag( ) to evaluate your model(s).
How well is the model fitted to the data?
What is your model? Can you express it with letters?
What is your biological interpretation of the model?
__________________________________________
D:\687301211.doc, 05.02.16, page 1 of 9
tmp <- read.table("e:\\Kyrre\\Studier\\drgrad\\kurs\\Timeseries\\kilpis.txt")
# tmp <read.table("c:\\Kyrre\\Studier\\drgrad\\kurs\\Timeseries\\kilpis.txt")
kilpis <- ts(tmp[,1], start=1953, frequency=2)
length(kilpis)
[1] 89
kilpis.spring <- matrix(0, 45, 1)
for (i in 1:45) {kilpis.spring[i] <- kilpis[(i*2)-1]}
kilpis.spring <- ts(kilpis.spring, start=1953)
kilpis.fall <- matrix(0, 44, 1)
for (i in 1:44) {kilpis.fall[i] <- kilpis[i*2]}
kilpis.fall <- ts(kilpis.fall, start=1953)
par(mfrow=c(3,3))
ts.plot(kilpis, main="Clethrionomys rufocanus at Kilpisjarvi", ylab="Trap
index", xlab="Year")
ts.points(kilpis)
ts.plot(kilpis.spring, main="C. rufocanus in spring", xlab="Year")
ts.points(kilpis.spring)
ts.plot(kilpis.fall, main="C. rufocanus in fall",
ts.points(kilpis.fall)
xlab="Year")
acf(kilpis, type="correlation", plot=T)
acf(kilpis.spring, type="correlation", plot=T)
acf(kilpis.fall, type="correlation", plot=T)
acf(kilpis, type="partial", plot=T)
acf(kilpis.spring, type="partial", plot=T)
acf(kilpis.fall, type="partial", plot=T)
__________________________________________
D:\687301211.doc, 05.02.16, page 2 of 9
0 510 20
0 5 10 15
0 51Trapindex0 20
Cl ethri
C
onomy
.
rufoc
Cs
. an
r
r
uf
u
11
9
1
9
6
1
9
7
0
9
8
0
9
0
1
0
1
9
1
9
6
1
9
7
0
9
8
0
9
0
1
0
1
91
9
6
1
9
7
0
9
8
09
0
Ye a r
Ye a r
Ye a r
02468
L a g
0
-0.4 0.2ACF 0.8
-0.4 0.2ACF 0.8
-0.4 0.A2CF 0.8
S e r ie
Se
s r ie
: S
k
s
e
ilp
r
:
5 1 0
1 50
L a g
5 1 0
1 5
L a g
02468
L a g
0
-0.4 PartilACF 0.
-0.4 PartilACF 0.
-0.4 PartilACF 0. 0.4
S e r ie
Se
s r ie
: S
k
s
e
ilp
r
:
5 1 0
1 50
L a g
__________________________________________
D:\687301211.doc, 05.02.16, page 3 of 9
5 1 0
1 5
L a g
kilpis.ar <- ar(kilpis.spring, order.max=9)
kilpis.ar
$order:
[1] 2
$ar:
, , 1
[,1]
[1,] 0.31903
[2,] -0.45581
$var.pred:
[,1]
[1,] 17.401
$aic:
[1] 8.69482 8.48007 0.00000 1.99762
6.83923
[10] 8.49921 10.12308 12.11896 12.49896
0.32233
1.96094
3.24463
4.84671
$n.used:
[1] 45
$order.max:
[1] 12
$partialacf:
, , 1
[,1]
[1,] 0.2191459
[2,] -0.4558068
[3,] 0.0072625
[4,] 0.2800484
[5,] 0.0894347
[6,] -0.1256669
$resid:
Series 1
1953
NA
1954
NA
1955
8.48208
1956 -4.75243
1957
1.41131
1958
1.12721
:
1995 -3.23340
1996 -3.26534
1997 -3.37557
start deltat frequency
1953
1
1
$method:
[1] "yule-walker"
$series:
[1] "kilpis.spring"
par(mfrow=c(2,2))
acf.plot(kilpis.ar, conf=T, main=(“Partial ACF for data”))
ts.plot(ts(kilpis.ar$aic, start=0), main=("AIC values. NB: Startpoint"))
ts.plot(kilpis.ar$resid, main=(“Residuals of AR(2) model”))
acf.plot(tmp <- acf(kilpis.ar$resid, plot=F), main="ACF of residuals”)
__________________________________________
D:\687301211.doc, 05.02.16, page 4 of 9
-0.4 -0.2PartilACF 0. 0.2
02468
P a rt i a
l
A
C
A
IC
v
a
0
2
0
4
6
8
Lag
2
4
6
8
T im e
-5 0 5 10
-0.2 0.2 ACF 0.6 1.0
A
CF
Re
si d
ua
ls
of
A
1960
1970
1980
1990
0
T im e
5
1015
Lag
kilpis.fit <- ar(kilpis.spring, aic=F, order=3)
# -> This gives the identical results.
par(mfrow=c(2,2))
ts.plot(kilpis.fit$resid, main=(“Residuals of AR(3) model”))
ts.points(kilpis.fit$resid, col=8)
acf.plot(tmp <- acf(kilpis.fit$resid, plot=F), main="ACF of residuals”)
acf.plot(tmp <- acf(kilpis.fit$resid, plot=F), add=T)
ACF of residuals
-5
-0.2
0
5
ACF
0.2
0.6
10
1.0
Residuals of AR(2) model
1960
1970
1980
1990
0
5
10
15
Lag
Time
-0.2
ACF
0.2
0.6
1.0
Series : kilpis.fit$resid
0
5
10
15
Lag
----__________________________________________
D:\687301211.doc, 05.02.16, page 5 of 9
tsp(kilpis)
[1] 1953 1997
2
kilpis.stl <- stl(kilpis, "periodic")
# plot.stl(kilpis.stl)
par(mfrow=c(1,1))
ts.plot(kilpis, main="Kilpisjärvi data", xlab="Year", ylab="Trap index",
ylim=c(-2, 30))
ts.points(kilpis, pch=28, col=8)
ts.lines(kilpis.stl$seas, col=4)
ts.lines(kilpis.stl$rem, lty=1, col=3)
legend(locator(1), legend=c("Data", "Seasonal effects", "remainder"),
lty=1, col=c(1,4,3))
0 5 10 15 20 25 30
Kilpis jär
Data
S eas onal
effe
remai nder
1960
1970
1980
1990
__________________________________________
D:\687301211.doc, 05.02.16, page 6 of 9
par(mfrow=c(2,2))
ts.plot(kilpis, main="Kilpisjärvi data", xlab="Year", ylab="Trap index")
ts.points(kilpis, pch=28, col=8)
ts.plot(kilpis.stl$seas, main="Seasonal components of Kilpisjärvi data",
xlab="Year", ylab="Trap index", ylim=c(-3, 3))
ts.lines(kilpis.stl$seas, col=4)
ts.plot(kilpis.stl$rem, main="Remaining series (Kilpisjärvi)", xlab="Year",
ylab="Trap index")
ts.lines(kilpis.stl$rem, lty=1, col=3)
acf(kilpis.stl$rem)
-3 -2 -1 0 1 2 3
0 5 10 15 20 25 0 5 10 15 20 25
Ki l p
i sj
S
ä
e
ra
v
i
so
d
a
n
t
a
1960
1970
1980
1990
1960
1970
1980
1990
Re
m
a
in
in
g
se
S
e
ri r
e
1960
1970
1980
1990
__________________________________________
D:\687301211.doc, 05.02.16, page 7 of 9
kilpis.arima22 <- arima.mle(kilpis.stl$rem – mean(kilpis.stl$rem), model =
list(order=c(2,0,2)),n.cond=10)
aicc <- function(loglik, p, q, n){
# loglik is the log likelihood of the model
# (estimated by arima.mle)
# p is the order of the AR-component
# q is the order of the MA-component
# n is the length of the time series
a <- loglik + ((2 * (p + q + 1) * n)/(n - p - q - 2))
return(a)
}
kilpis.arima22$aic
# [1] 451.4363
# Without subtracting mean: [1] 472.1974
# length(kilpis)
# [1] 89
aicc(kilpis.arima22$loglik, p=2, q=2, n=89)
# [1] 454.1592
# Without subtracting mean: [1] 474.9203
# par(mfrow=c(1,2))
arima.diag(kilpis.arima22)
P lot
of
S t P
andar
lot
of
diz
S
0. 0.2 p-value 0.4 0.6 -0.2 0P.ACF 0.2 -1.0 ACF0. 0.5 1.0 -2 0 1 2 3 4
0. 0.5 p-value 0.1 0.15 -0.2 0.PACF 0.2 0.4 -1.0 ACF0. 0.5 1.0 -2 -1 0 1 2 3
A R
I
M
AA
R
I
M
M
A
od
M
1960
1970
1980
1990
A CF
0
2
4
6
8
10
P A CF
2
4
6
8
4
5
6
Lag
10
7
0
A
of
CF
R es
P l
i
o
d
2
4
6
8
10
P lot
P A
of
CF
Res
P
P - v alues
3
1960
1970
1980
1990
P lot
2
4
6
8
10
of
P -v
Lj
al
ungues
B
3
4
5
6
Lag
7
A RIM
A (2,0,2)
A RIM
A
M
(
__________________________________________
D:\687301211.doc, 05.02.16, page 8 of 9
# Including a covariate
tmp <- read.table("C:\\Kyrre\\data\\nao1864_1999.txt")
nao5397 <- as.matrix(tmp[90:134,])
nao5397.ts <- ts(nao5397, start=1953)
tsp(nao5397.ts)
[1] 1953 1997
1
tsp(kilpis.spring)
# With a covariate (notice that the covariate must not be a time series)
kilpis.arima11 <- arima.mle(kilpis.spring, model = list(order=c(1,0,1)),
n.cond=10, xreg=nao5397)
kilpis.arima11
Call: arima.mle(x = kilpis.spring, model = list(order = c(1, 0, 1)), n.cond
= 10, xreg =
nao5397)
Method: Maximum Likelihood
Model : 1 0 1
Coefficients:
AR : 0.2389
MA : -0.41193
Variance-Covariance Matrix:
ar(1)
ma(1)
ar(1) 0.07674 0.05801
ma(1) 0.05801 0.06757
Coeffficients for regressor(s): nao5397
[1] 0.08591
Optimizer has converged
Convergence Type: relative function convergence
AIC: 216.96855
# Without a covariate
kilpis.arima11 <- arima.mle(kilpis.spring, model = list(order=c(1,0,1)),
n.cond=10)
kilpis.arima11
Call: arima.mle(x = kilpis.spring, model = list(order = c(1, 0, 1)), n.cond
= 10)
Method: Maximum Likelihood
Model : 1 0 1
Coefficients:
AR : 0.2477
MA : -0.40534
Variance-Covariance Matrix:
ar(1)
ma(1)
ar(1) 0.07615 0.05783
ma(1) 0.05783 0.06780
Optimizer has converged
Convergence Type: relative function convergence
AIC: 215.01605
__________________________________________
D:\687301211.doc, 05.02.16, page 9 of 9
Download