Ch3

advertisement
Chapter 3: Box-Jenkins Seasonal
Modelling
3.1 Stationarity Transformation

“Pre-differencing transformation” is
often used to stablize the seasonal
variation of the time series.
A
common transformation is of the form:
y  n( yt )
*
t
1
•
“Differencing transformation”:
1) Z t  yt*  yt*1
(first non-seasonal difference)
2) Z t  yt*  yt* L
(first seasonal difference, where L is the number
of seasons in a year)
*
*
*
*
Z

y

y

y

y
 t  L t  L 1 
t 1 
3) t  t
(first seasonal and first non- seasonal difference)
Of course, one can also obtain second and higher
order differences by simply applying the same rule.
2
3.2 Autocorrelation and Partial Autocorrelation

To determine if the data are stationary, we examine the
behaviour of the autocorrelation and partial autocorrelation
of the series at both the seasonal and non-seasonal level.

The behaviour of the SAC and SPAC functions at lags 1 to
L-3 is often considered as the behaviour of these functions
at the non-seasonal level.

A spike (significant memory) is said to exist if the
corresponding SAC or SPAC are greater than twice their
respective standard deviations.

The time series is considered to be stationary if the SAC of
the series cuts off or dies down reasonably quickly at both
the seasonal & non-seasonal levels.
3
Example 3.1
• Figure 3.1 shows the monthly
passenger totals (yt) in thousands of
passengers from 1949-59. The plot
levels patterns of increasing seasonal
variations.
• Figure 3.2 shows y  n( yt ) , which
seems to have equalized the seasonal
variations.
*
t
4
Figure 3.1
Monthly total international airline passengers (in thousands), 1949-1959
600
500
400
no. of passengers
300
200
100
0
Dec-48
May-50
Sep-51
Jan-53
Jun-54
Oct-55
Mar-57
Jul-58
Dec-59
5
Figure 3.2
Natural logarithms of monthly total international airline passengers,
1949-1959
6.5
6.3
6.1
5.9
5.7
no. of
passengers
5.5
5.3
5.1
4.9
4.7
4.5
Sep-48
Feb-50
Jun-51
Nov-52
Mar-54
Jul-55
Dec-56
Apr-58
Sep-59
6
• The following SAS output shows the
*
SAC’s of yt , its first difference at the
non-seasonal level, at the seasonal level
and at both the non-seasonal and
seasonal levels.
• On the basis of the SAC’s, it appears
that first difference at either seasonal
level, or at both seasonal and nonseasonal levels are necessary to ensure
the stationarity of the data.
7
ARIMA Procedure
Name of variable = LY.
Mean of working series = 5.486478
Standard deviation
= 0.414728
Number of observations =
132
Autocorrelations
Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0
0.171999
1.00000
|
|********************|
1
0.163124
0.94840
|
.
|******************* |
2
0.152803
0.88839
|
.
|******************
|
3
0.143954
0.83694
|
.
|*****************
|
4
0.136137
0.79150
|
.
|****************
|
5
0.130741
0.76013
|
.
|***************
|
6
0.126696
0.73661
|
.
|***************
|
7
0.123230
0.71646
|
.
|**************
|
8
0.121237
0.70487
|
.
|**************
|
9
0.122719
0.71349
|
.
|**************
|
10
0.124451
0.72355
|
.
|**************
|
11
0.127306
0.74015
|
.
|***************
|
12
0.128377
0.74638
|
.
|***************
|
13
0.120171
0.69867
|
.
|**************
|
14
0.110539
0.64267
|
.
|*************.
|
15
0.102490
0.59587
|
.
|************
.
|
16
0.094860
0.55151
|
.
|***********
.
|
17
0.089022
0.51757
|
.
|**********
.
|
18
0.084737
0.49266
|
.
|**********
.
|
19
0.081216
0.47219
|
.
|*********
.
|
20
0.079499
0.46220
|
.
|*********
.
|
21
0.080921
0.47047
|
.
|*********
.
|
22
0.082292
0.47845
|
.
|**********
.
|
23
0.084129
0.48913
|
.
|**********
.
|
24
0.084738
0.49267
|
.
|**********
.
|
"." marks two standard errors
8
ARIMA Procedure
Name of variable = LY.
Period(s) of Differencing = 1.
Mean of working series = 0.009812
Standard deviation
= 0.106038
Number of observations =
131
NOTE: The first observation was eliminated by
differencing.
Autocorrelations
Lag
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0.011244
1.00000 |
|********************|
0.0021211
0.18864 |
.
|****
|
-0.0014190
-0.12620 |
.***|
.
|
-0.0017381
-0.15458 |
.***|
.
|
-0.0036763
-0.32696 |
*******|
.
|
-0.000749
-0.06661 |
.
*|
.
|
0.00045338
0.04032 |
.
|*
.
|
-0.0011063
-0.09839 |
. **|
.
|
-0.0038510
-0.34250 |
*******|
.
|
-0.0012310
-0.10948 |
. **|
.
|
-0.0013408
-0.11925 |
. **|
.
|
0.0022435
0.19953 |
.
|****.
|
0.0093677
0.83312 |
.
|*****************
|
0.0022267
0.19803 |
.
|**** .
|
-0.0015966
-0.14200 |
.
***|
.
|
-0.0012365
-0.10996 |
.
**|
.
|
-0.0032543
-0.28942 |
******|
.
|
-0.0005262
-0.04680 |
.
*|
.
|
0.00039747
0.03535 |
.
|*
.
|
-0.0011731
-0.10433 |
.
**|
.
|
-0.0035000
-0.31128 |
.******|
.
|
-0.0012046
-0.10713 |
.
**|
.
|
-0.000954
-0.08485
.
**|
.
|
0.0020942
0.18625
.
|****
.
|
0.0080211
0.71337
|
.
|**************
|
"." marks two standard errors
9
ARIMA Procedure
Name of variable = LY.
Period(s) of Differencing = 12.
Mean of working series = 0.121282
Standard deviation
= 0.063215
Number of observations =
120
NOTE: The first 12 observations were eliminated by
differencing.
Autocorrelations
Lag
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0.0039962
1.00000
|
|********************|
0.0029424
0.73631
|
.
|***************
|
0.0025646
0.64176
|
.
|*************
|
0.0019980
0.49997
|
.
|**********
|
0.0018314
0.45830
|
.
|*********
|
0.0015802
0.39543
|
.
|********
|
0.0013155
0.32920
|
.
|*******
|
0.0010092
0.25255
|
.
|*****
.
|
0.00079972
0.20012
|
.
|****
.
|
0.00058932
0.14747
|
.
|***
.
|
0.00003062
0.00766
|
.
|
.
|
-0.0004257
-0.10653
|
.
**|
.
|
-0.0009502
-0.23779
|
.
*****|
.
|
-0.0005842
-0.14618
|
.
***|
.
|
-0.0005817
-0.14556
|
.
***|
.
|
-0.0004511
-0.11287
|
.
**|
.
|
-0.0006197
-0.15507
|
.
***|
.
|
-0.0004318
-0.10805
|
.
**|
.
|
-0.0005272
-0.13193
|
.
***|
.
|
-0.0005622
-0.14069
|
.
***|
.
|
-0.0006994
-0.17501
|
.
****|
.
|
-0.0005544
-0.13872
|
.
***|
.
|
-0.000448
-0.11211
|
.
**|
.
|
-0.0001579
-0.03950
|
.
*|
.
|
-0.0003788
-0.09480
|
.
**|
.
|
"." marks two standard errors
10
ARIMA Procedure
Name of variable = LY.
Period(s) of Differencing = 1,12.
Mean of working series = 0.001322
Standard deviation
= 0.044889
Number of observations =
119
NOTE: The first 13 observations were eliminated by
differencing.
Autocorrelations
Lag
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0.0020150
1.00000
|
|********************|
-0.0006363
-0.31578
|
******|
.
|
0.00021167
0.10505
|
.
|** .
|
-0.0004326
-0.21469
|
****|
.
|
0.00009253
0.04592
|
.
|*
.
|
0.00006174
0.03064
|
.
|*
.
|
0.00008917
0.04425
|
.
|*
.
|
-0.0001315
-0.06527
|
.
*|
.
|
0.0000188
0.00933
|
.
|
.
|
0.00032781
0.16268
|
.
|***.
|
-0.0000966
-0.04794
|
.
*|
.
|
0.00014118
0.07007
.
|*
.
|
-0.0008188
-0.40633
|
********|
.
|
0.00031546
0.15655
|
.
|*** .
|
-0.0000898
-0.04457
|
.
*|
.
|
0.00028601
0.14194
|
.
|*** .
|
-0.000294
-0.14590
|
. ***|
.
|
0.00018672
0.09266
|
.
|**
.
|
-0.0000634
-0.03145
|
.
*|
.
|
0.00010845
0.05382
|
.
|*
.
|
-0.0002755
-0.13673
|
. ***|
.
|
0.00006769
0.03359
|
.
|*
.
|
-0.0001636
-0.08119
|
.
**|
.
|
0.00044341
0.22005
|
.
|****.
|
-0.0000687
-0.03409
|
.
*|
.
|
"." marks two standard errors
11
Example 3.2
• Figure 3.3 shows the monthly
values of the number of people
(Xt) in Wisconsin employed in
trade from 1961 to 1975. No
predifferencing transformation
appears to be necessary.
12
Figure 3.3
Number of employees (in thousands), 1961-1975
400
380
360
340
320
no. of employees
300
280
260
240
220
Mar-60
Dec-62
Sep-65
Jun-68
Mar-71
Dec-73
Aug-76
13
• Next, let’s examine the SAC’s
of Xt, its first difference at the
non-seasonal level, at the
seasonal level and at both the
seasonal and non-seasonal levels.
14
ARIMA Procedure
Name of variable = X.
Mean of working series = 307.5584
Standard deviation
= 46.62852
Number of observations =
178
Autocorrelations
Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0
2174.219
1.00000
|
|********************|
1
2111.301
0.97106
|
.
|******************* |
2
2046.143
0.94109
|
.
|******************* |
3
1990.467
0.91549
|
.
|******************
|
4
1953.651
0.89855
|
.
|******************
|
5
1923.082
0.88449
|
.
|******************
|
6
1894.387
0.87130
|
.
|*****************
|
7
1857.165
0.85418
|
.
|*****************
|
8
1822.990
0.83846
|
.
|*****************
|
9
1795.368
0.82575
|
.
|*****************
|
10
1781.604
0.81942
|
.
|****************
|
11
1766.588
0.81252
|
.
|****************
|
12
1754.960
0.80717
|
.
|****************
|
13
1689.253
0.77695
|
.
|****************
|
14
1622.604
0.74629
|
.
|***************
|
15
1565.605
0.72008
|
.
|**************
|
16
1526.444
0.70207
|
.
|**************
|
17
1493.548
0.68694
|
.
|**************.
|
18
1462.579
0.67269
|
.
|************* .
|
19
1424.437
0.65515
|
.
|************* .
|
20
1390.875
0.63971
|
.
|************* .
|
21
1363.633
0.62718
|
.
|*************
.
|
22
1347.737
0.61987
|
.
|************
.
|
23
1328.662
0.61110
|
.
|************
.
|
24
1312.463
0.60365
|
.
|************
.
|
"." marks two standard errors
15
ARIMA Procedure
Name of variable = X.
Period(s) of Differencing = 1.
Mean of working series = 0.902825
Standard deviation
= 7.210001
Number of observations =
177
NOTE: The first observation was eliminated by
differencing.
Autocorrelations
Lag
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
51.984116
1.00000
|
|********************|
1.341360
0.02580
|
.
|* .
|
-10.104648
-0.19438
|
****|
.
|
-16.397040
-0.31542
|
******|
.
|
-6.537721
-0.12576
|
***|
.
|
0.720104
0.01385
|
.
|
.
|
11.646511
0.22404
|
.
|****
|
0.382655
0.00736
|
.
|
.
|
-5.583873
-0.10741
|
. **|
.
|
-15.804044
-0.30402
|
******|
.
|
-9.291756
-0.17874
|
****|
.
|
2.139864
0.04116
|
.
|*
.
|
46.868231
0.90159
|
.
|******************
|
0.801322
0.01541
|
.
|
.
|
-9.690318
-0.18641
|
.****|
.
|
-15.285807
-0.29405
|
******|
.
|
-6.236594
-0.11997
|
.
**|
.
|
0.881801
0.01696
|
.
|
.
|
10.680823
0.20546
|
.
|**** .
|
0.496121
0.00954
|
.
|
.
|
-4.968756
-0.09558
|
.
**|
.
|
-14.320935
-0.27549
|
******|
.
|
-8.286359
-0.15940
|
.
***|
.
|
1.685671
0.03243
|
.
|*
.
|
42.361435
0.81489
|
.
|****************
|
"." marks two standard errors
16
ARIMA Procedure
Name of variable = X.
Period(s) of Differencing = 12.
Mean of working series =
10.3759
Standard deviation
= 5.005722
Number of observations =
166
NOTE: The first 12 observations were eliminated by
differencing.
Autocorrelations
Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0
25.057251
1.00000
|
|********************|
1
23.551046
0.93989
|
.
|******************* |
2
21.750363
0.86803
|
.
|*****************
|
3
19.984942
0.79757
|
.
|****************
|
4
18.383410
0.73366
|
.
|***************
|
5
17.031926
0.67972
|
.
|**************
|
6
15.647808
0.62448
|
.
|************
|
7
14.141135
0.56435
|
.
|***********
|
8
12.707374
0.50713
|
.
|**********
|
9
11.123315
0.44392
|
.
|*********.
|
10
9.421701
0.37601
|
.
|******** .
|
11
7.755107
0.30950
|
.
|******
.
|
12
6.024674
0.24044
|
.
|*****
.
|
13
5.018099
0.20027
|
.
|****
.
|
14
4.119250
0.16439
|
.
|***
.
|
15
3.165849
0.12634
|
.
|***
.
|
16
2.245328
0.08961
|
.
|**
.
|
17
1.057665
0.04221
|
.
|*
.
|
18
-0.103884
-0.00415
|
.
|
.
|
19
-0.936067
-0.03736
|
.
*|
.
|
20
-1.623877
-0.06481
|
.
*|
.
|
21
-2.257332
-0.09009
|
.
**|
.
|
22
-2.941722
-0.11740
|
.
**|
.
|
23
-3.670260
-0.14647
|
.
***|
.
|
24
-4.472118
-0.17848
|
.
****|
.
|
"." marks two standard errors
17
ARIMA Procedure
Name of variable = X.
Period(s) of Differencing = 1,12.
Mean of working series = 0.087273
Standard deviation
= 1.438735
Number of observations =
165
NOTE: The first 13 observations were eliminated by
differencing.
Autocorrelations
Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0
2.069959
1.00000
|
|********************|
1
0.380397
0.18377
|
.
|****
|
2
-0.056837
-0.02746
|
. *|
.
|
3
-0.021478
-0.01038
|
.
|
.
|
4
-0.290834
-0.14050
|
***|
.
|
5 -0.0045074
-0.00218
|
.
|
.
|
6
0.200142
0.09669
|
.
|**.
|
7
0.041474
0.02004
|
.
|
.
|
8
0.187094
0.09039
|
.
|**.
|
9
0.197702
0.09551
|
.
|**.
|
10
0.0004563
0.00022
|
.
|
.
|
11
-0.144889
-0.07000
|
. *|
.
|
12
-0.572732
-0.27669
|
******|
.
|
13
-0.200208
-0.09672
|
. **|
.
|
14
0.056730
0.02741
|
.
|*
.
|
15
0.0061858
0.00299
|
.
|
.
|
16
0.287759
0.13902
|
.
|***.
|
17
0.049923
0.02412
|
.
|
.
|
18
-0.209991
-0.10145
|
. **|
.
|
19
-0.198252
-0.09578
|
. **|
.
|
20
-0.113819
-0.05499
|
.
*|
.
|
21
-0.039443
-0.01906
|
.
|
.
|
22
-0.039793
-0.01922
|
.
|
.
|
23
0.106062
0.05124
|
.
|*
.
|
24
-0.165247
-0.07983
|
. **|
.
|
"." marks two standard errors
18
Notations
*
t
Now, suppose that y is a pre-differencing
transformed series, the general stationarity
transformation is:
Zt    y
D
L
d
*
t
 (1  B ) (1  B) y
L D
d
*
t
where B is the lag (backward shift) operator,
D is the degree of seasonal differencing and
d is the degree of non-seasonal differencing.
19
3.3 Estimation and Diagnostic Checking
The general seasonal Box-Jenkins model can be written in the form,
p(B)p(BL)Zt = δ+θq(B)Q(BL)t
where
p(B) = (1  1B  2B2  …  pBp)
is the non-seasonal autoregressive operator of order p,
p(BL) = (1  1,LBL  2,LB2L  …  p,LBpL)
is the seasonal autoregressive operator of order P,
q(B) = (1  1B  2B2  …  pBq)
is the non-seasonal moving average operator of order q,
Q(BL) = (1  1,LBL  2,LB2L  …  Q,LBQL)
is the seasonal moving average operator of order Q,
 = p(B)P(BL)
The ARIMA notation is usually written as ARIMA (p, d, q) (P, D, Q)L20
.
Identification of the order p, q, P and Q are basically the same as in nonseasonal Box-Jenkins models. The following table provides some guidelines
for choosing non-seasonal and seasonal operators
21
22
• Estimation is usually carried out
using maximum likelihood, as in the
case of non-seasonal Box-Jenkins
analysis.
• As an example, consider the SPAC
of the time series of example 3.2,
after first difference at both seasonal
and non-seasonal levels.
23
Partial Autocorrelations
Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
1
0.18377 |
. |****
|
2
-0.06337 |
. *| .
|
3
0.00689 |
. | .
|
4
-0.14706 |
***| .
|
5
0.05599 |
. |* .
|
6
0.07694 |
. |**.
|
7
-0.00961 |
. | .
|
8
0.08102 |
. |**.
|
9
0.07084 |
. |* .
|
10
0.00110 |
. | .
|
11
-0.07361 |
. *| .
|
12
-0.25948 |
*****| .
|
13
0.01555 |
. | .
|
14
0.00615 |
. | .
|
15
-0.03042 |
. *| .
|
16
0.09184 |
. |**.
|
17
-0.01900 |
. | .
|
18
-0.04354 |
. *| .
|
19
-0.08056 |
.**| .
|
20
0.03107 |
. |* .
|
21
0.03687 |
. |* .
|
22
-0.06723 |
. *| .
|
23
0.03476 |
. |* .
|
24
-0.18999 |
****| .
|
24
•
At the non-seasonal level, both the SAC and SPAC appear
to have a significant spike at lag 1 and cuts off after lag 1.
•
One can tentatively identify an AR(1), MA(1) or ARMA(1,
1) models for the non-seasonal part of the series.
•
At the seasonal level, the SPAL appears to be dying down,
while the SAC cuts off after lag 12. Hence a seasonal
MA(1) model is identified.
•
Combining both the seasonal & non-seasonal levels, we
have the following tentative models:
ARIMA(1, 1, 0) (0, 1, 1)12,
ARIMA(0, 1, 1) (0, 1, 1)12,
ARIMA(1, 1, 1) (0, 1, 1)12
25
•
The SAS program for estimating these models is as
follows:
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
data employ;
input @7 x;
cards;
etc.
239.6
236.4
236.8
241.5
;
proc arima data=employ;
identify var=x(1,12);
estimate p=1 q=(12) printall plot method=ml;
estimate q=(1) (12) printall plot method=ml;
estimate p=1 q=(1) (12) printall plot method=ml;
run;
26
ARIMA(1,1,0)(0,1,1)12
Estimation results of an
model
Maximum Likelihood Estimation
Parameter
MU
MA1,1
AR1,1
Estimate
0.07694
0.41307
0.16005
Constant Estimate
=
Approx.
Std Error
0.07647
0.07493
0.07702
T Ratio
1.01
5.51
2.08
Lag
0
12
1
0.0646288
Variance Estimate = 1.80012263
Std Error Estimate = 1.34168649
AIC
= 570.489073
SBC
= 579.80691
Number of Residuals=
165
Autocorrelation Check of Residuals
To
Lag
6
12
18
24
30
Chi
Square
4.42
7.61
11.92
19.71
24.38
Autocorrelations
DF
4
10
16
22
28
Prob
0.352
0.667
0.750
0.601
0.662
0.011 -0.048 0.039 -0.103 -0.001 0.106
-0.059 0.024 0.100 -0.020 0.013 0.059
-0.085 0.050 -0.035 0.082 0.041 -0.063
-0.146 -0.046 0.027 -0.020 0.103 -0.075
-0.090 0.061 -0.064 -0.044 0.068 -0.030
27
ARIMA(0,1,1)(0,1,1)12
Estimation results of an
Maximum Likelihood Estimation
Parameter
MU
MA1,1
MA2,1
Estimate
0.07723
-0.17261
0.40941
Constant Estimate
Approx.
Std Error
0.07570
0.07695
0.07502
T Ratio
1.02
-2.24
5.46
model
Lag
0
1
12
= 0.07723315
Variance Estimate = 1.79723527
Std Error Estimate = 1.34061004
AIC
= 570.185014
SBC
= 579.50285
Number of Residuals=
165
Autocorrelation Check of Residuals
To
Lag
6
12
18
24
30
Chi
Square
4.04
7.15
11.57
19.20
23.68
Autocorrelations
DF
4
10
16
22
28
Prob
0.400
0.711
0.773
0.633
0.698
0.000 -0.025 0.040 -0.101 -0.002 0.105
-0.058 0.026 0.098 -0.020 0.013 0.058
-0.087 0.055 -0.039 0.083 0.036 -0.061
-0.142 -0.046 0.027 -0.026 0.102 -0.076
-0.089 0.059 -0.065 -0.043 0.063 -0.031
28
Estimation results of an
model
ARIMA(1,1,1)(0,1,1)12
Maximum Likelihood Estimation
Parameter
MU
MA1,1
MA2,1
AR1,1
Estimate
0.07778
-0.45907
0.40276
-0.29315
Constant Estimate
Approx.
Std Error
0.07374
0.37080
0.07533
0.39756
T Ratio
1.05
-1.24
5.35
-0.74
Lag
0
1
12
1
= 0.10058242
Variance Estimate = 1.80600827
Std Error Estimate = 1.34387807
AIC
= 571.896463
SBC
= 584.320245
Number of Residuals=
165
Autocorrelation Check of Residuals
To
Lag
6
12
18
24
30
Chi
Square
3.19
5.88
10.55
18.00
21.80
Autocorrelations
DF
3
9
15
21
27
Prob
0.364
0.752
0.784
0.649
0.747
0.010 0.019 0.015
-0.050 0.031 0.093
-0.089 0.059 -0.042
-0.140 -0.053 0.025
-0.085 0.048 -0.064
-0.087 -0.005 0.101
-0.011 0.009 0.054
0.089 0.027 -0.060
-0.028 0.097 -0.076
-0.039 0.052 -0.033
29
Diagnostic checking is conducted using
the Ljung-Box-Pierce Statistic
k
Q  n(n  2)  r
*
 1
z
e
( n  )
where n is number of observations
available after differencing.
30
Download