A comparison of some of pattern identication methods Wai-Sum Chan

Statistics & Probability Letters 42 (1999) 69 – 79 A comparison of some of pattern identication methods for order determination of mixed ARMA models Wai-Sum Chan ∗ Department of Statistics and Actuarial Science, The University of Hong Kong, Pokfulam Road, Hong Kong Received February 1998; received in revised form June 1998 Abstract Model identication is a crucial step in time-series modelling. The orthodox Box–Jenkins (BJ) identication examines the patterns of the sample autocorrelation function (SACF) and the sample partial autocorrelation function (SPACF). However, for mixed ARMA processes, the SACF and SPACF often exhibit similar behaviour, which makes the identication much more dicult. Recently, identication methods using the patterns of some functions of the autocorrelations have been proposed to supplement the BJ methods. This paper studies some of these proposed procedures. Their performances for order selection of a mixed ARMA process are compared with an expert system in a simulation study. Comments on each c 1999 Elsevier Science B.V. All rights reserved individual identication method are also given. Keywords: Corner method; ESACF table; Model identication; Outliers; SCAN table; Time-series ARMA models 1. Introduction Suppose that a time-series Yt has the stationary autoregressive-moving average representation (B)Yt = (B)at ; (1) where B is the backshift operator such that Bs Yt = Yt−s , (B) = 1 − 1 B − · · · − p Bp ; (B) = 1 − 1 B − · · · − q Bq ; (B) has all its roots outside the unit circle, and at is white noise with zero mean and constant variance a2 ¡ ∞. We further assume that the AR and MA polynomials in Eq. (1) have no common factors. For all integers k, dening the autocovariance k at lag k by Cov[Yt ; Yt−k ], we have k = E[Yt Yt−k ]: (2) In particular, 0 = Var(Yt ). The general process in Eq. (1) can be fully characterized by = (1 ; : : : ; p ; 1 ; : : : ; q ; a2 ): ∗ Tel.: +852 2859 2466; fax: +852 2858 9041; e-mail: chanws@hku.hk. c 1999 Elsevier Science B.V. All rights reserved 0167-7152/99/$ – see front matter PII: S 0 1 6 7 - 7 1 5 2 ( 9 8 ) 0 0 1 9 5 - 3 (3) 70 W.-S. Chan / Statistics & Probability Letters 42 (1999) 69 – 79 Alternatively, it can be represented by {0 ; 1 ; : : : ; p+q }: Dening the autocorrelation at lag k by k k = 0 (4) precisely the same information as in is contained in 0 and {1 ; 2 ; : : : ; p+q }: The complete set of autocorrelations 1 ; 2 ; : : : is termed the autocorrelation function (ACF). The partial autocorrelation function (PACF) discussed in Box and Jenkins (1976, p. 64) can be obtained by the ACF through the Yule–Walkers equations. The orthodox Box–Jenkins (BJ) identication method examines the patterns of the ACF and the PACF. If {Yt } is an MA(q) process, then the ACF has a cut-o at lag q. On the other hand, the PACF cuts o at lag p for an AR(p) model. However, the BJ identication method is not very useful when dealing with mixed ARMA processes. Both the ACF and the PACF have tail-o patterns. Simple inspection of the graphs of the ACF and the PACF would not, in general, give clear values of p and q for mixed models. The diculty is compounded when the ACF and the PACF are replaced by their estimates. Since an ARMA process can be completely characterized by its ACF, identication methods using the patterns of some functions of the autocorrelations have been proposed to supplement the BJ method for identication of mixed ARMA processes. These are often classied as pattern identiÿcation methods in time series literature. They include the R and S array method (Gray et al., 1978), the Corner method (Beguin et al., 1980), the GPAC method (Woodward and Gray, 1981), the EACF method (Tsay and Tiao, 1984), the SCAN method (Tsay and Tiao, 1985), and many others. Choi (1992, Ch. 5) provided a comprehensive review of pattern identication methods. Pattern identication methods have received mixed responses from time series analysts. Some, like Liu and Hanssens (1982), Lii (1985), and Rezayat and Anandalingam (1988), favour the Corner method. Others, like Gooijer and Heuts (1981), and Petruccelli and Davies (1984), are less optimistic. Davies and Petruccelli (1984) presented an argument against the use of the GPAC method. Gooijer and Heuts (1981) mentioned that the complexity and theoretical diculty of the R and S array method have retained us from using it. This article concentrates on three selected pattern identication methods. They are the Corner method, the EACF method, and the SCAN method. These methods are selected on the criteria that they are: (i) relatively simple to use; (ii) available in some time-series computer packages; and (iii) less criticized by other time-series analysts. A Monte Carlo study is performed to compare the identication power of these methods for a mixed ARMA model. 2. Some selected pattern identiÿcation methods 2.1. The Corner method Beguin et al. (1980) considered the determinant r r−1 · · · r−s+1 r+1 r · · · r−s+2 (r; s) = . .. .. .. .. . . . r+s−1 r+s−2 · · · r (5) W.-S. Chan / Statistics & Probability Letters 42 (1999) 69 – 79 71 Table 1 The asymptotic pattern of the Corner table and the SCAN table for an ARMA (p; q) model MA order (j) AR order (i) ··· ··· q−1 q q+1 ··· ··· K .. . .. . .. . p−1 p p+1 p+2 .. . K X .. . .. . X X X X .. . X ··· .. . .. . ··· ··· ··· ··· .. . ··· X .. . .. . X X X X .. . X X .. . .. . X O O O .. . O X .. . .. . X O O O .. . O X .. . .. . X O O O .. . O ··· .. . .. . ··· ··· ··· ··· .. . ··· X .. . .. . X O O O .. . O Note: X represents nonzero; O represents zero. for r¿1 and s¿1. The Corner table C is dened by C(i; j) = ( j + 1; i + 1) (6) for i; j = 0; 1; : : : ; K, and K is an arbitrary but large integer. The asymptotic pattern of the Corner table C for an ARMA (p; q) model is given in Table 1. It is expected that the Corner table produces a large south east rectangular sub-matrix whose all elements are zeros. The coordinates of the northwest corner of this zero-rectangle are (p; q). It provides a strong clue for us to identify the order of the underlying process. In practice, we only have nite number of observations. The autocorrelations in Eq. (5) have to be estimated. ˆ s). Gooijer and Heuts (1981) complained that it is dicult to locate We calculate the Corner table using (r; the possible values of p and q by visual inspection of the numerical elements inside the table. Following Beguin et al. (1981) and Tsay and Tiao (1984), we simplify the Corner table using indicator symbols. The Simplied Corner table is dened as  ( ˆ j + 1; i + 1)    O; if ¡2 ˆ j + 1; i + 1)) SE(( (7) C∗ (i; j) =    X; otherwise for i; j = 0; 1; : : : ; K and “O” is an indicator symbol to represent an element whose value is not dierent from zero. On the other hand, “X ” represents a nonzero element. The standard error of any ˆ element in the estimated Corner table is given by s 0 ˆ = AGA (8) SE() n where A is a (1 × h) vector with elements a( j) = @=@j ; h is the maximal lag among all the autocorrelations in Eq. (5), n is the sample size, G is a (h × h) matrix whose (i; j) element is ∞ X {k k−i+j + k+j k−i − 2k j k−i − 2i k k−j + 2i j 2k }: k=−∞ (9) 72 W.-S. Chan / Statistics & Probability Letters 42 (1999) 69 – 79 The estimated matrices Â and Ĝ can be obtained by replacing the autocorrelations in Eqs. (5) and (9) by their corresponding estimates. In practice, we also have to approximate the summation in Eq. (9) using nite number of terms (say, −506k650). The Simplied Corner table with indicator symbols is easier for us to compare with its asymptotic pattern in Table 1. However, it should be noted that the indicator symbols only provide a crude guide and are not meant to give formal signicant tests. 2.2. The ESACF method Tsay and Tiao (1984) proposed the extended sample autocorrelation function (ESACF) based on the concept of iterated least-squares (ILS) regression. Let {y1 ; : : : yn } be a realization from an ARMA (p; q) model. The lth iterated AR (k) regression is dened as yt = k X r=1 k;(l)r yt−r − l X s=1 (l) k;(l)s âk;(l−s) t−s + ak; t (10) for t = (k + l + 1); : : : ; n; k = 1; 2; : : : , and l = 0; 1; : : : , where âk;(l)t = yt − k X (l) ˆ k; r yt−r + r=1 l X s=1 (l) ˆk; s âk;(l−s) t−s (11) (l) (l) (l) (l) is the estimated residual of the lth iterated AR(k) regression, and {ˆ k; 1 ; : : : ; ˆ k; l } and {ˆk; 1 ; : : : ; ˆk; l } are the ordinary least-squares estimates obtained from the regression in Eq. (10). Let k;(l)t = yt − k X (l) ˆ k; r yt−r (12) r=1 for k = 0; 1; : : : . The extended sample autocorrelation {rk; l } is dened as the sample autocorrelation at lag l of {k;(l)t }. It should be noted that {r0; l } is simply the standard SACF (ˆl ) of {yt }. We can form a two-way table from the ESACF. The ESACF table, E, is dened as E(i; j) = ri; j+1 (13) for i; j = 0; 1; : : : ; K, and K is an arbitrary but large integer. The asymptotic ESACF pattern for an ARMA (p; q) model is tabulated in Table 2. There is a remarkable zero-triangle in the table and its vertex is in (p; q) position. Hence, the ESACF can be a useful tool in model identication, particularly for a mixed ARMA process. Tsay and Tiao (1984) proposed using a simplied ESACF table for nite sample situations. The original ESACF table of values can be summarized in a condensed form by replacing those values that are within two standard √ errors of zero by an “O”, and by an “X ” otherwise. The standard errors of {rk; l } can be approximated by 1= n − k − l. 2.3. The SCAN method Tsay and Tiao (1985) proposed a smallest canonical correlation (SCAN) approach for tentative order determination in building ARMA models. Let {y1 ; : : : ; yn } be a realization from an ARMA (p; q) process. The steps of calculating the SCAN table is summarized as follows: W.-S. Chan / Statistics & Probability Letters 42 (1999) 69 – 79 73 Table 2 The asymptotic pattern of the ESACF table for an ARMA (p; q) model MA order (j) AR order (i) ··· ··· q−1 q q+1 ··· ··· K .. . .. . p−1 p p+1 p+2 .. . K X .. . X X X X .. . X ··· .. . ··· ··· ··· ··· .. . ··· X .. . X X X X .. . X X .. . X O X X .. . X X .. . X O O X .. . X X .. . X O O O .. . X ··· .. . ··· ··· ··· ··· .. . ··· X .. . X O O O .. . O Note: X represents nonzero; O represents zero. 1. For 06i6K and 06j6K (K is an arbitrary but large integer), we compute the smallest eigenvalue (i; ˆ j) of the matrix ˆ j) = Â(i; j)B̂(i; j); (i; (14) where Â(i; j) = X t B̂(i; j) = X t !−1 Zi; t Zi;0 t X t ! Zi; t Zi;0 t−j−1 !−1 Zi; t−j−1 Zi;0 t−j−1 X t ; ! Zi; t−j−1 Zi;0 t are (i + 1) × (i + 1) matrices with summation t from (i + j + 2) to n, and Zi; t = (yt ; : : : ; yt−i )0 : 2. For each i and j, a (i; j) statistic is computed (i; ˆ j) ; (i; j) = −(n − i − j) ln 1 − (i; j) (15) where (i; 0) = 1; (i; j) = 1 + 2 j X ˆk (!) k=1 and ˆk (!) denotes the sample autocorrelation of the transformed series {!i; t } at lag k. The transformed series can be obtained by !0; t = yt ; ( j) ( j) !i; t = yt − ˆ 1 yt−1 − · · · − ˆ i yt−i ; ( j) ( j) ˆ j) corresponding to (i; ˆ j). where (1; −ˆ 1 ; : : : ; −ˆ i ) is a normalized eigenvector of (i; 3. A two-way SCAN table S can be arranged by S(i; j) = (i; j) for i; j = 0; 1; : : : ; K: (16) 74 W.-S. Chan / Statistics & Probability Letters 42 (1999) 69 – 79 Table 3 A hypothetical simplied pattern identication table MA order (j) AR order (i) 0 1 2 3 4 5 6 7 0 1 2 3 4 X X X O O X X O O O X X O O O X X O O O X X O O O O O O O O O O O O O O O O O O Tsay and Tiao (1985) showed that (i; j) is a 12 random variable asymptotically when i = p and j¿q or when i¿p and j = q. The asymptotic pattern of the SCAN table is given in Table 1. It has a similar pattern as the Corner method. There is a large zero-rectangle with (p; q) as the upper-left vertex inside the table. As in the case of the ESACF, a simplied SCAN table is commonly used in practice. Liu and Hudak (1994, p. 545) suggested that a symbol “X ” is displayed to indicate a position where the statistic (i; j) is signicant at the one per cent level, and the symbol “O” is displayed otherwise. Therefore, we can determine possible values for p and q by searching for a corner in the simplied SCAN table. 3. Monte Carlo study Computations for constructing pattern identication tables described in Section 2 are relatively simple. Mareschal and Melard (1988) provided FORTRAN codes for the Corner method. The Corner table can also be eciently calculated using matrix-based computer languages (e.g., Splus, GAUSS, MATLAB, SCA,: : : , etc.). The ESACF and the SCAN methods are incorporated in the SCA system (Liu and Hudak, 1994). Despite the availability of ecient algorithms for the selected pattern identication methods there are only a few empirical studies on them. Gooijer and Heuts (1981) examined the identication power of the Corner method based on 15 simulation runs. Rezayat and Anandalingam (1988) compared the ESACF and the Corner method in a Monte Carlo experiment with 50 replications. The mean and standard deviation of the elements in the SCAN table were studied by Tsay and Tiao (1985) using simulation with 400 runs. Unfortunately, the identication power of the SCAN method was not examined in their paper. One of the major barriers in studying the order selection power of a pattern identication method is the visual searching of the corner (vertex) inside the simplied table. It is subjective and time consuming for a large-scale simulation experiment. In this section, we compare the identication power of the selected methods through a larger scale (1000 runs) simulation. Since most time series analysts (especially the beginners) heavily rely on the simplied tables, our study will only be based on the patterns of the symbols “X ” and “O” inside each table. We propose some objective rules for the computer to locate the potential corners=vertices automatically. Following Rezayat and Anandalingam (1988), we allow each method to select a set of orders instead of a single identication. The computer is instructed to include (i; j) into the order set i 1. the symbol in the (i − 1; j) position is an “X ”; and 2. the symbol in the (i; j − 1) position is an “X ”; and 3. the symbol in the (i; j) position is an “O”. The checking will be passed if (i − 1) ¡ 0 or ( j − 1) ¡ 0. For example, in Table 3, orders (0,5), (2,1) and (3,0) are selected into the order set. These rules might not be perfect, but we nd that they are quite helpful in specifying low order ARMA models. W.-S. Chan / Statistics & Probability Letters 42 (1999) 69 – 79 75 Table 4 Number of selected combinations (p; q) in 1000 simulations from the ARMA (2,1) model in Eq. (17) for various pattern identication methods and the SCA Expert System (p; q) n (0; 0) (1; 0) (2; 0) (3; 0) (4; 0) (0; 1) (0; 2) (0; 3) (0; 4) (1; 1) (1; 2) (1; 3) (2,1) (2; 2) (2; 3) (3; 1) (3; 2) NSa Total a CORNER 100 ESACF 200 400 1000 100 SCAN 200 400 0 102 891 7 4 0 13 210 344 675 163 27 5 2 0 1 0 0 0 945 53 11 0 0 9 149 410 475 74 45 5 5 0 0 0 0 154 805 39 0 0 0 2 48 673 196 799 9 9 6 1 0 0 0 174 793 0 0 0 0 0 314 514 981 14 15 14 5 0 0 98 126 233 0 31 354 376 421 485 38 342 55 0 73 2 0 0 0 58 165 0 0 19 207 117 628 207 553 153 8 79 13 0 0 0 9 94 0 0 0 2 5 541 340 577 167 36 83 58 2444 2181 2741 2824 2634 2207 1912 1000 0 0 0 0 22 0 0 0 0 0 192 529 591 160 115 53 103 1765 100 EXPERT 200 400 1000 100 200 400 1000 1 13 292 204 65 1 1 4 0 166 7 3 44 21 7 0 0 171 1000 0 8 148 456 111 1 0 0 0 120 4 3 48 22 3 1 4 71 1000 0 0 49 717 140 0 1 0 0 36 1 0 41 6 0 0 0 9 1000 0 0 3 883 108 0 0 0 0 2 0 0 4 0 0 0 0 0 1000 0 0 575 400 52 0 109 574 276 687 227 4 30 7 4 15 22 0 0 268 687 64 0 0 72 418 331 591 40 266 12 3 27 14 0 0 247 704 55 0 0 0 36 46 788 140 463 8 5 19 12 0 0 215 754 40 0 0 0 0 0 476 436 502 25 6 14 10 2982 2793 2523 2478 NS represents non-stationary models. Following ARMA (2,1) process is considered: (1 − 0:8B)(1 − 0:5B)Yt = (1 + 0:5B)at : (17) The parameters are chosen to reduce possible cancellation eects. In the rst experiment, we consider time series which are generated from model (17). The rst 100 observations are discarded to reduce the eects of starting values. We consider sample size n = 100; 200; 400; 1000; a2 = 1; the greatest possible order K = 5. Simplied tables for the Corner method, the ESACF method and the SCAN method are obtained by the SCA system (Liu and Hudak, 1994). Using our proposed rules, the selected order set for each method is recorded. The experiment is repeated 1000 times and the numbers of selected combinations (p; q) for dierent methods are summarized in Table 4. For comparison purpose, order selection results by the SCA Expert System are also reported. This expert system is described in detail by Liu (1993). Outliers or unusual observations could easily aect the performance of the identication methods in a real application. In the second experiment, a few outlying observations are introduced into the data. Let Yt be the outlier-free time series generated from model (17), the contaminated time series Zt is obtained by Zt = Yt + m X !It(Ts ) ; (18) s=1 where m is the number of outliers, ! represents magnitude of the outliers, and ( 1; t = Ts ; (Ts ) It = 0; t 6= Ts ; (19) 76 W.-S. Chan / Statistics & Probability Letters 42 (1999) 69 – 79 Table 5 Rates of under-specication (in percent) for various method in the presence of outliers Method n CORNER 100 200 400 1000 ESACF 100 200 400 1000 SCAN 100 200 400 1000 EXPERT 100 200 400 1000 ! 0 3 5 7 9 ∞ 68.8 65.1 7.4 0.0 66.0 63.5 40.6 0.0 61.1 61.0 54.3 1.9 62.6 55.8 47.4 47.4 63.7 63.6 34.9 49.1 100.0 100.0 100.0 100.0 20.9 5.3 0.3 0.0 37.5 33.9 2.8 0.2 29.9 18.6 4.4 0.7 43.9 16.7 7.8 7.3 51.7 39.0 2.9 2.8 82.4 79.9 82.2 87.3 46.0 21.4 11.6 8.7 63.0 54.3 25.9 9.3 54.5 56.0 52.3 8.7 55.9 50.8 49.2 45.7 59.6 53.4 39.2 49.2 99.4 99.9 99.9 100.0 60.8 30.7 9.4 0.5 95.4 93.2 95.5 68.9 92.6 94.9 97.6 99.8 96.7 93.3 98.9 88.8 93.8 96.4 99.0 98.8 100.0 100.0 100.0 100.0 is the indicator variable representing the presence or absence of an outlier at time Ts . Time series are generated from model (18) with the same basic setup as the rst experiment. Three outliers are introduced at T1 =0:25n; T2 =0:5n; T3 =0:75n with magnitude !=0; 3; 5; 7; 9 and ∞. The case of ! → ∞ is approximated by setting !=10 000. The experiment is repeated 1000 times. From a practical point of view, under-specication maybe more serious a problem than minor over-specication. For example, the true model is ARMA (2,1). Therefore, an AR (2) specication is a more “serious mistake” than an AR (3) or an ARMA (1,2) specication. Consequently, for the second experiment, the rates of under-specication (in percent) for each identication method are reported in Table 5. 4. Discussions For the rst experiment, the number of correct identication for each method in 1000 simulations is typed in bold face in Table 4. A “correct identication” means that the true order has been selected into the identied order set by the method. We observe that the number of correct identication increases as sample size increases for each pattern identication method. For example, in the Corner method, the number of correct identication is 5 for n = 100, 45 for n = 200, 799 for n = 400, and it further climbs up to 981 for n = 1000. These agree with the asymptotic results in each method. For the Corner method, it works well in the large sample situations (n=400; 1000). However, its performance for the cases with n6200 are very discouraging. The method only includes the true order 0.5% (n = 100) and 4.5% (n = 200) of the total 1000 simulation runs. When the sample size is small, it erroneously favours the AR(2) identication. Out of the 1000 simulations, the method selects (p; q) = (2; 0) into the order set in 891 runs for n = 100, and 945 runs for n = 200. For the ESACF method, its performance is quite robust to the sample size. The number of correct identication is 342 for n = 100, 553 for n = 200, 577 for n = 400, and 591 for n = 1000. The average number of W.-S. Chan / Statistics & Probability Letters 42 (1999) 69 – 79 77 Table 6 Identication ratios (in percent) for various methods n 100 200 400 1000 CORNER ESACF SCAN EXPERT 0.2 2.1 29.1 34.7 13.0 25.1 30.2 33.5 1.0 9.5 18.4 20.3 4.4 4.8 4.1 0.4 element in the selected order set is fast decreasing with the sample size. It is 2.634 for n = 100, 2.207 for n = 200, 1.912 for n = 400, and 1.765 for n = 1000. It implies that the ESACF method usually makes a more concrete identication for time series with larger sample size. The results of the SCAN method are less optimistic. Its most favoured combination is (1; 1) for n = 100, (3; 0) for n = 200, (1, 2) for n = 400, and (3; 0) for n = 1000. However, none of them is the true order (2; 1) of model (17). For comparison purpose, automatical identication results from the SCA Expert System (Liu, 1993) are also reported in Table 4. It should be noted that the expert system is only allowed to make a single identication in each simulation run. It is, therefore, not fair to directly compare the numbers of the expert system with results of other pattern identication methods. However, the performance of the expert system is rather disappointing for sample size as large as n = 1000. It clearly favours an ARMA (3,0) model for the simulated time series. Since model (17) can be written as Yt − 1:8Yt−1 + 1:3Yt−1 − 0:65Yt−3 + 0:33Yt−4 − 0:16Yt−5 + 0:08Yt−6 · · · = at ; (20) the frequent choice of ARMA (3,0) model by the expert system is not totally unreasonable. Our limited experience in this study shows that there is still plenty of room for the expert system to be improved. In order to have a criterion for evaluating the performance of all the identication methods, we compute the identication ratio R= Number of correct identication in the 1000 runs : Total number of identication in the 1000 runs (21) The results are summarised in Table 6. . Based on this criterion, the ESACF method outperforms the others. The Corner method has a reasonable performance in large sample situations. On the other hand, our simulation study suggests that the SCAN method and the SCA Expert System does not have adequate identication power in many cases. Performances of the identication methods in the presence of time series outliers are summarized in Table 5. All methods are adversely inuenced by the outlying observations. Only the ESACF method can provide some resistance to extreme observations when the sample size is large. Finally, some other characteristics of the identication methods are compared in Table 7. The ESACF and the SCAN method can handle directly nonstationary processes. On the other hand, we are not able to get the ACF from nonstationary time series. Without the proper autocorrelations, the Corner table in Eq. (6) is not dened. Furthermore, the Corner method can easily underspecify the order if one of the AR roots is close, but not on, the unit circle. For example, consider the model used. If (1 − 0:8B) in the AR polynomial is replaced by (1 − 0:98B), the estimated serial correlations in the determinant of Eq. (5) are all very close to one. The Corner method can easily identify the model as an AR(1). The multivariate extension of the ESACF approach was proposed by Tiao and Tsay (1983). The SCAN method was also generalised by Tiao and Tsay (1989) for identifying vector ARMA models. Liu and Hanssens (1982) discussed the use of the Corner table to select the orders in a rational transfer function model. Unfortunately, none of the selected pattern identication methods is able to provide direct extension to identify seasonal time series models. We also record in Table 7 the 78 W.-S. Chan / Statistics & Probability Letters 42 (1999) 69 – 79 Table 7 Some other characteristics of the identication methods CORNER ESACF SCAN EXPERT Extensions to: 1. Nonstationary time series 2. Seasonal time series 3. Transfer function models 4. Vector time series No No Yes No Yes No No Yes Yes No No Yes Yes Yes Yes No CPU time (in seconds) 4 7 80 15 CPU time (in seconds) requirement for each identication method (n=1000) using the SCA package on a SGI PowerChallenge (running IRIX 6.2) system at the National University of Singapore. 5. Conclusions For outlier-free time series, the general conclusions observed from Table 6 are (a) the Corner method works well when the sample size is large, but it fares poor in small to moderate sample sizes; (b) the ESACF method is more robust with respect to sample size and works reasonably well; and (c) the automatic method fares poor for all sample size used. However, as noted in the Section 4, the automatic method only provides a single model whereas multiple models are allowed for other methods. Therefore, direct comparison of the identication power for the expert system with the other methods should be avoided. For time-series contaminated with a few outlying observations, the conclusions obtained from Table 5 are (a) no identication method in this study can resist extremely large outliers; (b) the adverse eects in model identication caused by outliers can be generally diluted by increasing the sample size; and (c) only the ESACF method can provide some resistance to moderate outliers when the sample size is large. Finally, we should emphasize that our discussions are conned to the ARMA(2,1) model we have studied. Extrapolation to other situations should be done with caution. Acknowledgements The author is grateful to the anonymous referee for his=her helpful comments on an earlier version of this paper. References Beguin, J.M., Gourieroux, C., Monfort, A., 1980. Identication of a mixed autoregressive-moving average process: the corner method. In: Anderson, O.D. (Ed.), Time Series. North-Holland, Amsterdam. Box, G.E.P., Jenkins, G.M., 1976. Time Series Analysis, Forecasting and Control, 2nd ed., Holden-Day, San Francisco. Choi, B., 1992. ARMA Model Identication. Springer, New York. Davies, N., Petruccelli, J.D., 1984. On the use of the general partial autocorrelation function for order determination in ARMA (p; q) processes. J. Amer. Statist. Assoc. 79, 374–377. de Gooijer, J.G., Heuts, R.M.J., 1981. The corner method: an investigation of an order determination procedure for general ARMA processes. J. Oper. Res. Soc. 32, 1039 –1046. Gray, H.L., Kelley, G.D., McIntire, D.D., 1978. A new approach to ARMA modelling. Commun. Statist. B7, 1–77. Lii, K.S., 1985. Transfer function model order and parameter estimation. J Time Series Anal. 6, 153–169. Liu, L.M., 1993. A new expert system for time series modelling and forecasting. ASA Proc. Business and Economic Statistics Section, 424 –429. W.-S. Chan / Statistics & Probability Letters 42 (1999) 69 – 79 79 Liu, L.M., Hanssens, D.M., 1982. Identication of multiple-input transfer function models. Commun. Statist. A11, 297–314. Liu, L.M., Hudak, G.B., 1994. Forecasting and Time Series Analysis using the SCA Statistical System, Scientic Computing Associates Corp., P.O. Box 4692, Oak Brook, Illinois 60522. Mareschal, B., Melard, G., 1988. The corner method for identifying autoregressive moving average models. Appl. Statist. 37, 301–316. Petruccelli, J.D., Davies, N., 1984. Some restrictions on the use of corner method hypothesis tests. Commun. Statist. A13, 543–551. Rezayat, F., Anandalingam, G., 1988. Using instrumental variables for selecting the order of ARMA models. Commun. Statist. A17, 3029–3065. Tiao, G.C., Tsay, R.S., 1983. Multiple time series modelling and extended sample cross-correlation. J. Bus. Econom. Statist. 1, 43–56. Tiao, G.C., Tsay, R.S., 1989. Model specication in multivariate time series. J. Roy. Statist. Soc. B51, 157–213. Tsay, R.S., Tiao, G.C., 1984. Consistent estimates of autoregressive parameters and extended sample autocorrelation function for stationary and nonstationary ARMA models. J. Amer. Statist. Assoc. 79, 84 –96. Tsay, R.S., Tiao, G.C., 1985. Use of canonical analysis in time series model identication. Biometrika 72, 299–315. Woodward, W.A., Gray, H.L., 1981. On the relationship between the S array and the Box-Jenkins method of ARMA model identication. J. Amer. Statist. Assoc. 76, 579–587.

A comparison of some of pattern identication methods Wai-Sum Chan

Related documents

Products

Support

A comparison of some of pattern identi cation methods Wai-Sum Chan

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

A comparison of some of pattern identication methods Wai-Sum Chan