SCHOOL OF STATISTICS UNIVERSITY OF THE PHILIPPINES DILIMAN WORKING PAPER SERIES Robust Estimation of A Time Series Model with Structural Change by Wendell Q. Campano and Erniel B. Barrios (correspondence) UPSS Working Paper No. 2009-10 July 2009 School of Statistics Ramon Magsaysay Avenue U.P. Diliman, Quezon City Telefax: 928-08-81 Email: updstat@yahoo.com 1 Robust Estimation of A Time Series Model with Structural Change Wendell Q. Campano University of the Philippines Diliman Magsaysay Ave., Diliman, Quezon City Philiipines E-mail: wqcampano@up.edu.ph Erniel B. Barrios (correspondence) University of the Philippines Diliman Magsaysay Ave., Diliman, Quezon City Philiipines E-mail: ebbarrios@up.edu.ph Tel. No.:(63 2) 9280881 Abstract A Procedure for estimating a time series models with structural change is proposed. Nonparametric bootstrap (block bootstrap or AR Sieve) is applied to a series of estimates obtained through a modified forward search algorithm. The forward search algorithm is implemented with overlapping and independent blocks of time points. The procedure can mitigate the difficulty in estimation when there is a temporary structural change. The simulation study indicated robustness of estimates from the estimation method when temporary structural change is introduced into the model provided that the time series is fairly long. We also provided a procedure for detecting structural change and the subsequent adjustment of the overall model if indeed, there is a structural change. Keywords: nonparametric bootstrap, ARIMA model, structural change, forward search AMS Classification Codes: 62G05, 62G09, 62G35 1. Introduction Advances in computing facilities and methods have profound impact on the way we analyze and investigate time series data. Implementation of complex, iterative, nonparametric procedures and simulations are now simpler. Detection of outliers and structural breaks in time series, mostly iterative in nature, has become an integral part of model diagnostics (Tsay, 2000). 2 The occurrence of unusual shocks can result to time series outliers or if prolonged a little longer can create temporary structural break or sometimes permanent structural change. This paper explores two such computing-intensive methods, the forward search and the nonparametric bootstrap methods in time series analysis that is influenced by random disturbances that temporarily alters the model behavior. The forward search (FS) is a powerful algorithm introduced initially for detecting atypical observations and their effects on models fitted to a data. Atkinson and Riani (2000) noted that this method was originally developed for models that assumed independent observations in various modeling frameworks, e.g., linear and nonlinear regression, generalized linear models, multivariate analysis, among others. The FS has been effective in detecting aberrant observations and hidden data structure even in the presence of masked effect due to the contamination caused by unusual observations. The method is also successful in detecting clustered observations for both continuous and categorical data (Cerioli et al, 2007). The forward search starts by fitting a model from an initial subset of observations considered outlier-free, progressing with the search by adding an observation or a set of observations according to their ‘closeness’ to the fitted model measured by some similarity or distance measure, e.g., residuals. Thus, the forward search is made up of three phases: choosing an initial subset, progressing (forward) in the search, and diagnostic monitoring, (Riani, 2004). Riani (2004) extended the forward search in time series data still focusing on outlier detection. The initial subset is robustly chosen among k blocks of contiguous observations of fixed dimension b. The idea of block sampling is to conserve the dependence structure of the 3 observations in the series. Then progressing with the search by moving to higher dimension, say b+1, using the least squared standardized prediction residual, prediction continue until the highest possible dimension is obtained. This paper modified further the forward search to account the dependence structure in time series data. Instead of searching by moving to higher dimension, the search is done using blocks of the same length. The main idea of maintaining the length of the blocks is to isolate certain perturbations present in some segments in the series. This is subsequently expected to reveal the underlying behavior of the time series. The forward search algorithm will be employed to independent (non-overlapping) and overlapping blocks of data. Independent blocks can address longer time series data while overlapping blocks can address the problems usually associated with short time series data. The focus of this study is to develop a procedure that will produce a robust estimate of model parameters in the presence structural change. These structural perturbations can be considered as contiguous outliers or persistent shocks because these observations exhibit some structure of their own different from the bulk of the data. Thus, presence of these “irregular” segments may result in an inadequate or biased time series model (Chen and Liu, 1993). 2. Estimation with Data Perturbation Data perturbations such as structural changes and other outlying observations are common in time series. These perturbations are sets of observations which are in some way different from the bulk of the data and may have a structure of their own (Konis and Laurini, 2007). Tsay (1986) noted that these aberrant observations or structures could seriously affect the statistics calculated from the data such as the autocorrelation functions: SACF, SPACF, and 4 ESACF in the ARMA model. Hence, it is necessary to correctly or at least robustly estimate a model in the presence of these observations. The general procedure to deal with these atypical segments in the series is to first identify where occurrence of these segments are, then measure the effect of these observations to the specified model. Tsay (1986) proposed an iterative procedure to model time series in the presence of outliers using a linear regression technique. Two classes of single outlier models were considered: innovational outlier (IO) and additive outlier (AO). Once the outlier is detected, an iterative method to specify tentative models for the outlier-contaminated series will be used using the ESACF, then remove the effect of the atypical observations from the model. 2.3 The Bootstrap Method Nonparametric bootstrap is a computing-intensive method that involves repeated “resampling” technique in order to generate an empirical estimate of the distribution of certain statistics (Mooney and Duval, 1993). The generated empirical distribution function (EDF) F̂ is then used to estimate the unknown cumulative distribution function (CDF) F and then use F̂ just as we would a parametric model (Davison and Hinkley, 1997). Mooney and Duval (1993) noted that both bootstrap and parametric inference have the same underlying purpose: making an inference about using a statistic ˆ . The only difference is how they obtain the sampling distribution of ˆ . Parametric inference makes distributional assumptions on the distribution of ˆ while nonparametric bootstrapping involves finding first this distribution of ˆ before making any inferences. The basic steps in the nonparametric bootstrap procedure, as described by Mooney and Duval (1993), are as follows: 5 1. Construct an empirical probability distribution Fˆ ( x) , from the sample by placing a probability of 1 n at each point, x1, ..., x n . This is the EDF of x , which is a nonparametric maximum likelihood estimate (MLE) of the population distribution function, F ( X ) . 2. From the EDF, Fˆ ( x) , draw a simple random sample of size n with replacement. This * is a “resample”, xb . 3. Calculate the statistic of interest, ˆ , from this resample, yielding ˆb* . 4. Repeat steps 2 and 3 B times, where B is a large number. 5. Construct a probability distribution from B ˆb* ’s by placing a probability of 1/B at each point, ˆ1* , ˆ2* ,...,ˆB* . This distribution is the bootstrapped estimate of the sampling distribution of ˆ, Fˆ * ˆ * . We used the percentile method of constructing the confidence interval since it does not require parametric assumption that is otherwise needed in the normal approximation and BC method. The percentile method takes literally the notion that Fˆ * ˆ * approximates F ˆ . An level confidence interval includes all the values of ˆ * between the / 2 and 1 / 2 percentiles of the Fˆ * ˆ * distribution (Efron, 1982 and Stine, 1990 as cited by Mooney and Duval, 1993). DiCiccio and Romano (1988) noted that none of these bootstrap confidence interval methods offers the best confidence intervals for the general case since the criteria for judging the quality of their results vary widely. 6 2.4 Bootstraps for Time Series Davison and Hinkley (1997) discussed two approaches to resampling in time series: the model-based resampling and block resampling. In the model-based resampling, the idea is to fit a suitable model to the data, followed by the computation of residuals from the fitted model, and then to generate new series by incorporating random samples from the residuals into the fitted model. Typically, the residuals are centered to have the same mean as the innovations of the model. This type of resampling is based on applying model equation(s) of the series to innovations resampled from the residuals. To illustrate, suppose an AR(1) model Yt Yt 1 t , t Z , 1 (1) is fitted to the realizations Y1 ,..., Yn giving estimated AR coefficient ˆ and innovations ˆt Yt ˆYt 1 , t 2,..., n. (2) Note that ˆ1 is unobtainable because Y0 is unknown. Model-based resampling then proceed by random sampling with replacement from centered residuals ˆ2 ,..., ˆn to obtain simulated innovations 0* ,..., n* , and then setting Y0* 0* and Yt * ˆYt *1 t* , t 1,..., n. (3) The major drawback with model-based resampling is that the parameters of a model and its structure must be identified from the data. If the chosen structure is not appropriate, the resampled series will be generated from a wrong model, and hence they will not have the same statistical properties as the original data. The second approach is the block resampling that involves resampling of blocks of consecutive observations. The block bootstrap tries to mimic the behavior of an estimator by independent and identically distributed resampling of blocks of consecutive observations. The 7 simplest version of this approach is to divide the data into b non-overlapping blocks of the same length l (e.g., n bl ). The procedure is to take a bootstrap sample from the blocks with equal probabilities, and then paste the selected blocks end-to-end to form a new series. The idea is to preserve the original time series structure within a block. If the blocks are long enough, enough of the original dependence structure will be preserved in the resampled series. This method will work best if the dependence is weak and the blocks are as long as possible. However, using the block bootstrap, the dependence between the blocks is neglected and the bootstrap sample is not (conditionally) stationary and exhibit artifacts which are caused by linking the randomly selected blocks (Bühlmann, 1997). Bühlmann (1997) proposed using the sieve bootstrap instead to address these issues. Sieve bootstrap is done by fitting parametric models first, using for instance the Akaike information criterion, and then resampling is performed over the residuals. An infinitedimensional nonparametric model is approximated by a sequence of finite-dimensional parametric models, a strategy known as the method of sieves. Specifically, the true underlying stationary process is approximated by an autoregressive model of order p [AR(p)], where p p(n) is a function of the sample size with p(n) , p(n) o(n) (n ) . The sieve bootstrap relies heavily on the crucial assumption that the data X 1 ,..., X n is a finite realization of an AR() process j ( X t j ) t , 0 1, (4) j 0 with 2 j . The AR() representation (4) includes the important class of ARMA(p,q) j0 models 8 p q X t j X t j k t k t , t Z , j 1 (5) k 1 q with invertible generating MA-poynomial, i.e., ( z) 1 k z k , z C has its roots outside k 1 the unit disk z C ; z 1(Bühlmann, 2002). The definition of the sieve bootstrap, provided in Bühlmann (2002), is given below. Let X t , t Z a real-valued stationary process with E[ X t ] . Represent X t as a one-sided infinite order AR process as in (4) and denote by X 1 ,..., X n a sample from the process X t , tZ . 1. Fit an autoregressive process, with increasing order p(n) as the sample size n increases and estimate the coefficients ˆ1, n ,...,ˆp ,n corresponding to model (4). Note that this procedure yields residuals p(n) ˆt , n ˆ j ,n ( X t j X ), 0, n 1 (t p 1,..., n) . (6) j 0 2. Construct the resampling based on this autoregressive approximation. Center the residuals ~t , n ˆt ,n (n p) 1 n ˆ t ,n (t p 1,...n) (7) t p 1 n and denote the empirical cumulative distribution function of ~t , n t p 1 by Fˆ ,n (.) (n p ) 1 1[~t ,n .]. (8) 3. Resample for any t t* i..i..d . ~ Fˆ , n (9) 9 4. Define X t * t Z by the recursion p ( n) ˆ j,n ( X t* j X ) t* . (10) j 0 In practice, the sieve bootstrap sample X 1* ,..., X n* , as described in Bühlmann (2002), are constructed in the following way: 1. Choose starting values, e.g., equal to zero. 2. Generate an AR( p(n) ) process according to (10) until ‘stationarity’ is reached and then discard the first p(n) generated values. 3. Consider any statistic Tn Tn ( X 1 ,..., X n ) , where Tn is a measurable function of n observations. The bootstrapped statistic Tn* is defined as Tn* Tn ( X 1* ,..., X n* ) . The sieve bootstrap yields a (conditionally) stationary bootstrap sample and does not exhibit artifacts in the dependence structure like in the block bootstrap where the dependence between blocks is neglected. The method does not require ‘pre-vectorizing’ the original observations. Also, the sieve bootstrap sample is not a subset of the original sample. It has been shown that this method outperforms the more general block bootstrap within the class of linear invertible time series. Details in the comparison between sieve bootstrap and other time series bootstrap methods are in Bühlmann (2002). 3. Statistical Inference in the Presence of Temporary Structural Change Statistical inference for models given data containing structural change can be done in two phases. In Phase I, a robust model estimate reflecting the underlying overall behavior of the 10 series will be obtained. Without loss of generality, assume that the time series y t t 1,..., n. is stationary. Suppose y t follow an autoregressive moving average [ARMA(p, q)] model, ( B) yt ( B)at (11) Where ( B ) 1 1B ... p B p and ( B ) 1 1B ... q B q are polynomials in B , B is the backshift operator, and a t is a white noise process. Suppose further there are k contiguous segments each of dimension b, denoted by s (i ) , time segments i 1,..., k , defined from the series. Note that for independent segments, s (1) y1 ,..., y b , s ( 2) y b 1 ,..., y 2b ,..., s ( k ) y ( k 1)b 1 ,..., y kb . For overlapping segments, for example first b years then adding and deleting one time point at a time, the segments are formed as: s (1) y1 ,..., y b , s ( 2) y 2 ,..., y b 1 ,..., s ( k ) y k ,..., y b k 1 . 3.1 Phase I: The Estimation Procedure We describe a method of estimating the overall behavior of the time series contaminated with temporary structural change. The procedure consists of implementing a modified forward search algorithm and a nonparametric bootstrap method. We initially present the steps necessary regardless of the nature of the blocks (overlapping or independent). Then later, specific steps needed specifically for the nature of the blocks will be presented. Modified Forward Search Step 1: For each of the k segments, fit model equation (11) and obtain the estimates of each parameter in the equation. The following matrices of estimates are obtained: 11 ˆ11 ˆ21 ˆ ˆ22 ˆ FS 12 ... ... ˆ ˆ 1k 2k ... ˆp1 ... ˆp 2 ... ... ... ˆpk and ˆFS ˆ11 ˆ 12 ... ˆ 1k ˆ21 ˆ22 ... ˆ 2k ... ˆq1 ... ˆq 2 ... ... ... ˆqk The estimates from Step 1 are called the series of “forward searched” parameter estimates of equation (11), a total of (p + q) set of series. When the segments are not independent (overlapping), each series of forward search estimates represents the estimates obtained using AR-Sieve. The goal for the modification of the forward search algorithm in this procedure is to reveal the overall picture of the time series by partitioning the series into blocks of constant length. Any temporary structural change is therefore localized in specific segments only and will not necessarily contaminate the entire time series. The purpose of the FS for time series discussed by Riani (2004) is to order the observations according to their closeness with the fitted model and hence focuses in outlier detection. Nonparametric Bootstrap Estimation The estimation routine is continued given the estimates from Step 1 above. This time, an AR Sieve bootstrap is used to revise the estimates from the modified forward search algorithm. Step 2: Using nonparametric bootstrap, estimate each parameters in equation (11) based on the series of forward search estimates obtained in Step 1. That is, the bootstrap estimate of i 12 in (11) denoted by ˆi BS using the series ˆi FS ˆi1 ˆ i2 , ... ˆik i 1,..., p and the bootstrap estimate of j in (11) denoted by ˆ j BS using the series ˆ j FS ˆ j1 ˆ j 2 , j 1,..., q . Also, ... ˆ jk compute the Monte Carlo Variance and a (1 )100% bootstrap confidence interval (BCI) for each parameter in equation (1). When the segments are independent blocks, block bootstrap is used. On the other hand, if the segments are overlapping, AR Sieve is used. When data perturbations or structural change are localized in certain blocks of observations, the forward search part should be able to identify the underlying model structure from the majority of the blocks. Unlike in independent data case where sample size increases during the forward search algorithm, block size is maintained to be constant and the search is applied on the blocks. If the block size is progressively increased considering the premise that structural change occurred, erratic parameter estimates from an ARIMA model can be expected. It is advantageous to maintain a constant block size so that specific segments where structural change occurred can be isolated from the rest of the segments. Hence, robustness can be achieved. The application of the nonparametric bootstrap will aid in filtering out the effect of structural change in certain localized segments. Thus, a temporary structural change should not prominently influence the “majority” of the blocks, and the bootstrap can come up with a more stable version of the series of forward searched estimates. 13 Bootstrap Procedure for Independent Blocks The bootstrap procedure for the independent blocks uses the series of estimates obtained from the forward search. For each series of estimates obtained through FS, let be the parameter to be estimated. (i) Resample from the series of estimates obtained for each block. (ii) Compute the mean ˆ b (iii) Repeat steps (i) and (ii) B (replication size) times, where B is large. (iv) Compute the bootstrap estimate ˆ BS ˆ BS 1 n ˆ k . n k 1 1 B j ˆ , Monte Carlo Variance B j 1 1 B (ˆ ( j ) ˆ BS ) 2 , and BCI. B 1 j Bootstrap Procedure for Overlapping Blocks (AR-Sieve) For overlapping blocks, there is dependence between blocks and hence, forward search estimates obtained from these blocks are also not independent. The nonparametric bootstrap relies heavily on the assumption that the values being resampled are independent. Thus, in order to apply nonparametric bootstrap to the series of forward search estimates from overlapping blocks, the dependencies between the estimates should be removed. The AR-Sieve is implemented to obtain a new series of forward search estimates that are indepedent. The ARSieve procedure follows: For each block, (i) Estimate simultaneously the parameters in equation (11) and obtain the residuals. 14 (ii) Generate new residual series a t with mean 0 and variance equal to the mean square error (MSE) of the residuals in (i). (iii) Choose starting values, e.g., equal to zero, and generate new series y t based on estimates in (i) and the new residual series a t in (ii). (iv) Estimate the model equation using the generated series y t in (iii) and store the parameter estimates. (v) Repeat steps (iii) and (iv) B times, where B is a large number. (vi) Compute the mean of each parameter from the B models estimated. This is now the forward search estimate for the block. After exhausting all the blocks, nonparametric bootstrap procedure used for independent blocks will be applied to the forward search estimates obtained using ARSieve. 3.2 Phase II: A Procedure on Identifying and Modeling Structural Change In this section, a procedure of how to identify and model the segments with structural change is presented. The procedure uses a BCI for detecting segments with perturbation. Once these segments are identified, the estimated model in Phase I is then adjusted so that the resulting model represents the underlying structure as well as the structural changes in the series. The basic steps of this procedure follow: Step 1: Using the BCI for i obtained in Step 2 of Section 3.1, compare the individual elements ˆil , l 1,..., k in the vector ˆi FS ˆi1 ˆ i 2 with the BCI. Identify the segments ... ˆ ik 15 where the ˆil , l 1,..., k are outside the BCI and for each segment, compute the difference im BCIl ˆim , where m the segment where ˆim BCI for ˆi and BCI l is the BCI limit nearest to ˆim . The idea here is that a segment with estimate(s) outside the BCI indicates presence of perturbation in that particular segment. Thus, this step suggests the use of BCI as a method for detecting structural changes in the time series. ~ Step 2: The adjusted estimate of ˆi , denoted by i is, ˆi BS im I t (tm ) where I t (t m ) 1 if t m belongs to time segment m and 0, otherwise. Apply this procedure for all the parameters in equation (11) to generate adjusted parameter estimates: ~ i ˆi BS im I t (tm ) for i , i 1,..., p m and ~ j ˆ j BS jm I t (t m ) for j , j 1,..., q m These parameter estimates yields the adjusted model estimates that represents the general structure of the time series and the perturbations in the series. 4 Simulation Studies The proposed procedure was illustrated and evaluated through a simulation study. Simulated data were generated from AR(1), MA(1), and ARMA(1,1) processes with embedded structural change at the following locations in the series: start; middle; end; start and middle; start and end; middle and end; start, middle, and end. Embedding of structural change was done by simply replacing the selected segments (start, middle, or end) of the uncontaminated series with observations generated from a model of the supposed structural change. For example, consider a simulated series X t , t 1,...,100 and 16 suppose a 5-timepoint structural change will be embedded at the start and end of the series. To * * incorporate the perturbation, generate X 1* ,..., X 5* and X 96 ,..., X 100 using the structural change model, then replace respectively. X * 1 X 1 ,..., X 5 Thus, the and new X 96 ,..., X 100 series with treated X * 1 ,..., X 5* as the * * and X 96 ,..., X 100 , realization is * * ,..., X 5* , X 6 ,..., X 95 , X 96 ,..., X 100 . Short and long series were considered in the evaluation. For the long series, different lengths of embedded perturbations were used to monitor the effect of these occasional perturbations to the model parameter estimates. A comparison between bootstrap estimates using moving blocks and independent blocks was made. The forward search was implemented to the following segmentations: (i) Long Series (130 years of monthly data) a. 10-year independent data segments b. first 10 years, then adding and deleting 5 years at a time c. first 10 years, then adding and deleting one year at a time (ii) Short Series (5 years of monthly data) a. 1-year independent data segments b. first year, then adding and deleting 1 month at a time Thus, two cases of overlapping segments were compared. One is moving segment by relatively short period and the other is moving by relatively long block. A span of 1 year, 5 years, and 10 years of structural perturbation were embedded in the long series while a span of 3 months were considered in the short series. 17 The evaluation of the proposed procedure involved examining the robustness of model estimates vis-à-vis conditional least squares under following scenarios: (i) near nonstationarity and near noninvertibility (ii) location and length of perturbation (iii) length of the series (iv) bootstrap method A model is considered robust if the model parameter estimates are sensitive to the overall memory pattern of a time series but are insensitive to occasional outliers (Chen and Liu, 1993). Robustness means that the parameter estimates should be close to the underlying overall model parameters even in the presence of perturbations. The long series consists of 1560 values (130 years of monthly data or less than 5 years of daily data) and 60 values (5 years of monthly data or approximately 3 months of daily data) for the short series. The models where data were simulated are summarized in Table 1. [Table 1 Here] All the processes are stationary and invertible. However, investigations were also done near the boundaries of nonstationarity and noninvertibility. Table 2 shows the structural change embedded in the series generated from each model in Table 1. After embedding the said perturbations, the (conditional) least squares estimates (CLS) already showed significant deviations from the true model parameters even with just few contaminations, illustrating the data-sensitive character of ARIMA models. [Table 2 Here] 18 4.1 Parameters Estimates Tables 3 summarizes the parameter estimates of the AR(1) models (1 0.5 B)(Yt 10) a t (stationary) and (1 0.95B )(Yt 10) a t (near nonstationary) each with temporary structural change at the start, middle and end of the series. On the long stationary series, the bootstrap estimates (BS1, BS2, and BS3) of were much closer to the true value than the CLS estimates. The estimates of the mean , did not differ much for the two procedures. However, for the near nonstationary long series, almost all the CLS estimates of mean and of were nearer the true values. This is so because slicing the time series into blocks can highlight the near nonstationary behaviors or even result to nonstationarity within each block. For the case of the short series, the estimates from the proposed procedure were evidently not reliable because of high absolute percent differences (Table 4). The bootstrap estimate of was off by large amount as explained by the fact that the proposed algorithm involved segmenting the time series into blocks. With short time series, the overall dependence structure is hardly passed on to the individual blocks of even shorter length. Thus, with the introduction of perturbations, the behavior of each block produced from a short time series can generally produce erratic picture of the dependence structure of the time series. [Table 3 Here] [Table 4 Here] Similar results were obtained from MA(1) models. On the invertible long series [ (Yt 10) (1 0.5B)a t ] , the estimates of from the new procedure yield smaller (absolute) percent differences compared to those obtained from CLS, even with many points of contamination. The estimates using the proposed procedure and the CLS were similar for the near noninvertible long series [ (Yt 10) (1 0.95B)a t ], see Table 5 for details. The estimates 19 were relatively close to the true value for both procedures. Like in the case of AR(1), the proposed procedure did not work well for short series (Table 6 ) [Table 5 Here] [Table 6 Here] The parameter estimates in Table 7 proved further the advantage of the proposed procedure with stationary and invertible ARMA(1,1) [ (1 0.4 B)(Yt 10) (1 0.5)a t ]. The estimates of the mean, and were closer to the true values compared to the CLS estimates. However, in the case of near nonstationary and near noninvertible long series [ (1 0.95 B)(Yt 10) (1 0.95)at ], the CLS method performed better, see Table 7 for details. Like in the case of AR(1) and MA(1), the estimates for the short ARMA(1,1) series were also not reliable (Table 8). [Table 7 Here] [Table 8 Here] 4.2 Effect of Near Nonstationarity and Near Noninvertibility In the case of AR(1), being near nonstationarity caused the estimates of the mean to be unstable using the proposed procedure. The estimates of however, were still robust (Table 3). Near nonstationarity has greater effect on the estimates of the mean than of for AR(1) models, both for long and short time series. Majority of the blocks of near nonstationary series can not represent the overall picture of the time series and hence, the bootstrap estimates failed to capture the underlying structure. For the MA(1) models shown in Tables 5 and 6, the degree of invertibility does not really affect the estimates produced using the proposed procedure. Even if the series is near 20 noninvertible, the estimates are quite robust. The independent block method is capable further of filtering the perturbations localized in each segment, that could have blended into the noninvertibility character of MA(1). The results for the ARMA(1,1) presented in Table 7 showed that near nonstationary and almost noninvertible series produced unstable estimates of the mean with the proposed procedure. However, the estimates of and remain robust. Near nonstationarity of the time series can greatly influence the optimality of ARIMA modeling which relies heavily on the dependence structure of the data. Blocks defined from near nonstationary series can highlight more the nonstationary behavior of the time series. Some, if not majority of the blocks can even exhibit nonstationarity. Thus, bootstrapping estimates from these blocks will not give optimal results. Stationarity implies that any time series segment of the same length should exhibit similar dependence structure regardless of the section of the time series it is extracted from. The proposed method benefits from this since the algorithm explicitly partitions the time series into blocks of constant length and estimates of the model parameters are generated from each block. Furthermore, any temporary structural change occurring in the time series can be localized only in some blocks. Other blocks that are not affected by these perturbations should be able to clearly characterize the dependence structure of the time series. 4.3 Effect of Location and Length of Structural Change The location and length of the perturbation significantly affected estimates obtained from CLS. Longer length of contamination can produce greater deviation of the estimated model from the true model and occasional perturbations can introduce bias in model-fitting. This was not the 21 case for the estimates obtained using the proposed procedure, the parameter estimates were relatively closer to the true model even with longer contamination length compared to the CLS estimates. This confirms the robustness of the estimates obtained from the blended resamplingforward search estimates. This is true only for stationary and relatively longer time series. Moreover, the location of the contamination did not really matter when the proposed estimation procedure is used. The estimates obtained were robust whether the contaminations were at the start, middle, end, or a combination. 4.4 Overlapping Blocks vs. Independent Blocks The bootstrap procedure using overlapping and independent segments (BS1, BS2, and BS3) produced similar estimates for the long series, in general. Results from overlapping segments moving by short blocks or by relatively long blocks were also similar. However, note that there seemed to be an advantage of using non-overlapping segments against overlapping segments when the series is near nonstationary [ (1 0.95B )(Yt 10) a t and (1 0.95B)(Yt 10) (1 0.95)a t ]. The estimates produced by bootstrapping independent blocks (BS3), especially of the mean, were more robust. There are only few blocks that can be formed if they are non-overlapping compared to overlapping one. Thus, logically, if the series is near nonstationary, fewer blocks can emphasize the near nonstationarity (or even nonstationarity) condition of the complete series. 4.5 Comparison between CLS and the Proposed Method The estimates obtained using the proposed procedure were more robust compared to the least square estimates that uses the entire time series at once when the series was long and stationary/invertible. The new procedure was able to produce better estimates than CLS in terms 22 of sensitivity to the overall structure and insensitivity to occasional structural changes provided that the series is long and stationary/invertible. Simulation results for the series from AR(1), MA(1), and ARMA(1,1) showed that CLS estimates were more sensitive to structural changes in time series compared to bootstrap estimates. In the case of short time series data, the implementation of the proposed procedure did not yield desirable results. The estimates obtained were neither robust nor stable, percent difference of estimates from true values and the standard errors were relatively large. CLS estimates were better than the bootstrap estimates for short series. The CLS took advantage of using the whole time series in estimating an ARIMA model at once. The poor estimates obtained using the proposed procedure can be attributed to the very small number of observations involved in the estimation per block. The dependence structure in a block can be entirely different from the global dependence structure estimated in an ARIMA model. Using few data points yield poor estimates in terms of robustness and stability, in general. Another reason is that, altering or contaminating a short series can completely destroy the general structure of the data unlike in the case of the long series where the general structure is preserved. Forecast accuracy measured by MAPE was also analyzed. Both estimation procedures yield comparable results. Despite the robustness of the bootstrap estimates over the CLS estimates, there was no significant gain in forecasting performance. In fact, there were many cases wherein the MAPE of the CLS models were lower than those of the proposed procedure. One reason for this is the insensitivity of the bootstrap estimates to the perturbations introduced in the series. The bootstrap estimates can be highly penalized in the segments where structural change occurred. Forecast based on CLS models outperformed those coming from the models 23 estimated using the proposed procedure in the segments with structural change. However, in the segments without structural change, the proposed procedure outperformed the CLS, especially for the stationary (and invertible) long time series. 4.6 Identifying and Modeling Structural Change In this section, the Phase II procedure is illustrated using the simulated series with five years perturbation. A 95% BCI for the mean is (9.96, 10.50) and a 95% BCI for is (0.46, 0.68). The Forward Search estimates are presented in Table 9 below. [Table 9 Here] This leads to the following adjusted estimates of the model parameters: For : 10.20 + 1.121I (1,...,120 ) (t ) - 0.03 I ( 241,..., 360 ) (t ) - 0.24 I ( 361,..., 480 ) (t ) - 0.38 I ( 721,..., 840 ) (t ) - 0.13 I (841,..., 960 ) (t ) + 0.51 I (1441,...,1560 ) (t ) For : 0.57 + 0.131I (1,...,120 ) (t ) - 0.11 I ( 241,..., 360 ) (t ) - 0.05 I ( 481,..., 600 ) (t ) - 0.08 I ( 601,..., 720 ) (t ) + 0.28 I ( 721,..., 840 ) (t ) - 0.06 I (1201,...,1320 ) (t ) + 0.25 I (1441,...,1560 ) (t ) Note that the true structural change occurred at time points 1-120, 721-840, and 1441-1560. The procedure did detect the occurrence of the perturbation. However, there were also points that were misclassified. While the method is highly sensitive to detect structural change, it is not specific because of the confounding effect of the temporary shocks on the general behavior of the time series. Even after the implementation of the procedure, there was still a need to subjectively classify whether there was an occurrence of structural change in the series or none. For example, in Table 9, if only the top 3 highest distance from BCI were considered, then the method had correctly identified the contaminations in the series. 24 Tables 10-12 present a closer examination of the sensitivity of the BCI in identifying structural changes in the selected series from the simulation study. Recall that when the FS estimate(s) for a particular segment is outside a BCI, it was proposed that structural change should have occurred in that segment. The method of detecting structural change using a BCI is highly sensitive to the structural change induced in the simulated data. With 95% coverage probability, all the segments were considered as having structural change. Note that only a small fraction of the series was embedded with perturbations but the method indicated otherwise. Relaxing the coverage probability to 99%, BCI can reduce the number of misclassified segments; however, the overall sensitivity is still very high. The extreme sensitivity of the method is a consequence of the high level of accuracy exhibited by bootstrap confidence intervals, i.e., the bootstrap confidence intervals have narrow width. Thus, even estimates from segments without structural change are very vulnerable of falling outside a BCI. [Table 10 Here] [Table 11 Here] [Table 12 Here] 5. Conclusions The simulation study confirmed the bias in estimating parameters of a time series model when different perturbations are introduced to the parameters. The perturbations resemble temporary structural change in the model that will revert back to the original behavior after a short period. These contaminations can conceal the underlying general structure of the time series and may result to poor estimation. Bootstrapping of estimates obtained through forward search algorithm is proven to reveal the underlying model. 25 The estimation procedure yields robust estimates of ARIMA models with temporary structural change generated from AR(1), MA(1), and ARMA(1,1) processes. The estimates obtained from the nonparametric bootstrap of a series of estimates from the modified forward search algorithm are superior over the estimates obtained using the conditional least squares in terms of robustness and capturing the overall structure of the data provided that the time series is relatively long, stationary, and invertible. The length of the times series is an important prerequisite for the optimality of the proposed method because of the segmentation into blocks of equal length. The global pattern of the time series is easily passed on to the local behavior of the blocks when the time series is longer. Stationarity is also important since it implies that the overall state of dependence structure is preserved within each block. High-frequency data that is vulnerable to temporary shocks and short-term structural changes can benefit from the proposed estimation procedure. If the interest is to unveil the underlying model, free from any temporary shocks, then a forward search-nonparametric bootstrap algorithm proposed in this paper should be able to produce robust and stable estimates of parameters of such models. The proposed procedure performed poorly when the series is relatively short. Segmenting the short time series into blocks will tend to aggregate the bias brought about by the temporary structural change. Because of the data-sensitivity of ARIMA modeling procedures, short time series cut into even shorter blocks will become highly vulnerable to any data point that deviates from the underlying model. Similar arguments can be made on the poor performance of the procedure on near nonstationary models. If the time series (whether short or long) is cut into shorter blocks, near nonstationarity can further be highlighted in each block. When there is already near nonstationary behavior in the global scenario, the localized nonstationarity can be easily observed. Thus, estimation of an ARIMA model will suffer at the block level. 26 The forecast accuracy of the new procedure is comparable to the CLS. The models obtained using the new procedure can outperform the CLS models in the uncontaminated segments. Thus, if the interest is forecasting future values that would represent the true underlying model, then the proposed procedure can provide more accurate results. References Atkinson, A.C., Riani, M. (2000). Robust Diagnostic Regression Analysis. New York : SpringerVerlag. Bühlmann, P. (1997). Sieve bootstrap for time series. Bernoulli 3:123-148. Bühlmann, P. (2002). Bootstraps for Time Series, Statistical Science 17:52-72. Cerioli, A., Riani, M., Atkinson, A. C. (2007). Clustering Contiguous or Categorical Data with the Forward Search. Bull. of the 56th Meeting of the ISI, International Statistical Institute, Portugal. Chen, C. and Liu, L. (1993). Joint Estimation of Model Parameters and Outlier Effects in Time Series. J. of the American Stat. Assoc. 88:284-297. Davison A. C., Hinkley, D. V. (1997). Bootstrap methods and their application. Cambridge U.K.: Cambridge University Press. Diciccio, T. J., and Romano, J. P. (1988). A review of boostrap confidence intervals (with discussion). J. of the Royal Stat. Soc. Ser. B 50: 338-370. Efron, B. (1982). The Jacknife, the Bootstrap, and Other Resampling Plans. Philadelphia: Society for Industrial and Applied Mathematics. Konis, K., Laurini, M. (2007). Fitting a Forward Search in Linear Regression. Bull. of the 56th Meeting of the ISI, International Statistical Institute, Portugal. Mooney, C.Z. and R.D. Duval (1993). Bootstrapping: A Nonparametric Approach to Statistical Inference. Sage Publications. Riani, M. (2004). Extensions of the Forward Search to Time Series. Linear and Nonlinear Dynamics in Time Series 8(Article 2). Stine, R. A. (1990). An introduction to bootstrap methods. Soc. Methods and Res. 18:243-291. 27 Tsay, R. S. (1986). Time Series Model Specification in the Presence of Outliers. J. of the American Stat. Assoc. 81:132-141. Tsay, R. S. (2000). Time Series and Forecasting: Brief History and Future Research. J. of the American Stat. Assoc. 95:638-643. Table 1 Models Considered in the Simulation. Process AR(1) MA(1) Model Nature of the Process (1 0.5 B)(Yt 10) a t Stationary (1 0.95B )(Yt 10) a t Near Nonstationary (Yt 10) (1 0.5B)a t Invertible (Yt 10) (1 0.95B)a t Near Noninvertible 28 ARMA(1,1) (1 0.4 B)(Yt 10) (1 0.5 B)a t (1 0.95B )(Yt 10) (1 0.95B)a t Stationary and Invertible Near Nonstationary and Near Noninvertible Note: a t ~ N (0,1) for all the models. Table 2 Structural Change Models Process Structural Change Model AR(1) (1 0.8 B)(Yt 13) a t MA(1) (Yt 13) (1 0.8B )a t ARMA(1,1) (1 0.8 B)(Yt 13) (1 0.8 B)a t Note: a t ~ N (0,1) for all the models. Table 3 Parameter Estimates for the Long AR(1) Series With Temporary Structural Change (S.C) at the Start, Middle, and End of the Series. (1 0.5 B)(Yt 10) a t Method mean (s.e.) % diff phi1 (s.e.) % diff (1 0.95B)(Yt 10) a t MAPE MSE 20.59 8.01 1.01 0.44 8.05 1.03 3.16 8.05 1.02 mean (s.e.) % diff phi1 (s.e.) % diff MAPE MSE 0.03 9.84 0.99 2.84 10.21 1.24 3.26 9.98 1.11 (a) 1 year temporary structural change CLS BS1 BS2 10.16 (0.064) 10.10 (0.017) 10.14 (0.041) 1.59 0.98 1.37 0.60 (0.02) 0.50 (0.008) 0.52 (0.018) 10.15 (0.452) 4.08 (3.012) 6.21 (3.353) 1.48 59.19 37.88 0.95 (0.008) 0.92 (0.006) 0.92 (0.014) 29 BS3 10.18 (0.057) 1.77 0.55 (0.03) 9.90 8.05 1.01 10.42 (0.861) 4.22 0.94 (0.017) 1.11 9.95 0.99 57.38 8.38 1.13 4.78 9.07 1.11 8.61 1.32 4.71 9.11 1.11 7.13 8.60 1.32 4.90 9.14 1.11 13.48 8.48 1.28 0.93 (0.01) 0.91 (0.006) 0.90 (0.011) 0.91 (0.016) 2.34 6.68 10.48 (0.35) 10.06 (0.216) 10.21 (0.395) 9.87 (0.614) 4.21 9.07 1.11 61.42 8.38 1.21 16.90 9.30 0.98 8.92 1.47 2.67 9.56 0.99 3.14 9.11 1.51 3.92 9.75 1.00 4.83 9.08 1.49 0.95 (0.008) 0.92 (0.006) 0.91 (0.015) 0.92 (0.025) 0.29 7.75 11.69 (0.471) 11.49 (0.301) 11.85 (0.888) 10.59 (1.195) 3.37 9.53 1.00 (b) 5 years temporary structural change CLS BS1 BS2 BS3 10.50 (0.126) 10.36 (0.093) 10.34 (0.188) 10.20 (0.145) 5.00 3.62 3.38 2.02 0.79 (0.016) 0.53 (0.017) 0.54 (0.04) 0.57 (0.054) 0.57 2.11 1.27 (c) 10 years temporary structural change CLS BS1 BS2 BS3 10.76 (0.143) 10.51 (0.101) 10.64 (0.223) 10.67 (0.36) 7.61 5.06 6.42 6.68 0.81 (0.015) 0.54 (0.018) 0.52 (0.037) 0.52 (0.043) 14.87 18.53 5.94 Note: CLS – Conditional Least Squares, BS1 – overlapping segments moving a length of 1 year BS2 – overlapping segments moving a length of 5 years, BS3 – non-overlapping 10-year block Table 4 Parameter Estimates for the Short AR(1) Series Simulated with Temporary Structural Change (S.C) at the Start, Middle, and End of the Series. (1 0.5 B)(Yt 10) at (1 0.95B )(Yt 10) a t Method ̂ (s.e.) CLS BS1 BS2 10.26 (0.239) 9.92 (0.217) 10.32 (0.206) % diff 2.60 0.83 3.22 ˆ % diff MAPE MSE (s.e.) 0.33 (0.124) 0.09 (0.027) 0.12 (0.19) 34.93 9.10 1.54 82.64 9.11 1.72 75.68 9.64 1.61 ̂ (s.e.) 9.74 (0.842) 16.23 (2.12) 10.25 (0.559) % diff ˆ % diff MAPE MSE 7.12 8.53 1.22 52.74 37.16 13.43 36.32 10.81 1.58 (s.e.) 2.62 62.32 2.49 0.88 (0.065) 0.45 (0.031) 0.60 (0.127) Note: CLS – Conditional Least Squares, BS1 – overlapping segments moving a length of 1 month, BS2 – nonoverlapping 1-year block 30 Table 5 Parameter Estimates for the Long MA(1) Series Simulated with Temporary Structural Change (S.C) at the Start, Middle, and End of the Series. (Yt 10) (1 0.5B)a t (Yt 10) (1 .95 B)a t Method mean (s.e.) % diff theta1 (s.e.) % diff MAPE MSE 1.31 8.60 1.22 7.37 8.57 1.22 5.74 8.58 1.22 2.46 8.59 1.22 29.82 9.33 1.47 10.67 9.23 1.51 11.67 9.26 1.50 13.74 9.36 1.49 38.44 9.86 1.68 17.24 9.79 1.74 17.91 9.85 1.73 14.81 9.99 1.73 mean (s.e.) % diff theta1 (s.e.) 10.07 (0.052) 10.01 (0.019) 10.03 (0.037) 10.08 (0.045) 0.67 10.35 (0.055) 10.17 (0.047) 10.23 (0.111) 10.35 (0.179) 3.45 10.75 (0.059) 10.57 (0.083) 10.65 (0.215) 10.73 (0.361) 7.52 % diff MAPE MSE -0.91 (0.011) -0.88 (0.005) -0.88 (0.012) -0.89 (0.02) 4.60 8.62 1.17 7.27 8.63 1.17 7.68 8.65 1.17 6.80 8.67 1.17 -0.93 (0.01) -0.88 (0.003) -0.88 (0.007) -0.91 (0.015) 2.43 8.73 1.27 7.35 8.72 1.29 7.11 8.74 1.28 4.54 8.77 1.27 -0.90 (0.011) -0.87 (0.005) -0.88 (0.012) -0.90 (0.02) 5.64 9.31 1.51 7.99 9.23 1.52 7.55 9.27 1.51 5.30 9.29 1.51 (a) 1 year temporary structural change CLS BS1 BS2 BS3 10.13 (0.042) 10.10 (0.015) 10.12 (0.038) 10.13 (0.058) 1.32 1.00 1.20 1.31 -0.51 (0.022) -0.46 (0.008) -0.47 (0.018) -0.49 (0.027) 0.11 0.35 0.81 (b) 5 years temporary structural change CLS BS1 BS2 BS3 10.25 (0.051) 10.10 (0.044) 10.15 (0.106) 10.25 (0.185) 2.53 1.00 1.47 2.49 -0.65 (0.019) -0.55 (0.011) -0.56 (0.028) -0.57 (0.048) 1.68 2.25 3.46 (c) 10 years temporary structural change CLS BS1 BS2 BS3 10.70 (0.056) 10.49 (0.079) 10.58 (0.197) 10.69 (0.341) 6.98 4.93 5.80 6.91 -0.69 (0.018) -0.59 (0.011) -0.59 (0.024) -0.57 (0.036) 5.71 6.51 7.30 Note: CLS – Conditional Least Squares, BS1 – overlapping segments moving a length of 1 year BS2 – overlapping segments moving a length of 5 years, BS3 – non-overlapping 10-year block Table 6 Parameter Estimates for the Short MA(1) Series Simulated with Temporary Structural Change (S.C) at the Start, Middle, and End of the Series. (Yt 10) (1 0.5B)a t (Yt 10) (1 .95 B)a t Method ̂ (s.e.) CLS BS1 BS2 10.36 (0.289) 10.04 (0.071) 10.35 (0.272) % diff 3.63 0.44 3.47 ˆ % diff MAPE MSE (s.e.) -0.41 (0.122) -0.25 (0.039) -0.42 (0.163) 17.66 12.36 2.46 49.86 12.34 2.62 16.72 12.33 2.46 ̂ (s.e.) 10.54 (0.268) 10.43 (0.082) 10.72 (0.317) % diff 5.44 4.26 7.18 ˆ % diff MAPE MSE 41.08 10.07 1.75 42.74 10.02 1.76 29.07 10.01 1.80 (s.e.) -0.56 (0.109) -0.54 (0.03) -0.67 (0.137) 31 Table 7 Parameter Estimates for the Long ARMA(1,1) Series Simulated with Temporary Structural Change (S.C) at the Start, Middle, and End of the Series. (1 0.4 B)(Yt 10) (1 0.5)a t Method mean (s.e.) phi1 (s.e.) % diff % diff theta1 (s.e.) % diff MAPE1 MAPE2 -0.52 (0.028) -0.53 (0.01) -0.54 (0.023) -0.55 (0.039) 4.88 13.53 8.04 6.27 14.03 8.01 7.59 13.85 8.03 9.55 13.63 8.04 -0.41 (0.028) -0.50 (0.01) -0.49 (0.02) -0.49 (0.026) 17.78 8.07 8.44 0.91 10.03 8.21 1.52 10.07 8.18 2.76 9.74 8.25 -0.40 (0.026) -0.52 (0.012) -0.55 (0.029) -0.56 (0.045) 20.04 7.24 8.66 4.05 9.13 8.47 9.89 8.58 8.84 12.49 7.83 9.45 (a) 1 year temporary structural change CLS BS1 BS2 BS3 10.01 (0.067) 9.96 (0.017) 9.99 (0.044) 10.02 (0.068) 0.06 0.42 (0.03) 0.35 (0.012) 0.37 (0.023) 0.39 (0.039) 0.42 0.08 0.22 5.79 11.45 8.19 3.27 (b) 5 years temporary structural change CLS BS1 BS2 BS3 10.37 (0.109) 10.21 (0.069) 10.16 (0.099) 10.28 (0.172) 3.70 0.65 (0.023) 0.42 (0.017) 0.43 (0.035) 0.44 (0.052) 2.13 1.56 2.83 63.14 5.81 7.30 11.02 (c) 10 years temporary structural change CLS BS1 BS2 BS3 10.85 (0.15) 10.55 (0.13) 10.87 (0.405) 11.35 (0.73) 8.47 0.74 (0.02) 0.45 (0.02) 0.45 (0.049) 0.47 (0.066) 5.52 8.69 13.47 84.55 12.13 11.85 18.59 Note: CLS – Conditional Least Squares, BS1 – overlapping segments moving a length of 1 year BS2 – overlapping segments moving a length of 5 years, BS3 – non-overlapping 10-year block Table 7 (Cont.) Parameter Estimates for the Long ARMA(1,1) Series Simulated with Temporary Structural Change (S.C) at the Start, Middle, and End of the Series. (1 0.95B)(Yt 10) (1 0.95)a t Method mean (s.e.) % diff phi1 (s.e.) % diff theta1 (s.e.) % diff MAPE1 MAPE2 31.11 16.51 91.22 2.11 36.69 99.71 6.34 30.93 96.43 (a) 1 year temporary structural change CLS BS1 BS2 11.23 (0.617) 13.40 (1.572) 10.40 (1.002) 12.33 34.05 3.98 0.94 (0.009) 0.95 (0.002) 0.95 (0.004) 1.16 0.25 0.27 -0.65 (0.019) -0.93 (0.013) -0.89 (0.042) 32 BS3 10.51 (1.37) 5.10 0.94 (0.009) 0.56 -0.84 (0.072) 11.45 25.72 97.03 -0.61 (0.02) -0.88 (0.02) -0.88 (0.043) -0.89 (0.059) 35.78 7.13 27.58 7.59 8.26 42.55 7.45 8.66 42.66 6.17 8.89 43.54 -0.66 (0.019) -0.86 (0.019) -0.89 (0.036) -0.92 (0.023) 30.30 7.34 54.39 9.37 8.13 58.09 5.99 8.48 55.78 2.68 8.86 50.68 (b) 5 years temporary structural change CLS BS1 BS2 BS3 12.24 (0.678) 13.56 (1.844) 11.06 (0.957) 10.73 (1.132) 22.43 35.56 10.58 7.31 0.95 (0.008) 0.95 (0.002) 0.95 (0.005) 0.94 (0.013) 0.38 0.11 0.13 1.14 (c) 10 years temporary structural change CLS BS1 BS2 BS3 10.55 (0.595) 8.93 (0.6) 9.09 (1.079) 9.89 (1.558) 5.54 10.69 9.11 1.10 0.94 (0.009) 0.94 (0.006) 0.91 (0.022) 0.86 (0.038) 1.54 1.54 3.83 8.95 Note: CLS – Conditional Least Squares, BS1 – overlapping segments moving a length of 1 year BS2 – overlapping segments moving a length of 5 years, BS3 – non-overlapping 10-year block Table 8 Parameter Estimates for the Short ARMA(1,1) Series Simulated with Temporary Structural Change (S.C) at the Start, Middle, and End of the Series. (1 0.4 B)(Yt 10) (1 0.5)a t Method ̂ (s.e.) CLS BS1 BS2 10.70 (0.47) 9.88 (0.255) 11.03 (0.452) % diff ˆ % diff (s.e.) 6.98 1.20 10.26 0.55 (0.166) 0.29 (0.03) 0.73 (0.148) 37.27 27.44 81.55 ˆ (s.e.) -0.45 (0.155) -0.32 (0.031) -0.13 (0.317) % diff MAPE MSE 9.08 8.55 1.30 35.99 9.03 1.57 74.18 8.68 1.38 33 (1 0.95B)(Yt 10) (1 0.95)a t Method ̂ (s.e.) CLS BS1 BS2 6.55 (1.681) 10.78 (4.311) 5.32 (1.374) % diff ˆ % diff (s.e.) 34.47 7.83 46.78 0.74 (0.118) 0.43 (0.04) 0.11 (0.245) ˆ % diff MAPE MSE 83.54 94.05 9.83 46.72 190.85 15.72 43.37 145.00 13.00 (s.e.) 22.55 55.24 88.29 -0.16 (0.165) -0.51 (0.046) -0.54 (0.192) Note: CLS – Conditional Least Squares, BS1 – overlapping segments moving a length of 1 month, BS2 – nonoverlapping 1-year block 34 Table 9 Forward Search Estimates of the Series From (1 0.5 B)(Yt 10) a t with 5 Years Contamination at the Start, Middle, and End of the Series Time Points ̂ FS Distance from CI ˆFS Distance from CI 1-120 121-240 241-360 361-480 481-600 601-720 721-840 841-960 961-1080 1081-1200 1201-1320 1321-1440 1441-1560 11.63 10.09 9.93 9.72 10.37 10.17 9.58 9.83 10.04 10.24 10.16 9.96 11.01 1.12 0.81 0.50 0.35 0.61 0.41 0.38 0.96 0.50 0.48 0.50 0.40 0.56 0.93 0.13 -0.03 -0.24 -0.38 -0.13 0.51 -0.11 -0.05 -0.08 0.28 -0.06 0.25 Table 10 Bootstrap Confidence Intervals for the Parameters of the Series Generated from (1 0.5 B)(Yt 10) a t with 10-year S.C. at the Start, Middle, and End of the Series. Method BS1 BS2 BS3 Method BS1 BS2 BS3 95% BCI for 10.3170 10.1927 10.0259 10.7174 11.1016 11.4186 99% BCI for 10.2615 10.1208 9.9272 10.8047 11.3090 11.7415 95% BCI for 0.5026 0.4507 0.4464 0.5727 0.5891 0.6176 99% BCI for 0.4926 0.4287 0.4282 0.5786 0.6139 0.6355 Proportion of segments w/ estimate(s) outside the BCI 120/121 25/25 10/13 (99.2%) (100.0%) (76.9%) Proportion of segments w/ estimate(s) outside the BCI 120/121 25/25 8/13 (99.2%) (100.0%) (61.5%) 35 Table 11 Bootstrap Confidence Intervals for the Parameters of the Series Generated from (Yt 10) (1 0.5B)a t with 10-year S.C. at the Start, Middle, and End of the Series. Method BS1 BS2 BS3 Method BS1 BS2 BS3 95% BCI for 10.3318 10.2193 10.0126 10.6532 11.0058 11.4009 99% BCI for 10.3066 10.0979 9.9606 95% BCI for -0.6103 -0.6367 -0.6554 10.6963 11.1565 11.6474 Proportion of segments w/ estimate(s) outside the BCI -0.5643 -0.5427 -0.5117 99% BCI for -0.6134 -0.6490 -0.6757 120/121 24/25 9/13 (99.2%) (96.0%) (69.2%) Proportion of segments w/ estimate(s) outside the BCI -0.5592 -0.5318 -0.4980 118/121 22/25 9/13 (97.5%) (88.0%) (69.2%) Table 12 Bootstrap Confidence Intervals for the Parameters of the Series Generated from (1 0.4 B)(Yt 10) (1 0.5 B)at with 10-year S.C. at the Start, Middle, and End of the Series. Method BS1 BS2 BS3 Method BS1 BS2 BS3 95% BCI for 10.3364 10.2274 9.9758 10.8172 11.7590 12.9525 99% BCI for 9.9178 10.0733 9.9758 13.5397 11.8561 12.9525 95% BCI for 0.4128 0.3552 0.3544 0.4891 0.5457 0.6057 99% BCI for 0.3241 0.3163 0.3544 0.6392 0.5871 0.6057 95% BCI for -0.5450 -0.6075 -0.6507 -0.4992 -0.4969 -0.4817 99% BCI for -0.6889 -0.6258 -0.6507 -0.4434 -0.4876 -0.4817 Proportion of segments w/ estimate(s) outside the BCI 121/121 25/25 12/13 (100.0%) (100.0%) (92.3%) Proportion of segments w/ estimate(s) outside the BCI 121/121 24/25 12/13 (100.0%) (96.0%) (92.3%) 36