Appendix: Finite Sample Properties of our Tests To investigate finite sample properties of our tests, we use four examples, motivated by our empirical results. The results reported here extend existing Monte Carlo work on structural change in four ways – 1) the case where all coefficients in the VAR are allowed to break, 2) the case where additional exogenous variables are included in the regression, 3) the case where all coefficients, including those on the exogenous variables are allowed to break, 4) bivariate models. The four specific examples we highlight here are intended to be illustrative; the reader is referred to Bekaert, Harvey and Lumsdaine (2000) for a more systematic, comprehensive Monte Carlo investigation. The four cases that we consider (along with the reasons as to why they were selected) are: Based on the empirical results, we begin by choosing Mexico for the univariate results, since Mexico is the longest series and there was mixed univariate evidence for Mexico. Recall that the mean break only test found no evidence of a structural break, when the test allowed for all parameters to break the no-break null was rejected, and this rejection was somewhat weakened when the world instruments were included in the specification. Thus it is important to check whether these mixed results might be the result of badly sized or low power tests. Using coefficients from the data generating process for Mexico, an AR(1) with no break, we investigate size of the tests. This has been studied extensively by BLS1 and BLS2 in the case of a test for a mean-break only. Sample size is 241 observations. Using coefficients from the data generating process for Chile, an AR(1) with a mean break in July 1980, we use this example to investigate power of our tests in the case of a simple mean break. When the test allows for a mean-break only, the results from this simulation will enhance those reported in BLS for this case; furthermore, this case provides an extention to testing for a break in all coefficients, to investigate what loss in power there is when allowing all coefficients to break. Sample size is 239 observations. Using coefficients from the data generating process for Colombia, an AR(1) with breaks in the mean and AR(1) coefficients in February 1992, we use this example to investigate power in three cases – a) when the test allows for a break in all coefficients, b) when the world variables are included in the regression and the test allows all noninstrument coefficients to have a break, and c) when the world variables are included in the regression and the test allows for all coefficients, including the world ones, to break. This country is the one where the influence of the world variables was the largest. Sample size is 131 observations. Using coefficients from the data generating process for Thailand, a bivariate AR(1) with breaks in a number of coefficients including the mean in January 1980 but without significance of the world variables, we use this example to consider the finite sample properties of the tests in the case of a bivariate system. There was no evidence of a structural break in the univariate equations for Thailand. BLS2 contains extensive results concerning the case of a mean break where the test allows for only a mean break; here we extend these to the case where all coefficients are allowed to break. All simulations are based on 5,000 replications. Table A.1: Size (all under the null of no break) for a 5% test Univariate Bivariate Mexico Chile Colombia No World Instruments Mean Break 0.063 0.081 - 0.047 All Coeff. Break 0.081 0.109 0.092 0.106 World Instruments Mean Break 0.080 - - - All non-instrum. coeffic. 0.108 Break - 0.149 - All coeff. Break - 0.164 - 0.145 Thailand Table A1 gives size results at the 5 percent nominal level. While the test are well-sized when the test allows for a mean-break only, there is evidence of some size distortion when the test allows for more coefficients to break and when world instruments are included in the regression. This size distortion disappears with larger samples but nonetheless is important to consider in our empirical results. We therefore turn to simulations under the alternative when there is a break for the three cases where our empirical findings suggested there was evidence of a break. Size-adjusted power, using the finite sample distributions computed in Table A1, is computed and reported in Table A2. Table A.2: Size-adjusted Power for a 5% test Univariate Chile No World Instruments Mean Break Test for Mean break Colombia Bivariate Thailand 0.731 - 1.00 Mean Break Test All Coeff. Break 0.519 - 1.00 Break in all Coeffic. Test All Coeff Break - 0.073 - World Instruments Mean Break Test for Mean Break - 0.091 - All coeff. Break Test for all coeff. Break (including world) - 0.986 - Table A.2 demonstrates that the power of the test statistics is quite high when the model is specified correctly. Recall that the data generating process for Colombia included the world variables – hence it is not surprising that there is a substantial loss of power when these are not included and when the test only allows for a mean break (both situations correspond to model misspecification). In all other cases, the size-adjusted power is quite high, exceeding 51 percent. The loss of power in the univariate case from allowing all coefficients (in this case, just one additional one) to break when the underlying DGP only has a mean break is about 30 percent. A similar loss of power does not seem to occur in the bivariate case. In general, power loss will depend on the magnitude of the break relative to the additional dimensionality of the test statistic when more coefficients are allowed to break. Finally, we also report finite sample coverage rates for the Chile simulations, comparing coverage in the case when the test allows for only a mean break to the case when all coefficients are allowed to break. The break in the data generating process occurs 23% of the way through the sample, at observation 55 in a sample size of 239. In the chart below, ‘p’ refers to the number of lags (the true number is 1), BIC refers to lags being chosen by the BIC, the column marked ‘90%’ refers to the coverage of the 90% confidence interval, ‘median’ refers to the median break date across all replications, and ‘90% range’ refers to the range over which 90% of the break dates from the Monte Carlo simulations feel. Thus a median of 57 and a 90% range of 96 means that 90% of the break dates fell between observations 11 and 103. Finite sample coverage probabilities -- Chile Mean break only p 90% median 90% range All coefficients allowed to break p 90% median 90% range 1 .864 BIC .841 1 .814 BIC .806 57 57 96 96 57 57 95 95 In both cases, coverage is high and similar to coverage rates reported in BLS2 for the mean-break only case. The 90% range is large, so that confidence intervals for this case are big. This may reflect small break magnitude and again emphasizes that there is not a direct link between rejection of the test statistic and the magnitude of the confidence intervals around the estimated break date. The median of 57 shows that there is no median bias. Finite sample coverage probabilities -- Thailand Mean break only p 90% median 90% range All coefficients allowed to break p 90% median 90% range 1 BIC 1 .935 BIC .936 .897 .852 160 160 22 22 160 160 22 22 The coverage probabilities for Thailand are better than those for Chile. In addition, the 90% range is much tighter. Finite sample coverage probabilities -- Colombia DGP: Break in all coeffs Test: Break in all coeffs p 90% med range 1 .602 67 92 BIC .549 67 92 World vars, mean break Mean break only 90% med range World vars, mean break All coeffs break 90% med range .546 86 85 .530 86 85 .957 .955 87 1 87 1 For the first two examples the range is large. In addition, there is evidence of median bias in the first case, as the true break (2/92) occurs at 65% of the way through the 131 observation sample (observation number 85). The inclusion of world variables improves both the coverage rates and substantially improves the 90% range, narrowing it to only one month. This explains the very tight confidence intervals that we find in all of our empirical results.