A Critical Assessment of Simulated Critical Values

advertisement
A Critical Assessment of Simulated Critical Values
By
John T. Cuddington* and William Navidi**
December 29, 2010
Abstract
There is a wide variety of statistical problems (e.g., unit root and cointegration tests) where
hypothesis testing involves the use of simulated rather than theoretical critical values. We
argue that in practice the number of replications used to simulate critical values is often
insufficient to provide the degree of precision that is implied. In particular, the number
of replications needed is greatest for values in the tails of the distribution. We provide
recommendations for approximating the number of replications needed to achieve a desired
degree of precision.
Keywords: Monte Carlo simulations, simulated critical values, unit root tests, cointegration
tests.
Running Head: Simulated Critical Values
*Division of Economics and Business and **Department of Mathematical and Computer
Sciences, Colorado School of Mines, Golden, CO 80421.
**To whom correspondence should be addressed.
1
Introduction
There is a wide variety of statistical problems where hypothesis testing involves the use
of simulated rather than theoretical critical values. In some cases finite sample critical
values are required; in other cases, asymptotic critical values must be simulated. Important
examples include the simulated critical values needed to carry out Dickey-Fuller and PhillipsPerron unit root tests, the Engle-Granger cointegration test, unit root tests in the presence
of breaks at known or unknown dates (Perron, Zivot-Andrews). Authors devising these new
tests typically provide simulated critical values (CVs), often reporting two or even more
significant digits for various sample sizes. Reporting CVs to two or more digits implies
something about the purported precision of the underlying calculations, which is determined
largely by the number of replications used in the Monte Carlo simulations.
The objective of a typical simulation exercise is to obtain one or more simulated critical
values or quantiles for a test statistic calculated for a variety of sample sizes. For example,
one might wish to obtain estimates of the 1%, 5%, and 10% critical values for a test statistic
calculated from sample size of T = 25, 50, 100, and 250. For each sample size Ti , the
following five steps are carried out.
1. Specify the data generating process (DGP) under the null hypothesis.
2. Using the DGP, generate a sample of length Ti . For some dynamic models, it may be
necessary to sample Ti + k observations and discard the first k to eliminate the effects
of initial conditions.
3. Calculate the statistic of interest using sample of length Ti . We denote this statistic
S(Ti ).
4. Repeat steps 2 and 3 r times to obtain i.i.d. observations on the test statistic of
interest. These observations from the r replications S1 (Ti ), ..., Sr (Ti ) are an estimate
2
of the sampling distribution of the finite sample statistic S(Ti ).1
5. Use the rα order statistics for α = 0.01, 0.05, and 0.10, say, to estimate the desired
critical value or quantiles for the finite sample statistic S(Ti ).
Even when simulation results are reported for a range of sample sizes, researchers attempting
to apply them typically have sample sizes that differ from those that were simulated, necessitating some form of interpolation (at least implicitly). In a series of papers, MacKinnon
has proposed a method in which response surface regressions are used to accurately estimate
critical values of the asymptotic distribution of S(T ) (as sample size T goes to infinity), and
to estimate critical values for finite sample sizes by interpolation. The method involves simulating a critical value, say the α quantile, for a range of sample sizes T1 , T2 , Tn . Let q α (Ti ),
be the estimated α quantile for sample size Ti . Each q α (Ti ) is obtained from a simulation
with r replications. The response surface regression corresponding to the α quantile is:
α
q α (Ti ) = θ∞
+ θ1α Ti−1 + θ2α Ti−2 + θ3α Ti−3 + εi
α
represents the α quantile of the asymptotic distribution, and the reIn this equation, θ∞
maining terms reflect the rate of convergence (1/T ) of the quantile of the finite-sample
distributions to the quantile of the asymptotic distribution. One can substitute any chosen
value of T into the RHS of the estimated response surface regression to obtain the estimated
critical value for sample size T . This interpolation method is increasingly being used to get
accurate finite sample critical values for complicated time series estimators.
Since the early 1990s, MacKinnon has argued forcefully that large numbers of replications are
required in simulation experiments in order to obtain accurate critical values. For example,
1
It is important that r be large enough to obtain sufficient granularity in the CDF of S(Ti ) to obtain the
rα order statistic. For example, if r = 20, it would not be possible to determine the 1% quantile! One would
need at least 100 replications.
3
his 1991 paper uses 25,000 replications for each experiment. Additionally, he repeats each
experiment 40 times so that the total number of replications is equal to 1,000,000. In the
2010 reissue and extension of his original 1991 working paper, MacKinnon uses 500 sets of
experiments with 200,000 replications in each experiment for a grand total of 100 million
replications (for each sample size considered)!
Note that in terms of total replications, MacKinnon recommends a “divide and conquer”
approach. That is, he runs 500 experiments with 200,000 replications rather than one experiment with 100 million. In MacKinnon (2000) he lists several reasons for this, including (1)
the observed variation among the estimates from the 500 experiments provides a very easy
way to measure the experimental randomness in the estimated quantiles, (2) less computer
memory is required, (3) it reduces the sorting cost inherent in quantile calculations (as “it
is cheaper to sort N numbers M times that to sort MN numbers at once”), and (4) it allows
the simulation experiments to be easily divided among a number of computers for parallel
processing and reduced vulnerability to power failures.
In spite of MacKinnon’s admonitions, researchers simulating critical values have often based
their results on woefully inadequate numbers of replications. For example, MacKinnon (2010,
page 2) points out that “Engle and Granger (1987), Engle and Yoo (1987), Yoo (1987),
and Phillips and Ouliaris (1990) all provide tables for one or more versions of the EngleGranger cointegration test. But these tables are based on at most 10,000 observations, which
means that they are quite inaccurate.” In spite of tremendous gains in computing power
over time, there is, if anything, a tendency towards smaller rather than larger numbers
of replications in critical value simulation exercises. MacKinnon’s papers with various
coauthors are laudable exceptions. Here are a few other examples of simulation papers with
replications used. Dickey-Fuller (1979, 4000 reps), Dickey-Fuller (1981, 50,000 reps), Schwert
(1989, 10,000 reps), Zivot-Andrews (1992, 1000 or 5000 reps), Lumsdaine-Papell (1997, 500
reps), Nunes-Newbold-Kuan (1997, 5000 reps), Harvey-Leybourne-Newbold (2001, 10,000
4
reps), Lee-Strazicich (2001, 5000 reps), and Lee-Strazicich (2003, 2000 or 5000 reps).
The purpose of this note is twofold. First, we argue that, in practice, the number of replications used to simulate critical values is often insufficient to provide the degree of precision
that is implied, and second, we provide some recommendations for approximating, roughly,
the number of replications needed to achieve a desired degree of precision. It is important
to note that larger numbers of replications are needed to obtain precise critical values in the
tails of the distribution.
In a typical simulation to obtain critical values, the true distribution of S(T ) cannot be
calculated analytically. In this paper we begin by discussing the estimation of quantiles
in some situations where the true distribution is known. By examining the asymptotic
distribution of the sample quantiles, we show that for some typical distributions, the number
of replications needed to obtain precision to two or three significant digits is much larger than
is typically used. Then we provide some suggestions for determining a number of replications
that will provide a desired degree of precision in cases where the true distribution is unknown.
2
A simple example
Consider the most basic of simulation experiments. Suppose we take r independent observations of a random variable X that has the standard normal distribution. We might
ask: What are the upper and lower bounds of the confidence 95% interval based on these r
replications? In this trivial example, we know that the theoretical CVs are (−1.960, 1.960).
The simulated CVs are found by calculating the 2.5% and 97.5% quantiles of the sampling
distribution.
For a slightly more complex example, consider drawing a random sample of size n from the
standard normal distribution N (0, 1). Using these n observations, we run the simplest of all
regressions, where the model contains only an intercept term:
5
X i = α + εi
It is, of course, well-known that the OLS estimator of α is just the sample mean. So in this
simple case:
Pn
S=
i=1
Xi
n
which has the following distribution:
S ∼ N (0, 1/n)
For convenience, we will consider the standardized sample mean
√
nS, which has a standard
normal distribution. Following the five step procedure above, is straightforward to replicate
this procedure r times to obtain the empirical sampling distribution of this sampling statistic.
Table 1 below shows our results of this exercise for estimating CVs of the standard normal distribution using r = 100 and r = 1, 000, 000, along with results
for estimating CVs of the distribution of the sample mean of 100 i.i.d. standard
normal random variables using r = 10, 000.
For r = 100, the estimated CVs are quite far from the true values. Undoubtedly,
the reader’s response to this finding is that the chosen r is much too small.
Surprisingly, the estimated CVs for r = 1, 000, 000 and for the sample mean
with r = 10, 000 are still somewhat different from the theoretical values.
These examples raise the question: just how large does r have to be in this simplest of
experiments to get the precision of the 0.025 quantile to be 0.001? We can address this
question by examining the asymptotic distribution of a sample quantile.
6
Table 1: True Quantiles and Estimated Quantiles for the Standard Normal Distribution
Quantile
0.01
0.025
0.05
0.10
0.50
0.90
0.95 0.975
0.99
True Value
−2.326 −1.960 −1.645 −1.282
Estimated Value
n = 1, r = 100
−2.550 −2.227 −1.988 −1.326 −0.172 1.519 1.941 2.150 2.425
Estimated Value
−2.322 −1.955 −1.641 −1.281
n = 1, r = 1, 000, 000
Estimated Value
n = 100, r = 10, 000
3
0.000 1.282 1.645 1.960 2.326
0.000 1.278 1.640 1.957 2.321
−2.377 −1.954 −1.651 −1.281 −0.002 1.290 1.658 1.975 2.294
The asymptotic distribution of a sample quantile
Let F (x) be any absolutely continuous, strictly increasing cumulative distribution function,
0
and let f (x) = F (x) be the corresponding probability density function. In the context of
the discussion above, F (x) is the cdf of the statistic S(T ).
Let 0 < p < 1, and let zp = F −1 (p) be the pth quantile of F . The goal is to estimate the
critical value zp , and find its asymptotic distribution. Let S1 , ..., Sr be i.i.d. with cdf F (x),
and let ẑp be the estimated critical value, for which the proportion of Si that are less than ẑp
is p. It is known that for r sufficiently large, ẑp is approximately normally distributed with
p(1 − p)
[see, e.g., Walker (1968)]. We provide a brief derivation of
E(ẑp ) = zp and V (ẑp ) =
rf (zp )2
this result.
To start with, let p̂ be the proportion of the Si that are less than zp .
Then by the
Central Limit Theorem, p̂ is approximately normally distributed with mean p and varip(1 − p)
. Using the delta method, it follows that F −1 (p̂) is approximately normal
ance
r
"
#2
p(1
−
p)
d
with mean F −1 (p) and variance
F −1 (p) . Now since F −1 (p) = zp and since
r
dp
d
F −1 (p) = 1/f (F −1 (p)) = 1/f (zp ), F −1 (p̂) is approximately normal with mean zp and
dp
p(1 − p)
variance
.
rf (zp )2
7
It can be shown using a stochastic equicontinuity argument (e.g. Andrews, 1994) that ẑp ≈
F −1 (p̂); in fact the difference between them is of order 1/r in probability. Since this difference
is of higher order than the standard deviation of F −1 (p̂), the asymptotic distribution of ẑp
is the same as that of F −1 (p̂). Therefore ẑp is asymptotically normal with E(ẑp ) = zp and
p(1 − p)
V (ẑp ) =
.
rf (zp )2
Some general observations follow: For a fixed number of replications, the variance is a
function of p. The numerator is maximized for p = 0.5, and decreases quadratically as p
tends toward 0 or 1. For values of zp where the density is large, the denominator is larger, so
the variance is smaller. Intuitively, the reason that such values of zp can be estimated more
precisely is that there will be comparatively many observations in neighborhoods of values
where the density is large. One needs more replications to get precise CVs for small values
of p because relatively few of the replications produce information regarding the tails of the
sampling distribution.
Relationship between V (ẑp ) and p
For the normal distribution, the density f (zp ) is maximized for p = 0.5, and decreases
exponentially as p tends toward 0 or 1. As a result, the variance is minimized at p = 0.5 and
increases as p approaches 0 or 1. In contrast, the uniform distribution on [0, 1] has constant
density f (x) = 1 on [0, 1]. In this case the variance is maximized at p = 0.5 and decreases
quadratically as p approaches 0 or 1. Figure 1 presents a plot of the standard deviation of
√
ẑp (multiplied by r) as a function of p for both the standard normal distribution and the
uniform distribution on [0, 1].
The behavior observed for the normal distribution in Figure 1 is typical of distributions supported on the whole line. To see this, write p = F (zp ), so that V (ẑp ) =
F (zp )(1 − F (zp ))
0
. Applying L’Hospital’s rule, we find limp→0 rV (ẑp ) = limp→0 1/2f (zp ).
rf (zp )2
0
Similarly, limp→1 rV (ẑp ) = limp→1 −1/2f (zp ). For distributions supported on the whole
8
Standard Deviation of Sample Quantile
of Standard Normal Distribution
Standard Deviation of Sample Quantile
of Uniform Distribution
4
0.25
Standard Deviation
Standard Deviation
3.5
3
2.5
2
0.15
0.1
0.05
1.5
1
0
0.2
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Quantile
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Quantile
1
Figure 1: Standard deviation (multiplied by
√
r) of ẑp as a function of p for the standard
normal distribution (left) and the uniform distribution on [0, 1] (right).
line, zp → ∞ as p → 1 and zp → −∞ as p → 0. Assuming there are not infinitely many sign
0
0
0
changes in f (zp ), then f (zp ) < 0 for sufficiently large z, and limz→∞ f (zp ) = 0. It follows
that limp→1 V (ẑp ) = ∞. Similarly, limp→0 V (ẑp ) = ∞ as well.
For distributions supported on a finite interval, the dependence of V (ẑp ) on p may be quite
different. A striking example is the uniform distribution on [0, 1]. Here f (zp ) = 1 for all p,
so V (ẑp ) = p(1 − p)/n. Now the variance is maximized at p = 0.5, and decreases to 0 as p
tends toward 0 or 1 (see Figure 1).
An example
We’ll consider the standard normal distribution, with density f (x) = φ(x) = (2π)−1/2 e−x
2 /2
.
The most commonly used quantiles are p = 0.025 and p = 0.975, which are used to construct
95% confidence intervals. The true critical values are ±1.9600 to four decimal places. In
particular, when r = 1, 000, 000 and p = 0.025 or 0.975, then zp = ±1.96 and φ(zp ) = 0.058.
The standard deviation of ẑp turns out to be about 0.003.
9
One can use the formula for the variance to compute the number of replications needed
to estimate a quantile to a given level of precision. For example, to estimate the 0.025
quantile of the normal distribution with a precision of ±0.1% with 95% confidence, one must
find the value of r that makes the standard deviation equal to 0.001/1.96. It turns out
that r ≈ 27, 400, 000! Figure 2 presents a plot of the approximate number of replications
needed to estimate zp with a precision of ±0.1% with 95% confidence for the standard normal
distribution. The approximation is based on estimating the true distribution of
the sample quantile with its asymptotic normal distribution.
Number of Replications Needed to Estimate
Quantile to ± 0.001 with 95% Confidence
7
Number of Replications
6
x 10
5
4
3
2
1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Quantile
Figure 2: Number of replications needed to estimate zp with a precision of ±0.001 with 95%
confidence for the standard normal distribution.
The precision is proportional to the square root of the number of replications. Therefore, for
example, if a precision of only ±1% were needed, the number of replications needed would
be 1/100 of that shown in Figure 2.
The effect of the sample size and number of replications on precision
Let S n represent the sample mean of n i.i.d. observations from the standard normal distribution, and consider estimating the pth quantile of the distribution of S n on the basis of r
10
replications. The density of S n is
√
√
nφ(x/ n), and the pth quantile is zp /n, where zp is the
pth quantile of N (0, 1). It follows that the asymptotic variance of the pth quantile is
p(1 − p)
rnφ(zp )2
We see that the asymptotic variance is inversely proportional to the product rn. Therefore,
when estimating a normal quantile, it does not matter whether one takes r samples of size
n, or one sample of size rn. For sample means of other distributions, if n is large enough,
the density of the sample mean is approximately normal, so the asymptotic variance again
should be approximately proportional to rn.
4
CV simulation recommendations
It is important to assess the precision of simulated critical values. When the density of a
test statistic S is known, the asymptotic distribution for a particular critical value can be
used to compute the uncertainty. For the normal distribution (see Figure 1), there is greater
uncertainty associated with the critical values for extreme quantiles. Therefore it takes more
replications to estimate the 1% critical value with precision .001 than to estimate the 50%
critical value with the same precision. Suppose that we wish to estimate critical values for
the 2.5% and 97.5% quantiles of a standard normal distribution with a precision of 0.01
(equivalent to a standard error of 0.005). Our analysis suggests that this precision can be
attained with approximately 285,000 replications.
Table 2 shows other critical values and the recommended number of replications needed for
precision of .100, .010 and .001. If the researcher is content with a precision of 0.100 for the
2.5/97.5% quantile, for example, only 2,850 replications are needed. On the other hand, for
precision of 0.001 on this quantile, roughly 28,500,000 replications are needed. This may be
impractical if many simulations need to be run for a collection of model specifications. For
the more extreme 1.0/99.0% quantile, even more reps are required.
11
For researchers who are using only 5,000 replications, say, to simulate critical values, it seems
reasonable to produce tables of critical values with only one digit (rather than two or three)
to accurately indicate significant digits.
Table 2: Recommended Number of Replications to Produce Simulated Critical Values for
Various Quantiles of the Standard Normal Distribution with Desired Precision
Quantiles
0.100
Desired Precision
0.010
10.0 or 90.0%
5.0 or 95.0%
2.5 or 97.5%
1.0 or 99.0%
1,170
1,790
2,850
5,570
117,000
179,000
285,000
557,000
0.001
11,700,000
17,900,000
28,500,000
55,700,000
In all cases, the number of recommended replications r is obtained from the following formula:
p(1 − p)
r= 2
, where σ is the standard error of zp , equal to one-half the desired precision.
σ φ(zp )2
In many practical applications, the density of S will not be known. In these cases one
might proceed by estimating the density with a standard smoothing technique, and then use
that estimate to approximate the asymptotic distribution of the sample quantile. Here we
describe a nonparametric bootstrap approach. Choose a value R, and generate R replications
S1 , ..., SR . The value R should be large enough to provide a reasonable approximation in the
bootstrap procedure we will describe, but need not be large enough to provide the desired
degree of precision.
Partition the R values of S into m subsamples of size r = R/m. (It should be feasible to
take, for example, R = 100, 000, m = 100 and r = 1000; smaller values may suffice in some
situations.) Then compute the sample quantile ẑp from each subsample, obtaining estimates
ẑp(1) , ..., ẑp(m) . Next compute the sample standard deviation sr of these estimates. The value
of sr is an estimate of the standard deviation of the sample quantile based on r replications.2
For any large number of replications M , we can then approximate the distribution of the
sample quantile based on M replications with N (zp , rs2r /M ).
2
This recommendation amounts to MacKinnon’s “divide and conquer” approach discussed in the intro-
duction.
12
References
Andrews, D. (1994). Empirical Process Methods in Econometrics. In Handbook of Econometrics, R.F. Engle and D.L. McFadden, eds. Elsevier Science B.V.
Dickey, David A. and Wayne A. Fuller (1979). Distribution of the Estimators for Autoregressive Time Series with a Unit Root. Journal of the American Statistical Association,
74:427–431.
Dickey, David A. and Wayne A. Fuller (1981). Likelihood Ratio Statistics for Autoregressive
Time Series with a Unit Root,. Econometrica 4 (July), 49:1057–1072.
Engle, Robert F. and C. W. J. Granger (1987). Cointegration and Error Correction: Representation, Estimation, and Testing. Econometrica, 55:251–276.
Engle, R.F. and B.S. Yoo (1991). Cointegrated Economic Time Series: An Overview with
New Results. in Long-Run Economic Relationships: Redings in Cointegration, R.F. Engle
and C.W.J. Granger (eds). Oxford: Oxford University Press.
Harvey, D., Leybourne, S. and P. Newbold (2001). Innovational Outlier Unit Root Tests with
An Endogenously Determined Breaks in Level. Oxford Bulletin of Economics and Statistics
63:559–575.
Lee, Junsoo and Mark C. Strazicich (2001). Break Point Estimation and Spurious Rejections
with Endogenous Unit Root Tests. Oxford Bulletin of Economics and Statistics 63:535–558.
Lee, Junsoo and Mark C. Strazicich (2003). Minimum LM Unit Root Test with Two Structural Breaks. Review of Economics and Statistics 85:1082–1089.
Lumsdaine, R. and D. Papell (1997). Multiple Trend Breaks and the Unit Root Hypothesis.
Review of Economics and Statistics 79:212-218.
MacKinnon, James G. (1991). Critical Values for Cointegration Tests. Chapter 13 in R. F.
Engle and C. W. J. Granger (eds.), Long-run Economic Relationships: Readings in Cointegration, Oxford: Oxford University Press.
13
MacKinnon, James G. (1996). Numerical Distribution Functions for Unit Root and Cointegration Tests. Journal of Applied Econometrics, 11:601–618.
MacKinnon, James G. (2000). Computing Numerical Distribution Functions in Econometrics. In High Performance Computing Systems and Applications, A. Pollard. D. Mewhort,
and D. Weaver, eds. Amsterdam, Kluwer, 455–470
MacKinnon, James G. (2010). Critical Values for Cointegration Tests. Queens University
Working Paper No. 1227, 2010. [This paper updates and extends MacKinnon (1991).]
Nunes, L., P. Newbold, and C. Kuan (1997). Testing for Unit Roots with Breaks: Evidence on the Great Crash and the Unit Root Hypothesis Reconsidered. Oxford Bulletin of
Economics and Statistics, 59:435–448.
Perron, Pierre (1989). The Great Crash, the Oil Price Shock, and the Unit Root Hypothesis,
Econometrica 57:1361–1401.
Phillips, P.C.B and S. Ouliaris (1990). Asymptotic Properties of Residual-Based Tests for
Cointegration, Econometrica 58:165-193.
Phillips, P.C.B. and P. Perron (1988). Testing for a Unit Root in Time Series Regression.
Biometrika, 75:335–346.
Schwert, G. William (1989). Tests for Unit Roots: A Monte Carlo Investigation. Journal of
Business and Economic Statistics, 7:147–159.
Walker, A. (1968). A Note on the Asymptotic Distribution of Sample Quantiles, Journal of
the Royal Statistical Society, Series B 30:570–575.
Yoo, B.S. (1987). Co-integrated Time Series: Structure, Forecasting and Testing. unpublished Ph.D. Dissertation, University of San Diego.
Zivot, E. and D. Andrews. (1992) Further Evidence on the Great Crash, the Oil-Price Shock,
and the Unit-Root Hypothesis. Journal of Business and Economic Statistics, 10:251270.
14
Download