simulation

advertisement
1
Files\courses\MonteCarlo\MonteCarlo1.doc
Revision 11.06.97
Marc Nerlove 1997
NOTES ON
MONTE CARLO, BOOTSTRAPPING AND ESTIMATION BY SIMULATION
Lest men suspect your tale untrue,
Keep probability in view. John Gay, Fables, I, 1727
1. Random Number Generation
Anyone who considers arithmetical methods of
producing random digits, is, of course, in a state of
sin.
John von Neumann, 1951
Uniform and Normal Deviates
"It may seem perverse to use a computer, that most precise and deterministic of all machines
conceived by the human mind, to produce 'random' numbers. More than perverse, it may seem to be a
conceptual impossibility. Any program, after all, will produce output that is entirely predictable, hence, not
truly 'random.'"(Press, et al., p.191) There is, nonetheless, a large literature on how one uses a computer to
"simulate" a sequence of random numbers. 1 The basic building block of all "random" number generation is
the uniform random deviate. These are random numbers which lie within an interval, usually (0,1), and for
which any number in this interval is as likely as any other. All other sorts of random deviates (GAUSS lists
commands for beta, RNDBETA, gamma, RNDGAM, negative binomial, RNDNB, and Poisson, RNDP, as
well as uniform, RNDU and RNDUS, and normal, RNDN and RNDNS 2) are generated by transforming
uniform deviates, sometimes simply and sometimes by elaborate algorithms. 3 Standard normal deviates
(zero mean, variance 1) may also be produced from uniform deviates by such transformations.4 For
example, Box and Muller suggest the following transformation which produces pairs of independent
normal deviates:
y1  (2 ln u1 ) 2 cos( 2u 2 )
1
(1)
1
y 2  (2 ln u1 ) 2 sin( 2u 2 )
1
,
Beginning with John von Neumann's "Various Techniques Used in Connection with Random Digits,"
Bureau of Standards Applied Mathematics Series 12, pp. 36-38, 1951.
2
In the case of RNDU and RNDN, the so-called "seed" is supplied from the computer clock; in the case of
RNDUS and RNDNS, the "seed" is user supplied. For beta, gamma negative binomial and Poisson, what
code is available in the GAUSS file random.src is not very informative, but for normal and uniform
deviates the code is concealed in the GAUSS executive program and considered proprietary. References to
the literature in the GAUSS manual on these routines indicate that the method used to generate uniformly
distributed random variates is (2) below. However, (1) is not used to generate normal deviates, but rather a
method called the "fast acceptance-rejection algorithm" proposed in A. J. Kinderman and J. R. Ramage,
"Computer Generation of Normal Random Numbers,' Jour. Amer. Stat. Assoc., 71(356): 893-896 (1976). A
brief discussion of the acceptance-rejection method is given below.
3
GAUSS also has an algorithm, called by RNDVM not listed in the manual, for generating deviates of the
von Mieses distribution, usually called the circular normal distribution.
4
The definitive work on this subject is Luc Devroye, Non-Uniform Random Variate Generation, New
York: Springer-Verlag, 1986. I discuss further examples below.
2
where ui and ui+1 are consecutive uniform deviates in the interval (0,1).5 At that point it is easy to generate
vectors of numbers having a multivariate normal distribution with specified mean vector and variance
covariance matrix through the use of an appropriate matrix transformation.
Uniform deviates are usually generated by so-called congruential generators. The first to be
suggested (by Lehmer in 1951) was the multiplicative generator:
ui 1  aui (mod m) ,
where the notation indicates that
u i 1 is the remainder when aui is divided by m. Such generators have to
be started by an initial value called the seed. The numbers produced by such a recurrence relation will
eventually start to repeat. The maximum period possible is m, but the multiplicative generator will
generally have a period much shorter than this. A recursive relation with better properties is the linear
congruential generator:
(2)
u i 1  (au i  c)(mod m) ,
where m is called the modulus, and a and c are positive integers called the multiplier and the increment
respectively. The reccurrence (2) will eventually repeat itself, but if m, a and c are properly chosen the
period of the generator will be m, the maximum possible. The number m is usually chosen to be about the
length of a machine word, making the generator machine dependent. 6 "Portable" random number
generators (RNG) are generally much less efficient. Moreover, unless m, a and c are chosen very carefully
in relation to one another, there may be a lot of serial correlation among values generated. The rules are
complicated: (a) c and m may have no common divisor, (b) a = 1(mod p) for every prime factor of m, and
(c) a = 1(mod 4) if m is a multiple of 4. The seed for each successive call of the generator is the last integer
ui+1 returned. Usually this value is saved for the next use of the procedure. In the case of the GAUSS
generators RNDU and RNDUS, the ending value of the seed is not available directly to the user, but could
be computed from the last random number returned. RNDU takes its seed from the computer clock when
GAUSS is first started up. To obtain a number in the interval [0,1) usually u i+1/m is returned rather than
ui+1.7
What can go wrong with the RNG in common use? Clearly, that the choice of m, a and c may be
botched is the greatest problem in situations in which the specifics of the RNG are not known. Press, et al.,
pp.194-195, suggest a kind of shuffling procedure to alleviate this problem But randomness, like beauty, is
in the eye of the beholder. "...the deterministic program that produces the random sequence should be
different from, and – in all measurable respects – statistically uncorrelated with, the computer program that
uses its output. In other words, any two different random number generators ought to produce statistically
the same results when coupled to your particular applications program." (p.191) More generally, Knuth
(1969, pp. 34-100) gives a large number of tests for randomness, the most powerful of which is called the
spectral test. Understanding and implementing this test requires knowledge of time series analysis, but
Exercise 2 to this section suggest a couple of simpler tests which might be implemented in GAUSS: Chisquare and Kolmogorov-Smirnov. GAUSS's generators RNDU and RNDN pass these tests.
To obtain a series of vectors which can be considered to have come from a multivariate normal
distribution with mean vector  and variance-covariance matrix , we can make use of the Cholesky
5
Box, G. E. P. Box and M. E. Muller, "A Note on the Generation of Random Normal Deviates, Annals of
Mathematical Statistics, 29: 610-611 (1958). In fact, the Box-Muller method is not a very efficient way to
generate normal deviates because it involves evaluation of logarithmic and trigonometric functions which
require series expansions inside almost all computers (i.e., these functions are not "hard-wired"). This is
why GAUSS uses an alternative method (Kinderman and Ramage, op. cit.).
6
The GAUSS manual remarks that the generator assumes a 32-bit machine (p.1525), which means m
should be about 232. This is consistent with the restriction in RNDUS that the user supplied seed must be in
the range 0<seed<231-1 (p.1523). The example used there chooses 3937841 as the seed. The default value
for m is in fact 231-1 = 2147483647 (p. 1519).
7
The default values are c=0, a=397204094 and m=2147483647. But you can set your own parameters and
starting seed with RNDCON, RNDMOD, and RNDSEED (p.1519).
3
decomposition of the matrix  and the characterization of the multivariate normal discussed in
Econometrics for Applied Economists, Notes for AREC 624, Spring 1995, vol. 3, pp. APP/11 - 17.8 Let u
be a vector of iid normal variable with mean zero and variance 1; then,
(3)
v    Tu
has a multivariate normal distribution with mean vector  and variance-covariance matrix T'T = .
Nonuniform Deviates in General9
One of the principal problems in doing Monte Carlo simulations of all kinds is the generation of
nonnormal deviates. For example, in regression analysis, tests of significance are based on the assumption
of an underlying normal distribution for the disturbances; if these are not normal, potentially serious errors
can result in the assessment of the significance of the regression coefficients.. Similarly, maximumlikelihood estimation is generally based on the assumption of a normal distribution and heavy reliance is
placed on the desirable asymptotic properties of the ML estimates; while many of these asymptotic
properties hold for many nonnormal cases, we typically deal with samples which are not so large that we
can be confident that the asymptotic properties are good approximations; moreover, methods of likelihood
inference more generally depend crucially on the distributional assumptions made..
As remarked, all random number generation, including that for normal deviates, begins with the
generation of uniform deviates. The problem is to get from uniform deviates to the deviates we want. 10
There are essentially three basic methods for generating nonuniform deviates: (1) the inverse transform
method; (2) the composition method; and the acceptance-rejection method. Fishman (1996, pp. 179-187)
deal with two other methods, as well as with refinements of the basic three.
(1)The inverse transform method is conceptually the easiest to explain and can be applied to
discrete as well as continuously distributed variates: Since we know how to generate variates which are
uniform on the interval [0,1), simply regard these as probabilities from a cumulative distribution function
corresponding to the distribution we want and find the value of the variate corresponding to the
"probability" turned up by the uniform generator. Thus, to obtain continuously distributed variates with the
cumulative F( ), for which we have F-1( ) in closed form, first generate the uniform(0,1) variable, u, the
compute z = F-1(u ). The z's so generated could have come from the distribution with density f( ). The
applicability of this method depends on our ability to specify F -1( ), and its efficiency is problematic since
we often can not avoid evaluating exponential, logarithmic or trigonometric functions. Fishman (1996,
p.151) gives a useful table of common continuous distributions and their analytical inverses. Included are
the following: uniform; beta; exponential, logistic, noncentral Cauchy, normal Pareto; and Weibull. David
Baird has a series of programs, available at the GAUSS site at American University,
http://gurukul.ucc.american.edu/econ/gaussres/GAUSSIDX.HTM,
for generating nonuniform random variates, in which he also gives programs for the following inverse
cumulatives: Normal; t; F; chi-square; and Gamma.
For a discrete distribution the inverse transform method is a bit more difficult because the
cumulative cannot generally be obtained in closed analytical form, although the principle is the same. The
essential idea is as follows: Suppose we want to generate a discrete random variate, z , which can take on
8
GAUSS contains a command, CHOL, for finding the Cholesky decomposition of a positive definite
symmetric matrix. This decomposition finds an upper triangular matrix T such that T'T = .
9
As remarked in footnote 5, the definitive reference on this subject is Devroye, op. cit., but Fishman (1996,
Chapter 3, "Generating Samples," pp. 145-254) contains an excellent discussion. Mooney's (1997, pp. 14 –
46) is good as far as it goes, but somewhat incomplete.
10
GAUSS contains commands for generating deviates, as remarked, for beta, RNDBETA, gamma,
RNDGAM, negative binomial, RNDNB, and Poisson, RNDP, as well normal, RNDN. But you need to
understand in general how this is done and to be able to generate variables from distributions not covered
by GAUSS, such as the t-distribution.
4
values a, a+1, ..., b, coming from the cumulative distribution with probabilities { 0 < q a < qa+1 < ... < qb =
1}. qi is the probability that z  i. Set z =a. Generate a uniform variate u, as long as u < q a z remains at a,
but when u > qa augment z by 1 and continue with qz. If this seems difficult to follow, the example of
Bernoulli trials discussed by Mooney (1997, pp. 31-33) is instructive: A Bernoulli trial is just a "one-shot"
binomial; z takes on the value a with probability q and not a with probability 1-q; thus the cumulative
distribution is {0 < q < 1}. Here is a little GAUSS program to generate n Bernoulli RVs with this
distribution.
q = 0.4;
/* for example */
y = RNDU(n,1);
p = q*ones(n,1);
z = y .le p;
/* .le is the element by element logical operator, returning 1 if true, 0 if false */
z is an n by 1 vector of RVs from the distribution {0 < 0.4 < 1}. Another example involving only a twopoint cumulative is the binomial distribution with t trials and probability p of success . Here
t
t 
q     p i (1  p) t i .
i 1  i 
Obviously, this method, in general, applies to any univariate discrete distribution, although in general the
probabilities qi can be specified in terms of a few underlying parameters and the method makes no use of
this to increase efficiency. When the number of possible values is more than 2, the number of comparisons
involved can be large. Fishman (1996, pp. 153-155) gives a method for improving efficiency and reducing
the mean number of comparisons required to generate a random variable with the specified cumulative.
GAUSS has commands for generating RVs for the negative binomial and the Poisson distributions, and
Baird includes a program for the binomial. Fortunately, many common discrete distributions can be built up
as the composition of Bernoulli variables.
(2) The composition method is the lazy approach, when it can be implemented, and is generally
even more inefficient than the inverse transform method. We have already had an example of this approach
in the Box-Muller method for generating random normal deviates from uniform deviates by transforming
the latter. IID normal 0-1 variables can also be transformed to univariate normals with nonzero mean and
variance different from one, as well as into nonindependent multivariate normals. Mooney (1997, pp. 1823) suggests obtaining RVs from the lognormal, Chi-square and t distributions this way: lognormal by
exponentiating a normal variable; Chi-square with df degrees of freedom by summing df independent
normals squared; and t with df degrees of freedom as the ratio of a normal and the sqrt of an independent
Chi-square with df degrees of freedom. The binomial, geometric and negative binomial distributions can
also be treated in this way (Mooney, 1997, pp. 35-42). A standard exponential variable can be generated
from a uniform [0,1) variable.11 Notwithstanding its potential inefficiency, the composition method
provides a very powerful tool for handling distributions that cannot be represented in terms of one of the
standard distributions. The idea is to "mix" one or more distributions you already know something about.
The key parameters in the implementation of this method are the mixing proportions. For example suppose
we wanted to mix only two distributions for which we know how to create RVs easily, e.g., two normal
distributions with different means and variances; We create a vector from one distribution of length np and
one from the other of length n(1-p) where p is the mixing proportion; concatenate the two vectors so
obtained vertically; then randomize the elements using the row index and the GAUSS commands RNDU
and CEIL. Mooney (1997, pp. 24-25) gives the following GAUSS example for mixing two normal
distributions:
11
It is interesting to note that von Neumann's 1951 paper suggested obtaining an exponentially distributed
random variable by the acceptance-rejection method described below, rather than by the composition
method. His sketch of the proof appears to have been the basis for the development of the AR method.
5
y1 = RNDN(((1-p)*n),1); mean = a1; variance = b1; y1 = mean + (sqrt(variance)*y1);
y2 = RNDN((p*n),1);
mean = a2; variance = b2; y2 = mean + (sqrt(variance)*y2);
y = y1|y2;
index = CEIL(RNDU(n,1)*n);
/* See GAUSS manual, p. 1075. */
x = SUBMAT(y, index', 0);
/* See GAUSS manual, p. 1599. Note that index has been
transposed. */
x is the desired vector of RVs having a distribution which is a mixture in proportion p of the two normal
distributions specified.
(3) The acceptance-rejection method (AR method) is not only the most generally applicable but
the basis for the most efficient and refined algorithms. It works when the inverse cumulative is intractable
and when the distribution of the RV we want to generate is not a simple function of one or more
distributions we know how to generate. The AR method is based on a theorem on conditional probability
stated by von Neumann (1951, op. cit.):
Theorem: Let f(z) be a pdf defined on the interval [a,b], such that
f ( z )  cg ( z )h( z ),
where
b
h( z )  0,  h( z )dz  1, c  sup z [ f ( z ) / h( z )] and 0  g ( z )  1.
a
Let Z be the RV with pdf h(z) and U be uniformly distributed on [0,1); then, if Ug(Z), Z has the pdf f(z).
This theorem suggests the following procedure for generating a vector, x, n by 1, of RVs from a
distribution having the pdf f(z): Generate two independent RVs, distributed uniformly on [0,1), say z and u.
Calculate f(z); if u  f(z), insert the value z as the next element in x; if not, try again. Here g(z) and h(z) are
both the uniform density. Go on until you have filled the vector x. 12 Here is a short GAUSS program to
generate a vector of 10000 RVs coming from the exponential distribution using the von Neumann method:
SAMPLESZ = 10000;
i = 1; x = zeros(samplesz,1);
do while i le samplesz;
T = 0;
again:
u = rndu(1,1); sum = 0; N = 2;
do while sum < u;
v = rndu(1,1); sum = sum + v; N = N + 1;
endo;
if (N % 2 eq 0); T = T +1;
goto again;
12
The proof is as follows (Fishman, 1996, p.172): U and Z have the joint pdf
f U , Z (u, z )  h( z ), 0  u  1, a  z  b.
Then Z has the conditional pdf
hZ

( z | U  g ( Z )) 
g (z)
0
f U , Z (u, z )du
prob(U  g ( Z ))
,
b
where
prob(U  g (Z ))   h( z ) g ( z )  1 / c,
and
hZ ( z | U  g (Z ))  cg ( z)h( z)  f ( z).
a
QED.
6
endif; x[i,1] = T + u;
i = i + 1;
endo;
The algorithm used by GAUSS to generate normal 0-1 deviates is of the AR type (Kinderman and
Ramage, op. cit.). The AR method can be made quite efficient by clever programming and is the basis for
many of the algorithms which are used in practice The method, however, has problems with "thick-tailed"
distributions, since, for such distributions, the ratio of rejections to acceptances will generally be high,
leading to the necessity of generating a very large number of uniform RNs in order to achieve the desired
final number. And, as is the case in general, formulations which require the evaluation of exponential,
logarithmic, trigonometric, or other special functions may be inefficient simply because these evaluations
are themselves time consuming.
--Continued next file: MonteCarlo2.doc
Exercises
Part 1: Random Number Generation
1. Show that the transformation (1) of two variables each independently distributed on the unit interval
yields two independently identically distributed normal variates with unit variances and zero means, with
positive probability on the whole real plane. Follow the development of Hogg and Craig, 4th edition,
section 4.3, "Transformations of Variables of the Continuous Type." Example 8 is exactly this problem.
The variables (u1, u2) are distributed with positive probability density on the unit square, [0,0], [0,1], [1,0],
[1,1]; whereas the variables (y1 y2) are distributed with positive probability density on the whole real plane.
Pay particular attention as to how the region of positive probability for the uniformly distributed variates is
transformed by (1) into the region of positive probability for the normal variates. (See H&C, sec. 4.3,
Examples 3-7.)
2. Write a GAUSS program to generate 1000 random numbers in the interval [0,1) using RNDU. Divide
the interval [0,1) into 10 equal parts and find the frequency of numbers falling in each category. Compare
this with the theoretical frequency for each category of 1/10 using a standard Chi-square test to assess the
statistical significance of the deviations you find. Use the same random numbers to implement a
Kolmogorov-Smirnov test. The K-S test is unfortunately not referenced in the 4th and later editions of
H&C, but you can find a good discussion in A. M. Mood, F. A. Graybill, and D. C. Boes, Introduction to
the Theory of Statistics, 3rd edition, New York: McGraw-Hill, 1974, pp.508-510. Knuth, pp.41-51, and
Press, et al., pp. 472-475, also have an extended discussions. Critical values for the test are tabulated in
CRC Standard Probability and Statistics Tables and Formulae, ed. by W. H. Beyer, Boca Raton: CRC
Press, 1991, p. 334. How would you modify these tests to check whether RNDN produces variates which
are independent standard normal?
3. Write a GAUSS program to generate 1000 3 by 1 vectors having a trivariate normal distribution with
mean vector (1, 2, 3)' and variance-covariance matrix:
6.250
1.875
0.625
1.875
2.8125 0.9375
0.625
0.9375
1.3125.
How would you modify your program to produce random normal deviates with a specified correlation
matrix?
4. Write a GAUSS program to generate 1000 random nonuniform numbers using the GAUSS commands
RNDN (normal), RNDBETA (beta, choose shape parameters both = 2), RNDGAM (gamma, choose
parameter alpha = 2), and RNDNB (negative binomial, choose parameters = 1.5 and 0.5; note that this is a
discrete distribution). The Poisson distribution is treated in Exercise 6 below. Compute the KolmogorovSmirnov test statistic against the null of the true distribution, and graph the empirical against the theoretical
7
distribution. Do the same for the t-distribution for which there is no GAUSS generator, with degrees of
freedom = 2. (See Baird's programs, available at the GAUSS site at American University,
http://gurukul.ucc.american.edu/econ/gaussres/GAUSSIDX.HTM.
Note that the cumulative of the t-distribution is required and that you can use the GAUSS routine, CFDTC.)
GAUSS does not supply all of the other cumulatives you need; you will have to write these yourself or get
them from Baird. Note that the Poisson distribution is a discrete distribution and that the KolmogorovSmirnov test based on continuous cumulative distribution will give you weird results. See Exercise 6,
below.
5. Write a GAUSS program to generate 1000 RVs from a binomial distribution with 20 trials and
probability 0.75 of success. Use a Chi-square test to check whether the variates you generated could have
come from a binomial with probability 0.5 of success.
6. The Poisson distribution, which is widely used in the study of event data in econometrics (Greene, 2nd
ed., pp. 676-679), has the pdf
m x e m
f ( x) 
, x  0,1,2,....,
x!
 0, elsewhere, m  0.
(Hogg and Craig, 4th ed., pp. 99-102.) This discrete distribution has, among others, the following
properties, which may be useful in formulating computer algorithms to generate Poisson variates from
uniform variates (Johnson, Kotz and Kemp, Univariate Discrete Distributions, 2nd. ed., Wiley, 1992, pp.
158-162):
(a)If X1 , X2 , X3 , ... are iid RVs from an exponential distribution with parameter 1 (pdf e-x , x0)
and Z is the smallest integer 0 such that X1+X2+X3+X4+...+XZ > , then Z has a Poisson
distribution with parameter m = .
(b)If U1 , U2 , ... are iid uniform [0,1) variates and Z is the smallest nonnegative integer such that
Z 1
U
i
 e  , then Z has a Poisson distribution with parameter m = .
i 1
Write two short GAUSS programs to generate 10,000 Poisson RVs with parameter = 0.5 based on method
(a) and on method (b). In the case of method (a), based on exponentially distributed RVs, use the algorithm
used in the text to illustrate the AR method. Show that z = -ln(u), where u is distributed uniform [0,1), has a
standard exponential distribution and use this as well to generate Poisson variables by the composition
method. Compare the frequencies obtained with the theoretical frequencies, m = 0.5, x = 0, 1, 2, ...10
Compare with the RVs produced by the GAUSS command RNDP. What do you conclude about the
efficacy of the different methods to simulate Poisson RVs and about the probable nature of the method
underlying RNDP?
7. The (central) Cauchy distribution (Hogg and Craig, 2nd. ed., p 142, Exercise 4.22) has the pdf
f ( x) 
1
,    x  .
 (1  x 2 )
Show (a) that f(x) is the marginal distribution of X1 in the joint distribution of (X1 = Y1/Y2 , X2 = Y2) where
Y1 and Y2 are iid normal 0-1 variables; and (b) the cumulative Cauchy is
F ( x) 
1 1
 arctan x .
2 
Write two short GAUSS programs to generate 1000 RVs from a Cauchy distribution with pdf f(x) using
respectively the composition method and the inverse transform method. Compare the empirical cumulatves
withthe theoretical cumulative and test whether the differences are significant using a KolmogorovSmirnov test.
8
References
1. Random Number Generation:
Press, W. H., B. P. Fannery, S. A. Teukolsky, and W. T. Vetterling, Numerical Recipes: the Art
of Scientific Computing. New York: Cambridge University Press, 1986. Chapter 7, "Random
Numbers," pp.191-225.
Knuth, D. E., The Art of Computer Programming, Volume2: Seminumerical Algorithms. Reading, MA:
Addison-Wesley Pub. Co., 1969. Chapter 3, "Random Numbers," pp. 1-160.
Hammersley, J. M., and D. C. Handscomb, Monte Carlo Methods. London: Methuen & Co. Ltd., 1964.
Chapter 3, "Random, Pseudorandom, and Quasirandom Numbers," pp. 25-42.
Mooney, C. Z., Monte Carlo Simulation, Sage Publications, Series No. 07-116, 1997.
Fishman, G. S., Monte Carlo: Concepts, Algorithms and Applications, New York: Springer-Verlag, 1996.
Download