Ch.3 Random number generators

advertisement
Chapter 3: Random number generators
 Random number generators
 Generating arbitrary distributions
 The transformation method
 The rejection method
 MATLAB functions for common distributions
ASTM21
Chapter 3: Random number generators
p. 1
Random number generators
Generating (pseudo-) random numbers X ~ U(0,1) is basic to all experimental statistics,
simulation, experiment design, and data analysis.
Computer languages often contain a system-supplied random number generator, say
x = rand()
or
x = rand(seed)
where seed is an integer used to "seed" the generator.
Designing good random number generators is an art that has only recently matured. Do
not try to design one yourself, or use code that is more than 10 years old. Some quotes
from Numerical Recipes (3rd ed., Ch. 7):
Be cautious about any source earlier than about 1995, since the field has progressed enormously
in the following decade. [...] The greatest lurking danger for a user today is that many out-of-date
and inferior methods remain in general use. [...] If all scientific papers whose results are in doubt
because of [bad random number generators] were to disappear from library shelves, there would
be a gap on each shelf about as big as your fist.
Numerical Recipes contains examples of good, portable random number generators.
But the simplest is to use generators that are part of a well-reputed software package.
The so-called Mersenne Twister (Matsumoto & Nishimura 1997) used by MATLAB
[mt19937ar] since about 2008 is considered to be of high quality - it passes a number of
stringent tests for randomness, including the ‘Diehard’ test suite (Marsaglia 1998).
ASTM21
Chapter 3: Random number generators
p. 2
Random number generators: Use of seed
When MATLAB is started, and you ask for (say) three random numbers using rand(),
you get:
>> rand(3,1)
ans =
0.814723686393179
0.905791937075619
0.126986816293506
The next one is:
>> rand
ans =
0.913375856139019
If you quit MATLAB and start again, you get for example:
>> rand(2,2)
ans =
0.814723686393179
0.905791937075619
0.126986816293506
0.913375856139019
Each time MATLAB is started, the default random stream is initialized with the same
seed (= 0).
ASTM21
Chapter 3: Random number generators
p. 3
Random number generators: Use of seed
To check how the default random stream was initialized, use:
>> RandStream.getGlobalStream
ans =
mt19937ar random stream (current global stream)
Seed: 0
NormalTransform: Ziggurat
In simulation experiments it is often desirable to start each experiment (or batch of
experiments) from a different, but well-defined state, so that a certain experiment can be
reproduced exactly. (Why?) This is done by initializing the default random stream with
different seeds for each batch.
>> stream = RandStream.create('mt19937ar','seed',100)
stream =
mt19937ar random stream
Seed: 100
NormalTransform: Ziggurat
>> RandStream.setGlobalStream(stream)
>> rand(3,1)
ans =
0.543404941790965
0.278369385093796
0.424517590749133
ASTM21
Chapter 3: Random number generators
p. 4
Random number generators: Some caveats
•
Do not re-initialise unnecessarily: for example to repeat an experiment 1000 times in
sequence, do not re-initialise in between, unless each experiment takes a really long
time. (Why?)
•
If your investigation depends very critically on the quality of the random number
generator, try another one and check that the results are (statistically) the same.
•
Warning: In MATLAB, never use rand('seed',1), rand('state',2), randn('seed',
3), etc. to set the seed. These are obsolete forms which cause MATLAB to switch to a
very old random number generator, as can be seen by checking the default stream:
>> rand('seed',100)
>> RandStream.getGlobalStream
ans =
legacy random stream (current global stream)
RAND algorithm: V4 (Congruential)
RANDN algorithm: V5 (Ziggurat)
The congruential RAND algorithm from V4 (1992) is definitely not a good one!
ASTM21
Chapter 3: Random number generators
p. 5
Generating arbitrary distributions: The transformation method
To generate a univariate (pseudo-) random variable y with given pdf p(y), there are a few
basic techniques that can be used, and some nice tricks for special distributions (like the
Gaussian). They all start with the generation of one or several uniform variates x ~ U(0,1).
The transformation method requires that you can compute (without too much difficulty) the
inverse cdf of p(y), F–1(x).
ASTM21
Chapter 3: Random number generators
p. 6
The transformation method: Three examples
Example 1: The exponential distribution
To generate a random variable y with the exponential pdf with parameter ! (see L2:24),
we use the analytical cdf
with the inverse
Thus, if x ~ U(0,1) we find that y = −(ln x)/! has the desired distribution.
Note that 1 − x ~ U(0,1) so we can save one subtraction by using x instead of 1 − x.
ASTM21
Chapter 3: Random number generators
p. 7
The transformation method: Three examples
Example 2: The Cauchy distribution
The standard form of the Cauchy distribution is
which is symmetric about x = 0 and has a FWHM (Full Width at Half Maximum) of 2 units.
As discussed in L1:19 this distribution has undefined mean value and infinite variance.
The analytical cdf is
with the inverse
Thus, if x ~ U(0,1) we find that y = tan[(x − 0.5)#] has the desired distribution.
ASTM21
Chapter 3: Random number generators
p. 8
The transformation method: Three examples
Example 3: The Box-Muller transformation for the normal distribution
It is not convenient to use the transformation method directly on the one-dimensional
normal distribution N(0,1), because the cdf
and its inverse cannot be expressed in terms of elementary functions. However,
if x ~ N(0,1) and y ~ N(0,1) are independent normal variates, then their joint pdf is
Transforming to polar coordinates by means of x = r cos !, y = r sin !, we find
which shows that r and ! are independent, that ! ~ U(0, 2#), and that the cdf of r is F(r) =
1 − exp(−r2/2), with inverse r = F–1(u) = [−2 ln(1 − u)]1/2. Thus, given two uniform variates
u1 ~ U(0,1) and u2 ~ U(0,1) we obtain two independent normal variates as
which is the Box-Muller transformation. However, more efficient algorithms exist (p. 13).
ASTM21
Chapter 3: Random number generators
p. 9
Generating arbitrary distributions: The rejection method
The rejection method does not require the inverse cdf, or even the cdf, but only that you can
compute p(x) for any given x. Moreover, you need some other function f (x) such that
• p(x) ≤ f (x) everywhere
• the integral of f is finite (say A)
• the inverse cumulative function of f can be computed (F–1(a).)
ASTM21
Chapter 3: Random number generators
p. 10
Generating arbitrary distributions: The rejection method
Using the rejection method, the algorithm to generate x0 ~ p(x) is:
$
$
$
$
$
ASTM21
1.
2.
3.
4.
5.
Generate a number a ~ U(0, A)
Apply transformation x0 = F–1(a)
Compute f (x0) and p(x0)
Generate another random number b = U(0, f (x0))
If b ≤ p(x0) accept x0, otherwise goto 1
Chapter 3: Random number generators
p. 11
The rejection method: Two examples
Example 1: Generate x ~ Beta(2,5)
This is the beta distribution (see L1:18) with parameters % = 2 and & = 5:
The transformation method is not useful, since the cdf is a polynomial of degree 6.
But we have p(x) < 2.5 everywhere (see diagram), so we can use f (x) = 2.5 (0 ≤ x ≤ 1)
with integral A = 2.5. The cdf of f (x) is y = F(x) = Ax for 0 ≤ x ≤ 1, and the inverse is
x = F–1(y) = y/A. Thus the procedure in this case is:
1.
2.
3.
4.
5.
Generate a number a ~ U(0, A)
Apply transformation x = F–1(a) = a/A [this is simply x ~ U(0,1)]
3
Compute f (x) = A and p(x)
Generate another random number y = U(0, A)
2.5
If y ≤ p(x) accept x, otherwise goto 1
f (x)
2
It can be seen that this is equivalent to placing
random points (x, y) in the rectangle outlined by f (x)
and accepting the x value if the y value is below p(x).
The efficiency of the method depends on the ratio of
areas below the two curves, i.e., A = 2.5 in this case.
On average, A pairs (x, y) are needed to generate one x.
ASTM21
Chapter 3: Random number generators
1.5
p(x)
1
0.5
0
−0.2
0
0.2
0.4
0.6
x
0.8
1
p. 12
1.2
The rejection method: Two examples
Example 2: The ziggurat algorithm for x ~ N(0,1)
The ziggurat algorithm (by Marsaglia) is the most commonly used method to generate
Gaussian numbers (e.g., randn in MATLAB) because it is very efficient.
It is essentially the rejection method applied to segments of the Gaussian curve (see
diagram). First one of the rectangles is selected at random (as they have equal area), then
a second random number is used to decide if the x value is to the left of the dotted line,
otherwise the rejection method is applied to decide if it is below the Gaussian curve.
0.5
0.4
0.3
0.2
0.1
0
0
ASTM21
1
2
3
4
Artist’s drawing of a Guto-Sumerian ziggurat (step pyramid)
Source: www.iranian.com
Chapter 3: Random number generators
p. 13
MATLAB functions for common distributions
Distribution
pdf or pmf
cdf
p = F(x)
inverse cdf
x = F−1(p)
random
generator
Uniform
unifpdf
unifcdf
unifinv
unifrnd
Beta
betapdf
betacdf
betainv
betarnd
Normal
normpdf
normcdf
norminv
normrnd
Chi-squared
chi2pdf
chi2cdf
chi2inv
chi2rnd
Exponential
exppdf
expcdf
expinv
exprnd
Gamma
gampdf
gamcdf
gaminv
gamrnd
Binomial
binopdf
binocdf
binoinv
binornd
Poisson
poisspdf
poisscdf
poissinv
poissrnd
standard
(0, 1)
rand
randn
Initialization of random number renerator: rng(seed), rng(’shuffle’)
ASTM21
Chapter 3: Random number generators
p. 14
Download