Ren_RandomNumbers.ppt

advertisement
Pseudorandom Number Generators
and the Metropolis Algorithm
Jane Ren
Outline
• Introduction to random number
generators
• The Marsaglia random number
generator
• SPRNG (A Scalable Library for
Pseudorandom Number Generation)
• Textbook assignment 3.2.3 with
Marsaglia random numbers
• Assignment 3.2.3 with random
numbers from another type of
random number generator.
• Test results
• Conclusions
Properties of Good Random
Number Generators
•
•
•
•
•
•
•
•
•
Randomness
Uniformity
Computational Efficiency
Long period length
Unpredictability
Repeatability
Portability
Homogeneity
Good random numbers are not easy
to find
The Marsaglia Random Number Generator
• The programs in our textbook use
the Marsaglia generator.
• A combinations of two generators
• Lagged Fibonacci series
– In = In-r – In-s mod 224, r = 97, s = 33
• Arithmetic Series
– I – k, I – 2k, I – 3k…, mod (224 - 3), k
= 7654321
• The final step
–
–
–
–
Take xn from a lagged Fibonacci series
Take yn from a n arithmetic series
If x >= y, the final number is x – y
Else the final number is x – y + 1
Sample Output from the Marsaglia Generator
• Assignment a0102_02 with seed1 = 1
and seed2 = 6.
• Run 1
RANMAR INITIALIZED.
idat= 1, xr= 0.894103587
idat= 2, xr= 0.034489155
idat= 3, xr= 0.330096781
idat= 4, xr= 0.613509536
extra
xr= 0.710489452
• Run 2
MARSAGLIA CONTINUATION.
idat= 1, xr= 0.710489452
idat= 2, xr= 0.429685593
idat= 3, xr= 0.142630935
idat= 4, xr= 0.369295061
extra
xr= 0.894588530
Some Advantages of he
Marsaglia Generator
• The lagged Fibonacci series has a
period of (224 – 1) * 294 ≈2110. This
is very good for a pseudorandom
number generator.
• The Marsaglia random number
generator is portable because it
uses the IEEE standard
representation of 32 bit floating
point numbers.
Some Drawbacks of the Marsaglia Generator
•
Low discretization step size 2-24
– We have already seen this problem from assignment
a0204_03.
•
•
Not as fast as some other generators, such as the
congruential generator.
It fails the birthday spacing test.
– The birthday spacing test chooses m birthdays in a year of
n days. If a certain day appeared more than once, then
that value should be asymptotically Poisson distributed
with mean m**3/(4n).
3.2.3 Assignment for section 3.2 Page 152
• a0302_01
• “Write a Metropolis program to generate
the normal distribution through the
Markov process
x  x’ = x + 2a(xr – 0.5)
where xr is a uniformly distributed
random number in the range [0,1) and a
is a real constant. Use a = 3.0 and the
initial value x = 0.0 to test your program.
(i) Generate a chain of 2000 events. Plot
the empirical peaked distribution
function for these event in comparison
with the exact peaked distribution
function. Monitor also the acceptance
rate. (ii) Repeat the above for a chain of
20000 events.”
Assignment a0302_01
•
Output for a chain of 20 000 events
RANMAR INITIALIZED.
Metropolis Generation of Gaussian random numbers
Ndat=20000, a=3.0000
Acceptance rate found 0.491700
•
CPU time for running the Metropolis Program with
Marsaglia = 0.126u (average of 10 runs)
•
Comparison between the empirical peaked distribution and
the exact peaked normal distribution
Scalable Library for Pseudorandom
Number Generation (SPRNG)
•
•
Software package released under the
GNU General Public License.
There are 6 kinds of random number
generators in SPRNG
1. Combined Multiple Recursive Generator
2. 48 Bit Linear Congruential Generator with
Prime Addend
3. 64 Bit Linear Congruential Generator with
Prime Addend
4. Modified Lagged Fibonacci Generator
5. Multiplicative Lagged Fibonacci Generator
6. Prime Modulus Linear Congruential
Generator
* Mersenne Twister Generator (will be added)
•
Link http://sprng.cs.fsu.edu/
Why SPRNG?
• The random number generators in SPRNG
are considered the top generators.
• The generators have never failed the
standard tests.
– Including some of the largest random number
tests with up to 10**13
• The tests used were collisions, coupon,
equidistribution, gap, maximum of t,
permutations, poker, runs up, serial, sum of
distributions, Ising: Metropolis, Ising: Wolff, and
Random Walk.
• What if one of the SPRNG generators
was used for assignment a0302_01?
• All of the random number generators
passed the empirical tests, so which one
do we want to use?
Descriptions of Some of the Tests
SPRNG Generators passed
• Permutation Test
– Divide the random sequence into
subsequences of a certain length.
– Rank each value in the subsequence by
magnitude.
– Count the frequency each permutation occurs.
– Conduct a Chi-Square test with these counts.
– Compare the results.
• Collision Test
– Measure the number of times the number falls
at a position that already had a number in it.
• Coupon Collector’s Test
– Compare actual period with expected period
The Generators
1. Combined Multiple Recursive Generator
•
•
Period ≈ 2219
Fast because it is a linear congruential
generator
2. 48 Bit Linear Congruential Generator
with Prime Addend
•
•
•
Period = 248
Number of distinct streams is of the order of
219
Fast because it is a linear congruential
generator
3. 64 Bit Linear Congruential Generator
with Prime Addend
•
•
•
Period = 264
Number of distinct streams is of the order of
108
Fast because it is a linear congruential
generator
The Generators
4. Modified Lagged Fibonacci Generator
•
Period = 231(2l – 1), where l is the lag.
•
•
The default period is 21310
Number of distinct streams is of the order of
231(l – 1)-1 , default = 21310
5. Multiplicative Lagged Fibonacci
Generator
•
•
Period = 261(2l – 1), default = 281
Number of distinct streams is of the order of
263(l – 1)-1 , default = 281
6. Prime Modulus Linear Congruential
Generator
•
•
•
Period = 264 - 2
Number of distinct streams ≈ 258
Fast because it is a linear congruential
generator
Is there a better generator?
• The current 6 random number
generators in SPRNG work
sufficiently well.
• The Mersenne Twister generator
MT19937 is not a part of SPRNG
yet, but it is being added to SPRNG.
• I chose to use Mersenne Twister as
a substitute for the Marsaglia
generator for a0302_01.
Why Mersenne Twister?
• Astronomical period of 219937 – 1
– No other generator has achieved this!
– Marsaglia has 2110
• 623 dimensional equidistribution with up to 32 bit
accuracy in a working area of only 624 words.
– No other generator has achieved this!
• Can be employed in practical important
simulations because it is based on a good
definition of randomness.
– Many other generators are only random in a
particular area.
• Performance exceeds that of integer-largemodulus generators.
• Efficient random number generation.
• Passed tests many stringent tests, including the
diehard tests (invented by Marsaglia) and tests
on the higher dimensional uniformity, including
the spectral test and the k-distribution test.
• Most suitable for Monte Carlo because it
emphasizes on the most significant bits.
Mersenne Twister – The Algorithm
•
Based on the recurrence (mod 2)
•
l, s, t, and u are integers, b and c are bit masks of word size,
x[0:n-1] is an array of n unsigned integers of word size, and
u, ll, and a are unsigned constant integers of word size.
Tempering Matrix
• The Mersenne Twister uses a tempering
matrix, so that multiplication by the
tempering matrix is very efficient and fast.
– Example of a tempering Matrix
• Let x = (xw-1, xw-2, xw-3,…, x0)
a = (aw-1, aw-2, aw-3,…, a0)
• xA can be calculated with only binary
operations.
– xA = shiftright(x)
if x0 = 0
– xA = shiftright(x) XOR a if x0 = 1
Mersenne Twister – How does It Work?
• Step 0
– Create a mask for the upper bits and another one
for the lower bits.
• Step 1
– The x array is initialized with a seed
• Step 2
– y is the upper bits of x[i] concatenated with the
lower bits of x[i+1].
Mersenne Twister – How does It Work?
• Step 3
– y * A.
– A is chosen carefully so it can be easily calculated
with right shift and exor.
• Step 4
– Multiply with T (tempering matrix) for better
equidistribution and good acurracy.
• Steps 5 & 6
•
Increment i by 1 and repeat the process
Mersenne Twister
• We use an improved version of the
Mersenne Twister from it’s original
authors (released 2/2002).
• Some disadvantages of the original
version
– Most significant bits only affect most
significant bits of the array state.
– Non-overlapping subsequences on
successive runs is not possible.
– Not as fast as the improved version.
Sample Output from the Mersenne Twister
Run 1
MERSENNE TWISTER INITIALIZED.
idat= 1, xr= 0.814723692
idat= 2, xr= 0.135477004
idat= 3, xr= 0.905791934
idat= 4, xr= 0.835008590
extra xr= 0.126986812
• Run 2
MERSENNE TWISTER CONTINUATION.
idat= 0, xr= 0.126986812
idat= 1, xr= 0.968867771
idat= 2, xr= 0.913375856
idat= 3, xr= 0.221034043
extra xr= 0.632359250
Mersenne Twister in a302_01
• Output for a chain of 20 000 events
RANMAR INITIALIZED.
Mersenne Twister Generation of
Gaussian random numbers
Ndat=20000, a=3.0000
Acceptance rate found 0.497850
• The acceptance rate is 0.4917000
with the Marsaglia generator
• CPU Time for the Metropolis
Program with Mersenne Twister =
0.124 s (Average of 10 runs)
– Marsaglia = 0.126 s
Mersenne Twister in a302_01
• Comparison between the empirical
peaked distribution and the exact
peaked normal distribution
Results from Marsaglia and Mersenne Twister Combined
Mersenne Twister and Marsaglia Execution Times
• The CPU times for running the Metropolis
program with Marsaglia and Mersenne
Twister are very close.
• Does it mean both generators have
roughly the same speed?
•
Mersenne Twister is faster than
Marsaglia.
• I wrote a program to only generate
4,294,967,295 Marsaglia numbers and
another one that only generates
4,294,967,295 Mersenne Twister
numbers.
– Mersenne Twister = 244.710u
– Marsaglia = 312.580u
– MT is approximately 22% faster.
Gaussian Difference Test
• Data from the Metropolis program
• Marsaglia
– Mean = 0.003543
– Error bar = 0.007169
• Mersenne Twister
– Mean = -0.008677
– Error bar = 0.006998
• Q = 0.222553
GNUPLOT FOR DISTRIBUTION FUNCTION.
• STMC_DF_GNU(Ndat, data1);
• STMC_DF_GNU(Ndat2, data2);
The Two-Sided Kolmogorov Test
• N = N1 = N2 = 50000
• KOLM2_AS2(N1, N2, DAT1, DAT2, DEL, Q)
– DAT1 = Marsaglia, DAT2 = Mersenne Twister
• DEL = 0.001172, Q = 0.882084
– DAT1 = Marsaglia1, DAT2 = Marsaglia2
• DEL = 0.001148, Q = 0.896565
– DAT1 = MT1, DAT2 = MT2
• DEL = 0.002050, Q = 0.243914
• DEL is the maximum deviation from the
exact distribution function
• Q is the approximation to the probability
that this deviation is due to chance.
Conclusions
• The Mersenne Twister generator is
more suitable for Monte Carlo
Simulations than the Marsaglia
random number generator.
– Mersenne Twister is faster.
– Mersenne Twister more accurate.
– Mersenne Twister has a linear
discretization.
• Mersenne Twister is probably the
best generator at the present time.
References
1.
2.
3.
4.
5.
6.
Berg, Bernd. Markov Chain Monte Carlo
Simulations and Their Statistical Analysis.
World Scientific, 2004.
Karaoglu, Hihat. Mersenne Twister: A Study on
Random Number Generators. Project Prestudy.
Vrije Universiteit Brussel, Apr 2004.
Korab, Holly. Best of the Random Number
Generators. The National Center for
Supercomputing Applications. Featured story,
Feb 1999.
L’Ecuyer Pierre. Uniform Random Number
Generation. The Annals of Operations Research
(1994).
Lee, J.C. Random Number and Statistical Tests.
Talk at Chuo University, Japan. Jan, 2003.
Matsumoto, Makoto and Takuji Nishimura. Mersenne
Twister: A 623-Dimensionally Equidistributed Uniform
Pseudo-Random Number Generator. ACM
Transactions on Modeling and Computer Simulations,
Vol. 8, No. 1, Jan 1998, Pages 3-30.
References (continued)
7.
8.
Matsumoto, Makoto and Takuji Nishmura.
rng_afm.c. An Improved version of the
Mersenne Twister.
http://www.math.keio.ac.jp/matumoto/emt.ht
ml
Mascagni, Michael, and Srinivasan, Ashok.
Algorithm 806: SPRNG: A Scalable Library for
Pseudorandom Number Generation. ACM
Transactions on Mathematical Software, Vol 26,
No. 3, September 2000, pages 436-461.
9. RSA Laboratories’ Bulletin – News and Advice on
Data Security and Cryptography. Number 8.
Sept 3, 1998.
10. Sinclair, Bart. Permutation Test. Class notes for
Elec 428 Computer Systems Performance.
http://www.owlnet.rice.edu/~elec428/rng/perexp.h
tml
11. The Scalable Parallel Random Number
Generator Library (SPRNG) for ASCI Monte
Carlo Computations. http://sprng.cs.fsu.edu/
Questions
Download