Random-Number Generation Andy Wang CIS 5930-03 Computer Systems

advertisement
Random-Number Generation
Andy Wang
CIS 5930-03
Computer Systems
Performance Analysis
Generate Random Values
• Two steps
– Random-number generation
• Get a sequence of random numbers distributed
uniformly between 0 and 1
– Random-variate generation
• Transform the sequence to produce random
values satisfying the desired distribution
2
Background
• The most common method
– Use a recursive function
xn = f(xn-1, xn-2, …)
3
Example
• xn = (5xn-1 + 1) %16
– Suppose x0 = 5
– The first 32 numbers
are between 0 and
15
16
• Divide xn by 15 to
get numbers
between 0 and 1
1
0.9
0.8
0.7
0.6
Random
0.5
number
0.4
0.3
0.2
0.1
0
14
12
10
Random
8
number
6
4
2
0
0
10
20
Nth number
30
40
0
10
20
Nth number
30
40
4
Basic Terms
• x0 = seed
– Given a function, the entire sequence can
be regenerated with x0
• Generated numbers are pseudo
random
– Deterministic
– Can pass statistical tests for randomness
– Preferred to fully random numbers so that
simulated results can be repeated
5
Cycle Length
• Note that starting with the 17th number,
the sequence repeats
– Cycle length of 16
1
0.9
0.8
0.7
0.6
Random
0.5
number
0.4
0.3
0.2
0.1
0
0
10
20
Nth number
30
40
6
More Terms
• Some generators do not repeat the
initial part (tail) of the sequence
• Period of a generator
= tail + cycle length
tail
cycle length
period
7
Question
• How to choose seeds and randomnumber generation functions?
1. Efficiently computable
•
Heavily used in simulations
2. The period should be large
3. Successive values should be
independent and uniformly distributed
8
Types of Random-Number
Generators
•
•
•
•
•
Linear-congruential generators
Tausworth generators
Extended Fibonacci generators
Combined generators
Others
9
Linear-Congruential
Generators
• In 1951, Lehmer found residues of
successive powers of a number have
good randomness properties
xn = an % m = aan-1 % m = axn-1 % m
• Lehmer’s choices of a and m
a = 23 (multiplier)
m = 108 + 1 (modulus)
• Implemented on ENIAC
10
(Mixed) Linear-Congruential
Generators (LCG)
• xn = (axn-1 + b) % m
• xn is between 0 and m – 1
• a and b are non-negative integers
• “Mixed”  using both multiplication by a
and addition by b
11
The Choice of a, b, and m
• m should be large
– Period is never longer than m
• To compute % m efficiently
– Make m = 2k
– Just truncate the result by k bits
12
The Choice of a, b, and m
• If b > 0, maximum period m is obtained
when
– m = 2k
– a = 4c + 1
– b is odd
– c, b, and k are positive integers
13
Full-Period Generators
• Generators with maximum possible
periods
• Not equally good
– Look for low autocorrelations between
successive numbers
– xn = ((234 + 1)xn-1 + 1) % 235 has an
autocorrelation of 0.25
– xn = ((218 + 1)xn-1 + 1) % 235 has an
autocorrelation of 2-18
14
Multiplicative LCG
• xn = axn-1 % m, b = 0
• Can compute more efficiently when m =
2k
• However, maximum period is only 2k-2
• Problem: Cyclic patterns with lower bits
15
Multiplicative LCG
with m = 2k
• When a = 8i ± 3
• When a ≠ 8i ± 3
– E.g., xn = 5xn-1 % 25
– E.g., xn = 7xn-1 % 25
• Period is only 8
• Which is ¼ of 25
• Period is only 4
30
30
25
25
20
Random
number 15
20
Random
number 15
10
10
5
5
0
0
0
10
20
Nth number
30
40
0
10
20
Nth number
30
40
16
Multiplicative LCG
with m ≠ 2k
• To get a longer period, use m = prime
number
– With proper choice of a, it is possible to get
a period of m – 1
– a needs to be a prime root of m
• If and only if an % m ≠ 1 for n = 1..m - 2
17
Multiplicative LCG
with m ≠ 2k
• xn = 3xn-1 % 31
– x0 = 1
– Period is 30
– 3 is a prime root of 31
Random number
30
25
20
15
10
5
0
0
10
20
Nth number
30
40
18
Multiplicative LCG
with m ≠ 2k
• xn = 75xn-1 % (231 – 1)
– 75 is a prime root of 231 – 1
– But watch out for computational errors
• Multiplication overflow
– Need to apply tricks mentioned in p. 442
• Truncation due to the number of digits available
19
Tausworthe Generations
• How to generate large random
numbers?
• The Tausworthe generator produces a
random sequence of binary digits
– The generator then divides the sequence
into strings of desired lengths
– Based on a characteristic polynomial
20
Tausworthe Example
• Suppose we use the following
characteristic polynomial
x7 + x 3 + 1
– The corresponding generation function is
• bn+7  bn+3  bn = 0
Or
• bn = bn-4  bn-7
– Need a 7-bit seed
21
Tausworthe Example
• The bit stream sequence
1111111000011101111001011001….
• Convert to random numbers between 0
and 1, with 8-bit numbers
x0 = 0.111111102 = 0.9921910
x1 = 0.000111012 = 0.1132810
x2 = 0.111001012 = 0.8945310
…
22
Tausworthe Generator
Characteristics
• For the L-bit numbers generated
+E[xn] = ½
+V[xn] = 1/12
+The serial correlation is zero
+ Good results over the complete cycle
- Poor local behavior within a sequence
23
Tausworthe Example
• If a characteristic polynomial of order q
has a period of 2q – 1, it is a primitive
polynomial
• For x7 + x3 + 1
• q=7
• Sequence repeats after 127 bits = 27 - 1
• A primitive polynomial
24
Tausworthe Implementation
• Can be easily generated via linearfeedback shift-registers
• For x5 + x3 + 1

bn
bn-1
bn-2
bn-3
bn-4
bn-5
25
Extended Fibonacci
Generators
• xn = (xn-1 + xn-2) % m
– Does not have good randomness
properties
– High serial correlation
• An extension
– xn = (xn-5 + xn-17) % 2k
26
Combined Generations
• Add random numbers by two or more
generators
– Can considerably increase the period and
randomness
xn = 40014xn-1 % 2147483563
yn = 40692yn-1 % 2147483399
wn = (xn - yn) % 2147483562
– This generator has a period of 2.3 x 1018
27
Combined Generators
wn = 157wn-1 % 32363
xn = 146xn-1 % 31727
yn = 142yn-1 % 31657
vn = (wn - xn + yn) % 32362
– This generator has a period of 8.1 x 1012
– Can avoid the multiplication overflow
problem
28
Combined Generators
• XOR random numbers by two or more
generators
29
Combined Generators
• Shuffle
– One sequence as an index
• To an array filled with random numbers
generated by the second sequence
– The chosen number in the second
sequence is replaced by a new random
number
– Problem
• Cannot skip to the nth random number
30
A Survey of Randomnumber Generators
• Some published generator functions
xn = 75xn-1 % (231 – 1)
– Full period of 231 – 2
– Low-order bits are randomly distributed
• Many others (see textbook)
– All have problems
• General lessons: Use established
ones; Do not invent your own
31
Seed Selection
• If the generator has a full period
– Only one random variable is required
– Any seed value is good
• However, with more than one random
variable, the story is different for
multistream simulations
– E.g., random arrival and service times
– Should use two streams of random
numbers
32
Seed Selection Guidelines
• Do not use zero
– Not good for multiplicative LCGs and
Tausworthe generators
• Avoid even values
– Not good if a generator does not have a full
period
• Do not use one stream for all variables
– May yield strong correlations among
variables
33
Seed Selection Guidelines
• Use nonoverlapping streams
– Each stream requires a separate seed
– Otherwise…
• A long interarrival time may correlate with a
long service time
– Suppose we need 10,000 random numbers
for interarrival times; 10,000 for service
times, use seeds 1 and 10,001
– xn = [anx0 + c(an – 1)/(a – 1)] % m
• For multiplicative LCGs, c = 0
34
Seed Selection Guidelines
• Not to reuse seeds in successive
simulation runs
– No point to run a simulation again with the
same seed
– Just continue with the last random number
as the seed for the successive runs
35
Seed Selection Guidelines
• Do not use random random-number
generator seeds
– E.g., do not use the time of day, or
/dev/random to seed simulations
– Simulations should be repeatable
– Cannot guarantee that multiple streams will
not overlap
• Do not use numbers generated by
random-number generators as seeds
36
Myths About Randomnumber Generation
• A complex set of operations leads to
random results
– Hard to guess does not mean random
• Random numbers are not predictable
– Given a few successive numbers from an
LCG
– Can solve a, c, and m
– Not suitable for cryptographic applications
37
Myths about Randomnumber Generation
• Some seeds are better than others
– True
– Avoid generators whose period and
randomness depend on the seed
• Accurate implementation is not
important
– Watch out for overflows and truncations
38
Myths about Randomnumber Generation
• Bits of successive words generated by a
random-number generator are equally
randomly distributed
– Nope
39
Myths about Randomnumber Generation
• xn = (25173xn-1 + 13849) % 216
– x0 = 1
– Least significant bit is always 1
– Bit 2 is always 0
– Bit 3 has a cycle of 2 n decimal
1
25173
– Bit 4 has a cycle of 4 2 12345
– Bit 5 has a cycle of 8 3 54509
binary
01100010
01010101
00110000
00111001
11010100
11101101
4
27825
01101100
10110001
5
55493
11011000
11000101
6
25449
01100011
01101001
7
13277
00110011
1101110140
Myths about Randomnumber Generation
• For all multiplicative LCGs
• The Lth bit has a period that is at most 2L
• For LCGs, with the form
xn = axn-1 % 2k
– The least significant bit is always 0 or
always 1
• High-order bits are more random
41
More on Random Number
Generations
• Mersenne twister
– Period =~ 219937-1
• /dev/random
– Extract randomness from physical devices
– Truly random
42
White Slide
43
Download