Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis Generate Random Values • Two steps – Random-number generation • Get a sequence of random numbers distributed uniformly between 0 and 1 – Random-variate generation • Transform the sequence to produce random values satisfying the desired distribution 2 Background • The most common method – Use a recursive function xn = f(xn-1, xn-2, …) 3 Example • xn = (5xn-1 + 1) %16 – Suppose x0 = 5 – The first 32 numbers are between 0 and 15 16 • Divide xn by 15 to get numbers between 0 and 1 1 0.9 0.8 0.7 0.6 Random 0.5 number 0.4 0.3 0.2 0.1 0 14 12 10 Random 8 number 6 4 2 0 0 10 20 Nth number 30 40 0 10 20 Nth number 30 40 4 Basic Terms • x0 = seed – Given a function, the entire sequence can be regenerated with x0 • Generated numbers are pseudo random – Deterministic – Can pass statistical tests for randomness – Preferred to fully random numbers so that simulated results can be repeated 5 Cycle Length • Note that starting with the 17th number, the sequence repeats – Cycle length of 16 1 0.9 0.8 0.7 0.6 Random 0.5 number 0.4 0.3 0.2 0.1 0 0 10 20 Nth number 30 40 6 More Terms • Some generators do not repeat the initial part (tail) of the sequence • Period of a generator = tail + cycle length tail cycle length period 7 Question • How to choose seeds and randomnumber generation functions? 1. Efficiently computable • Heavily used in simulations 2. The period should be large 3. Successive values should be independent and uniformly distributed 8 Types of Random-Number Generators • • • • • Linear-congruential generators Tausworth generators Extended Fibonacci generators Combined generators Others 9 Linear-Congruential Generators • In 1951, Lehmer found residues of successive powers of a number have good randomness properties xn = an % m = aan-1 % m = axn-1 % m • Lehmer’s choices of a and m a = 23 (multiplier) m = 108 + 1 (modulus) • Implemented on ENIAC 10 (Mixed) Linear-Congruential Generators (LCG) • xn = (axn-1 + b) % m • xn is between 0 and m – 1 • a and b are non-negative integers • “Mixed” using both multiplication by a and addition by b 11 The Choice of a, b, and m • m should be large – Period is never longer than m • To compute % m efficiently – Make m = 2k – Just truncate the result by k bits 12 The Choice of a, b, and m • If b > 0, maximum period m is obtained when – m = 2k – a = 4c + 1 – b is odd – c, b, and k are positive integers 13 Full-Period Generators • Generators with maximum possible periods • Not equally good – Look for low autocorrelations between successive numbers – xn = ((234 + 1)xn-1 + 1) % 235 has an autocorrelation of 0.25 – xn = ((218 + 1)xn-1 + 1) % 235 has an autocorrelation of 2-18 14 Multiplicative LCG • xn = axn-1 % m, b = 0 • Can compute more efficiently when m = 2k • However, maximum period is only 2k-2 • Problem: Cyclic patterns with lower bits 15 Multiplicative LCG with m = 2k • When a = 8i ± 3 • When a ≠ 8i ± 3 – E.g., xn = 5xn-1 % 25 – E.g., xn = 7xn-1 % 25 • Period is only 8 • Which is ¼ of 25 • Period is only 4 30 30 25 25 20 Random number 15 20 Random number 15 10 10 5 5 0 0 0 10 20 Nth number 30 40 0 10 20 Nth number 30 40 16 Multiplicative LCG with m ≠ 2k • To get a longer period, use m = prime number – With proper choice of a, it is possible to get a period of m – 1 – a needs to be a prime root of m • If and only if an % m ≠ 1 for n = 1..m - 2 17 Multiplicative LCG with m ≠ 2k • xn = 3xn-1 % 31 – x0 = 1 – Period is 30 – 3 is a prime root of 31 Random number 30 25 20 15 10 5 0 0 10 20 Nth number 30 40 18 Multiplicative LCG with m ≠ 2k • xn = 75xn-1 % (231 – 1) – 75 is a prime root of 231 – 1 – But watch out for computational errors • Multiplication overflow – Need to apply tricks mentioned in p. 442 • Truncation due to the number of digits available 19 Tausworthe Generations • How to generate large random numbers? • The Tausworthe generator produces a random sequence of binary digits – The generator then divides the sequence into strings of desired lengths – Based on a characteristic polynomial 20 Tausworthe Example • Suppose we use the following characteristic polynomial x7 + x 3 + 1 – The corresponding generation function is • bn+7 bn+3 bn = 0 Or • bn = bn-4 bn-7 – Need a 7-bit seed 21 Tausworthe Example • The bit stream sequence 1111111000011101111001011001…. • Convert to random numbers between 0 and 1, with 8-bit numbers x0 = 0.111111102 = 0.9921910 x1 = 0.000111012 = 0.1132810 x2 = 0.111001012 = 0.8945310 … 22 Tausworthe Generator Characteristics • For the L-bit numbers generated +E[xn] = ½ +V[xn] = 1/12 +The serial correlation is zero + Good results over the complete cycle - Poor local behavior within a sequence 23 Tausworthe Example • If a characteristic polynomial of order q has a period of 2q – 1, it is a primitive polynomial • For x7 + x3 + 1 • q=7 • Sequence repeats after 127 bits = 27 - 1 • A primitive polynomial 24 Tausworthe Implementation • Can be easily generated via linearfeedback shift-registers • For x5 + x3 + 1 bn bn-1 bn-2 bn-3 bn-4 bn-5 25 Extended Fibonacci Generators • xn = (xn-1 + xn-2) % m – Does not have good randomness properties – High serial correlation • An extension – xn = (xn-5 + xn-17) % 2k 26 Combined Generations • Add random numbers by two or more generators – Can considerably increase the period and randomness xn = 40014xn-1 % 2147483563 yn = 40692yn-1 % 2147483399 wn = (xn - yn) % 2147483562 – This generator has a period of 2.3 x 1018 27 Combined Generators wn = 157wn-1 % 32363 xn = 146xn-1 % 31727 yn = 142yn-1 % 31657 vn = (wn - xn + yn) % 32362 – This generator has a period of 8.1 x 1012 – Can avoid the multiplication overflow problem 28 Combined Generators • XOR random numbers by two or more generators 29 Combined Generators • Shuffle – One sequence as an index • To an array filled with random numbers generated by the second sequence – The chosen number in the second sequence is replaced by a new random number – Problem • Cannot skip to the nth random number 30 A Survey of Randomnumber Generators • Some published generator functions xn = 75xn-1 % (231 – 1) – Full period of 231 – 2 – Low-order bits are randomly distributed • Many others (see textbook) – All have problems • General lessons: Use established ones; Do not invent your own 31 Seed Selection • If the generator has a full period – Only one random variable is required – Any seed value is good • However, with more than one random variable, the story is different for multistream simulations – E.g., random arrival and service times – Should use two streams of random numbers 32 Seed Selection Guidelines • Do not use zero – Not good for multiplicative LCGs and Tausworthe generators • Avoid even values – Not good if a generator does not have a full period • Do not use one stream for all variables – May yield strong correlations among variables 33 Seed Selection Guidelines • Use nonoverlapping streams – Each stream requires a separate seed – Otherwise… • A long interarrival time may correlate with a long service time – Suppose we need 10,000 random numbers for interarrival times; 10,000 for service times, use seeds 1 and 10,001 – xn = [anx0 + c(an – 1)/(a – 1)] % m • For multiplicative LCGs, c = 0 34 Seed Selection Guidelines • Not to reuse seeds in successive simulation runs – No point to run a simulation again with the same seed – Just continue with the last random number as the seed for the successive runs 35 Seed Selection Guidelines • Do not use random random-number generator seeds – E.g., do not use the time of day, or /dev/random to seed simulations – Simulations should be repeatable – Cannot guarantee that multiple streams will not overlap • Do not use numbers generated by random-number generators as seeds 36 Myths About Randomnumber Generation • A complex set of operations leads to random results – Hard to guess does not mean random • Random numbers are not predictable – Given a few successive numbers from an LCG – Can solve a, c, and m – Not suitable for cryptographic applications 37 Myths about Randomnumber Generation • Some seeds are better than others – True – Avoid generators whose period and randomness depend on the seed • Accurate implementation is not important – Watch out for overflows and truncations 38 Myths about Randomnumber Generation • Bits of successive words generated by a random-number generator are equally randomly distributed – Nope 39 Myths about Randomnumber Generation • xn = (25173xn-1 + 13849) % 216 – x0 = 1 – Least significant bit is always 1 – Bit 2 is always 0 – Bit 3 has a cycle of 2 n decimal 1 25173 – Bit 4 has a cycle of 4 2 12345 – Bit 5 has a cycle of 8 3 54509 binary 01100010 01010101 00110000 00111001 11010100 11101101 4 27825 01101100 10110001 5 55493 11011000 11000101 6 25449 01100011 01101001 7 13277 00110011 1101110140 Myths about Randomnumber Generation • For all multiplicative LCGs • The Lth bit has a period that is at most 2L • For LCGs, with the form xn = axn-1 % 2k – The least significant bit is always 0 or always 1 • High-order bits are more random 41 More on Random Number Generations • Mersenne twister – Period =~ 219937-1 • /dev/random – Extract randomness from physical devices – Truly random 42 White Slide 43