Use and Testing of Pseudo-Random Number Generators A. M. Amthor, J. C. Woods North Georgia College and State University Abstract Several types of pseudo-random number generators (PRNGs) will be discussed along with methods for evaluating PRNGs. Some applications of random number generators will also be discussed. Special attention will be given to the limitations of particular generators and requirements for Monte Carlo simulations and cryptographic applications. Examples will be given throughout. 1 Introduction Today randomness is a necessary ingredient in many man-made systems, and, while these systems may vary widely, most share a common centerpiece, a digital computer. This fact allows a valuable simplification at the start. Because they are to be used on digital computers, the discussion of randomness in general may be limited to random integers. One might argue that computers are capable of processing noninteger values, but even floating point numbers regardless of precision are stored in memory as binary integers. Furthermore, a stream of random integers on a given range can easily be converted to a stream of floating point numbers of any desired precision over any desired range. The chief benefit of this simplification is that it will allow discrete probabilities to be considered over discrete and finite sets of values as opposed to dealing always with continuous probability distributions integrated on intervals. Luckily as the demand for random numbers has increased with the pervasion of computer technology and probabilistic mathematics in many fields of human venture, so has our ability to generate “randomness”. Although, much of this generated randomness is more properly called pseudo-randomness since it is in actuality the result of a deterministic mathematical process, which only has the appearance of randomness in the natural sense. Why settle for second best, using pseudorandom numbers when random ones are available? Consider using an equiprobable six-sided die to generate a stream of random digits modulo six, or perhaps several such die in a box to increase the speed of the process. The output of this process could be considered truly random, since the outcome could not possibly be determined in advance. The main problem with this system is that most applications of random numbers require a much higher bit rate (number of random bits generated per unit time) of random numbers than could be produced through this method. If a computer expects a random stream as an input to some process, then random numbers must be generated at a rate comparable to the speed of the computer. The most obvious solution is to have the computer generate the random numbers. Then as computers become faster and demand higher bit rates, higher bit rates can be provided by the same increase in computational speed that required them. The downside to this solution is that computers are by nature not random but deterministic and are designed to provide predictable output for any given state and input values. Therefore, these computer-generated numbers will never be truly random, but rather they will be the apparently random output of some deterministic process operating on a finite set of integers or integer equivalents. It should be noted that there is some variation in the field with respect to certain acronyms. The reader should be careful when reading any literature on the subject to verify the definition of an acronym before assuming equivalence. 2 Generators It seems only appropriate to begin the study of pseudo-random number generators (PRNGs) with a definition of a generator. L’Ecuyer gives such a definition in [1]. A generator is a structure made up of a finite set of states, S; an initial state, s o, also known as the seed; a transition function, T, from S to S; a finite set of outputs, U; and an output function, G, from S to U. As an example, consider one of the most fundamental generators, known as a linear congruential generator (LCG). A LCG uses a recursion relation with four userdefined values to generate a stream of integers. LCG(M,A,b,so) defines a generator with state set S ⊂ ℤ mod M, T(sn) = sn+1 = (A*sn + b) mod M, U = S, and G(sn) = sn, where sn is defined as the nth state. Notice that the number of states in S, according to this definition, must be less than or equal to M. This, combined with the fact that T is a function, allows us to set an upper limit on the period, p, where p is defined to be the minimum value of k such that sn = sn+k. In fact it has been shown that the maximum period is M-1, and necessary conditions for a LCG to achieve this period are that M is prime, A is a primitive root1 mod M, and so ≠ 0. Another quantity, the transient τ, is the least value of n for which sn = sn+k. Although it may be tempting to think that the transient is always zero, that would be to assume that T is injective, which is not necessarily true. I accidentally proved the existence of nonzero transients when I tried to write a program to find the period of several LCGs, incorrectly, by searching for the value n such that sn = so. For any generator with nonzero transient however, so is never repeated even after generating states beyond sM. LCGs are by far the most widely used of all PRNGs primarily because they are simple in design and lack the significant memory and processor requirements of more complex generators. The choice of the four input parameters is the difficult part. It usually involves a good deal of study as these four parameters will determine the extent to which the generated numbers will appear random as well as determining the period of the generator. Some examples of well-known LCGs among students of mathematics would be the generator in the Maple function rand(), which is given by LCG(1012-11,427419669081,0,1) and the Mathematica PRNG, which is given by LCG(α48-α8+1,α=231,0,1). On the other hand, LCGs are not suitable for all applications due to their poor performance on several empirical tests, their relatively short periods, and the fact that the state is equal to the output. LCGs do serve as the bases for several other more complex PRNGs with better characteristics. The extended linear congruential generator (ELCG), also known as a recursive generator (RG) or multiple recursive generator (MRG), is similar to an LCG except that its output function, G, includes more than one recursive term, and consequently its state A primitive root for a prime, p, is given by the number g if every (g n) Mod p is unique for 1 ≤ n ≤ p-1, and (gp-1) Mod p = 1. For more information see [7]. 1 consists of a vector of previous outputs. ELCG(M,a 1,a2,…,ak,b,so) defines a generator with state set S ⊂ {(un-1,…,un-k) such that all um ∈ U}, k-dimensional vectors containing the k sequential outputs preceding the nth output of the generator; T(sn) gives the new state vector by removing the rightmost element, shifting the elements right one space, and making G(sn) the new leftmost element; U ⊂ ℤ mod M; and G(sn = (un-1,…,un-k)) = (a1*un-1+…+ ak*un-k+b) mod M. The most significant difference between the ELCG and the standard LCG is that for the ELCG, S ≠ U. This eliminates the basis for the theoretical limit on the period that exists for an LCG. As explained in [2], the period length is now limited by p ≤ Mk-1, and necessary conditions for the maximum period are that M is prime and the characteristic polynomial, P(z) = zk – a1zk-1 - … - ak, is a primitive polynomial2 modulo M. The cost of the improved performance of the ELCG over the standard LCG is the more complicated implementation. Storing the larger state variable should not be a problem, but one multiplication operation and likely one modulus operation will be needed for each nonzero am, considering the large size of each amun-m. Although, methods do exist for dealing easily with larger numbers, such that only one modulus operation per output is needed, if the an parameters are chosen well. A combined linear congruential generator (cLCG) is another variation of the standard LCG technique. In this case, more than one LCG is used and the individual outputs are combined in the process of computing the final output, G(s n). In this case, the period is rather obviously limited by the least common multiple of the periods of the individual LCGs. Some methods of combining the individual LCG outputs are as follows. Consider the individual outputs of a certain number, J, of separately defined LCGs. The individual outputs of the J generators are multiplied by independent constant integer coefficients, δj, usually chosen to be relatively prime to Mj, the modulus of the jth generator, then divided by Mj. All J products of this process are then summed, and their remainder after division by one is a pseudo-random number on the interval [0,1). Alternatively, the outputs of the individual LCGs may be combined using an exclusive or 2 A polynomial is primitive mod M if it is irreducible mod M, meaning that it cannot be factored into non-trivial polynomials over the integers mod M. For example, x2 + x + 1 is primitive mod 2, but x2 + 1 is not since (x + 1)(x + 1) = x2 + 2x + 1 ≡ x2 + 1 (mod 2). For more information see [7]. binary operation. Another method similar to the first takes the sum of all terms δjuj,n (where uj,n is the nth output of the jth generator) over J generators modulo Mc, divided by Mc (where Mc is a modulus specific to the combination and not necessarily equal to any of the Mj values), to generate a uniform variate on [0,1). Since it requires substantially fewer division operations, this third method is preferable to the first computationally. Wichman and Hill designed a cLCG using three LCGs: LCG1(30269,171,0,1), LCG2(30307,172,0,1) and LCG3(30323,170,0,1) and combined the generators using the first process described above, see [8]. Their cLCG was shown in 1986 by Zeisl in the Journal of Applied Statistics to be equivalent to the generator: LCG(27817185604309,16555425264690,0,1) Theoretically, all such combined generators are equivalent to single LCGs with large moduli. This is why they are so popular, because you get the effective output of the larger generator without quite so much of the computational difficulty. Both of these two altered LCG methods may be combined to form a third variant, a multiple recursive generator (MRG). These are sometimes referred to as combined multiple recursive generators by those who refer to an ELCG or RG as a multiple recursive generator. These are also in wide use in the scientific community although the limitations of standard computer architecture make implementation of strong MRGs difficult except for specially chosen parameters. A fundamentally different method, but one which may still be seen as a multiplicative, recursive method, is a generator based on a feedback shift-register (FSR). One of the most fundamental FSRs is a linear feedback shift-register (LFSR), see [3], [4], and [5]. These generators produce streams of random bits based on an nbit state value. The output bit is the rightmost or nth bit, and the transition function determines the new state by shifting each bit right one position and recursively generating a new first, leftmost, bit. The recursion generates the first bit by adding the values in selected bit positions—known as taps—in the previous state modulo 2. The process is often considered in terms of matrices, with the state represented by an ndimensional binary vector acted on by the transition function in the form of matrix multiplication by an n-by-n transition matrix. The transition matrix contains ones in the positions corresponding to taps in the first column, zeros otherwise, and the other columns form a diagonal matrix of 1's, excluding the bottom row, which serve to shift the bits right. The bottom row will be all zeros except possibly the first column. The period of an FSR is determined by the number of states, which is 2 n, and the fact that the zero vector must be excluded. Regardless of the tap sequence a zero state vector would always yield a zero state. Thus, the maximum period is 2n – 1. To achieve the maximum period it is necessary that: A) the characteristic polynomial be monic irreducible3, and b) m = 2n – 1 is the smallest for the smallest m such that the characteristic polynomial divides xm + 1. The characteristic polynomial is produced by xn + cnxn-1 + … + c1, with the first column of the transition matrix elements numbered cn, … , c1 from top to bottom. For example, a FSR with a 4-bit state will have a full period for c1 = c4 = 1 and c2 = c3 = 0, since x4 + x3 + 1 is monic irreducible, and the smallest m for which the characteristic polynomial divides xm + 1 is m =15, which is 24 – 1. Other more complicated FSRs may achieve more random output by varying the transition matrix between outputs according to some set algorithm. 3 Tests Given so many generators, some method must be found to determine which generators are suitable, and which are not, for a given application. There has yet to be developed a complete theoretical understanding of all methods involved in PRNGs, especially the most complex mechanisms in the field, combined hybrid generators, which combine several different generator methods, and even the conceptually simple Bays-Durham shuffle, a method used to enhance the performance of some simpler pseudo-random number sources. Thus, it is essential that methods of empirical testing be developed to evaluate the performance of PRNGs, and, more specifically, to determine which generators are suitable for which applications. 3 A function is monic irreducible if it cannot be factored into non-trivial polynomials over the integers mod 2, see footnote 2 of this paper or [7] for an example or more information. The most fundamental requirement for a random number generator is that its output, for sample sets much smaller than the full period, be approximately uniformly distributed over the set of possible outputs. More generally, it is important that n- dimensional vectors, composed of n sequential outputs, be approximately uniformly distributed over the n-dimensional output space. Depending on the application, uniformity in some particular dimension or dimensions may be considerably more important than in others. The standard method for evaluating uniformity in n- dimensional space is to generate a certain number of n-dimensional vectors, N, and divide the space into K "bins," with each bin having the same probability that an ndimensional vector will fall into it. Then examine the distribution of N vectors into the K bins using a Chi-squared test with K-1 degrees of freedom to see if the distribution is adequately similar to a distribution that would be expected from a truly random set of vectors with approximately N/K vectors in each bin and an error near ±(N/K)(1/2). A Monte Carlo integration test is another test that similarly compares the spatial distribution of pseudo-random vectors to that expected of a truly random sequence. A definite integral whose value is well known is used together with a larger area, which contains the entire integrated area. The ratio of the integral to the larger area is compared to the number of pseudo-random vectors within the area of integration divided by the total number of vectors in the larger area. For example, consider the unit circle in the first quadrant. The area under the curve is π/4. This means that a PRNG with a uniform distribution in two dimensions should have N*π/4 of its two dimensional vectors (x,y) such that x2 + y2 < 1. One property of many PRNGs, including most LCG based varieties as well as FSR based generators is that, considered over a complete period, they exhibit a lattice structure in s-dimensions. The spectral test studies the s-dimensional lattice structure of a generator as a measure of the “coarseness” of the distribution. Ideally, the parallel hyperplanes formed in s-dimensional space by the generator should not have a great variation in the distances measured between hyperplanes. The spectral test considers the maximal distance between adjacent parallel hyperplanes in the s-dimensional lattice, taking the maximum of these values over all families of hyperplanes as d s. Since the vectors for the entire space are typically mapped onto the unit cube, the distance, ds, will naturally depend on the total number of possible output values, though if it performs well, ds should not be much greater than the distance between planes created by points which are perfectly evenly distributed in the space, where the distance between several hyperplanes will be equal. The tests mentioned so far have focused primarily on the geometrical distribution of the generator and are mainly useful for determining the viability of a generator for use in Monte Carlo simulations. More rigorous tests are called for, however, when the PRNG output is expected to imitate purely random numbers in other ways. The Diehard battery of tests developed by George Marsaglia is intended to test the randomness of PRNGs to this higher standard for use in computational number theory. The GCD (greatest common divisor) test, thoroughly described in [6], from the Diehard battery, studies the distribution of two variables, GCD(u n, un+1) and k, derived from application of the Euclidian algorithm to consecutive output values, u n and un+1, from a PRNG. The GCD(un, un+1), as well as k, the number of iterations of the Euclidian algorithm required to find the GCD for two independent random numbers, have distributions that are well known either theoretically or through extensive experiment. Marsaglia states that given two random integers on [1,n], k is approximately normally distributed with a mean of 0.842766 ln(n) + 0.06535, and a variance of 0.5151 ln(n) + 0.1666. The distribution of GCD(un, un+1) was found through extensive experiment for 32-bit integers. Part of the experimental results given in [6] for selected generators are shown below, with the expected counts in the first row of each table. Each generator was used to generate 7x106 random numbers to be tested. K Expected Counts KISS <=3 Xn=69069*Xn-1+12345 mod 232 Xn=69071*Xn-1mod(232+15) GCD Expected Counts KISS Xn=69067*Xn-1 mod(235+951) Xn=69069*Xn-1+12345 mod 232 5.5 7 4 29.5 30 5 144.6 142 6 590.7 583 7 2065 2148 8 6277 6294 3 20 101 491 1629 4965 154 21 123 701 1896 6121 1 2 6079271 1519817 6078818 1521176 3 675474 675496 4 379954 379749 5 243171 242677 6 168869 168462 6077628 1520790 675940 379369 243256 168537 8106215 900376 0 324000 0 0 The performance of a given generator is evaluated by comparing the number of counts in each column to the expected number of counts using a chi-squared test. The p-value gives the probability that the numbers in the sample were generated from a random sample. All LCGs fail the Diehard tests. This does not mean that LCGs are useless, since the Diehard tests are more demanding than many applications require. Passing the GCD test, the one element of Diehard mentioned here, depends heavily on the last digit of the output value, even though the last digit has relatively little effect on the uniformity of the distribution. 4 Applications The purpose of these efforts in the development and testing of PRNGs is to have a selection of easily implementable and verifiably robust methods to produce numbers that satisfy the various requirements for “randomness” of several man-made systems that require them to operate. Since these applications are the justification for much of the study which has and continues to take place in the field it is only proper to give at least a brief synopsis of the most popular applications. Those being Monte Carlo simulations, cryptography, and computational number theory. Monte Carlo simulations use known or theoretical probability distribution functions, sampled with the use of a PRNG, to simulate the real-world system which the probability distribution functions presume to model. Monte Carlo simulations are used, with great success, in areas from experimental nuclear physics to business planning. As an example, consider the process of radioactive decay. Given a certain sample of N radioactive nuclei, the model predicts that a certain number, D, of those nuclei will decay over a given time period. The current model also states that, on average, D is proportional to N by some constant λ. Monte Carlo techniques can be used to simulate this system. First define the number of unstable nuclei (N), the decay constant (λ∈[0,1)), and the number of time steps which will be observed (T). Then perform the following algorithm T times, keeping a list of values of all variables involved except the random numbers used: a) Choose N random numbers, un, between zero and one. b) Let D equal the number of un values such that un < λ. c) Subtract D from N At completion the list of D values should be very similar to the activity of a radioisotope, if λ is large there will be a noticeable decrease in activity due to the decrease in the overall population of the radioisotope. This simulation serves to verify an already well verified theory of radioactive decay, and so may not seem all that interesting. Although, clearly more complex simulations could be rewarding. Successive applications of varying probability distributions may be used involving ion beams impacting solid targets to cause radiative capture fusion reactions, subsequent shielding interaction simulation and finally detector interaction simulation of the reaction products to determine the validity of a current theory or the applicability of an experimental result to the theory. Monte Carlo simulations make several demands of PRNGs. Firstly, they must be uniformly distributed in one, two, and three dimensional space, including cases where non-cartesian coordinate systems may be used. PRNGs must also be able to produce many parallel streams of uncorrelated outputs to ensure that interactions among independent particles can be modeled with unbiased output. Arguably the most widespread use of PRNGs is in Cryptography. The security of a cryptographic system depends on a users ability to generate sets of cryptographic keys in an unpredictable way. A PRNG can be used to randomly generate sets of keys, but it will be necessary to design a generator with certain characteristics which enhance security, as discussed in [9]. Firstly, the generator algorithm must be assumed to be public knowledge. This will help build a system that is secure even against insider hacks. Secondly, the generator must have a very large number of states to protect against a brute force attack, an attempt to find the state value by trying all possibilities. Thirdly, there must be a computationally irreversible separation between the state and the output to prevent knowledge of the output from being used to determine the state and other outputs. Finally, the algorithm should allow for additional entropy to be added to the state periodically, perhaps in the form of optional user input or a hardware entropy source, to reduce the risk from a one-time state compromise. Computational number theory is in many ways the most demanding application for PRNGs. This serves as the motivation for much of Marsaglia’s work in developing novel testing methods to ensure uniformity of the outputs, testing for regularities previously ignored. Computational procedures use a PRNG and draw theoretical conclusions while assuming that the outputs are completely independent and random. 5 References [1] Pierre L’Ecuyer, 1994, Uniform Random Number Generators, preprint of article to appear in Annals of Operations Research. [2] Pierre L’Ecuyer and Renée Touzin, 2000, Fast Combined Multiple Recursive Generators with Multipliers of the Form a = ±2q ± 2r, Proceedings of the 2000 Winter Simulation Conference. [3] W. Cherowitzo, Linear Feedback Shift Registers, 2000, http://www-math.cudenver.edu/~wcherowi/courses/m5410/m5410fsr.html [4] Cheryl Praeger, Linear Feedback Shift Registers, 2001, http://www.maths.uwa.edu.au/~praeger/teaching/3CC/WWW/chapter4.html [5] Unknown Author, Linear Feedback Shift Registers, http://homepage.mac.com/afj/lfsr.html [6] George Marsaglia and Wai Wan Tsang, 2002, Some Difficult-to-Pass Tests of Randomness, Journal of Statistical Software, Volume 7, Issue 3. [7] Eric W. Weisstein, 1999, “Primitive Root” & “Irreducable Polynomial”, http://mathworld.wolfram.com/, CRC Press LLC. [8] K. Entacher and P. Hallekalek, Combined LCGs : cLCG, 2000, http://crypto.mat.sbg.ac.at/results/karl/server/node8.html [9] J. Kelsey, B. Schneier, D. Wagner, and C. Hall, Cryptanalytic Attacks on Pseudorandom Number Generators, http://www.counterpane.com/pseudorandom_number.pdf