A Survey on Random Number Testing Algorithms and Software

advertisement
A Survey on Random Number Testing Algorithms and Software
Shuaiyuan Zhou
Abstract
This paper surveyed random number testing algorithms and software.
Marsaglia’s Diehard Tests were discussed in detail, including description of all 12 testing
algorithms. Also, a library – TestU01, was studied and some experiments were done with
it.
Keywords
Randomness testing, Diehard Tests, TestU01
1. Introduction
Random number is widely used in many applications, thus, a lot of work has been done
regarding to random number generation. Various random number generators were
proposed by researchers such as Knuth [1] during years of study.
Another concern in study of random number generation is how to evaluate a certain
random number generator: whether the sequence it produced is random. This is called
randomness testing and a branch of algorithms were developed as well as implemented
before.
This paper gave a survey on random number testing algorithms and software done in my
term project; and was organized as following: section 2 provided an overview of random
number testing algorithms and some concrete example of testing methods; section 3
focused on Diehard Tests developed by Marsaglia; section 4 summarized different
current software for testing random number generators; and in section 5, a software
library - TestU01 was discussed in detail; section 6 concluded this paper.
2. Random Number Testing Methods
Randomness tests, in data evaluation, as discussed in [2], are used to analyze the
distribution pattern of a set of data. One practical measure of randomness based on
statistical randomness is called statistical tests.
According to [3], a numeric sequence is said to be statistically random when it contains
no recognizable patterns or regularities; but follows a certain probability distribution. As
we expected, statistical randomness in many cases is pseudo-randomness.
[3] stated that statistical test is a method of making decisions of randomness by looking at
the performance of experimental data or more specifically speaking, the distribution of
experimental data. The basic procedure of a random number testing algorithms can be
described as following: given a random number sequence produced by the generator
being tested, we did some calculation and examine whether the resulted sequence
exhibits a certain distribution. If it does, we may claim that the randomness of sequence
is good enough and thus, the generator pass the test.
Researchers proposed a variety of algorithms for randomness testing. The first tests for
random numbers were published by M.G. Kendall and Bernard Babington Smith in 1938,
which were listed in table 1 and discussed in [3]. In 1995, George Marsaglia published
his Diehard Tests, which we will discuss in detail later in this paper. And in 2001,
Institute of Standards and Technology (NIST) published another set of tests in [4] and
listed in table 2.
Table 1 Early Days Tests
1
2
3
4
Frequency test
Serial test
Poker test
Gap test
Table 2 NIST Tests
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Frequency (monobit)
Frequency (block)
Runs test
Longest run of ones in a block
Binary matrix rank
Discrete Fourier transform (spectral)
Non-overlapping template matching
Overlapping template matching
Maurer’s universal statistical
Linear complexity
Serial
Approximate entropy
Cumulative sums
Random excursions
Random excursions variants
3. Diehard Tests
Diehard tests are a battery of statistical tests for measuring the quality of a set of random
numbers. They were developed by George Marsaglia and first published in 1995 on a
CD-ROM. I suspect that Marsaglia named his work “diehard” because these tests seemed
more stringent than the existing ones at the time they were proposed, thus provided a
better measurement for randomness.
During years of development, Marsaglia adopted ideas of randomness testing from
previous researches and designed some tests by himself, and composed this battery of
Diehard tests. In Diehard, Marsaglia include 12 tests, which are summarized in the
following table and would be discussed in detail later.
Table 3 Diehard Tests
1
2
3
4
5
6
7
8
9
10
11
12
Birthday spacing
Overlapping permutation
Ranks of matrices
Monkey test
Count the 1s
Parking lot test
Minimum distance test
Random sphere test
The squeeze test
Overlapping sums test
Runs test
The craps test
3.1 Birthday spacing
Birthday spacing is the discrete version of Iterated-Spacing Test, which is described in [5]
in detail. Marsaglia named this test “birthday spacing” since it was somehow related to
the famous birthday paradox in probability theory, which stated that in a group of at least
23 randomly chosen people, there is more than 50% probability that some pair of them
would have the same birthday; and for 57 or more people, this probability could be more
than 99%.
In [5], Marsaglia stated that duplicate birthday is not stringent enough, so he used
duplicate spacing instead. The idea is as following: firstly, choose m integers I(1), I(2), …,
I(m) from a random sequence produced by the RNG to be tested, whose range is from 1
to n; and treat them as m birthdays in a year of n days; secondly, sort these numbers to
get I(1) ≤ I(2)≤ … I(m) and then calculate m spacings between each pair: I(2) - I(1), I(3) I(2), …, I(m-2) - I(m-1), I(1) + m - I(m); thirdly, count duplicate spacings, that is, values which
appear more than once in the spacings.
After we got the count of duplicate spacings, the test was done by showing that this count
should asymptotically follow a Poisson distribution with parameter λ = m3/(4n).
3.2 Overlapping permutation
Overlapping permutation test is an improvement to overlapping m-tuple tests discussed in
[5]. The motivation lies in the idea of adopting overlapping m-tuple tests to random
sequences whose elements are not independent. Since the elements are not independent,
we need to take into consideration the states of overlapping tuples in the sequence, and
use this newly formed sequence of states as the test bed because of the independence of
occurrence of different states.
The procedure for this overlapping permutation test can be illustrated by a concrete
example as following. Let u1,u2, …, un be uniform variables produced by a RNG; then
each of the overlapping 3-tuples (u1, u2, u3), (u2, u3, u4), … could be in one of six possible
states. Thus, overlapping triples of u’s lead to a sequence of states whose elements are
chosen from interval [1, 6].
After having this sequence of states, a overlapping m-tuple test was executed. More
specifically, we could let wi,j,k = number of times i,j,k appears in the state sequence, and
calculate µi,j,k as mean of wi,j,k and denote C as the covariance matrix and C- as the weak
inverse of C; then the test is based on the fact that the equation
provides a χ2-distribution.
3.3 Ranks of matrices
If a matrix is viewed as a collection of row vectors, then the rank of matrix is defined as
the number of independent row vectors. The rank of a
matrix can vary from 0 to N;
and the matrix is said to be of full rank if its rank is N.
The binary rank test for
matrices was discussed in [6]. Given the sequences
produced by a random number generator, a
binary matrix was formed by using
the leftmost 32 bits of 32 variants as row vectors. In practical, the generated variables can
be viewed as a sequence of computer words, and the matrix was constructed by choosing
32 distinct words.
After construction of matrices, the ranks were calculated and counted. Although ranks of
a
matrix could be in the range of [0, 32], mathematical researches stated that a
matrix with rank smaller than 29 appeared rarely. An experiment in [6] showed that of
40,000 matrices, the expected value for matrices with rank less than 29 is only 211.5, or
the percentage is merely 0.529%. Given this observation, when counting the number of
different matrix ranks, the matrices with ranks less than or equal to 29 were accumulated;
in other words, we got the counts for matrices with ranks of 32, 31, 30 and 29 or less.
After we had these counts, a chi-squared test was performed to determine the randomness
of the generated sequence and to evaluate the generator.
3.4 Monkey test
According to [7], the name of this test comes from infinite monkey theorem, which states
that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of
time will almost surely type a given text, such as the complete works of William
Shakespeare.
Monkey test is also called overlapping m-tuple test described in detail in [5]. And the
basic idea of this test is as following: treat m consecutive elements in the sequence of a
random number generator as “words”; count overlapping words in a stream; and the
number of “words” that don’t appear should follow a certain distribution. In [8], three
kinds of monkey test: Overlapping-pairs-sparse-occupancy (OPSO) test, Overlappingquadruples-sparse-occupancy (OQSO) test and DNA test were discussed.
For demonstration, we may take a look at OPSO test as an example. In this test, twoletter “words” are formed from an alphabet of 210 letters. And the letters in a word are
determined by a specified 10 bits from a 32-bit integer in the sequence of a random
number generator. Then, we count the number of missing “words”, or in other words,
letter combinations that do not appear in the entire sequences to be tested. Finally, we
expected this count to have an asymptotic Poisson distribution.
The OQSQ and DNA tests are similar to OPSO test; and the difference lies in the
composition of alphabet as well as the combination method of words. More specifically,
in OQSQ test, we form 4-letter words from an alphabet of 25 letters; and in DNA test, the
alphabet is of 4-letters, somehow like the structure of DNA, and the words are composed
of 10 letters.
In [8], the author stated that OPSO, OQSO and DNA tests are the most effective among
monkey tests by the experiment of applying various monkey tests to many different
random number generators.
3.5 Count the 1s
Two kinds of count the 1s tests are included in Diehard, namely count the 1’s in a stream
of bytes and count the 1’s in specific bytes. The difference is how to sample the sequence
to be tested – to choose successive bits or to jump around with predefined phase.
The process of this test is discussed in [5] and would be summarized here. For
initialization, overlapping sequences of length 5 are formed using bits chosen from a
sequence generated by a random number generator; and counts of 1’s are grouped into 5
sets: {0, 1, 2}, {3}, {4}, {5} and {6, 7, 8}. Then, the counts of the appearance of each of
the 55 combinations are obtained and a chi-squared test is performed on the counts to
validate the randomness of the generator.
3.6 Parking lot test
This test is designed to access the uniformity of points in m-space when the coordinates
of the points are taken from successive elements of a sequence of the random number
generator being tested.
In this test, each number produced by a random number generator was treated as a point
in m-space, and also as the center of a cubic “car” of specified diameter. Given the cube
of the parking lot, and suppose that we park by ear, the test is done by parking randomly
in the lot – we try to park arbitrarily until we succeed with a car that does not hit any of
those already parked.
Given this parking strategy, out of n attempts, we would have a list of k cars which
parked successfully. And in [5], it is stated that after a relatively large times of attempts,
the number of successful parking would follow a certain normal distribution.
Also, it is mentioned that the dimension of space does not have any influence on the
performance, meaning that the only thing needed consideration is the size and shape of
“cars”.
3.7 Minimum distance test
In this test, each variant of a generated random sequence was viewed as a point in a
square of certain scale; and the subject of testing was the distance between points.
Let us look at a concrete example. From a sequence generated by the random number
generator being tested, we randomly picked 8,000 items and treated them as points and
placed them randomly in a
square. Then, the minimum distance
between each pair of points could be calculated; and the test was based on the
observation that the square of the distance should be exponentially distributed with a
certain mean.
One relevant problem is how to calculate the minimum distance. Euler distance might be
the most widely used measurement.
3.8 Random sphere test
This is similar to minimum distance test discussed above. The difference is that the
subject to be considered is volume of spheres rather than minimum distance between
each pair of points.
We treat the elements of sequence being tested as points, and randomly choose 4,000
points in a cube of edge 1,000. Then, a sphere is centered on each point, and the radius
used is the minimum distance from the center to another point.
For testing, we rank the volume of spheres and use the smallest one as the subject. [9]
stated that the smallest sphere’s volume should be exponentially distributed.
The issues regarding to various methods for calculating minimum distance discussed in
previous section also need to be token into consideration here when choosing the radius
of each sphere.
3.9 The squeeze test
The squeeze test was designed for random number generators that produce random floats
on [0,1). For generators whose sequences are integers, we could apply a mapping, or, the
normalization process to project the sequences being tested to floats on [0,1). This linear
operation would not affect the correctness of result.
The basic idea in designing this test is denoted as “squeezing” in the name, which is,
given an item of the sequence being tested, a process to reduce it to 1 was performed in
the test. More specifically speaking, the processing is done as following: starting with k =
231, the test finds j, which is the number of iterations necessary to reduce k to 1, using the
reduction
, with U from the sequence being tested.
In the test, such j’s are found 100,000 times, and counts of the number of times j was ≤ 6,
7, …, 47, ≥ 48 are obtained. We accumulate j’s that are too small (less than 6) or too
large (greater than 48) and count them together since empirical experience told us that the
number of times j fell into these ranges is relatively small.
After having these counts, a chi-squared test could be performed on them to indicate the
fitness of the generator being tested.
3.10 Overlapping sums test
Overlapping sums test discussed in [6] and [10] takes floats on [0, 1] as the input,
meaning that, at the very beginning, sequences being tested need to be normalized and
floated to get uniform, U[0,1], variables, namely: U1, U2, …, UN.
Then overlapping sums are constructed such that
S1 = U1+ U2+…+ U100; S2 = U2+ U3+…+ U101; …; Si = Ui+ Ui+1+…+ Ui+100 etc.
The result of the central limit theorem showed that these S’s are approximately
multivariate normal. The next step is taking 100 of S’s and converting them into standard
normal variables,
by multiplying (S - µ) by a matrix A, where µ is mean
value of S and A satisfies
.
A further manipulation is to convert these standard normal variants into uniform, U[0,1],
variables by the inverse normal cumulative distribution function. Finally, we carry out a
chi-squared test on these uniform variables.
3.11 Runs test
Runs test, also called Wald-Wolfowitz test in [11] is similar to the longest run within a
block test discussed in [10]. The focus of this test is the total number of runs in the
sequence generated by the random number generator being tested. And the sequence can
be viewed as being composed of 0’s and 1’s.
A run, as in [11], is defined as a maximal non-empty segment of the sequence consisting
of adjacent equal elements; or, in other words, an uninterrupted sequence of identical bits
bounded before and after with a bit of opposite value.
In a sequence of length N, there are N0 occurrences of 0 and N1 occurrences of 1. (so N =
N0 + N1). And the test is done by showing that N0 and N1 are normally distributed with
mean
and variance
.
3.12 The craps test
This test comes from the game of craps, in which players place wagers on the outcome of
the roll or a series of rolls of two dice.
In this test, we play 200,000 games of craps, and count the wins and the numbers of
throws per game. Then we perfume a chi-squared test to each count.
4. Software for Testing RNGs
Quite a few suits of software for testing random number generators are developed by
various research groups by implementing the testing algorithms discussed above, making
some modification in purpose of special applications, or incorporating some new testing
algorithms. Some widely used testing suits are TestU01, DIEHARD, SPRNG, NIST,
ENT and Crypt-X. In this section, we overview all these test suits and in next section
TestU01 will be examined in detail.
4.1 TestU01
TestU01 is software library, implemented by Pierre L’Ecuyer from Université de
Montréal in the ANSI C language, and offering a collection of utilities for the empirical
statistical testing of uniform random number generators. TestU01 is free to download and
the latest version as well as user’s guide could be found in [12].
4.2 DIEHARD
DIEHARD is the implementation of Marsaglia’s battery of Diehard tests discussed in
Section 3. More information regarding to this software could be found in [14]. The author
provided source code in both C and Fortran version, the complied program as well as
user’s guide in the CD-ROM.
4.3 SPRNG
The SPRNG library [15] is another public-domain software that implements the classical
tests for random number generators given by Knuth in his book, plus a few others.
Although tests proposed by Knuth are now seen to be quite mild, allowing known “bad”
generators to pass the tests, SPRNG is of value since the quality of tests depends on the
application and in certain cases, Knuth’s tests perform fairly well.
4.4 NIST Statistical Test Suite (NIST)
NIST Statistical Test Suite [4] is a statistical package consisting of 16 tests released by
National Institute of Standards and Technology in 2001. The testing algorithms are NIST
tests mentioned in Section 2. Since it is a result of collaborations between the Computer
Security Division and the Statistical Engineering Division at NIST, this test suit is
especially suitable for testing the randomness of arbitrary long binary sequences
produced by either hardware or software based cryptographic random or pseudo-random
number generators. Thus, NIST test suite is usually used to test and certificate random
number generators used in cryptographic applications.
4.5 ENT
ENT Program developed by John Walker in 1998 discussed in [16] is a set of statistical
tests to test for non-randomness in sequences. It is described as being useful for those
evaluating pseudo-random number generators for encryption and statistical sampling
applications, compression algorithms, and other applications where the information
density of a file is of interest.
4.6 Crypt-X
Crypt-X suite of statistical test was developed by researchers at the Information Security
Research Centre at Queensland University of Technology in Australia and is a
commercial software package. Crypt-X supports steam ciphers, block ciphers and key
stream generators. Information could be found in [17].
5. TestU01
In this section, we will focus on TestU01 – a software library implementing a series of
classical statistical testing algorithms as well as several types of random number
generators. Started in 1981 as a Pascal program to implement the test suggested by Knuth,
TestU01 kept developing over years by adding new testing algorithms and new
generators as well. Now, the latest version - TestU01-1.2.3, written in ANSI C language,
was released on August 18th, 2009.
In the following, the organization or the architecture of the library will be discussed and
some practical issues such as installation and usage of the library will be addressed.
5.1 Organization
According to user’s guide [13], the software tools of TestU01 are organized in four
classes of modules: those implementing random number generators, those implementing
statistical tests, those implementing pre-defined batteries of tests, and those implementing
tools for applying tests to entire families of generators. The modules are named u, s, b,
and f respectively; and within each module, the types, variables and functions are
prefixed by a letter identical to the corresponding module.
5.1.1 The generator implementations
The u module provides the basic tools for defining and implementing uniform random
number generators. A set of uniform generators were already programmed in this library,
such as the linear congruential generators in module ulcg, the multiple recursive
generators in umrg, the inversive generators in uinv, etc.
Besides these intrinsic generators, functions to combine two or more generators are also
provided
in
this
module,
such
as
unif01_CreateCombAdd2()
or
unif01_CreateCombXor2(). What is more, some manipulation could be done with various
functions: for example, getting the current state of a generator by unif01_WriteState(),
filtering the output in different ways such as combining successive values in the sequence
with function unif01_CreateDoubleGen(), and testing the speed of a particular generator
by creating a timer unif01_TimerGen() and then getting the result
unif01_WriteTimerRec().
5.1.2 The statistical tests
The statistical tests are implemented in the s modules. To apply a test to a particular
generator, the generator must first be created by a proper function in u module, and then
passed as a parameter to the function implementing the required test in s module. The test
results would be printed automatically to the standard output.
Various tests are organized in different modules. For example, in module sknuth, the
classical statistical tests for random number generators described in Knuth’s book are
implemented and in module smarsa, the battery of diehard tests discussed in Section 3 is
implemented. There are other modules of implementation of tests which are grouped
together according to their similarity.
5.1.3 Batteries of tests
In b module, some predefined suites of standard statistical test, with fixed parameters are
included. This module provides convenience for users since different types of tests are
incorporated in the test suites and what users need is to pass generators to be tested as a
parameter to these test suites.
A number of predefined batteries of tests are available in TestU01, such as the battery
Pseudo-DIEHARD applies most of the tests in Marsaglia’s diehard suite and the battery
FIPS_140_2 implements the small suite of tests of the NIST standard.
5.1.4 Tools for testing families of generators
The f modules provide a set of tools, which are built on top of the modules that
implement the generator families and the test, and designed to perform systematic studies
of the interaction between certain types of tests and the structure of the point sets
produced by given families of random number generators.
For each family to be tested, a random number generator of period of length near 2i has
been selected, for all integers i in some interval. Typically, a given random number
generator would fail a test as the sample size increases. Given this fact, the test is done by
figuring out the sample size at which the test starts to reject the random number generator
as a function of the period of the generator being tested.
5.2 Installation
In [12], both the source code and the compiled binary files are all provided. Also, an
installation guide is available, talking about compilation and installation under different
platforms. Based on my experience in this term project, I would like to say something
more about the installation under Microsoft Windows.
In my project, I used TestU01 under Microsoft Windows in Cygwin, which provides a
complete UNIX-like environment, specifically, the GCC compiler, the Bourne shell and a
lot of other tools. Although I used the precompiled binary files containing the compiled
TestU01 library, I found that the GCC compiler is still needed when I tried to link
executables against the library. Since I used compiled version, the installation is simply
to unzip the library in the root directory, which could be done by following commands:
cd /
uzip bin-cygwin.zip
After successful execution of the above command, the files were extracted in usr/include,
/usr/lib and /usr/share/TestU01.
Another thing I would like to mention here is the setting of environmental variables.
Before running TestU01, we need to set the PATH in Cygwin with the following
command:
export PATH = /usr/bin: $ {PATH}.
5.3 Experiments and results
For experiment, I will demonstrate the usage of TestU01 library by an existing example
of birthday spacing test. This sample code could be found in /examples: birth1.c and is
cited here:
#include
#include
#include
#include
"unif01.h"
"ulcg.h"
"smarsa.h"
<stddef.h>
int main (void)
{
unif01_Gen *gen;
gen = ulcg_CreateLCG (2147483647, 397204094, 0, 12345);
smarsa_BirthdaySpacings (gen, NULL, 1, 1000, 0, 10000, 2, 1);
smarsa_BirthdaySpacings (gen, NULL, 1, 10000, 0, 1000000, 2, 1);
ulcg_DeleteGen (gen);
return 0;
}
As we can see, we could use the library simply by calling some functions. In this
particular example, the functions of u module and s module were utilized. The two
functions from u module dealt with the initialization and deletion of a particular linear
congruential generator whose modulus, multiplier, additive constant and initial state are
specified as parameters in the creation function. What is more, the birthday spacing test is
done by the function smarsa_BirthdaySpacings(). Two tests were carried out on the same
generator with different parameters and the difference could be seen from the output of
the program:
*****************************************************************
ulcg_CreateLCG:
m = 2147483647, a = 397204094, c = 0, s = 12345
smarsa_BirthdaySpacings test:
----------------------------------------------------------------N = 1,
n = 1000,
r = 0,
d = 10000,
t = 2,
Order = 1
Number of cells = d^t = 100000000
Lambda = Poisson mean =
2.5000
-----------------------------------------------------------------Total expected number = N*Lambda
:
2.50
Total observed number
:
Significance level of test
6
:
0.04
---------------------------------------------------------------CPU time used
:
00:00:00.00
Generator state:
s = 1858647048
*****************************************************************
ulcg_CreateLCG:
m = 2147483647, a = 397204094, c = 0, s = 12345
smarsa_BirthdaySpacings test:
-----------------------------------------------------------------N = 1,
n = 10000,
r = 0,
d = 1000000,
t = 2,
Order = 1
Number of cells = d^t = 1000000000000
Lambda = Poisson mean =
0.2500
------------------------------------------------------------------Total expected number = N*Lambda
Total observed number
:
:
Significance level of test
0.25
44
:
eps
*****
------------------------------------------------------------------CPU time used
:
00:00:00.01
Generator state:
s = 731506484
From the output, we could see that in both test, N = 1, r = 0 and t = 2; but the sample size
are different. When n = 103, the expected number of collision λ is 2.5 and when n = 104
this number would be 0.25. And in the first case, the test showed a Poisson distribution
with mean λ and in the second case, it seemed that this birthday spacing test failed.
6. Conclusion
This paper summarized random number testing algorithms as well as software, and
examined Diehard Tests and TestU01 more thoroughly. We could not find a test that
performs well for all random number generators, thus, we need various tests for different
generators. As the development of computer hardware, the criteria for randomness may
become more stringent and new well designed test methods are needed.
References
[1]
Knuth, D. E. The Art of Computer Programming, V2: Semi-numerical Algorithms,
2nd Edition, Addison-Wesley, Reading, Mass., 1981.
[2]
Randomness test: http://en.wikipedia.org/wiki/Randomness_tests
[3]
Statistical randomness: http://en.wikipedia.org/wiki/Statistical_randomness
[4]
NIST: http://csrc.nist.gov/rng/
[5]
George Marsaglia. A Current View of Random Number Generators, Computer
Science and Statistics, 1985.
[6]
The Distributed Systems Group, Computer Science Department, TCD. Analysis of
an On-line Random Number Generator, April 2001.
[7]
infinite monkey theorem: http://en.wikipedia.org/wiki/Infinite_monkey_theorem
[8]
Monkey Tests for Random Number Generators, Computers and Mathematics with
Applications, 9, 1-10, 1993.
[9]
James E. Gentle. Random number generation and Monte Carlo methods Chapt2.
Quality of Random Number Generators, 2nd Edition, Springer, 1998.
[10]
The Distributed Systems Group, Computer Science Department, TCD. Random
Number Generators: An Evaluation and Comparison of Random.org and Some
Commonly Used Generators, April 2005.
[11]
runs test: http://en.wikipedia.org/wiki/Wald%E2%80%93Wolfowitz_runs_test
[12]
TestU01: http://www.iro.umontreal.ca/~simardr/testu01/tu01.html
[13]
Pierre L’Ecuyer and Richard Simard. TestU01: A Software Library in ANSI C for
Empirical Testing of Random Number Generators.
[14]
DIEHARD: http://stat.fsu.edu/~geo/diehard.html
[15]
SPRNG: http://sprng.cs.fsu.edu/
[16]
ENT: http://www.fourmilab.ch/random/
[17]
Crypt-X: http://www.isrc.qut.edu.au/resource/cryptx/
Download