# A comparative study of some pseudorandom number generators

**ABSTRACT** We present results of a test program of a group of pseudorandom number generators which are commonly used in the applications of physics, in particular in Monte Carlo simulations. The generators include public domain programs, manufacturer installed routines and a random number sequence produced from physical noise. We start by traditional standard tests, followed by detailed bit level and visual tests. The computational speed of various algorithms is also scrutinized. Our results allow direct comparisons between the properties of different generators, as well as an assessment of the efficiency of various test methods. Together with recently developed application specific tests, this information provides a good criterion to choose the best generator among the tested ones for a given problem.

**0**Bookmarks

**·**

**75**Views

- [Show abstract] [Hide abstract]

**ABSTRACT:**We present a theoretical and empirical analysis of the quality of the CRAY-system random num ber generator RANF in parallel settings. Sub sequences of this generator are used to obtain parallel streams of random numbers for each pro cessor. We use the spectral test to analyze the quality of lagged subsequences of RANF with step sizes 21, l ≥ 1, appropriate for CRAY sys tems. Our results demonstrate that with increas ing l, the quality of lagged subsequences is strongly reduced in comparison to the original sequence. The results are supported by a numeri cal Monte Carlo integration study. We also use the spectral test to exhibit the well known long- range correlations between consecutive blocks of random numbers obtained from RANF.SIMULATION: Transactions of The Society for Modeling and Simulation International 03/1999; 72(3):163-169. · 0.66 Impact Factor - SourceAvailable from: Liming XiuCircuits and Systems II: Express Briefs, IEEE Transactions on 06/2011; 58(6):1549-7747. · 1.19 Impact Factor
- SourceAvailable from: David Fairbairn[Show abstract] [Hide abstract]

**ABSTRACT:**Volume estimation packages often lack tools for measuring and an-alyzing the effect of input data uncertainty and possible error propagation in the volume estimation process. This paper proposes and develops an empirical proce-dure for error modeling in volume estimation using spatial simulation techniques. The procedure can be used to study the stability of interpolation methods in the presence of error, in other words finding a method that predicts best surface fit and volume estimates from imperfect data. The Monte Carlo simulation technique is used to construct an effective algorithm to study the effect of input data uncertainty on volumes estimated with perturbed Kriged surfaces and triangulated surfaces. The results are compared for consistency with a volume generated from a trian-gulated surface that holds the location points fixed.Journal of Surveying Engineering 05/2001; 127(2). · 1.00 Impact Factor

Page 1

arXiv:hep-lat/9304008v2 10 Aug 1993

A Comparative Study of

Some Pseudorandom Number Generators

I. Vattulainen1, K. Kankaala1,2, J. Saarinen1, and T. Ala-Nissila1,3

1Department of Electrical Engineering

Tampere University of Technology

P.O. Box 692

FIN - 33101 Tampere, Finland

2Centre for Scientific Computing

P.O. Box 405, FIN - 02100 Espoo, Finland

3Research Institute for Theoretical Physics

P.O. Box 9 (Siltavuorenpenger 20 C)

FIN - 00014 University of Helsinki, Finland

Abstract

We present results of an extensive test program of a group of pseudo-

random number generators which are commonly used in the applications of

physics, in particular in Monte Carlo simulations. The generators include

public domain programs, manufacturer installed routines and a random num-

ber sequence produced from physical noise. We start by traditional statistical

tests, followed by detailed bit level and visual tests. The computational speed

of various algorithms is also scrutinized. Our results allow direct comparisons

between the properties of different generators, as well as an assessment of the

efficiency of the various test methods. This information provides the best

available criterion to choose the best possible generator for a given problem.

However, in light of recent problems reported with some of these generators,

we also discuss the importance of developing more refined physical tests to

find possible correlations not revealed by the present test methods.

PACS numbers: 02.50.-r, 02.50.Ng, 75.40.Mg.

Key words: Randomness, random number generators, Monte Carlo simulations.

1

Page 2

1Introduction

“Have you generated any

new random numbers today?”

J. K¨ apyaho

Long sequences of random numbers are currently required in numerous applications,

in particular within statistical mechanics, particle physics, and applied mathematics.

The methods utilizing random numbers include Monte Carlo simulation techniques

[5], stochastic optimization [1], and cryptography [6, 12, 61], all of which usually

require fast and reliable random number sources. In practice, the random num-

bers needed for these methods are produced by deterministic rules, implemented as

(pseudo)random number generator algorithms which usually rely on simple aritme-

thic operations. By their definition, the maximum - length sequences produced by

all these algorithms are finite and reproducible, and can thus be “random” only in

some limited sense [8, 9].

Despite the importance of creating good pseudorandom number generators, fairly

little theoretical work exists on random number generation. Thus, the properties of

many generators are not well understood in depth. Some random number generator

algorithms have been studied in the general context of cellular automata [8], and

deterministic chaos [4]. In particular, number theory has yielded exact results on the

periodicity and lattice structure for linear congruential and Tausworthe generators

[11, 40, 30, 55]. These results have led to theoretical methods of evaluating the

algorithms, the most notable being the so called spectral test. However, most of

these theoretical results are derived for the full period of the generator while in

practice the behavior of subsequences of substantially shorter lengths is of particular

importance in applications. In addition, the actual implementation of the random

number generator algorithm may affect the quality of its output. Thus, in situ tests

of implemented programs are usually needed.

Despite this obvious need for in situ testing of pseudorandom number generators,

only relatively few authors have presented results to this end [32, 33, 49, 42]. Most

likely, there are two main reasons for this. The first is the persistence of underlying

fundamental problems in the actual definitions of “randomness” and “random” se-

quences which have given no unique practical recipe for testing a finite sequence of

numbers [31]. Thus various authors have developed an array of different tests which

mostly probe some of the statistical properties of the sequences, or test correlations

e.g. on the binary level. Recently, Compagner and Hoogland [8] have presented a

2

Page 3

somewhat more systematic approach to randomness as embodied in finite sequences.

They propose testing the values of all possible correlation coefficients of an ensem-

ble of a given sequence and all of its “translations” (iterated variations) [10], a task

which nevertheless appears rather formidable for practical purposes. We are not

aware of any attempts to actually carry out their program. The second reason is

probably more practical, namely the gradual evolution of improved pseudorandom

number generator algorithms, which has led to a diversity of generators available in

computer software, public domain and so on. For many of these algorithms (and

their implementations) only a few rudimentary tests have been performed.

In this work we have undertaken an extensive test program [58] of a group of

pseudorandom number generators, which are often employed in the applications

of physics. These generators which are described in detail in Section 2 include

public domain programs GGL, RANMAR, RAN3, RCARRY and R250, a library

subroutine G05FAF, manufacturer installed routines RAND and RANF, and even

a sequence generated from physical noise (PURAN II). Our strategy is to perform

a large set of different tests for all of these generators, whose results can then be

directly compared with each other. There are two main reasons for this. Namely,

there is a difficulty associated with most quantitative tests in the choice of the test

parameters and final criteria for judging the results. Thus, we think that full compar-

ative tests of a large group of generators using identical test parameters and criteria

can yield more meaningful results, in particular when there is a need for a reliable

generator with a good overall performance. Second, performing a large number of

tests also allows a comparison of exactly how efficient each test is in finding certain

kinds of correlations.

As discussed in Section 3, we first employ an array of standard statistical tests,

which measure the degree of uniformity of the distribution of numbers, as well as

correlations between them. Following this, we perform a series of bit level tests, some

of which should be particularly efficient in finding correlations between consecutive

bits in the random number sequences.

pictures of random numbers and their bits on a plane. Finally, for the sake of

completeness we have also included a relative performance test of the generators

in our results. A complete summary of the test results is presented in Section 4.

As our main result we find three generators, namely GGL, G05FAF, and R250

with an overall best performance in all our tests, although some other generators

such as RANF and RANMAR perform almost as convincingly. We also find that

the bit level tests are most efficient in finding local correlations in the random

Third part of our testing utilizes visual

3

Page 4

numbers, but do not nevertheless guarantee good statistical properties, as shown in

the case of RCARRY. Our results also show that visual tests can indeed reveal spatial

correlations not clearly detected in the quantitative tests. Our work thus provides a

rather comprehensive test bench which can be utilized in choosing a random number

generator for a given application. However, choosing a “good quality” random

number generator for all applications may not be trivial as discussed in Section

5, in light of the recent results reporting anomalous correlations in Monte Carlo

simulations [14, 23] using the here almost impeccably performing R250. Thus, more

physical ways of testing random number sequences are probably needed, a project

which is currently underway [59].

4

Page 5

2Generation of Random Numbers

“The generation of random numbers

is too important to be left to chance.”

R. Coveyou

Pseudorandom number sequences needed for high speed applications are usually

generated at run time using an algorithm which often is a relatively simple nonlinear

deterministic map. The implementation of the corresponding recurrence relation

must also ensure that the stream of numbers is reproducible from identical initial

conditions. The deterministic nature of generation means that the designer has to be

careful in the choice of the precise relationship of the recursion, otherwise unwanted

correlations will appear as amply demonstrated in the literature [8].

However, even the best generator algorithm can be defeated by a poor computer

implementation. Whenever an exact mathematical algorithm is translated into a

computer subroutine, different possibilities for its implementation may exist. Only

if the operation of a generator can be exactly specified on the binary level, has

the implementation a chance to be unambiguous; otherwise, machine dependent

features become incorporated into the routine. These include finite precision of real

numbers, limited word size of the computer, and numerical accuracy of mathematical

functions. Furthermore, it would often be desirable that the implemented routine

performed identically in each environment in which it is to be executed, i.e. it would

be portable.

Some of the desired properties of good pseudorandom number generators are easily

defined but often difficult to achieve simultaneously. Namely, besides good “ran-

domness” properties portability, repeatability, performance speed, and a very long

period are often required. Ideally, a random number generator would be designed

for each application, and then tested within that application to ensure that the in-

evitable correlations that do exist in a deterministic algorithm, cause no observable

effects. In practice, this is seldom possible, which is another reason why extensive

tests of pseudorandom number generators are needed.

Most commonly used pseudorandom number generator algorithms are the linear

congruential method, the lagged Fibonacci method, the shift register method, and

combination methods. A special case are nonalgorithmic or physical generators which

are used for creating a non - reproducible sequence of random numbers. These are

usually based on “random” physical events, e.g. changes in physical characteristics

5

Page 6

of devices, cosmic ray bursts or electromagnetic interference. Details and properties

of the algorithms will be summarized in the next section. Following this, we shall

describe in more detail the particular generators chosen for our tests. Reviews of

current state of generation methods can be found in e.g. Marsaglia [42], James [27],

L’Ecuyer [34], and Anderson [3].

2.1Classification of Generation Methods

“Anyone who considers arithmethical methods

of producing random digits is, of course,

in a state of sin.”

J. von Neumann

Among the simplest algorithms are the linear congruential generators which use the

integer recursion

Xi+1= (aXi + b ) mod m,(1)

where the integers a, b and m are constants. It generates a sequence X1,X2,... of

random integers between 0 and m − 1 (or in the case b = 0, between 1 and m − 1).

Each Xiis then scaled into the interval [0,1). Parameter m is often chosen to be

equal or nearly equal to the largest integer in the computer. Linear congruential

generators can be classified into mixed (b > 0) and multiplicitive (b = 0) types, and

are usually denoted by LCG(a,b,m) and MLCG(a,m), respectively.

Since the introduction of this algorithm by Lehmer [35], its properties have been

researched in detail. Marsaglia [40] pointed out about 20 years ago that the random

numbers in d dimensions lie on a relatively small number of parallel hyperplanes.

Further theoretical work [11, 15, 16] has been done to weed out bad choices of the

constants a, b and m but so far no consensus has evolved on a unique best choice

for these parameters.

To increase the period of the linear congruential algorithm, it is natural to generalize

it to the form

Xi= (a1Xi−1 + ··· + apXi−p) mod m,

where p > 1 and ap?= 0. The period is the smallest positive integer λ for which

(2)

(X0, ..., Xp−1) = (Xλ, ..., Xλ+p−1). (3)

Since there are mppossible p − tuples, the maximum period is mp− 1. In this

category the simplest algorithm is of the Fibonacci type. The use of p = 2, a1 =

6

Page 7

a2 = 1 leads to the Fibonacci generator

Xi= (Xi−1 + Xi−2) mod m.(4)

Since no multiplications are involved, this implementation has the advantage of

being fast.

A lagged Fibonacci generator requires an initial set of elements X1,X2,...,Xrand

then uses the integer recursion

Xi= Xi−r ⊗ Xi−s, (5)

where r and s are two integer lags satisfying r > s and ⊗ is a binary operation (+, −,

×, ⊕ (exclusive-or)). The corresponding generators are designated by LF(r,s,⊗).

Typically, the initial elements are chosen as integers and the binary operation is

addition modulo 2n. Lagged Fibonacci generators are elaborated in e.g. Ref. [42].

An alternative generator type is the shift register generator. Feedback shift register

generators are also sometimes called Tausworthe generators [54].

shift register algorithm is based on the theory of primitive trinomials of the form

xp+xq+1. Given such a primitive trinomial and p binary digits x0,x1,x2,...,xp−1, a

binary shift register sequence can be generated by the following recurrence relation:

The feedback

xk= xk−p ⊕ xk−p+q, (6)

where ⊕ is the exclusive-or operator, which is equivalent to addition modulo 2. l-bit

vectors can be formed from bits taken from this binary sequence as

Wk= xkxk+dxk+2d···xk+(l−1)d, (7)

where d is a chosen delay between elements of this binary vector. The resulting

binary vectors are then treated as random numbers. Such a generated sequence of

random integers will have the maximum possible period of 2p− 1, if xp+ xq+ 1 is

a primitive trinomial and if this trinomial divides xn− 1 for n = 2p− 1, but for

no smaller n. These conditions can easily be met by choosing p to be a Mersenne

prime, i.e. a prime number p for which 2p− 1 is also a prime. A list of Mersenne

primes can be found e.g. in Refs. [60, 62, 6, 25]. Generators based on small values

of p do not perform well on the tests [42]. According to some statistical tests on

computers [57] the value of q should be small or close to p/2.

Lewis and Payne [37] formed l-bit words by introducing a delay between the words.

The corresponding generator is called the generalized feedback shift register gener-

ator, denoted by GFSR(p,q,⊕). In a GFSR generator the words Wk satisfy the

7

Page 8

recurrence relation:

Wk= Wk−p⊕ Wk−p+q.(8)

Under special conditions, maximal period length of 2p− 1 can be achieved. Lewis

and Payne [37] and Niederreiter [47] have also studied the properties of the algorithm

theoretically. An important aspect of the GFSR algorithm concerns its initialization,

where p initial seeds are required. This question has been studied theoretically in

Refs. [18, 19, 55, 56, 20].

Given the inevitable dependencies that will exist in a pseudorandom sequence, it

seems natural that one should try to shuffle a sequence [26] or to combine separate

sequences. An example of such approach is given by MacLaren and Marsaglia [39]

who were apparently the first to suggest the idea of combining two generators to-

gether to produce a single sequence of random numbers. The essential idea is that

if X1, X2,... and Y1, Y2, ... are two random number sequences, then the sequence

Z1, Z2, ... defined by Zi = Xi ⊗ Yiwill not only be more uniform than either

of the two sequences but will also be more independent. Algorithms using this idea

are often called mixed or combination generators.

As mentioned before, physical devices have also been used in the creation of random

number sequences. Usually, however, such sequences are generated too slowly to

be used in real time, but rather stored in the computer memory where they can be

easily accessed. This also guarantees the reproducability of the chosen sequence in

applications. However, physical memory restrictions often severely limit the number

of stored numbers. Unwanted and unknown physical correlations may also affect the

quality of physical random numbers. As a result, physical random numbers have not

been commonly used in simulations. One implementation of a physical generator

can be found in Ref. [52].

2.2Descriptions of Generators

In this section, we shall describe in more detail the generators which have been

chosen for the tests. Since many combinations of possible parameters exist, we

have tried to choose those particular algorithms which have been most commonly

used in physics applications, or which have been previously tested. At the end of

this section, we shall also describe a sequence of random numbers generated from

8

Page 9

physical noise, which has been included for purposes of comparison.

• GGL

GGL is a uniform random number generator based on the linear congruential method

[48]. The form of the generator is MLCG(16807,231− 1) or

Xi+1= (16807 Xi) mod (231− 1). (9)

This generator has been particularly popular [48]. It has seen extensive use in the

IBM computers [65], and is also available in some commercial software packages

such as subroutine RNUN in the IMSL library [66] and subroutine RAND in the

MATLAB software [67]. MLCG(16807,231− 1) generators are quite fast and have

been argued to have good statistical properties [36].

without shuffling are reported by Learmonth and Lewis [32]. Other test results on

implementations of this algorithm have been given in [29, 2, 3, 34]. Its drawback

is its cycle length 231− 1 (≈ 2 × 109steps) [29], which can be exhausted fast on

a modern high speed computer. We also note that our Fortran implementation of

GGL is particularly sensitive to the arithmetic accuracy of its implementation (cf.

Section 4). Our Fortran implementation of GGL produces the same sequence as

RNUN of the IMSL library1

Results of tests with and

• RAND

RAND uses a linear congruential random number algorithm with a period of 232

[63] to return successive pseudorandom numbers in the range from 0 to 231−1. The

generator is LCG(69069,1,232) or

Xi+1= (69069 Xi + 1) mod 232. (10)

The multiplier 69069 has been used in many generators, probably because it was

strongly recommended in 1972 by Marsaglia [41], and is part of the SUPER - DUPER

generator [3]. Test results on various implementations of the LCG(69069,1,232)

algorithm have been reported in [32, 42, 3, 44]. The generator tested here is the

implementation by Convex Corp. on the Convex C3840 computer system [63].

• RANF

1We also unsuccessfully tried the IBM assembly code implementation of Lewis et al. [36] on an

IBM 3090 computer.

9

Page 10

The RANF algorithm uses two equations for generation of uniform random numbers.

It utilizes the multiplicative congruential method with modulus 248. The algorithms

are MLCG(M1,248) and MLCG(M64,248):

Xi+1= (M1Xi) mod 248,

Xi+64= (M64Xi) mod 248,

(11)

(12)

where M1 = 44485709377909 and M64 = 247908122798849. Period length of the

RANF generator is 246[45]. Spectral test results on the RANF generator have been

given in Refs. [3, 17]. On the CRAY-X/MP and CRAY-Y/MP systems, RANF is a

standard vectorized library function [64]. The operations (M1Xi) and (M64Xi) are

done as integer multiplications in such a way as to preserve the lower 48 bits. We

tested RANF on a Cray X-MP/432.

• G05FAF

G05FAF is a library routine in the NAG software package [68]. It calls G05CAF

which is a multiplicative congruential algorithm MLCG(1313,259) or

Xi+1= (1313Xi) mod 259. (13)

G05FAF can be used to generate a vector of n pseudorandom numbers which are

exactly the same as n successive calls to the G05CAF routine. Generated pseudo-

random numbers are uniformly distributed over the specified interval [a,b). The

period of the basic generator is 257[68]. Its performance has been analyzed by the

spectral test [30].

• R250

R250 is an implementation of a generalized feedback shift register generator [37].

The 31-bit integers are generated by a recurrence of the form GFSR(250,103,⊕) or

Xi= Xi−250 ⊕ Xi−(250−103).(14)

Implementation of the algorithm is straightforward, and p = 250 words of memory

are needed to store the 250 latest random numbers. A new term of the sequence can

be generated by a simple exclusive - or operation. An IBM assembly language im-

plementation of this generator has been presented by Kirkpatrick and Stoll [29] who

use a MLCG(16807,231−1) to produce the first 250 initializing integers. Due to the

popularity of R250, there have been many different approaches for its initialization

10

Page 11

[50, 18, 7, 20]. The period of the generator is 2250−1 [29]. Some test results of R250

generator have been reported by Kirkpatrick and Stoll [29]. We have implemented

R250 on Fortran [24].

• RAN3

RAN3 generator is a lagged Fibonacci generator LF(55,24,−) or

Xi= Xi−55− Xi−24.(15)

The algorithm has also been called a subtractive method. The period length of

RAN3 is 255− 1 [30], and it requires an initializing sequence of 55 numbers. The

generator was originally Knuth’s suggestion [30] for a portable routine but with

an add operation instead of a subtraction. This was translated to a real Fortran

implementation by Press et al. [51]. We were unable to find any published test

results for RAN3.

• RANMAR

RANMAR is a combination of two different generators [27, 43]. The first is a lagged

Fibonacci generator

Xi=

Xi−97− Xi−33,

Xi−97− Xi−33+ 1, otherwise.

if Xi−97≥ Xi−33;

(16)

Only 24 most significant bits are used for single precision reals. The second part

of the generator is a simple arithmetic sequence for the prime modulus 224− 3 =

16777213. The sequence is defined as

Yi=

Yi− c,

Yi− c + d, otherwise,

if Yi≥ c;

(17)

where c = 7654321/16777216 and d = 16777213/16777216.

The final random number Ziis then produced by combining the obtained Xiand Yi

as

Zi=

Xi− Yi,

Xi− Yi+ 1, otherwise.

if Xi≥ Yi;

(18)

11

Page 12

The total period of RANMAR is about 2144[43]. A scalar version of the algorithm

has been tested on bit level with good results [43]. We used the implementation

by James [27] which is available in the Computer Physics Communications (CPC)

software library, and has been recommended for a universal generator.

• RCARRY

RCARRY [44] is based on the operation known as “subtract - and - borrow”. The

algorithm is similar to that of lagged Fibonacci, but it has the occasional addition

of an extra bit. The extra bit is added if the Fibonacci sum is greater than one.

The basic formula is:

Xi= (Xi−24 ± Xi−10 ± c) mod b.(19)

The carry bit c is zero if the sum is less than or equal to b, and otherwise “c = 1 in

the least significant bit position” [27]. The choice for b is 224.

The period of the generator is about 21407[44] when 24-bit integers are used for the

random numbers. We were unable to find any published test results for RCARRY.

We used the implementation of James [27], again available in the CPC software

library.

• PURAN II

PURAN II is a physical random number generator created by Richter [52]. It uses

random noise from a semiconductor device. The generated data has been perma-

nently stored on a computer disk, from which it can be transferred by request. In

this work, we have tested the PURAN II data on bit level only (cf. Section 4), and

also used it to verify the correct operation of our test programs.

12

Page 13

3Description of the Tests

“Of course, the quality of a generator can

never be proven by any statistical test.”

P. L’Ecuyer

A fundamental problem in testing finite pseudorandom number sequences stems from

the fact that the definition of randomness for such sequences is not unique [31]. Thus,

one usually has to decide upon some criteria which test at least the most fundamental

properties that such sequences should possess, such as correct values of the moments

of their probability distribution. This has lead to the emergence of a large number

of tests which can be divided into three approximate categories: Statistical (or

traditional) tests for testing random numbers in real or integer representation, bit

level tests for binary representations of random numbers, and more phenomenological

visual tests. In this work, we have employed several tests belonging to each of these

categories, as will be discussed below. Also, the spectral test for LCG generators

was included. We should note here that recently Compagner and Hoogland [8] have

suggested a more systematic test program for finite sequences. We shall not employ

it in this work, however.

The traditional utilitarian approach has been to subject pseudorandom number

sequences to tests, which derive from mathematical statistics [30]. In their simplest

form, tests in this category reveal possible deviations of the distribution of numbers

from an uniform distribution, such as the χ2test. However, some of the more

sophisticated tests should actually probe correlations between successive numbers

as well [33].

Another approach is to test the properties of random numbers on the bit level. Of

the traditional tests, some can be performed in this manner also. Marsaglia [42]

has proposed additional tests which explicitly probe the individual bits of random

number sequences represented as binary computer words. Some of these tests have

been further refined [2]. We have included two of these tests here, in particular to

examine possible correlations between bits of successive binary words.

A rather different way of testing spatial correlations between random numbers is pos-

sible by using direct visualization. This can most easily be done in two dimensions

by plotting pairs of points on a plane, or visualizing the bits of binary numbers. In

addition to yielding qualitative information, such tests offer a possibility to develop

more physical quantitative tests through interpretation of the visualized configu-

13

Page 14

rations as representations of physical systems, such as the Ising model [8, 59]. In

this work, however, we have simply used a few different types of visual tests to

complement our quantitative tests.

Before discussing each test in detail, we would like to emphasize that although some

of the generators we have tested have previously been subjected to similar tests, an

extensive comparative testing of a large collection of generators has been lacking up

to date. The importance of this becomes obvious when one considers the freedom

of choice of various parameters in the tests, as discussed below. Only compara-

tive testing with identical parameters allows a direct comparison between different

generators. Another difficulty concerns the implementation of random number gen-

erator algorithms and the testing routines [53, 48, 3, 22]. Problems in either may

actually lead to significant differences in the results. In fact, as an example we shall

explicitly demonstrate for GGL and RAND how slightly different implementations

of the same generator can lead to completely different results.

3.1Statistical tests

The statistical tests included in our test bench were the uniformity test, the serial

test, the gap test, the maximum of t test, the collision test, and the run test. In

addition, we carried out the park test [42]. A review of the statistical tests can be

found e.g. in Ref. [30], and a suggestion for implementing them in Ref. [13].

The leading idea in carrying out the statistical tests was to improve the statis-

tical accuracy of these tests by utilizing a one way Kolmogorov - Smirnov (KS)

test. This was achieved by repeating each individual test described below N times,

and then submitting the obtained empirical distribution to a KS test (for the park

test, however, this was not possible). Similar approach has been suggested earlier

by Dudewicz and Ralley [13] and realized by L’Ecuyer [33]. The KS test reveals

deviations of an empirical distribution function (Fn(x)) from the theoretical one

(F(x)). This can be quantified by test variables K+and K−, which are defined by

K+=√nsup{Fn(x) − F(x)} and K−=√nsup{F(x) − Fn(x)}. K+measures the

maximum deviation of Fn(x) from F(x) when Fn(x) > F(x) and K−measures the

respective quantity for Fn(x) < F(x). The tests are as follows:

(i) To test the uniformity of a random number sequence, a standard χ2test was

used [30]. n random numbers were generated in the half open interval [0,1),

then multiplied by ν and truncated to integers in the interval [0,ν). The

14

Page 15

number of occurrences in each of the ν bins was compared to the theoretical

prediction using the χ2test.

(ii) Serial correlations were tested [32, 30] by studying the occurrence of d-tuples

of n random numbers distributed in the interval [0,1). For example, in the

case of pairs, we tabulated the number of occurrences of (x2i,x2i+1) for all i ∈

[0,n). Each d-tuple occurs with the probability ν−dwhere ν is the number of

bins in the interval. The results were then subjected to the χ2test.

(iii) The gap test [30] probes the uniformity of the random number sequence of

length n. Once a random number xi falls within a given interval [α,β], we

observe the number of subsequent numbers xi+1...xi+j−1 ?∈ [α,β]. When

again xj∈ [α,β], it defines a gap of length j. For finite sequences, it is useful

to define a maximum gap length l. Then we can test the results against the

theoretical probability using the χ2test.

(iv) The maximum of t test [30] is a simple uniformity test. If we take a random

number sequence of length n (xi ∈ [0,1),i = 1,...,n) and divide it into

subsequences of length t and pick the maximum value for each subsequence,

the maxima should follow the xtdistribution.

(v) The collision test [30] can be used to test the uniformity of the sequence when

the number of available random numbers (n) is much less than the number of

bins (w). We then study how many times a random number falls in the same

bin, i.e. how many collisions occur. The probability for j collisions is:

w(w − 1)···(w − n + j + 1)

wn

?

n

n − j

?

, (20)

where w = sd, d is the dimension and s can be chosen.

(vi) In the run test [32, 30], we calculate the number of occurrences of increasing

or decreasing subsequences of length 1 ≤ i < l for a given random number

sequence x1,x2,...,xn. To carry out this test we chose l = 6 and followed

Knuth [30] in the choice of the relevant test quantity.

(vii) In the park test [42], we choose randomly points in a d-dimensional space and

allocate a diameter for each point. Within each diameter, “a car is parked”.

The aim is to park as many non - overlapping cars as possible, and study the

distribution of k cars. Unfortunately, since the theoretical distribution is not

known, this test can only be used for qualitative comparative studies.

15

#### View other sources

#### Hide other sources

- Available from Ilpo Vattulainen · Aug 15, 2014
- Available from arxiv.org