Validating a Random Number Generator
Download
Report
Transcript Validating a Random Number Generator
Validating a Random Number
Generator
Based on:
A Test of Randomness Based on the Consecutive
Distance Between Random Number Pairs
By:
Matthew J. Duggan, John H. Drew, Lawrence M. Leemis
Presented By:
Sarah Daugherty
MSIM 852
Fall 2007
Introduction
Random numbers are critical to Monte
Carlo simulation, discrete event simulation,
and bootstrapping
There is a need for RNG with good
statistical properties.
One of the most popular methods for
generating random numbers in a computer
program is a Lehmer RNG.
Daugherty MSIM 852 Fall 2007
Lehmer Random Number Generators
Lehmer’s algorithm: an iterative equation
produces a stream of random numbers.
Requires 3 inputs: m, a, and x0.
xi 1 axi mod m
m = modulus, a large fixed prime number
a = multiplier, a fixed positive integer < m
x0 = initial seed, a positive integer < m
Produces integers in the range (1, m-1)
Daugherty MSIM 852 Fall 2007
Problem
Lehmer RNG are not truly random
With carefully chosen m and a, it’s possible to
generate output that is “random enough” from a
statistical point of view.
However, still considered good generators
because their output can be replicated,
they’re portable, efficient, and thoroughly
documented.
Marsaglia (1968) discovered too much
regularity in Lehmer RNG’s.
Daugherty MSIM 852 Fall 2007
Marsaglia’s Discovery
He observed a lattice structure when consecutive
random numbers were plotted as overlapping ordered
pairs.
((x0, x1, x2,…, xn),
(x1, x2,…, xn+1))
Lattice created using
m = 401, a = 23.
Does not appear to be
random at all; BUT a
degree of randomness
MAY be hidden in it.
Daugherty MSIM 852 Fall 2007
Solution
Find the hidden randomness in the order in
which the points are generated.
The observed distribution of the distance
between consecutive RN’s should be close
to the theoretical distance.
Develop a test based on these distances.
Hoping to observe that points generally are not
generated in order along a plane or in a regular
pattern between planes.
Daugherty MSIM 852 Fall 2007
Overlapping vs. Non-overlapping Pairs
Considering distance between consecutive
pairs of random numbers, points can be
overlapping or non-overlapping.
Overlapping: (xi, xi+1), (xi+1, xi+2)
Non-overlapping: (xi, xi+1), (xi+2, xi+3)
Both approaches are valid.
The non-overlapping case is mathematically
easier in that the 4 numbers represented
are independent therefore the 2 points they
represent are also independent.
Daugherty MSIM 852 Fall 2007
Non-overlapping Theoretical Distribution
If we assume X1, X2, X3, X4 are IID U(0,1)
random variables, we can find the
distance between (X1, X2) and (X3, X4) by:
D ( X1 X 3 ) ( X 2 X 4 )
2
2
Daugherty MSIM 852 Fall 2007
Non-overlapping Theoretical Distribution
The cumulative
distribution,
F(x), of D.
Daugherty MSIM 852 Fall 2007
Goodness-of-Fit Test
Now we can compare our theoretical distribution
against the Lehmer generator.
Convert the distances between points into an
^
empirical distribution, F(x), which will allow us to
perform a hypothesis test.
N ( x)
F ( x)
n
^
N(x) = # of values that do
not exceed x
^
H 0 : F ( x) F0 ( x)
^
H1 : F ( x) F0 ( x)
n = # of distances collected
Daugherty MSIM 852 Fall 2007
Classification of Results
Based on results of 3 hypothesis tests (KS,
CVM, and AD tests), each RNG can be
classified as:
Good – the null hypothesis was not rejected in
any test.
Suspect – the null hypothesis was rejected in 1
or 2 of the tests.
Bad – the null hypothesis was rejected in all 3
tests.
Daugherty MSIM 852 Fall 2007
Results
Interesting cases are when a multiplier is
rejected by only 1 or 2 of the 3 tests. See
a = 3 in table.
Daugherty MSIM 852 Fall 2007
Random
number
pairs
Distances
connecting
pairs
F(x)
(solid) vs.
^
Good
Suspect
Bad
F(x)
(dotted)
Daugherty MSIM 852 Fall 2007
Summary
A test of randomness was developed for
Lehmer RNG’s based on distance between
consecutive pairs of random numbers.
Since some multipliers are rejected by
only one or two of the 3 hypothesis tests,
the distance between parallel hyperplanes
should not be used as the only basis for a
test of randomness. The order in which
pairs are generated is a second factor to
consider.
Daugherty MSIM 852 Fall 2007
Critique
Potential – limited. Many other tests exist for
validating RNG’s.
Impact – minimal. Frequently used RNG’s use a
modulus much larger than the m=401 used here.
Overall – paper is well written; in it’s current state,
this test is a justified addition to collection of tests
for RNG’s.
Future – use larger modulus; improve theoretical
distribution by improving numerical calculations of
integral for cdf; test other non-Lehmer generators
such as additive linear, composite, or quadratic.
Daugherty MSIM 852 Fall 2007