Generating Random Numbers

Download Report

Transcript Generating Random Numbers

Techniques in Signal Processing
CSC 508
Generating Random Numbers
with a Computer
2.3 Generating Random Numbers
Before we begin this section, it is important to realize that a digital (VonNeumann
architecture) computer cannot generate sequences of random numbers. Only after you
accept and understand this will we proceed to generate sequences of random numbers
with a computer.
At this point some explanation is in order: A conventional digital computer is perfectly
deterministic and, therefore, is incapable of exhibiting any sort of random behavior
(unless it is broken). Computation methods exist t generate sequences of numbers that
appear to be random and that pass certain standard tests for randomness. These are called
pseudo-random sequences. You must appreciate this difference to avoid embarassing and
sometimes costly errors and simulation and analysis.
Any simulation or other computer-based analysis which relies on the randomness of a
pseudo-random sequence, is susceptable to a special kind of error. Improperly used, a
pseudo-random sequence can produce structure or artifacts in your data that are a product
of the pseudo-random number generator itself rather than a real feature of the system
being modeled. The most important thing to remember is that every pseudo-random
number sequence generated on a computer will eventually begin to repeat itself. We must
never draw from such a sequence past this point.
Method of Congruence
One of the most popular methods of generating pseudo-random sequences is called the
method of congruence. A sequence of n pseudo-random number Xn={x0,x1,. . ., xn) may be
generated by the recurrence relation,
xk 1  (ax  b) mod T
where we establish the following relationship between the values a,b and T:
(1) T is the maximum possible value in Xn, (a function of our computer).
(2) b is relatively prime to T (i.e. they have no common factors)
(3) a = 1 mod p where p is a prime factor of T.
(4) a = 1 mod 4 if 4 is a factor of T.
For example, if we let T=2q then b must be odd and we must ensure that a=1 mod 4. If
we let T=10q then b cannot be divisible by 2 or 5 (the prime factors of T) and we must
choose a value of a so that a = 1 mod 20.
To answer the question building in your mind. No. We cannot rely on the built-in
random number generators provided with most lanuage compilers. These ready-to-go
random number generators are notoriously flawed and inadequate for serious computer
research.
A simple Ada function using the method of congruence is shown below. This function
generates a uniformly distributed pseudo-random sequence. An inital seed value for
XVAL (say 141) is needed to start the sequence. Each time RAN(MAXVAL) is called
an integer between 0 and MAXVAL is returned.
If our computer can represent integers up to 32767 in value then we must make
sure that A*XVAL+B does not exceed this value. As a test we note that for
XVAL0=141, T=210, A=24+1=17 and B=1 The maximum value of XVAL is T-1 or
1023 and that A*XVAL+B never exceeds 17392 which is < 32767.
In general the value of MAXVAL must not be set larger than T/2, otherwise some values
in the range (0..MAXVAL) cannot be generated by RAN. This is due to the truncation in
the conversion from floating point back to integer type in the RAN function.
Now that we have established the maximum size for MAXVAL, we should consider how
many values can be drawn from RAN before we are in danger of obtaining a repeated
sequence. A good rule of thumb is to never exceed the square root of the maximum
number of values which can be represented by the computer system we are using. In our
case we are limited to 327671/2=181. Those of you with programming experience may be
interested in testing the random number generator built into your favorite programming
language for the length of repeating sequences. (Or maybe not.)
We can use a uniform random number generator to generate random numbers over any
range of values. For example, we can use RAN to produce a source of random floatingpoint numbers U[-1.0.1.0] in the range [-1.0,+1.0] with the following scaling transormation,
 RAN ( MAXVAL) 
U [ 1.0,1.0]  2
 1
MAXVAL


where MAXVAL is the largest integer allowed as an upper limit for the function RAN.
In general, a uniform random distribution over any range can be obtained through a
process of scaling and shifting the values produced by RAN.
Consider the graph on the left in which
a random deviate x in the range
[0,MaxVal] is used in the transformation function R(x) to compute a new
uniform deviate in the range [-1.0,+1.0].
This is called the inverse method of
generating random deviates and can be
used to generate other types of random
distributions such as the normal
distribution from a uniform distribution.
To use the inverse method to produce
normally distributed random numbers
we need to find a function N(x) to
replace R(x). This function must map
uniformly distributed values U(x) into
normally distributed values N(x).
N(x) is the functional inverse of the
integral of the Gaussian PDF. This
integral is called the cumlative
distribution function (CDF) and has
no closed-form solution. We can
approximate N(x) using a method
after Abramowitz and Stegun, as
shown in this chart.
This approximation produces values
of N(x) which are good to within
+/_ 4.5x10-4 when modeling a
normal distribution with zero mean
and unit sigma. You can then convert
these normal deviates to any mean
and standard deviation by
multiplying each times the desired s
and adding the desired m.
N'(x) = sN(x) + m
The Direct Method of Generating Normal Deviates
Another method for generating normal deviates is called the direct method. Given two
draws from a uniform random distribution U1 and U2 with range [0.0,1.0], we generate
two independent normal random deviates N1 and N2 by,
N1   2 ln( U1 ) cos 2 U 2
N 2   2 ln( U1 ) sin 2 U 2
Remember that the implied mean of the normal distribution is 0.0 and the standard
deviation is 1.0 for the normal distribution modeled using the direct method. What
values of U1 and U2 produce the largest positive and largest negative values for N?
What values of U1 and U2 result in N=0.0?
This is the preferred method when the cost of computing the logarthims and
trigonometric functions is not prohibitive. With the nearly universal availability of
CPUs with built-in math coprocessors, the direct method of computing normal
random deviates is becoming the standard.