orthogonal arrays application to pseudorandom numbers generation

Download Report

Transcript orthogonal arrays application to pseudorandom numbers generation

ORTHOGONAL ARRAYS APPLICATION TO
PSEUDORANDOM NUMBERS GENERATION
AND OPTIMIZATION PROBLEMS
A.G.Chefranov †‡, T.A.Mazurova ‡, I.D.Sidorov ‡, T.S.Letia
†Eastern Mediterranean University, Gazimagusa, North Cyprus
‡Taganrog State University of Radio-Engineering, Taganrog,
Russia
Technical University of Cluj-Napoca, Cluj-Napoca, Romania
Orthogonal Arrays
 Orthogonal arrays (OA) are widely used in many areas.
 In OA of strength t with L levels having Lt rows and f columns with




elements from {0,..,L-1} each combination of t columns contains without
repetition all Lt combinations of numbers {0,..,L-1}.
OA may be used for generation of pseudorandom sequences using
enumeration of combinations of columns and writing their rows in line.
This results in very long not repeated sequences of numbers even in
the case of rather small values of L and t. Thus, for L=7 and t=4 period
length is equal to 107537. This approach requires generation of full OA;
in the mentioned above case it contains 74 rows each of 7 numbers
from {0,..,6}.
For effective enumeration of permutations for this algorithm special
partially factorial numbers representation is introduced.
Also, directly formula for generating elements of OA may be used for
pseudorandom numbers generation.
Estimation of distances between OA rows is made showing uniform
distribution of these vectors. This may be the ground for using OA in
optimization
Formula for OA Generation
 Elements of OA with prime L, strength t, L
columns:
t 1
(1)
N   N  j Lj ,
j 0
t 1
OAL ,t ( N , k )  ( N  j k j ) mod L,
j 0
N  0, Lt  1,
k  0, L  1
OA-PRNGA1
1.
2.
3.
4.
Choose L,t and some permutation of OA rows and permutation
of some t columns as a seed of algorithm
Run through OA rows inside selected columns outputting OA
elements as pseudorandom numbers. To improve statistical
characteristics of generated sequence some transformation
may be applied to it (for example, 0-s from this sequence are
deleted, values are taken modulus 2, resulting bit sequence is
transformed by grouping 2 neighboring bits, 00-s and 11-s are
deleted, 01-s and 10-s are converted into 0-s and 1-s
respectively, and then XORing of neighboring bytes).
Take next combination of this or next t columns, repeat step 2
Take next permutation of OA rows, repeat steps 2,3
Partially Factorial-Based Numbers
Representation
n is a length of permutation, xi – numbers,
which can be uniquely obtained from
permutation, and which uniquely define
respective permutation
n
n 1
(2)
x
 (x
i 0
i
0  xi  i,

 j ),
j i  2
n
 j 1
j  n 1
Enumeration of Permutations and
Calculation of xi
 Let we have permutation a1a2…an-1. Substituting n serially on
all possible (numerated as 0,.., n-1) positions in this permutation
we get n new permutations: na1a2…an-1, a1na2…an-1, …,
a1a2…nan-1, a1a2…an-1n.
 To enumerate all permutations of 3 elements 0,1,2: 0 (all
permutations of 1 element are ready), 10, 01 (all permutations of
2 elements are got), 210, 120, 102 (got by substitution of 2 into
1st permutation of size 2), 201, 021, 012 (all permutations are
got).
 Value xi may be defined as place for insertion chosen in this
algorithm for i-th element while getting particular permutation,
for example, if we take permutation 012 then x0 = 0, x1 =1, x2
=2, and according to (2) x=3*x1+1*x2=5, for permutation 201 we
get x0 = 0, x1 =1, x2 =0, x= 3*x1+1*x2=3, value of x0 will always
be equal to 0.
OA-PRNGA2
Choose L,t, some OA row number N and some
combination of 2L numbers from {0,..,L-1} mix[2,L]
as a seed.
2. Calculate next L OA row numbers and mix them
with array mix, outputting L numbers taken as L
outputs of generator. This mixture may be taken as
1.
out[i ]  (OArow[i ] * mix[1, i ]  mix[2, i ])
mod L,
i  0, L  1
where out[i] stands for i-th output pseudorandom
number obtained from current OA row, i-th element
of which is denoted as OArow[i]
3. Take next row cyclically by Lt, repeat step 2.
Measures Used for Comparison
 number of 1-s in generated bit sequence (length>=20000 bit),
statistics is
k
2
s
Q
1
F1  ( )  n
n s 1 p s
where k=2, must have asymptotic distribution as χ2 with k-1
levels of freedom;
 number of m-bit vectors, statistics is given by:
m
2 m 1
2
2
F2 
(  Ni )  k
k i 0
where Ni is a frequency of appearance of i-th vector in their
sequence of k vectors; must have asymptotic distribution as χ2
with 2m-1levels of freedom;
Measures Used for Comparison-1
 number of sequences of 1-s and 0-s,
statistics is
( Bi  Ei )
(Gi  Ei )
F3  

Ei
Ei
i 1
i 1
k
2
k
2
where Bi is number of series of 0-s of length i,
Gi is number of series of 1-s of length i,
Ei=(n-i+3)/(2i+2) is expected number of
series, k is maximal integer i for which Ei > 5;
must have asymptotic distribution as χ2 with
2k-2 levels of freedom;
Measures Used for Comparison-2
 coefficient of sequential correlation showing
dependency of the next symbol on the
previous one, calculated as
F4 
n2
n 1
i 0
i 0
n( U iU i 1  U n 1U 0 )  ( U i ) 2
n 1
n 1
n U  ( U i ) 2
i 0
2
i
i 0
Value of this statistics must be near 0 in 95%
of occasions, n>2
Comparison Results
 Sequences of 2500 and 250 000 bytes were
used
 OA-PRNGA1, OA-PRNGA2, RC4, Standard
Delphi Generator were compared
 Approximately same characteristics
 Timing for OA-PRNGA1 is slightly worse
 P-4 1.7Ghz 256 M was used
UNIFORMITY OF OA ROWS DISTRIBUTION
AND OA APPLICATION TO OPTIMIZATION
PROBLEMS
Theorem. Distance between any two rows of
OA with L levels, L is a prime, L+1 columns,
strength t=2, where i  i1 L  i0 , k  k 1 L  k 0
is
= d L (i1 , k1 )  CL , i1  k1
oa  oa
i
k L
L 1
C 
4
2
L
{
Ld L (i0 , k0 ), i1  k1
d L (a, b)  min{ abs(a  b),
L  abs(a  b)},
a, b  SL
Investigated for Optimization
Functions
n
f1 ( x)   x , f 2 ( x)  100( x  x 2 )  (1  x1 )
i 1
2
i
2
1
2
2
 Rosenbrock’s saddle and its generalization:
n 1
f 21 ( x)  (1  x1 )  100( xi2  xi 1 ) 2
2
i 1
n
f 3 ( x)   || xi ||
25
1
1


f 4 ( x)
500
j 1
i 1
1
2
j   ( xi  aij ) 6
i 1
f 5 ( x)  ( x12  x 2  11) 2  ( x1  x 22  7) 2
Himmelblau’s function and its generalizations
OA-Based Optimization Algorithms
 Essence of these algorithms is in mapping of initial ranges for
each variable onto set of levels,
 calculation of function in the points shown by respective
columns of OA,
 selection of the best point,
 and remapping of twice shrunk range on the set of levels.
There may be used different approaches to select next point:
 by averaging of function’s values along each of dimensions
separately, and choosing the optimal one as combination of
separate optimums (AvgOA),
 or find best function value and choose corresponding set of
variables as next optimum (DirOA);
 for remapping, next optimal point components may retain their
relative positions as in the previous act of mapping (AvgOA,
DirOA), or they become the center of new mapped range
(CenOA).
Results of Algorithms Investigation
 Algorithms DirOA, AvgOA, CenOA together with




Monte-Carlo method (MCM) were considered
Number of functions’ arguments 2-19
Number of OA levels up to 19
Number of runs for MCM up to 1 000 000
For little number of variables OA-based optimization
algorithms have nearly the same productivity as
MCM, but for larger number of variables (5-19) OAbased algorithms give significantly better results in
quality and in timings as well.
Conclusion
 Applicability of two proposed OA-based algorithms for
pseudorandom numbers generating with extremely large
periods was shown by comparison with well known algorithms.
 Special partially factorial numbers representation was
introduced for effective enumeration of permutations for
generated pseudorandom sequences. Such sequences may be
used as stream ciphers for encryption in networks and for
random number generation while modeling.
 Uniformity of scattering of points represented by OA rows is
estimated. OA-based optimization algorithms iteratively reducing
scope of search with the help of OA were considered. Relatively
low computational complexity and sufficient accuracy especially
in the case of large number of variables allows using them for
optimization in control processes.