SimpleConstructions

Download Report

Transcript SimpleConstructions

Simple Constructions of Almost k-wise
Independent Random Variables
Gil Tamir 08/12/2008
Noga
Alon
Johan
Hastad
Oded
Goldreich
Rene
Peralta
Introduction
Randomized algorithms
General Idea
Importance
Deterministic
Simulation
Interest
Algorithms that have access to a random number generator to generate random
bits.
There exist many problems that can be solved efficiently using random
algorithms.
• Examples include prime numbers testing or finding cycles in a graph.
A typical way is to iterate over all possible generations of random bits.
• Note that the number of options is exponential to the number of random bits!
Certain problems can be efficiently solved using random variables which aren’t
completely random, yet have a certain random property.
Let’s focus on a specific type of random property
1
Definitions
Definition: k-wise independence
A sample space S 
0,1 ,
n
where if X  x1...xn is chosen Uniformly from S,
then for any k positions i1  i2  ...  ik and any  =1... k
Pr  xi1 ...xik  1... k   2 k
3
Definitions
Definition: (ε,k)-wise independence
A sample space S 
0,1 ,
n
where if X  x1...xn is chosen Uniformly from S,
then for any k positions i1  i2  ...  ik and any  =1... k
Pr  xi1 ...xik  1... k   2 k  
4
Definitions
Definition: (ε,k)-biased linearly, for k>0
A sample space S 
0,1
n
,
where if X  x1...xn is chosen Uniformly from S,
then for any k  0 positions i1  i2  ...  ik
Pr  xi1 ...xik xi1 ...xik  0   Pr  xi1 ...xik xi1 ...xik  1  
2
2




• If S is ( , n)-biased linearly, we denote it simply  -biased linearly
5
Definitions
Let’s examine the relations between the Probability Spaces over n bits
All Sample
spaces
(ε,k)-wise
independent
(ε,k)-biased
linearly
k-wise
independent
n-wise
independent
(same ε,k)
6
Definitions
Short explanation
• Let S be a k -wise independent sample space over n bits,
X  x1...xn be a uniformly chosen vector from S,
and a set of k  0 positions i1  i2  ...  ik .

Now let J = 1... k  0,1 s.t 1... k 1... k
k
• From symmetry we get J  0,1
k
2
thus since since S is k-wise independent,
2

0
(ε,k)-biased
linearly
k-wise
Pr  xi1 ...xik  J  = 1 2.
independent
• Finally we get Pr  xi1 ...xik xi1 ...xik  0  
2


Pr  xi1 ...xik  J   1/2, and S is  -biased linearly, for any  .
7
Interest
 
How small can a sample space S  0,1
definitions be?
n
which satisfies these
n-wise
independence
S must contain every vector X  0,1  S  2n
k-wise
independence
Alon, Babai and Itai have shown a construction of size roughly n k /2 .
It has been proven to be within a constant factor of the theoretical bound.
(ε,k)-wise
independence
 klog(n) 
We will show a construction of size roughly 

 

n
2
8
Motivation
(ε,k)-wise vs. k-wise independent sample spaces
• In many cases, if an algorithm behaves “well”
on points chosen from a k -wise independent sample space,
then it will behave “essentially as well” on points chosen from
( , k )-wise independent sample space.
• In many cases, iterating over all n k /2 points in a k-wise sample space
is simply not efficient,
2
 klog(n) 
yet iterating over all 
 points in a ( , k )-sample space is.

 
9
Goal
small (ε,k)-wise independent probability space over n bits
• Given (n, k ,  ), where
- n  2t  1
- k  2d  1, k  n log(n)
-  0
if n not as such, use a bigger n ' ( n '  2n)
if k is even, use k ' : k  1
if k  n log(n) , we'll handle it later
• We create an ( , k )-wise independent S 
 k log(n) 
so that S  

 

0,1 ,
n
2
 k log(n) 
• This is the same as constructing such S using 2 log 
 bits
 

10
Previous results
small (ε,k)-wise independent probability space over n bits
• Naor and Naor:
Construction of ( , k )-wise independent S 
so that S 
0,1 ,
n
k log(n)
4
 k log(n) 
• This is the same as constructing such S using log 
 b its
4
 

11
Comparison
Our results vs. Previous results
2
 k log(n)   k log(n) 
 k log(n)   k log(n) 
2 log 
,
vs.
log
 


, 

4
4
 
  

 
  

• Asymptotically, the bits used are the same.
• Our result size bound is better long as  2  1  k log( n) 
• In any realistic application,   2 k ,

k
and we get sizes 2 k log(n)

2


vs. 24 k k log(n) .
• In cases where in addition, 2k is much larger than log( n)
 
 
we get sizes roughly 22 k vs. 24 k
12
Method
2log(
n  2  1, k  2d  1. Build ( , k)-wise independent Sn
t
1 dt
1 dt

1
2log(
 1  dt 
Use 2 log 
 bits to get a  -biased linearly T1 dt
  
2
Consider H n(1 dt ) for which any k rows are linearly independent.
2 log(
3
Sn
1 dt


)
)
 0,1
n
 0,1
1 dt
.
)
is the sample space from which the
uniformly selected x  S is defined:
• Select uniformly z  T
• x  Hz
13
Method
0,1
2log(
2 log(
1 dt

1 dt

0,1
)
n
n-wise
independent
)-wise independent
Im( H )1n dt
1
k-wise
independent
2
0,1
1 dt
(1+dt)-wise
independent
2log(
T1 dt
1 dt

ε-biased
ε-biased
linearly
linearly
2 log(
Sn
)
1 dt

)
(ε,k)-wise
independent
3
14
The construction
1
This works for all m, n such that n  2m
Denote Sn2 m  {0,1}n the sample space generated by 2m bits,
from which a uniformly selected vector R  r0 ...rn 1 is defined:
• Select uniformly two vectors X , Y  GF (2m )
• ri = X i Y
2
GF (2m ) : This notation denotes the standard field of size 2m.
Since all fields of such size are isomorphic to each other,
we can choose the following field:
• Members : all polynomials over 2 with degree  m.
• Operations : standard  ,   polynomial operations,
modulo a specific irreducible polynomial of degree m
15
The construction
1
Proposition
S n2 m is (n 2m ) -biased linearly
2log(
Specifically: T1 dt
1 dt

)
is  -biased linearly
Proof
Let R  r0 ...rn 1 be a uniformly chosen vector from S,
and let i1  i2  ...  ik be a non empty series of indexes.
we need to prove
Pr  (ri1 ...rik ) (ri1 ...rik )  0   Pr  ( ri1 ...rik ) ( ri1 ...rik )
2



2
 1  n 2 m

16
The construction
1
Proof continued (1)
• First, let X , Y  GF (2m ) be the two creators of R, thus:
 k 2  k   k

ij
(ri1 ...rik ) (ri1 ...rik )    ri j     ri j     X Y  
2
 j 1 2  j 1 2  j 1
2
k
k
j
X
 Y
j 1
i
2
• Let's define p( )    j , a polynomial over GF (2m ) and we get:
i
j 1
(ri1 ...rik ) (ri1 ...rik )  p( X ) Y
2
2
17
The construction
1
Proof continued (2)
• Using Bayes ' theorem we get:
Pr  p( X ) Y
2
 0   Pr  p( X ) Y
Pr  p( X ) Y
• Pr  p( X ) Y
• Pr  p( X ) Y
2
2
2
2
 0 p( X )  0  * Pr  p( X )  0  
 0 p( X )  0  * Pr  p( X )  0
 0 p( X )  0   1 always true
1

Y is chosen uniformly
 0 p( X )  0  
2
• Pr  p( X )  0  (n  1) 2m
0  deg(p)  ( n  1), X is chosen uniformly
n  2m assures that p can not be the zero polynomial
18
The construction
1
Proof continued (3)
This leads us to:
• Pr  p( X ) Y
2
• Pr  p( X ) Y
2
1
 0   * Pr  p( X )  0  1* Pr  p( X )  0 
2
1
 1  * Pr  p( X )  0
2
And finally:
• Pr  p( X ) Y
2
 1  Pr  p( X ) Y
2
 0   ( n  1) 2 m  n 2 m
19
The construction
2
The linear transformation H
• Let x1 ,..., xn
n  2t -1 be the non-zero elements of GF(2t ) , as row vectors.
2 t 1 
1 x x 3
x
1
1
1


1 x2 x23
x2 2t 1 
• Define H := 

1


3
2 t 1 
xn 
1 xn xn
• Notice that H has n rows, and 1  dt columns.
• It is known that any k k  2d  1 rows of H are linearly independent
over GF(2)
20
The construction
3
Proposition
2log(
Sn
1 dt

)


: Im H T 2 log( 1dt ) is a ( , k )-biased linearly
1 dt
Proof
Let R  r0 ...rn 1 be a uniformly chosen vector,
and let i1  i2  ...  ik be a non empty series of indexes.
we need to prove
Pr  (ri1 ...rik ) (ri1 ...rik )  0   Pr  ( ri1 ...rik ) ( ri1 ...rik )
2



2
 1  

21
The construction
3
Proof continued
2log(
• First, let V  v1...vn  T1 dt
1 dt
)

be the source of R, thus:
 k 2   k   k  1 dt

(ri1 ...rik ) (ri1 ...rik )    ri j     ri j       hi j ,l  vl   
2
 2
 j 1  2  j 1  2  j 1  l 1
 1 dt  k
   1 dt
 k

     hi j ,l  vl      vl    hi j ,l   
 l 1 j 1
 

  2  l 1
 j 1
 2
 
• Now since the sum of rows i1 ,.., ik of H is not equal to 0,
 k'

the above sum is equal to   vl j '  for some non-empty series of indexes l1  l2  ...  lk ' .
 j '1 2
2log(
• Now since T1 dt
1 dt
)

is  -biased linearly:
 k '

 k '



Pr   vl j '   0   Pr   vl j '   1   , and the proposition follows.
 j '1  2

 j '1  2 
22
The construction
3
Almost final step
2log(
• We've shown Sn
1 dt
)

to be ( , k )-biased linearly.
• Recall the lemma stating:
For any sample space S , if S is ( , k )-biased linearly,
then it is also ( , k )-wise independent.
2log(
Thus we conclude that Sn
1 dt
)

is ( , k )-wise independent.
23
The construction
3
The step after the almost final step
• If k  n log(n), we can simply use the construction from
1
n
with n : n, m : log   , (n  2m as required)
 
2
n
and get  -biased sample space of size  
 
• Using the same lemma as before, this sample space is ( , k )-wise independent.
 k log(n) 
• since k  n log(n), and thus n  k log(n), this satisfies our goal of 


 
2
24
The construction
3
Interesting Bonus!
• Since 0,1
1 dt
is (1  dt )-wise independent, then for any   0 it is  -biased linearly.
• Thus, following the same proof, for any   0, Im  H  is ( , k )-biased linearly.
• Again, using the same lemma as before,
we conclude that for any   0, Im  H  is ( , k )-wise independent.
• This proves that Im  H   0,1 is k -wise independent.
n
25
The construction
3
Concluding remarks
In our construction we relied on an existing representation
of GF(2m ). This implies having an irreducable polynomial
of degree m over GF(2). Here are two ways to overcome this:
• In many applications, we're allowed a preprocessing stage of
comparable size to that of the needed sample space.
In such cases, we can simply go over all polynomials of degree  m,
and discard those with non-trivial divisors.
• Another option is to sample m degree polynomials randomly,
until an irreducable one is found. The density of irreducables
is  1 m , so with roughly m 2 samples we succeed with good probability.
26
The construction
3
Future Improvements
Our construction for finding ( ,k)-wise independent sample space
in 0,1 composed of two main steps:
n
1 Finding a  -biased linearly space in 0,1
k log n
 2  Finding a linear transformation from 0,1
 k log(n) 
of size 

 

k log n
2
to 0,1
n
with an image that is a k -wise independent space
• Part  2  has been proven to be within constant factor of the optimum,
and thus improving upon this method can only be done by finding a smaller
 -biased linearly space in 0,1
k log n
.
• A completely different method might exist, and might produce better results.
27