Transcript ppt - IBM

Optimal Bounds for JohnsonLindenstrauss Transforms and
Streaming Problems with SubConstant Error
T.S. Jayram
David Woodruff
IBM Almaden
Data Stream Model
• Have a stream of m updates to an n-dimensional vector v
– “add x to coordinate i”
– Insertion model -> all updates x are positive
– Turnstile model -> x can be positive or negative
– stream length and updates < poly(n)
• Estimate statistics of v
–
–
–
–
# of distinct elements F0
Lp-norm |v|p = (Σi |vi|p )1/p
entropy
and so on
• Goal: output a (1+ε)-approximation with limited memory
Lots of “Optimal” Papers
• Lots of “optimal” results
– “An optimal algorithm for the distinct elements problem”
[KNW]
– “Fast moment estimation in optimal space” [KNPW]
– “A near-optimal algorithm for estimating entropy of a stream”
[CCM]
– “Optimal approximations of the frequency moments of data
streams” [IW]
– “A near-optimal algorithm for L1-difference” [NW]
– “Optimal space lower bounds for all frequency moments” [W]
• This paper
– Optimal Bounds for Johnson-Lindenstrauss Transforms and
Streaming Problems with Sub-Constant Error
What Is Optimal?
• F0 = # of non-zero entries in v
• “For a stream of indices in {1, …, n}, our algorithm
computes a (1+ε)-approximation using an optimal O(ε-2 +
log n) bits of space with 2/3 success probability… This
probability can be amplified by independent repetition.”
• If we want high probability, say, 1-1/n, this increases the
space by a multiplicative log n
• So “optimal” algorithms are only optimal for algorithms
with constant success probability
Can We Improve the Lower Bounds?
x2
-2
ε
{0,1}
y2
-2
ε
{0,1}
Gap-Hamming: either Δ(x,y) > ½ + ε or Δ(x,y) < ½-ε
Lower bound of Ω(ε-2) with 1/3 error probability
But upper bound of ε-2 with 0 error probability
Our Results
Streaming Results
• Independent repetition is optimal!
• Estimating Lp-norm in turnstile model up to 1+ε w.p. 1-δ
– Ω(ε-2 log n log 1/δ) bits for any p
– [KNW] get O(ε-2 log n log 1/δ) for 0 · p · 2
• Estimating F0 in insertion model up to 1+ε w.p. 1-δ
– Ω(ε-2log 1/δ + log n) bits
– [KNW] get O(ε-2 log 1/δ) for ε-2 > log n
• Estimating entropy in turnstile model up to 1+ε w.p. 1-δ
– Ω(ε-2log n log 1/δ) bits
– Improves Ω(ε-2 log n) bound [KNW]
Johnson-Lindenstrauss Transforms
• Let A be a random matrix so that with probability
1- δ, for any fixed q 2 Rd
|Aq|2 = (1 ± ε) |q|2
• [JL] A can be a 1/ε2 log 1/δ x d matrix
– Gaussians or sign variables work
• [Alon] A needs to have (1/ε2 log 1/δ) / log 1/ε
rows
• Our result: A needs to have 1/ε2 log 1/δ rows
Communication Complexity
Separation
f(x,y) 2 {0,1}
y
x
1
0
D1/3, ρ (f) = communication of best 1-way deterministic protocol
that errs w.p. 1/3 on distribution ρ
[KNR]: R||1/3(f) = maxproduct distributions ¹ £ λ D ¹ £ λ,1/3(f)
Communication Complexity
Separation
f(x,y) 2 {0,1}
VC-dimension: maximum number r of columns for which all
2r rows occur in communication matrix on these columns
[KNR]: R||1/3(f) = Θ(VC-dimension(f))
Our result: there exist f and g with VC-dimension k, but:
R||δ(f) = Θ(k log 1/δ) while
R||δ(g) = Θ(k)
Our Techniques
Lopsided Set Intersection (LSI)
U = 1/ε2 ¢ 1/δ
Is S Å T = ;?
S ½ {1, 2, …, U}
|S| = 1/ε2
T ½ {1, 2, …, U}
|T| = 1/δ
- Alice cannot describe S with o(ε-2 log U) bits
- If x, y are uniform then with constant probability, S Å T = ;
- R||1/3(LSI) > Duniform, 1/3 (LSI) = Ω(ε-2 log 1/δ)
Lopsided Set Intersection (LSI2)
U = 1/ε2 ¢ 1/δ
Is S Å T = ;?
S ½ {1, 2, …, U}
|S| = 1/ε2
T ½ {1, 2, …, U}
|T| = 1
-R||δ/3(LSI2) ¸ R||1/3(LSI) = Ω(ε-2 log 1/δ)
- Union bound over set elements in LSI instance
Low Error Inner Product
U = 1/ε2 ¢ 1/δ
Does <x,y> = 0?
x 2 {0, ε}U
|x|2 = 1
y 2 {0, 1}U
|y|2 = 1
Estimate <x, y> up to ε w.p. 1-δ -> solve LSI2 w.p. 1-δ
R||δ (inner productε) = Ω(ε-2 log 1/δ)
L2-estimationε
U = 1/ε2 ¢ 1/δ
- log 1/δWhat
factor
is
new,
but
is
|x-y|
?
2
want an (ε-2 log n log
U
1/δ)
lower
bound
U
y
2
{0,
1}
x 2 {0, ε}
|x|2 = 1
|y|2 = 1
- Can use a known trick to
2
- |x-y|22 = |x|
|y|22extra
- 2<x, y>
2 –factor
2<x,y>
2 +an
get
log= n
- Estimate |x-y|2 up to (1+Θ(ε))-factor solves inner-productε
- So R||δ (L2-estimationε) = Ω(ε-2 log 1/δ)
Augmented Lopsided Set
Intersection (ALSI2)
Universe [U] = [1/ε2 ¢ 1/δ]
j 2 [U]
i* 2 {1, 2, …, r}
Si*+1 …, Sr
S1, …, Sr ½ [U]
All i: |Si| = 1/ε2
Is j 2 Si*?
R||1/3(ALSI2) = (r ε-2 log 1/δ)
Reduction of ALSI2 to L2-estimationε
- Set r = Θ(log n)
S1
S2
…
Sr
- R|| δ(L2-estmationε) = (ε-2 log n log
1/δ)
x1
j
yi*
|| (L
x2
S -estimationxεi*+1
- Streaming
Space
>
R
)
δ 2i*+1
x
…
…
…
xr
Sr
xr
}
y - x = 10i* yi* - i=1i* 10i ¢ xi
|y-x|2 is dominated by 10i* |yi* – xi*|2
}
y
Lower Bounds for JohnsonLindenstrauss
x 2 {-nO(1), …, nO(1)} t
y 2 {-nO(1), …, nO(1)} t
Use public randomness to agree on a JL matrix A
Ax
- Can estimate |x-y|2 up to 1+ε
w.p. 1-δ
- #rows(A) = (r ε-2 log 1/δ /log
n)
- Ay
|A(x-y)|2
Low-Error Hamming Distance
Universe = [n]
Δ(x,y) = Hamming Distance
between x and y
x 2 {0,1}n
y 2 {0,1}n
• R||δ (Δ(x,y)ε) = (ε-2 log 1/δ log n)
• Reduction to ALSI2
• Gap-Hamming to LSI2 reductions with Low Error
• Implies our lower bounds for estimating
• Any Lp-norm
• Distinct Elements
• Entropy
Conclusions
• Prove first streaming space lower bounds that
depend on probability of error δ
– Optimal for Lp-norms, distinct elements
– Improves lower bound for entropy
– Optimal dimensionality bound for JL transforms
• Adds several twists to augmented indexing proofs
– Augmented indexing with a small set in a large domain
– Proof builds upon lopsided set disjointness lower bounds
– Uses multiple Gap-Hamming to Indexing reductions that
handle low error
ALSI2 to Hamming Distance
multiple copies by
S1, …, Sr ½ Embed
[1/ε2 ¢ 1/δ]
2
duplicating
coordinates at
All i: |Si| = 1/ε
different scales
j 2 [U]
i* 2 {1, 2, …, r}
Si*+1 …, Sr
- Let t = 1/ ε2 log 1/δ
- Use public coin to generate t random strings b1, …, bt 2 {0,1}t
- Alice sets xi = majorityk in Si bi, k
- Bob sets yi = bi ,j