Assymetric Communication Complexity
Download
Report
Transcript Assymetric Communication Complexity
Asymmetric
Communication Complexity
And its implications on Cell Probe
Complexity
Based on a paper of Peter Bro Miltersen,
Noam Nisan, Muli Safra and Avi Wigderson
Slides by Elad Verbin
Purpose
We want to get Lower Bounds.
Best known lower bounds:
Sorting is Ω(nlogn) in the comparison
model
Trivial lower bounds. i.e. MAX is Ω(n)
What can we really do, i.e. for RAM?
Outline
Yao decided to strengthen the model –
Considered the Cell Probe model.
Lower bounding Cell Probe is hard too.
We strengthen even more –
Communication Complexity
Outline
1.
2.
We show the relationship between Cell
Probe and Communication Complexity.
We show how to get lower bounds for
Communication Complexity using two
techniques:
The Richness Technique
The Round Elimination Technique
Communication Complexity
The problem : f:XY{0,1}
Alice gets xX, Bob gets yY, their goal is
to exchange messages to decide f(x,y).
f(x,y)
A solution is a communication protocol that
can compute f(x,y) for all x,y.
Asymmetric Communication
Complexity
A Communication Protocol that computes
function f in which Alice sends at most a
bits and Bob sends at most b bits is called
a [a,b]-protocol for f.
Pink<=a
f(x,y)
Blue<=b
Randomized Protocols
If the protocol is allowed to flip (public)
coins, and gives the correct answer with
probability > 2/3 it is called a randomized
protocol.
If it always correctly identifies a 0-instance
it is called a one-sided error protocol.
Example
For any problem, there are trivial
deterministic protocols:
[log|X|,1]-protocol
[1,log|Y|]-protocol.
The Problem DISJ(N,k,l)
We work on the universe U={0,1,…,N-1}
Alice gets x, a set of k elements
Bob gets y, a set of l elements
They must decide if x∩y=Ø
x
y
x∩y=Ø?
A one-sided error randomized
[O(k),O(l)]-protocol for DISJ(N,k,l)
If we say that x and y are disjoint then we
want to have complete confidence. If we
say they intersect, we want to be
reasonably certain.
Flip public coins to get a sequence of
random subsets of the universe: R1, R2, …
x
y
R3
x
y
R3
y=y∩R3
y
x
y
R3
y=y∩R3
R8
x=x∩R8
AND SO ON….
y
A one-sided error randomized
[O(k),O(l)]-protocol for DISJ(N,k,l)
We don’t really send the index of R8, we
just send the distance from the last set
(8-3=5). This means that the expected
numbers of bits sent by a player is equal
to the size of his set
If at some point one of the sets becomes
empty, then the originals were disjoint –
say so. Otherwise, after a long time, say
that there is an intersection.
A one-sided error randomized
[O(k),O(l)]-protocol for DISJ(N,k,l)
If x and y were indeed disjoint, the sizes of
x and y decrease by a factor of 2 each
round. Therefore the total communication
is [O(k),O(l)].
If the sets were disjoint, what is the
chance that we say that there is an
intersection? Very low.
Fixed-round protocols
If t alternating messages are sent and
each message is of size a or b it is called
a [t,a,b]-protocol.
a
b
a
t
b
f(x,y)
Static Data Structure Problems
A static data structure problem is a function
f:DQR
D – the data
Q – the queries
R – Possible answers. Typically, R={0,1}
DS
Data
query
The Problem MEMBERSHIP(N)
INPUT: a set S[N]
QUERIES: of the form “xS?”
D={S[N]}, |D|=2N
Q=[N]
R={0,1}
The trivial solution is optimal.
The Problem MEMBERSHIP(N,n)
INPUT: a set S[N] of size n
QUERIES: of the form “xS?”
D={S[N] | |S|=n}, |D|=choose(N,n)
Q=[N]
R={0,1}
The Cell Probe Model
Parameter w – word size
s cells, each containing w bits.
Each query probes at most t cells to get
answer
A query is a decision tree of depth t and
degree 2w
MEMBERSHIP(N,n)
Solutions:
Keep every possible answer. s=N, t=1
(better – s=N/w, t=1)
Keep a nonredundant representation.
s=log(choose(N,n)), t=log(choose(N,n))
MEMBERSHIP(N,n)
Solutions:
Keep a sorted list of all elements.
s=nlog(N)/w , t=log(n)*log(N)/w
There is a randomized solution with
s=(n/w)c, t=O(1), for some constant c.
What is the connection
between Cell Probe and
Asymmetric
Communication
Complexity?
ACC <-> Cell Probe
The communication problem related to a
static data structure problem f:DQ{0,1}
if the problem where Alice gets a query,
Bob gets the data, and they should decide
if this is a “yes” instance or a “no” instance
Communication Problem MEM(N,n)
Alice gets x[N], Bob gets y[N], |y|=n,
they should decide if xy.
Trivial protocols: [1,nlogN] , [logN,1]
Lemma CP->AAC
If there is a solution to the data
structure problem with word size w
taking s cells and with query time t,
then there is a [2t,log(s),w]-protocol
for the communication problem
Therefore a lower bound on ACC gives us
a lower bound on Cell Probe
Finer points of CP->AAC
How is the communication complexity
model stronger than the Cell Probe
Model?
Answer: In its adaptivity
Which form of Cell Probe lower bounds
can we get from the CP->AAC Lemma?
Answer: the bound on space is up to a
polynomial
Restricted AAC->CP
If there is a [O(1),a,b]-protocol for the
communication problem then the data
structure problem has a solution with word
size w=b, t=O(1) and s=2O(a)
Proof: The Data Structure for input y
contains the message Bob should send
next for every possible history of
messages Alice can send, for any query.
Lower Bounding The
Communication Complexity
Communication Problem MEM(N,l)
Alice gets x[N], Bob gets y[N], |y|=l,
they should decide if xy.
NONMEM(N,l) is the same problem, when
Alice and Bob want to decide if xy
Trivial protocols: [1,l*logN] , [logN,1]
Problem <-> Matrix
We identify a communication problem
f:X×Y{0,1} with a |X|×|Y| Matrix where
M[x][y]=f(x,y).
The matrix of NONMEM(N,l) has N rows
N
and l columns. Each column has N-l 1
entries
Problem <-> Matrix
A problem (matrix) is (u,v)-rich if at least v
columns contain at least u 1-entries.
(4,3)-rich
NONMEM(N,l) is
N
(N-l, l )-rich.
The Richness Lemma
1.
2.
Let f be a communication problem that:
is (u,v)-rich
has a randomized one-sided error [a,b]protocol.
Then f contains a u/2a+2 over
v/2a+b+2 submatrix of 1-entries.
Randomized Lower Bound for
MEM(N,l)
Say MEM(N,l) has a negative-one-sided
error [a,b]-protocol. Let a<log(l), l<N/2.
Then NONMEM(N,l) has a one-sided error
[a,b]-protocol
Randomized Lower Bound for
NONMEM(N,l)
N
(N-l, l )-rich
NONMEM(N,l) is
Therefore it has a 1-submatrix of
dimensions at least (N-l)/2a+2 over
N
/2a+b+2
l
Randomized Lower Bound for
NONMEM(N,l)
However, if there is a 1-submatrix
of
N
r
dimensions r on s then s≤ l
By substituting for s and r, simplifying and
bounding we get 2a(a+b)=Ω(l)
Randomized [O(a),O(l/2a)] Upper
Bound for NONMEM(N,l)
On the other hand, NONMEM(N,l) has a
[O(a),O(l/2a)]-protocol, for all a<log(l):
Alice sends Bob the first a indices of R’s
that contain x. This allows Bob to reduce y
to expected size l/2a.
Then Bob sends a couple indices that
contain y.
If we are not yet sure that they are disjoint,
we say that they intersect.
Tightness for NONMEM(N,l)
NONMEM(N,l) has a [O(a),O(l/2a)]-protocol, for
all a<log(l)
2a(a+b)= 2a(a+l/2a)=?O(l) Therefore the last
result is tight?
There are constants c,c’>0, so that for any a,
b=l/2ca is enough.
b=l/2c’a is not enough.
The Richness Lemma
1.
2.
Let f be a communication problem that:
is (u,v)-rich
has a randomized one-sided error [a,b]protocol.
Then f contains a u/2a+2 over
v/2a+b+2 submatrix of 1-entries.
Proof of the Richness Lemma
First let us prove a weaker result: if f has a
deterministic [a,b]-protocol then it contains
a contains a u/2a over
v/2a+b submatrix of 1-entries.
We prove this by induction on a+b:
Proof of the Richness Lemma
For a+b=0 – |X|≥u, |Y|≥v, and f(x,y)=1 for
all x,y, so this is trivial.
Now, if Alice send the first bit:
X0 – inputs for which she sends 0
X1 – inputs for which she sends 1
Let f0, f1 be the restrictions of f to X0Y,
X1Y.
Proof of the Richness Lemma
At least one of them is (u/2,v/2)-rich, and
both have a [a-1,b] protocol.
By the induction it contains a (u/2)/2a-1
over (v/2)/2a+b-1 1-submatrix.
In the other case, Bob send the first bit.
Define Y0,Y1,f0,f1. At least one of them is
(u,v/2)-rich, and proceed similarly.
Proof of the Richness Lemma
Now let us prove the general case:
Let S be the set of u*v rich-positions in the
matrix
Let us look at some coin-flip sequence.
Proof of the Richness Lemma
Let X = #{1s in S}
E[X]>=2/3 * uv
=> There exists such a sequence for
which X>=2/3 * uv
Fix the sequence, to get a deterministic
algorithm. This algorithm computes a
function f’ that is close to f.
Proof of the Richness Lemma
By a counting argument, f’ is (u/4,v/4)-rich,
and so it has a 1-submatrix of the required
size ( u/2a+2 over v/2a+b+2 )
This is a 1-submatrix in f too, because the
error is one-sided.
Q.E.D.
A Richness Results for two-sided
error
Let d,e>0, and let f:X×Y{0,1} be a
communication problem with at least a dfraction of 1s. If f has a randomized twosided error [a,b]-protocol then f has a
submatrix M of dimensions at least
|X|/2O(a) over |Y|/2O(a+b) with at least a
(1-e)-fraction of 1s.
The SPAN(n) Problem
In SPAN, Alice gets x{0,1}n and Bob gets
a vector subspace y{0,1}n
y can be represented using a basis of k≤n
vectors – O(n2) bits
Alice and Bob must decide if x∈y.
Trivial Protocols:[n,1] , [1,n2]
Lower bounds for SPAN
1.
2.
Let’s prove that in any [a,b] randomized
one-sided error protocol for SPAN, either
a=Ω(n), or b=Ω(n2)
We will assume that y is of dimension
n/2.
We will prove that:
SPAN is (2n/2,2n2/4)-rich, and
SPAN does not contain a 1-submatrix of
2/12
n/3
n
dimensions 2 over 2
SPAN is
2/4
n/2
n
(2 ,2 )-rich
Each subspace contains exactly 2n/2
vectors => each column contains 2n/2 1s.
How many subspaces of dimension n/2
are there?
Lets choose a basis: we have 2n-1
possibilities for the first vector, 2n-2 for the
second, 2n-4 for the third, etc.
SPAN is
2/4
n/2
n
(2 ,2 )-rich
We chose each basis (n/2)! times
How many basis does a subspace has?
We have 2n/2-1 options to choose the first
vector, 2n/2-2 for the second, etc.
We again chose each basis (n/2)! times.
Thus, there are at least
dimension n/2.
2/4
n
2
subspaces of
SPAN does not contain a 1-submatrix
2/12
n/3
n
of dimensions 2 over 2
Lets look at a 1-submatrix with at least 2n/3
rows. The subspace spanned by them is
of dimension at least n/3.
How many subspaces of dimension n/2
can include this entire subspace?
2/12
n
2
And we’re done.
The Round Elimination
Lemma
f->Pm(f)
Let f:X×Y->{0,1} be a communication problem
Pm(f) is:
Alice gets m elements from X, x1, …, xm
Bob gets 1≤i≤m, yY and also x1, …, xi-1
They want to compute f(xi,y)
How meaningful can Alice’s first message be?
The Round Elimination Lemma
Let C=99, R=4256.
Say that PRa(f) has a randomized twosided error [t,a,b]-protocol in which Alice
sends the first message.
Then there is a randomized two-sided
error [t-1,Ca,Cb]-protocol with Bob
sending the first message.
General framework for LB proofs using
the Round Elimination Lemma
[t,a,b]-protocol for F(n)
[t,a,b]-protocol for Pm(F(n’)) (typically n’=n/m)
[t-1,Ca,Cb]-protocol for F(n’)
[1,Ct-1a,Ct-1b]-protocol for F(n(t-1))
The problem GT(n)
Alice and Bob each gets an n-bit integer.
They want to decide if x<y.
x
y
x<y?
The problem GT(n)
Deterministic communication complexity is
linear
Randomized comm. complexity with twosided error is O(logn) (using a logarithmic
number of rounds)
When limited to t rounds: There is a
[t,n1/tlogn,n1/tlogn]-protocol
GT(n) does not have a
[t,n1/tC-t,n1/tC-t]-protocol
Theorem: Let C=99.
There does not exist a randomized twosided error [t,n1/tC-t,n1/tC-t]-protocol for
GT(n)
GT(n) does not have a
[t,n1/tC-t,n1/tC-t]-protocol
By induction on t
Say there was a [t,n1/tC-t,n1/tC-t]-protocol
for GT(n).
Then there is a [t,n1/tC-t,n1/tC-t]-protocol for
Pm(GT(n’)) for m=n1/t, n’=n(t-1)/t
From the round Elimination Lemma, there
is a [t-1, n1/tC-(t-1),n1/tC-(t-1)]-protocol for
GT(n’). Contradiction.
GT(n) does not have a
[t,n1/tC-t,n1/tC-t]-protocol
[t,n1/tC-t,n1/tC-t]-protocol for GT(n)
[t,n1/tC-t,n1/tC-t]-protocol for Pm(GT(n’)) for m=n1/t,
n’=n(t-1)/t
Alice constructs a n-bit integer x’: She
concatenates x1…xm
Bob constructs a n-bit integer y’: He
concatenates x1…xi-1 then y and the rest is 1s
We get x’>y’ xi>y
THE END