Transcript Slide 1
The Polynomial Method
In Quantum and Classical Computing
y
x
Scott Aaronson (MIT)
Overview
The polynomial method: Just an awesome tool
that every CS theorist should know about
Goes back to the prehistory of the field (1960’s),
but also plays a major role in current work
[including at this FOCS] on machine learning,
quantum computing, circuit lower bounds,
communication complexity…
Idea: Reduce CS questions to questions about the
minimum degree of real polynomials
Easy to learn! “Look ma, no quantum”
This Talk: Just Some Basics
1. Polynomials in machine learning
- Perceptrons
2. Polynomials in quantum computing
Stuff I -wish
I could
cover but can’t
lack algorithms
of time
Optimality
of Deutsch-Jozsa
andfor
Grover
- Polynomials
finite
fields (Razborov-Smolensky)
- Collisionover
lower
bound
- Reduction of communication problems to polynomials
3.- Sherstov’s
Polynomials
in circuit
complexity
pattern
matrix method
- Deep
- Linial-Mansour-Nisan
connections to Fourier
and
analysis
Bazzi
4. Polynomials everywhere!
- Communication complexity, oracles, streaming…
Our story starts in St. Petersburg, around 1889…
привет! I proved a cool
theorem:
if pyou’re
is a quadratic,
Uhh …
on
max p' x 4 max px
And what if p has
degree d?
your own
1 x 1
1 x 1
y
x
1
Pr X kXˆ
k
Dmitri Mendeleev
A. A. Markov
(periodic table dude)
(inequality dude)
Markov did generalize Mendeleev’s bound to
arbitrary degree (about which more later)
He thereby helped start a field called approximation
theory
Approximation theory is a proto-complexity theory!
Real polynomials = Model of computation
Degree = Complexity measure
So, maybe not so surprising that it ends up being
related to actual complexity theory…
1. POLYNOMIALS IN MACHINE
LEARNING
Fast-forward to 1969…
And AI researchers were studying perceptrons
f
f1
f2
…
m
1 working
if i f i
Bill Ayers
was
f
i 1
for the McCain’08
0
otherwise
campaign
fm
1 , , m ,
A perceptron of order k is a Boolean function
f:{0,1}n{0,1} that’s a threshold of subfunctions
on at most k variables each
Minsky and Papert: Small
perceptrons have serious
limitations!
Application: “killed neural net research for a decade”
Suppose f:{0,1}n{0,1} is represented by an order-k
perceptron
Then there’s clearly a degree-k polynomial p:RnR
such that for all x1,…,xn{0,1},
sgn px1 ,, xn f x1,, xn
Furthermore, without loss of generality p is
multilinear: no variable raised to higher power than 1
Example: The PARITY function
Suppose
sgn px1,, xn x1 xn
for all x1,…,xn{0,1}. Then what can we say
about deg(p)?
Theorem: deg(p)n
Key idea: Symmetrization
Replace multivariate polynomials by univariate
ones, which are easier to understand
How Symmetrization Works
Let
Let
px1 ,, xn
qk
x
S n
S d
S
i
iS
px1 ,, xn
x x k
EX
1
n
Key Lemma:
q(k) is itself a polynomial in k,
of degree at most d
Proof: By linearity of expectation,
qk S EX xi
x1 xn k
S n
iS
S d
n S n
EX xi
k
x1 xn k
k
S
iS
n S !
k!n k !
k S !n k ! n!
n S !
k k 1 k S 1
n!
which is a degree-|S| polynomial in k.
So, suppose there’s an order-k perceptron
computing the parity of n bits
Then there’s a degree-k multilinear polynomial p
such that sgn p x ,, x x x
1
n
1
n
Hence there’s a degree-k univariate polynomial q
such that for all k=0,…,n,
qk 0 if k even
qk 0 if k odd
Must have degree n
2. POLYNOMIALS IN QUANTUM
COMPUTING
Quantum Query Model In One Slide
What are the allowed operations?
Initialize vector of amplitudes
1
n
Quantum state:
Unit vector in Cn
Query the input bits
1 1 1 2 x1
1 2 x
n
n n
One further detail: The quantum state can have
more a
than
n dimensions,
with multiple“Measure”
components
Apply
unitary
transformation
querying each xi, as well as components that don’t
1 u11 u1n 1
make
at all
queries
Outcome i observed
2
with
probability
|
|
Complexity
Measure:
Q(f)
=
minimum
number
of
i
u
u
nn n
n n1
queries needed to compute a Boolean function f with
probability 2/3, on all inputs x=x1…xn
Example: The Deutsch-Jozsa Algorithm
Does something spectacular:
Computes the XOR of two bits with one oracle call!
1 x1 x2
x2 x1
1
2
1
2
1
1 2x
0
1
2
1 0
1
2
x
2
2
1
2
1
2
By computing x1x2, x3x4, etc., can compute the
parity of n bits with n/2 oracle calls
Is that optimal?
Lemma (Beals et al. 1998): If a quantum algorithm
makes T queries, its probability of accepting is a
degree-2T multilinear polynomial over the xi’s
Right-to-Left Proof:
0 u11 u1n 1 2 x1
0 1
1 2 x1
Implication:
algorithm
If a quantum
computed
x0 x 1with
u queries,
it
would
<n/2
lead
to
a
2
x
u
0
1
2
x
1
n
n n1
nn
n n
polynomial approximating PARITY with degree <n.
Entries
After
are Deutsch-Jozsa
now
TStill
queries,
Degree-2
degree-1
degree-T
polynomials
polynomials
polynomials
polynomials
over the xi’s
Hence
must be
optimal!
Then
i accepting
2
i
has degree 2T
Another Famous Quantum
Algorithm: Grover’s
Computes the OR of n bits using O(n) queries
Is Grover’s algorithm optimal?
BBBV 1994: Yes, by a quantum argument
We’ll instead prove Grover is optimal using …
wait for it …
Given a Boolean function f, let deg(f) be the
minimum degree of a real polynomial p:RnR
such that
px f x x 0,1
n
Theorem (Nisan-Szegedy 1994):
deg ORn n
Observation: Is that lower bound tight? Yes,
because of Grover’s algorithm!
To prove deg(OR)=(n), we need to
revisit our good friend Markov…
Theorem (Markov): If p is a degree-d real
polynomial, then
max p' x d max px
2
1 x 1
1 x 1
Another convenient form: for all n>0,
deg p
n max p ' x
0 x n
2 max p x
0 x n
Markov’s inequality is tight.
The extremal cases are called the
Chebyshev polynomials:
Td x cosd arccosx
Uhh … why is that a polynomial at all?
Td cos x cos dx
Re e idx
Re cos x i sin x
d
which is a degree-d polynomial in cos x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
Let p satisfy
px ORx x 0,1
n
We want to lower-bound deg(p)
Symmetrize:
qk EX px , degq deg p
x k
1
0
1 qk 1 k 1,, n
1
1by
q' xSo
2Markov’s
for some0 inequality,
x 1
n max1 2 ,2c 1
degq
n
c
0
q0
One remaining problem: q(x) need not be
bounded at non-integer x
Solution: Notice max q' x 2c 1, c max q x
0 x n
0 x n
Collision Problem
Illustrates the amazing reach of the polynomial
method
Problem: Given f:[n][n], decide whether f is 1-to-1
or 2-to-1, promised it’s one or the other
By the Birthday Paradox, ~n queries to f
are necessary and sufficient classically
[Brassard et al. 1997] gave a quantum algorithm
making O(n1/3) queries
[A. 2002]: Any quantum algorithm needs (n1/5)
queries. Improved to (n1/3) by Shi
Lower bound by polynomial method
1 if f x h
Let x, h
0 otherwise
Lemma (following Beals et al.): If a quantum
algorithm makes T queries to f, the probability p(f)
that it accepts is a degree-2T polynomial in the
(x,h)’s
Now let
qk
p f
k - to-1 functions f
EX
be the expected acceptance probability on a
random k-to-1 function
The Miracle:
q(k) is itself a polynomial in k, of
degree at most 2T
Why?
nr
n d !
r dh
n / k r
EX
xh , j , h
r
k - to-1 functions f
n
/
k
r
h
1
j
1
k!
k d h !
n
n!
n / k
k!n / k
h 1
n/k
n r !n d !
k! n / k !n n / k !
r
n!n!
n / k r !n n / k !k!n / k r k d h !
d1
d2
h 1
d3
n r !n d !
n!n!
n r !n d !
d
n!n!
h 1
n n n
k
k
1
k
d
1
1
r
1
h
k k k
h 1
n r !n d !
n!n!
k!r n / k !
r
n / k r !
k d h !
r
k
1
k
d
1
nn k n rk k
h
r
h 1
which is a degree-d polynomial in k. That’s why.
Technicality: Need to deal with k not dividing n
Another Useful Hammernomial:
Bernstein’s Inequality
deg p
max p' x xn x
0 x n
max px
0 x n
Application: Any quantum algorithm to compute the
MAJORITY of n bits requires (n) queries
Ouch, that really
hurts the degree!
Oh, and don’t forget the inequality of V.
A. Markov—A. A.’s younger brother!
px
d k p 2e deg p max
0 x n
max k
0 x n dx
nk
2
k
Application [A. 2004]: Direct product theorem for
quantum search. After T queries, the probability that
a quantum algorithm finds K marked items out of N
is at most (cT2/N)K
0 1
K
N
3. POLYNOMIALS IN CIRCUIT
COMPLEXITY
Linial-Mansour-Nisan 1993: If a Boolean function f
is computable by an AC0 circuit of size s and depth k,
then we can find a degree-d real polynomial p such
that
EX n px f x 2s 2
x0,1
2
d 1 / k / 20
Proof uses the Switching Lemma to upper-bound
high-degree Fourier coefficients
By Nisan-Szegedy, the above
theorem would be false if we wanted
|p(x)-f(x)| to be small for every x
Bazzi 2007: Let F=C1Cm be a DNF formula.
Then we can find degree-d real polynomials p and q
such that
n
px F x qx x 0,1
O 1 d
EX qx px m 2
x0,1n
Implies that polylog-wise independent distributions
“fool” small DNFs.
The proof takes
64
pages
4. POLYNOMIALS EVERYWHERE
Polynomials in Oracle-Building
Beigel 1992: There exists an oracle relative to which
PNP PP
Use the following problem: Given exponentially-long
integers x=x1…xN and y=y1…yN, is xy?
It’s in PNP, since we can use binary search to find the
leftmost i such that xiyi
But is there a low-degree polynomial p such that
if x y
1
sgn px1 ,, xN , y1 ,, yN
?
1 otherwise
Sure:
2
N 1
x1 2N 2 x2 20 xN 2N 1 y1 2N 2 y2 20 yN
But by clever repeated use of Markov’s inequality,
one can show that any such polynomial must take
on huge (doubly-exponentially-large) values
This means the problem can’t be in PP
[A. 2006] generalized Beigel’s result to give an
oracle relative to which PP has linear-size circuits
Requires handling many polynomials simultaneously
Slide of Guilt: The Polynomial Method
in Communication Complexity
Razborov 2002: Any quantum
protocol for the Disjointness
problem requires (n) qubits of
Razborov and Sherstov, this very FOCS:
communication
An AC0 function with large unbounded-error
communication complexity
Sherstov, this very FOCS:
Characterizes the unboundederror communication complexity
Chattopadhyay-Ada,
Lee-Shraibman 2008: Lower
of symmetric functions
bounds for the k-party communication complexity of
Disjointness in the Number-On-Forehead And
model
more!
Some Positive Uses of Polynomials
Harvey-Nelson-Onak, this very FOCS: Chebyshev
polynomials used to give a streaming algorithm for
approximating the Shannon entropy
Beigel-Reingold-Spielman 1991: PP is closed
under intersection
Future Direction 1: Beyond
Symmetrization
Find better techniques to lower-bound the degrees
of multivariate polynomials.
deg
OR
AND AND
Upper bound: O(n)
(from quantum algorithm)
n
AND ?
Lower bound: (n1/3)
n
(can be proved using the
n1/3 collision lower bound)
deg(f)=O(deg(f)2) for all Boolean functions f?
Best known relation: deg(f)=O(deg(f)6) (Beals et al.)
Future Direction 2: Understanding
Bounded Real Polynomials
Conjecture. Let p:Rn[0,1] be a real polynomial of
degree d. Suppose EXx,y[|p(x)-p(y)|]=(1). Then there
exists an i[n] such that EXx[|p(x)-p(xi)|]=(1/poly(d)).
Would have major implications for quantum!
e.g., for P vs. BQP relative to a random oracle
Given a partial function f:S{0,1} (S{0,1}n), let deg(f)
be the minimum degree of a polynomial p such that
(1) 0p(x)1 for all x{0,1}n,
(2) |p(x)-f(x)| for all xS.
Is there a partial f for which deg(f) is exponentially
smaller than Q(f)?
Future Direction 3: Matrix- Valued
Polynomials
What Boolean functions can we approximate as
Would imply an oracle relative to which
p11 x1 , , xn p1m x1 , , xn
SZKQMA (i.e., “there are no succinct
pij d ?
max quantum
for problems
, deg
proofs
like
graph
p x , , x non-isomorphism”)
p
x
,
,
x
n
mm
1
n
m1 1
Conjecture. Suppose
max(A(x))[0,1] for all x{0,1}n
max(A(x))2/3 for all x encoding a 1-to-1 function
max(A(x))1/3 for all x encoding a 2-to-1 function
Then d2(d+log m)=(n).
Future Direction 4: Extending Bazzi’s
Theorem to AC0 (the Linial-Nisan Conjecture)
Problem: Given fAC0, construct polylog(n)-degree
polynomials p,q:RnR such that
px f x qx x 0,1 , EX n qx px
n
x0,1
If p,q have the further property that
p x
q x
C x
C
Clauses C
C x
C
Clauses C
C
C
P rC On
P rC O n
1 / 2
1 / 2
C
C
then we get an oracle relative to which BQPPH.
The polynomial method:
the choice of hardworking
American lowerboundsmen
I
approve!
y
x
OPEN PROBLEM