Transcript Slide 1

The Polynomial Method
In Quantum and Classical Computing
y
x
Scott Aaronson (MIT)
Overview
The polynomial method: Just an awesome tool
that every CS theorist should know about
Goes back to the prehistory of the field (1960’s),
but also plays a major role in current work
[including at this FOCS] on machine learning,
quantum computing, circuit lower bounds,
communication complexity…
Idea: Reduce CS questions to questions about the
minimum degree of real polynomials
Easy to learn! “Look ma, no quantum”
This Talk: Just Some Basics
1. Polynomials in machine learning
- Perceptrons
2. Polynomials in quantum computing
Stuff I -wish
I could
cover but can’t
lack algorithms
of time
Optimality
of Deutsch-Jozsa
andfor
Grover
- Polynomials
finite
fields (Razborov-Smolensky)
- Collisionover
lower
bound
- Reduction of communication problems to polynomials
3.- Sherstov’s
Polynomials
in circuit
complexity
pattern
matrix method
- Deep
- Linial-Mansour-Nisan
connections to Fourier
and
analysis
Bazzi
4. Polynomials everywhere!
- Communication complexity, oracles, streaming…
Our story starts in St. Petersburg, around 1889…
привет! I proved a cool
theorem:
if pyou’re
is a quadratic,
Uhh …
on
max p' x   4 max px 
And what if p has
degree d?
your own
1 x 1
1 x 1
y
x


1
Pr X  kXˆ 
k
Dmitri Mendeleev
A. A. Markov
(periodic table dude)
(inequality dude)
Markov did generalize Mendeleev’s bound to
arbitrary degree (about which more later)
He thereby helped start a field called approximation
theory
Approximation theory is a proto-complexity theory!
Real polynomials = Model of computation
Degree = Complexity measure
So, maybe not so surprising that it ends up being
related to actual complexity theory…
1. POLYNOMIALS IN MACHINE
LEARNING
Fast-forward to 1969…
And AI researchers were studying perceptrons
f
f1
f2
…
m

1 working
if   i f i  
Bill Ayers
was
f 
i 1
for the McCain’08
0
otherwise
campaign
fm
1 ,  ,  m ,   
A perceptron of order k is a Boolean function
f:{0,1}n{0,1} that’s a threshold of subfunctions
on at most k variables each
Minsky and Papert: Small
perceptrons have serious
limitations!
Application: “killed neural net research for a decade”
Suppose f:{0,1}n{0,1} is represented by an order-k
perceptron
Then there’s clearly a degree-k polynomial p:RnR
such that for all x1,…,xn{0,1},
sgn  px1 ,, xn   f x1,, xn 
Furthermore, without loss of generality p is
multilinear: no variable raised to higher power than 1
Example: The PARITY function
Suppose
sgn  px1,, xn   x1  xn
for all x1,…,xn{0,1}. Then what can we say
about deg(p)?
Theorem: deg(p)n
Key idea: Symmetrization
Replace multivariate polynomials by univariate
ones, which are easier to understand
How Symmetrization Works
Let
Let
px1 ,, xn  
qk  
   x
S n
S d
S
i
iS

px1 ,, xn 
x  x  k
EX
1
n
Key Lemma:
q(k) is itself a polynomial in k,
of degree at most d
Proof: By linearity of expectation,


qk     S EX  xi 
x1  xn  k
S n 
 iS 
S d

 n  S  n
  
EX  xi   
 k
x1  xn  k
k

S
 iS  
  

n  S !
k!n  k !

k  S !n  k ! n!

n  S !

k k  1 k  S  1
n!
which is a degree-|S| polynomial in k.
So, suppose there’s an order-k perceptron
computing the parity of n bits
Then there’s a degree-k multilinear polynomial p
such that sgn p x ,, x  x  x
 1
n

1
n
Hence there’s a degree-k univariate polynomial q
such that for all k=0,…,n,
qk   0 if k even
qk   0 if k odd
Must have degree n
2. POLYNOMIALS IN QUANTUM
COMPUTING
Quantum Query Model In One Slide
What are the allowed operations?
Initialize vector of amplitudes
 1 
 
  
 
 n
Quantum state:
Unit vector in Cn
Query the input bits
 1   1 1  2 x1  
  


  

     1  2 x 
n 
 n  n
One further detail: The quantum state can have
more a
than
n dimensions,
with multiple“Measure”
components
Apply
unitary
transformation
querying each xi, as well as components that don’t
 1   u11  u1n  1 
make
at all  
 queries

Outcome i observed
          
2
with
probability
|
|
Complexity
Measure:
Q(f)
=
minimum
number
of
i
   u
  

u
nn  n 
 n   n1
queries needed to compute a Boolean function f with
probability  2/3, on all inputs x=x1…xn
Example: The Deutsch-Jozsa Algorithm
Does something spectacular:
Computes the XOR of two bits with one oracle call!


1  x1  x2  

 
 x2  x1  


1
2
1
2
1 

 1 2x

0 
1
2 



1  0
1

2
x
2



2

1 

2
1 

2
By computing x1x2, x3x4, etc., can compute the
parity of n bits with n/2 oracle calls
Is that optimal?
Lemma (Beals et al. 1998): If a quantum algorithm
makes T queries, its probability of accepting is a
degree-2T multilinear polynomial over the xi’s
Right-to-Left Proof:
0   u11  u1n  1  2 x1
0   1 
1  2 x1



 
  Implication:

  algorithm

If a quantum
computed

  
 x0 x 1with
  u queries,
 it
 would
  
<n/2
lead
to
a

2
x

u
0
1

2
x
1
n
n   n1
nn  
n  n 

polynomial approximating PARITY with degree <n.
Entries
After
are Deutsch-Jozsa
now
TStill
queries,
Degree-2
degree-1
degree-T
polynomials
polynomials
polynomials
polynomials
over the xi’s
Hence
must be
optimal!
Then

i accepting
2
i
has degree 2T
Another Famous Quantum
Algorithm: Grover’s
Computes the OR of n bits using O(n) queries
Is Grover’s algorithm optimal?
BBBV 1994: Yes, by a quantum argument
We’ll instead prove Grover is optimal using …
wait for it …
Given a Boolean function f, let deg(f) be the
minimum degree of a real polynomial p:RnR
such that
px   f x    x  0,1
n
Theorem (Nisan-Szegedy 1994):
 
deg ORn    n
Observation: Is that lower bound tight? Yes,
because of Grover’s algorithm!
To prove deg(OR)=(n), we need to
revisit our good friend Markov…
Theorem (Markov): If p is a degree-d real
polynomial, then
max p' x   d max px 
2
1 x 1
1 x 1
Another convenient form: for all n>0,
deg  p  
n max p '  x 
0 x  n
2 max p  x 
0 x  n
Markov’s inequality is tight.
The extremal cases are called the
Chebyshev polynomials:
Td x  cosd arccosx
Uhh … why is that a polynomial at all?
Td cos x   cos dx
 
 Re e idx

 Re cos x  i sin x 
d

which is a degree-d polynomial in cos x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
Let p satisfy
px   ORx    x  0,1
n
We want to lower-bound deg(p)
Symmetrize:
qk   EX  px , degq   deg p 
x k
1
0
1    qk   1   k  1,, n
1
  1by
q' xSo
 2Markov’s
for some0 inequality,
 x 1
n max1  2 ,2c  1
degq  
 n
c
 
0
   q0  
One remaining problem: q(x) need not be
bounded at non-integer x
Solution: Notice max q'  x   2c  1, c  max q x 
0 x  n
0 x  n
Collision Problem
Illustrates the amazing reach of the polynomial
method
Problem: Given f:[n][n], decide whether f is 1-to-1
or 2-to-1, promised it’s one or the other
By the Birthday Paradox, ~n queries to f
are necessary and sufficient classically
[Brassard et al. 1997] gave a quantum algorithm
making O(n1/3) queries
[A. 2002]: Any quantum algorithm needs (n1/5)
queries. Improved to (n1/3) by Shi
Lower bound by polynomial method
1 if f x   h
Let x, h   
0 otherwise
Lemma (following Beals et al.): If a quantum
algorithm makes T queries to f, the probability p(f)
that it accepts is a degree-2T polynomial in the
(x,h)’s
Now let
qk  

p f 
k - to-1 functions f
EX
be the expected acceptance probability on a
random k-to-1 function
The Miracle:
q(k) is itself a polynomial in k, of
degree at most 2T
Why?
 nr 

n  d !
 r dh
  n / k  r 
EX
  xh , j , h  
r
k - to-1 functions f
n
/
k

r
h

1
j

1

 k!
 k  d h !
 n 

n!
n / k 
k!n / k
h 1
n/k


n  r !n  d !
k! n / k !n  n / k !


r
n!n!
n / k  r !n  n / k !k!n / k r  k  d h !
d1
d2
h 1
d3
n  r !n  d ! 

n!n!
n  r !n  d ! 

d
n!n!
h 1
 n  n   n





k
k

1

k

d

1


1


r

1

 
h

 k  k   k
 

 h 1
 
n  r !n  d ! 

n!n!
k!r  n / k !
r
n / k  r !
 k  d h !
r





k

1

k

d

1
nn  k  n  rk  k 
h

r
 h 1

which is a degree-d polynomial in k. That’s why.
Technicality: Need to deal with k not dividing n
Another Useful Hammernomial:
Bernstein’s Inequality
deg p  
max p' x  xn  x 
0 x  n
max px 
0 x  n
Application: Any quantum algorithm to compute the
MAJORITY of n bits requires (n) queries
Ouch, that really
hurts the degree!
Oh, and don’t forget the inequality of V.
A. Markov—A. A.’s younger brother!
px  
d k p  2e deg p  max

0 x  n
max k  

0 x  n dx
nk



2
k
Application [A. 2004]: Direct product theorem for
quantum search. After T queries, the probability that
a quantum algorithm finds K marked items out of N
is at most (cT2/N)K
0 1
K
N
3. POLYNOMIALS IN CIRCUIT
COMPLEXITY
Linial-Mansour-Nisan 1993: If a Boolean function f
is computable by an AC0 circuit of size s and depth k,
then we can find a degree-d real polynomial p such
that


EX n  px   f x   2s  2
x0,1
2
 d 1 / k / 20
Proof uses the Switching Lemma to upper-bound
high-degree Fourier coefficients
By Nisan-Szegedy, the above
theorem would be false if we wanted
|p(x)-f(x)| to be small for every x
Bazzi 2007: Let F=C1Cm be a DNF formula.
Then we can find degree-d real polynomials p and q
such that
n
px  F x  qx x 0,1
O 1    d 
EX qx   px   m 2
x0,1n
Implies that polylog-wise independent distributions
“fool” small DNFs.
The proof takes
64
pages
4. POLYNOMIALS EVERYWHERE
Polynomials in Oracle-Building
Beigel 1992: There exists an oracle relative to which
PNP  PP
Use the following problem: Given exponentially-long
integers x=x1…xN and y=y1…yN, is xy?
It’s in PNP, since we can use binary search to find the
leftmost i such that xiyi
But is there a low-degree polynomial p such that
if x  y
1
sgn  px1 ,, xN , y1 ,, yN   
?
 1 otherwise
Sure:
2
N 1
 
x1  2N 2 x2   20 xN  2N 1 y1  2N 2 y2   20 yN
But by clever repeated use of Markov’s inequality,
one can show that any such polynomial must take
on huge (doubly-exponentially-large) values
This means the problem can’t be in PP
[A. 2006] generalized Beigel’s result to give an
oracle relative to which PP has linear-size circuits
Requires handling many polynomials simultaneously

Slide of Guilt: The Polynomial Method
in Communication Complexity
Razborov 2002: Any quantum
protocol for the Disjointness
problem requires (n) qubits of
Razborov and Sherstov, this very FOCS:
communication
An AC0 function with large unbounded-error
communication complexity
Sherstov, this very FOCS:
Characterizes the unboundederror communication complexity
Chattopadhyay-Ada,
Lee-Shraibman 2008: Lower
of symmetric functions
bounds for the k-party communication complexity of
Disjointness in the Number-On-Forehead And
model
more!
Some Positive Uses of Polynomials
Harvey-Nelson-Onak, this very FOCS: Chebyshev
polynomials used to give a streaming algorithm for
approximating the Shannon entropy
Beigel-Reingold-Spielman 1991: PP is closed
under intersection
Future Direction 1: Beyond
Symmetrization
Find better techniques to lower-bound the degrees
of multivariate polynomials.


deg  


OR
AND AND
Upper bound: O(n)
 (from quantum algorithm)
n

AND  ?
Lower bound: (n1/3)
n 
 (can be proved using the
n1/3 collision lower bound)
deg(f)=O(deg(f)2) for all Boolean functions f?
Best known relation: deg(f)=O(deg(f)6) (Beals et al.)
Future Direction 2: Understanding
Bounded Real Polynomials
Conjecture. Let p:Rn[0,1] be a real polynomial of
degree d. Suppose EXx,y[|p(x)-p(y)|]=(1). Then there
exists an i[n] such that EXx[|p(x)-p(xi)|]=(1/poly(d)).
Would have major implications for quantum!
e.g., for P vs. BQP relative to a random oracle
Given a partial function f:S{0,1} (S{0,1}n), let deg(f)
be the minimum degree of a polynomial p such that
(1) 0p(x)1 for all x{0,1}n,
(2) |p(x)-f(x)| for all xS.
Is there a partial f for which deg(f) is exponentially
smaller than Q(f)?
Future Direction 3: Matrix- Valued
Polynomials
What Boolean functions can we approximate as
Would imply an oracle relative to which
 p11 x1 , , xn   p1m  x1 , , xn  
 SZKQMA (i.e., “there are no succinct

 pij   d ?
max  quantum

 for problems

, deg
proofs
like
graph
 p x , , x non-isomorphism”)





p
x
,

,
x
n
mm
1
n 
 m1 1
Conjecture. Suppose
max(A(x))[0,1] for all x{0,1}n
max(A(x))2/3 for all x encoding a 1-to-1 function
max(A(x))1/3 for all x encoding a 2-to-1 function
Then d2(d+log m)=(n).
Future Direction 4: Extending Bazzi’s
Theorem to AC0 (the Linial-Nisan Conjecture)
Problem: Given fAC0, construct polylog(n)-degree
polynomials p,q:RnR such that
px   f x   qx x  0,1 , EX n qx   px   
n
x0,1
If p,q have the further property that
p x  
q x  
  C x 
C
Clauses C
  C x 
C
Clauses C

C
C


P rC   On
P rC   O n
1 / 2 
1 / 2 
C


C
then we get an oracle relative to which BQPPH.
The polynomial method:
the choice of hardworking
American lowerboundsmen
I
approve!
y
x
OPEN PROBLEM