The Polynomial Method In Quantum and Classical Computing Scott Aaronson (MIT) x y OPEN PROBLEM
Mar 27, 2015
The Polynomial MethodIn Quantum and Classical Computing
Scott Aaronson (MIT)
x
y
OPEN PROBLEM
Overview
The polynomial method: Just an awesome tool that every CS theorist should know about
Goes back to the prehistory of the field (1960’s), but also plays a major role in current work [including at this FOCS] on machine learning, quantum computing, circuit lower bounds, communication complexity…
Idea: Reduce CS questions to questions about the minimum degree of real polynomials
Easy to learn! “Look ma, no quantum”
This Talk: Just Some Basics
1. Polynomials in machine learning- Perceptrons
2. Polynomials in quantum computing- Optimality of Deutsch-Jozsa and Grover algorithms- Collision lower bound
3. Polynomials in circuit complexity- Linial-Mansour-Nisan and Bazzi
4. Polynomials everywhere!- Communication complexity, oracles, streaming…
Stuff I wish I could cover but can’t for lack of time - Polynomials over finite fields (Razborov-Smolensky) - Reduction of communication problems to polynomials - Sherstov’s pattern matrix method - Deep connections to Fourier analysis
Our story starts in St. Petersburg, around 1889…
Dmitri Mendeleev(periodic table dude)
A. A. Markov(inequality dude)
k
XkX1ˆPr
xpxpxx 1111
max4'max
привет! I proved a cool theorem: if p is a quadratic,
x
y
And what if p has degree d?
Uhh … you’re on your own
Markov did generalize Mendeleev’s bound to arbitrary degree (about which more later)
He thereby helped start a field called approximation theory
Approximation theory is a proto-complexity theory!
Real polynomials = Model of computationDegree = Complexity measure
So, maybe not so surprising that it ends up being related to actual complexity theory…
1. POLYNOMIALS IN MACHINE LEARNING
Fast-forward to 1969…
Bill Ayers was working for the McCain’08 campaign
And AI researchers were studying perceptrons
A perceptron of order k is a Boolean function f:{0,1}n{0,1} that’s a threshold of subfunctions
on at most k variables each
f1 fmf2…
f
,,,
otherwise0
if1
1
1
m
m
iii ff
Minsky and Papert: Small perceptrons have serious
limitations!
Suppose f:{0,1}n{0,1} is represented by an order-k perceptron
Then there’s clearly a degree-k polynomial p:RnR such that for all x1,…,xn{0,1},
nn xxfxxp ,,,,sgn 11
Furthermore, without loss of generality p is multilinear: no variable raised to higher power than 1
Application: “killed neural net research for a decade”
Example: The PARITY function
Suppose
for all x1,…,xn{0,1}. Then what can we say about deg(p)?
Key idea: Symmetrization
Replace multivariate polynomials by univariate ones, which are easier to understand
Theorem: deg(p)n
nn xxxxp 11 ,,sgn
Let
nkxx
xxpEXkqn
,,11
Si
i
dSnS
Sn xxxp ,,1
Key Lemma:q(k) is itself a polynomial in k, of degree at most d
How Symmetrization Works
Let
Proof: By linearity of expectation,
dSnS Si
ikxx
S xEXkqn1
11!
!
!
!!
!!
!
1
Skkkn
Sn
n
knk
knSk
Sn
k
n
Sk
SnxEX
Sii
kxx n
which is a degree-|S| polynomial in k.
So, suppose there’s an order-k perceptron computing the parity of n bits
Then there’s a degree-k multilinear polynomial p such that
Hence there’s a degree-k univariate polynomial q such that for all k=0,…,n,
nn xxxxp 11 ,,sgn
odd if0
even if0
kkq
kkq
Must have degree n
2. POLYNOMIALS IN QUANTUM COMPUTING
Quantum Query Model In One Slide
Apply a unitary transformation
What are the allowed operations?
n
1
Initialize vector of amplitudes
nnn x
x
21
21 111
nnnn
n
n uu
uu
1
1
1111
“Measure”
Outcome i observed with probability |i|2
Query the input bits
Quantum state:Unit vector in Cn
One further detail: The quantum state can have more than n dimensions, with multiple components querying each xi, as well as components that don’t make queries at allComplexity Measure: Q(f) = minimum number of queries needed to compute a Boolean function f with probability 2/3, on all inputs x=x1…xn
Example: The Deutsch-Jozsa Algorithm
Does something spectacular:Computes the XOR of two bits with one oracle call!
2
12
1
210
021
2
1
2
12
1
2
11
2
1
12
21
x
x
xx
xx
By computing x1x2, x3x4, etc., can compute the parity of n bits with n/2 oracle calls
Is that optimal?
Lemma (Beals et al. 1998): If a quantum algorithm makes T queries, its probability of accepting is a degree-2T multilinear polynomial over the xi’s
Right-to-Left Proof:
nnnnn
n
n x
x
uu
uu
x
x
11
1
1111
210
021
210
021
Entries are now degree-1 polynomials over the x i’sStill degree-1 polynomialsDegree-2 polynomialsAfter T queries, degree-T polynomials
accepting
2
iiThen has degree 2T
Implication: If a quantum algorithm computed x1xn with <n/2 queries, it would lead to a
polynomial approximating PARITY with degree <n. Hence Deutsch-Jozsa must be optimal!
Another Famous Quantum Algorithm: Grover’s
Computes the OR of n bits using O(n) queries
Is Grover’s algorithm optimal?
BBBV 1994: Yes, by a quantum argument
We’ll instead prove Grover is optimal using … wait for it …
nxxfxp 1,0
Theorem (Nisan-Szegedy 1994):
Given a Boolean function f, let deg(f) be the minimum degree of a real polynomial p:RnR such that
Observation: Is that lower bound tight? Yes, because of Grover’s algorithm!
nORn deg
To prove deg(OR)=(n), we need to revisit our good friend Markov…
xpdxpxx 11
2
11max'max
Theorem (Markov): If p is a degree-d real polynomial, then
xpxpn
pnx
nx
0
0
max2
'maxdeg
Another convenient form: for all n>0,
Markov’s inequality is tight.The extremal cases are called the Chebyshev polynomials:
xdxTd arccoscosUhh … why is that a polynomial at all?
didx
d
xix
e
dxxT
sincosRe
Re
coscos
which is a degree-d polynomial in cos x
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
Let p satisfy nxxORxp 1,0
We want to lower-bound deg(p)
Symmetrize:
pqxpEXkqkx
degdeg,
0
1
nkkq ,,111
0
1
10 somefor 21' xxq
0q
One remaining problem: q(x) need not be bounded at non-integer x
Solution: Notice xqccxqnxnx
00max,12'max
So by Markov’s inequality,
nc
cnq
12,21maxdeg
Collision Problem
Problem: Given f:[n][n], decide whether f is 1-to-1 or 2-to-1, promised it’s one or the other
[A. 2002]: Any quantum algorithm needs (n1/5) queries. Improved to (n1/3) by Shi
Illustrates the amazing reach of the polynomial method
By the Birthday Paradox, ~n queries to f are necessary and sufficient classically
[Brassard et al. 1997] gave a quantum algorithm making O(n1/3) queries
Lower bound by polynomial method
Let
Lemma (following Beals et al.): If a quantum algorithm makes T queries to f, the probability p(f) that it accepts is a degree-2T polynomial in the (x,h)’s
otherwise0
if1,
hxfhx
fpEXkqfk functions 1-to-
Now let
be the expected acceptance probability on a random k-to-1 function
The Miracle:
q(k) is itself a polynomial in k, of degree at most 2T
which is a degree-d polynomial in k. That’s why.
Why?
krknknndkknn
dnrn
rk
n
k
n
k
ndkkk
nn
dnrn
rkn
kn
dk
k
nn
dnrn
nn
knnknk
dkkknnrkn
dnrn
k
nkn
n
dkk
dnrkn
rn
hxEX
r
hh
r
hh
r
hh
r
kn
r
hh
rkn
knr
hh
rkn
r
h
d
jjh
fk
h
1
1
1
/
1
/
/
1
/1 1,
functions 1-to-
11!!
!!
1111!!
!!
!/
!/
!
!
!!
!!
!!
!/!/!
!!!/!/
!!
!
!/
!!
!/
,
d3
d1d2
d
Technicality: Need to deal with k not dividing n
Another Useful Hammernomial: Bernstein’s Inequality
Application: Any quantum algorithm to compute the MAJORITY of n bits requires (n) queries
xpxnxxp
pnx
nx
0
0
max
'maxdeg
Ouch, that really hurts the degree!
Oh, and don’t forget the inequality of V. A. Markov—A. A.’s younger brother!
Application [A. 2004]: Direct product theorem for quantum search. After T queries, the probability that a quantum algorithm finds K marked items out of N is at most (cT2/N)K
k
nxk
k
nx nk
xppe
dx
pd
0
2
0
maxdeg2max
0 1 K N
3. POLYNOMIALS IN CIRCUIT COMPLEXITY
Linial-Mansour-Nisan 1993: If a Boolean function f is computable by an AC0 circuit of size s and depth k, then we can find a degree-d real polynomial p such that
20/2
1,0
/1
22k
n
d
xsxfxpEX
Proof uses the Switching Lemma to upper-bound high-degree Fourier coefficients
By Nisan-Szegedy, the above theorem would be false if we wanted
|p(x)-f(x)| to be small for every x
Bazzi 2007: Let F=C1Cm be a DNF formula. Then we can find degree-d real polynomials p and q such that
dO
xmxpxqEX
n
21
1,0
nxxqxFxp 1,0
Implies that polylog-wise independent distributions “fool” small DNFs.
The proof takes 64 pages
[Razborov 2008]
4. POLYNOMIALS EVERYWHERE
Polynomials in Oracle-Building
Beigel 1992: There exists an oracle relative to which PNP PP
Use the following problem: Given exponentially-long integers x=x1…xN and y=y1…yN, is xy?
It’s in PNP, since we can use binary search to find the leftmost i such that xiyi
But is there a low-degree polynomial p such that
?otherwise1
if1,,,,,sgn 11
yxyyxxp NN
Sure:
But by clever repeated use of Markov’s inequality, one can show that any such polynomial must take on huge (doubly-exponentially-large) values
This means the problem can’t be in PP
NNNN
NN yyyxxx 02
21
102
21
1 222222
[A. 2006] generalized Beigel’s result to give an oracle relative to which PP has linear-size circuits
Requires handling many polynomials simultaneously
Slide of Guilt: The Polynomial Method in Communication Complexity
Razborov 2002: Any quantum protocol for the Disjointness problem requires (n) qubits of communicationRazborov and Sherstov, this very FOCS:
An AC0 function with large unbounded-error communication complexity
Sherstov, this very FOCS: Characterizes the unbounded-error communication complexity of symmetric functionsChattopadhyay-Ada, Lee-Shraibman 2008: Lower
bounds for the k-party communication complexity of Disjointness in the Number-On-Forehead modelAnd more!
Some Positive Uses of Polynomials
Harvey-Nelson-Onak, this very FOCS: Chebyshev polynomials used to give a streaming algorithm for approximating the Shannon entropy
Beigel-Reingold-Spielman 1991: PP is closed under intersection
Future Direction 1: Beyond Symmetrization
Find better techniques to lower-bound the degrees of multivariate polynomials.
?deg
OR
AND AND AND
n
n
Upper bound: O(n) (from quantum algorithm)
Lower bound: (n1/3) (can be proved using the n1/3 collision lower bound)
deg(f)=O(deg(f)2) for all Boolean functions f?
Best known relation: deg(f)=O(deg(f)6) (Beals et al.)
Future Direction 2: Understanding Bounded Real Polynomials
Conjecture. Let p:Rn[0,1] be a real polynomial of degree d. Suppose EXx,y[|p(x)-p(y)|]=(1). Then there exists an i[n] such that EXx[|p(x)-p(xi)|]=(1/poly(d)).
Given a partial function f:S{0,1} (S{0,1}n), let deg(f) be the minimum degree of a polynomial p such that(1) 0p(x)1 for all x{0,1}n,(2) |p(x)-f(x)| for all xS.
Is there a partial f for which deg(f) is exponentially smaller than Q(f)?
Would have major implications for quantum!e.g., for P vs. BQP relative to a random oracle
Future Direction 3: Matrix- Valued Polynomials
Conjecture. Suppose
max(A(x))[0,1] for all x{0,1}n max(A(x))2/3 for all x encoding a 1-to-1 function max(A(x))1/3 for all x encoding a 2-to-1 function
Then d2(d+log m)=(n).
What Boolean functions can we approximate as
?deg,
,,,,
,,,,
111
11111
max dp
xxpxxp
xxpxxp
ij
nmmnm
nmn
Would imply an oracle relative to which SZKQMA (i.e., “there are no succinct quantum proofs for problems like graph
non-isomorphism”)
Future Direction 4: Extending Bazzi’s Theorem to AC0 (the Linial-Nisan Conjecture)
Problem: Given fAC0, construct polylog(n)-degree polynomials p,q:RnR such that
If p,q have the further property that
then we get an oracle relative to which BQPPH.
xpxqEXxxqxfxpnx
n
1,0,1,0
CC
CC
xCxq
xCxp
Clauses
Clauses
CC
CC
nOC
nOC
2/1
2/1
Pr
Pr
The polynomial method: the choice of hardworking
American lowerboundsmen
x
y
OPEN PROBLEM
I approve!