Top Banner
Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite programming and graph algorithms February 10-14, 2014
20

Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

Dec 17, 2015

Download

Documents

Kory Cameron
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

Rounding Sum of Squares RelaxationsBoaz Barak – Microsoft Research

Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell)

workshop on semidefinite programming and graph algorithms

February 10-14, 2014

Page 2: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

This talk is about

• Semi-definite programming ,

SOS/Positivstellensatz method

• Proof complexity

• The Unique Games Conjecture

• Graph partitioning, small set expansion

• Machine Learning

• Cryptography.. (in spirit).

Page 3: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

Sum-of-Squares (SOS) Algorithm[Shor’87,Parillo ’00, Nesterov ’00, Lasserre ’01]

Motivation: Sometimes a polynomial can have exponentially many local minima…

E.g.

… but there is still a short proof that

E.g.

… and this proof can be found efficiently via semidefinite programming (SDP)

Page 4: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

SOS Algorithm:For low degree we consider the program :

max𝑥∈ℝ𝑛

𝑃 (𝑥 )𝑠 . 𝑡 .

𝑃1 (𝑥 )=⋯=𝑃𝑘 (𝑥 )=0SOS Proof that : Polynomials and SOS polys s.t.

(𝜈−𝑃 )𝑆=∑𝑃 𝑖𝑄𝑖+𝑆′+1

Degree of proof: max degree of [Gregoriev-Vorobjov’99]

Theorem: [Shor ’87, Parillo ’00, Nesterov ’00, Lasserre ’01]

1) A proof of degree can be found in time.2) Can find in time the min s.t. degree d proof that

Positivstellensatz: All true bounds have SOS proof. [Artin ’27, Krivine ’64, Stengle ‘74]

Page 5: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

SOS Algorithm:For low degree we consider the program :

max𝑥∈ℝ𝑛

𝑃 (𝑥 )𝑠 . 𝑡 .

𝑃1 (𝑥 )=⋯=𝑃𝑘 (𝑥 )=0SOS Proof that : Polynomials and SOS polys s.t.

(𝜈−𝑃 )𝑆=∑𝑃 𝑖𝑄𝑖+𝑆′+1

Degree of proof: max degree of [Gregoriev-Vorobjov’99]

Theorem: [Shor ’87, Parillo ’00, Nesterov ’00, Lasserre ’01]

1) A proof of degree can be found in time.2) Can find in time the min s.t. degree d proof that

Positivstellensatz: All true bounds have SOS proof. [Artin ’27, Krivine ’64, Stengle ‘74]

Can optimize in time over programs with degree proofs.

Program :

Page 6: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

Program : max𝑥∈ℝ𝑛

𝑃 (𝑥 )𝑠 . 𝑡 .

𝑃1 (𝑥 )=⋯=𝑃𝑘 (𝑥 )=0SOS Proof that :

Can optimize in time over programs with degree proofs.

(𝜈−𝑃 )𝑆=∑𝑃 𝑖𝑄𝑖+𝑆′+1

Can’t hope for always: Captures SAT, CLIQUE, 3COL, MAX-CUT, etc…

But maybe often? Essentially only one (robust) lower bound showing [Grigoriev ’01]

Applications:• Optimizing polynomials w/ non-negative coefficients over sphere.

• Algorithms for quantum separability problem [Brandao-Harrow’13]

• Sparse coding: learning dictionaries beyond the barrier.

• Finding sparse vectors in subspaces.

• Approach to refute the Unique Games Conjecture.

This talk: General method to analyze the SOS algorithm. [B-Kelner-

Steurer’13]

Page 7: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

Program : max𝑥∈ℝ𝑛

𝑃 (𝑥 )𝑠 . 𝑡 .

𝑃1 (𝑥 )=⋯=𝑃𝑘 (𝑥 )=0SOS Proof that :

Can optimize in time over programs with degree proofs.

(𝜈−𝑃 )𝑆=∑𝑃 𝑖𝑄𝑖+𝑆′+1

Can’t hope for always: Captures SAT, CLIQUE, 3COL, MAX-CUT, etc…

But maybe often? Essentially only one (robust) lower bound showing [Grigoriev ’01]

Applications:• Optimizing polynomials w/ non-negative coefficients over sphere.

• Algorithms for quantum separability problem [Brandao-Harrow’13]

• Sparse coding: learning dictionaries beyond the barrier.

• Finding sparse vectors in subspaces.

• Approach to refute the Unique Games Conjecture.

This talk: General method to analyze the SOS algorithm. [B-Kelner-

Steurer’13]

Page 8: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

Program : max𝑥∈ℝ𝑛

𝑃 (𝑥 )𝑠 . 𝑡 .

𝑃1 (𝑥 )=⋯=𝑃𝑘 (𝑥 )=0Finding is hard. We consider easier problem:

“Finding a needle in a needle-stack”

Given many ’s maximizing , find a single with value close to maximum.

(multi) set of s.t. ,

Single s.t. ,

Combiner

Non-trivial combiner: Only depends on low degree marginals of

\{𝔼𝑥∼𝑆𝑥𝑖1⋯ 𝑥𝑖𝑘 \}𝑖1 ,.. ,𝑖𝑘∈ [𝑛]

[B-Kelner-Steurer’13]: Transform “simple” non-trivial combiners to algorithm for original problem.

Idea in a nutshell: Simple combiners will output a solution even when fed “fake marginals”.

Page 9: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

Program : max𝑥∈ℝ𝑛

𝑃 (𝑥 )𝑠 . 𝑡 .

𝑃1 (𝑥 )=⋯=𝑃𝑘 (𝑥 )=0Finding is hard. We consider easier problem:

“Finding a needle in a needle-stack”

Given many ’s maximizing , find a single with value close to maximum.

(multi) set of s.t. ,

Single s.t. ,

Combiner

Non-trivial combiner: Only depends on low degree marginals of

\{𝔼𝑥∼𝑆𝑥𝑖1⋯ 𝑥𝑖𝑘 \}𝑖1 ,.. ,𝑖𝑘∈ [𝑛]

[B-Kelner-Steurer’13]: Transform “simple” non-trivial combiners to algorithm for original problem.

Idea in a nutshell: Simple combiners will output a solution even when fed “fake marginals”.

Pseudoexpectations (aka “Fake Marginals”)

“fake marginals”.

Def: [Lasserre ’01] Degree pseudoexpectation is operator mapping any degree poly into a number satisfying:• Normalization:

• Linearity: of deg

• Positivity: of deg

Fundamental Fact: deg SOS proof for for any deg pseudoexpectation operator

Take home message:

• Pseudoexpectation “looks like” real expectation to low degree polynomials.

• Can efficiently find pseudoexpectation matching any polynomial constraints.

• Proofs about real random vars can often be “lifted” to pseudoexpectation.

Page 10: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

Program : max𝑥∈ℝ𝑛

𝑃 (𝑥 )𝑠 . 𝑡 .

𝑃1 (𝑥 )=⋯=𝑃𝑘 (𝑥 )=0Finding is hard. We consider easier problem:

“Finding a needle in a needle-stack”

Given many ’s maximizing , find a single with value close to maximum.

(multi) set of s.t. ,

Single s.t. ,

Combiner

Non-trivial combiner: Only depends on low degree marginals of

\{𝔼𝑥∼𝑆𝑥𝑖1⋯ 𝑥𝑖𝑘 \}𝑖1 ,.. ,𝑖𝑘∈ [𝑛]

[B-Kelner-Steurer’13]: Transform “simple” non-trivial combiners to algorithm for original problem.

Idea in a nutshell: Simple combiners will output a solution even when fed “fake marginals”.

Pseudoexpectations (aka “Fake Marginals”)

“fake marginals”.

Def: [Lasserre ’01] Degree pseudoexpectation is operator mapping any degree poly into a number satisfying:• Normalization:

• Linearity: of deg

• Positivity: of deg

Fundamental Fact: deg SOS proof for for any deg pseudoexpectation operator

Take home message:

• Pseudoexpectation “looks like” real expectation to low degree polynomials.

• Can efficiently find pseudoexpectation matching any polynomial constraints.

• Proofs about real random vars can often be “lifted” to pseudoexpectation.

Problem: Given low degree maximize s.t.

Combining Rounding

Page 11: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

Problem: Given low degree maximize s.t.

[B-Kelner-Steurer’13]: Transform “simple” non-trivial combiners to algorithm for original problem.

Non-trivial combiner: Alg with

Input: , r.v. over s.t.

Output: s.t.

Corollary: In this case, we can find efficiently:

• Use SOS PSD to find pseudoexpectation matching input conditions.

• Use to round the PSD solution into an actual solution

Crucial Observation: If proof that is good solution is in SOS framework, then it holds even if is fed with a pseudoexpectation.

Combining Rounding

Page 12: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

Goal: Given examples of form , where recover

Find the “right” representation of observed data

Previous best (rigorous) results: [Spielman-Wang-Wright ’12, Arora-Moitra-Ge ‘13, Agrawal-Anandkumar-Jain-Netrapalli-Tandon ‘13]

We show: is sufficient* (even in non-independent, overcomplete case)

Let set of vectors.

LOTS of work: important primitive in Machine Learning, Vision, Neuroscience...

Example Application: Dictionary Learning / Sparse Coding

[Olhausen-Field ’96]

* In quasipoly time; we show is sufficient in poly time.

Page 13: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

Goal: Given examples of form , where recover

Find the “right” representation of observed data

Previous best (rigorous) results: [Spielman-Wang-Wright ’12, Arora-Moitra-Ge ‘13, Agrawal-Anandkumar-Jain-Netrapalli-Tandon ‘13]

We show: is sufficient* (even in non-independent, overcomplete case)

Let set of vectors.

LOTS of work: important primitive in Machine Learning, Vision, Neuroscience,…

Example Application: Dictionary Learning / Sparse Coding

[Olhausen-Field ’96]

* In quasipoly time; we show is sufficient in poly time.

Page 14: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

(3) Show that arguments in (1) and (2) fall under the SOS framework.

Goal: Given examples of form , where recover

Let set of vectors.

Achieve in 3 steps:

Result generalizes to overcomplete, non independent case.

For simplicity, assume , ’s orthonormal basis, i.i.d. random vars over s.t.

(1) Find a program s.t. every maximizing is close to one of ’s

(2) Give combining alg taking moments of dist over maximizers into a vector close to one of ’s.

Consider the polynomial𝑃 (𝑥 )=𝔼 ⟨ 𝑦 ,𝑥 ⟩4=𝔼 (∑𝑊 𝑖 ⟨𝑎𝑖 ,𝑥 ⟩ )4(can approximate arbitrarily well from examples)

Opening parenthesis we get𝑃 (𝑥 )≤𝜇∑ ⟨𝑎𝑖 ,𝑥 ⟩4+2𝜇2 (∑ ⟨𝑎𝑖 , 𝑥 ⟩2 )2=𝜇∑ ⟨𝑎𝑖 ,𝑥 ⟩4+𝑜(𝜇)∥ 𝑥∥4

Corollary: unit, Establishes (1) !

Step 1.

Page 15: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

(3) Show that arguments in (1) and (2) fall under the SOS framework.

Goal: Given examples of form , where recover

Let set of vectors.

Achieve in 3 steps:

Result generalizes to overcomplete, non independent case.

For simplicity, assume , ’s orthonormal basis, i.i.d. random vars over s.t.

(1) Find a program s.t. every maximizing is close to one of ’s

(2) Give combining alg taking moments of dist over maximizers into a vector close to one of ’s.

Consider the polynomial𝑃 (𝑥 )=𝔼 ⟨ 𝑦 ,𝑥 ⟩4=𝔼 (∑𝑊 𝑖 ⟨𝑎𝑖 ,𝑥 ⟩ )4(can approximate arbitrarily well from examples)

Opening parenthesis we get𝑃 (𝑥 )≤𝜇∑ ⟨𝑎𝑖 ,𝑥 ⟩4+2𝜇2 (∑ ⟨𝑎𝑖 , 𝑥 ⟩2 )2=𝜇∑ ⟨𝑎𝑖 ,𝑥 ⟩4+𝑜(𝜇)∥ 𝑥∥4

Corollary: unit, Establishes (1) !

Step 1.

Page 16: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

Step 2. Let be dist over unit vectors s.t. every satisfies for some

Pick set of random (std gaussian) vectors.

Establishes (2) !

for Let be matrix s.t.

Our combining algorithm outputs the top e-vec of .

Suppose that and for every , .

(Note that )

Then if then (up to scaling) and we’ll succeed.

(3) Show that arguments in (1) and (2) fall under the SOS framework.

Goal: Given examples of form , where recover

Let set of vectors.

Achieve in 3 steps:

(1) Find a program s.t. every maximizing is close to one of ’s

(2) Give combining alg taking moments of dist over maximizers into a vector close to one of ’s.

Slightly tedious but straightforward computations.

Happens w prob

Page 17: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

Unique Games Conjecture: UG/SSE problem is NP-hard. [Khot’02,Raghavendra-Steurer’08]

reasons to believe reasons to suspect

“Standard crypto heuristic”: Tried to solve it and couldn’t.

Very clean picture of complexity landscape:simple algorithms are optimal[Khot’02…Raghavendra’08….]

Random instances are easy via simple algorithm[Arora-Khot-Kolla-Steurer-Tulsiani-Vishnoi’05]

Simple poly algorithms can’t refute it[Khot-Vishnoi’04] Subexponential algorithm

[Arora-B-Steurer ‘10]

Quasipoly algo on KV instance[Kolla ‘10]

Simple subexp' algorithms can’t refute it[B-Gopalan-Håstad-Meka-Raghavendra-Steurer’12] SOS solves all candidate hard

instances[B-Brandao-Harrow-Kelner-Steurer-Zhou ‘12]

SO

S p

roof

syst

em

SOS useful for sparse vector problemCandidate algorithm for search problem[B-Kelner-Steurer ‘13]

A personal overview of the Unique Games Conjecture

Page 18: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

Conclusions

• Sum of Squares is a powerful algorithmic framework that can yield strong results for the right problems.

(contrast with previous results on SDP/LP hierarchies, showing lower bounds when using either wrong hierarchy or wrong problem.)

• “Combiner” view allows to focus on the features of the problem rather than details of relaxation.

• SOS seems particularly useful for problems with some geometric structure, includes several problems related to unique games and machine learning.

• Still have only rudimentary understanding when SOS works or not.

• Other proof complexity approximation algorithms connections?

Page 19: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.
Page 20: Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

Other Results

Sparse vector problem:Recover -sparse vector in -dimensional subspace given arbitrary basis.

Random case: Recovery for any

(Improving on [Demanet-Hand ‘13])

[Brandao-Harrow’12]: Using our techniques, find separable quantum state maximizing a “local operations classical communication” () measurement.

Worst case: Recovery* for

(motivation: machine learning, optimization , [Demanet-Hand 13]

worst-case variant is algorithmic bottleneck in UG/SSE alg [Arora-B-Steurer’10])