Global Optimization with Polynomials - Department of Statistics

”Global Optimization with Polynomials”

Geoffrey Schiebinger, Stephen Kemmerling

Math 301, 2010/2011

March 16, 2011

Geoffrey Schiebinger, Stephen Kemmerling (Math 301, 2010/2011)”Global Optimization with Polynomials” March 16, 2011 1 / 15

Overview

”Global Optimization with Polynomials and the problem ofmoments”, by Jean B. Lasserre (2001)

Goal: Solve minx∈K p(x), p arbitrary polynomial,K =

⋂{x |gi (x) >= 0}, gi arbitrary Polynomials.

Result: Possible as Sequence of SDPs, approaching the solution.


Outline

The Moment Problem. Equivalence.

SDP relaxation. Exactness.

General unconstrained case (p not S.O.S., K = Rn).

Constrained case.

Detecting Optimality.

Generalizations/Applications.

Conclusions.


The Problem of Moments I

Given a polynomial p : Rn → R,consider P 7→ p∗ := minx∈Rn p(x).

Moment Formulation: P 7→ p∗ := minµ∈P(Rn)

∫p(x)dµ.

Assumption: Minimizer always exists.

Theorem: P and P are equivalent. Specifically

(a) inf P = inf P(b) x∗ = argmin P⇒ µ∗ = δx∗ = argminP(c) µ∗ = argminP ⇒ p(x) = minP, µ∗ − a.e.


The Problem of Moments II


(a) inf P = inf PProof. We have p(x) =

∫p dδx , thus inf P ≤ inf P.

Conversely let p∗ := inf P. Then, since p(x) ≥ p∗ ∀x , we haveinf P = infµ

∫p dµ ≥ p∗ = inf P.

(b) x∗ = argmin P⇒ µ∗ = δx∗ = argminPProof. Immediate from p(x∗) ≤ p(x)∀x .

(c) µ∗ = argminP ⇒ p(x) = minP, µ∗ − a.e.Proof. Let B ⊂ Rn, with µ∗(B) > 0 and p(x) > p∗ ∀x ∈ B. Then∫p dµ∗ =

∫B p dµ∗ +

∫Rn−B p dµ∗ > p∗. Contradiction to (a).










∫B p dµ∗ +











∫B p dµ∗ +



The Problem of Moments III

With p =∑

α pαxα :

∫p(x)dµ =

∑α pα

∫xαdµ =

∑α pαyα

Thus:

P

{miny

∑α pαyα

s.t. yα are moments

Relaxation:

Q

{miny

∑α pαyα

s.t.Mm(y) � 0

where Mm(y) is the moment matrix up to degree m, i.e. it’s entriesare yα with

∑αi ≤ m and < p,Mm(y)p >=

∫p2dµy . Slaters

Condition holds for Q. Q = P if p(x)− p∗ is S.O.S.


General p via Sequence of SDPs

Theorem: Let p(x) = : Rn → R be a 2m-degree polynomial of the form∑α pαx

α with global minimum p∗ = minP and such that ||x∗|| ≤ a forsome a > 0 at some global minimizer x∗.

Then as N →∞, one has

inf QNa ↑ p∗.

Here QNa is the convex LMI problem:

QNa

infy

∑α pαyα

MN(y) � 0,

MN−1(θy) � 0.

and θ(x) = a− ||x ||2, Mm(θy)(i , j) =∑

α θαy{β(i ,j)+α}.What really matters: 〈v ,Mm(θy)v〉 =

∫θ(x)v(x)2µy (dx)


Proof Setup

Writing MN(y) =∑

α yαBα for appropriate matrices {Bα} andMN−1(θy) =

∑α yαCα for appropriate matrices {Cα}, we can express the

dual

(QNa )∗

{supX ,Z�0 −X (1, 1)− a2Z (1, 1),

〈X ,Bα〉+ 〈Z ,Cα〉 = pα, α 6= 0

Let Ka = {x : θ(x) > 0}.

Fact: For all p(x) strictly positive on Ka, we can write

p(x)=∑r1

i=1 qi (x)2 + θ(x)∑r2

j=1 tj(x)2

(See, e.g. Berg (1980), ”The multidimensional moment problem and semi-groups”).


Proof Setup

Writing MN(y) =∑

α yαBα for appropriate matrices {Bα} andMN−1(θy) =

∑α yαCα for appropriate matrices {Cα}, we can express the

dual

(QNa )∗

{supX ,Z�0 −X (1, 1)− a2Z (1, 1),

〈X ,Bα〉+ 〈Z ,Cα〉 = pα, α 6= 0

Let Ka = {x : θ(x) > 0}.Fact: For all p(x) strictly positive on Ka, we can write

p(x)=∑r1

i=1 qi (x)2 + θ(x)∑r2

j=1 tj(x)2

(See, e.g. Berg (1980), ”The multidimensional moment problem and semi-groups”).


Proof

From x∗ ∈ Ka, and with y∗ = (x∗1 , . . . , (x∗1 )2N , . . . , (x∗n )2N),

it follows that MN(y∗),MN−1(θy∗) � 0 so that y∗ is admissible forQN

a and thus inf QNa ≤ p∗

Let ε > 0. p(x)− (p∗ − ε) > 0, so ∃N0 such that

p(x)− p∗ + ε =∑r1

i=1 qi (x)2 + θ(x)∑r2

j=1 tj(x)2

for some polynomials qi of degree at most N0, and polynomials tj ofdegree at most N0 − 1.

X =∑r1

i=1 qiq′i , Z =

∑r2i=1 tj t

′j

X ,Z � 0

Therefore (X,Z) admissible for (QN0a )∗ with value

−X (1, 1)− a2Z (1, 1) = p∗ − ε,and therefore

p∗ − ε ≤ inf QN0a ≤ p∗


Proof





p(x)− p∗ + ε =∑r1

i=1 qi (x)2 + θ(x)∑r2

j=1 tj(x)2


X =∑r1

i=1 qiq′i , Z =

∑r2i=1 tj t

′j

X ,Z � 0





Proof





p(x)− p∗ + ε =∑r1

i=1 qi (x)2 + θ(x)∑r2

j=1 tj(x)2


X =∑r1

i=1 qiq′i , Z =

∑r2i=1 tj t

′j

X ,Z � 0





Proof





p(x)− p∗ + ε =∑r1

i=1 qi (x)2 + θ(x)∑r2

j=1 tj(x)2


X =∑r1

i=1 qiq′i , Z =

∑r2i=1 tj t

′j

X ,Z � 0





Optimality Conditions

inf QNa = p∗ iff p(x)− p∗ =

∑r1i=1 qi (x)2 + θ(x)

∑r2j=1 tj(x)2,

with deg(qi ) ≤ N, deg(tj) ≤ N − 1.

Practical sufficient condition: Rank MN(y) = Rank MN−1(y).(See, e.g. Curto, Fialkow (2000): ”The truncated complex K-moment problem”)

Extraction of optimal point possible with SVD of MN(y). (See, e.g. Henrion,

Lasserre (2005): ”Detecting global optimality and extracting solutions in GloptiPoly”)




∑r1i=1 qi (x)2 + θ(x)

∑r2j=1 tj(x)2,








∑r1i=1 qi (x)2 + θ(x)

∑r2j=1 tj(x)2,






Constrained Case

Consider PK 7→ p∗ := minx∈K p(x), with K =⋂{x |gi (x) >= 0}, gi

arbitrary polynomials.Assumption: x ∈ K ⇒ ||x ||2 ≤ a for some a. (Weaker Possible!)Then, analogous to the unconstrained case, let

QNK

miny

∑α pαyα

s.t.MN(y) � 0

MN−ddeg(gi )/2e(giy) � 0, i = 1, . . . , r .

and we have inf QNK ↑ p∗K as N →∞. Proof proceeds as in the

unconstrained case, but using that (given assumption)

p(x) =

r1∑i=1

qi (x)2 +r∑

k=1

gk(x)

r2∑j=1

tj(x)2

for some qi , tj .(See, e.g. Jacobi,Prestel (2000), ”On Special Representations of strictly positive polynomials”).


Convergence rate?

The rate of convergence in inf QNa ↑ p∗ is unknown.

Also, solving QN gets expensive very quickly :

M2(y) =

1 y1,0 y0,1 y2,0 y1,1 y0,2

y1,0 y2,0 y1,1 y3,0 y2,1 y1,2

y0,1 y1,1 y0,2 y2,1 y1,2 y0,3

y2,0 y3,0 y2,1 y4,0 y3,1 y2,2

y1,1 y2,1 y1,2 y3,1 y2,2 y1,3

y0,2 y1,2 y0,3 y2,2 y1,3 y0,4

In practice, however, it works well:

GloptiPoly: Global Optimization over Polynomials with Matlab andSeDuMi

Didier Henrion, Jean-Bernard LasserreDecember, 2006

SOSTOOLS by Stephen Prajna, Antonis Papachristodoulou, PeterSeiler, Pablo A. Parrilo


Sparsity in Coefficient Vectors

Convergent SDP-relaxations in polynomial optimization with sparsity.Lasserre, 2006

Similar result holds: infQr ↑ p∗ as r →∞, whereP: infx∈Rn{f (x)|x ∈ K}K := {x ∈ Rn|gj(x) ≥ 0, j = 1, . . . ,m}gj and f depend only on {xi |i ∈ Ik} for some k, and |Ik | ≤ κAdvantage: the number of variables is O(κ2r ), instead of O(n2r )

LMI’s of size O(κr ) instead of O(nr ).

significant when κ < n

Necessary condition on the way the Ik are related: Ik+1 ∩⋃k

j=1 Ij ⊆ Is ,for some s ≤ k (running intersection property).


Generalization to non-commuting variables

Same flavor: construct a sequence of SDP’s that solve the problem ofinterest

Applications in quantum chemistry and quantum mechanics:

Computing atomic and molecular ground state energies (solvingHartree Fock equations)

Computing upper-bounds on the maximal violation of Bellinequalities.

Convergent relaxations of polynomial optimization problems with non-commuting variablesS. Pironio, M. Navascues, A. Acin 2009


Conclusions

Very General framework.

Quite a few interesting applications.

Software available.

Substantial Computational Challenges!


Global Optimization with Polynomials - Department of Statistics

Documents