Linear Conic Programming: A New Modeling Tool …Linear Conic Programming: A New Modeling Tool for Analytical Decision Making Professor Shu-Cherng Fang Department of Industrial and

Linear Conic Programming:A New Modeling Tool for Analytical Decision

Making

Professor Shu-Cherng Fang

Department of Industrial and Systems EngineeringGraduate Program in Operations Research

North Carolina State UniversityRaleigh, North Carolina, USA

July 2, 2013

Summer Workshop in Taiwan

2013 Summer Workshop Taiwan Linear Conic Programming 1 / 219

Topics

• Introduction to Linear Conic Programming

• General Models

• Essential Concepts

• Basic Theory

• Solution Methods

• Recent Research Directions


References I

Books

• �ãª§0©Ô§�5I`z(Linear Conic Optimization)§¥©�§$Ê�+n�ÆmÖ 16§�ÆÑ��§�®§¥I§2013c"

• Bertsekas D.P., Nedic A. and Ozdaglar A.E., Convex Analysis andOptimization, Athena Scientific: Belmont, MA USA 2003.

• Boyd S. and Vandenberghe L., Convex Optimization, Cambridge UniversityPress: Cambridge, UK 2004

• Fang S.-C. and Puthenpura S., Linear Optimization and Extensions: Theoryand Algorithms, Prentice-Hall Inc.: Englewood Cliffs, NJ USA 1993.

• Nemirovski A., Lectures on Modern Convex Optimization: Analysis,Algorithms, and Engineering Applications, Society for Industrial and AppliedMathematics: Philadelphia, PA USA 2001.

• Renegar J., A Mathematical View of Interior-point Methods in ConvexOptimization, Society for Industrial and Applied Mathematics: Philadelphia,PA USA 2001.


References II

• Wolkowicz H., Saigal R. and Vandenberghe L. (edited), Handbook ofSemidefinite Programming: Theory, Algorithms, and Applications, KluwerAcademic Publisher: Norwell, MA USA 2000.

• Rockafellar R.T., Convex Analysis, Princeton University Press: Princeton, NJUSA 1970.

Lectures

• Ye Y., Linear Conic Programming, lecture notes online:http://www.stanford.edu/class/msande314/sdpmain.pdf

• Todd M.J., Semidefinite Programming, lecture notes online:http://people.orie.cornell.edu/~miketodd/cornellonly/or637/or637.html

Software

• A very popular general purpose SDP solver, SeDuMi, of Jos F. Sturm can befound in: http://sedumi.ie.lehigh.edu

• Another very popular convex programming problems solver, CVX, can befound in: http://cvxr.com/cvx


http://www.stanford.edu/class/msande314/sdpmain.pdf

http://people.orie.cornell.edu/~miketodd/cornellonly/or637/or637.html


http://sedumi.ie.lehigh.edu

http://cvxr.com/cvx



I. Introduction


Modern Management Science, OperationsResearch and Industrial Engineering

• Production and Operations

• Logistics and Supply Chain

• Medical and Health Systems

• Financial Mathematics and Engineering

• Service Systems and Management

• Communications and Data Mining

• Manufacturing and Ergonomics

Modeling and analytic decision making is a must.


Some Old Sayings

• Every model is a wrong model, but some are moreuseful.

• ó�õÙ¯7k|Ùì.


Toolbox for Modeling and Decision Making

When you have a hammer in hand, everything looks like anail.

I’m gonna “nail” you!


Most Frequently Used Tools for MS/OR/IE

• Linear Regression Model

y = a0 + a1x1 + a2x2 + · · ·+ anxn

• Linear Programming Model

min c1x1 + c2x2 + · · ·+ cnxns.t. a11x1 + a12x2 + · · ·+ a1nxn = b1

...am1x1 + am2x2 + · · ·+ amnxn = bmx1, x2, . . . , xn ≥ 0


Why Is Linear Programming Model Popular?

• Simple and applicable• Fortune 500 companies• Classroom teaching

• Theoretical supports• Optimality conditions• Duality theory• Sensitivity analysis

• Fast computing algorithms• Polynomial time complexity• Large scale problems• Excel LP solver


Limitations of Linear Programming Model

• Nature involves nonlinearity and nonconvexity.

• Linearity only provides the first order information forapproximation.


Extensions of Linear Programming Model

• Semidefinite Programming (SDP)

• Second Order Cone Programming (SOCP)

• Copositive Programming (CoP)

• Completely Postive Programming (CPP)

Linear Conic Programming (LCoP)


What is Linear Conic Programming?

Min cTxs.t. (ai)Tx = bi, i = 1, . . . ,m

x ≥ 0(x ∈ Rn)

(LP)

where ai = (ai1, ai2, . . . , a

in)T ∈ Rn, bi ∈ R and c ∈ Rn.

Min C •Xs.t. Ai •X = bi, i = 1, . . . ,m

X ∈ K(LCoP)

where K is a closed, convex cone; bi ∈ R and C, Ai are in the space ofinterests with “•” being a linear operator.


Basic Definitions

• Let En be an n-dimensional Euclidean space and K is a subset ofEn. If “λx ∈ K,∀x ∈ K and λ ≥ 0,” then K is a cone in En.

• Cone K is convex, if the line segment formed by any two points of Kis contained in K.

• Cone K is closed, if all accumulation points of K are contained in K.

Figure : A closed, convex cone

• The dual cone K∗ of cone K is defined by

K∗ = {y ∈ En|〈x, y〉 ≥ 0,∀x ∈ K}.


Linear Conic Programming

Min C •Xs.t. Ai •X = bi, i = 1, . . . ,m

X ∈ K(LCoP)

where K is a closed, convex cone; bi ∈ R and C, Ai are in the space ofinterests (usually, an n-dimensional Euclidean space En) with “•” being alinear operator.

By using the concept of dual cone, the dual problem of (LCoP) has thefollowing form:

Max bT ys.t.

∑mi=1 yiAi + S = C

S ∈ K∗(LCoD)

where y ∈ Rm and K∗ is the dual cone of K.


Power of Linear Conic Programming

• Nonlinearity and nonconvexity may be absorbed by thecone.

• Shares a very similar structure as Linear Programming.

• Possesses well-developed theory.• Optimality conditions• Duality theory• Sensitivity analysis

• Exists interior-point approach.• polynomial-time solvable subclasses of problems• polynomial-time approximation to general problems


Applications of LCoP

• Warehouse Location Problem

• Correlation Matrix Verification Problem

• Robust Portfolio Design Problem

• Stochastic Queue Location Problem

• Mac-Cut Problem

• Quadratic Knapsack Problem

• Quadratically Constrained Quadratic Programming Problems

• Discrete Optimization Problems

...


II. General Models


How to build an LCoP Model?

• Modeling is Art!

– Solid understanding

– Wide knowledge

– Constant practice


Where does the cone come?

• Vectors

• Matrices

• Functions

• ...


From Vectors: K = Rn+ (First Orthant)

When K = Rn+ = {x ∈ Rn|xi ≥ 0, i = 1, ..., n}, LCoP becomes LP.

Min cTxs.t. Ax = b

x ∈ Rn+(LP)

where A ∈ Rm×n, b ∈ Rm and c ∈ Rn.

Dual of LP:max bT ys.t. AT y + s = c

s ∈ Rn+(LD)

where y ∈ Rm and s ∈ Rn.


K = Rn+

Figure : R1+ Figure : R2

+ Figure : R3+


K = Ln (Lorentz Cone/Second Order Cone)

When K = Ln = {x ∈ Rn|√x2

1 + · · ·+ x2n−1 ≤ xn}, LCoP becomes

SOCP.Min cTxs.t. Ax = b

x ∈ Ln(SOCP)

where A ∈ Rm×n, b ∈ Rm and c ∈ Rn.

Dual of SOCP:

Max bT ys.t. AT y + s = c

s ∈ Ln(SOCD)

where y ∈ Rm and s ∈ Rn.


K = Ln

Figure : L2 Figure : L3


Generalized SOCP

The product of Second Order Cones

K = Ln1 × · · · × Lnr , where n1 + · · ·+ nr = n. Then K is still a cone.

Alternative SOCP:

Min cTxs.t. Ax− b ∈ K,

x ∈ Rn.


From Matrices: K = Sn+ (Positive Semidefinite Cone)

When K = Sn+ = {X ∈ Rn×n|X = XT � 0}, LCoP becomes SDP.

Min C •Xs.t. Ai •X = bi, i = 1, ...,m,

X ∈ Sn+(SDP)

where C,A1, ..., Am are given n× n symmetric matrices and b1, ..., bm aregiven scalars, and

M •X =∑i,j

MijXij = tr(MTX).

Dual of SDP:

Max bT ys.t.

∑mi=1 yiAi + S = C

S ∈ Sn+(SDD)

where y = (y1, ..., ym)T is a vector in Rm and S is an n× n symmetricmatrix.


Alternative SDP form

Given Ci, D ∈ Sn, yi ∈ R

Min∑mi=1 Ci •Xi

s.t.∑mi=1 yiXi −D ∈ Sn+,

Xi ∈ Sn

More generally,

Min C •X + dTxs.t. Ai •X + (ai)Tx = bi, i = 1, ...,m,

X ∈ Sn+, x ∈ Rp+.

where C,Ai ∈ Sn, d, ai ∈ Rp for i = 1, ...,m and b = (b1, ..., bm)T ∈ Rm.


K = Sn+

S2+ =

{(x, y, z) ∈ R3|

[x yy z

]� 0.

}⇐⇒ x ≥ 0, z ≥ 0, xz ≥ y2.


SOCP =⇒ SDP

Ln+1 can be easily embedded into Sn+1 by observing the fact that[xt

]∈ Ln+1 ⇐⇒

[t xT

x tIn

]∈ Sn+1

+


From Functions: Cone of Nonnegative Functions

• Nonnegative quadratic functions over a given F ⊆ Rn

f(x) = xTAx+ 2bTx+ c ≥ 0,∀x ∈ F

f ⇔[c bT

b A

]

• DF =

{[c bT

b A

]∈ Sn+1

∣∣∣ [c bT

b A

]•[

1 xT

x xxT

]≥ 0,∀x ∈ F

}is a

cone.

• D∗F = cl Cone{[

1 xT

x xxT

]∈ Sn+1

∣∣∣x ∈ F} is a closed convex cone.


From Functions: Cone of Nonnegative Functions

• F = Rn =⇒ DF = Sn+1+ (Positive Semidefinite Cone),

D∗F = Sn+1+ .

• F = Rn+ =⇒ DF = Cn+1 (Copositive Cone),D∗F = C∗n+1 (Completely Positive Cone).

• Larger F implies smaller DF , in particular,

C∗n+1 ⊆ (Sn+1+ ∩Nn+1

+ ) ⊆ (Sn+1+ )∗ = Sn+1

+ ⊆ (Sn+1+ +Nn+1

+ ) ⊆ Cn+1,

andD∗F ⊆ Sn+1

+ ⊆ DF , ∀F ⊆ Rn.


From Functions: K = DF (Cone of NonnegativeQuadratic Functions)

When K = DF , LCoP becomes CoP.

min C •Xs.t. Ai •X = bi, i = 1, ...,m,

X ∈ DF .(CoP)

where C,A1, ..., Am are given n× n symmetric matrices and b1, ..., bm aregiven scalars. Dual of CoP:

max bT ys.t.

∑mi=1 yiAi + S = C,

S ∈ D∗F .(CoD)



Copositive Cone and Completely Positive Cone

D∗Rn+

= C∗n+1 Sn+1+ Cn+1 = DRn

+.

0.0

0.5

1.0

1.5

2.0

x

-1

0

1

y

0.0 0.5 1.0 1.5 2.0

z

Figure : Copositive Cone C2 Figure : Completely Positive Cone C∗2


Exampling Model

Torricelli Point ProblemThe problem was proposed by Pierre de Fermat in 17th century. Giventhree points a, b and c on the R2 plane, find the point in the plane thatminimizes the total distance to the three given points. The solutionmethod was found by Torricelli, hence know as Torricelli point.

Figure : Torricelli Point Problem


Torricelli Point Problem

Hint

t1 ≥ ‖x− a‖2 ⇔[x− at1

]∈ L3,

t2 ≥ ‖x− b‖2 ⇔[x− bt2

]∈ L3,

t3 ≥ ‖x− c‖2 ⇔[x− ct3

]∈ L3.

SOCP Formulation

Min t1 + t2 + t3

s.t.

[x− at1

]∈ L3,

[x− bt2

]∈ L3,

[x− ct3

]∈ L3


Exampling Model

Weber ProblemIn 1909, the German economist Alfred Weber introduced the problem offinding a best location for the warehouse of a company, such that the totaltransportation cost to serve the customers is minimum. Suppose thatthere are m customers needing to be served. Let the location ofcustomer i be ai ∈ R2, i = 1, . . . ,m. Suppose that customer may havedifferent demands, to be translated as weight ωi for customer i,i = 1, . . . ,m. Denote the desired location of the warehouse to be x.

SOCP Formulation

Minm∑i=1

ωiti

s.t.

[x− aiti

]∈ L3, i = 1, . . . ,m.


Exampling Model

Correlation Matrix Verification

Given three random variables A, B and C with the correlationcoefficients ρAB , ρAC and ρBC , respectively. Suppose we know fromsome prior knowledge (e.g., empirical results of experiments) that−0.2 ≤ ρAB ≤ −0.1 and 0.4 ≤ ρBC ≤ 0.5. What are the smallest andlargest values that ρAC can take?

HintThe correlation coefficients are valid if and only if 1 ρAB ρAC

ρAB 1 ρBCρAC ρBC 1

� 0


Correlation Matrix Verification

SDP formulationThe above problem can be formulated as following problem:

Min/Max ρAC

s.t. −0.2 ≤ ρAB ≤ −0.1

0.4 ≤ ρBC ≤ 0.5

ρAA = ρBB = ρCC = 1 ρAA ρAB ρACρAB ρBB ρBCρAC ρBC ρCC

∈ S3+

Answer

−0.978 ≤ ρAC ≤ 0.872


Exampling Model

Robust Portfolio DesignAssume there are n investment options in the market having an uncertainreward r with E[r] = r and V ar[r] = Σ. Find a robust portfolio thatoptimizes the worst possible reward.

HintWe may consider the rewards r are in an ellipsoid

E = {r = r + κΣ1/2u : ‖u‖2 ≤ 1}.

where r is the expected return, Σ is the empirical covariance matrix,0 < κ < 1 is a given constant.robust counterpart: (optimize the worst case)

maxω

minr∈E{rTω : eTω = 1, ω ≥ 0}.


Robust Portfolio Design

SOCP FormulationNotice that

minr∈E

rTω

= min‖u‖2≤1

{rTω + κuTΣ1/2ω}

= rTω − κ‖Σ1/2ω‖2

rˆT ω − κ‖Σ1/2ω‖2 ≥ t ⇐⇒[κΣ1/2ωrTω − t

]∈ Ln+1

Robust portfolio problem becomes an SOCP

Max rTω − κ‖Σ1/2ω‖2s.t. eTω = 1, ω ≥ 0

⇐⇒

Max ts.t. eTω = 1, ω ≥ 0[

κΣ1/2ωrTω − t

]∈ Ln+1


Exampling Model

Stochastic Queue Location Problem

Suppose there is a deliverer who serves m potential customers in theregion. Customers’ requests by call are random. Once a customer call isreceived, then the deliverer is dispatched in the First Come First Servemanner. In case the deliverer is out, the customer will have to wait. Thegoal is to find a good location for the deliverer to station in order tominimize the expected waiting time of service.



Hint

Assume one delivery each time and the probability of customer i to call ispi, i = 1, . . . ,m. The demand/request process follows the Poissondistribution with overall arrival rate λ. The problem can be treated asM/G/1 queue in Queueing theory such that the expected service time,including waiting time and traveling, can be explicitly computed. To thisend, denote the speed of the deliverer by v, the location of customer i byai and the location of the deliverer’s station by x.

ReferenceMamnoon Jamil, Alok Baveja, Rajan Batta: The stochastic queue centerproblem. Computers and Operations Research. 26, 1999, pp.1423-1436



Problem formulation

According to queueing theory, the expected waiting time for customer i isgiven by

ωi(x) =

(2λ/v2)m∑i=1

pi‖x− ai‖22

1− (2λ/v)m∑i=1

pi‖x− ai‖2+

1

v‖x− ai‖2,

where the first term is the expected waiting time for the deliverer to befree and the second the term is the waiting time for the deliverer to travelafter his departure at the station.



Note that

‖x− ai‖2 ≤ t0i ⇐⇒[x− ait0i

]∈ L3

‖x− ai‖22/s ≤ ti, s > 0 ⇐⇒∥∥∥∥[x− aiti−s

2

]∥∥∥∥2

≤ ti+s2

We can formulate this problem as an SOCP:

Min (2mλ/v2)m∑i=1

piti + (1/v)m∑i=1

t0i

s.t. s = 1− (2λ/v)m∑i=1

pit0i ,[

x− ait0i

]∈ L3,

x− aiti−s2

ti+s2

∈ L4, i = 1, ...,m.


Exampling Model

Convex QCQP =⇒ SOCPThe popularity of SOCP is also due to the fact that it is a generalized formof convex QCQP (Quadratically Constrained Quadratic Programming).Specifically, consider the following QCQP:

Min xTA0x+ 2bT0 x+ c0s.t. xTAix+ 2bTi x+ ci ≤ 0, i = 1, . . . ,m

where A0 � 0, Ai � 0 for i = 1, . . . ,m.Note that

t ≥n∑i=1

x2i ⇐⇒

∣∣∣∣∣∣∣∣∣

∣∣∣∣∣∣∣∣∣

x1

...xn

(t− 1)/2

∣∣∣∣∣∣∣∣∣

∣∣∣∣∣∣∣∣∣2

≤ t+ 1

2⇐⇒

x1

...xn

(t− 1)/2(t+ 1)/2

∈ Ln+2


Convex QCQP =⇒ SOCP

Therefore, for each i = 0, . . . ,m

xTAix+ 2bTi x+ ci ≤ ui ⇐⇒

A1/2i x

−bTi x− ci/2− 1/2 + ui/2−bTi x− ci/2 + 1/2 + ui/2

∈ Ln+2

Convex QCQP can be equivalently written as

Min u

s.t.

A1/20 x

−bT0 x− c0/2− 1/2 + u/2−bT0 x− c0/2 + 1/2 + u/2

∈ Ln+2

A1/2i x

−bTi x− ci/2− 1/2−bTi x− ci/2 + 1/2

∈ Ln+2, i = 1, . . . ,m.


General QCQP

Define

Dn+1 =

{U ∈ Sn+1

∣∣∣∣ U • [ 1 xT

x xxT

]≥ 0,∀x ∈ feas(QCQP)

}D∗n+1 = cl cone

{X =

[1x

] [1x

]T ∣∣∣∣ x ∈ feas(QCQP)

}

General QCQP becomes LCoP:

Min

[c0 bT0b0 A0

]• Y

s.t. Y11 = 1,Y ∈ D∗n+1.

(LCoP)


Exampling Model

MAX CUT ProblemGiven a graph G = (V,E,W ), find an optimal partition of the node set intotwo subsets V1 and V2 such that the weighted cut is maximized.

max∑ni=1

∑nj=1 wij

1−xixj

2

s.t. xi ∈ {−1, 1}, i = 1, . . . , n(MAX − CUT )


MAX CUT Problem

Hint

• xi ∈ {−1, 1} ⇐⇒ x2i = 1, Max-cut is a QCQP.

• Sn+1+ is contained in Dn+1.

TheoremIf wij ≥ 0, ∀i 6= j. Then the expected value of randomized algorithm is atleast α ≈ 0.878 times the value of the maximum cut.

ReferenceMichel X. Goemans, David P. Williamson: Improved approximationalgorithms for maximum cut and satisfiability problems using semidefiniteprogramming. Journal of ACM, 42(6), 1995, pp.1116-1145


Concluding Recommendation

Update your toolbox withLinear Conic Programming models!


III. Essential Concepts


Basic Knowledge

Content• Vectors, Matrices, and Spaces• Inner Products and Norms• Open, Closed, Interior, and Boundary Sets• Functions• Linear Systems• Convex Sets and Functions

Key Concepts• Relative interior of a given set• Conjugate function/transform of a given function• Dual cone of a given cone


Vectors, Matrices and Spaces

• Real numbers: R, R+, R++

• Euclidean space: Rn

• First orthant: Rn+• n-dimensional (column) vector:

x = (x1, x2, . . . , xn)T

• Matrices space: Rm×n

• Matrix: M ∈ Rm×n, ith row Mi·, jth column M·j , ijth entry Mij(Mi,j)

• Symmetric square matrices space (n(n+ 1)/2-dimensional space):

Sn = {M ∈ Rn×n |M = MT }.



Given M ∈ Rm×n, N ∈ Rn×m, S ∈ Rn×n

• Determinant: det(S)

• Trace: tr(S)tr(MN) = tr(NM)

• Null space: N (M)= {x ∈ Rn|Mx = 0}.• Range space: R(M)= {y ∈ Rm|y = Mx for some x ∈ Rn}.• Positive semidefinite matrix:

S � 0 ⇐⇒ zTSz ≥ 0, ∀ z ∈ Rn

• Positive definite matrix:

S � 0 ⇐⇒ zTSz > 0, ∀ z ∈ Rn and z 6= 0



Theorem: (Schur complementary theorem)Given

X =

[A BBT C

]and S = C −BTA−1B,

if A � 0 thenX � (�)0⇔ S � (�)0


Inner Products and Norms

• Inner products:x • y = xT y =

∑i xiyi

X • Y = tr(XTY ) =∑i,j XijYij

• Norms:• Euclidean norm: ‖x‖2 =

√x • x

• p-norm: ‖x‖p = (∑ni=1 |xi|

p)1/p for p ≥ 1

• Infinity-norm: ‖x‖∞ = max{|x1|, . . . , |xn|}

• Frobenius norm:

‖X‖F =√X •X =

√tr(XTX)

• Note that: xTAx = A • xxT


Open, Closed, Interior and Boundary Sets

• Neighborhood: N(x0; ε) = {x ∈ Rn| ‖x− x0‖ < ε}.

• Open: X ⊂ Rn is open if for any x ∈ X , there exists ε > 0 such thatN(x; ε) ⊂ X .

• Closed: X ⊂ Rn is closed, if Rn\X = {x ∈ Rn|x /∈ X} is open.

• Closure of a set X ⊂ Rn is the smallest closed set containing X andis denoted as cl(X ).


Open, Closed, Interior and Boundary Sets

• Interior: the interior of a given set X ⊂ Rn is

int(X ) = {x ∈ X |∃ εx > 0 such that N(x; εx) ⊂ X}

• Boundary of a set X ⊂ Rn:

bdry(X ) = cl(X )\int(X ) = {x ∈ cl(X )|x /∈ int(X )}

• Bounded: a set X ⊂ Rn is bounded if there exist an r > 0 such that

‖x‖ < r,∀x ∈ X


Functions

• Continuous: f : X ⊂ Rn is continuous at x0 if( i ) x0 ∈ X(i i) limx→x0 f(x) = f(x0)

• Continuous function: f ∈ C0(X ) means f is continuous at all pointsin X ⊂ Rn.

• Gradient: For f : X ⊂ Rn → R

∇f(x) = [∂f(x)

∂x1, · · · , ∂f(x)

∂xn]1×n

• Hessian: For f : X ⊂ Rn → R

F (x) = [∂2f(x)

∂xi∂xj]n×n

• Continuously differentiable function: f ∈ Cp(X ) (p = 1, 2, · · · ) meansf is p-th continuously differentiable over X ⊂ Rn.


Functions

Theorem (Taylor theorem)Let X be open, f ∈ Cp(X ), x1, x2 ∈ X , x1 6= x2 and

x(θ) = θx1 + (1− θ)x2 ∈ X , ∀ 0 ≤ θ ≤ 1.

Then ∃ x = θx1 + (1− θ)x2 ∈ X , 0 < θ < 1, s.t.

f(x2) = f(x1) +

p−1∑k=1

1

k!dkf(x1;x2 − x1) +

1

p!dpf(x;x2 − x1)

where dkf(x;h) is the k-th order differential of function f along h.


Functions: Big O and Small o

Let g(·) be a real-valued function on R.

• g(x) = O(m(x))

∃ c ≥ 0 such that ∣∣∣∣ g(x)

m(x)

∣∣∣∣ ≤ c as x→ 0 (or +∞)

• g(x) = o(m(x)) ∣∣∣∣ g(x)

m(x)

∣∣∣∣ = 0 as x→ 0 (or +∞)


Functions

Taylor theorem in small o formulation:• p = 1

f(x+ h) = f(x) +∇f(x)h+ o(‖h‖)

• p = 2

f(x+ h) = f(x) +∇f(x)h+1

2hTF (x)h+ o(‖h‖2)


Linear Systems

Given x1, · · · , xm ∈ Rn• Linear combination:

m∑i=1

λixi,

where λi ∈ R, i = 1, . . . ,m.• Linearly independent

m∑i=1

λixi = 0⇒ λ1 = · · · = λm = 0

• Affine combination: a linear combination withm∑i=1

λi = 1

• Affinely independent: if x2 − x1, · · · , xm − x1 are linearlyindependent.


Linear Systems

• Convex combination: a linear combination withm∑i=1

λi = 1 and λi ≥ 0, i = 1, . . . ,m

• Hyperplane:

X = {x ∈ Rn|aTx =

n∑i=1

aixi = b}

• Affine space: affine combination of any two points in the space is stillin the space. (An intersection of finitely many hyperplanes.)

• Linear subspace: an affine space containing the origin.

We can always transform an affine space Y ⊂ Rn into a linearsubspace X ⊂ Rn by choosing x0 ∈ Y such that

X = {x− x0|x ∈ Y}


Linear Systems

• Half space:

X = {x ∈ Rn|aTx =

n∑i=1

aixi ≤ b}

• Polyhedron: an intersection of finitely many half spaces.

• Polytope: a bounded polyhedron

• Dimension of a linear subspace: the maximum number of linearlyindependent vectors in the subspace.

• Dimension of an affine space: the dimension of the transformedlinear subspace.

• Dimension of a polyhedron: the dimension of the smallest affinespace containing it.


Linear Systems

• Linear equations

a1 • x = b1a2 • x = b2· · · · · · · · ·

am • x = bm

⇒ Ax = b,

where a1, · · · , am and x are all in Rn.

A1 •X = b1A2 •X = b2· · · · · · · · ·

Am •X = bm

⇒ AX = b,

where A1, · · · , Am and X are all in Sn.• For convenience, A∗y =

∑mi=1 yiAi.


Convex Sets and Properties

• A set X ⊂ Rn is convex if for any x1 ∈ X and x2 ∈ X , we haveλx1 + (1− λ)x2 ∈ X , for all 0 ≤ λ ≤ 1.

• Convex hull: the smallest convex set containing a given set

conv(X ) = {x ∈ Rn|x =∑mi=1 λiy

i for some m ∈ N+,λi ≥ 0,

∑mi=1 λi = 1, and yi ∈ X , i = 1, . . . ,m}

• Dimension of a convex set: the dimension of the smallest affinespace containing it.

• Relative interior of a convex set X ⊂ Rn: suppose H is the smallestaffine space containing X ,

ri(X ) = {x ∈ Rn|∃ open set Y ⊆ Rn such that x ∈ Y ∩H ⊂ X}

• Supporting hyperplane H = {x ∈ Rn|aTx = b} of a convex set X :

aT y ≥ b,∀ y ∈ X and X ∩H 6= ∅.


Convex Functions and Properties

• Epigraph of a function f : X ⊂ Rn → R

epif = {(x, λ) ∈ Rn+1|λ ≥ f(x), x ∈ X}

• Closed function: if epif is a closed set.

• Convex function: if epif is a convex set.

• Concave function: if −f is a convex function.

• Convex hull function conv(f) of a function f : X ⊂ Rn → R is afunction on X such that epi(conv(f)) = conv(epi(f)).

Lemmaf : X ⊂ Rn → R is a convex function if and only if for any x1, x2 ∈ X and0 ≤ λ ≤ 1, we have

f(λx1 + (1− λ)x2) ≤ λf(x1) + (1− λ)f(x2).



• Subgradient d ∈ Rn of a convex function f : X ⊂ Rn at x ∈ X :

if for any y ∈ X ,f(y) ≥ f(x) + dT (y − x)

• The set {(y, λ) ∈ Rn+1|λ− dT y = f(x)− dTx} is a supportinghyperplane of epif at x.

• Subdifferential of a convex function f : X ⊂ Rn at x ∈ X :

∂f(x) = {d ∈ Rn|d is a subgradient of f at x}



−8 −6 −4 −2 2 4 6 8

−4

−2

−1

2

4

6

8

10

y = 2x − 4

y = x − 1y = −x − 1

y = −2x − 4

y = −3x − 9 y = 3x − 9

x

f(x) = x2

4

Figure : (x, f(x) = x2

4)↔ (m, b(m) = −m2) : y = mx+ b



x

y

Figure : (m, b(m) = −m2) : y = mx+ b↔ (x, f(x) = x2

4)



−3 −2 2 3

−6

−4

−2

1

2

4

x = −2y + 1

x = −4y + 4x = 4y + 4

x = 2y + 1

y

g(y) = −y2

Figure : (y, g(y) = −y2)→ (m, b(m) = m2

4) : x = my + b



y

x

Figure : (m, b(m) = m2

4) : x = my + b↔ (y, g(y) = −y2)



• Conjugate (transform) of f : X ⊂ Rn → R:

h(y) = supx∈X{y • x− f(x)}

with h being defined on Y = {y ∈ Rn|h(y) < +∞}.

Lemmah : Y is a closed, convex function.

Lemma (Fenchel’s inequality)Given f : X and its conjugate h : Y, then

x • y ≤ f(x) + h(y), ∀ x ∈ X and y ∈ Y.

Moreover,x • y = f(x) + h(y) ⇐⇒ y ∈ ∂f(x)


Conjugate Functions and Properties

Let f : X ⊂ Rn → R be a function with its conjugate transform h : Y.• For α ∈ R, the conjugate of f + α is h− α.• For a ∈ Rn, the conjugate of f(x) = f(x) + x • a on X ish(y) = h(y − a), ∀ y ∈ Y.

• For a ∈ Rn, the conjugate of f(x) = f(x− a) on X ish(y) = h(y) + y • a, ∀ y ∈ Y.

• For λ > 0, the conjugate of f1(x) = λf(x) on X is h1(y) = λh( yλ ),∀ y ∈ λY.

• For λ > 0, the conjugate of f2(x) = f(xλ ) on λX is h2(y) = h(λy),∀ y ∈ Y/λ.

TheoremAssume that f1 : X and f2 : X have the same convex hull function. Thenthey have the same conjugate transform h : Y when it exists.


Conjugate Functions and Properties

We know the dual problem of LD is LP again. When will the conjugatetransform of h : Y become f : X?

Proper functionA convex function f is proper if its epigraph is non-empty and contains novertical lines, i.e. if f(x) < +∞ for at least one x and f(x) > −∞ forevery x.

TheoremLet f : X ⊂ Rn → R be a proper closed convex function with conjugatetransform h : Y. Then the conjugate transform of h : Y is f : X . Moreover,y ∈ ∂f(x) if and only if x ∈ ∂h(y). In this case,

x • y = f(x) + h(y) ⇐⇒ y ∈ ∂f(x) or x ∈ ∂h(y)


Convex Cone Structure

Content

• Convex Cones and Properties

• Partial Order and Ordered Vector Space

• Some Examples


Convex Cones and Properties

• A set K ⊂ Rn is a cone if

∀x ∈ K and λ ≥ 0⇒ λx ∈ K;

• A cone K ⊂ Rn is pointed if

K ∩ −K = {0};

• A cone K ⊂ Rn is solid ifintK 6= ∅;

• A cone K ⊂ Rn is proper if it is pointed, solid, closed and convex.



• Conic combination: a linear combination∑mi=1 λix

i with λi ≥ 0,xi ∈ Rn for all i = 1, . . . ,m.

• The conic hull of a set X ⊂ Rn is

Cone(X ) = {x ∈ Rn|x =∑mi=1 λix

i, for some m ∈ N+

and xi ∈ X , λi ≥ 0, i = 1, . . . ,m.}

• The dual cone K∗ ⊂ Rn of a cone K ⊂ Rn is

K∗ = {y ∈ Rn|y • x ≥ 0,∀ x ∈ K}

K∗ is a closed, convex cone.• If K∗ = K, then K is a self-dual cone.



K, K1, K2 are convex cones in Rn.• (K∗)∗ = cl(K)

• K1 ∩K2, K1 ∪K2, K1 +K2 are all cones• (K1 +K2)∗ = K∗1 ∩K∗2• K1 and K2 are both closed⇒ K1 +K2 is closed.• ri(K1 +K2) = ri(K1) + ri(K2)

• The supporting hyperplane of K always contains the origin• If K is solid (pointed), then K∗ is pointed (solid).


Partial Order and Ordered Vector Space

• A relation “≥” is a partial order on a set X if it has:

1. reflexivity: a ≥ a for all a ∈ X ;

2. antisymmetry: a ≥ b and b ≥ a imply a = b;

3. transitivity: a ≥ b and b ≥ c imply a ≥ c.

• An ordered vector space X is equipped with a partial order “≥” whichalso satisfies:

• homogeneity: a ≥ b and λ ∈ R+ imply λa ≥ λb;

• additivity: a ≥ b and c ≥ d imply a+ c ≥ b+ d.


Partial Order and Ordered Vector Space

• A proper cone K in a vector space can induce a partial order “≥K”

a ≥K b⇔ a− b ∈ K

which leads to an ordered vector space.• Similarly, we can define “≤K”

a ≤K b⇔ b ≥K a,

• Closeness of K allows passing limits in ≥K :

ai ≥K bi, ai → a, bi → b as i→∞ ⇒ a ≥K b.

• Solidness of K allows us to define a strict inequality:

a >K b⇔ a− b ∈ intK,

anda <K b⇔ b >K a.


Examples: Rn+

• Rn+ is a proper cone.

• Inner product: x • y = xT y

• (Rn+)∗ = Rn+ (self-dual)

• Partial order: “≥Rn+

”


Examples: Ln

• Ln / SOC(n− 1)Lorentz cone (secondorder cone)

Ln = {x ∈ Rn|xn ≥√x2

1 + · · ·+ x2n−1}

• Ln is a proper cone.

• Inner product: x • y = xT y

• (Ln)∗ = Ln (self-dual)

• Partial order: “≥Ln ”


Examples: Sn+

• Sn+ ⊂ Sn: the set of symmetric positive semidefinite matrices• Sn+ is a proper cone.• Inner product:

X • Y = tr(XTY )

• Another view:

vec(X) = [X11,√

2X12, X22,√

2X13,√

2X23, X33, · · · , Xnn]T ∈ Rn(n+1)

2

ThenX • Y = vec(X) • vec(Y ) =

∑i,j

XijYij

• Partial order: “≥Sn+

” or “�”


Examples: Sn+

Lemma(Sn+)∗ = Sn+ (self-dual)

Proof.“⊆”: If X ∈ (Sn+)∗, then zTXz = X • zzT ≥ 0, for all z ∈ Rn.Therefore, X ∈ Sn+.

“⊇”: For any Y ∈ Sn+,

Y =∑ni=1 λiz

i(zi)T ,

with λi ≥ 0.If X ∈ Sn+, then

X • Y =

n∑i=1

λiX • zi(zi)T =

n∑i=1

λi(zi)TXzi ≥ 0.

Therefore, X ∈ (Sn+)∗.


Examples: Cn and C∗n

• Copositive cone:

Cn = {X ∈ Sn|zTXz ≥ 0,∀ z ≥Rn+

0}

• Completely positive(nonnegative) cone:

C∗n =

{X ∈ Sn

∣∣∣∣∣ X =∑mi=1 z

i(zi)T , for some m ∈ N+

and zi ≥Rn+

0, i = 1, . . . ,m

}

• (Cn)∗ = C∗n and Cn = (C∗n)∗

• C∗n ⊂ Sn+ ⊂ Cn


Examples: Cones of Nonnegative QuadraticFunctions — Homogeneous

• F ⊂ Rn

• Nonnegative homogeneous quadratic functions over F

f(x) = xTAx ≥ 0,∀x ∈ F

f ⇔ A

• HDF = {A ∈ Sn|xTAx ≥ 0,∀x ∈ F} is a closed, convex cone.(i) Closeness:

xTAix ≥ 0 and Ai → A⇒ xTAx ≥ 0

(ii) Convexity:

xTAiX ≥ 0, i = 1, 2⇒ xT (λA1 + (1− λ)A2)x ≥ 0,∀ 0 ≤ λ ≤ 1


Examples: Cones of Nonnegative QuadraticFunctions — Homogeneous

• HD∗F = cl(Cone{xxT |x ∈ F})

• (HDF )∗ = HD∗F and (HD∗F )∗ = HDF

• Examples:• F = Rn

HDF = HD∗F = Sn+• F = Rn+HDF = Cn and HD∗F = C∗n

• F = {x ∈ Rn+|eTx = 1}HDF = Cn and HD∗F = C∗n


Examples: Cones of Nonnegative QuadraticFunctions — Nonhomogeneous

• Nonnegative quadratic functions over F ⊂ Rn

f(x) = xTAx+ 2bTx+ c ≥ 0,∀x ∈ F

f ⇔[c bT

b A

]

• DF =

{[c bT

b A

]∈ Sn+1

∣∣∣ [1x

]T [c bT

b A

] [1x

]≥ 0,∀x ∈ F

}is a closed,

convex cone.

• D∗F = cl(

Cone{[

1 xT

x xxT

] ∣∣∣x ∈ F})• (D∗F )∗ = DF and (DF )∗ = D∗F


Examples: Cones of Nonnegative QuadraticFunctions — Nonhomogeneous

Examples:

• F = Rn

DF = D∗F = Sn+1+

• F = Rn+DF = Cn+1 and D∗F = C∗n+1


IV. Basic Theory


Duality Theory of Linear Conic Programming

Content

• Definition of LCoP and LCoD

• Conjugate Duality Theory

• Deriving LCoD from LCoP

• Conic Duality Theorems for LCoP

• Duality Theorems of LP, SOCP and SDP


Linear Conic Programs

Recall that

Min C •Xs.t. Ai •X = bi, i = 1, ...,m

X ∈ K(LCoP)

where K is a closed, convex cone; bi ∈ R and C, Ai are in the space ofinterests with “•” being an appropriate linear operator.

Note that when K = Rn+ or Ln, X is a vector; when K = Sn+, X is ann× n matrix.


Linear Conic Programs

Min c • xs.t. ai • x = bi, i = 1, . . . ,m

x ∈ K(LCoP)

where K is a closed, convex cone, such as Rn+, Ln and Sn+.

Max bT ys.t.

∑mi=1 yia

i + s = cs ∈ K∗, y ∈ Rm

(LCoD)

where K∗ is the dual cone of K.


Conjugate Duality Theory

Conjugate Program

inf f(x)s.t. x ∈ X ∩K (CP)

where f : X ⊂ Rn → R and K is a cone in Rn.

Conjugate Dual

inf h(y)s.t. y ∈ Y ∩K∗ (CD)

where h : Y is the conjugate transform of f : X and K∗ is the dual coneof K.

• feas(*) denotes the feasible domain of problem (*)• opt(*) denotes the optimal solution set of problem (*)• v(*) denotes the optimal value of problem (*)



Theorem (Conjugate duality theorem)If x ∈ feas(CP) and y ∈ feas(CD), then

0 ≤ x • y ≤ f(x) + h(y)

with the equality holding if and only if

x • y = 0 and y ∈ ∂f(x),

in which casex ∈ opt(CP) and y ∈ opt(CD).

ProofThe inequality follows from Fenchel’s inequality and the definition of dualcone. The rest follows easily.



Theorem (Weak duality theorem)If both CP and CD are feasible, then

(i) v(CP) is finite and

v(CP) + h(y) ≥ 0,∀ y ∈ feas(CD);

(ii) v(CD) is finite and

v(CP) + v(CD) ≥ 0.

ProofThis theorem follows from the previous conjugate duality theorem.



Theorem (Fenchel’s theorem/Strong duality theorem)Suppose that f : X and K are closed and convex. If v(CD) is finite andone of the following conditions holds:

(i) ri(K∗) ∩ ri(Y) 6= ∅,(ii) both K∗ and Y are polyhedrons,

thenv(CP) + v(CD) = 0 and opt(CP) 6= ∅.

Similarly, if v(CP) is finite and one of the following conditions holds:(i) ri(K) ∩ ri(X ) 6= ∅,(ii) both K and X are polyhedrons,

thenv(CP) + v(CD) = 0 and opt(CD) 6= ∅.

Proof: See Rockafellar’s book “Convex Analysis” Section 31.


Deriving LCoD from LCoP

LCoP

Min c • xs.t. ai • x = bi, i = 1, . . . ,m

x ∈ K(LCoP)

Deriving LCoD in the framework of conjugate program.



LCoP as CP

Variables: uT = (u0, u1, . . . , um) ∈ Rm+1;

f(u) = u0;

X = {u ∈ Rm+1|ui = bi, i = 1, . . . ,m};

K0 = {u ∈ Rm+1|u0 = c • x, ui = ai • x, x ∈ K, i = 1, . . . ,m}.

inf f(u)s.t. u ∈ X ∩K0



Corresponding CD

Variables: vT = (v0, v1, . . . , vm) ∈ Rm+1;

h(v) = supu∈X{u • v − f(u)} < +∞

= supu0∈R{(v0 − 1)u0 +

m∑i=1

bivi}

Hence

h(v) =∑mi=1 bivi;

Y = {v ∈ Rm+1|v0 = 1};



Corresponding CD

Moreover,

K∗0 = {v ∈ Rm+1|v • u ≥ 0,∀u ∈ K0}

= {v ∈ Rm+1|(v0c+∑mi=1 via

i) • x ≥ 0,∀x ∈ K}

= {v ∈ Rm+1|v0c+∑mi=1 via

i ∈ K∗}.

Hence

Y ∩K∗0 = {v ∈ Rm+1|c+

m∑i=1

viai = s, s ∈ K∗}.

inf∑mi=1 bivi

s.t. c+∑mi=1 via

i = ss ∈ K∗



CD to LCoDDefine variables: y = −(v1, . . . , vm)T , we have

Max bT ys.t.

∑mi=1 yia

i + s = cs ∈ K∗, y ∈ Rm

(LCoD)

Therefore, the duality theorems of conjugate programs may apply toLCoP.


Conic Duality Theorems for LCoP

Theorem (Weak duality theorem)If both LCoP and LCoD are feasible, then

c • x ≥ bT y,∀x ∈ feas(LCoP) and (y, s) ∈ feas(LCoD).

Theorem (Strong duality theorem)

(i) If feas(LCoP) ∩ int(K) 6= ∅ and v(LCoP) is finite, then there exists(y∗, s∗) ∈ feas(LCoD) such that bT y∗ = v(LCoP).

(ii) If feas(LCoD) ∩ int(K∗) 6= ∅ and v(LCoD) is finite, then there existsx∗ ∈ feas(LCoP) such that c • x = v(LCoD).

Proof: See Aharon Ben-Tal and Arkadi Nemirovski’s book “Lectures onmodern convex optimization” Chapter 2.


Conic Duality Theorems for LCoP

TheoremIf feas(LCoP) and feas(LCoD) are both nonempty andfeas(LCoP) ∩ int(K) 6= ∅, then x∗ is optimal for LCoP if and only if thefollowing conditions hold:

(i) x∗ ∈ feas(LCoP);

(ii) There exists (y∗, s∗) ∈ feas(LCoD);

(iii) c • x∗ = bT y∗ (or equivalently x∗ • s∗ = c • x∗ − bT y∗ = 0).Proof: =⇒ follows from strong duality theorem.

⇐= is obvious.


Linear Program (LP)

Min cTxs.t. Ax = b

x ≥Rn+

0(LP)


s ≥Rn+

0(LD)


Linear Program (LP)

Theorem (LP duality theorem)

(i) If either LP or LD is unbounded, then the other one is infeasible.

(ii) If either v(LP) or v(LD) is finite, then there exist x∗ ∈ feas(LP) and(y∗, s∗) ∈ feas(LD) such that v(LP) = cTx∗ = bT y∗ = v(LD).

(iii) If LP is feasible and v(LP) is finite, then x∗ is optimal for LP if andonly if the following conditions hold:

(a) Ax∗ = b, x∗ ≥Rn+

0;

(b) there exists (y∗, s∗) satisfying AT y∗ + s∗ = c, s ≥Rn+

0;

(c) (x∗)T s∗ = cTx∗ − bT y∗ = 0.


Second Order Cone Program (SOCP)

Min cTxs.t. Ax = b

x ≥K 0(SOCP)

where K = Ln1 × · · · × Lnr = {x ∈ Rn|n1 + · · ·+ nr = n, (x1, ..., xn1)T ∈

Ln1 , ..., (xn−nr+1, ..., xn)T ∈ Lnr}.


s ≥K 0(SOCD)



Theorem (SOCP duality theorem)

(i) If either SOCP or SOCD is unbounded, then the other one isinfeasible.

(ii) If there exists a feasible solution x such that x ∈ int(K), andv(SOCP) is finite, then there exist (y∗, s∗) ∈ feas(SOCD) such thatv(SOCP) = bT y∗ = v(SOCD).

(iii) If there exists a feasible solution (y, s) such that s ∈ int(K), andv(SOCD) is finite, then there exist x∗ ∈ feas(SOCP) such thatv(SOCP) = cTx∗ = v(SOCD).



Theorem (SOCP duality theorem)

(iv) If both SOCP and SOCD are feasible, and there exists a feasiblesolution x such that x ∈ int(K), then x∗ is optimal for SOCP if andonly if the following conditions hold:

(a) Ax∗ = b, x∗ ≥K 0;

(b) there exists (y∗, s∗) satisfying AT y∗ + s∗ = c, s∗ ≥K 0;

(c) (x∗)T s∗ = cTx∗ − bT y∗ = 0.


Difference between LP and SOCP (interior feasible solution):

Min −x2

s.t. x1 − x3 = 0x ∈ L3

Max 0 · y

s.t.

0−10

− y 1

0−1

=

−y−1y

∈ L3

v(SOCP ) = 0 but SOCD is infeasible.

Figure : Feasible domain is a ray x1 = x3 in hyperplane x2 = 0. No feasibleinterior point.



Finite nonzero duality gap:

Min −x2

s.t. x1 + x3 − x4 + x5 = 0x2 + x4 = 1x ∈ L3 × L2

Max y2

s.t.

y1 +s1 = 0y2 +s2 = −1

y1 +s3 = 0−y1 +y2 +s4 = 0y1 +s5 = 0

s ∈ L3 × L2

x∗ =

−101

× [11

]y∗ =

[−1−1

]s∗ =

101

× [01

]

v(SOCP) = 0 6= −1 = v(SOCD)



Zero duality gap with non-attainable value:

Min x1

s.t. −x2 −x3 = 0x2 = −1

x ∈ L3

Max −y2

s.t. s1 = 1−y1 +y2 +s2 = 0−y1 +s3 = 0

s ∈ L3

x∗ =

0−11

v(SOCD) = 0 but not attainable.


Semidefinite Program (SDP)

Min C •Xs.t. AX = b

X � 0(SDP)

Max bT ys.t. A∗y + S = C

S � 0(SDD)

Note:

A∗y =

m∑i=1

yiAi



Theorem (SDP duality theorem)

(i) If either SDP or SDD is unbounded, then the other one is infeasible.

(ii) If there exists a feasible solution X such that X � 0, and v(SDP) isfinite, then there exist (y∗, S∗) ∈ feas(SDD) such thatv(SDP) = bT y∗ = v(SDD).

(iii) If there exists a feasible solution (y, S) such that S � 0, and v(SDD)is finite, then there exist X∗ ∈ feas(SDP) such thatv(SDP) = C •X∗ = v(SDD).



Theorem (SDP duality theorem)

(iv) If both SDP and SDD are feasible, and there exists a feasiblesolution X such that X � 0, then X∗ is optimal for SDP if and only ifthe following conditions hold:

(a) AX∗ = b, X∗ � 0;

(b) there exists (y∗, S∗) satisfying A∗y∗ + S∗ = C, S∗ � 0;

(c) X∗ • S∗ = C •X∗ − bT y∗ = 0.



Interior feasible solutionInfinite duality gap:

C =

[0 11 0

], A =

[0 00 1

], b = 0

X∗ =

[0 00 0

]and SDD is infeasible.

Zero duality gap with non-attainable value:

C =

[1 00 0

], A =

[0 11 0

], b = 1

v(SDP ) = 0 but is not attainable. y∗ = 0 and S∗ =

[1 00 0

].



Finite nonzero duality gap:

C =

0 0 00 0 10 1 0

, A1 =

0 0 00 1 00 0 0

, A2 =

1 0 00 0 −10 −1 0

, b =

[01

]

X∗ =

1 0 00 0 00 0 0

, y∗ =

[0−1

], S∗ =

1 0 00 0 00 0 0

v(SDP ) = 0 6= −1 = v(SDD)


V. Solution Methods


Interior Point Methods

Content

• Interior Points and Primal-Dual Model

• Barrier Functions and Optimality Systems

• Central Path and Newton Methods

• Path Following Method


Interior Point Methods

• Interior point approach

• Start from an interior point solution.• If the current solution is not good enough, then move to another interior

point solution.• Stop at an interior point solution whose objective value is close to the

optimum (within an ε gap).

• Advantages:

• Polynomial time complexity (comparing to the simplex method for LP)• Excellent computational performance in practice (comparing to the

ellipsoid method)

• Three types: primal; dual; primal-dual


Primal-dual Model

• Primal-dual type of LP

Min sTxs.t. Ax = b

AT y + s = cx ≥Rn

+0, s ≥Rn

+0

(LPD)

• Primal-dual type of SDP

Min S •Xs.t. AX = b

A∗y + S = CX � 0, S � 0

(SDPD)

• Note:AX = [A1 •X, · · · , Am •X]T

and A∗y =∑mi=1 yiAi


Interior Points

feas+(LP) = {x ∈ Rn|Ax = b, x >Rn+

0}feas+(LD) = {(y, s) ∈ Rm ×Rn|AT y + s = c, s >Rn

+0}

feas+(LPD) = feas+(LP)× feas+(LD)

feas+(SDP) = {X ∈ Sn|AX = b,X � 0}feas+(SDD) = {(y, S) ∈ Rm × Sn|A∗y + S = C, S � 0}feas+(SDPD) = feas+(SDP)× feas+(SDD)

• Assumptions:• feas+(LP) and feas+(LD) are not empty and the rows of A are linearly

independent.

• feas+(SDP) and feas+(SDD) are not empty and the vectors formed byAi in A are linearly independent.


Barrier function

• Properties required:• Strictly convex (concave).• Goes to +∞ (−∞) when the point is close to the boundary.• Sufficient continuous differentiability.

• Barrier functions:

LP : −∑ni=1 log xi SDP : − log det(X)

LD :∑ni=1 log si SDD : log det(S)

LPD : −∑ni=1 log(xisi) SDPD : − log det(XS)


LP with Barrier

Min cTx− µ∑ni=1 log xi

s.t. Ax = bx >Rn

+0

(LPB)

Max bT y + µ∑ni=1 log si

s.t. AT y + s = cs >Rn

+0

(LDB)

Min sTx− µ∑ni=1 log(xisi)

s.t. Ax = bAT y + s = cx >Rn

+0, s >Rn

+0

(LPDB)


Common Optimality System for LP with Barrier

Ax = bAT y + s = cΛxs = µex >Rn

+0, s >Rn

+0,

where e = (1, . . . , 1)T and Λx is a diagonal matrix with (Λx)ii = xi,i = 1, . . . , n.

Notice that

µ =xT s

n=cTx− bT y

n

When µ→ 0, sTx→ 0. Optimal!


SDP with Barrier

Min C •X − µ log det(X)s.t. AX = b

X � 0(SDPB)

Min bT y + µ log det(S)s.t. A∗y + S = C

S � 0(SDDB)

Min S •X − µ log det(XS)s.t. AX = b

A∗y + S = CX � 0, S � 0

(SDPDB)


Common Optimality System for SDP with Barrier

AX = bA∗y + S = CXS = µIX � 0, S � 0

Notice that

µ =S •Xn

=C •X − bT y

n

When µ→ 0, S •X → 0. Optimal!


Central Path for LP and SDP

CLP = {(x, y, s) ∈ feas+(LPD)|Λxs = µe, 0 < µ < +∞}

CSDP = {(X, y, S) ∈ feas+(SDPD)|XS = µI, 0 < µ < +∞}

Under proper assumptions:• For any 0 < µ < +∞, there exists a unique point on central path.

LP: (x(µ), y(µ), s(µ))

SDP: (X(µ), y(µ), S(µ))

• Given µ > 0, the set {(x, y, s) ∈ feas+(LPD)|Λxs = µe, 0 < µ < µ} isbounded.Given µ > 0, the set {(X, y, S) ∈ feas+(SDPD)|XS = µI, 0 < µ < µ}is bounded.


Example: Central Path (Primal)

Min x1 + x2

s.t. x1 + x2 ≤ 3x1 − x2 ≤ 1x2 ≤ 2x1 ≥ 0, x2 ≥ 0

Figure : Projection of central path on (x1, x2)


Example: Central Path (Dual)

Max 3y1 + y2 + 2y3

s.t. y1 + y2 ≤ 1y1 − y2 + y3 ≤ 1y1 ≤ 0y2 ≤ 0y3 ≤ 0

Figure : Projection of central path on (y1, y2)


Newton Method for LP

Given (x0, y0, s0) ∈ feas+(LPD) with µ0 = (s0)T x0

n and 0 ≤ γ ≤ 1, find(dx, dy, ds) satisfying

A(x0 + dx) = bAT (y0 + dy) + (s0 + ds) = cΛx0+dx(s0 + ds) = γµ0ex0 + dx >Rn

+0, s0 + ds >Rn

+0,

After linearization A 0 00 AT I

Λs0 0 Λx0

dxdyds

=

00

γµ0e− Λx0Λs0e

x0 + dx >Rn

+0, s0 + ds >Rn

+0,

Directly solve the equation is not easy.


Newton Method for LP

Linear scaling: Given a positive diagonal matrix D ∈ Rn×n,

A = AD, x0 = D−1x0, s0 = Ds0, c = Dc

A 0 00 AT I

Λs0 0 Λx0

dxdyds

=

00

γµ0e− Λx0Λs0e

x0 + dx >Rn

+0, s0 + ds >Rn

+0,

• D = Λx0 : x0 = e ⇒ x0 + dx >Rn+

0, ∀‖dx‖2 < 1 (Primal)

• D = Λ−1s0 : s0 = e ⇒ s0 + ds >Rn

+0, ∀‖ds‖2 < 1 (Dual)

• D = Λ1/2x0 Λ

−1/2s0 : v0 = x0 = s0 = Λ

1/2x0 Λ

1/2s0 e (Primal-dual)


Primal-Dual Interior-Point Method for LP

D = Λ1/2x0 Λ

−1/2s0 : A 0 0

0 AT II 0 I

dxdyds

=

00

γµ0Λ−1v0 e− v

0

x0 + dx >Rn

+0, s0 + ds >Rn

+0,

One can solveAAT dy = −A(γµ0Λ−1

v0 e− v0)

And then solve ds and dx:

ds = −AT dydx = −ds + γµ0Λ−1

v0 e− v0


Newton Method for SDP

Given (X0, y0, S0) ∈ feas+(SDPD) with µ0 = S0•X0

n and 0 ≤ γ ≤ 1, find(4X, dy,4S) satisfying

A(X0 +4X) = bA∗(y0 + dy) + (S0 +4S) = C(X0 +4X)(S0 +4S) = γµ0IX0 +4X � 0, S0 +4S � 0

After linearization

A4X = 0A∗dy + 4S = 0

4XS0 + X04S = γµ0I −X0S0

X0 +4X � 0, S0 +4S � 0.

Directly solve the equation is not easy.


Newton Method for SDP

Linear transformation: Given an invertible matrix L ∈ Rn×n, letA = (A1, . . . , Am), Ai = LTAiL for i = 1, . . . ,m.X0 = L−1X0L−T , S0 = LTS0L, C = LTCL.

A4X = 0

A∗dy + 4S = 0

4XS0 + X04S = γµ0I − X0S0

X0 +4X � 0, S0 +4S � 0

• L = (X0)1/2: X0 = I ⇒ X0 +4X � 0, ∀‖4X‖F < 1 (Primal)• L = (S0)−1/2: S0 = I ⇒ S0 +4S � 0, ∀‖4S‖F < 1 (Dual)• LLT = (S0)−1/2[(S0)1/2X0(S0)1/2]1/2(S0)−1/2:V 0 = X0 = S0 (Primal-dual)


Primal-Dual Interior-Point Method for SDP

LLT = (S0)−12 [(S0)

12X0(S0)

12 ]

12 (S0)−

12 :A 0 0

0 A∗ II 0 I

4Xdy4S

=

00

γµ0(V 0)−1 − V 0

X0 +4X � 0, S0 +4S � 0

One can solveAA∗dy = −A(γµ0(V 0)−1 − V 0)

And then solve 4S and 4X:

4S = −A∗dy4X = −4S + γµ0(V 0)−1 − V 0


Neighborhood of Central Path for LP

Notice that x0 = s0 = v0

• Distance to central path: u >Rn+0

δ(u) = ‖e− n

uTuΛuu‖2

• Neighborhood of the central path

N2(β) = {u|u >Rn+

0, δ(u) ≤ β}

N−∞(β) = {u|u >Rn+

0,Λuu ≥Rn+

(1− β)uTu

nI}


Examples: N2(12) and N−∞(1

2)

Figure : Neighborhood N2( 12) Figure : Neighborhood N−∞( 1

2)


Finding Step Length for LP

x0 + αdxs0 + αds

scaling back−−−−−−−−−−−→

[x1

s1

] new scaling−−−−−−−−−−→ v1 = x1 = s1

LemmaFor any 0 ≤ α ≤ 1,

µ1 =‖v1‖22n

=(x0 + αdx)T (s0 + αds)

n= (1− α+ γα)µ0



LemmaIf δ(v0) < 1 and α satisfies x0 + αdx >Rn

+0 and s0 + αds >Rn

+0, then

(1− α+ γα)δ(v1) ≤ (1− α)δ(v0) +α2

2

(γ2δ(v0)2

1− δ(v0)+ n(1− γ)2

)Proof:

µ1δ(v1) = µ1‖e− 1

µ1Λv1v

1‖2

= ‖(1− α+ γα)µ0e− Λ(v0+αdx)(v0 + αds)‖2

≤ ‖(1− α)µ0(e− 1

µ0Λv0v

0)‖2 + ‖α2Λdx ds‖2

≤ (1− α)µ0δ(v0) +α2

2‖dx + ds‖22

= (1− α)µ0δ(v0) +α2

2(γ2‖µ0Λ−1

v0 e− v0‖22 + (1− γ)2nµ0)

≤ (1− α)µ0δ(v0) +α2

2(µ0γ2δ(v0)2

1− δ(v0)+ (1− γ)2nµ0)



LemmaIf v0 ∈ N2(β) with β = 1

2 , γ = 11+1/

√2n

and α = 1, then

(i) v1 ∈ N2(β)

(ii) x1 • s1 = x1 • s1 = ‖v1‖22 = γµ0


Path Following Algorithm for LP

Step 1: (Initialization)ε > 0, (x0, y0, s0) with v0 ∈ N (β), where β = 1

2 .Set k = 0, γ = 1

1+1/√

2n, and α = 1.

Step 2: Solve the Newton system introduced above and get (dx, dy, ds).Set xk+1 = xk + αdx

yk+1 = yk + αdysk+1 = sk + αds

with vk+1 = Λ1/2

xk+1Λ1/2

sk+1e.Set k = k + 1.

Step 3: If xk • sk < ε, stop. Otherwise, go to Step 2.


Complexity for LP

TheoremGiven the above settings, we have

(i) vk ∈ N2(β), k = 0, 1, 2, . . ..(ii) The algorithms stops in

O(√n log

x0 • s0

ε)

steps and output a primal-dual solution satisfying

xk • sk < ε


Neighborhood of Central Path for SDP

Notice that X0 = S0 = V 0

• Distance to central path: U ∈ Sn+ and U � 0

δ(U) = ‖I − n

I • U2U2‖F , with U2 = UU

• Neighborhood of the central path

N2(β) = {U |U � 0, δ(U) ≤ β}

N−∞(β) = {U |U � 0, U2 � (1− β)I • U2

nI}


Finding Step Length for SDP

X0 + α4XS0 + α4S

scaling back−−−−−−−−−−−→

[X1

S1

] new scaling−−−−−−−−−−→ V 1 = X1 = S1

LemmaFor any 0 ≤ α ≤ 1,

µ1 =‖V 1‖2Fn

=tr[(X0 + α4X)(S0 + α4S)]

n= (1− α+ γα)µ0.



LemmaFor any square matrix U , we have

tr(U2) = ‖U + UT

2‖2F − ‖

U − UT

2‖2F ≤ ‖

U + UT

2‖2F

LemmaSuppose δ(V 0) < 1 and α ≥ 0 satisfies X0 + α4X � 0 andS0 + α4S � 0. Let

W =(X0 + α4X)(S0 + α4S) + ((X0 + α4X)(S0 + α4S))T

2

thenW = (1− α)(V 0)2 + αγµ0I + α24X4S +4S4X

2

andδ(V 1)2 ≤ ‖I − 1

µ1W‖2F



LemmaSuppose δ(V 0) < 1 and α ≥ 0 satisfies X0 + α4X � 0 andS0 + α4S � 0. Then

(1− α+ γα)δ(V 1) ≤ (1− α)δ(V 0) +α2

2

(γ2δ(V 0)2

1− δ(V 0)+ n(1− γ)2

)Proof

µ1δ(V 1) ≤ (1− α)µ0δ(V 0) + α2‖4X4S+4S4X2 ‖F

≤ (1− α)µ0δ(V 0) + α2

2 ‖4X +4S‖2F= (1− α)µ0δ(V 0) + α2

2 (γ2‖µ0(V 0)−1 − V 0‖2F + (1− γ)2nµ0)

≤ (1− α)µ0δ(V 0) + α2µ0

2

(γ2δ(V 0)2

1−δ(V 0) + n(1− γ)2)



LemmaIf V 0 ∈ N2(β) with β = 1

2 , γ = 11+1/

√2n

and α = 1, then

(i) V 1 ∈ N2(β)

(ii) X1 • S1 = X1 • S1 = ‖V 1‖2F = γµ0


Path Following Algorithm for SDP

Step 1: (Initialization)ε > 0, (X0, y0, S0) with V 0 ∈ N (β), where β = 1

2 .Set k = 0, γ = 1

1+1/√

2n, and α = 1.

Step 2: Solve the equation system introduced above and get(4X, dy,4S).Set Xk+1 = Xk + α4X

yk+1 = yk + αdySk+1 = Xk + α4S

with V k+1 = Xk+1 = Sk+1.Set k = k + 1.

Step 3: If Xk • Sk < ε, stop. Otherwise, go to Step 2.


Complexity

TheoremGiven the above settings, we have

(i) V k ∈ N2(β), k = 0, 1, 2, . . ..(ii) The algorithms stops in

O(√n log

X0 • S0

ε)

steps and output a primal-dual solution satisfying

Xk • Sk < ε


Example: Path Following Algorithm

min x1 + x2

s.t. x1 + x2 ≤ 3x1 − x2 ≤ 1x2 ≤ 2x1 ≥ 0, x2 ≥ 0

Figure : Path following algorithm with β = 1/2


Initialization and Improve the Performance

Initialization

• Big-M Method• Two-Phase Method• Self-Dual Embedding Method

Different Path-Following Methods

• Short Step Algorithm• Long Step Algorithm• Predictor-Corrector Algorithm• Largest Step Algorithm

Reference: Handbook of Semidefinite Programming: Theory, Algorithms,and Applications, edited by Wolkowicz H., Saigal R. and VandenbergheL., Kluwer Academic Publisher: Norwell, MA USA 2000.


VI. Recent ResearchDirections


Quadratically Constrained Quadratic ProgrammingProblem (QCQP)

Min xTQ0x+ 2bT0 x+ c0

s.t. xTQix+ 2bTi x+ ci ≤ 0, i = 1, . . . ,m1,xTQix+ 2bTi x+ ci = 0, i = m1 + 1, . . . ,m1 +m2,

where Qi is an n× n real symmetric matrix, bi is an n-dimensional real vectorand ci is a real number, for each i = 0, 1, . . . ,m1 +m2.

Example


Motivation — Why Study QCQP

Quadratically Constrained Quadratic Programming (QCQP)

• Connection between linear programming and nonlinearprogramming

• Nonlinearity and nonconvexity

• Second order information for approximation.


Motivation — Why are QCQP problems important?

• QCQP problems can be used as subroutines in othermethods for more general optimization problems.

– Trust region method for nonlinear analysis– Least squares method for statistics

• Help understand the difference between convex andnonconvex problems in terms of the difficulties involved insolving these problems.

• Help develop deeper understanding of the complicatedstructure of quadratic optimization.

• Build more useful tools by exploring the polynomial-timesolvable subclasses.


Motivation — Why are QCQP problems important?

• Wide Applications• Portfolio Optimization Problem• Knapsack Problem• Location-allocation Problem• Information Network Security• Combinatorial Problem• . . .

• Generalizations of known optimization problems• 0-1 Constrained Quadratic Programming Problem• Box Constrained Quadratic Programming Problem• Mixed-integer Constrained Quadratic Programming Problem• Standard Quadratic Programming Problem• Second-order Cone Constrained Quadratic Programming Problem• . . .


Application I: Portfolio Optimization Problem

What is the best portfolio selection for your investment?

Invest instead of saving in your pocket!


Application I: Portfolio Optimization Problem

The classical mean-variance(MV) model developed byMarkowitz(1952) uses mean and variance of the portfolioto measure the expected value and risk of the selection.

Let x be the vector of weights investing on n securities.

Let ξ be the random vector of expected returns ofn risky assets. µi = E(ξi), i = 1, . . . , n, σ2(ξTx) = xTQx.

min xTQxs.t. µTx ≥ ρ,

Ax ≤ b,(MV)

where ρ is a prescribed return level, Ax ≤ b isused for representing some real-world trading conditions.


Application I: Portfolio Optimization Problem withHard Constraints

• Cardinality constraint: the number of assets in the optimal portfoliocould be limited,

|supp(x)| ≤ K,

where supp(x) = {i | xi 6= 0}, 1 ≤ K � n.The need to account for this limit is due to the transaction cost andmanagerial concerns.

• Minimum buy-in threshold:

αi ≤ xi ≤ βi, i ∈ supp(x) or xi ∈ {0} ∪ [αi, βi], i = 1, . . . , n.

Difficulty: testing the feasibility of the domain defined by the newconstraints is already NP-complete when A has three rows, seeBienstock(1996).


Application I: Reformulation of PortfolioOptimization Problem as QCQP

• The cardinality constraint can be represented by

eT y ≤ K, 0 ≤ xi ≤ βiyi, i = 1, . . . , n, y ∈ {0, 1}n.

• The minimum buy-in threshold can be expressed as

αiyi ≤ xi ≤ βiyi, i = 1, . . . , n, y ∈ {0, 1}n.

• Mixed-integer constrained quadratic programming (MIQP) problem:

min xTQxs.t. Ax ≤ b, eTy ≤ K,

y2i − yi = 0, (yi ∈ {0, 1}), i = 1, . . . , n,αiyi ≤ xi ≤ βiyi, i = 1, . . . , n.

(CMV)


Application II: Quadratic Knapsack Problem

Which items will you pick for your weight-limited bag?


Application II: Quadratic Knapsack Problem

• Given n items, where item j has a positive integer weight wj .

• Given an n× n nonnegative integer matrix Q, where Qii is the profitachieved if item i is selected and Qij = Qji is the profit achieved ifboth items i and j are selected.

• Quadratic knapsack problem selects an item subset whose overallweight does not exceed a given knapsack capacity c, so as tomaximize the overall profit.


Application II: Quadratic Knapsack Problem asQCQP

Let wT = (w1, w2, . . . , wn), by introducing the binary variable,the problem can be reformulated as

max xTQxs.t. wTx ≤ c,

xi ∈ {0, 1}, i = 1, . . . , n.(x2i − xi = 0)

(QKP)

Difficulty: quadratic knapsack problem is NP-Hard, see Galloet al.(1980).

Gallo, G., Hammer, P. and Simeone, B., 1980. Quadratic knapsack problems.Mathematical Programming Study, 12, 132-149.


Application III: Location-Allocation Problem

Where to locate these factories?


Application III: Location-Allocation problem

• Given the flow fij between facility i and j, the distancedkp between location k and p, for i, j, k, p = 1, . . . , n.

• Assigning facilities to locations in such a way that eachfacility is designated to exactly one location andvice-versa.

• The location-allocation problem aims to find a minimumcost allocation of facilities into locations, taking the costsas the sum of all possible distance-flow products.

• The location-allocation problem is equivalent to thequadratic assignment problem.


Application III: Location-Allocation Problem asQCQP

min∑n

i,j=1

∑nk,p=1 fijdkpxikxjp

s.t.∑n

i=1 xij = 1, j = 1, . . . , n,∑nj=1 xij = 1, i = 1, . . . , n,

xij ∈ {0, 1}, i, j = 1, . . . , n.

(x2ij − xij = 0)

(QLAP)

Difficulty: quadratic assignment problem is NP-Hard, seeSahni and Gonzales(1976).Sahni, S. and Gonzales, T., 1976. P-complete approximation problems. Journalof the Association for Computing Machinery, 23, 555-565.


Application IV: Information Network Security

For a hacker, what is the biggest damage to the informationflow by destroying some edges in the network?

The problem of information network security is equivalent to the max-cut problem.


Application IV: Information Network Security asQCQP

• Given the weight wij = wji for the edge between node i and j.• Introduce binary variable xi ∈ {−1, 1}, i = 1, . . . , n to indicate the

partition.

max∑n

i=1

∑nj=1wij

1−xixj2

s.t. xi ∈ {−1, 1}, i = 1, . . . , n.

(x2i = 1)

(MC)

• Difficulty: max-cut problem is NP-Complete, see Karp(1972).

• If wij ≥ 0,∀i 6= j. Then the expected value of randomized algorithmis at least α ≈ 0.878 times the value of the maximum cut by usingSemidefinite Programming, see Goemans and Williamson(1995).Goemans, M.X. and Williamson, D.P., 1995. Improved approximation algorithms formaximum cut and satisfiability problems using semidefinite programming. Journal ofACM, 42, 1116-1145.


How Difficult are QCQP Problems?

Difficulty:Pardalos and Vavasis(1991) have proved that thequadratic programming problem is NP-Hard. Therefore,QCQP problems and their extensions are all NP-Hardproblems.

Reference:Pardalos, P.M. and Vavasis, S.A., 1991. Quadraticprogramming with one negative eigenvalue is NP-Hard.Journal of Global Optimization, 1, 15-22.


Current Research Directions of QCQP Problems

• Find sufficient global optimality conditions.

• Characterize the structures of QCQP problems with fewconstraints or special constraints.

• Identify polynomial-time solvable subclasses of QCQPproblems.

• Develop approximations for some difficult QCQPproblems.


How can we deal with QCQP?

• It is difficult to solve QCQP problems directly.

• New reformulations and tools are needed for QCQPproblems.

We need a “Magic stick"!


New Tool for QCQP: Linear Conic Programming

(LP) (LCoP)Min cTx Min C •Xs.t. (ai)Tx = bi, i = 1, ..,m, s.t. Ai •X = bi, i = 1, ..,m,

x ≥ 0. X ∈ K.(x ∈ Rn

+) (K is a cone)

K is a closed, convex cone; bi ∈ R and C, Ai are in thespace of interests with “•” being an appropriate linearoperator.


Special Cases of Linear Conic Programming

• Linear Programming (LP):K = Rn+.

• Second-order Cone Programming (SOCP):K = Ln = {x ∈ Rn|

√x2

1 + . . .+ x2n−1 ≤ xn}.

• Semidefinite Programming (SDP):K = Sn+ = {M ∈ Sn| xTMx ≥ 0, ∀x ∈ Rn}.

• Copositive Programming (CoP):K = Cn = {M ∈ Sn| xTMx ≥ 0, ∀x ∈ Rn+}.

Figure : R3+ Figure : L2 Figure : S2

+


Refreshment: Cone of Nonnegative Functions

• Nonnegative quadratic functions over a given F ⊆ Rn

f(x) = xTAx+ 2bTx+ c ≥ 0,∀x ∈ F

f ⇔[c bT

b A

]

• DF =

{[c bT

b A

]∈ Sn+1

∣∣∣ [c bT

b A

]•[

1 xT

x xxT

]≥ 0,∀x ∈ F

}is a

cone.

• D∗F = cl Cone{[

1 xT

x xxT

]∈ Sn+1

∣∣∣x ∈ F} is a closed convex cone.


Refreshment: Cone of Nonnegative Functions

• F = Rn =⇒ DF = Sn+1+ (Positive Semidefinite Cone),

D∗F = Sn+1+ .

• F = Rn+ =⇒ DF = Cn+1 (Copositive Cone),D∗F = C∗n+1 (Completely Positive Cone).

• Larger F implies smaller DF , in particular,

C∗n+1 ⊆ (Sn+1+ ∩Nn+1

+ ) ⊆ (Sn+1+ )∗ = Sn+1

+ ⊆ (Sn+1+ +Nn+1

+ ) ⊆ Cn+1,

andD∗F ⊆ Sn+1

+ ⊆ DF , ∀F ⊆ Rn.


Refreshment: K = DF (Cone of Nonnegative QuadraticFunctions)

When K = DF , LCoP becomes CoP.

Min C •Xs.t. Ai •X = bi, i = 1, ...,m,

X ∈ DF .(CoP)

where C,A1, ..., Am are given n× n symmetric matrices and b1, ..., bm aregiven scalars.Dual of CoP:

Max bT ys.t.

∑mi=1 yiAi + S = C,

S ∈ D∗F .(CoD)



Copositive Cone and Completely Positive Cone

D∗Rn+

= C∗n+1 Sn+1+ Cn+1 = DRn

+.

0.0

0.5

1.0

1.5

2.0

x

-1

0

1

y

0.0 0.5 1.0 1.5 2.0

z

Figure : Copositive Cone C2 Figure : Completely Positive Cone C∗2


Nonconvex Quadratic Programming and LinearConic Programming

For the following nonconvex quadratic programming problem (NQP):

Min xTP 0x+ 2(q0)Tx+ γ0

s.t. x ∈ F (NPQ)

where ∅ 6= F ⊆ Rn is a possibly nonconvex domain, it is equivalent to thelinear conic programming problem (MP) defined as

Min

[γ0 (q0)T

q0 P 0

]•X

s.t. X11 = 1, X ∈ D∗F .(MP)

Problem (MP) is still NP-hard in general because decomposing anoptimal solution of problem (MP) to find a solution of problem (NQP) maynot be polynomial-time [Sturm2003].


Conic Duality Theorems

Weak Conic Duality Theorem [Ben-Tal2001]Assume problems (LCoP) and (LCoD) are both feasible. Then, theoptimal value of problem (LCoD) is a lower bound for the optimal value ofproblem (LCoP).

Strong Conic Duality Theorem [Ben-Tal2001]

a. If problem (LCoP) is bounded below and strictly feasible, thenproblem (LCoD) is feasible, an optimal solution is attainable forproblem (LCoD) and the optimal values of problems (LCoP) and(LCoD) are equal.

b. If problem (LCoD) is bounded above and strictly feasible, thenproblem (LCoP) is feasible, an optimal solution is attainable forproblem (LCoP) and the optimal values of problems (LCoP) and(LCoD) are equal.


Computational Complexity

• (LCoP) has polynomial-time interior-point algorithms when

(i) K = Rn+ (ii) K = Ln (iii) K = Sn+ (iv) K = Sn+ +Nn+.

• When K = Cn+1, (LCoP) becomes NP-Hard.

• P or NP is somewhere in-between (Sn+ +Nn+) and Cn+1.


Polynomial-time Approximation

• Since D∗Rn = DRn = Sn+1+ and D∗F ⊆ S

n+1+ ⊆ DF , ∀F ⊆ Rn,

(SDP) relaxation always provides a lower bound for (CoP) inpolynomial-time.

(SDD) relaxation always provides an upper bound for (CoD) inpolynomial-time.

• Replacing Sn+1+ by (Sn+ +Nn

+) may provide a better lower bound for(CoP) in polynomial-time.

• Same scheme works for (NPQ) and (MP).


Rank-one Decomposition

For any X in Sn+, X has a rank-one decomposition, that is

X =

r∑i=1

xi(xi)T

where r ∈ N is the rank of X and xi ∈ Rn for i = 1, . . . , r (ref.[Horn1990]).

Ye and Zhang [Ye2003]Let Y be a given symmetric matrix in Sn and X be a positive semidefinitematrix with rank 0 < r ≤ n. Suppose that X • Y ≤ 0, then there exists arank-one decomposition of X running in polynomial time to find xi ∈ Rn,i = 1, . . . , r, such that

X =∑ri=1 x

i(xi)T ,

(xi)TY xi ≤ 0.


Linear Matrix Inequality

A linear matrix inequality (LMI) is an expression of the form

A0 + y1A1 + · · ·+ ymAm < 0

where A0, . . . , Am are n× n given symmetric matrices, y = (y1, . . . , ym) isa vector of real variables, and B < 0 means B is a positive semidefinitematrix.• The form of an LMI is very general. It includes linear inequalities,

convex quadratic inequalities, etc.

(x− xc)TQ−1(x− xc) ≤ 1, Q ∈ Sn++ ⇔[

1 (x− xc)T(x− xc) Q

]< 0

• SDP with additional LMI constraints can be solved efficiently[Gahinet1993]. ⇒ Improve the bounds generated by SDPrelaxations.


Reformulation-Linearization Technique (RLT)

RLT generates LMIs for SDP relaxations.

RLT originated in 1986 [Adams1986], which is applied in solving 0-1,mixed 0-1 linear and polynomial programming problems [Sherali1990,Sherali1994], and continuous, nonconvex polynomial programmingproblems [Sherali1995, Sherali2001].

RLT involves two steps:1 Reformulation Step - additional valid nonlinear inequalities are

generated.2 Linearization Step - each product term in the valid nonlinear

inequalities is replaced by a single continuous variable.


Completely Positive Programming (CPP) Problem

Min f(X) = C •Xs.t. Ai •X = bi, i = 1, 2, . . . ,m

X ∈ C∗n,(CCP)

where C ∈ Sn, Ai ∈ Sn, bi ∈ R, i = 1, 2, . . . ,m, and C∗n is the completelypositive cone of order n.

• Quadratic programming problem with linear and binary constraintscan be written as a completely positive programming (CPP) problem[Burer2009].


Literature Review

Quadratic programming problem with linear and binaryconstraints⇒ Completely positive programming (CPP)problem [Burer2009].

• Box constrained quadratic programming problem.• Standard quadratic programming problem.• Maximum clique problem.• Binary constrained quadratic programming problem.• Mixed integer quadratic programming problem.


Literature Review

• The study of copositivity and complete positivity can be traced backto 1965 [Motzkin1965].

• Structure of Cn and C∗n• Checking copositivity and complete positivity: co-NP-complete

[Murty1987].• Matrices with special structures (polynomial-time solvable): tridiagonal

and acyclic [Bomze2000, Ikramov2002].• C∗n ⊆ (Sn+ ∩Nn

+) ⊂ (Sn+ +Nn+) ⊆ Cn:

It was showed in [Maxfield1963] that

C∗n=(Sn+ ∩Nn+) ⊂ (Sn+ +Nn

+) = Cn (n ≤ 4),

andC∗n⊂(Sn+ ∩Nn

+) ⊂ (Sn+ +Nn+) ⊂ Cn (n ≥ 5).


Literature Review

Algorithms of CPP problem

• Global optimization techniques: [Bundfuss and Dür 2009].• Difference-of-convex (d.c.) decompositions: [Bomze and Eichfelder

2010].• Hierarchy on cones: [Parrilo2000], [Bomze and Klerk 2002] and

[Peña et al. 2007].


New Results Obtained for The CPP Problem

• Computable representation of the cone of nonnegative quadraticforms over a general nontrivial second-order cone by linear matrixinequalities (LMIs).

• Approximation to the underlying cone of completely positivematrices.

• Adaptive algorithm with “reformulation-linearization technique” (RLT)constraints.


Cone of Nonnegative Quadratic Forms: A Special Case

of Cone of Nonnegative Quadratic Functions

Given a nonempty set F ⊆ Rn, the cone of nonnegative quadratic formsover F [Strum2003]:

CF = {M ∈ Sn| xTMx ≥ 0 for all x ∈ F}.

Its dual cone isC∗F =cl Cone{xxT ∈ Sn| x ∈ F}.

Example:Cn = {M ∈ Sn| xTMx ≥ 0 for all x ∈ Rn+},

C∗n =cl Cone{xxT ∈ Sn| x ∈ Rn+}.

Lemma:F1 ⊆ F2 ⇒ CF1

⊇ CF2and C∗F1

⊆ C∗F2.

Motivation:Use a smallest set F covering Rn+ such that C∗F is computable.


Approximating CPP Problem

• Using the second-order cones to cover the first orthant

Figure : Approximation based on SOCs


Cone of Nonnegative Quadratic Functions overSecond-order Cone

Consider the cone of nonnegative quadratic forms over a nontrivialsecond-order cone FSOC= { x ∈ Rn |

√xTQx ≤ fTx}, where Q ∈ Sn++

and f ∈ Rn.• Theorem A. A matrix M ∈ Sn satisfies M ∈ CFSOC if and only if there exists

a λ ≥ 0 such that

M + λ(Q− ffT ) ∈ Sn+.

• Theorem B. A matrix X ∈ Sn satisfies X ∈ C∗FSOCif and only if X satisfies

that

(Q− ffT ) ·X ≤ 0X ∈ Sn+.

Theorems A and B lead to LMI representations of CFSOCand C∗FSOC

.


Second-order Cone Covering a Simplex

• Let V be a simplex generated by a set of vertices {v1, v2, ..., vn}, withvi ∈ Rn, i = 1, 2, ..., n, being linearly independent.

• The system of linear equations vTi y = 1, i = 1, 2, ..., n, must have aunique solution y = f .

• We solve the following SDP problem to get a second-order coneFSOC= { x ∈ Rn|

√xTQx ≤ fTx} such that V ⊆ FSOC:

Min log det(Q−1)s.t. Q · [vivTi ] = 1, i = 1, 2, ..., n

Q ∈ Sn++.


Approximation

Basic Idea: Partition the standard simplex ∆ = {x ∈ Rn+| (en)Tx = 1},where en ∈ Rn is a vector of all ones, into several small simplices∆ = ∆1

⋃∆2...

⋃∆k, then use F iSOC to cover each ∆i. As the partition

becomes finer, these second-order cones cover Rn+ more precisely.

Theorem If ∆ = ∆1

⋃∆2...

⋃∆k and ∆i ⊆ F iSOC for i = 1, 2, ..., k, then⋂k

i=1 CFiSOC⊆ C and

∑ki=1 C∗Fi

SOC⊇ C∗.

The approximation problem under the simplex partition can be formulatedas following:

Min C •Xs.t. Ai •X = bi, i = 1, 2, ...,m,

X = X1 + ...+Xk,(Qi −B) •Xi ≤ 0,Xi ∈ Sn+, i = 1, 2, ..., k,

(RACP)

where B = en(en)T .


Adaptive Scheme

Intuitive Idea: For an optimization problem, the importance of each feasible pointis not the same. A sensitive subregion, which has a high potential to contain anoptimal solution, should be paid more attentions [Lu2011].


Adaptive Scheme

Assume X =∑ki=1

∑rij=1 µijxijx

Tij is a decomposition of a optimal

solution of problem (RACP). Let I = {(i, j)|1 ≤ i ≤ k, 1 ≤ j ≤ ri} be theindex set of the decomposition solution andIp = {(i, j)|(i, j) ∈ I, xij /∈ Rn+} be the index set of the infeasiblesolutions.

Definition Define any x∗ = argminx∈{xij |(i,j)∈Ip}xTDx to be a sensitive

point. Let t be the smallest number of the first index i among all sensitivepoints x∗, then ∆t is defined to be the “corresponding sensitive simplex”

Definition Let F ⊆ Rn. For any σ > 0, the σ-neighborhood of F isNσ,F = {x ∈ Rn| ∃y ∈ F s.t. ‖x− y‖∞ < σ}.


Algorithm

Step 1 (Initialization Step): Set the initial second-order coneF0

SOC= {x ∈ Rn |√xT Inx ≤ (en)Tx} to cover ∆. Let k = 1.

Step 2: Solve (RACP) with approximation cones to obtain the optimalsolution X∗ = X∗1 + ...+X∗k . Record the optimal value of (RACP) as lk.

Step 3: Decompose X∗. If there is no sensitive point, then return X∗ asthe optimal solution of problem (CPP). Otherwise, find all sensitive pointsx∗ and the corresponding sensitive simplex ∆t.

Step 4: Check stopping criteria. If all sensitive points x∗ /∈ Cone(Nσ,∆)and the computation time is less than the pre-assigned maximum time,go to Step 5. Otherwise, return max{l1, ..., lk} as a lower bound for theproblem (CPP).

Step 5: Drop F tSOC from the approximation cones. Split ∆t into two newsimplices. After that, approximate each of these two simplices by twosmaller second-order cones. Set k = k + 1 and go to Step 2.


Numerical Results

Four different problems are tested:• Box constrained quadratic programming problem.• Standard quadratic programming problem.• Maximum clique problem.• Binary constrained quadratic programming problem.

Algorithm was implemented using MATLAB 7.9.0 on a computer with Intel Core 2CPU 2.8 Ghz and 4G memory. The solvers cvx [Grant2010] and SeDuMi 1.3[Sturm1999] were incorporated in solving problems. The optimal values of thetesting problems were calculated by the commercial software BARON[Sahinidis2010].


Table : Box constrained quadratic programming problems

Instance n fopt fAACP Iters T imespar020-100-1 20 -966 -966.001 1 24.93sspar020-100-2 20 -769 -769.001 1 27.85sspar020-100-3 20 -1019 -1019.001 1 26.20sspar030-060-1 30 -1514 -1514.057 1 333.39sspar030-060-2 30 -1811 -1811.691 1 338.30sspar030-060-3 30 -1917 -1917.001 1 261.05sspar030-080-1 30 -3072 -3072.003 1 285.13sspar030-080-2 30 -2186.208 -2186.209 1 278.43sspar030-080-3 30 -1897.522 -1897.534 1 311.47sspar040-030-1 40 -2244 -2244.010 1 1575.12sspar040-030-2 40 -2088.820 -2094.681 1 1822.23sspar040-030-3 40 -2006 -2008.148 1 1875.38s

Algorithm can provide very good lower bounds at the first iteration.


Table : Standard quadratic programming problems

Instance n fopt fAACP Iters T imespar020-100-1 20 -44.9231 -44.9231 1 1.12sspar020-100-2 20 -49 -49 1 0.58sspar020-100-3 20 -48 -48 1 0.44sspar030-060-1 30 -48 -48 1 2.82sspar030-060-2 30 -39.0508 -39.0508 1 3.86sspar030-060-3 30 -45 -45 1 2.76sspar030-080-1 30 -49 -49 1 3.01sspar030-080-2 30 -49 -49 1 3.37sspar030-080-3 30 -38 -38 1 3.39sspar040-030-1 40 -46 -46 1 14.09sspar040-030-2 40 -35.2 -35.2 1 15.29sspar040-030-3 40 -30.3707 -30.3707 1 15.60s

Algorithm can find the optimal values at the first iteration.


Table : Maximum clique problems

Instance node edge w(G) wAACP Iters T imeSanchis50 50 900 10 10 1 69.46sSanchis60 60 1100 14 14 1 216.98sSanchis70 70 1500 20 20 1 453.44sSanchis80 80 1800 20 20 1 1075.94sSanchis90 90 2000 22 22 1 2123.63sSanchis100 100 2500 25 25 1 4081.67sBrock50 50 350 10 10 1 54.73sBrock60 60 541 14 14 1 171.52sBrock70 70 701 20 20 1 411.03sBrock80 80 903 20 20 1 929.74sBrock90 90 1568 22 22 1 1282.75sBrock100 100 1978 25 25 1 3625.51s

Algorithm can return optimal maximum clique numbers for both graphs atthe first iteration.


Table : Binary constrained quadratic programming problems

Instance n fopt fAACP %Error Iters Timespar010-100-1 10 -535 -537.437 0.46% 8 16.92sspar010-100-2 10 -390 -391.088 0.28% 34 68.19sspar010-100-3 10 -554 –554.001 0.00% 1 0.21sspar020-100-1 20 -915 -919.745 0.52% 23 213.43sspar020-100-2 20 -984 -986.233 0.23% 18 174.81sspar020-100-3 20 -910 -914.531 0.50% 21 193.45sspar030-060-1 30 -1917 -1920.863 0.20% 257 35211.51sspar030-060-2 30 -2362 -2366.173 0.18% 228 34735.73sspar030-060-3 30 -2003 -2010.641 0.38% 385 53287.61sspar030-080-1 30 -2187 -2192.226 0.24% 312 43726.12sspar030-080-2 30 -1761 -1767.003 0.34% 307 40273.47sspar030-080-3 30 -1777 -1784.365 0.41% 288 35238.88s

Algorithm can efficiently approximate the optimal values for small andmiddle size problems and obtained good lower bounds for large sizeproblems by taking much more computational efforts.


Summary

• The computable representation of the cone of nonnegative quadratic formsover a general nontrivial second-order cone has been given.

• An approximation algorithm based on the computable cone over a union ofsecond-order cones has been provided for solving the CPP problem.

• An adaptive scheme and RLT constraints have been used to improve theefficiency.

Publications:

• Tian, Y., Fang, S.-C., Deng, Z. and Lu, C. (2013) Computable representation of the cone of

nonnegative quadratic forms over a general second-order cone and its application to completely

positive programming, J. of Industrial and Management Optimization, vol. 9, 701-709.

• Jin, Q., Tian, Y., Deng, Z., Fang S.-C. and Xing W. (2013) Exact computable representation of

some quadratically constrained quadratic programming problems, J. of Operations Research

Society of China, vol. 1, 107-134.

• Deng, Z., Fang, S.-C., Jin, Q. and Xing W. (2013) Detecting copositivity of asymmetric matrix by an

adaptive ellipsoid-based approximation scheme, European Journal of Operational Research, vol.

229, 21-28.


Concluding Recommendation

Update your toolbox withLinear Conic Programming models!


References I

• Adams, W. P. and H.D. Sherali (1986). A tight linearization and an algorithmfor zero-one quadratic programming problems. Management Science,32(10), 1274-1290.

• Ben-Tal, A. and A. Nemirovskii (2001). Lectures on Modern ConvexOptimization: Analysis, Algorithms and Engineering Applications, Society forIndustrial and Applied Mathematics: Philadelphia, PA.

• Bertsekas D.P., Nedic A. and A.E. Ozdaglar (2003). Convex Analysis andOptimization, Athena Scientific: Belmont, MA.

• Bomze, I.M. (2000). Linear-time copositivity detection for tridiagonalmatrices and extension to block-trdiagonality. SIAM Journal on MatrixAnalysis and Application, 21, 840-848.

• Bomze, I.M. and G. Eichfelder (2013). Copositivity detection bydifference-of-convex decomposition and ω-subdivision. MathematicalProgramming, 138, 365-400.


References II

• Bomze, I.M. and E. de Klerk (2002). Solving standard quadratic optimizationproblem via linear, semidefinite and copositive programming. Journal ofGlobal Optimization, 24, 163-185.

• Boyd S. and L. Vandenberghe (2004). Convex Optimization, CambridgeUniversity Press: Cambridge, UK.

• Bundfuss, S. and M. Dür (2009). An adaptive linear approximation algorithmfor copositive programs. SIAM Journal on Optimization, 20, 30-53.

• Burer, S. (2009). On the copositive representation of binary and continuousnonconvex quadratic programs, Mathematical Programming, 120, 479-495.

• Fang S.-C. and S. Puthenpura (1993). Linear Optimization and Extensions:Theory and Algorithms, Prentice-Hall Inc.: Englewood Cliffs, NJ.

• Gahinet P. and A. Nemirovsky (1993). LMI: A Package for Manipulating andSolving LMIs. National Institute for Research in Computer Science andControl.


References III

• Grant, M. and Boyd, S. (2010). CVX: matlab software for disciplinedprogramming, version 1.2. <http://cvxr.com/cvx>

• Horn, R.A. and C.R. Johnson (1990). Matrix Analysis, Cambridge UniversityPress: Cambridge, UK.

• Ikramov, K.D. (2002). Linear-time algorithm for verifying the copositivity of anacyclic matrix. Computational Mathematics and Mathmetical Physics, 42,1701-1703.

• Lu, C., Jin, Q., Fang, S.-C., Wang, Z. and W. Xing (2011). An LMI basedadaptive approximation scheme to cones of nonnegative quadratic functions.Submitted to Mathematical Programming.

• Motzkin, T.S. and Straus, E.G. (1965). Maxima for graphs and a new proof ofa theorem of Turan. Canadian Journal of Mathematics, 17, 533-540.

• Murty, K.G. and Kabadi, S.N. (1987). Some NP-complete problems inquadratic and nonlinear programming. Mathematical Programming, 39,117-129.


http://cvxr.com/cvx

References IV

• Nemirovski A. (2001). Lectures on Modern Convex Optimization: Analysis,Algorithms, and Engineering Applications, Society for Industrial and AppliedMathematics: Philadelphia, PA.

• Parrilo, P. (2000). Structured Semidefinite Programs and Semi-AlgebraicGeometry Methods in Robustness and Optimization. Ph.D. Thesis,California Institute of Technology. Available at: <http://etd.caltech.edu/etu/available/etd-05062004-055516/>

• Peña, J., Vera, J. and L. Zuluaga (2007). Computing the stability number of agraph via linear and semidefinite programing. SIAM Journal on Optimization,18, 87-105.

• Renegar J. (2001). A Mathematical View of Interior-point Methods in ConvexOptimization, Society for Industrial and Applied Mathematics: Philadelphia,PA.

• Rockafellar R.T. (1970). Convex Analysis, Princeton University Press:Princeton, NJ.


http://etd.caltech.edu/etu/available/etd-05062004-055516/

http://etd.caltech.edu/etu/available/etd-05062004-055516/

References V

• Sherali, H.D. and W.P. Adams (1990). A hierarchy of relaxations between thecontinuous and convex hull representations for zero-one programmingproblems. SIAM Journal on Discrete Mathematics, 3(3), 411-430.

• Sherali, H.D. and W.P. Adams (1994). A hierarchy of relaxations and convexhull characterizations for mixed-integer zero-one programming problems.Discrete Applied Mathematics, 52(1), 83-106.

• Sherali, H.D. and C.H. Tuncbilek (1995). A reformulation-convexificationapproach for solving nonconvex quadratic programming problems. Journalof Global Optimization, 7(1), 1-31.

• Sherali, H.D. and H. Wang (2001). Global optimization of nonconvexfactorable programming problems. Mathematical Programming, 89(3),459-478.


References VI

• Sturm, J. (1999). SeDuMi 1.02, a matlab tool box for optimization oversymmetric cones. Optimization Methods and Software, 11&12, 625-653.

• Sturm, J.F. and S. Zhang (2003). On cones of nonnegative quadraticfunctions. Mathematics of Operations Research, 28(2), 246-267.

• Handbook of Semidefinite Programming: Theory, Algorithms, andApplications, edited by Wolkowicz H., Saigal R. and Vandenberghe L.,published in 2000 by Kluwer Academic Publisher: Norwell, MA.

• Wright. S. (1997). Primal-Dual Interior-Point Methods, Society for Industrialand Applied Mathematics: Philadelphia, PA.

• Ye, Y. and S. Zhang (2003). New results on quadratic minimization. SIAMJournal on Optimization, 14(1), 245-267.


References VII

Others

• Ye Y., Linear Conic Programming, lecture notes online:http://www.stanford.edu/class/msande314/sdpmain.pdf

• Todd M.J., Semidefinite Programming, lecture notes online:http://people.orie.cornell.edu/~miketodd/cornellonly/or637/or637.html

Software

• A very popular general purpose SDP solver, SeDuMi, of Jos F. Sturm can befound in: http://sedumi.ie.lehigh.edu/

• Another very popular convex programming problems solver, CVX, can befound in: cvxr.com/cvx/

• Sahinidis, N.V. and Tawarmalani, M. (2010). BARON 9.0.4: GlobalOptimization of Mixed-Integer Nonlinear Programs.<http://archimedes.cheme.cmu.edu/baron/baron.html>


http://www.stanford.edu/class/msande314/sdpmain.pdf



http://sedumi.ie.lehigh.edu/

cvxr.com/cvx/

http://archimedes.cheme.cmu.edu/baron/baron.html



Thank You !

Questions?

[email protected]


Linear Conic Programming: A New Modeling Tool …Linear Conic Programming: A New Modeling Tool for Analytical Decision Making Professor Shu-Cherng Fang Department of Industrial and

Documents