Top Banner
Ecole Polytechnique Fédérale de Lausanne Master of Mathematical Sciences Semester project Prof. Daniel Kressner Supervisor: Petar Sirkovic Eigenvalue Optimization for Solving the MAX - CUT Problem Julien HESS Lausanne, fall 2012
40

Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

May 20, 2018

Download

Documents

doandat
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

Ecole Polytechnique Fédérale de LausanneMaster of Mathematical Sciences

Semester project

Prof. Daniel Kressner

Supervisor: Petar Sirkovic

Eigenvalue Optimization for Solving theMAX-CUT Problem

Julien HESS

Lausanne, fall 2012

Page 2: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

Abstract

The purpose of this semester project is to investigate the Spectral BundleMethod, which is a specialized subgradient method particularly suited for solvinglarge scale semidefinite programs that can be cast as eigenvalue optimizationproblems of the form

miny∈Rm

aλmax(C −m∑

i=1

Aiyi) + bT y,

where C and Ai are given real symmetric matrices, b ∈ Rm allows to specify a

linear cost term, and a > 0 is a constant multiplier for the maximum eigenvaluefunction λmax(·). In particular, a semidefinite relaxation of the well-known max-

cut problem belongs to this class of problems.After a general description of the Spectral Bundle Method, a matlab im-

plementation of the method designed for solving the eigenvalue relaxation of themax-cut problem is presented. Finally, it is explained how to extract an opti-mal solution of the original max-cut problem from the optimal solution of theeigenvalue relaxation.

2

Page 3: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

Contents

1 Introduction 4

2 Preliminaries 5

2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 The Max-Cut Problem 7

3.1 Standard Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.2 Algebraic Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4 Semidefinite Programming 10

4.1 The Standard Primal-Dual Pair of Semidefinite Programs . . . . . . . 104.2 Duality Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114.3 Semidefinite Relaxation of the Max-Cut Problem . . . . . . . . . . . . 124.4 Solving the Semidefinite Relaxation with SeDuMi . . . . . . . . . . . . 13

5 Eigenvalue Optimization 14

5.1 Standard Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 145.2 Eigenvalue Relaxation of the Max-Cut Problem . . . . . . . . . . . . . 155.3 Analysis of the Objective Function . . . . . . . . . . . . . . . . . . . . 165.4 The Spectral Bundle Method . . . . . . . . . . . . . . . . . . . . . . . 18

5.4.1 Verbal Description . . . . . . . . . . . . . . . . . . . . . . . . . 185.4.2 Mathematical Description . . . . . . . . . . . . . . . . . . . . . 195.4.3 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 245.4.4 Improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

6 Resolution of the Max-Cut Problem 26

6.1 Extraction of a Dual Optimal Solution . . . . . . . . . . . . . . . . . . 266.2 Extraction of a Primal Optimal Solution . . . . . . . . . . . . . . . . . 266.3 Best Rank One Approximation . . . . . . . . . . . . . . . . . . . . . . 286.4 Extraction of an Optimal Cut . . . . . . . . . . . . . . . . . . . . . . . 28

7 Numerical Experiments 29

A Matlab codes 34

A.1 The Spectral Bundle Method . . . . . . . . . . . . . . . . . . . . . . . 34A.1.1 Lanczos with Reorthogonalization . . . . . . . . . . . . . . . . 34A.1.2 The Oracle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35A.1.3 The Trial Point Finding . . . . . . . . . . . . . . . . . . . . . . 35A.1.4 The Model Update . . . . . . . . . . . . . . . . . . . . . . . . . 36A.1.5 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

A.2 Resolution of the Max-Cut Problem . . . . . . . . . . . . . . . . . . . 38A.2.1 Extraction of a Dual Optimal Solution . . . . . . . . . . . . . . 38A.2.2 Extraction of a Primal Optimal Solution . . . . . . . . . . . . . 38A.2.3 Best Rank One Approximation . . . . . . . . . . . . . . . . . . 39A.2.4 Extraction of an Optimal Cut . . . . . . . . . . . . . . . . . . . 39

3

Page 4: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

1 Introduction

Although it is one of the simplest graph partitioning problems to formulate, the max-

cut problem (MC) is one of the most difficult combinatorial optimization problemsto solve. Indeed, it belongs to the class of NP-complete problems, meaning that itis not yet known whether it can be solved with an algorithm running in polynomialtime for general graphs. For this reason, it is customary to solve a relaxation ofthe problem rather than MC itself, which provides an upperbound for the optimalsolution.

The most common relaxation of the max-cut problem is expressed in the frame-work of semidefinite programming. From a theoretical point of view, interior pointmethods are ideal for solving semidefinite programs (see Helmberg [3], Chap. 4).However, for practical applications with many constraints, the price to pay in a sin-gle iteration is often too high. In this view, the Spectral Bundle Method - whichis a specialized subgradient method - offers an interesting alternative for solving theclass of large scale semidefinite programs that can be cast as eigenvalue optimizationproblems (cf. [3], Chap. 5). As the max-cut problem belongs to this class, the mainpurpose of this paper is to show how the Spectral Bundle Method can be used inorder to solve it.

To begin with, we introduce the notation used in this paper and give some basicresults of linear algebra in Section 2. In Section 3 we introduce the max-cut problemin more detail, whereas the standard formulation of the primal-dual pair of semidefi-nite programs and the main results of duality theory are briefly presented in Section 4.The reader familiar with the max-cut problem and with semidefinite programmingmight want to skip those sections and go directly to Section 5, where the main topicof this paper is discussed. A general description of the Spectral Bundle Method isgiven, as well as a Matlab implementation designed for solving the particular case ofMC. In Section 6, we then explain how to extract an optimal solution of the max-cut

problem from an optimal solution of its eigenvalue relaxation. Finally, a number ofnumerical results are presented in Section 7.

4

Page 5: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

2 Preliminaries

As usual when dealing with optimization problems, we will use matrix notation torepresent linear systems of equations in a compact and convenient way. Although weassume the notions of matrices and matrix operations to be familiar to the reader,Subsection 2.1 briefly describes the main notations that we are going to use in thispaper. For the sake of convenience, we also recall in Subsection 2.2 a number ofimportant results from linear algebra that are going to be useful in the followingsections.

2.1 Notation

Let m, n be integers and let us consider Rm and R

n as vector spaces over R. Letalso Mm,n be the set of m × n real matrices. For a vector x ∈ R

n and a matrixA ∈ Mm,n, we denote the j-th component of x by xj and the (i, j)-entry of A by aij,where 1 ≤ i ≤ m and 1 ≤ j ≤ n. We will use capital latin letters to denote matricesand small latin letters to denote vectors. Finally, we denote by e := (1, 1, . . . , 1)T

the vector of appropriate length whose coordinates are all ones and by I the identitymatrix of appropriate size.

If we interpret Mm,n as a vector space in Rn·m, the natural inner product between

two elements A = (aij) and B = (bij) in Mm,n is the Frobenius inner product

〈A, B〉 = tr(BT A) =m∑

i=1

n∑

j=1

aijbij ,

where the trace tr(·) is the sum of the diagonal elements of a square matrix. Notethat for vectors, the Frobenius inner product corresponds to the usual Euclidian innerproduct, i.e. 〈x, y〉 = yT x for x, y ∈ R

n. The trace is a linear operator, which impliesthat the Frobenius inner product is bilinear. The induced norm is the Frobenius

norm,

‖A‖F =√

〈A, A〉.In this paper we shall mainly work with the set of symmetric matrices in

Mn,n, denoted by Sn, which we naturally endow with the inner product of Mn,n.An arbitrary matrix A ∈ Sn can always be diagonalized, which means there existan orthonormal matrix P ∈ Mn,n and a diagonal matrix Λ ∈ Sn containing theeigenvalues of A on its main diagonal such that A = PΛP T . All eigenvalues of asymmetric matrix A are real and we denote them by λi(A), i = 1, . . . , n.

For our purposes we assume that the eigenvalues are sorted non-increasingly,namely λmax (A) = λ1(A) ≥ λ2(A) ≥ · · · ≥ λn(A) = λmin(A). A matrix A ∈ Mn,n issaid to be positive semidefinite (respectively, positive definite) if xT Ax ≥ 0 ∀x ∈R

n (respectively, if xT Ax > 0 ∀x ∈ Rn\ {0}). We write A < 0 when the matrix is

symmetric positive semidefinite and A ≻ 0 when it is symmetric positive definite.

5

Page 6: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

2.2 Linear Algebra

For future reference and for the reader’s convenience, we state here a few basic resultsfrom linear algebra. We first recall the Rayleigh-Ritz Theorem, which provides themost common way to characterize the maximum eigenvalue of a symmetric matrix.

Theorem 2.1 (Rayleigh-Ritz). Let A ∈ Sn. Then

λmax(A) = max‖v‖=1

vT Av, (2.2.1)

λmin(A) = min‖v‖=1

vT Av. (2.2.2)

Proof. See Horn and Johnson [7].

In particular, the maximum in (2.2.1) is attained for eigenvectors to the maximumeigenvalue of A. Next, we recall the Singular Value Decomposition (SVD) of a matrix.

Theorem 2.2 (SVD). Let A ∈ Mm,n. Then there are orthogonal matrices U ∈ Mm,m

and V ∈ Mn,n such that

A = UΣV T , with Σ = Diag(σ1, . . . , σr, 0, . . . , 0) ∈ Mm,n

and σ1 ≥ σ2 ≥ · · · ≥ σr ≥ 0 for r = min{m, n}.

Proof. See Horn and Johnson [7].

The uniquely determined scalars σ1, . . . , σr in Theorem 2.2 are called the singular

values of A. Let A = UΣV T be an SVD of an arbitrary matrix A ∈ Mm,n. For anyk ≤ min{m, n}, let Uk ∈ Mm,k and Vk ∈ Mn,k contain the first k columns of U andV , respectively. Then the matrix

Ak := UkΣkV Tk , with Σk = Diag(σ1, . . . , σk) (2.2.3)

clearly has rank at most k. Moreover,

‖A − Ak‖F = ‖Diag(σk+1, . . . , σr)‖F =√

σ2k+1 + · · · + σ2

r ,

which the following result shows to be minimal.

Theorem 2.3 (Eckart-Young). For a matrix A ∈ Mm,n, let Ak be defined as in(2.2.3). Then

‖A − Ak‖F = min {‖A − B‖F : B ∈ Mm,n has rank at most k} .

Proof. See Horn and Johnson [7].

6

Page 7: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

3 The Max-Cut Problem

Although it is one of the simplest graph partitioning problems to formulate, the max-

cut problem (MC) is one of the most difficult combinatorial optimization problemsto solve. Indeed, it belongs to the class of NP-complete problems, meaning that itis not yet known whether it can be solved with an algorithm running in polynomialtime for general graphs (for planar graphs, it has been shown that MC is the dual tothe route inspection problem, thus solvable in polynomial time [2]). It is interestingto note however that the inverse problem of MC, namely the min-cut problem, canbe solved in polynomial time by means of network flow techniques [9]. Applications ofMC appear in various fields such as VLSI circuit design [8] and statistical physics [1].

In this section we briefly introduce the standard and algebraic formulations of themax-cut problem.

3.1 Standard Formulation

The max-cut problem is defined in the framework of graph theory. Denote anundirected graph G = (V, E) as a pair consisting of a set of vertices V and a setof edges E. We assume that G contains no loop nor multiple edges, that is there isno edge starting and ending at the same vertex and there can be at most one edgebetween two distinct vertices. We use V = {1, . . . , n} and E = {ij : i < j, i, j ∈ V },i.e. an edge with endpoints i and j is denoted by ij. Let the map

w : E −→ R

ij 7−→ w(ij) := wij

be a weight function defined on the set of edges.In this paper, we shall only work with the general case of the complete edge-

weighted graph Kn. Indeed, any graph can easily be modelled in this setting; itsuffices to consider a non-existing edge as an edge with zero weight by setting wij = 0if ij /∈ E. For an unweighted graph we simply assign a unitary weight to every edge,namely wij = 1 for all ij ∈ E. The symmetric matrix W = (wij) ∈ Sn is referred toas the weighted adjacency matrix of the graph. Since we assume G to contain noloop, all the diagonal entries of W are zero, hence tr(W ) = 0.

The notion of cut is intimately related to that of partition of the vertex set.Indeed, a cut is defined as a set of edges such that the vertex set is partitioned in twodisjoint subsets when those edges are removed. More precisely, for a subset S ⊂ V ,the cut δ(S) is the set of edges that have one endpoint in S and the other in V \S,namely δ(S) = {ij ∈ E : i ∈ S, j ∈ V \S}.

The weight of a cut is the sum of the weights of its edges. The max-cut

problem therefore consists in finding a cut of maximum weight. More formally, it canbe written as

(MC) maxS⊂V

ij∈δ(S)

wij . (3.1.1)

7

Page 8: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

Figure 3.1 shows a very simple example of a complete graph on 3 vertices corre-sponding to the adjacency matrix

W =

0 1 41 0 34 3 0

.

In this example, we can see very easily that the maximum cut (represented by thered edges) has weight 7.

1

2

31

4

3

Figure 3.1: A complete edge-weighted graph on 3 vertices.

3.2 Algebraic Formulation

In order to solve the max-cut problem, it will be more convenient to work with analgebraic formulation of (3.1.1). For a subset S ⊂ V , let us introduce the associatedcut vector x ∈ {−1, 1}n defined by

xi =

{1, if i ∈ S

−1, if i ∈ V \S

for i = 1, . . . , n. Observe that for an edge ij ∈ E, we have xi = −xj if ij ∈ δ(S) andxi = xj if ij /∈ δ(S). Therefore the expression 1

2(1 − xixj) evaluates to 1 if the edgeij is in the cut δ(S) and to 0 otherwise. 1

2 (1 − xixj) is called the incidence vector

associated to the cut. Since we have to count each edge in the cut only once, we canwrite ∑

ij∈δ(S)

wij =∑

i<j

wij1 − xixj

2.

8

Page 9: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

Exploiting the symmetry of W and the fact that xixi = 1, this yields

ij∈δ(S)

wij =1

4

i,j

wij(1 − xixj)

=1

4

n∑

i=1

n∑

j=1

wijxixi −n∑

j=1

wijxixj

=1

4xT (Diag(W e) − W ) x,

where the linear operator Diag(·) : Rn → Sn is defined by

Diag(y) =

y1 0 · · · 0

0 y2...

.... . . 0

0 · · · 0 yn

(3.2.1)

for all y ∈ Rn. The matrix

L :=1

4(Diag(W e) − W ) (3.2.2)

is called the Laplace matrix of the graph. Note that the symmetry of W impliesthe symmetry of L. With this notation, we finally obtain an equivalent algebraicformulation of (3.1.1), namely

(AMC) maxx∈{−1,1}n

xT Lx. (3.2.3)

9

Page 10: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

4 Semidefinite Programming

Semidefinite programming is a subfield of convex optimization that includes severalcommon classes of optimization problems, such as for instance linear programming,quadratic programming, and semidefinite linear complementarity problems. In gen-eral, semidefinite programs (SDP) naturally arise from problems whose data is givenby matrices. In addition, many relaxations of practical problems in operations re-search and combinatorial optimization can be formulated as SDP, which makes thisrelatively new field of optimization very interesting.

A close relation can be drawn between semidefinite programming and linear pro-gramming: instead of optimizing over the cone of the nonnegative orthant x ≥ 0,in semidefinite programming the optimization is performed over the more generalcone of symmetric positive semidefinite matrices X < 0. However, unlike for linearprograms, not every SDP satisfies strong duality.

In this section we first introduce our standard formulation of a primal semidefiniteprogram and derive its dual. After a glimpse on the duality theory of semidefiniteprogramming, we finally derive a semidefinite relaxation of the max-cut problem.

4.1 The Standard Primal-Dual Pair of Semidefinite Programs

The standard formulation of a semidefinite program is

max 〈C, X〉(PSDP) s.t. AX = b, (4.1.1)

X < 0,

where C ∈ Sn is a given cost matrix and X ∈ Sn is the matrix variable. Theconstraints are given by the vector b ∈ R

m and the linear operator A : Sn → Rm

defined by

AX =

〈A1, X〉...

〈Am, X〉

, (4.1.2)

where Ai are given symmetric matrices for i = 1, . . . , m.As often in constrained optimization, it is possible to convert the primal problem

to a dual form. Before deriving the dual of (PSDP), we need to determine theadjoint operator to A. By definition, it is the operator AT : Rm → Sn satisfying〈AX, y〉 = 〈X, AT y〉 for all X ∈ Sn and y ∈ R

m. Exploiting the linearity of the innerproduct, we have

〈AX, y〉 =m∑

i=1

〈Ai, X〉yi = 〈m∑

i=1

yiAi, X〉,

hence we obtain

AT y =m∑

i=1

yiAi. (4.1.3)

In order to derive the dual of (PSDP) we follow a Lagrangian approach, that is welift the m primal equality constraints of (4.1.1) into the objective function by meansof a Lagrangian multiplier y ∈ R

m. By making this Lagrangian relaxation as tight as

10

Page 11: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

possible, the primal problem reads supX<0 infy∈Rm [〈C, X〉 + 〈b − AX, y〉]. The dualof (PSDP) is then constructed by swapping inf and sup, leading to

supX<0

infy∈Rm

[〈C, X〉 + 〈b − AX, y〉] ≤ infy∈Rm

supX<0

[〈b, y〉 + 〈X, C − AT y〉]. (4.1.4)

By construction, the value of the dual problem is an upper bound for the value ofthe primal problem. Conversely, the value of the primal problem is a lower boundfor the value of the dual problem. An explicit justification of this fact will be givenin (4.2.1). To ensure the finiteness of the infimum on the right hand side, the innermaximization over X < 0 has to remain finite for some y ∈ R

m. This requires〈X, C − AT y〉 ≤ 0 for all X < 0, which by Fejer’s Trace Theorem is equivalent toAT y − C < 0 (see Helmberg [3], Corollary 1.2.7). In order to take this conditioninto account we introduce a slack variable Z := AT y − C ∈ Sn and require it tobe positive semidefinite. We finally obtain the standard formulation of the dualsemidefinite program to (PSDP),

min 〈b, y〉(DSDP) s.t. AT y − Z = C, (4.1.5)

y ∈ Rm, Z < 0.

4.2 Duality Theory

A matrix X ∈ Sn is called a primal feasible solution if it satisfies the constraintsof (PSDP), i.e. if AX = b and X < 0. Similarly, a pair (y, Z) ∈ R

m × Sn is called adual feasible solution if it satisfies the constraints of (DSDP), i.e. AT y − Z = Cand Z < 0. In order to emphasize the importance of the inequality shown in (4.1.4),we consider the duality gap between a dual feasible solution (y, Z) and a primalfeasible solution X

〈b, y〉 − 〈C, X〉 = 〈AX, y〉 − 〈AT y − Z, X〉 = 〈Z, X〉 ≥ 0, (4.2.1)

where the last inequality comes from the fact that both X and Z are positive semidef-inite matrices. The property that the duality gap is always nonnegative is referred toas weak duality. In other words, the objective value of any primal feasible solutionis at most the objective value of any dual feasible solution.

If the duality gap 〈Z, X〉 is equal to zero, then the primal-dual triplet (X, y, Z)is an optimal solution. However, unlike for linear programming, optimality no longerimplies 〈Z, X〉 = 0 in the case of semidefinite programming. Indeed, it may happenthat primal and dual optimal objective values do not coincide. This is due to the factthat semidefinite sets of feasible solutions have a nonpolyhedral structure. Neverthe-less, the following result of strong duality shows that the problem is well behavedif the primal or the dual feasible sets contain a positive definite feasible point.

Definition 4.1. A point X is strictly feasible for (PSDP) if it is feasible for (PSDP)and satisfies X ≻ 0. A pair (y, Z) is strictly feasible for (DSDP) if it is feasible for(DSDP) and satisfies Z ≻ 0.

11

Page 12: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

Theorem 4.2 (Strong Duality). Assume that there exists a strictly feasible solutionfor (DSDP) and let

p∗ := sup{〈C, X〉 : AX = b, X < 0} and

d∗ := inf{〈b, y〉 : AT y − Z = C, Z < 0}.

Then p∗ = d∗ and if p∗ is finite it is attained for some X ∈ {X < 0: AX = b}.

Proof. See Helmberg [3], Theorem 2.2.5.

4.3 Semidefinite Relaxation of the Max-Cut Problem

In order to derive a semidefinite relaxation of the algebraic formulation (3.2.3) of themax-cut problem, observe that

xT Lx =n∑

i=1

n∑

j=1

xiLijxj =n∑

i=1

n∑

j=1

Lij(xixj) = 〈L, xxT 〉

for all x ∈ Rn. By construction, the matrix xxT has rank one and it is symmetric

positive semidefinite. Moreover, for all x ∈ {−1, 1}n, its diagonal entries are equal to1. Relaxing xxT to an arbitrary matrix X ∈ Sn satisfying these three properties, weobtain

max 〈L, X〉s.t. diag(X) = e, (4.3.1)

X < 0,

rank(X) = 1,

where diag(·) : Sn → Rn is defined by

diag(X) = (x11, x22, . . . , xnn)T (4.3.2)

for all X ∈ Sn. The constraint diag(X) = e can be written in the form AX = b withb = e and A : Sn → R

n defined as in (4.1.2) with Ai being the n × n matrix with a 1at the (i, i)-entry and zeros elsewhere. On the contrary, the constraint rank(X) = 1cannot be expressed in the form AX = b, hence (4.3.1) is not yet a semidefiniteprogram. In fact, it is still equivalent to (3.2.3), because both problems optimize thesame objective function over the same feasible set. Indeed, the constraints X ∈ Sn

and rank(X) = 1 together imply that there exists a factorization x ∈ Rn such that

X = xxT . Moreover, the constraint diag(X) = e implies 1 = Xii = xixi for i =1, . . . , n, which yields x ∈ {−1, 1}n.

As a consequence, we have to drop the rank one constraint in order to obtain acontinuous feasible set and the primal semidefinite relaxation of (AMC) reads

max 〈L, X〉(P) s.t. diag(X) = e, (4.3.3)

X < 0.

12

Page 13: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

In order to derive the corresponding dual problem, we first need to determine thedual operator of diag(·). In fact, it is the linear operator Diag(·) defined in (3.2.1).Indeed, for all X ∈ Sn and y ∈ R

n we have

〈diag(X), y〉 =n∑

i=1

xiiyi =n∑

i=1

xii(Diag(y))ii =n∑

i=1

n∑

j=1

xij(Diag(y))ij = 〈X, Diag(y)〉.

Therefore, according to (4.1.5), the dual problem to (P) is

min 〈e, y〉(D) s.t. Diag(y) − Z = L, (4.3.4)

y ∈ Rn, Z < 0.

4.4 Solving the Semidefinite Relaxation with SeDuMi

At this point, provided that the dimension n of the problem is not too large (seesection 5), we could already solve the semidefinite relaxations (P) and (D) in orderto obtain an approximation of the optimal solution to the original max-cut problem(3.2.3). A very convenient and effective way to do so is given by the matlab toolboxSeDuMi [11] (stands for Self-Dual-Minimization), which allows to solve optimizationproblems over symmetric cones.

SeDuMi implements the self-dual embedding technique proposed by Ye, Toddand Mizuno [12]. In particular, SeDuMi offers the advantage to state a problem inmatlab in a very similar way as one would formulate it in theory. The problem hasto be stated between two tags between which the SeDuMi routines are called.

Assuming that the Laplace matrix L of a graph on n vertices is given, we can solvethe primal semidefinite relaxation (P) of the max-cut problem with the matlab

code:

1 cvx_begin sdp2 variable X(n,n) symmetric3 maximize(sum(sum(L. * X)))4 diag(X) == e5 X ≥ 06 cvx_end

In the same way, the dual semidefinite relaxation (D) of the max-cut problem canbe solved with the matlab code:

1 cvx_begin sdp2 variable y(n)3 variable Z(n,n) symmetric4 minimize(sum(y))5 Z == diag(y) − L6 Z ≥ 07 cvx_end

In section 7 we will use the solutions obtained by SeDuMi as a way to verify theaccuracy of the solutions obtained by our matlab implementation of the SpectralBundle Method.

13

Page 14: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

5 Eigenvalue Optimization

Optimization problems involving eigenvalues arise in several mathematical disciplines.When optimization is performed over affine sets of matrices, eigenvalue optimizationis closely related to semidefinite programming. This is due to the fact that a sym-metric matrix X ∈ Sn is positive semidefinite if and only if its minimal eigenvalue isnonnegative, namely λmin(X) ≥ 0. In this section we make this relation more preciseand give a standard formulation of an eigenvalue optimization problem. Then wederive an eigenvalue relaxation of the max-cut problem. We also present a completedescription of the Spectral Bundle Method, which gives rise to an iterative algorithmfor solving eigenvalue optimization problems. Finally, we implement this method inmatlab in order the solve the particular case of the max-cut problem.

5.1 Standard Formulation

Let us consider again the dual semidefinite program in standard form

min 〈b, y〉(DSDP) s.t. Z = AT y − C,

y ∈ Rm, Z < 0,

that we introduced in Section 4. In order to reformulate (DSDP) as an eigenvalueoptimization problem, we use the fact that

Z < 0 ⇔ 0 ≤ λmin(Z) = −λmax(−Z)

⇔ λmax(−Z) ≤ 0

and lift this constraint in the objective function by means of a Lagrangian multipliera ≥ 0. This yields our standard formulation of an eigenvalue optimization problem

(EOP) miny∈Rm

[aλmax(C − AT y) + bT y]. (5.1.1)

For the sake of convenience we will denote the objective function by

f(y) := aλmax(C − AT y) + bT y (5.1.2)

in the sequel of this paper. Under the assumption that the n × n identity matrixbelongs to the range of the operator AT , problem (EOP) is equivalent to (DSDP).This important result is captured in the following proposition (see Helmberg [3],Proposition 5.1.1).

Proposition 5.1. Assume there exists y ∈ Rm such that I = AT y. Then (DSDP) is

equivalent to (EOP) for a = max{0, bT y}.

Proof. Let y ∈ Rm be such that I = AT y and for y ∈ R

m consider the half rayR(y) := {y + λy : λ ≥ λmax(C − AT y)}. Note that by construction, R(y) is feasiblefor (DSDP) for all y ∈ R

m. Indeed, due to the condition on λ we have

AT (y + λy) − C = λAT y + AT y − C

= λI − (C − AT y) < 0.

In particular, this implies that (DSDP) has strictly feasible solutions (just chooseλ > λmax(C − AT y) above). We now consider the two cases bT y < 0 and bT y ≥ 0:

14

Page 15: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

Case 1 : Assume bT y < 0. Then a = max{0, bT y} = 0 and the objective values ofboth (DSDP) and (EOP) tend to minus infinity along any ray y + λy whenλ → ∞.

Case 2 : Assume bT y ≥ 0. Then a = max{0, bT y} = bT y ≥ 0. We first show thatfor any feasible solution y ∈ R

m of (EOP) there exists a feasible solution yof (DSDP) with the same objective value. Let y ∈ R

m be a given feasiblesolution of (EOP). Note that problem (EOP) is constant along directionsλy for λ ∈ R, because

f(y + λy) = aλmax(C − AT (y + λy)) + bT (y + λy)

= aλmax(C − AT y − λI) + bT y + λa

= aλmax(C − AT y) + bT y

= f(y).

In particular, the choice λ := λmax(C−AT y) ensures that the term λmax(C−AT (y + λy)) vanishes, hence y := y + λy ∈ R(y) is a feasible solution of(DSDP) satisfying bT y = f(y) = f(y).

Conversely, we show that for any feasible solution y ∈ Rm of (DSDP) there

exists a feasible solution y of (EOP) with at most the same objective value.Let y ∈ R

m be a given feasible solution of (DSDP). Consider the pointy := y + λmax(C − AT y)y ∈ R(y). Since y is feasible for (DSDP) we have

Z = AT y − C < 0 ⇔ λmax(C − AT y) ≤ 0,

therefore we obtain

f(y) = bT y = bT y + λmax(C − AT y)a ≤ bT y.

Because (EOP) is a minimization problem, this suffices to prove the equivalence of(DSDP) and (EOP).

Remark 5.2. If (PSDP) is feasible, all its feasible solutions satisfy tr(X) = a, because

0 = 〈AX − b, y〉 = 〈X, AT y〉 − bT y = 〈X, I〉 − a = tr(X) − a.

Moreover, since (DSDP) has strictly feasible solutions, the Strong Duality Theo-rem 4.2 implies that the primal optimum is attained and is equal to the infimum of(DSDP).

5.2 Eigenvalue Relaxation of the Max-Cut Problem

In the particular case of the max-cut problem, the dual semidefinite relaxation is

min eT y

(D) s.t. Z = Diag(y) − L,

y ∈ Rn, Z < 0,

15

Page 16: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

where the adjoint AT of the linear operator A is Diag(·) : Rn → Sn. Clearly we haveI = Diag(e), which yields y = e. Therefore we have a = max{0, eT e} = n > 0 andthe eigenvalue relaxation of the max-cut problem reads

(E) miny∈Rm

[nλmax(L − Diag(y)) + eT y]. (5.2.1)

In view of Proposition 5.1, we have the equivalence between problems (D) and (E).

5.3 Analysis of the Objective Function

In this subsection we go back to the general case and analyse the cost function ofproblem (EOP), namely f(y) = aλmax(C − AT y) + bT y. As the case a = 0 is oflittle interest, we assume that a > 0 in the rest of this section. To begin with, weinvestigate the convex structure of the maximum eigenvalue function λmax(·) as wellas its subdifferential. We first use the Rayleigh-Ritz Theorem 2.1, which allows us tocharacterize the maximum eigenvalue of a symmetric matrix as

λmax(X) = max‖v‖=1

vT Xv. (5.3.1)

The next result will be very helpful in order to reformulate (5.3.1).

Proposition 5.3. The set W = {W ∈ Sn : W < 0, tr(W ) = 1} is the convex hull ofthe set V = {vvT : v ∈ R

n, ‖v‖ = 1}.

Proof. It is sufficient to show that any W ∈ W can be written as a convex combinationof elements in V. Let W ∈ W be given. Since W is symmetric, a Singular ValueDecomposition (see Theorem 2.2) has the form

W = V ΣV T ,

where V ∈ Mn,n is an orthogonal matrix and Σ = Diag(σ1, . . . , σn). Denoting thei-th column of V by vi, we obtain

W =n∑

i=1

σivivTi . (5.3.2)

Since V is orthogonal we have ‖vi‖2 = vTi vi = 1, hence viv

Ti ∈ V for i = 1, . . . , n.

Moreover, the singular values satisfy σi ≥ 0 for i = 1, . . . , n and we have

n∑

i=1

σi = tr(Σ) = tr(W ) = 1,

where we used the fact that the trace is invariant under similarity transformations.This proves that (5.3.2) is a convex combination.

Using the fact that vT Xv = 〈X, vvT 〉 for all v ∈ Rn, the linearity of the Frobenius

scalar product and Proposition 5.3, maximizing over V in (5.3.1) amounts to maxi-mizing over W and thus we can reformulate the maximum eigenvalue function as asemidefinite program,

λmax(X) = maxW ∈W

〈X, W 〉. (5.3.3)

16

Page 17: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

As the maximum over a family of linear functions, λmax(·) is convex. This brings usto the notion of subgradient. The subgradients of λmax(·) at X are the linear formsWS that satisfy the subgradient inequality λmax(Y ) ≥ λmax(X) + 〈Y − X, WS〉 for allY ∈ Sn. The subdifferential of λmax(·) at X is the set of all subgradients of λmax(·)at X and is denoted by ∂λmax(X). In particular, we have the characterization

∂λmax(X) = ArgmaxW ∈W〈X, W 〉 = {PV P T : tr(V ) = 1, V < 0}, (5.3.4)

where the columns of P form an orthonormal basis of the eigenspace to the maximumeigenvalue of X (see Helmberg [3], p.74).

Remark 5.4. Due to Proposition 5.3, the subdifferential in (5.3.4) may be viewed asthe convex hull of the dyadic products of the normalized eigenvectors to the maximumeigenvalue of X. In particular, any eigenvector v ∈ R

n to λmax(X) gives rise to asubgradient of λmax(·) at X of the form WS = vvT .

Let us now consider the function aλmax(X) = maxW ∈W〈X, aW 〉. Using the lin-earity of the trace operator, we have tr(aW ) = a tr(W ) = a for all W ∈ W. Moreover,aW < 0 ⇔ W < 0 (recall that we assumed a > 0), so that if we set

Wa := {W ∈ Sn : W < 0, tr(W ) = a} (5.3.5)

we can writeaλmax(X) = max

W ∈Wa

〈X, W 〉,

whose subdifferential is

∂aλmax(X) = {PV P T : tr(V ) = a, V < 0}.

Finally, this allows us to reformulate the objective function f of (5.1.2) as

f(y) = maxW ∈Wa

[〈C, W 〉 + 〈b − AW, y〉] . (5.3.6)

The convexity of λmax implies that of f . As a consequence, the subdifferential of fat y is

∂f(y) ={

b − AW : W ∈ ∂aλmax(C − AT y)}

. (5.3.7)

17

Page 18: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

5.4 The Spectral Bundle Method

The Spectral Bundle Method is a specialized subgradient method for solving eigen-value optimization problems of the form (EOP). It was developed by C. Helmberg [3]as an alternative for solving the class of large constrained semidefinite programs thatcan be cast as eigenvalue optimization problems. Indeed, although primal-dual inte-rior point methods work well on reasonably defined semidefinite programs in theory,their applicability is limited to problems with about 7000 constraints on a work sta-tion equipped with the current technology [3]. This is due to the fact that the systemmatrix arising in the computation of the step direction is in general dense and positivedefinite, which often makes the cost of a single iteration too important. The Spec-tral Bundle Method allows to solve high dimensional problems of the form (EOP)(in particular, problems arising from semidefinite programs with a large number ofconstraints), but has the drawback of having a poor convergence rate: it is only afirst order method.

Before giving a more precise mathematical description of the Spectral BundleMethod and its algorithm, we begin with a verbal description of the general principlesof the method.

5.4.1 Verbal Description

Due to the maximum eigenvalue function λmax(·), the objective function f of (5.1.2)that we want to minimize is a nonsmooth convex function (see Subsection 5.3). There-fore, the Spectral Bundle Method employs nonsmooth convex optimization techniquesto solve (EOP) and requires the computation of a subgradient of f at a given pointy ∈ R

m. Together with the function value f(y), a subgradient gives rise to a lin-ear minorant of f that touches f at y. This linear function minorizing f is usuallyreferred to as a supporting hyperplane or a cutting surface model of f at y.

Rather than minimizing f directly, subgradient methods work with a sequenceof cutting surface models of f and therefore an oracle returning f(y) and providingsome subgradient s ∈ ∂f(y) for a given point y is the only necessary informationfor such methods. In view of Remark 5.4, subgradients to the maximum eigenvaluefunction at X are completely determined by the eigenspace to the maximum eigen-value of X. Therefore, as long as the matrix structure enables quick matrix-vectormultiplications, maximal eigenvalues and eigenvectors of large scaled matrices can beefficiently determined by iterative methods such as for instance the Lanczos method(see [10]). In this case, function value and subgradient can be provided by the oraclefairly quickly.

In particular, the Spectral Bundle Method is a specialized subgradient methodwhich produces a sequence of semidefinite cutting surface models of f by accumulatingand updating subgradient information in a matrix called the bundle. This bundlematrix determines the size of a convex quadratic semidefinite subproblem that hasto be solved at each iteration, hence the importance of keeping the size of the bundlematrix relatively small. Subgradient information that is removed from the bundlematrix is accumulated in an other matrix called the aggregate.

18

Page 19: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

5.4.2 Mathematical Description

We will now describe one iteration of the Spectral Bundle Method in more detail. Wealso propose a matlab implementation of the method for the particular case of themax-cut problem. Recall that the goal is to minimize the cost function

f(y) = aλmax(C − AT y) + bT y

over y ∈ Rm with a > 0. Assume that we are at iteration k and denote the k-th

iterate by yk. The current center of stability yk is the starting point or the lastsuccessful iterate.

The Oracle

As a specialized subgradient method, the Spectral Bundle Method requires anoracle able to provide the objective value f(yk) and a subgradient sk ∈ ∂f(yk) forsome given iterate yk ∈ R

m. Due to the structure of ∂f(yk) (cf. (5.1.2)), the oracleis therefore assumed to deliver the maximum eigenvalue of a given matrix C − AT yk

and a subgradientW k

S ∈ ArgmaxW ∈Wa〈C − AT yk, W 〉.

Remark 5.4 implies that we can take W kS of the form W k

S = avk(vk)T , where vk isan eigenvector to the maximum eigenvalue of C − AT yk. In practise, the maximumeigenvalue and a corresponding eigenvector are usually computed by the Lanczosmethod, which is the best approach for computing a few extremal eigenvalues of amatrix whose product with a vector can be carried out quickly. We refer to Saad [10]for a full description of the Lanczos process.

Our implementation of the algorithm for the max-cut problem uses the Lanczosmethod with complete orthogonalization lanczos_orth A.1.1 as soon as the dimen-sion m of the problem is greater than 100. For small dimensions we simply use thematlab function eig in order to compute the eigenvectors and eigenvalues of a givensymmetric matrix. The oracle is implemented in the matlab function oracle A.1.2.

The Cutting Surface Model

In each iteration of the Spectral Bundle Method, a minorizing cutting surfacemodel of f at the current center of stability yk has to be computed. Characterization(5.3.6) shows that any fixed W ∈ Wa gives rise to a linear function

fW (y) := 〈C, W 〉 + 〈b − AW, y〉

minorizing f . Therefore, a subset W ⊂ Wa yields a convex minorant fW

of f (i.e., acutting surface model) as the pointwise maximum over those linear functions, namely

fW

(y) := maxW ∈W

fW (y) ≤ maxW ∈Wa

fW (y) = f(y) for all y ∈ Rm. (5.4.1)

In order to exploit the structure of (EOP), it is very important to choose an appro-priate cutting surface model for f (see Remark 5.7). Following Helmberg [3], theparticular choice for iteration k is

Wk ={

PkV P Tk + αW k : tr(V ) + α = a, V < 0, α ≥ 0

}, (5.4.2)

19

Page 20: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

where Pk is an orthonormal matrix of size n × rk and W k is an n × n positivesemidefinite matrix of trace 1 (i.e., W k ∈ W). We refer to Pk as the bundle andto W k as the aggregate, while the number of columns rk of Pk is the size of thebundle. Both Pk and W k maintain relevant subgradient information and are updatedat the end of each iteration. The cutting surface model corresponding to this subsetis therefore f

Wk(cf. (5.4.1)), for which a characterization is given in the following

proposition.

Proposition 5.5. For Wk of (5.4.2) and fWk

defined as in (5.4.1),

fWk

(y) = max{

λmax(P Tk (C − AT y)Pk), 〈C − AT y, W k〉

}+ bT y ≤ f(y). (5.4.3)

Proof. See Helmberg [3], Proposition 5.2.1.

This proposition shows that the maximum eigenvalue of P Tk (C − AT y)Pk, and

hence the value of the cutting surface model, can be efficiently determined providedthat the bundle size rk remains small. Moreover, it helps to make a clever choice ashow to construct the bundle matrix Pk. Indeed, since f

Wkis a minorant of f , it is

best to have as large values of fWk

as possible in the vicinity of the current stability

center yk in order to obtain a good approximation of f around yk. This is the case ifthe columns of Pk span the eigenspaces of the largest eigenvalues of C − AT yk.

The Augmented Model

Since fWk

is a linear approximation of f , it can be expected to be of reason-

able quality only locally. Therefore the next iterate yk+1 is sought in a neighborhoodof yk and determined by minimizing the augmented model

fk(y) := fWk

(y) +u

2

∥∥∥y − yk∥∥∥

2, u > 0, (5.4.4)

which is the sum of the cutting surface model fWk

and a positive quadratic term

penalizing the distance from yk. More precisely, yk+1 is the minimizer of the problem

miny∈Rm

maxW ∈Wk

[〈C, W 〉 + 〈b − AW, y〉 +

u

2

∥∥∥y − yk∥∥∥

2]

. (5.4.5)

In particular, the quadratic term u2

∥∥∥y − yk∥∥∥

2ensures the minimum to be unique and

finite. The parameter u > 0 allows to control (to some extent) the distance betweenyk+1 and yk. It is intended to keep yk+1 in the region where f

Wkshould be close to

f . For the sake of convenience, we introduce the augmented lagrangian as

Lk(y, W ) := 〈C, W 〉 + 〈b − AW, y〉 +u

2

∥∥∥y − yk∥∥∥

2. (5.4.6)

Rather than solving (5.4.5) we solve its dual, namely

maxW ∈Wk

miny∈Rm

Lk(y, W ). (5.4.7)

20

Page 21: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

The inner minimization miny∈Rm Lk(y, W ) being an unconstrained convex quadratic

problem, we can solve it explicitely for any fixed W ∈ Wk. From

0 = ∇yLk(y, W ) = b − AW +u

2(2y − 2yk)

we obtain ykmin(W ) := yk + 1

u(AW − b), so that

Lk(ykmin(W ), W ) = 〈C, W 〉 + 〈b − AW, yk +

1

u(AW − b)〉 +

u

2

∥∥∥∥yk +1

u(AW − b) − yk

∥∥∥∥2

= 〈C, W 〉 + 〈b − AW, yk〉 − 1

u〈AW − b, AW − b〉 +

1

2u‖AW − b‖2

= 〈C − AT yk, W 〉 + bT yk − 1

2u‖AW − b‖2 .

Due to the definition (5.4.2) of Wk, maximizing Lk(ykmin(W ), W ) over Wk amounts

to solve the following quadratic semidefinite programming problem in V and α

min1

2u‖AW − b‖2 − 〈C − AT yk, W 〉 − bT yk

(Q) s.t. W = PkV P Tk + αW k, (5.4.8)

tr(V ) + α = a,

V < 0, α ≥ 0.

The dimension of problem (Q) depends on the bundle size rk, which again emphasizesthe importance of keeping the bundle relatively small throughout the algorithm. Thefollowing result shows that strong duality holds for the duals (5.4.5), (5.4.7).

Proposition 5.6. Let Lk be the augmented lagrangian as defined in (5.4.6). Then

miny∈Rm

maxW ∈Wk

Lk(y, W ) = Lk(yk+1, W k+1) = maxW ∈Wk

miny∈Rm

Lk(y, W ) (5.4.9)

with yk+1 = ykmin(W k+1) unique and W k+1 an optimal solution of (Q).

Proof. See Helmberg [3], Lemma 5.2.2.

Remark 5.7. Of course, (5.4.2) is not the unique way to construct the subset Wk ⊂Wa. Depending on the matrix structure of problem (EOP), various other choices ofsets Wk may turn out useful in particular applications. The main design criterion fora model (i.e. for the choice of a set Wk) is the efficiency in solving the correspondingquadratic semidefinite subproblem. Determining a good model requires to find acompromise between the quality of the model, which strongly influences the numberof iterations needed to achieve the desired precision, and the cost of evaluating fitself.

The structure (5.4.2) of the subset Wk ⊂ Wa yields the convex quadratic semidef-inite subproblem (Q) that can be solved fairly efficiently by means of interior pointmethods (provided that the bundle size is not too large). We refer to Helmberg [3],Section 5.5, for a description of such a method that exploits the particular structureof the problem. See also Helmberg and Rendl [6]; Helmberg and Kiwiel [5] for acomplete description of the interior point algorithm.

As this paper is only a humble semester project, our implementation of the algo-rithm for the max-cut problem uses the matlab toolbox SeDuMi [11] for solvingthe quadratic subproblem (Q). This step is implemented in the matlab functiontrial_point A.1.3.

21

Page 22: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

The Descent Test

The candidate yk+1 is then accepted as the new stability center only if the progressof its function value1 f(yk+1) with respect to f(yk) is satisfactory compared to thedecrease predicted by the model value fW k+1(yk+1), i.e. if

f(yk) − f(yk+1) ≥ κ[f(yk) − fW k+1(yk+1)

](5.4.10)

for some parameter κ ∈ (0, 1). If this condition is satisfied this yields a descent step

and the current center of stability is updated as yk+1 := yk+1; otherwise it is calleda null step and the current stability center remains unchanged, namely yk+1 := yk.

The Model Update

After a descent or a null step, the cutting surface model is updated. More precisely,a new subset Wk+1 ⊂ Wa of the form (5.4.2) is determined by updating the bundleand the aggregate. Information from the new subgradient W k+1

S = avk+1(vk+1)T ,where vk+1 is an eigenvector to the maximum eigenvalue of C − AT yk+1, is added tothe model by adding the new eigenvector vk+1 to the old bundle matrix Pk, henceincreasing the bundle size. In order to still be able to solve the quadratic subproblem(Q) efficiently, some of the columns of Pk have to be removed in this process andtheir contribution to the solution is accumulated in the new aggregate matrix. Theimportant information in Wk that has to be maintained in Wk+1 is contained in thematrix W k+1. Indeed, as (yk+1, W k+1) is a saddle point of Lk (see Proposition 5.6),W k+1 ∈ Wk+1 ensures that after a null step the value of the augmented model cannotdecrease. This forces a sequence of null steps to eventually produce a descent stepover time. In fact, it has been shown (see Helmberg [3], Section 5.3) that convergenceof the Spectral Bundle Method is guaranteed provided that

W k+1, W k+1S ∈ Wk+1 with Wk+1 of the form (5.4.2). (5.4.11)

The minimal choice within this framework is W k+1 = W k+1 and Pk+1 = vk+1.However, although rk+1 = 1 would suffice in theory, in practise it is well-advised topreserve also the most important part of the subspace spanned by the columns of Pk.This allows to accumulate the relevant eigenspace information without recomputingthe whole spectrum in each iteration. The matrix W k+1 helps to determine whichsubspace has been important in the last computation. Denoting by V∗ and α∗ theoptimal solutions of (Q) that gave rise to W k+1, consider QΛQT = V∗ an eigenvaluedecomposition of V∗, where QT Q = I and Λ = diag(λ1, . . . , λrk

) with λ1 ≥ · · · ≥λrk

≥ 0. Then we can write

W k+1 = PkV∗P Tk + α∗W k = (PkQ)Λ(PkQ)T + α∗W k, (5.4.12)

which shows that more information about W k+1 is carried by the first columns of PkQthan by the last ones. Assuming therefore that the first columns of PkQ are moreimportant for the remaining part of the optimization process, we split Q into twoparts, Q = [Q1, Q2], where Q1 carries the eigenvectors corresponding to the largest

1obtained by the oracle

22

Page 23: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

eigenvalues of V∗. The matrix Λ is analogously splitted into two smaller diagonalmatrices Λ1 and Λ2. The new bundle matrix Pk+1 will then contain an orthonormalbasis of the space spanned by the columns of PkQ1 and vk+1, namely

Pk+1 = orth[PkQ1, vk+1]. (5.4.13)

The remaining columns PkQ2 are finally incorporated in the new aggregate matrix as

W k+1 =(PkQ2)Λ2(PkQ2)T + α∗W k

tr Λ2 + α∗, (5.4.14)

where the scaling factor 1/(tr Λ2 + α∗) ensures that W k+1 has a trace equal to 1.The following result shows that updating the bundle and the aggregate in this waymaintains the structure (5.4.2) of the set Wk.

Proposition 5.8. For W k+1S = avk+1(vk+1)T ∈ Wa, update formulas (5.4.13) and

(5.4.14) ensure that Pk+1 is orthonormal, W k+1 ∈ W, and that condition (5.4.11) issatisfied for Wk+1 constructed as in (5.4.2).

Proof. See Helmberg [3], Proposition 5.2.3.

Our implementation of the algorithm for the max-cut problem adds 7 new eigen-vectors to the bundle at each iteration and keeps the bundle size bounded using aparameter rmax ∈ N as an upper bound. The model update process is implementedin the matlab function model_updating A.1.4.

The Stopping Criterion

It remains to determine a reasonable stopping criterion. Of course, we would likethe algorithm to stop when the objective value f(yk) at the current stability centeris close enough to the optimal value miny∈Rm f(y). However, a lower bound for thelatter value is unfortunately not available. Indeed, although we know that the cuttingsurface model f

Wkis a minorant for f , we do not know its minimizer. Instead, we do

know the minimizer yk+1 of the augmented model. Considering the quadratic termof the augmented model as a kind of trust region constraint for the cutting surfacemodel, yk+1 may be viewed as the minimizer of f

Wkover a ball around yk. Provided

that the weight u remains reasonably small, the value fWk

(yk+1) = fW k+1(yk+1) pro-vides a lower bound for f over a ball of reasonable size. Therefore, if the gap betweenf(yk) and fW k+1(yk+1) is small, namely if

f(yk) − fW k+1(yk+1) < ǫ(|f(yk)| + 1), (5.4.15)

no satisfactory progress can be expected within the trust region and we stop thealgorithm.

23

Page 24: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

5.4.3 The Algorithm

Here we state the algorithm in pseudo-code. For the particular case of the max-

cut problem, the whole algorithm is implemented in the matlab function solvesbm

A.1.5.

Algorithm 5.9 (Spectral Bundle Method).Inputs: cost matrix C ∈ Sn, adjoint operator AT , vector b ∈ R

m, starting pointy0 ∈ R

m, stopping parameter ǫ ≥ 0, descent parameter κ ∈ (0, 1), weight u > 0.

0. (Initialization) Set k = 0, y0 = y0, compute f(y0) and determine W0 (oracle).

1. (Find trial point) Compute W k+1 and yk+1 by solving (Q) (Proposition 5.6).

2. (Stopping criterion) If f(yk) − fW k+1(yk+1) ≤ ǫ(|f(yk)| + 1) then stop.

3. (Evaluation) Compute f(yk+1) and find a subgradient W k+1S (oracle).

4. (Descent test) If f(yk) − f(yk+1) ≥ κ[f(yk) − fW k+1(yk+1)

]then set yk+1 =

yk+1 (descent step); otherwise set yk+1 = yk (null step).

5. (Update the model) Choose Wk+1 ⊃{

W k+1, W k+1S

}of the form (5.4.2).

6. Increase k by one and goto 2.

Helmberg [3] proved the convergence of Algorithm 5.9 for ǫ = 0. The proof isdivided into several steps. It is first proven that the stopping criterion identifiesoptimal solutions correctly if the algorithm stops after a finite number of iterations.After an analysis of the asymptotic behavior of null steps, it is then shown that unlessthe current stability center is optimal, a descent step will eventually be triggered aftera finite number of iterations. It is finally proven that the sequence of objective valuesat the stability centers satisfies f(yk) ↓ infy∈Rm f(y) in the case of infinitely manydescent steps.

Theorem 5.10 (Convergence). Let {yk} be the sequence of stability centers generatedby Algorithm 5.9. Then either yk → y ∈ Argminy∈Rm f(y), or Argminy∈Rm f(y) = ∅and ‖y‖ → ∞. In both cases f(yk) ↓ infy∈Rm f(y).

Proof. See Helmberg [3], Theorem 5.3.6.

The proof of convergence shows that (5.4.11) is the only requirement for a sequenceof null steps to produce a descent step over time. It is also interesting to note thatthe solution W k of the quadratic subproblem can be related to the optimal solutionsof the primal problem (PSDP) (see Helmberg [3], Theorem 5.3.8). In particular, ifAlgorithm 5.9 stops after a finite number of steps, say at iteration k = K, thenW K+1 belongs to the set of optimal solutions of the primal semidefinite programcorresponding to (EOP), i.e.

W K+1 ∈ Argmax {〈C, X〉 : AX = b, tr X = a, X < 0} .

24

Page 25: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

5.4.4 Improvements

As this paper is only a humble semester project, our matlab implementation solvesbm

A.1.5 of the Spectral Bundle Method is by far not as effective as it could be. Thereare indeed many improvements and modifications that could be made to it in orderto save time and memory. Here we briefly state the different aspects that could beimproved.

Firstly, it may be useful to restart the Lanczos process after nL steps by setting thenew starting vector to the Lanczos vector corresponding to the maximal eigenvalueof the tridiagonal matrix. A small parameter nL will be of interest when the matrix-vector multiplication is computationally cheap, otherwise it is preferable to choosea larger parameter. This topic is discussed by Helmberg [4], who proposes differentheuristics for choosing the parameter nL. Step 4 of Algorithm 5.9 may also be slightlymodified in order to save time by stopping eigenvalue computations early when thecurrent estimate is already good enough to prove that the current iterate will resultin a null step (see Helmberg [4]).

Secondly, the weight u of the augmented model (5.4.4) is usually updated duringthe algorithm for efficiency reasons. Since this parameter controls the distance of thenew iterate with respect to the center of stability, it has to adapt to the local geometryof the objective function. Indeed, a large value of u penalizes the distance from thestability center and therefore allows only small steps. In this view, a sequence ofdescent steps would for example indicate that u should be decreased in order to takelarger steps. In the same vein, if a long sequence of null steps precedes a descentstep, a larger value of u might be better. Several rules for updating parameter u areproposed by Helmberg [4], none of them being known to be the best.

Thirdly, rather than solving the quadratic subproblem (Q) with the matlab tool-box SeDuMi, it may be faster to solve it with an appropriate interior point methodexploiting the structure of the problem. As aforementioned, such a method is de-scribed by Helmberg [3] in Section 5.5. See also Helmberg and Rendl [6]; Helmbergand Kiwiel [5] for a full description of the interior point code.

Finally, the number of Lanczos vectors to add and to keep in the bundle may alsobe dynamically updated at each iteration. Helmberg [4] proposes a heuristic whichchooses four parameters at each iteration for controlling the model update.

25

Page 26: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

6 Resolution of the Max-Cut Problem

Let us assume that we are given the Laplace matrix L of a graph on n vertices. Inthis section, we explain how to compute an approximation of the optimal solution tothe original max-cut problem in algebraic form

(AMC) maxx∈{−1,1}n

xT Lx

from an optimal solution of its eigenvalue relaxation

(E) miny∈Rm

[nλmax(L − Diag(y)) + eT y]︸ ︷︷ ︸=:f(y)

.

Assume that yE ∈ Rn is an optimal solution of (E) obtained by the matlab function

solvesbm A.1.5, which is an implementation of Algorithm 5.9 (i.e., the SpectralBundle Method) for the particular case of the max-cut problem.

By construction of the eigenvalue relaxation, we know that γ := f(yE) is anupper bound for the optimal value of (AMC). The goal is now to find a cut vectorx ∈ {−1, 1}n whose objective value xT Lx is very close to γ. This is done in severalsteps.

6.1 Extraction of a Dual Optimal Solution

To begin with, we use the equivalence between the eigenvalue relaxation (E) and thedual semidefinite relaxation

min eT y

(D) s.t. Diag(y) − Z = L,

y ∈ Rn, Z < 0.

in order to compute an optimal solution of (D) from yE. This equivalence is ensuredby Proposition 5.1, whose proof tells us that

y := yE + λmax(L − Diag(yE))e ∈ Rn (6.1.1)

is a feasible solution of (D) with objective value eT y = γ. As a consequence, (y, Z)with Z := Diag(y) − L is an optimal solution of (D). This first step is implementedin the matlab function shifty A.2.1.

6.2 Extraction of a Primal Optimal Solution

In the next step, we use the strong duality between the dual semidefinite relaxation(D) and the primal semidefinite relaxation

max 〈L, X〉(P) s.t. diag(X) = e,

X < 0.

in order to compute an optimal solution of (P) from (y, Z). Strong duality is ensuredby Theorem 4.2 and the fact that the dual problem (D) has strictly feasible solutions(cf. proof of Proposition 5.1).

26

Page 27: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

The goal is to find a matrix X ∈ Sn with objective value 〈L, X〉 = γ and satisfyingthe constraints diag(X) = e and X < 0. The condition on the objective value forcesthe duality gap 〈X, Z〉 to be zero. Since Z is positive semidefinite and X must bepositive semidefinite, this requires the columns of X to be in the nullspace of Z. LetN be a basis for the nullspace of Z and assume it has dimension k ≤ n, i.e. N is ann × k matrix. We are now looking for a k × n matrix O such that NO is symmetricand satisfies diag(NO) = e.

For the sake of convenience, let us denote the i-th row of N by Ni and the i-thcolumn of O by Oi, for i = 1, . . . , n. With respect to this notation, the constraintdiag(NO) = e is equivalent to the system

NiOi = 1, i = 1, . . . , n. (6.2.1)

Moreover, requiring NO to be symmetric amounts to require

NiOj − NjOi = 0, j = 1, . . . , n − 1, i = j + 1, . . . , n. (6.2.2)

Note that we have nk variables (i.e., the number of entries in O) and that (6.2.1)

and (6.2.2) respectively yield n and (n−1)n2 equations. Thus, unless k > n+1

2 , thereare more equations than variables and the matrix O can be determined. We express(6.2.1)-(6.2.2) as a big linear system

A vec(O) = b, (6.2.3)

where vec(·) : Mk,n → Rkn is the usual vectorization operator stacking the columns

of a matrix on top of eachother, A is an n(n+1)2 × kn matrix and b ∈ R

n(n+1)2 . More

precisely, we have

A =

N1 0 0 · · · 00 N2 0 0...

. . ....

0 0 Nn

N2 −N1 0 0N3 0 −N1 0...

. . ....

Nn 0 −N1

0 N3 −N2 0...

. . ....

0 Nn 0 −N2...

...

0 0 Nn −Nn−1

and b =

11...1

00...0

0...0...

0

.

We can then solve system (6.2.3) easily, for example using a QR-factorization of A.Transforming the obtained solution vec(O) back to a k × n matrix O, we eventuallyget the desired matrix X := NO. This second step is implemented in the matlab

function dual2primal A.2.2.

27

Page 28: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

6.3 Best Rank One Approximation

As we derived the primal semidefinite relaxation (P) in Section 4.1, we droppedthe rank one constraint (rank(X) = 1) in the original algebraic formulation of themax-cut problem (4.3.1). As a consequence, the primal optimal solution X thatwe derived is not necessarily of rank one. In order to deal with this problem andwith the purpose of extracting an optimal cut more easily, we compute the best rankone approximation of X. Using the fact that X is a symmetric matrix, an SVDdecomposition of X (cf. Theorem 2.2) reads X = UΣUT . Let u ∈ R

n be the firstcolumn of U . Then the matrix

X1 := σ11uuT

is the best rank one approximation of X by Theorem 2.3. This step is implementedin the matlab function bestrank1approx A.2.3.

6.4 Extraction of an Optimal Cut

Now that we have the best rank one approximation X1 of X, it remains to determinethe corresponding cut. Since X1 is of rank one, there exists a vector z ∈ R

n such thatX1 = zzT . We can then extract a cut vector x ∈ {−1, 1}n from z by setting

xi :=

{1 if zi ≥ 0

−1 if zi < 0i = 1, . . . , n,

which finally yields the cut δ(S) associated to the set of vertices

S = {i ∈ {1, . . . , n} : xi = 1}.

This last step is implemented in the matlab function extractcut A.2.4.

28

Page 29: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

7 Numerical Experiments

As mentioned in subsection 5.4.4, there are many things that could be done to improvethe efficiency of our algorithm implemented in the matlab function solvesbm A.1.5.In particular, the weight u of the augmented model and the parameters governingthe update of the bundle (in our case, only the upper bound rmax for the bundle size)should be dynamically updated at each iteration. As a consequence, it might seempointless or unfounded at first sight to perform numerical experiments on the runningtime of solvesbm with these two parameters being fixed throughout the algorithm.However, although our numerical results are not representative of those that can beobtained with an optimal implementation of the spectral bundle method, we can stilldraw some interesting observations from them.

We begin with an empirical study of the running time of solvesbm. In our numer-ical experiments, we choose ǫ = 10−5 as parameter for the stopping criterion (5.4.15).Following Helmberg [4], we also set κ = 0.1 as parameter for the descent test (5.4.10)and y0 = (0, . . . , 0) as the starting point. Since the best choice of parameters u andrmax depends significantly on the structure of a graph (i.e., its dimension and den-sity), we perform the numerical experiments on several graphs of the same size inorder to compute an average running time.

For each fixed dimension ni = 10i, i = 1, . . . , 10, we generate a set of 10 randomcomplete graphs on ni vertices whose edge weights are uniformly distributed in [0, 1].We then solve each of the 10 corresponding eigenvalue relaxations of the max-cut

problem (E) with solvesbm and compute the average CPU-time in seconds.

10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

35

u = 1, rmax = 10u = 1, rmax = 20u = 4, rmax = 10u = 4, rmax = 20u = 8, rmax = 10u = 8, rmax = 20

Figure 7.1: Average CPU-time [s] with respect to the dimension n of the graph fordifferent combinations of parameters u and rmax.

29

Page 30: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

Figure 7.1 shows how the average CPU-time increases with respect to the dimen-sion of the graphs for different combinations of parameters u and rmax, respectivelyranging in {1, 4, 8} and {10, 20}. It appears that the average CPU-time is quite sen-sitive to those two parameters and that it is unclear whether there is a pair (u, rmax)ensuring fast convergence for every dimension of the graph.

For the following numerical experiments we set u = 4 and rmax = 10. In Figure 7.2we compare the average CPU-time of solvesbm with the average CPU-time neededby SeDuMi for solving the primal and dual semidefinite relaxations of the max-

cut problem (i.e., (P) and (D), respectively) for the same sets of random graphs.The scale on the vertical axis is logarithmic, therefore it seems that in all cases theaverage running time grows exponentially with respect to the dimension of the graphs.Although SeDuMi applied to problem (P) is faster than solvesbm, the differencebetween their average CPU-times seems to be of the order of a constant. On thecontrary, we can see that solvesbm clearly outperforms SeDuMi when the latter it isapplied to problem (D).

10 20 30 40 50 60 70 80 90 10010

−1

100

101

102

SeDuMi: (P)SeDuMi: (D)solvesbm: (E)

Figure 7.2: Comparison of average CPU-times.

30

Page 31: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

The spectral bundle method gives a rather good approximation of an optimalsolution fairly quickly, but it has a tailing off effect. In other words, the progress thatis done in the last iterations, before the stopping criterion terminates the algorithm,is very small. Figure 7.3 shows the error of solvesbm (with respect to the optimalvalue computed by SeDuMi) at each iteration for a graph on n = 50 vertices. Thescale on the vertical axis is logarithmic, which shows that the convergence to theoptimal solution is much faster in the first iterations than in the last ones.

0 1 2 3 4 5 6 7 810

−3

10−2

10−1

100

101

102

Figure 7.3: Error in the objective value at each iteration for a graph on 50 vertices.

31

Page 32: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

We now investigate the influence and sensitivity of parameters u, κ and rmax.As the choice for parameters u and rmax depends on the structure of a graph, inthe following we perform numerical experiments on a single instance of (randomlygenerated) graph and no longer consider the average running time.

Figure 7.4 shows how the CPU-time varies as a function of parameter u. In thetop part of the figure, the dimension is fixed to n = 50 and the upperbound rmax forthe bundle size ranges the set {10, 15, 20}. In the bottom part of the figure, the bundleparameter is fixed to rmax = 10 and the dimension n ranges the set {20, 40, 50}.

We can see that small values of u should be avoided, as it significantly increasesthe running time of solvesbm. This comes from the fact that too small values of uallow for too large steps (i.e., the distance from the stability center is weakly penalizedin the augmented model), which do not provide satisfactory progress in the objectivefunction and therefore lead to several null steps.

0 1 2 3 4 5 6 7 8 9 100

10

20

30

40

50

60

rmax = 10rmax = 15rmax = 20

0 1 2 3 4 5 6 7 8 9 100

10

20

30

40

50

60

n = 20n = 40n = 50

Figure 7.4: CPU-time [s] with respect to parameter u.

32

Page 33: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

Figure 7.5 shows how parameter κ influences the number of iterations (more pre-cisely, the number of descent steps) and the running time of solvesbm. In the top partof the figure, both the number of iterations and the CPU-time appear as functions ofκ. Parameters u and rmax are fixed to u = 4 and rmax = 10, while the dimension nruns through the set {20, 50}. It emphasizes how the CPU-time is naturally relatedto the number of iterations.

In the bottom part of the figure, the percentage of descent steps is shown as afunction of κ. Again, parameters u and rmax are fixed to u = 4 and rmax = 10 andthe dimension n ranges the set {20, 50}. We can see that large values of κ tightenthe descent test (5.4.10) and therefore allow for less descent steps.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

5

10

15

20

cpu time: n = 20cpu time: n = 50iterations: n = 20iterations: n = 50

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.940

50

60

70

80

90

100

n = 20n = 50

Figure 7.5: Influence of parameter κ.

33

Page 34: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

A Matlab codes

A.1 The Spectral Bundle Method

A.1.1 Lanczos with Reorthogonalization

1 function [H,U] = lanczos_orth(A,ns,x)2 % Perform ns steps of the Lanczos process withstoring the ful l3 % basis U and with reorthogonalization on Hermitian A with st arting4 % vector x. On return, H is an (ns+1,ns) upper tridiagonal mat rix.5

6 % This code will malfunction if A is not square, if x is of diffe rent7 % dimension from A, if x = 0, or if ns is greater than the dimensi on8 % of A.9

10 x = x / norm(x);11

12 H = zeros(ns+1,ns+1);13 n = size(A,1);14 % store all vectors15 U = x;16 beta = 0;17 ns = min(n,ns);18 for j = 1:ns19 z = A* U(:, end);20 alpha = U(:, end)' * z;21 u = z − U(:, end) * alpha;22 if j > 1,23 u = u − U(:, end−1) * beta;24 end25 % Reorthognalization26 alphas = U' * u;27 u = u − U* alphas;28 alpha = alpha + alphas( end);29

30 beta = norm(u);31 u = u / beta;32 U = [U,u];33 H(j,j) = alpha;34 H(j+1,j) = beta;35 H(j,j+1) = beta;36 end37

38 H = H(:,1: end−1);39 end

34

Page 35: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

A.1.2 The Oracle

1 function [fy,V] = oracle(n,y,L,nL)2 %ORACLE: Returns the value fy of the objective function and a3 %matrix V containing (at most nL) lanczos vectors of the matr ix4 %L−Diag(y).5 if n ≤ 1006 % Compute max. eigenvalue of L −Diag(y) using the Matlab eig ...

function7 [V,D] = eig(L −diag(y));8 [eigval,I] = sort(diag(D), 'descend' );9

10 V = V(:,I);11 V = V(:,1:min(n,nL));12 fy = n * eigval(1) + sum(y);13 else14 % Compute max. eigenvalue of L −Diag(y) using Lanczos method15 x = rand(n,1);16 [H,U] = lanczos_orth(L −diag(y),nL,x);17 [V,D] = eig(H(1: end−1,:));18 [eigval,I] = sort(diag(D), 'descend' );19

20 V = U(:,1: end−1) * V(:,I);21 fy = n * eigval(1) + sum(y);22 end23 end

A.1.3 The Trial Point Finding

1 function [W_new,y_new,V,alpha] = trial_point(n,y,L,P,Wbar,u,r)2 %TRIAL_POINT: solves the semidefinite convex quadratic su bproblem using3 %SeDuMi.4 e = ones(n,1);5

6 % Solve (Q) with SeDuMi7 cvx_begin sdp quiet8 variable V(r,r) symmetric9 minimize( pow_pos(norm(e −diag(P * V* P' + ...

10 (n − trace(V)) * Wbar)),2)/(2 * u) − ...11 sum(sum((P * V* P' + (n − trace(V)) * Wbar). * ...12 (L −diag(y)))) − sum(y) )13 trace(V) ≤ n14 V ≥ 015 cvx_end16

17 alpha = n −trace(V);18 W_new = P* V* P' + alpha * Wbar;19 y_new = y + (diag(W_new) −e)/u;20 end

35

Page 36: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

A.1.4 The Model Update

1 function [P,Wbar,r] = model_updating(P,Wbar,r,V,alpha,V_new,rm ax)2 %MODEL_UPDATING: updates the bundle matrix and the aggrega te matrix3 [Q,D] = eig(V);4 [d,I] = sort(diag(D), 'descend' );5 Q = Q(:,I);6 D = diag(d);7

8 % aggregation tolerance9 ta = 0.01;

10 % maximum number of Lanczos vectors to add in the bundle11 nA = 7;12 % minimum number of columns to maintain in the bundle13 rmin = min(r,max(0,rmax −nA));14 % number of columns to maintain in the bundle15 N = rmin;16 for i=(rmin+1):r17 if (d(i) ≥ ta * d(1)) && (i ≥ rmin)18 N = i;19 end20 end21 % number of Lanczos vectors to add in the bundle22 radd = min(r,rmax −N);23

24 Q1 = Q(:,1:N);25 D1 = D(1:N,1:N);26 Q2 = Q(:,(N+1): end );27 D2 = D((N+1): end,(N+1): end );28 % new aggregate matrix29 Wbar = (P * Q2* D2* Q2' * P' + alpha * Wbar)/(trace(D2)+alpha);30 % new bundle matrix31 P = orth([P * Q1,V_new(:,1:radd)]);32 % new bundle size33 r = size(P,2);34 end

A.1.5 The Algorithm

1 function [sbm] = solvesbm(n,L,epsilon,kappa,u,rmax,maxit)2 %SOLVESBM: Solves the eigenvalue optimization relaxation of the Max −Cut3 %problem for the graph corresponding to the laplace matrix L using the4 %Spectral Bundle Method5

6 % Step 0 (Initialization)7 t = cputime;8 nL = 100; % number of steps for Lanczos9 y = zeros(n,1); % starting point for the iterates

10

11 [fy0,V0] = oracle(n,y,L,nL);12

13 Wbar = eye(n)/n; % initial aggregate matrix14 r = min(rmax,size(V0,2)); % initial bundle size15 P = V0(:,1:r); % initial bundle matrix

36

Page 37: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

16 fy = [fy0]; % store the objective function values17

18 dsteps = 0; % counter for descent steps19 k = 1; % counter for iterations20

21 % Define the linear minorant f_W(y)22 f = @(W,y)(sum(sum((L −diag(y)). * W))+sum(y));23

24 while k ≤ maxit25 % Step 1 (Trial point finding)26 [W_new,y_new,V,alpha] = trial_point(n,y,L,P,Wbar,u,r) ;27

28 % Step 2 (Stopping criterion)29 if (fy(k) −f(W_new,y_new)) ≤ epsilon * (abs(fy(k))+1)30 break ;31 end32

33 % Step 3 (Evaluation)34 [fy_new,V_new] = oracle(n,y_new,L,nL);35

36 % Step 4 (Descent test)37 if (fy(k) −fy_new) ≥ kappa * (fy(k) −f(W_new,y_new))38 % Descent step39 dsteps = dsteps + 1;40 y = y_new; % Update the center of stability41 fy(k+1) = fy_new; % Store the new objective value42 else43 % Null step44 fy(k+1) = fy(k); % Store the new objective value45 end46

47 % Step 5 (Model updating)48 [P,Wbar,r] = model_updating(P,Wbar,r,V,alpha,V_new,rm ax);49

50 % Next step51 k = k + 1;52 end53 dsteps = dsteps/(k −1);54

55 sbm.t = cputime −t; % cpu time in seconds56 sbm.y = y; % optimal solution y57 sbm.fy = fy; % objective function values58 sbm.optval = fy( end ); % optimal value59 sbm.dsteps = dsteps; % fraction of descent steps60 if (k −1) == maxit61 sbm.iter = 0; % SBM has reached the max. number of iterations62 else63 sbm.iter = (k −1);64 end65 end

37

Page 38: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

A.2 Resolution of the Max-Cut Problem

A.2.1 Extraction of a Dual Optimal Solution

1 function [y,Z] = shifty(sbm_y,L)2 %SHIFTY: Extraction of a dual optimal solution3 y = sbm_y + max(eig(L −diag(sbm_y)));4 Z = diag(y) −L;5 end

A.2.2 Extraction of a Primal Optimal Solution

1 function [X,gap] = dual2primal(y,Z)2 %DUAL2PRIMAL: Extraction of a primal optimal solution3 tol = 1e −6;4 n = length(y);5 % Put eigenvalues of Z to zero if they are smaller than tol6 [V,D] = eig(Z);7 N = [];8 for i=1:n9 if D(i,i) < tol

10 D(i,i) = 0;11 N = [N V(:,i)];12 end13 end14 Z = V* D/V;15 k = size(N,2);16 % Construction of the system of constraints17 b = [ones(n,1);zeros(n * (n −1)/2,1)];18 A = sparse(n,n * k);19 for j=1:n20 % Enforce constraint X_jj = 121 A(j,(k * (j −1)+1):j * k) = N(j,:);22 for i=(j+1):n23 % Enforce constraint x_ij = x_ji24 C = sparse(1,[(k * (j −1)+1):j * k (k * (i −1)+1):i * k],[N(i,:) ...

−N(j,:)],1,n * k);25 A = [A;C];26 end27 end28 vecO = A\b;29 O = reshape(vecO,k,n);30 X = N* O;31 gap = sum(sum(Z. * X));32 end

38

Page 39: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

A.2.3 Best Rank One Approximation

1 function [X1] = bestrank1approx(X)2 %BESTRANK1APPROX: Best rank one approximation of X3 n = size(X,2);4 % Compute SVD of X5 [U,S,V] = svd(X);6 % Keep only the information from the largest singular value7 X1 = U(:,1) * S(1,1) * V(:,1)';8 end

A.2.4 Extraction of an Optimal Cut

1 function [x] = extractcut(X)2 %EXTRACTCUT: Extraction of the optimal cut3 n = size(X,2);4 % Compute SVD of X5 [U,S,V] = svd(X);6 % Extract an incidence vector from the first column of U7 for i=1:n8 if U(i,1) ≥ 09 x(i) = 1;

10 else11 x(i) = −1;12 end13 end14 end

39

Page 40: Eigenvalue Optimization for Solving the Problem - SMA …sma.epfl.ch/~anchpcommon/students/hess.pdf · Eigenvalue Optimization for Solving the ... it is customary to solve a relaxation

References

[1] F. Barahona, M. Grötschel, M. Jünger, and G. Reinelt. An application of combi-natorial optimization to statistical physics and circuit layout design. OperationsResearch, 36(3):493–513, 1988.

[2] F. Hadlock. Finding a maximum cut of a planar graph in polynomial time. SIAMJournal on Computing, 4(3):221–225, 1975.

[3] C. Helmberg. Semidefinite programming for combinatorial optimization. Konrad-Zuse-Zentrum für Informationstechnik Berlin, 2000.

[4] C. Helmberg. A C++ implementation of the spectral bundle method. Manualversion, 1(1), 2004.

[5] C. Helmberg and KC Kiwiel. A spectral bundle method with bounds. Mathe-matical Programming, 93(2):173–194, 2002.

[6] C. Helmberg and F. Rendl. A spectral bundle method for semidefinite program-ming. SIAM Journal on Optimization, 10(3):673–696, 2000.

[7] R.A. Horn and C.R. Johnson. Matrix analysis. Cambridge university press, 1990.

[8] R.Y. Pinter. Optimal layer assignment for interconnect. Journal of VLSI andcomputer systems, 1(2):123–137, 1984.

[9] T.L. Magnanti R.K. Ahuja and J.B. Orlin. Network flows: Theory, algorithms,and applications. 1993.

[10] Y. Saad. Numerical methods for large eigenvalue problems, volume 158. SIAM,1992.

[11] J.F. Sturm. Using SeDuMi 1.02, a Matlab toolbox for optimization over sym-metric cones. Optimization methods and software, 11(1-4):625–653, 1999.

[12] Y. Ye, M.J. Todd, and S. Mizuno. An O(√

nl)-iteration homogeneous and self-dual linear programming algorithm. Mathematics of Operations Research, pages53–67, 1994.

40