The Complexity of Self-Regular Proximity Based Infeasible IPMs

McMaster University

Advanced Optimization Laboratory

Title:

The Complexity of Self-Regular Proximity Based Infeasible IPMs

Authors:

Maziar Salahi, Tamas Terlaky, and Guoqing Zhang

AdvOl-Report No. 2003/3

May 2003, Hamilton, Ontario, Canada

The Complexity of Self-Regular Proximity Based

Infeasible IPMs

Maziar Salahi∗† , Tamas Terlaky†, Guoqing Zhang,‡

Revised

November 23, 2004

Abstract

Primal-Dual Interior-Point Methods (IPMs) have shown their power in solving largeclasses of optimization problems. In this paper a self-regular proximity based InfeasibleInterior Point Method (IIPM) is proposed for linear optimization problems. First we mentionsome interesting properties of a specific self-regular proximity function, studied recently byPeng and Terlaky, and use it to define infeasible neighborhoods. These simple but interestingproperties of the proximity function indicate that, when the current iterate is in a largeneighborhood of the central path, large-update IIPMs emerge as the only natural choice.Then, we apply these results to design a specific self-regularity based dynamic large-updateIIPM in large neighborhood. The new dynamic IIPM always takes large-updates and doesnot utilize any inner iteration to get centered. An O(n2 log n

ε ) worst-case iteration boundof the algorithm is established. Finally, we report the main results of our computationalexperiments.

Keywords: Linear Optimization, Infeasible Interior Point Method, Self-Regular ProximityFunction, Polynomial Complexity.

1 Introduction

Interior Point Methods (IPMs) initiated by Karmarkar [3] not only have polynomial complexity,but are also highly efficient in practice for solving Linear Optimization (LO) problems. A newparadigm of IPMs based on Self-Regular (SR) proximity functions was presented by Peng, Roosand Terlaky [14]. They proved that SR-proximity based feasible IPMs for LO enjoy the bestworst case theoretical complexity of large-update IPMs. In this paper, we aim to develop anSR-proximity based Infeasible IPM (IIPM) for LO and establish its polynomial complexity.

Feasible IPMs start with a strictly feasible interior point and maintain feasibility during thesolution process. It is not trivial to find an initial feasible interior point. One method to

∗Department of Mathematical sciences, Sharif University of Technology, P.O. Box 11365-9415, Tehran, Iran,email: [email protected].

†Advanced Optimization Lab, Department of Computing and Software, McMaster University, Hamilton, On-tario, Canada, L8S 4L7. email: [email protected], [email protected].

‡Department of Industrial and Manufacturing Systems Engineering, University of Windsor, Windsor, Ontario,Canada N9B 3P4. email: [email protected].

2

overcome this problem is to use the homogeneous embedding model by introducing artificialvariables. Such a homogenous self-dual formulation was presented first by Ye et al.[28]. Theother option is to develop IIPMs. Infeasible IPMs are widely adopted in many efficient softwarepackages. IIPMs start with an arbitrary positive point and feasibility is reached as optimalityis approached. The choice of the starting point in IIPMs is crucial for better performance.Lustig [7] and Tanabe [19] were the first to present IIPMs. Kojima at. al [5] prove the globalconvergence of an IIPM. Zhang [29] proved an O(n2L)-iteration bound for IIPMs under certainconditions. Mizuno [9] introduced a primal-dual IIPM and proved global convergence of thealgorithm.

In this paper we consider primal-dual LO problems in the following the standard form:

(P ) min{cT x : Ax = b, x ≥ 0},

and the dual problem is given by

(D) max{bT y : AT y + s = c, s ≥ 0},

where A ∈ Rm×n, b ∈ Rm, y ∈ Rm, c ∈ Rn, x, s,∈ Rn and without loss of generality [16] wemay assume that rank (A) = m. The vectors y, x and s are the vectors of variables.

In this paper we adopt a new SR proximity based search direction to develop new SR proximitybased IIPMs.

Our paper is organized as follows. In Section 2, first we introduce self-regular functions andthen we discuss the Newton system in the infeasible primal-dual methods based on self-regularproximity function. Then, in Section 3 we will discus some properties of our SR proximitymeasure. In Section 4 we specify our new algorithm. In Section 5 we establish some technicalresults for further complexity analysis. Then, in Section 6 the polynomial complexity of ournew algorithm is established. Some of our computational results are presented in Section 7.Conclusions are given in Section 8. To make the paper easily readable, we move most detailedproofs of the technical results to the Appendix.

2 Self-Regular Infeasible IPMs

2.1 Self-Regular Functions

In this section we recall the definition of self-regular functions [12, 14] and some of their prop-erties. The family of univariate self-regular functions are defined as follows.

Definition 2.1 A function ψ(t) ∈ C2 : (0,∞) → R is self-regular if it satisfies the followingconditions:

SR.1 ψ(t) is strictly convex with respect to t > 0 and vanishes at its global minimal point t = 1,i.e., ψ(1) = ψ′(1) = 0. Further, there exist positive constants ν2 ≥ ν1 > 0 and p ≥ 1, q ≥ 1such that

ν1(tp−1 + t−1−q) ≤ ψ′′(t) ≤ ν2(tp−1 + t−1−q), ∀t ∈ (0,∞); (1)

SR.2 For any t1, t2 > 0,

ψ(tr1t1−r2 ) ≤ rψ(t1) + (1− r)ψ(t2), ∀r ∈ [0, 1]. (2)

3

If ψ(t) is self-regular (SR), the parameter q is called the barrier degree and p the growth degreeof the function ψ(t).

There are two popular families of SR functions. The first family is given by

Υp,q(t) =tp+1 − 1p(p + 1)

+t1−q − 1q(q − 1)

+p− q

pq(t− 1) , p, q ≥ 1, (3)

with ν1 = ν2 = 1. The second family is defined as

Γp,q(t) =tp+1 − 1p + 1

+t1−q − 1q − 1

, p ≥ 1, q > 1, (4)

with ν1 = 1 and ν2 = q.

Let v ∈ Rn++. Then an SR-proximity measure Ψ : Rn

++ → R+ measures the discrepancy

between v and e = (1, 1, · · · , 1)T , and is defined as Ψ(v) =n∑

i=1ψ(vi), where ψ(t) is a univariate

SR function, called the kernel function of the SR-proximity.

A new paradigm of IPMs is introduced by Peng et.al., in [12, 14], where SR-proximity measuresare used to define search directions and to control the iterative process. In the rest of this sectiona general scheme of SR proximity based IIPMs is developed.

To solve the primal-dual problems (P ) and (D), we need to solve the following optimalityconditions

Ax = b x ≥ 0,

AT y + s = c s ≥ 0,

xs = 0,

where xs denotes the coordinatewise (Hadamard) product of the two vectors. In the optimalityconditions the first two constraints represent primal and dual feasibility, while the last one isthe so-called complementarity condition. Instead of solving the problem directly, IPMs relaxthe complementary condition by using a centrality parameter µ. The central path is defined asthe set of unique solutions {(x(µ)| µ > 0} and {(y(µ), s(µ))| µ > 0} of the system:

Ax = b x > 0,

AT y + s = c s > 0, (5)xs = µe,

where µ > 0 is a centrality parameter. System (5) is solved approximately by applying Newton’smethod for getting approximate solutions. As µ goes to zero, x(µ), y(µ), s(µ) converge to anoptimal solution. Given any x > 0, y and s > 0, the Newton direction for (5) in IIPM isdetermined by the following linear system of equations:

A 0 0

0 AT I

S 0 X

∆x

∆y

∆s

=

−rb

−rc

µe− xs

, (6)

where X and S denote the diagonal matrices diag(x) and diag(s), and rb and rc are residualsdefined by

rb = Ax− b,

rc = AT y + s− c.

4

In SR-IIPMs the Newton system (6) is modified. Let

v :=√

xs

µand v−1 :=

√µ

xs,

whose ith components are√

xisiµ and

√µ

xisi, respectively. Then, the Newton system for SR-IIPM

is given as:

A 0 0

0 AT I

S 0 X

∆x

∆y

∆s

=

−rb

−rc

−µv∇Ψ(v)

, (7)

where v∇Ψ(v) = (v1∇ψ(v1), · · · , vn∇ψ(vn))T . Observe that the right-hand-side of the last setof equations is exactly the same as in feasible SR-IPMs [12, 14]. We solve the Newton systemas follows. From the last equation of system (7), we derive

∆s = x−1(−µv∇Ψ(v)− s∆x).

Then, we have the so-called augmented system:

0 A

AT −D−2

∆y

∆x

=

−rb

rh

, (8)

where

D2 = S−1X,

rh = −rc + x−1µv∇Ψ(v).

The normal equation is derived by making a block-pivot on −D−2:

AD2AT ∆y = AD2rh − rb.

We obtain ∆y by solving the normal equation and we get ∆x by the formula

∆x = D2(AT ∆y − rh), (9)

finally ∆s can be computed. Having the Newton direction, we make a damped step, with acertain step length α to get the new iterate.

x(α) := x + α∆x,

y(α) := y + α∆y,

s(α) := s + α∆s.

3 Properties of the Proximity Function

In this section we investigate the properties of a specific SR-proximity function

Φ(x, s, µ) = Ψ(v) =12

∥∥∥v − v−1∥∥∥2, 1

1Throughout the paper ‖ · ‖ denotes the 2-norm of vectors.

5

with respect to the argument µ that will be useful in our algorithm design and complexityanalysis. Observe that Ψ(v) is an SR-proximity function with the kernel function ψ(t) = t2

2 −1+ t−2

2 . We are particularly interested in the case when the present iterate (x, s) is far away fromthe central path. For notational convenience, µg := xT s

n denotes the parameter value associatedwith the current duality gap and µh := n

x−T s−1 denotes the harmonic mean of (x1s1, · · · , xnsn).Note that when the point (x, s) is fixed, then we can cast the proximity function Φ(x, s, µ) as afunction of µ, i.e.,

Φ(x, s, µ) :=xT s

2µ− n +

µ

2x−T s−1.

If µ = µg, from the choice of the kernel function we know that the value of the proximity functionis determined by the product

xT sx−T s−1 =n2µg

µh.

By the arithmetic mean-harmonic mean inequality we know that µh ≤ µg and equality holds ifand only if (x, s) is on the central path. Next we consider the behavior of the function Φ(x, s, µ)w.r.t. µ. Because both x and s are positive vectors, the function Φ(x, s, µ) is convex with respectto µ. Using the optimality conditions for convex optimization problems, we can easily prove thefollowing results that will be used in the proof of Theorems 6.2 and 6.3.

Proposition 3.1 For any fixed iterate (x, s) > 0, the proximity function Φ(x, s, µ) as a functionof µ has a global minimizer at the geometric mean of µg and µh

µ∗ =

√xT s

x−T s−1=√

µgµh.

It is easy to verify the following interesting relation that plays a crucial role later in the designof our algorithmic scheme.

Proposition 3.2 Suppose that the iterate (x, s) > 0 is fixed. Then we have

Φ(x, s, µg) = Φ(x, s, µh) =n

2

(µg

µh− 1

).

In fact, with simple calculus, we can obtain the following relation between the values of theproximity function w.r.t. µ∗ and µg

Φ(x, s, µg) = Φ(x, s, µ∗) +Φ(x, s, µ∗)2

2n. (10)

Now, let us recall that when the primal-dual pair (x, s) is in a large neighborhood of the centralpath, then µg À µh may hold. Moreover, we can write

µh =

(1− µg − µh

µg

)µg = (1− θ)µg,

where θ > 12 when µg > 2µh. This, considering Φ(x, s, µg) = Φ(x, s, µh), leads naturally to a

large-update method. In practical implementations of IPMs, a large-update is used wheneverthe iterate is in a certain neighborhood of the central path or, equivalently, when the proximityfunction Φ(x, s, µg), or the ratio µg

µhis bounded above by a certain number τ À 1. Thus, after

one update of µ := µτ , we need also to investigate the growth behavior of the proximity function,

which is demonstrated by the following lemma [13]. For ease of understanding the proof isincluded.

6

Lemma 3.3 Let τ > 1 be a constant. If

µg

µh=

xT sx−T s−1

n2≤ τ,

thenΦ

(x, s,

µg

τ

)≤ (τ − 1)n

2.

Proof: By the assumption of the lemma we can write µg = τµh for some τ ≤ τ . It followsthat

Φ(

x, s,µg

τ

)=

τn

2− n +

nµg

2τµh

=(τ − 1)n

2− n

2+

nτ

2τ≤ (τ − 1)n

2,

which further concludes the lemma. 2

The following corollary gives useful relation that will be used throughout the paper. It is anobvious consequence of Proposition 3.2.

Corollary 3.4 If τ > 1 then µg

µh≤ τ if and only if Φ(x, s, µg) ≤ (τ−1)n

2 .

In fact, for any positive τ > 1, one can see that Φ(x, s, µ) = (τ−1)n2 if and only if the target

µ = µt solves

x−T s−1µ2 − (τ + 1)nµ + xT s = 0. (11)

It has two roots, one larger than µg and the other one smaller than µg. We are interested in thesmaller root which is defined as it follows:

µt :=2xT s

(τ + 1)n +√

(τ + 1)2n2 − 4xT sx−T s−1=

2µg

τ + 1 +√

(τ + 1)2 − 4µg/µh

. (12)

Observe that if µg

µh≤ τ then µt is well defined. It is also easy to verify that µt can be cast as a

decreasing function of τ . In particular, µt = µh if and only if µg = τµh and µh > µt wheneverµg < τµh. One can also easily prove the following lemma that later will be used in the proof oftechnical lemmas (for example Lemma 5.5).

Lemma 3.5 If µg

µh≤ τ then

1τ + 1

≤ µt

µg≤ 1

τ.

Now we proceed to discuss the properties of the search direction based on our specific self-regular proximity function for different updates of µ. Note that, due to the specific choice ofthe kernel function ψ(t), we can rewrite the Newton system (7) in the original space as

A∆x = −rb,

AT ∆y + ∆s = −rc, (13)s∆x + x∆s = µ2x−1s−1 − xs.

Let us denote by (∆x(µ), ∆y(µ), ∆s(µ)) the solution of system (13). The following lemmadiscusses the change of the duality gap along the search direction (∆x(µ), ∆s(µ)) for µ = µh.

7

Lemma 3.6 Let (∆x(µh), ∆s(µh)) be the solution of system (13) with µ = µh. Then we have

xT ∆s(µh) + sT ∆x(µh) =n2

x−T s−1− xT s = n(µh − µg).

Recall that in traditional IPMs based on the standard Newton direction, we need to solveequation system (6) at each iteration. In this case, if we set the target to µ+ = µh, then thesolution of system (6) will satisfy

xT ∆s + sT ∆x = nµh − xT s = n(µh − µg) < 0.

This implies that if the targeted parameter is µh, then the search direction based on our specificself-regular proximity function and the standard Newton direction will predict the change of theduality gap in the same way, i.e.,

(x + α∆x(µh))T (s + α∆s(µh)) = (x + α∆x)T (s + α∆s) = xT s

(1− α +

µhα

µg+ α2 ∆xT ∆s

xT s

).

4 A Dynamic Large-Update Infeasible IPM

In this section we introduce our new algorithm. Our algorithm works with an infeasible centralpath neighborhood N (τ, β). Although the neighborhood can be defined for any SR-proximityfunction, in this paper we choose the particular case Ψ(v) = 1

2

∥∥v − v−1∥∥2

. By the observationmotivated in Lemma 3.3, the SR-proximity based neighborhood is defined by

N (τ, β) ={

(x, y, s) | Φ(x, s, µg) ≤ (τ − 1)n2

, ‖rb‖ ≤∥∥∥r0

b

∥∥∥ µg

µ0β, ‖rc‖ ≤

∥∥∥r0c

∥∥∥ µg

µ0β

}, (14)

where, µg = xT sn , µ0 = µ0

g, (x0, y0, s0) is an arbitrary triple with x0, s0 strictly positive and β ≥ 1so that the initial point (x0, y0, s0) belongs to the neighborhood N (τ, β). If (x, y, s) ∈ N (τ, β),then infeasibility is bounded by a multiple of µ and the initial infeasibility. Further, the targetvalue µt given by (12) is well defined in the neighborhood N (τ, β).

We utilize a parameter τ ≥ 10 to keep control on the update of the duality gap parameter µand to force the value of the proximity function to satisfy the relation

Φ(x, s, µg) = Φ(x, s, µh) ≤ (τ − 1)n2

. (15)

We also stipulate that when the proximity function Φ(x, s, µg) has a relatively small value, forinstance, if Φ(x, s, µg) = Φ(x, s, µh) ≤ (τ−2)n

4 , then we choose µt defined by (12) as our targetedcentering parameter in system (13) for the search direction. Otherwise, we choose µh as thetargeted parameter. Moreover, the step size is carefully chosen so that all the iterates remain inthe neighborhood N (τ, β). Note that in both cases, we have µg ≥ τ

2µ+t , where µ+

t is the targetedparameter. Hence, our algorithm is indeed a large-update one if τ ≥ 10. For simplicity we usethe notation x(α) := x + α∆x, y(α) := y + α∆y and s(α) := s + α∆s. Correspondingly wealso define

µg(α) =x(α)T s(α)

n, µh(α) =

n

x(α)−T s(α)−1, µ∗(α) =

√µg(α)µh(α).

At each step, we stipulate that the step size should be chosen such that the proximity functionΦ(x(α), s(α), µ+

t ) has a sufficient decrease so that after the step the proximity function stillsatisfies (15) at the new iterate. The algorithm can be outlined as follows.

8

Algorithm SR-IIPM

Input:Proximity parameters τ ≥ 10 and β ≥ 1;neighborhood N (τ, β);an accuracy parameter ε > 0;(x0, y0, s0) ∈ N (τ, β); k = 0.

beginwhile max

{(xk)T sk,

∥∥∥rkb

∥∥∥ ,∥∥∥rk

c

∥∥∥}≥ ε doa

begin

If µkg

µkh

≥ τ2 then µ := µk

h;

otherwise µ := µkt given by (12).

Solve system (13) for ∆xk, ∆yk, ∆sk.begin

Determine a step size b α∗such thatΦ(x(αk), s(αk), µk

t ) ≤ Φ(xk, sk, µkt )− α∗

2 Φ(xk, sk, µkt )

and (x(αk), y(αk), s(αk)) ∈ N (τ, β);xk+1 := x(αk); yk+1 := y(αk); sk+1 := s(αk);k = k + 1.

endend

end

aIn the algorithm rkb , rk

c , µkg ,µk

h etc. denotes the primal and dual residuals and theactual µg and µh values of the iteration k, respectively.

bThe step size α∗ is estimated in Corollary 6.4.

Remark 4.1 All the iterates of the Algorithm SR-IIPM stay in N (τ, β) neighborhood that hasbeen proved in Corollary 6.6.

For future use we introduce the scalar quantity ρk defined by

ρk =k−1∏

i=0

(1− αi).

Because the first two components of system (13) are linear, we have

(rkb , rk

c ) = (1− αk−1)(rk−1b , rk−1

c )= (1− αk−1) · · · (1− α0)(r0

b , r0c ) (16)

= ρk(r0b , r

0c ),

where rkb , rk

c , xk and sk are the corresponding residuals and variables at step k. Since (xk, yk, sk) ∈N (τ, β), from (16) we derive

ρk

∥∥(r0b , r

0c )

∥∥µk

g

=

∥∥∥(rkb , rk

c )∥∥∥

µkg

≤ β

∥∥(r0b , r

0c )

∥∥µ0

.

9

Provided that (r0b , r

0c ) 6= 0, it follows from this inequality that

ρk ≤βµk

g

µ0. (17)

Polynomial complexity of Algorithm SR-IIPM follows from the result that the lower bound onstep size is an inverse polynomial function of n if we choose the starting point to be

(x0, y0, s0) = (ζe, 0, ζe), (18)

where ζ is scalar for which

‖(x∗, s∗)‖∞ ≤ ζ (19)

for some primal-dual optimal solution (x∗, y∗, s∗). Usually we don’t know ‖(x∗, s∗)‖∞ , becausewe do not know any optimal solutions a priori. However, these conditions are still relevant.Theoretically we can choose ζ = O(2L) where L is the input length of the LO problem [16]. Suchan initial point with sufficiently large ζ is helpful for computational practice for the followingreason: a well-centered starting point for which the ratio

∥∥(r0b , r

0c )

∥∥µ0

(20)

is small leads to faster convergence than do poorly centered points that are much closer to thesolution set. The point (18) satisfies these criteria. It is perfectly centered, and the ratio (20) isbounded above.

5 Preliminary Technical Results

In contrast to feasible SR-proximity based IPMs given in [12], the orthogonality property of ∆x,and ∆s does not hold in IIPMs, i.e., ∆xT ∆s 6= 0. In this section we estimate the second orderterm ∆xT ∆s

µ . The proofs are analogous to the proofs of chapter 6 of [24].

5.1 Technical Results I: Bounding ρk ‖(x, s)‖

The following orthogonality property is useful in several places in the analysis. If (x, y, s) is anyvector that satisfies the conditions

Ax = 0, AT y + s = 0, (21)

then

xT s = −xT AT y = 0. (22)

The first result establishes a bound on ρk

∥∥∥(xk, sk)∥∥∥ that will be used in the proof of Lemma 5.5.

Lemma 5.1 Suppose that the initial point is chosen to satisfy (18) and (19), and for the currentiterate (xk, yk, sk) ∈ N (τ, β). Then, there is a positive constant C1 such that for all k ≥ 0

ζρk

∥∥∥(xk, sk)∥∥∥1≤ C1nµk

g .

10

Proof: Let (x∗, y∗, s∗) be a primal-dual optimal solution and (x, y, s) be the vector definedby

(x, y, s) = ρk(x0, y0, s0) + (1− ρk)(x∗, y∗, s∗)− (xk, yk, sk).

It is not hard to check that (x, y, s) satisfies (21). Hence, from (22), we have

0 = xT s = (ρkx0 + (1− ρk)x∗ − xk)T (ρks

0 + (1− ρk)s∗ − sk)= ρ2

k(x0)T s0 + (1− ρk)2(x∗)T s∗ + ρk(1− ρk)((x0)T s∗ + (s0)T x∗)

+(xk)T sk − ρk((sk)T x0 + (xk)T s0)− (1− ρk)((sk)T x∗ + (xk)T s∗).

Since (sk)T x∗ + (xk)T s∗ ≥ 0, and (x∗)T s∗ = 0, we have

ρk((sk)T x0 + (xk)T s0) ≤ ρ2k(x

0)T s0 + (xk)T sk + ρk(1− ρk)((x0)T s∗ + (s0)T x∗). (23)

Since at each iteration (xk, sk) > 0, by definition (18) we have

(xk)T s0 + (sk)T x0 = ζ∥∥∥(xk, sk)

∥∥∥1.

Now from (23), and using ρk ∈ (0, 1) we have

ρkζ∥∥∥(xk, sk)

∥∥∥1≤ ρ2

knµ0 + (xk)T sk + ρk(1− ρk)(∥∥∥x0

∥∥∥∞ ‖s∗‖1 +

∥∥∥s0∥∥∥∞ ‖x

∗‖1

)

≤ ρknµ0 + (xk)T sk + ρk

∥∥∥(x0, s0)∥∥∥∞ ‖(x

∗, s∗)‖1 . (24)

From (17), by our choice of ζ and (x0, y0, s0), we also have∥∥∥(x0, s0)

∥∥∥∞ = ζ,

‖(x∗, s∗)‖1 ≤ 2nζ, (25)

µ0 =(x0)T s0

n= ζ2.

Substituting these values into (24) and using β ≥ 1 and (17), we obtain

ζρk

∥∥∥(xk, sk)∥∥∥1≤ βµk

gn + (xk)T sk + 2nβµkg

= C1nµkg ,

where C1 = 3β + 1, that completes the proof of the lemma. 2

5.2 Technical Results II: Bounding ‖∇Ψ(v)‖

We introduce the following notations dx :=v∆x

x, ds :=

v∆s

sand define

σ21 := ‖dx‖2 + ‖ds‖2 , and σ := ‖−∇Ψ(v)‖ = ‖dx + ds‖,

where the last equality follows from (7).

Then, we have

σ2 = ‖−∇Ψ(v)‖2 = ‖dx‖2 + ‖ds‖2 + 2dTx ds = σ2

1 + 2dTx ds = σ2

1 + 2∆xT ∆s

µ. (26)

The following lemma gives an upper bound for σ that will be used to derive the correspondingbound in Lemma 5.5 and inequality (42). For simplicity in the next four lemmas we removedall superscripts and subscripts.

11

Lemma 5.2 If Ψ(v) ≤ (τ−1)n2 then σ2 ≤ (τn + 1)3.

Proof: An upper bound for the optimal value of the problem

max σ2

s.t. Ψ(v) ≤ (τ − 1)n2

is (τn + 1)3.

This can be proved by examining the KKT conditions of the problem

ψ′(vi)[ψ

′′(vi)− λ] = 0, i = 1, ..., n,

Ψ(v) ≤ (τ − 1)n2

, (27)

λ

(Ψ(v)− (τ − 1)n

2

)= 0, λ ≥ 0.

Because ψ′′(vi) 6= 0, from the first condition we see that λ = 0 if and only if v = e, since this

equivalent to ψ′(vi) = 0, i = 1 · · · , n. In this case all the conditions of (27) are satisfied. On the

other hand, due to the choice of ψ(v) we have that

ψ′′(vi) = 1 + 3v−4

i

is a strictly decreasing convex function. If v 6= e and λ 6= 0 then the following two cases mayhappen:(i) v1 = v2 = ... = vn−k = 1, vn−k+1 = vn−k+2 = ... = vn > 1,(ii) v1 = v2 = ... = vn−k = 1, vn−k+1 = vn−k+2 = ... = vn < 1.For case (i) we have

kv2n − k

2− n

2≤ kv2

n − k

2+

kv−2n − k

2= Ψ(v) =

(τ − 1)n2

,

that imply vn ≤√

τn+kk ≤ √

τn + 1. Now

σ2 =n∑

i=1

(vi − v−3i )2 = k(vn − v−3

n )2 ≤ kv2n ≤ k

τn + k

k≤ (τ + 1)n.

For case (ii) we have

kv−2n − k

2− n

2≤ kv2

n − k

2+

kv−2n − k

2= Ψ(v) =

(τ − 1)n2

,

that imply v−2n ≤ τn+k

k . Now

σ2 =n∑

i=1

(vi − v−3i )2 = k(vn − v−3

n )2 ≤ kv−6n ≤ k(

τn + k

k)3 ≤ (τn + 1)3.

The proof is completed. 2

The following lemma gives a lower bound for vmin, the smallest coordinate of v that playssignificant role in determining step size (see proof of Theorem 6.1).

12

Lemma 5.3 Let σ be defined by (26). Then

vmin ≥ (1 + σ)−13 . (28)

Proof: If vmin ≥ 1 , then (28) is obvious. Let vmin ≤ 1, then we have

σ =∥∥∥v − v−3

∥∥∥ ≥∣∣∣vmin − v−3

min

∣∣∣ = v−3min − vmin ≥ v−3

min − 1,

that implies (28). 2

The combination of the previous lemma with the following lemma will be used in proof ofTheorem 6.1.

Lemma 5.4 Let σ be as defined by (26) and τ ≥ 10. Then for µt as the target value we haveσ ≥ 3 and for µh we have σ ≥ 2.

Proof: Using Proposition 3.1.5 of [12], we have σ2 ≥ 2Φ(x, s, µ). Due to the choice of the targetvalue in the algorithm, the following cases may happen. If µ = µt, then Φ(x, s, µt) = (τ−1)n

2 thatimplies σ2 ≥ (τ − 1)n ≥ 9. If µ = µh, then Φ(x, s, µh) ≥ (τ−2)n

4 . Then due to the choice of τ wehave σ2 ≥ (τ−2)n

2 ≥ 4, that completes the proof. 2

5.3 Technical Results III: Bounding (Dk)−1∆xk and Dk∆sk

Consistent with the notation introduced at the augmented system (8), we define Dk as:

Dk = (Xk)12 (Sk)−

12 = diag ((xk)

12 (sk)−

12 ).

We also make repeated use of matrix norms defined for a matrix M ∈ Rp×q as

‖M‖ = maxu∈Rq :‖u‖=1

‖Mu‖ .

This definition holds for all the three norms ‖.‖1 , ‖.‖2 and ‖.‖∞. A simple consequence of thisdefinition is that

‖Mu‖ ≤ ‖M‖ ‖u‖ , for all u ∈ Rq.

Because we dont have the orthogonality of ∆xk and ∆sk, the next lemma gives bounds onthe scaled vectors (Dk)−1∆xk and Dk∆sk, that helps us to determine an upper bound for(∆xk)T ∆sk that will be used frequently in the sequel. The proof is analogous to the proof ofLemma 6.5 in [24].

Lemma 5.5 Suppose that the current iterate (xk, yk, sk) ∈ N (τ, β) and (∆xk,∆yk, ∆sk) be thesolution of system (13) at iteration k of Algorithm SR-IIPM with the starting point as definedby (18) and (19). Then there is a constant C2 independent of n, such that for all k ≥ 0

∥∥∥(Dk)−1∆xk∥∥∥ ≤ C2

√µk

g n(σk)13 and

∥∥∥Dk∆sk∥∥∥ ≤ C2

√µk

g n(σk)13 . (29)

13

Proof: Let us define

(x, y, s) = (∆xk,∆yk,∆sk) + ρk(x0, y0, s0)− ρk(x∗, y∗, s∗),

where (x∗, y∗, s∗) is an optimal solution that satisfies inequality (19). It is easy to verify that(x, y, s) satisfies (21), thus from (22) we have

[∆xk + ρk(x0 − x∗)]T [∆sk + ρk(s0 − s∗)] = 0. (30)

From sk∆xk + xk∆sk = −µkvk∇Ψ(vk), we have

sk(∆xk + ρk(x0 − x∗)) +xk(∆sk + ρk(s0 − s∗)) (31)= −µkvk∇Ψ(vk) + ρks

k(x0 − x∗) + ρkxk(s0 − s∗).

If we multiply this system by (XkSk)−12 and note that (Xk)−

12 (Sk)

12 = (Dk)−1 and

(Xk)12 (Sk)−

12 = Dk, then we have

(Dk)−1((∆xk + ρk(x0 − x∗)) +Dk(∆sk + ρk(s0 − s∗)) = (32)

−√

µk∇Ψ(vk) + ρk(Dk)−1(x0 − x∗) + ρkDk(s0 − s∗).

Now, because of (30), we have∥∥∥(Dk)−1(∆xk + ρk(x0 − x∗)) + Dk(∆sk + ρk(s0 − s∗))

∥∥∥2

=∥∥∥(Dk)−1(∆xk + ρk(x0 − x∗))

∥∥∥2+

∥∥∥Dk(∆sk + ρk(s0 − s∗))∥∥∥2. (33)

Taking squared norms of both sides in (32), and using (33) we have∥∥∥(Dk)−1(∆xk + ρk(x0 − x∗))

∥∥∥2+

∥∥∥Dk(∆sk + ρk(s0 − s∗))∥∥∥2

(34)

≤(∥∥∥∥

√µk∇Ψ(vk)

∥∥∥∥ + ρk

∥∥∥(Dk)−1(x0 − x∗)∥∥∥ + ρk

∥∥∥Dk(s0 − s∗)∥∥∥)2

.

Let us now isolate the first term on the left-hand-side and write∥∥∥(Dk)−1(∆xk + ρk(x0 − x∗))

∥∥∥ ≤∥∥∥∥√

µk∇Ψ(υ)∥∥∥∥ + ρk

∥∥∥(Dk)−1(x0 − x∗)∥∥∥

+ρk

∥∥∥Dk(s0 − s∗)∥∥∥ . (35)

A simple application of the triangle inequality and addition of an extra term ρk

∥∥∥Dk(s0 − s∗)∥∥∥

to the right-hand-side give∥∥∥(Dk)−1∆xk

∥∥∥ ≤∥∥∥∥√

µk∇Ψ(vk)∥∥∥∥ + 2ρk

∥∥∥(Dk)−1(x0 − x∗)∥∥∥ + 2ρk

∥∥∥Dk(s0 − s∗)∥∥∥ . (36)

Next, we show that each term on the right-hand-side of (35) is O(√

µk) in magnitude. We have∥∥∥(XkSk)−

12

∥∥∥ = maxi=1,···,n

(xki s

ki )− 1

2 =1

mini=1,···,n

(xki s

ki )

12

.

Since vkmin ≥ (1 + σk)−

13 , i.e., min

i=1,···,n

√xk

i ski

µk≥ (1 + σk)−

13 , using Lemma 5.3, we have

∥∥∥(XkSk)−12

∥∥∥ ≤ (µk)−12 (1 + σk)

13 ≤ (µk

t )− 1

2 (1 + σk)13 ≤

√τ + 1µk

g

(1 + σk)13 , (37)

14

where the second inequality follows from the fact that µk ≥ µkt and µk

h ≥ µkt , and moreover the

last inequality follows from Lemma 3.5.

From the matrix norm∥∥(D)−1

∥∥, we have∥∥∥(Dk)−1

∥∥∥ = maxi=1,···,n

∥∥∥(Dkii)−1

∥∥∥

=∥∥∥(Dk)−1e

∥∥∥∞ =∥∥∥(XkSk)−

12 Ske

∥∥∥∞≤

∥∥∥(XkSk)−12

∥∥∥∥∥∥sk

∥∥∥1, (38)

and similarly:∥∥∥Dk

∥∥∥ ≤∥∥∥(XkSk)−

12

∥∥∥∥∥∥xk

∥∥∥1.

For the last two terms in (36), we have

ρk

∥∥∥(Dk)−1(x0 − x∗)∥∥∥ + ρk

∥∥∥Dk(s0 − s∗)∥∥∥ ≤ ρkζ

(∥∥∥(Dk)−1e∥∥∥ +

∥∥∥Dke∥∥∥)

= ρkζ(∥∥∥(XkSk)−

12 sk

∥∥∥ +∥∥∥(XkSk)−

12 xk

∥∥∥)

≤ ρkζ∥∥∥(XkSk)−

12

∥∥∥∥∥∥(xk, sk)

∥∥∥1.

From (37), (38), and Lemma 5.1 we have

ρk

∥∥∥(Dk)−1(x0 − x∗)∥∥∥ + ρk

∥∥∥Dk(s0 − s∗)∥∥∥ ≤ C1

√τ + 1(1 + σk)

13

√µk

g n. (39)

Finally, by combining (36), (37) and (39) we obtain the bound on∥∥∥(Dk)−1∆xk

∥∥∥ in (36) as

∥∥∥(Dk)−1∆xk∥∥∥ ≤

√µk

gσk + 2C1(1 + σk)

13

√τ + 1

√µk

g n

≤(σk + C1(τ + 1)(σk)

13 n

) √µk

g (40)

≤ C2

√µk

g n(σk)13 ,

where C2 = (C1 + 1)(τ + 1), the second inequality follows from Lemma 5.4 that implies σ ≥ 2for all target values, τ ≥ 10, and the last inequality follows from Lemma 5.2.

One can analogously derive∥∥∥Dk∆sk

∥∥∥ ≤ C2

√µk

g n(σk)13 ,

that completes the proof of the lemma. 2

Now we are ready to derive a bound for σk1 . From ( 29) we have

(∆xk)T ∆sk = ((Dk)−1∆xk)T (Dk∆sk) ≤∥∥∥(Dk)−1∆xk

∥∥∥∥∥∥Dk∆sk

∥∥∥ ≤ C22µk

gn2(σk)

23 ,

that gives

(∆xk)T ∆sk

µkg

≤ C22n2(σk)

23 . (41)

Since

(σk1 )2 = (σk)2 − 2

(∆xk)T ∆sk

µk≤ (σk)2 + 2(τ + 1)C2

2n2(σk)23 ,

15

where µk is the target value, we have

(σk)2 − 2(τ + 1)C22n2(σk)

23 ≤ σ2

1 ≤ (σk)2 + 2(τ + 1)C22n2(σk)

23 ≤ C3n

2(σk)23 , (42)

where C3 = 3(τ + 1)C22 = 3(τ + 1)3(3β + 2)2 and the last inequality follows from Lemma 5.2

and the definition of constant C2.

6 Complexity analysis

In this section we derive a polynomial upper bound for the number of iterations of AlgorithmSR-IIPM with a strictly positive step size α. The following theorem specifies the maximum stepsize when µk

t is the target value and estimates the reduction of the proximity function.

Theorem 6.1 Let τ ≥ 10,µk

g

µkh

< τ2 and (∆xk, ∆yk, ∆sk) be the solution of system (13) with

µ = µkt . Then the maximal feasible step size αk

max satisfies

αkmax ≥

910

(σk)−13 (σk

1 )−1.

Moreover, for any step size α ≤ (α∗0)k = 310

(σk)23 (σk

1 )−1

σk+3σk1

, the following relation holds:

Φ(x(α), s(α), µkt ) ≤ Φ(xk, sk, µk

t )−α

4(σk)2.

Proof: See Appendix. 2

We proceed to estimate the proximity function Φ(x(α), s(α), µg(α)) or, equivalently, the func-tion2 Φ(x(α), s(α), µ∗(α)) for a feasible step size when µt is used in the algorithm as the targetedparameter. Due to the inequality Φ(x(α), s(α), µ∗(α)) ≤ Φ(x(α), s(α), µ∗), it suffices to considerthe function Φ(x(α), s(α), µ∗).

Theorem 6.2 Let τ ≥ 10,µk

g

µkh

< τ2 and (∆xk, ∆yk,∆sk) be the solution of system (13) with µ =

µkt . Then the step size (α∗t )

k = min

(1

C2

√τ + 1n

12 (σk)

13

,19(σk)−

13 (σk

1 )−1

)is strictly feasible.

Moreover, for any step size α ≤ (α∗t )k the following relation holds:

Φ(x(α), s(α), µg(α)) ≤ Φ(xk, sk, µkt ) =

(τ − 1)n2

. (43)


It remains to consider the behaviors of Φ(x(α), s(α), µkt ) and Φ(x(α), s(α), µg(α)) when µk

h isused as the targeted duality gap parameter in Algorithm SR-IIPM.

2Here we use µ∗(α) =√

µg(α)µh(α), where µg(α) and µh(α) is defined by using x(α) and s(α).

16

Theorem 6.3 Let τ ≥ 10, τ2 ≤ µk

g

µkh

≤ τ and (∆xk,∆yk, ∆sk) be the solution of system (13)

with µ = µkh. Then the step size (α∗h)k = min

((σk)−

23

4C22n

,2910

σ23 σ−1

1

10σ + 32τ1σ1

)is strictly feasible.

Moreover, for any step size α ≤ (α∗h)k, we have

Φ(x(α), s(α), µg(α)) ≤ Φ(xk, sk, µkt ) =

(τ − 1)n2

, (44)

Φ(x(α), s(α), µt) ≤ Φ(xk, sk, µkt )−

α

2Φ(xk, sk, µk

t ). (45)


Combining the results of Theorems 6.1, 6.2 and 6.3 we have the following conclusion that isthe key result for proving the polynomial complexity.

Corollary 6.4 Let τ ≥ 10, and (∆xk, ∆yk, ∆sk) be the solution of system (13) with µ = µkt if

µkg

µkh

< τ2 , or µ = µk

h if τ2 ≤

µkg

µkh

≤ τ. Then the step size α ≤ α∗ =78

10τ(10 + 32τ)C3n2. is strictly

feasible and

Φ(x(α∗), s(α∗), µkt ) ≤ Φ(xk, sk, µk

t )−α∗

2Φ(xk, sk, µk

t ). (46)

µk+1g ≥ (1− α∗)µk

g . (47)

Proof: From Lemma 5.2 and (42) we have (α∗0)k ≥ 3

40C3n2, from Theorem 6.2, Lemma 5.2

and (42) we have (α∗t )k ≥ 1√

C3(τ + 1)n2, and from Theorem 6.3, Lemma 5.2 and (42) we have

(α∗h)k ≥ 2910

1C3(10 + 32τ)n2

then let

α∗p = min((α∗0)

k, (α∗t )k, (α∗h)k

)=

2910

1C3(10 + 32τ)n2

,

that implies (46) for any α ≤ α∗p. For (47), when the target value is µkt we have

µk+1g = µk

g

(1− α + α

(µkt )

2

µkgµ

kh

+ α2 ∆xT ∆s

nµkg

)

and thus it suffices to show that

(µkt )

2

µkgµ

kh

+ α(∆xk)T ∆sk

nµkg

≥ 0. (48)

Using the definition µkt , (41) and Lemma 5.2 one can easily derive that (48) holds for any

α ≤ α∗ =78

10τ(10 + 32τ)C3n2. When target value is µk

h one can analogously prove that (47) is

true for α ≤ α∗. The proof is completed. 2

Finally, summarizing the results we have:

17

Theorem 6.5 Let τ ≥ 10, suppose that the current iterate (xk, yk, sk) ∈ N (τ, β) and (∆xk,∆yk, ∆sk)be the solution of system (13) and µk is as it is defined in Algorithm SR-IIPM. Then the step

size α∗ =78

10τ(10 + 32τ)C3n2is strictly feasible. Moreover, we have

Φ(x(α∗), s(α∗), µg(α∗)) ≤ Φ(xk, sk, µkt ) =

(τ − 1)n2

, (49)

Φ(x(α∗), s(α∗), µkt ) ≤ Φ(xk, sk, µk

t )−α∗

2Φ(xk, sk, µk

t ), (50)

µk+1g ≥ (1− α∗)µk

g . (51)

The following corollary warrantees that all the iterates of algorithm SR-IIPM are in N (τ, β)neighborhood.

Corollary 6.6 Let τ ≥ 10. Suppose (xk, yk, sk) ∈ N (τ, β) and (∆xk, ∆yk,∆sk) be the solutionof system (13) and µk is as it is defined in Algorithm SR-IIPM, then (xk+1, yk+1, sk+1) ∈N (τ, β).

Proof: Using Theorem 6.5 we have µk+1g ≥ (1− α)µk

g , therefore we may write∥∥∥(rk+1

b , rk+1c )

∥∥∥µk+1

g

≤(1− α∗)

∥∥∥(rkb , rk

c )∥∥∥

µk+1g

≤∥∥∥(rk

b , rkc )

∥∥∥µk

g

≤ β

∥∥(r0b , r

0c )

∥∥µ0

.

Again by Theorem 6.5 we have

Φ(x(α∗), s(α∗), µg(α∗)) ≤ (τ − 1)n2

,

that implies the statement of the corollary. 2

To obtain an upper bound for the total number of iterations of Algorithm SR-IIPM we need toestimate the change of the parameter µt after one iteration. The following technical lemma willbe used in our estimation about µt. The proof of the lemma easily follows from the definition ofthe proximity function.

Lemma 6.7 Let v+ = v√1−θ

for some θ ∈ (0, 1). Then we have

Ψ(v+) ≤ 11− θ

Ψ(v) +nθ

1− θ.

By applying Lemma 6.7 to Theorem 6.5, we can prove the following theorem.

Theorem 6.8 Let τ ≥ 10, suppose that the current iterate (xk, yk, sk) ∈ N (τ, β) and (∆xk,∆yk, ∆sk)be the solution of system (13), where µk is as it is defined in Algorithm SR-IIPM, and α∗ is thedefault step size defined in Theorem 6.5. Let θ ≤ α∗

4 , Then

Φ(x(α∗), s(α∗), µkt (1− θ)) ≤ Φ(xk, sk, µk

t ).

Proof: From Lemma 6.7 we can see that to prove the theorem, it suffices to choose θ so thatit satisfies the following inequality

Φ(x(α∗), s(α∗), µkt ) + +nθ ≤ (1− θ)Φ(xk, sk, µk

t ).

18

Using Theorem 6.5, we can conclude that the above inequality will be satisfied if

θΦ(xk, sk, µkt ) + nθ ≤ α∗

2Φ(xk, sk, µk

t ). (52)

Recall the fact that Φ(xk, sk, µkt ) = (τ−1)n

2 , thus we can rewrite (52) as

(τ + 1)nθ

2≤ n(τ − 1)α∗

4.

This relation implies that if we choose θ = α∗4 , then we have

Φ(x(α∗), s(α∗), µkt (1− θ)) ≤ Φ(xk, sk, µk

t ).


Now we can proceed to discuss the complexity of Algorithm SR-IIPM. By the definition of µkt

we know that the proximity function Φ(x, s, µt) stays constant for all the iterates. Let us denoteby µk+1

t the target parameter value after step k. Then we have

Φ(x(0), s(0), µkt ) = Φ(x(α∗), s(α∗), µk+1

t ).

On the other hand, we know that

Φ(x(α∗), s(α∗), µg(α∗)) ≤ Φ(x(α∗), s(α∗), µk+1t ),

Φ(x(α∗), s(α∗), µkt ) ≤ Φ(x(α∗), s(α∗), µk+1

t ).

Because µk+1t ≤ µg(α∗), from these two inequalities we get µk

t ≥ µk+1t . Therefore, by using

Theorem 6.8 we can claim that

µk+1t ≤

(1− α∗

4

)µk

t . (53)

Now we are ready to prove the complexity of Algorithm SR-IIPM.

Theorem 6.9 Let τ ≥ 10 and t0 = max(

1,‖(r0

b ,r0c )‖

µ0

). Then Algorithm SR-IIPM will terminate

after at most ⌈4078

τ(10 + 32τ)C3n2 log

n(τ + 1)t0ε

⌉

iterations with a solution satisfying xT s ≤ ε and ‖(rb, rc)‖ ≤ ε.

Proof: In light of relation (53) we know that after at most⌈

4α∗

logn(τ + 1)βt0

ε

⌉

iterations we have µt ≤ εn(τ+1)βt0

that implies xT s ≤ εβt0

≤ ε. Then by Corollary 6.6 we have‖(rb, rc)‖ ≤ βµgt0 ≤ ε. 2

19

7 Implementation and Numerical Results

We implemented the algorithms in C by using the WSMP [2] sparse matrix package and someelements of OSL [11]. First we call some OSL subroutines for data input and preprocessing,and then we transfer the data of the LO problems to our solver; then we solve the problem withour SR-Proximity based IIPM solver and the results are feed back to the OSL postprocessingsubroutines that gives the final solution. WSMP is utilized to solve the normal equation systemat each iteration and some ESSL subroutines are used for matrix and vector operations. Thesystem environment is an IBM RS/6000 44P Model 270 workstation with operating system AIX4.3. The generation of initial points in SR-IIPM basically follows the method used in LIPSOLwith minor changes. The parameters are fixed for all test problems and we have used the usualbacktracking strategy (analogous to LIPSOL [30]) to ensure that the iterates stay sufficientlyaway from boundary of the positive orthant. We tested all the benchmark problems in NETLIB.The results are highly encouraging. The average iteration number is less than that of OSL andLIPSOL [30], while the solutions have a little higher precision.

As an illustration, Table 1 compares the iteration numbers and precision with OSL and LIPSOLfor the problems of the Kennington set from the NETLIB library. The computational times forSR-IIPM and LIPSOL are also given in Table 1. As a stopping criteria, we set the same precisionto the tolerance of duality gap for the three implementations. In the table, the column “Digits”shows how many digits of the objective value are the same as the standard reference solutionsin NETLIB.

For a total of 95 problems of the standard set in NETLIB, our algorithm provides an averageof 9.93 digits precision by an average of 21.33 iterations, while in average OSL gives 9.01 precisedigits by 21.68 iterations and LIPSOL gives 9.53 precise digits by 21.34 iterations. Our resultsfor 34 problems are better than LIPSOL according to less iterations (with the same precision)or higher precision (with the same number of iterations) or both. For 19 problems our resultsare worse than LIPSOL’s, while for the other 42 problems, the results are equally good.

8 Conclusions

In this paper, a self-regular proximity based infeasible IPM is presented and polynomial com-plexity of the algorithm is established. The number of iterations is bounded by O (

n2 log nε

).

Limited computational results are reported as well. Numerical experiences demonstrate thepotential of the algorithm to solve practical problems efficiently.

Acknowledgement: The first author would like to thank the Iranian Ministry of Science,Research and Technology for supporting his research. This research was also supported byan NSERC Discovery Grant, the Canada Research Chair program, by MITACS and an FPPgrant from IBM.T.J. Watson research Laboratory. The authors are grateful for two anonymousreferees for careful reading of an earlier version of this paper, for correcting some typos andgiving numerous useful advices how to improve the paper.

References

[1] E.D. Anderson, C. Roos, T. Terlaky, T. Trafalis and J.P. Warners. The use of low-rank up-dates in interior-point methods. Technical Report 96-149, Revised in 1999, Delft Universityof Technology, The Netherlands, 1996.

20

Table 1: Comparison of Iteration No, Precision, and CPU times

SR-IIPM

Problem Iter Digits Seconds

cre-a 29 12 5.33

cre-b 42 12 187.00

cre-c 28 12 3.83

cre-d 41 11 139.00

ken-07 14 11 0.73

ken-11 18 10 14.90

ken-13 23 11 52.50

ken-18 32 11 657.00

osa-07 29 12 206.00

osa-14 41 11 960.00

osa-30 31 12 413.00

osa-60 34 10 1360.00

pds-02 21 12 2.95

pds-06 33 11 30.50

pds-10 45 12 92.80

pds-20 49 12 479.00

OSL

Iter Digits

35 10

48 10

34 10

51 9

16 11

21 11

25 11

31 11

24 10

25 11

36 11

32 10

22 11

34 11

46 11

58 11

LIPSOL

Iter Digits Seconds

30 11 5.17

42 11 118.69

30 11 4.40

38 11 96.86

16 11 2.12

22 11 19.09

27 11 52.23

? ? ?

27 11 19.86

37 11 62.21

36 10 124.12

? ? ?

29 11 5.37

43 11 60.86

53 11 249.78

69 11 2075.26

Notes:

The symbol ”?” indicates that the mps reader failed to read those problems and the solver does not give solutions.

21

[2] A. Gupta. WSMP: Watson Sparse Matrix Package (Part I: direct solution of symmetricsparce systems). Technical Report RC 21866(98462), IBM T.J. Watson Research Center,Yorktown Heights, NY, 2000. http://www.cs.umn.edu/˜agupta/wsmp.html.

[3] N. Karmarkar. New polynomial-time algorithm for linear programming. Combinatorica, 4,373–395, 1984.

[4] M. Kojima. Basic lemmas in polynomial-time infeasible-interior-point methods for linearprograms. Annals of Operations Research, 62, 1–28, 1996.

[5] M. Kojima, N. Megiddo, and S. Mizuno. A primal-dual infeasible-interior-point algorithmfor linear programming. Mathematical Programming, 61, 263–280, 1993.

[6] I.J. Lustig. Interior point methods: computational state of the art. Linear Algebra and ItsApplications, 152, 191–222, 1993.

[7] I.J. Lustig. Feasible issues in a primal-dual interior-point methods for linear programming.Mathematical Programming, 49, 145–162, 1990/91.

[8] S. Mehrotra. On the implementation of a primal-dual interior point method. SIAM Journalon Optimization, 2, 575–601, 1992.

[9] S. Mizuno. Polynomiality of infeasible-interior-point algorithms for linear programming.Mathematical Programming, 67, 109–119, 1994.

[10] R.D.C. Monteiro and I. Adler. Interior path following primal-dual algorithms. Part I: Linearprogramming. Mathematical Programming, 44, 27–41, 1989.

[11] OSL: The Optimization Solution Library.Home page: http://www-3.ibm.com/software/data/bi/index.html

[12] J. Peng, C. Roos, and T. Terlaky. Self-Regularity: A New Paradigm for Primal-DualInterior-Point Algorithms. Princeton University Press, Princeton, NJ, 2002.

[13] J. Peng and T. Terlaky. A dynamic large-update primal-dual interior-point methods forlinear optimizaion.Optimization Methods and Software, 17, 1077-1104, 2002.

[14] J. Peng, C. Roos, and T. Terlaky. Self-Regular proximities and new search directions forlinear and semidefinite optimization. Mathematical Programming, 93, 129–171, 2002.

[15] F.A. Potra. A quadratically convergence predictor-corrector method for solving linear pro-grams from infeasible starting points. Mathematical Programming, 67, 383–406, 1994.

[16] C. Roos, T. Terlaky, and J.-Ph. Vial. Theory and Algorithms for Linear Optimization. AnInterior Point Approach. John Wiley and Sons, Chichester, UK, 1997.

[17] D. Shanno and E. Simantiraki. An infeasible interior-point method for linear complemen-tarity problems. SIAM Journal on Optimization, 7, no. 3, 620–640, 1997.

[18] J. Stoer, M. Wechs, and S. Mizuno. High order infeasible-interior-point methods for solvingsufficient linear complementarity problems. In: System Modeling and Optimization, Detroit,MI, Chapman Hall/CRC Res. Notes Math., 396, 245–252, 1997.

[19] K. Tanabe. Centered Newton method for linear programming: Interior and ’exterior’ pointmethod (in Janpanese). In: New Methods for Linear Programming. K. Tone, (Ed.) 3, 98–100, 1990.

22

[20] T. Terlaky (Ed.). Interior Point Methods of Mathematical Programming. Kluwer AcademicPublishers, 1996.

[21] M.J. Todd, and Y. Ye. Approximate Farkas lemmas and stopping rules for iterativeinfeasible-point algorithms for linear programming. Mathematical Programming, 81, 1–21,1998.

[22] P. Tseng. Analysis of an infeasible interior path -following method for complementarityproblems. SIAM Journal on Optimization, 7, no. 2, 386–402, 1997.

[23] R.H. Tutuncu. An infeasible-interior-point potential-reduction algorithm for linear pro-gramming. Mathematical Programming, 86, no. 2, 313–334, 1999.

[24] S.J. Wright. Primal-Dual Interior-Point Methods. SIAM, Philadelphia, 1996.

[25] S.J. Wright. An infeasible-interior-point algorithm for linear complementarity problems.Mathematical Programming, 67, 29–52, 1994.

[26] S.J. Wright and Y. Zhang. A superquadratic infeasible-interior-point method for linearcomplementarity problems. Mathematical Programming, 73, no. 3, 269-289, 1996.

[27] Y. Ye. Interior-Point Algorithms, Theory and Analysis. John Wiley & Sons, Chichester,UK, 1997.

[28] Y. Ye, M.J. Todd, and S. Mizuno. An O(√

nL)-iteration homogeneous and self-dual linearprogramming algorithm. Mathematics of Operations Research, 19, 53–67, 1994.

[29] Y. Zhang. On the convergence of a class of infeasible-interior-point methods for the horizan-tal linear complementarity problem. SIAM Journal on Optimization, 4, 208–227, 1994.

[30] Y. Zhang. Solving large-scale linear programs by interior-point methods under the MAT-LAB environment. Optimization Methods and Software, 10, 1–31, 1999.

[31] Y. Zhang User’s Guide to LIPSOL Linear-programming Interior Point Solvers V0.4. Opti-mization Methods and Software, 11/12, 385–396, 1999.

9 Appendix

In this section we present the detailed proofs of several technical results. For simplicity weremoved all superscripts and subscripts in this section.Proof of Theorem 6.1.Before we start the proof, we quote the following technical results from [12, 14].

Lemma 9.1 Suppose that γ ∈ [0, 1]. Then

(1 + t)γ ≤ 1 + γt, ∀ t ∈ [−1,∞)(1− t)γ ≤ 1− γt, ∀ t ∈ (−∞, 1]. (54)

Now we can outline the key steps in the proof. The reader is referred to [12] or [14] for thedetailed proof. In order to estimate the decrease of the proximity function after a step, usingLemmas 5.3, 5.4 we have

vmin ≥ (1 + σ)−13 ≥ (

σ

3+ σ)−

13 ≥ 9

10σ−

13 , (55)

23

where the second inequality follows Lemma 5.4. Let α = vminσ−11 . It follows immediately that

αmax ≥ vminσ−11 ≥ 9

10σ−

13 σ−1

1 . (56)

This proves the first statement of the theorem. Now let us define

f(α) = Ψ(v(α))−Ψ(v),

where v(α) =√

x(α)s(α)µ . Using Condition SR.2. of Definition 2.1., we have

f(α) ≤ 12Ψ(v + αdx) +

12Ψ(v + αds)−Ψ(v) := f1(α)

= −Ψ(v) +12

∑

i∈I

(ψ(vi + α[dx]i) + ψ(vi + α[ds]i)). (57)

Obviously, both functions f(α) and f1(α) are twice continuously differentiable with respect toα if α ≤ αmax. Analogously to Lemma 3.3.2 and Lemma 3.3.3 of [12, 14] we have

f′′1 (α) ≤ σ2

1

2(1 + 3(vmin − ασ1)−4),

and

f(0) = f1(0) = 0; f′(0) = f

′1(0) = −σ2

2.

Then we may write

f1(α) ≤ −σ2α

2+

σ21

2

∫ α

0

∫ ξ

0[1 + 3(vmin − ησ1)−4]dηdξ := f2(α).

The function f2(α) is convex and twice continuously differentiable in the interval [0, α). Let usdenote by α∗ the global minimizer of f2(α) in the interval [0, α), that is

α∗ = arg min{f2(α) : α ∈ [0, α]}.Analogously to the proof of Lemma 3.3.3 of [12] we can see that α∗ is the solution of the equation,

−σ2 + σ21α + σ1((vmin − ασ1)−3 − v−3

min) = 0. (58)

Let us define

ω1(α) = −σ2

2+ σ2

1α,

ω2(α) = −σ2

2+ σ1((vmin − ασ1)−3 − v−3

min).

It is easy to verify that both functions ω1(α) and ω2(α) are increasing for α ∈ [0, α). Using thistwo functions we can write equation (58) as

ω1(α∗) + ω2(α∗) = 0.

The root α∗1 of ω1(α) = 0 is α∗1 = σ2

2σ1. Through simple calculus, we find that the root α∗2 of

ω2(α) = 0 is

α∗2 =vmin

σ1

1−

(1 +

v3minσ

2

2σ1

)− 13

. (59)

24

Now by using (54) we can write

(1 +

v3minσ

2

2σ1

)− 13

=

(1− v3

minσ2

2σ1 + v3minσ

2

) 13

≤ 1− v3minσ

2

3(2σ1 + v3minσ

2),

this implies that

α∗2 ≥vminv

3minσ

2

3σ1(2σ1 + v3minσ

2).

By (55) we have

α∗2 ≥310

σ23 σ−1

1

σ + 3σ1.

Now we have α∗ ≥ min(α∗1, α∗2) ≥ 310

σ23 σ−1

1σ+3σ1

. Since

f2(0) = 0 and f′2(0) ≤ 0,

then by Lemma 1.3.3 of [12] the second statement also holds. 2

Proof of Theorem 6.2 It suffices to estimate the interval in which the proximity functionsatisfies

Φ(x(α), s(α), µg(α)) ≤ (τ − 1)n2

.

We start by considering the function

g1(α) = Φ(x(α), s(α), µ∗) =12

(µk

t

µ∗‖v(α)‖2 − 2n +

µ∗

µkt

∥∥∥v(α)−1∥∥∥2)

.

Note that, from the definition of µ∗ and µt, we know that

µkt

µ∗‖v(0)‖2 =

µ∗

µkt

∥∥∥v(0)−1∥∥∥2

= n√

τ0,

where τ0 := µg

µh. On the other hand, we have

‖v(α)‖2 = ‖v‖2 + αvT (dx + ds) + α2dTx ds ≤ ‖v‖2 + α2dT

x ds.

Because αvT (dx + ds) = α(∥∥v−1

∥∥2 − ‖v‖2) ≤ 0 for any strictly positive step size α. It is easy toverify that ∥∥∥v(α)−1

∥∥∥2 ≤ (1− αv−1

minσ1)−2∥∥∥v−1

∥∥∥2.

If we choose α such that (1 − αv−1minσ1)−2 ≤ √

2 ≤√

ττ0

, because 1 − 2−14 ≥ 1

8 , vmin ≥ 910σ−

13

(see the proof of the previous theorem) this implies that α ≤ 19σ−

13 σ−1

1 . Then we have

g1(α) ≤ 12

(n√

τ0 − 2n + α2∣∣∣dT

x ds

∣∣∣ + n√

τ0

√τ

τ0

)≤ 1

2

(2n√

τ − 2n + α2(τ + 1)C22n2σ

23

),

so if we take α∗t = min(

1

C2√

τ+1n12 σ

13, 1

9σ−13 σ−1

1

), we have g1(α) ≤ 1

2(2n√

τ − 2n + n) ≤ (τ−1)n2 .


25

Proof of Theorem 6.3.First we observe from the assumption that µk

g = τ1µkh where τ

2 < τ1 ≤ τ, we obtain µ∗ =√

τ1µkh.

Let us defineg(α) = Φ(x(α), s(α), µ∗)− Φ(x, s, µ∗).

Using condition SR.2 we have

g(α) ≤ 12Φ

v + αdx

τ141

+

12Φ

v + αds

τ141

− Φ(v) := g1(α).

Moreover, from the definition of v we have∥∥v−1

∥∥2 = n, which further implies∥∥∥v−3

∥∥∥2 −

∥∥∥v−1∥∥∥2

=∥∥∥v−3 − v−1

∥∥∥2+ 2

∥∥∥v−2 − e∥∥∥2 ≥ 0.

This inequality, together with the fact that ‖v‖2 = τ1

∥∥v−1∥∥2, gives

g′1(0) ≤ − σ2

2√

τ1.

The same way as Lemma 3.3.3 in [12] is proved we have:

g1(α) ≤ − σ2

2√

τ1+

σ21

2√

τ1

∫ α

0

∫ ξ

0(1 + 3τ1(vmin − ησ1)−4)dηdξ := g2(α).

It is easy to see, via making use of simple calculus, that g2(α) is convex and twice differentiablefor all α. Let α∗2 denote the global minimizer of g2(α) in the interval. Now like in Lemma 3.3.3[12] we have α∗2 to be the solution of the following equation

− σ2

√τ1

+σ2

1α√τ1

+ σ1

√τ1((vmin − ασ1)−3 − v−3

min) = 0.

Now, analogous to Theorem 6.1 using Lemma 5.4 one can show that vmin ≥ 87100σ−

13 for µh

as the target value. This implies α∗1 = σ2

2σ21

and α∗2 ≥ 2910

σ23 σ−1

110σ+32τ1σ1

. It is obvious that α∗2 ≥

min(α∗1, α∗2) ≥ 2910

σ23 σ−1

110σ+32τ1σ1

. Hence, whenever α ≤ α∗2, the relation

Φ(x(α), s(α), µ∗(α)) ≤ Φ(x(α), s(α), µ∗(0)) ≤ Φ(x, s, µ∗)

holds. Now, by using equation (10) we can claim

Φ(x(α), s(α), µg(α)) ≤ Φ(x, s, µg) ≤ Φ(x, s, µt),

where the last inequality follows from and the fact that µg ≥ µh ≥ µt. Now we focus on inequality(45). In order to investigate the behavior of the proximity function Φ(x(α), s(α), µt) for a feasiblestep size α, we define

h(α) := Φ(x(α), s(α), µt)− Φ(x(0), s(0), µt) (60)

=12

(µh

µt− µt

µh

) [αvT (dx + ds) + α2dT

x ds

]

+µt

2µh

[αvT (dx + ds)−

∥∥∥v−1∥∥∥2+

n∑

i=1

1[v + αdx]i [v + αds]i

+ α2dTx ds

]. (61)

26

Now, by applying a procedure similar to the proof of Theorem 6.1 to the second term in the

above formula, we can prove that for α ≤ 310

σ23 σ−1

1σ+3σ1

the following relation holds

h(α) ≤ 12

(µh

µt− µt

µh

) [αvT (dx + ds) + α2dT

x ds

]− αµt

4µhσ2.

Moreover, for any α ≤ σ−23

4C22n

we have

vT (dx + ds) + αdTx ds ≤ −

(µg

µh− 1

) ∥∥∥v−1∥∥∥2+ α(τ + 1)C2

2n2σ23 ≤ −

∥∥∥v−1∥∥∥2.

Finally, if we choose α ≤ min

(310

σ23 σ−1

1σ+3σ1

, σ−23

4C22n

), then

h(α) ≤ −α

2Φ(x, s, µt).


27

The Complexity of Self-Regular Proximity Based Infeasible IPMs

Documents