Research Reports on Mathematical and Computing Sciences

Research Reports on

Mathematical and

Computing Sciences

Department ofMathematical andComputing Sciences

Tokyo Institute of Technology

SERIES B: Operations Research

ISSN 1342-2804

A Relaxation Algorithm with Probabilistic Guarantee

for Robust Deviation Optimization Problems

Akiko Takeda, Shunsuke Taguchi

and Tsutomu Tanaka

March 2006, B–427

B-427 A Relaxation Algorithm with Probabilistic Guarantee for Robust Deviation Opti-mization Problems

Akiko Takeda†, Shunsuke Taguchi‡ and Tsutomu Tanaka§

March 2006

Abstract. For uncertain optimization problems, three measures of robustness (absolute robustness,deviation robustness and relative robustness) are proposed depending on the goal and specifics ofthe decision maker. Absolute robustness is well discussed on the tractable formulation and itsapplicability, but the studies on other robustness measures seem to be restricted to the field ofdiscrete optimization.

We mainly focus on deviation robustness for uncertain convex quadratic optimization problems,and propose a relaxation technique based on random sampling for solving robust deviation optimiza-tion problems with familiar uncertainty set. The relaxation problem gives a tighter approximatesolution than a simple sampled relaxation problem, theoretically and experimentally. Furthermore,robustness of the solution is measured in the probabilistic setting. The number of random samplesis estimated for obtaining an approximate solution with probabilistic guarantee, and approximationerror is evaluated a-priori and a-posteriori. Our relaxation algorithm with probabilistic guaranteemakes a-posteriori assessment to evaluate accuracy of approximate solutions.

Key words.

Robust Optimization, Relative Robustness, Probabilistic Robustness Analysis, Relaxation Method,Worst-Case Violation.

† Department of Mathematical and Computing Sciences, Tokyo Institute of Technology,2-12-1-W8-29 Oh-Okayama, Meguro-ku, Tokyo 152-8552 Japan. [email protected]

‡ Digital Media Network Company, Toshiba Corporation, 2-9, Suehiro-Cho, Ome, Tokyo198-8710, Japan. [email protected]

§ Department of Information Science, Tokyo Institute of Technology, 2-12-1-W8-29 Oh-Okayama, Meguro-ku, Tokyo 152-8552 Japan. [email protected]

1 Introduction

Uncertainty is an inevitable feature of many decision-making environments. On a regular basis

engineers, economists, investment professionals, and others need to make decisions to optimize a

system with incomplete information and considerable uncertainty. Robust optimization (RO) is

a term that is used to describe both modeling strategies and solution methods for optimization

problems that are defined by uncertain inputs. The objective of robust optimization models and

algorithms is to obtain solutions that are guaranteed to perform well (in terms of feasibility and

near-optimality) for all, or at least most, possible realizations of the uncertain input parameters.

For uncertain optimization problems, several robustness criteria are proposed depending on the

goal and specifics of the decision maker. In particular, Kouvelis and Yu (1997) defined three measures

of robustness: absolute robustness, deviation robustness and relative robustness. In their robust

optimization problems, a scenario based approach is used to represent the input data uncertainty.

Let us note f(x, As) the cost of solution x ∈ X in scenario s ∈ S, where X is the set of all solutions

and S is the set of all potentially realizable input data scenarios. As denotes the instance of the input

data that corresponds to scenario s. Then, the three robust optimization problems are described

in the following ways (Kouvelis and Yu 1997, Hites and Salazar-Neumann 2004, Yaman et al. 2004,

e.g.):

* Absolute robust optimization problem, which minimizes the maximum total cost among all feasible

decisions over all realizable input data scenarios such as minx∈X

maxs∈S

f(x, As),

* Robust deviation optimization problem, which exhibits the best worst case deviation from opti-

mality as minx∈X

maxs∈S{f(x, As)− f(x∗

s, As)}, where f(x∗s, As) := min

x∈Xf(x, As),

* Relative robust optimization problem, which exhibits the best worst case percentage deviation

from optimality such as minx∈X

maxs∈S

f(x, As)− f(x∗s, As)

f(x∗s, As)

, where f(x∗s, As) := min

x∈Xf(x, As).

In the studies of (Hites and Salazar-Neumann 2004, Yaman et al. 2004), the set of scenarios S can

be the Cartesian product of all intervals, that is, uncertain data possibly take any value in some

interval. Robust deviation optimization problems as well as absolute robust optimization problems

are studied well for discrete optimization problems with uncertainty: shortest path problems (Yu

and Yang 1998), spanning tree problem (Aron and Hentenryck 2004, Yaman et al. 2001), and so on.

To be concrete, shortest path problem of Yu and Yang (1998) considers that travel time on each arc

is uncertain because of various scenarios of the traffic conditions such as presence of accidents. And,

it is proposed to minimize the maximum deviation of the path length from the optimal path length

of the corresponding scenario, when the chosen path’s relative performance compared to the optimal

paths under various circumstances matters more.

On the other hand, there are different definitions of robust optimization problems in the literature

(Ben-Tal and Nemirovski 1998, 1999, El Ghaoui and Lebret 1997, Goldfarb and Iyengar 2003a,b, e.g.),

but these robust optimization problems may be regarded as absolute robust optimization problems.

The standard robust optimization formulations assume that the uncertain input parameters are

1

known only within certain bounds, which is called uncertainty set U , and focus on the considerable

worst case in U . As description of U for uncertain data, not only scenarios {As : s ∈ S} and some

interval, but also other types of uncertainty set U are proposed. Ben-Tal and Nemirovski (1999)

pointed out that ellipsoidal uncertainty set U , which assumes that uncertain data exists in some

ellipsoidal set, avoids a “too conservative” solution, compared to interval uncertainty set. Conversely,

in this robust optimization framework, other robustness such as deviation robustness is not discussed

well, though Krishnamurthy (2004) proposed several robust deviation optimization problems with

uncertainty set U represented by the convex hull of scenarios {As : s ∈ S} for portfolio optimization

problems.

In this paper, we mainly focus on deviation robustness for uncertain convex quadratic optimiza-

tion problems, and propose a relaxation technique for solving robust deviation optimization problems

(RD) with familiar uncertainty sets U used in (Ben-Tal and Nemirovski 1998, 1999, El Ghaoui and

Lebret 1997, Goldfarb and Iyengar 2003a,b). The concerned uncertainty set is known to provides

reasonable representation for uncertain input data. From U we draw N random samples and con-

struct a constraint relaxation problem (CRPN ) whose constraints are formed with these random

samples. The proposed problem (CRPN ) is shown to give a tighter approximate solution than a

simple sampled relaxation problem theoretically and experimentally.

Furthermore, robustness of a solution given by (CRPN ) is measured in the probabilistic setting.

Probabilistic robustness analysis and its applications to robust control are extensively discussed in

Tempo et al. (2005). The concept of violation probability is applied to sampled relaxation problems in

(Calafiore and Campi 2004, 2005, e.g.) and the number of random samples is estimated to guarantee

probabilistically that the resulting solution violates only a small portion of constraints. Furthermore,

Kanamori and Takeda (2006) investigated how large the solution violates each constraint. Extending

the probabilistic robustness analysis to (CRPN ), we estimate the number of samples N to obtain

an approximate solution with probabilistic guarantee, and assess approximation error, that is, the

difference of optimal values between (RD) and (CRPN ), a-priori and a-posteriori. Our relaxation

algorithm with probabilistic guarantee, presented in this paper for (RD), makes a-posteriori as-

sessment to evaluate accuracy of approximate solutions. Numerical results show that only a few

iterations of the algorithm are necessary to attain an approximate solution with required accuracy.

The rest of this paper is organized as follows. Section 2 proposes a relaxation technique for

robust deviation optimization problems with uncertainty set U . The resulting relaxation problem,

constructed with N random samples from U , gives a tighter lower bound than simple sampled re-

laxation problem. Furthermore, in Section 3, a-priori and a-posteriori assessments of approximation

error are introduced to estimate the number of samples N . And then, a relaxation algorithm with

probabilistic guarantee is presented. In Section 4, we discuss an application of the results we de-

veloped to linear least squares problem and a problem in financial mathematics, maximum Sharpe

ratio problem. Finally, Section 5 concludes the paper with some remarks.

2

2 Relaxation Techniques for Robust Deviation Optimization

We discuss robust deviation optimization problems with two kinds of uncertainty sets U , and derive

relaxation problems formulated as second-order cone programming problem or semidefinite program-

ming problem. Also, a relaxation problem for relative robust optimization with ellipsoidal uncertainty

set is briefly referred.

2.1 Robust Deviation Optimization Problem

We consider the following robust deviation optimization problem:

(RD) | minx∈X

max(Q,q,γ)∈U

f(x;Q, q, γ)− f∗(Q, q, γ), (1)

wheref(x;Q, q, γ) := x⊤Qx + q⊤x + γ

f∗(Q, q, γ) := minx∈X f(x;Q, q, γ).

Throughout this paper, we assume that the regions X and U are bounded. Also, suppose that

f(x;Q, q, γ) is convex quadratic in x, that is, Q is positive semidefinite matrix, and the feasible

region X consists of convex quadratic constraints.

For absolute robust optimization problems, several kinds of uncertainty sets U are proposed in

(Ben-Tal and Nemirovski 1998, 1999, Goldfarb and Iyengar 2003a). Krishnamurthy (2004) picked

up so-called polytopic uncertainty set U of Goldfarb and Iyengar (2003a), expressed as the convex

hull of ℓ given points such as

U =

(Q, q, γ) :(Q, q, γ) =

ℓ∑

j=1

uj(Qj, qj, γj)

Qj � O, j = 1, . . . , ℓ, u ≥ 0,∑ℓ

j=1 uj = 1

, (2)

and proposed to solve ℓ convex programs: minx∈X f(x;Qj, qj , γj) to obtain its optimal value

f∗(Qj , qj, γj) for all j = 1, . . . , ℓ. Then, a convex program

minx∈X,t

t

s.t. f(x;Qj, qj, γj)− f∗(Qj, qj , γj) ≤ t, j = 1, . . . , ℓ,

is constructed and solved as a convex quadratic optimization problem.

In this paper, we deal with familiar uncertainty set U : norm-constrained uncertainty set (Goldfarb

and Iyengar 2003a), ellipsoidal uncertainty set (Ben-Tal and Nemirovski 1998, 1999) and its variant,

which are defined as follows:

• Norm-constrained uncertainty set (Goldfarb and Iyengar 2003a):

U =

(Q, q, γ) :(Q, q, γ) = (Q0, q0, γ0) +

ℓ∑

j=1

uj(Qj, qj , γj)

Qj � O, j = 0, 1, . . . , ℓ, u ∈ U

, (3)

U := {u : u ≥ 0, ‖u‖ ≤ 1}. (4)

3

Then, function f(x;Q, q, γ) is written as

f(x,u) := f(x;Q, q, γ)

=∑ℓ

j=1(x⊤Qjx + q⊤

j x + γj)uj + x⊤Q0x + q⊤0 x + γ0,

= α(x)⊤u + x⊤Q0x + q⊤0 x + γ0,

(5)

where

α(x) = (x⊤Q1x + q⊤1 x + γ1, . . . ,x

⊤Qℓx + q⊤ℓ x + γℓ)

⊤. (6)

• Ellipsoidal uncertainty set (Ben-Tal and Nemirovski 1998) :

U =

(Q, q, γ) :

Q = (D0 +

ℓ∑

j=1

ujDj)⊤(D0 +

ℓ∑

j=1

ujDj)

(q, γ) = (q0, γ0) +

ℓ∑

j=1

uj(qj , γj), u ∈ U

, (7)

U := {u : ‖u‖ ≤ 1}.

Then, function f(x;Q, q, γ) is written as

f(x,u) := f(x;Q, q, γ)

=ℓ∑

i,j=1

(x⊤D⊤i Djx)uiuj +

ℓ∑

j=1

{x⊤(D⊤0 Dj + D⊤

j D0)x+

q⊤j x + γj}uj + x⊤D⊤

0 D0x + q⊤0 x + γ0

= u⊤D(x)u + r(x)⊤u + µ(x)⊤u + x⊤D⊤0 D0x + q⊤

0 x + γ0,

(8)

where matrix D(x) consists of (i, j)th element of x⊤D⊤i Djx, i, j = 1, . . . , ℓ, vector r(x) has

2x⊤D⊤0 Djx, j = 1, . . . , ℓ, as jth element, and vector µ(x) does q⊤j x + γj , j = 1, . . . , ℓ, as jth

element.

• A variant of ellipsoidal uncertainty set:

U =

(Q, q, γ) :

Q =m∑

k=1

(Dk0 +

ℓ∑

j=1

ukj D

kj )

⊤(Dk0 +

ℓ∑

j=1

ukj D

kj )

(q, γ) =

m∑

k=1

(qk

0 , γk0 ) +

ℓ∑

j=1

ukj (q

kj , γ

kj )

uk = (uk1 , . . . , u

kℓ )⊤ ∈ Uk, k = 1, . . . ,m

, (9)

Uk := {uk : ‖uk‖ ≤ 1}.

Denoting u1 ∈ U1, . . . ,um ∈ Um by u ∈ U , we have

f(x,u) := f(x;Q, q, γ)

=m∑

k=1

{uk⊤Dk(x)uk + rk(x)⊤uk + µk(x)⊤uk + x⊤Dk⊤

0 Dk0x + qk⊤

0 x + γk0

},

where matrix Dk(x) consists of (i, j)th element of x⊤Dk⊤i Dk

j x, i, j = 1, . . . , ℓ, vector rk(x)

has 2x⊤Dk⊤0 Dk

j x, j = 1, . . . , ℓ, as jth element, and vector µk(x) does qk⊤j x+ γk

j , j = 1, . . . , ℓ,

as jth element.

4

2.2 Relaxation Problems for Robust Deviation Optimization

We rewrite f(x;Q, q, γ) as f(x,u) and f∗(Q, q, γ) as f∗(u). Also, we refer to U as uncertainty set

instead of U . Then, Problem (1) is reformulated as the problem including infinitely many constraints

such as

(RD)minx∈X,t

t

s.t. f(x,u)− f∗(u) ≤ t, ∀u ∈ U .

When f(x,u) is linear in u as (5), f(x,u)− f∗(u) is convex in u since f∗(u) = minx∈X f(x,u) is

concave. However, f(x,u) − f∗(u) is not necessarily convex in u even if f(x,u) is convex in u as

(8). Furthermore, the infinite number of constraints f(x,u)− f∗(u) ≤ t makes (RD) difficult to be

solved.

As a simple way to construct a relaxation problem of (RD), we prepare a set of random samples

U (N) := {u1, . . . ,uN} selected from U , and think of the sampled relaxation problem:

(RDN )minx∈X,t

t

s.t. f(x,u)− f∗(u) ≤ t, ∀u ∈ U (N).

f∗(u) is obtained by solving a convex quadratic optimization minx∈X f(x,u) for random samples

u ∈ U (N). This problem (RDN ) which consists of a finite number of constraints can be solved as

a convex quadratic optimization problem via existing optimization methods. The set of constraint

functions of (RD) includes all constraints of (RDN ), and thus, (RDN ) yields a lower bound for

(RD). Furthermore, if we take sufficiently large number as N , the optimal value of (RDN ) may be

sufficiently close to that of (RD), though the number of constraints of (RDN ) becomes large.

In this section, we propose “nice” relaxation problems which yield tighter lower bounds than

(RDN ). The proposed relaxation problem is also constructed with the use of random samples

u1, . . . ,uN as well as (RDN ). The difference of two relaxation problems is that our relaxation

problem approximates the function f∗(u) by a set of tractable functions f(xi,u), i = 1, . . . , N , with

optimal solutions xi of minx∈X f(x,ui) for random samples ui, while (RDN ) approximates the set

U by a set of vector points U(N) = {u1, . . . ,uN}.

Lemma 2.1 Let xi be an optimal solution of minx∈X f(x,ui) for random samples ui, i = 1, . . . , N .

Then, the constraint relaxation problem:

(CRPN )

minx∈X,t

t

s.t. f(x,u)− f(x1,u) ≤ t, ∀u ∈ U

. . .

f(x,u)− f(xN ,u) ≤ t, ∀u ∈ U

yields a lower bound for (RD). Furthermore, the lower bound is tighter than the bound of sampled

relaxation problem (RDN ).

Proof: Let ϕ(u) = mini=1,...,N f(xi,u). Naturally, ϕ(u) ≥ f∗(u) holds for ∀u ∈ U , since

f(xi,u) ≥ minx∈X f(x,u) = f∗(u) does for i = 1, . . . , N and ∀u ∈ U . Then, we have

minx∈X

maxu∈U

f(x,u)− f∗(u) ≥ minx∈X

maxu∈U

f(x,u)− ϕ(u),

5

which implies that (RD) is lower bounded by minx∈X maxu∈U f(x,u) − ϕ(u). One can easily

show the equivalent between the relaxation problem and

minx∈X

maxu(i)∈U ,i=1,...,N

f(x,u(i))− f(xi,u(i)),

which is also described as (CRPN ). The relaxation problem (CRPN ) obviously has tighter lower

bound than the sampled relaxation problem (RDN ), since the set of constraint functions of (CRPN )

includes that of (RDN ). Indeed, the constraint f(x,ui) − f∗(ui) ≤ t of (RDN ) induced from a

random sample ui coincides with the ith constraint of (CRPN ) with fixed ui ∈ U , i.e., f(x,ui)−

f(xi,ui) ≤ t.

We describe the optimal value of the problem (•) by opt(•). The above lemma implies the

following relation among three problems: opt(RDN ) ≤ opt(CRPN ) ≤ opt(RD). If the problem

(CRPN ) is possibly transformed into a tractable optimization problem, it may yield a sufficiently

tight lower bound for (RD).

Now we assume norm-constrained uncertainty or ellipsoidal uncertainty set U for uncertain input

data, and deal with a quadratic function f(x,u). For each uncertainty set, we transform (CRPN )

into a tractable optimization problem.

2.2.1 Norm-constrained Uncertainty Set

In this section, we focus on norm-constrained uncertainty set defined as (3) and deal with robust

deviation optimization problem (RD) with the uncertainty set U of (4) and function f(x,u) =

α(x)⊤u + x⊤Q0x + q⊤0 x + γ0 of (5). Moreover, positive semidefinite matrices Qj (j = 0, . . . , ℓ) is

described as Qj = V ⊤j V j. For the random samples u1, . . . ,uN selected from U , let the corresponding

solution be xi = arg minx∈X f(x,ui), i = 1, . . . , N .

Theorem 2.2 Relaxation problem (CRPN ) of (RD) with norm-constrained uncertainty set is for-

mulated as the second-order cone programming problem:

minx∈X,t,g1,...,gN ,ν

t

s.t.

gi ≥ 0,∥∥∥∥∥

(2V jx

1− gij + q⊤

j x− x⊤i Qjxi − q⊤

j xi

)∥∥∥∥∥ ≤1 + gi

j − q⊤j x

+x⊤i Qjxi + q⊤

j xi,∥∥∥∥∥

(2V 0x

1− ν

)∥∥∥∥∥ ≤ 1 + ν,

‖gi‖ ≤ −ν − q⊤0 x−

(γ0 − f ∗(ui) + α(xi)

⊤ui

)+ t,

j = 1, . . . , ℓ, i = 1, . . . , N.

(10)

Proof: Functions f(xi,u), i = 1, . . . , N , are described as

f(xi,u) = (α(xi)⊤ui + x⊤

i Q0xi + q⊤0 xi + γ0) + α(xi)

⊤(u− ui)

= f ∗(ui) + α(xi)⊤(u− ui).

(11)

6

Then, Lemma 2.1 leads to a relaxation problem of (RD) such as

minx∈X,t

t

s.t. f(x,u)− {f∗(u1) + α(x1)⊤(u− u1)} ≤ t, ∀u ∈ U ,

· · ·

f(x,u)− {f∗(uN ) + α(xN )⊤(u− uN )} ≤ t, ∀u ∈ U .

(12)

Then, an inequality constraint with index i (i = 1, . . . , N) in (12) is rewritten as

x⊤Q0x + q⊤0 x +

(γ0 − f ∗(ui) + α(xi)

⊤ui

)− t

+ max‖u‖≤1,u≥0

ℓ∑

j=1

uj(x⊤Qjx + q⊤


j xi)

≤ 0.

Similar to Lemma 2 of Goldfarb and Iyengar (2003a), the above inequality is transformed into

gi ≥ 0,

gij ≥ x⊤Qjx + q⊤


j xi, j = 1, . . . , ℓ,

x⊤Q0x + q⊤0 x +

(γ0 − f ∗(ui) + α(xi)

⊤ui

)− t + ‖gi‖ ≤ 0,

⇔

gi ≥ 0,∥∥∥∥∥

(2V jx

1− gij + q⊤


j xi

)∥∥∥∥∥ ≤ 1 + gij − q⊤

j x + x⊤i Qjxi + q⊤

j xi,

j = 1, . . . , ℓ∥∥∥∥∥

(2V 0x

1− ν

)∥∥∥∥∥ ≤ 1 + ν,

‖gi‖ ≤ −ν − q⊤0 x−

(γ0 − f ∗(ui) + α(xi)

⊤ui

)+ t,

and the relaxation problem (10) follows.

2.2.2 Ellipsoidal Uncertainty Set

We consider uncertainty set (7) which contains quadratic terms of uncertain parameter u. Then,

the robust deviation optimization problem (RD) consists of U = {u : ‖u‖ ≤ 1} and f(x,u) =

u⊤D(x)u + r(x)⊤u + µ(x)⊤u + x⊤D⊤0 D0x + q⊤

0 x + γ0.

Theorem 2.3 Relaxation problem (CRPN ) of (RD) with ellipsoidal uncertainty set is formulated

7

as the semidefinite programming problem:

minx∈X,t,λ

t

s.t.

I D0x D1x . . . Dℓx

(D0x)⊤ t− δi(x) + f∗(ui)− λ 12(r(xi) + µ(xi)− µ(x))⊤

(D1x)⊤

... 12(r(xi) + µ(xi)− µ(x)) D(xi) + λI

(Dℓx)⊤

� 0,

λ ≥ 0, i = 1, . . . , N,

(13)

where δi(x) = q⊤0 x + γ0 + u⊤

i D(xi)ui + r(xi)⊤ui + µ(xi)

⊤ui.

Proof: For optimal solutions xi of minx∈X f(x,ui), i = 1, . . . , N , we have

f(xi,u)

= u⊤D(xi)u + r(xi)⊤u + µ(xi)

⊤u + x⊤i D⊤

0 D0xi + q⊤0 xi + γ0

= f ∗(ui) + u⊤D(xi)u + r(xi)⊤u + µ(xi)

⊤u− (u⊤i D(xi)ui + r(xi)

⊤ui + µ(xi)⊤ui).

The ith constraint of (CRPN ) is described as

f(x,u)− f(xi,u)

= u⊤(D(x)−D(xi))u + (r(x)− r(xi) + µ(x)− µ(xi))⊤u

+x⊤D⊤0 D0x + (q⊤

0 x + γ0 + u⊤i D(xi)ui + r(xi)

⊤ui + µ(xi)⊤ui)︸︷︷︸

δi(x)

−f ∗(ui)

≤ t, ∀u ∈ U .

(14)

We equivalently transform the constraint (14) into

u⊤(D(xi)−D(x))u + τ(r(xi)− r(x) + µ(xi)− µ(x))⊤u

+τ2(t− x⊤D⊤0 D0x− δi(x) + f∗(ui)) ≥ 0,

for all (τ,u) satisfying ‖u‖2 ≤ τ2 by multiplying τ 2 to (14) and regarding τu as u. The constraint

can be interpreted as follows: for all (τ,u) satisfying (τ,u⊤)

[1 0⊤

0 −I

](τ

u

)≥ 0,

(τ u⊤)

[t− δi(x) + f∗(ui)

12(r(xi) + µ(xi)− µ(x))⊤

12(r(xi) + µ(xi)− µ(x)) D(xi)

](τ

u

)

−(τ u⊤) [D0x D1x . . . Dℓx]⊤ [D0x D1x . . . Dℓx]

(τ

u

)

≥ 0

follows. Utilizing S-lemma, this interpretation is equivalent to the existence of λ ≥ 0 such as[

t− δi(x) + f∗(ui)− λ 12 (r(xi) + µ(xi)− µ(x))⊤

12 (r(xi) + µ(xi)− µ(x)) D(xi) + λI

]

− [D0x . . . Dℓx]⊤ [D0x . . . Dℓx] � 0.

(15)

8

The Schur complement procedure transforms (15) into a linear matrix inequality of the ith con-

straint of problem (13).

The problem (13) consists of linear matrix inequality constraints, and therefore, we can solve it

by interior point methods.

Remark 2.4 A variant of ellipsoidal uncertainty set (9) is considered. Then, we have

f(x,u) =

m∑

k=1

{uk⊤Dk(x)uk + rk(x)⊤uk + µk(x)⊤uk + x⊤Dk⊤

0 Dk0x + qk⊤

0 x + γk0

}.

The random samples uk1, . . . ,u

kN are extracted from Uk, k = 1, . . . ,m. Here, m-tuple of random

samples drawn from U 1, . . . ,Um is denoted by ui := (u1i , . . . ,u

mi ), i = 1, . . . , N . With optimal

solutions xi = arg minx f(x,ui), we have

f(xi,u) = f(xi, (u1, . . . ,um)) = f∗(ui) +

m∑

k=1

{uk⊤Dk(xi)uk + rk(xi)

⊤uk + µk(xi)⊤uk

−(uk⊤i Dk(xi)u

ki + rk(xi)

⊤uki + µk(xi)

⊤uki )}.

Then, similar to (14), the ith constraint of (CRPN ) is described as

f(x,u)− f(xi,u)

=

m∑

k=1

{uk⊤(Dk(x)−Dk(xi))uk + (rk(x)− rk(xi) + µk(x)− µk(xi))

⊤uk

+x⊤Dk⊤0 Dk

0x + δki (x)} − f∗(ui),

≤ t, ∀u1 ∈ U1, . . . ,∀um ∈ Um,

where δki (x) := qk⊤

0 x + γk0 + uk⊤

i Dk(xi)uki + rk(xi)

⊤uki + µk(xi)

⊤uki . It is also equivalent to

m∑

k=1

ski − f ∗(ui) ≤ t,

uk⊤(Dk(x)−Dk(xi))uk + (rk(x)− rk(xi) + µk(x)− µk(xi))

⊤uk + x⊤Dk⊤0 Dk

0x + δki (x) ≤ sk

i ,

∀u1 ∈ U1, . . . ,∀um ∈ Um, i = 1, . . . , N,

and therefore, the relaxation problem results in

minx∈X,t,s1

1,...,smN

,λ1,...,λmt

s.t.

m∑

k=1

ski − f ∗(ui) ≤ t,

I Dk0x Dk

1x . . . Dkℓ x

(Dk0x)⊤ sk

i − δki (x)− λk 1

2(rk(xi) + µk(xi)− µk(x))⊤

(Dk1x)⊤

... 12(rk(xi) + µk(xi)− µk(x)) Dk(xi) + λkI

(Dkℓ x)⊤

� 0,

λk ≥ 0, k = 1, . . . ,m, i = 1, . . . , N.

(16)

9

2.3 Extension to Relative Robust Optimization

Suppose that f ∗(Q, q, γ) > 0,∀(Q, q, γ) ∈ U and consider the relative robust optimization problem:

minx∈X

max(Q,q,γ)∈U

f(x;Q, q, γ)− f∗(Q, q, γ)

f∗(Q, q, γ). (17)

It should be noted that the numerator f(x;Q, q, γ) − f∗(Q, q, γ) ≥ 0 and the objective function

becomes nonnegative on the feasible region. As well as the robust deviation optimization problem,

we assume that f(x;Q, q, γ) := x⊤Qx + q⊤x + γ and Q is positive semidefinite matrix. When

uncertain data (Q, q, γ) of f(x;Q, q, γ) is linearly perturbed with parameter u such as (2) and (3),

f∗(Q, q, γ) is concave and the function f(x;Q, q, γ) − f∗(Q, q, γ) is convex in u. Therefore, the

objective function of (17) is quasiconvex in u, and thus, for polytopic uncertainty set U , which is

expressed as the convex hull of ℓ given points defined by (2), a relative robust optimization problem

(17) can be solved by a convex quadratic optimization problem:

minx∈X,t

t

s.t. f(x;Qj, qj , γj)− (1 + t)f∗(Qj , qj, γj) ≤ 0, j = 1, . . . , ℓ.

However, (17) with norm-constrained uncertainty or ellipsoidal uncertainty set is generally intractable.

As well as relaxation problems for robust deviation optimization, we prepare approximate func-

tions for f∗(Q, q, γ) and construct a tractable relaxation problem for relative robust optimization.

We now deal with norm-constrained uncertainty set U of (3), and use the notation f(x,u) instead

of f(x;Q, q, γ) and f∗(u) instead of f∗(Q, q, γ).

Theorem 2.5 A relaxation problem of relative robust optimization (17) with norm-constrained un-

certainty set is formulated as the second-order cone programming problem:

minx∈X,t,g1,...,gN ,ν

t

s.t.

gi ≥ 0,∥∥∥∥∥

(2V jx

1− gij + q⊤


j xi − δijt

)∥∥∥∥∥ ≤1 + gi

j − q⊤j x + x⊤

i Qjxi+

q⊤j xi + δi

jt,∥∥∥∥∥

(2V 0x

1− ν

)∥∥∥∥∥ ≤ 1 + ν,

‖gi‖ ≤ −ν − q⊤0 x− (α(xi)

⊤ui − f ∗(ui))t−(γ0 + α(xi)

⊤ui − f ∗(ui)),

j = 1, . . . , ℓ, i = 1, . . . , N,

(18)

where δij = x⊤

i Qjxi + q⊤j xi + γj .

Proof: Let xi be an optimal solution of minx∈X f(x,ui) for i = 1, . . . , N . Note that Problem

(17) is transformed into

minx∈X,t

t s.t. f(x,u)− (1 + t)f∗(u) ≤ 0, u ∈ U . (19)

10

Applying the relation f(xi,u) ≥ f∗(u), i = 1, . . . , N , and f(xi,u) of (11) which includes the

vector α(x) of (6) to the constraint of (19), we have the following inequalities; for i = 1, . . . , N ,

f(x,u)− (1 + t)f∗(u)

≥ f(x,u)− (1 + t){f∗(ui) + α(xi)

⊤(u− ui)}

= x⊤Q0x + q⊤0 x + (α(xi)

⊤ui − f ∗(ui))t +(γ0 + α(xi)

⊤ui − f ∗(ui))+

max‖u‖≤1,u≥0

ℓ∑

j=1

uj

x⊤Qjx + q⊤


j xi − (x⊤i Qjxi + q⊤

j xi + γj)︸︷︷︸δij

t

.

Therefore, we construct a relaxation problem whose constraints are

x⊤Q0x + q⊤0 x + (α(xi)

⊤ui − f ∗(ui))t +(γ0 + α(xi)

⊤ui − f ∗(ui))+

max‖u‖≤1,u≥0

ℓ∑

j=1

uj

(x⊤Qjx + q⊤


j xi − δijt) ≤ 0

⇔ gi ≥ 0∥∥∥∥∥

(2V jx

1− gij + q⊤


j xi − δijt

)∥∥∥∥∥ ≤ 1 + gij − q⊤

j x + x⊤i Qjxi + q⊤

j xi + δijt,

∥∥∥∥∥

(2V 0x

1− ν

)∥∥∥∥∥ ≤ 1 + ν

‖gi‖ ≤ −ν − q⊤0 x− (α(xi)

⊤ui − f ∗(ui))t−(γ0 + α(xi)

⊤ui − f ∗(ui)),

j = 1, . . . , ℓ

and the relaxation problem (18) follows.

Similar to Lemma 2.1, relaxation problem (18) for relative robust optimization problem (17) also

has a better lower bound than the sampled relaxation problem:

minx∈X,t

t

s.t.f(x,ui)− f∗(ui)

f∗(ui)≤ t, i = 1, . . . , N.

3 Estimation of Probabilistic Approximation Error

In the previous section, we have chosen N samples randomly from the given uncertainty set, and

constructed relaxation problems (CRPN ) for robust deviation optimization problems (RD) with two

different types of uncertainty sets U . Also, relaxation problem (18) is proposed for relative robust

optimization. Noticing that the lower bound achieved by (CRPN ) depends on the drawn random

samples, we consider how many N samples we should take to obtain an approximate solution with

probabilistic guarantee. Here, we focus on (RD), but the same discussion holds for relative robust

optimization.

11

3.1 Convergence Properties

Here, we consider sampled relaxation problem (RDN ), which is constructed using N independently

identically distributed (iid) random samples U (N) = {u1, . . . ,uN} from uncertainty set U . The direct

application of Kanamori and Takeda (2006) (Theorem 4.1) leads to Proposition 3.1 and Theorem

3.2, which ensure that sampled relaxation problem (RDN ) converges to (RD). In their paper, an

increasing function q(δ) for 0 ≤ δ ≤ B is defined as follows:

q(δ) := 1Vℓ(1) Dℓ

(δL, 1),

B := L√ℓ+1

(norm-constrained uncertainty (3)),

B := 2L (ellipsoidal uncertainty (7)).

Vℓ(r) denotes the volume of ℓ-dimensional hypersphere with radius r, and Dℓ(r, s) is defined as

Dℓ(r, s) = Vℓ−1(1)

sℓ

∫ cos−1(1− r2

2s2)

0(sinx)ℓdx + rℓ

∫ π

cos−1(− r2s

)(sinx)ℓdx

.

Let L be Lipschitz constant which satisfies the conditions:

| {f(x,u)− f∗(u)} − {f(x,v)− f∗(v)} | ≤ L‖u− v‖ for ∀x ∈ X and ∀u,v ∈ U .

Let (x∗N , t∗N ) be an optimal solution of sampled relaxation problem (RDN ) with random samples in

U (N). The assertion (i) of the following proposition follows directly from Corollary 1 of Calafiore and

Campi (2004). The probability Pr{u ∈ U : f(x∗N ,u)−f∗(u) > t∗N} in the assertion is called violation

probability (Calafiore and Campi 2004, 2005, e.g.). If a uniform probability distribution is assumed

on U , violation probability is calculated from the volume of the set {u ∈ U : f(x∗N ,u)−f∗(u) > t∗N}.

From Kanamori and Takeda (2006), we see that under the uniform distribution over U , function q(δ)

satisfies

q(δ) ≤ Pr

{u ∈ U : max

v∈U{f(x,v)− f∗(v)} − δ < f(x,u)− f∗(u)

}(20)

for all x ∈ X. The uniform distribution can be replaced by other probability distribution with some

regularity conditions, but for the sake of simplicity, the uniform distribution is assumed.

Proposition 3.1 (Kanamori and Takeda (2006)) Let ǫ ∈ (0, q(B)), η ∈ (0, 1) and N ≥ N(ǫ, η) :=2ǫlog 1

η+ 2dim(x) + 2 dim(x)

ǫlog 2

ǫ, where dim(x) denotes the dimension of x. An optimal solution

(x∗N , t∗N ) of sampled relaxation problem (RDN ) satisfies the following inequalities simultaneously with

probability of at least 1− η,

(i) Pr{u ∈ U : f(x∗N ,u)− f∗(u) > t∗N} ≤ ǫ,

(ii) minx∈X

maxu∈U

{f(x,u)− f∗(u)} − minx∈X

maxu∈U (N)

{f(x,u)− f∗(u)} ≤ q−1(ǫ).

Note that the value

minx∈X

maxu∈U (N)

{f(x,u)− f∗(u)}

12

in (ii) corresponds to the optimal value t∗N of (RDN ) at an optimal solution x∗N . Therefore, the

assertion (ii) of this proposition implies that the approximation error between robust deviation

optimization problem (RD) and its sampled relaxation problem (RDN ) is guaranteed to be within

q−1(ǫ) with probability at least 1− η.

Only the estimation of Lipschitz constant L remains for constructing function q(δ). Taking the

case of function f(x,u) of norm-constrained uncertainty set (3), we briefly show how to evaluate

Lipschitz constant L. Note that f(x,u) is described as f(x,u) = α(x)⊤u+x⊤Q0x+q⊤0 x+γ0 with

the notation α(x) of (6). Then we have

| {f(x,u1)− f∗(u1)} − {f(x,u2)− f∗(u2)} |

≤ | f(x,u1)− f(x,u2) |+ | f∗(u1)− f∗(u2) |,

and the first term of the right-hand side is upper bounded by { maxx∈X ‖α(x)‖ } ‖u1−u2‖. Also,

considering the relations

f∗(u1) = f(x1,u1) ≤ f(x2,u1), f∗(u2) = f(x2,u2) ≤ f(x1,u2)

for optimal solutions xi of minx∈X f(x,ui) = f∗(ui), i = 1, 2, we see that

| f∗(u1)− f∗(u2) | ≤ max{ ‖α(x1)‖, ‖α(x2)‖ } ‖u1 − u2‖.

Therefore, as Lipschitz constant L, we find 2 × { maxx∈X ‖α(x)‖ }, which can be easily esti-

mated with maximum eigenvalues of matrices Qj , j = 1, . . . , ℓ, sufficiently large value rx so that

maxx∈X ‖x‖ ≤ rx holds, and so on. For other kinds of uncertainty sets, we estimate L similarly.

The assertion (ii) indicates that the inequality: opt(RD)− opt(RDN ) ≤ q−1(ǫ) probabilistically

holds between (RD) and (RDN ). Moreover, Lemma 2.1 ensures that opt(RD) − opt(CRPN ) ≤

opt(RD)− opt(RDN ) always holds under the same random samples U (N), and thus, we have

PrN{

opt(RD)− opt(CRPN ) ≤ q−1(ǫ)}

≥ PrN{

opt(RD)− opt(RDN ) ≤ q−1(ǫ)}≥ 1− η,

(21)

where PrN{· · · } denotes the probability over N independent random samples U(N). Therefore, (ii)

also holds for a solution of (CRPN ).

The proposition leads to convergence properties of relaxation problems (RDN ) and (CRPN ) to

(RD).

Theorem 3.2 The optimal value of (RDN ) converges in probability to that of (RD), i.e., for ∀δ ∈

(0, B),

limN→∞

PrN

{minx∈X

maxu∈U

{f(x,u)− f∗(u)} − minx∈X

maxu∈U (N)

{f(x,u)− f∗(u)} > δ

}= 0.

The relaxation problem (CRPN ) also converges in probability to that of (RD).

13

Proof: From (21), we have

PrN{

opt(RD)− opt(CRPN ) > q−1(ǫ)}≤ η,

PrN{

opt(RD)− opt(RDN ) > q−1(ǫ)}≤ η.

It is possible to obtain η satisfying ⌈N(ǫ, η)⌉ = N implicitly when some probability ǫ and the

number of samples N are given. That is, under fixed parameter ǫ > 0, there exists a sequence

ηN → 0 as N → ∞. Since δ = q−1(ǫ) takes the value in (0, B) by ǫ ∈ (0, q(B)), the statement of

this theorem is proved.

This theorem means that if we take sufficiently large number N , relaxation problem (CRPN )

yields a lower bound almost equal to opt(RD).

3.2 Relaxation Algorithm with Probabilistic Guarantee

Proposition 3.1 determines the number of samples N so that approximation error q−1(ǫ) between

(RD) and (CRPN ) is guaranteed theoretically. However, when small approximation error is required,

it is difficult, from practical point of view, to solve relaxation problem (CRPN ) which includes many

constraints induced from N samples.

In this section, we introduce a practical iterative algorithm for (RD) by solving relaxation prob-

lems (CRPN ) with N less than theoretical number N(ǫ, η). The number of samples N is gradually in-

creased and the resulting relaxation problems (CRPN ) are solved until a sufficiently tight lower bound

is obtained for (RD). As criteria to terminate, that is, to decide that the lower bound is sufficiently

tight, we adopt a-posteriori assessment of the worst violation with Monte-Carlo techniques. Let an

optimal solution of (CRPN ) as (x∗N , t∗N ). Then, constraint violation of (RD) may possibly occur such

as f(x∗N , u)−f∗(u)− t∗N > 0, u ∈ U . We construct a set of random samples U(M) from U with suffi-

ciently large number M , and evaluate violation in the worst case: maxu∈U (M){f(x∗

N ,u)−f∗(u)−t∗N}.

Our algorithm is terminated when the worst violation becomes small so that we neglect it. In this

procedure, we obtain an approximate solution (x∗N , t∗N ) with probabilistic guarantee if the number

of random samples M is determined properly.

Theorem 3.3 Let (x∗N , t∗N ) be an optimal solution of (CRPN ), and U(M) := {u1, . . . ,uM} be a set

of M(≥ ⌈ ln ηln(1−q(δ))⌉) random samples from U for a confidence parameter η ∈ (0, 1) and a permissible

error δ ∈ (0, B]. For the worst-case violation among M samples:

βM = maxu∈U (M){f(x∗

N ,u)− f∗(u)− t∗N},

we have

opt(RD) − opt(CRPN ) = minx∈X

maxu∈U{f(x,u)− f∗(u)} − t∗N < βM + δ (22)

with probability at least 1− η.

14

Proof: Using the inequality (20), we have

PrM

{maxu∈U{f(x,u)− f∗(u)} − δ < max

u∈U (M){f(x,u)− f∗(u)}

}

= 1−

M∏

i=1

Pr

{maxu∈U{f(x,u)− f∗(u)} − δ ≥ f(x,ui)− f∗(ui)

}

≥ 1− (1− q(δ))M ≥ 1− η.

This implies that

maxu∈U{f(x,u)− f∗(u)} < max

u∈U (M){f(x,u)− f∗(u)}+ δ

holds with probability of at least 1 − η for any x ∈ X. Now, utilizing (x∗N , t∗N ), we subtract t∗N

from both sides of the inequality and substitute x∗N for x. Then, we have

maxu∈U{f(x∗

N ,u)− f∗(u)} − t∗N < βM + δ.

Since minx∈X maxu∈U{f(x,u)− f∗(u)} ≤ maxu∈U{f(x∗N ,u)− f∗(u)} holds, (22) follows.

Kanamori and Takeda (2006) proved the statement of Theorem 3.3 for an optimal solution

(x∗N , t∗N ) of sampled relaxation problem (RDN ). It should be noted that as the above proof shows,

this statement is proved not only for an optimal solution of relaxation problem (RDN ) or (CRPN ),

but also for any approximate solution x ∈ X with approximate value t. We propose to use an optimal

solution (x∗N , t∗N ) of (CRPN ) instead of that of (RDN ), since the worst case violation βM of (CRPN )

is expected to be considerably smaller than that of (RDN ).

Note that (22) of Theorem 3.3 as well as (21) evaluates approximation error opt(RD)−opt(CRPN ),

theoretically. The distinction between (21) and (22) is as follows: (21) makes a-priori assessment

of approximation error as q−1(ǫ), while (22) does a-posteriori assessment as βM + δ. Concretely, in

the case of a-priori assessment, before solving relaxation problem (CRPN ), the number of random

samples N can be determined for required accuracy q−1(ǫ). In the case of a-posteriori assessment,

once a solution (x∗N , t∗N ) has been computed, approximation error βM +δ of the solution is evaluated

with the use of Monte-Carlo techniques. It is expected that βM + δ is extremely less than q−1(ǫ),

since βM + δ is estimated with given solution (x∗N , t∗N ). Moreover, in a-posteriori assessment of

Kanamori and Takeda (2006), function q(δ,x∗N ), defined for the solution x∗

N , is used instead of q(δ)

for evaluation of M . To form q(δ,x∗N ), Lipschitz constant L of q(δ) is replaced with Lx∗

N, which

satisfies

| {f(x∗N ,u)− f∗(u)} − {f(x∗

N ,v)− f ∗(v)} | ≤ Lx∗

N‖u− v‖ for ∀u,v ∈ U . (23)

Lx∗

Ncan be smaller than L, and thus, necessary M random samples are decreased.

In the following algorithm, a-posteriori assessment is carried out for an optimal solution (x∗N , t∗N )

of (CRPN ). When we set small values close to 0 for η and δ, we need to solve minx∈X f(x,u),

u ∈ U (M) for large number M , and repeat function evaluations many times f(x∗N ,u)− f∗(u)− t∗N ,

15

u ∈ U (M). If opt(RD)−opt(CRPN ) < βM +δ is ensured with high accuracy 1−η, and furthermore,

βM +δ is sufficiently small at an approximate solution (x∗N , t∗N ), we regard (x∗

N , t∗N ) as almost optimal

for robust deviation optimization problem (RD).

Algorithm 3.4 Input: ζ > 0 (threshold for βM ), δ ∈ (0, B], η ∈ (0, 1) and initial number of

random samples N0.

Output: Almost optimal solution (x∗N , t∗N ), whose approximation error is guaranteed to be less than

ζ + δ with probability at least 1− η.

Step 0: Construct U (N0) = {u1,u2, . . . ,uN0} via random sampling from U . Let N ← N0, and go

to Step 1.

Step 1: Construct a relaxation problem (CRPN ) using U (N) and let (x∗N , t∗N ) be its optimal solution.

Go to Step 2.

Step 2: For sample size M = ⌈ ln ηln(1−q(δ,x∗

N))⌉, prepare another set of random samples U (M) =

{u1,u2, . . . ,uM} from U . Then, compute

βM = maxu∈U (M)

{f(x∗N ,u)− f∗(u)− t∗N}.

If βM > ζ, then go to Step 3. Otherwise go to Step 4.

Step 3: Define the subset V := {u ∈ U (M) : f(x∗N ,u) − f∗(u) − t∗N > 0} of U (M). Let V = |V|,

U (N+V ) ← U (N) ∪ V and N ← N + V . Go to Step 1.

Step 4: Terminate with the almost optimal solution (x∗N , t∗N ).

4 Numerical Results

As simple examples of robust deviation optimization problems, linear least squares problem and

maximum Sharpe ratio problem are considered. We compare accuracy of lower bounds obtained by

(RDN ) and (CRPN ) for these application problems. Then, a-priori and a-posteriori assessments are

carried out for maximum Sharpe ratio problem. All computations are conducted on an Opteron 850

(2.4GHz), 8GB of physical memory and 1MB of L2 cache size with SuSE Linux Enterprise Server 9.

4.1 Robust Deviation Linear Least Squares Problem

Linear least squares problem finds the best fitting straight line y =∑n

i=1 aixi + xn+1 through a

set of points. Suppose that we have m plots (a1, b1), . . . , (am, bm), where ak ∈ IRn and bk ∈ IR

(k = 1, . . . ,m), and formulate the linear least squares problem as follows.

minx∈IRn+1

‖Ax− b‖2, (24)

16

where A =

a⊤1 1

. . . ·

a⊤m 1

∈ IRm×(n+1) and b =

b1

...

bm

∈ IRm. If the data A and b are available,

we find easily an optimal solution of (24) as

x∗ = (A⊤A)−1A⊤b. (25)

In some applications of least squares problems, problem data (A, b) may include some measurement

error. In order to reduce the sensitivity of the decision x to perturbations in the data, the papers

(El Ghaoui and Lebret 1997, Goldfarb and Iyengar 2003a) proposed several types of absolute robust

optimization problems for a linear least squares problem. For the simplicity of notation, we describe

Problem (24) as

minx∈IRn+2

‖Dx‖2 s.t. xn+2 = 1,

where D = [A,−b]. The absolute robust optimization problems treat the coefficient matrix D as

uncertain input data, and focus on the worst case of D ∈ U where the regression ‖Dx‖2 becomes

large.

In this section we take notice of the worst case deviation from optimality and consider the robust

deviation optimization problem for the linear least squares problem:

minx

maxD∈U

‖Dx‖2 − f ∗(D) s.t. xn+2 = 1, (26)

where f ∗(D) = minx ‖Dx‖2 s.t. xn+2 = 1. f∗(D) is easily obtained from x∗ of (25). For each data

(ak, bk), k = 1, . . . ,m, of D subject to measurement error, we consider ellipsoidal uncertainty set

such as

U =

D =

α⊤1

. . .

α⊤m

:

αk = αk0 +

ℓ∑

j=1

ukj α

kj

‖uk‖ ≤ 1, k = 1, . . . ,m

. (27)

We refer to Uk = {uk : ‖uk‖ ≤ 1}, k = 1, . . . ,m, as uncertainty sets, and describe the conditions

u1 ∈ U1, . . . ,um ∈ Um by u ∈ U . Replacing f∗(D) by f∗(u), we rewrite (26) as

minx

maxu∈U

m∑

k=1

x⊤(αk0 +

ℓ∑

j=1

ukj α

kj )(α

k0 +

ℓ∑

j=1

ukj α

kj )

⊤x− f ∗(u)

s.t. xn+2 = 1.

Note that αkj corresponds to (n + 2) × 1 dimensional matrix (Dk

j )⊤ in a variant of ellipsoidal un-

certainty set (9). Hence, this problem can be formulated as a semidefinite programming similar to

(16).

We consider 8 points of two dimension (ai, bi), i = 1, . . . , 8 which fall into line. All data points

are considered to be uncertain, and each point has uncertainty set of circle with radius 0.5. Figure

1 shows how a shift of point positions influences the fitting straight lines of two robust problems:

absolute robust optimization and robust deviation optimization. As Figure 1 (left) shows, two robust

optimization problems have almost same optimal solutions whose fitting straight lines go through the

17

0 2 4 6 8

01

23

4

++

++

++

++

0 2 4 6 8

01

23

4Robust Deviation Opt.Absolute Robust Opt.

0 2 4 6 8

01

23

4

+ +

++

++

++

0 2 4 6 8

01

23

4

Robust Deviation Opt.Absolute Robust Opt.

Figure 1: Optimal fitting lines of absolute robust optimization and robust deviation optimization

0

0.5

1

1.5

2

2.5

3

103102.5102101.5101

Opt

imal

Val

ue

Num. of Samples (N)

proposed relaxation (CRPN)sampled relaxation (RDN)

0.01

0.1

1

10

100

1000

103102.5102101.5101

Ave

rage

Com

p. T

ime

[sec

]

Num. of Samples (N)


Figure 2: Optimal values of (CRPN ) and (RDN ), and their average computational time

nominal data (center) of each circle. With two data points going down, the dotted fitting straight

line, formed from an optimal solution of robust deviation optimization, tends to shift downward.

The shift of the optimal line is caused by the objective of robust deviation optimization. To find a

robust fitting line which has the least difference with optimal fitting line under various circumstances,

the resulting optimal line naturally shifts downward. On the other hand, the optimal fitting line of

absolute robust optimization remain the same for small change of two data points, since the downward

shift of fitting line induces larger regression error for remaining 6 data points in the worst case. The

choice of robust models depends on the concerned objective: fitting line’s relative performance or

worst-case performance.

Now we compare two relaxation problems (CRPN ) and (RDN ) in terms of their approximation

accuracy and computational time. Figure 2 (left) shows the minimum, maximum and average optimal

values of (CRPN ) and (RDN ) among 10 times trials for each fixed sample number N . The horizontal

axis shows that the sample number N is chosen on a log scale as 101, 101.25, 101.5, . . . , 103. Note that

even (RDN ) with N = 1000 cannot find tighter lower bound than (CRPN ) with N = 10. On

18

the other hand, in terms of computational time, (RDN ) is superior to (CRPN ). Indeed, most

computational time of (CRPN ) is devoted for solving a semidefinite programming problem (16) with

many linear matrix inequality constraints, and that of (RDN ) for a convex quadratic optimization

problem, since f∗(u), u ∈ U (N), are easily obtained from the computation of (25). The difference

of computational time is induced from the resulting optimization problems: (16) of (CRPN ) and

(RDN ). However, though (CRPN ) takes more computational time under the same N , relaxation

problem (CRPN ) with only N = 10 surpasses (RDN ) with N = 1000 in terms of computational time

and approximation accuracy. In that sense, solving (CRPN ) with less N is efficient way to obtain a

tight lower bound of (RD).

4.2 Robust Deviation Maximum Sharpe Ratio Problem

For a given expected return vector µ and a positive definite covariance matrix Q, Sharpe ratio

problem finds a portfolio that maximizes the Sharpe ratio defined as

maxx∈X

µ⊤x− rf√x⊤Qx

,

where rf ≥ 0 is the expected return of a riskless asset. The numerator is regarded as the expected

excess return on the portfolio, i.e., the return in excess of the risk-free rate rf , while the denominator

indicates the standard deviation of the return. The Sharpe ratio is a measure to evaluate a portfolio.

Goldfarb and Iyengar (2003b) considered Q and µ to be uncertain, and assumed interval uncer-

tainty set (described as interval) and factorized uncertainty set (Goldfarb and Iyengar 2003a,b) for

µ and Q, respectively. The objective of the robust counterpart of maximum Sharpe ratio problem

is to choose a portfolio that maximizes the worst case ratio of the expected excess return to the

standard deviation of the return. On the other hand, Krishnamurthy (2004) proposed the robust

deviation Sharpe ratio problem, which minimizes the maximum deviation of the Sharpe ratio from

the maximum ratio obtained for all possible realizations of expected return vector µ. The robust

deviation counterpart with uncertain expected return vector µ ∈ U is formulated as follows.

minx∈X

maxµ∈U

{f∗(µ)− f(x,µ)} , (28)

where X := {x : x ≥ 0,e⊤x = 1},

f(x,µ) =(µ− rfe)⊤x√

x⊤Qx, and f ∗(µ) = max

x∈X

(µ− rfe)⊤x√x⊤Qx

.

It is shown in Krishnamurthy (2004) that when the uncertainty set is given as the convex hull of ℓ

vectors µ1, . . . ,µℓ, which is called polytopic uncertainty set (2), an optimal solution of (28) can be

achieved by solving (ℓ + 1) second-order cone programming problems.

As uncertainty set U , we assume ellipsoidal uncertainty set where the quadratic term of x dis-

appears. We describe µ ∈ U as µ0 +∑ℓ

j=1 ujµj , u ∈ U := {u : ‖u‖ ≤ 1}, and use the notation

f(x,u) and f ∗(u) instead of f(x,µ) and f∗(µ), respectively. Applying the techniques of Goldfarb

19

and Iyengar (2003b) and Krishnamurthy (2004), Problem (28) is equivalently transformed into

minx,t

t

s.t. f∗(u)− (µ0 +∑ℓ

j=1 ujµj − rfe)⊤x ≤ t, ∀u ∈ U

x⊤Qx ≤ 1, x ≥ 0,

(29)

since the function of Sharpe ratio is homogeneous of a portfolio x, and furthermore, the optimality

is achieved at x⊤Qx = 1.

Now we try to solve the robust deviation optimization problem approximately. To formulate re-

laxation problem (CRPN ), approximating function f∗(u) from below is necessary. For that purpose,

we prepare a set of random samples U (N) from U and obtain optimal solutions xi of minx∈X f(x,ui)

for random samples ui ∈ U(N). Using optimal solutions xi, i = 1, . . . , N , we have

f∗(u) ≥ f(xi,u) = f∗(ui) +α(xi)

⊤(u− ui)√x⊤

i Qxi

,

where α(x) = (µ⊤1 x, . . . ,µ⊤

ℓ x)⊤. Then the first constraint of (29) is relaxed into

maxu∈U

α(xi)√

x⊤i Qxi

−α(x)

⊤

u

+ (rfe− µ0)⊤x + f ∗(ui)−

α(xi)⊤ui√

x⊤i Qxi

≤ t,

and the maximum value with respect to u is obtained as ‖ α(xi)√x⊤

i Qxi

−α(x)‖. Therefore, relaxation

problem (CRPN ) for (29) results in a second-order cone programming problem.

We assume that the dimension of variables x, that is, the number of assets is 20, and the dimension

ℓ of uncertain u is 4. The data µj , j = 0, 1, . . . , 4 of µ = µ0 +∑4

j=1 ujµj are given as

µ0 = (0.6, 0.58, 0.56, . . . , 0.24, 0.22)⊤ ,

µj = (0, . . . , 0︸︷︷︸5×(j−1)

, 0.45, 0.35, 0.25, 0.15, 0.05, 0, . . . , 0︸︷︷︸5×(4−j)

)⊤,

under the assumption that nominal return per unit, µ0, is decreasing from 1st asset to 20th asset

by 0.02, and that each 5 assets can behave similarly under various circumstances (for example, they

belong to the similar industry). Q is constructed randomly based on the uncertain expected return

vector µ so that the risk becomes higher as the expected return does higher.

Figure 3 shows the difference of optimal portfolios between absolute robust optimization and

robust deviation optimization. For the sake of convenience, the identification number from 1 to 20 is

given for each asset. The horizontal axis shows the number of each asset, and the vertical axis does

the optimal investment rate x∗i of ith asset (i = 1, . . . , 20). The optimal solution of absolute robust

optimization indicates that there are similar tendency in each 5 assets. From (k + 1)th to (k + 5)th

asset, k ∈ {0, 5, 10, 15}, their investment rates increase. This tendency is caused by the parameter

setting of µ, which shows that (k + 1)th asset possibly take small return per unit in the worst case,

though its nominal value is relatively high. We can say that an optimal solution of absolute robust

optimization is strongly sensitive to uncertainty set. On the other hand, in the optimal solution of

20

5 10 15 20

0.00

0.02

0.04

0.06

0.08

0.10

5 10 15 20

0.00

0.02

0.04

0.06

0.08

0.10

Robust Deviation Opt.

Absolute Robust Opt.

Figure 3: Characteristic of optimal solutions between two robust problems

robust deviation optimization, such tendency is not recognized well. A portfolio of robust deviation

optimization problem may be balanced well between return and risk.

We compare lower bounds of relaxation problems: sampled relaxation problem (RDN ) and pro-

posed relaxation problem (CRPN ) with fixing the number of random samples N adequately, and

make sure how (CRPN ) improves a lower bound. For each N , 100 different sets of random samples

are drawn and two relaxation problems (RDN ) and (CRPN ) are constructed. Figure 4 depicts aver-

age optimal values of each relaxation problem with the maximum and minimum values among 100

optimal values. The number of samples N are taken on a log scale as 101, 101.25, 101.5, . . . , 103. The

optimal value 1.0601 of (CRPN ) with N = 50000 is actually substituted for opt(RD). Note that the

average lower bound of (RDN ) with N = 1000 is achieved by (CRPN ) only with N = 32 ≈ 101.5.

Moreover, the width of optimal value interval for (CRPN ) tends to be smaller than (RDN ). Con-

sequently, numerical results indicate that the optimal value of (CRPN ) converges to that of (RD)

extremely faster than (RDN ).

In terms of computational time, there is no large difference between two relaxation problems with

same random samples N , as Figure 5 shows. Indeed, most of total computational time is devoted

for obtaining f∗(u) for u ∈ U (N), that is, for solving minx∈X f(x,ui), i = 1, . . . , N . Concretely, the

computation have occupied more than 97% of total computational time necessary for (CRPN ), while

more than 99% for (RDN ) under the random samples N = 1000. Noticing that the computational

time increases linearly as the number of samples N increases, we see that the proposed relaxation

problem (CRPN ) yields a tight lower bound of (RD) with less computational time.

4.3 Evaluation of Relaxation Algorithm

We now carry out a-priori and a-posteriori assessment of opt(RD) − opt(CRPN ) for the robust

deviation maximum Sharpe ratio problem, and finally evaluate the proposed algorithm which provides

21

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

103102.5102101.5101

Opt

imal

Val

ue

Num. of Samples (N)

proposed relaxation (CRPN)robust deviation (RD)

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

103102.5102101.5101

Opt

imal

Val

ue

Num. of Samples (N)

sampled relaxation (RDN)robust deviation (RD)

Figure 4: Optimal values of (CRPN ) and (RDN )

0.1

1

10

100

103102.5102101.5101

Ave

rage

Com

p. T

ime

[sec

]

Num. of Samples (N)


Figure 5: Average computational time of (CRPN ) and (RDN )

an approximate solution with probabilistic guarantee.

A-priori assessment We construct function q(δ) with dim(x) = 20 of x and ℓ = 4 of u. Lipschitz

constant is evaluated as L = 2×maxx ‖α(x)‖ = 0.9. For ellipsoidal uncertainty set, we have range

[0, 2L] of δ for q(δ). For its inverse function q−1(ǫ), the range of ǫ is [0, 1] because of q(2L) = 1.

Table 1 indicates the relation between the number of random samples N(ǫ, η) and theoretical

error measures (theoretical error q−1(ǫ) and its relative error) under the above parameter setting.

The relative error is defined by R.err := q−1(ǫ)

|opt(RD)|× 100. The optimal value of (CRPN ) obtained

with N = 50000, which is 1.060, is actually substituted for opt(RD). Proposition 3.1 guarantees

with probability at least 1−η(= 0.99) that relaxation problem (CRPN ) constructed with N = 10000

random samples gives a lower bound of (RD) whose approximation error is within q−1(ǫ) = 0.429,

which corresponds to 40.4% relative error. The theoretical approximation error q−1(ǫ) is common for

relaxation problems (RDN ) and (CRPN ), and therefore, we expect that (CRPN ) actually achieves

22

Table 1: A-priori assessment for opt(RD)− opt(CRPN ) (η = 0.01)

ǫ N(ǫ, η) q−1(ǫ) R.err [%]

0.787 100 1.397 131.8

0.125 1000 0.724 68.3

0.020 10000 0.429 40.4

Table 2: A-posteriori assessment for opt(RD)− opt(CRPN ) with x∗N (η = 0.01)

x∗N1

(N1 = 100) x∗N2

(N2 = 1000)

δ M βM A.err R.err [%] M βM A.err R.err [%]

0.15 2297 0.030 0.180 16.9 2294 0.006 0.156 14.7

0.10 11074 0.016 0.116 11.0 11057 0.006 0.106 10.0

0.05 169026 0.031 0.081 7.6 168759 0.010 0.060 5.7

x∗N3

(N3 = 10000)

δ M βM A.err R.err [%]

0.15 2285 -0.077 0.073 6.9

0.10 11013 -0.020 0.080 7.5

0.05 168073 -0.003 0.047 4.4

better lower bound whose error is far less than q−1(ǫ). From an empirical point of view, much less

number N ≪ 10000 of samples may be required to attain x∗N with approximation error less than

0.429. Certainly, Table 4 (left) shows that opt(CRPN ) achieved with N = 1000 is far less than the

theoretical error q−1(0.125) = 0.724.

A-posteriori assessment Next, a-posteriori assessments are carried out for x∗Ni

(i = 1, 2, 3),

which are obtained via (CRPN ) with N1 = 100, N2 = 1000 and N3 = 10000, respectively. In

a-posteriori assessment, q(δ,x∗N ) is used to evaluate sample number M . Lipschitz constant Lx∗

N

of (23) is estimated as Lx∗

N= ‖α(x∗

N )‖ + maxx ‖α(x)‖, whose first term is from estimation of

| f(x∗N ,u) − f(x∗

N ,v) | and second is from | f∗(u) − f∗(v) | for ∀u,v ∈ U . Lx∗

Ni= 0.575 is

applicable to all x∗Ni

(i = 1, 2, 3).

Now, for some fixed M > 0, we compute the worst violation:

βM = maxu∈U (M)

{f∗(u)− f(x∗N ,u)− t∗N}.

As a theoretical error measure, we define A.err := βM + δ for δ satisfying M = ⌈ ln ηln(1−q(δ,x∗

N))⌉

with fixed M and η = 0.01. Theorem 3.3 guarantees with probability at least 1 − η that the error

opt(RD)− opt(CRPN ) is less than A.err whenever δ ∈ (0, 2LN ] holds.

23

Table 3: Iterative relaxation algorithm with probabilistic guarantee (ζ = 0.01, η = 0.01, N0 = 10)

δ = 0.15 ( ζ + δ ≤ 0.16 )

itr. V.rate (V/M) βM t∗N

1 42/2274 0.275 0.856

2 0/2276 -0.043 1.041

δ = 0.10 ( ζ + δ ≤ 0.11 )

itr. V.rate (V/M) βM t∗N1 248/10963 0.275 0.856

2 0/11002 -0.012 1.057

δ = 0.05 ( ζ + δ ≤ 0.06 )

itr. V.rate (V/M) βM t∗N1 3653/167295 0.349 0.856

2 0/168250 -0.004 1.060

Table 2 shows A.err for solutions x∗N1

, x∗N2

and x∗N3

of (CRPN ). R.err is evaluated by A.err|opt(RD)|

×

100. Note that A.err = βM +δ possibly takes small value with sufficiently small parameter δ, though

sample number M becomes large and consequently computational time of a-posteriori assessment in-

creases. Compared to a-priori assessment, a-posteriori one is considerably strict. Indeed, a-posteriori

assessment guarantees that the relative error of x∗N1

is within 7.6% with probability 1− η (= 0.99),

while 131.8% relative error is guaranteed via a-priori assessment. Also, from this table, we see

that x∗N3

is sufficiently close to an optimal solution of (RD), since constraint violation, that is,

f∗(u)− f(x∗N ,u)− t∗N > 0, u ∈ U (M) hardly occurs with randomly chosen M samples.

Now we check the validity of a-posteriori assessment. We have lower bounds t∗N1= 1.040, t∗N2

=

1.045 and t∗N3= 1.055 for (RD), which are achieved by x∗

N1, x∗

N2and x∗

N3of (CRPN ), respectively.

Since opt(RD)− t∗N < A.err holds with probability at least 1− η = 0.99, we see that

t∗N1= 1.040 ≤ opt(RD) < 1.121, and t∗N2

= 1.045 ≤ opt(RD) < 1.105,

t∗N3= 1.055 ≤ opt(RD) < 1.102.

The almost optimal value 1.060 of (RD) exactly in these ranges.

Evaluation of Algorithm 3.4 For achieving a nice approximation of an optimal solution of (RD),

it might be a clever way to solve (CRPN ) with appropriately large N and check the accuracy of the

obtained solution x∗N via a-posteriori assessment. If the solution is sufficiently accurate, it can be

accepted as almost optimal solution of (RD).

Table 3 evaluates Algorithm 3.4 with three different permissible errors δ under the common

parameter ζ = 0.01, η = 0.01 and N0 = 10. t∗N is a lower bound of (RD) obtained by (CRPN ) at

each iteration of the algorithm, and V.rate shows that V constraints are violated among M sampled

constraints by x∗N achieved at each iteration. We see that as more accurate solution is required

with less δ, the number of random samples M becomes large and violation of constraints is likely

to occur, that is, V becomes large. However, by adding such violated constraints to (CRPN ), the

resulting approximate solution improves considerably and as a result, an approximate solution with

24

probabilistic guarantee is obtained within a few iterations. Note that the solution of the algorithm

with δ = 0.05 may be regarded as an optimal solution of (RD), since the lower bound t∗N is coincident

with 1.060, which is almost optimal value of (RD).

The behavior of Algorithm 3.4 with a few iterations has been observed even with N0 = 10. Tight

approximate solutions of (CRPN ) also contribute to the behavior. The algorithm whose relaxation

problems (CRPN ) are replaced with (RDN ) produces a lower bound t∗N = 1.017 after 5 iterations

under parameter δ = 0.15.

5 Concluding Remarks

In this paper, we have mainly focused on deviation robustness for uncertain convex quadratic opti-

mization problems, and proposed relaxation techniques for robust deviation optimization problems

with familiar uncertainty sets. The proposed relaxation problem is formulated with random samples

from given uncertainty set, and the resulting lower bound depends on the drawn random samples.

Therefore, it was our concern how many N samples we should take to obtain an approximate solution

with probabilistic guarantee.

A-priori assessment makes possible to estimate N for required accuracy, while once a solution

has been computed, a-posteriori assessment evaluates the accuracy of the solution. We applied the-

oretical results on a-priori and a-posteriori assessments of approximation error to the application

examples of linear least squares problem and maximum Sharpe ratio problem. Numerical results

indicate that a-posteriori assessment of approximation error is effectively utilized in our proposed

algorithm. Furthermore, from numerical results, we see that the proposed relaxation technique is

extremely effective compared to simple sampling technique. Indeed, in the example of linear least

squares problem, the proposed relaxation problem (CRPN ) with only N = 10 surpasses the sam-

pled relaxation problem (RDN ) with N = 1000 in terms of computational time and approximation

accuracy.

The proposed relaxation technique and a-priori and a-posteriori assessments may be applicable

to other kinds of uncertainty sets and also, relative robust optimization problems. One possible

extension is to modify the relaxation technique and algorithm presented here for a wide range of

robust optimization problems with deviation robustness or relative robustness.

References

Aron I.D, Hentenryck P.V. On the complexity of the robust spanning tree problem with interval

data. Operations Research Letters 2004;32; 36-40.

Ben-Tal A, Nemirovski A. Robust convex optimization. Mathematics of Operations Research 1998;23;

769-805.

Ben-Tal A, Nemirovski A. Robust solutions of uncertain linear programs. Operations Research

Letters 1999;25; 1-13.

25

Calafiore G, Campi M.C. A new bound on the generalization rate of sampled convex programs. 43rd

IEEE Conference on Decision and Control (CDC04) 2004;5; 5328-5333.

Calafiore G, Campi M.C. Uncertain convex programs: randomized solutions and confidence levels.

Mathematical Programming 2005;102; 25-46.

El Ghaoui L, Lebret H. Robust solutions to least-squares problems with uncertain data. SIAM

Journal on Matrix Analysis and Applications 1997;18; 1035-1064.

Goldfarb D, Iyengar G. Robust convex quadratically constrained programs. Mathematical Program-

ming 2003;97; 495-515.

Goldfarb D, Iyengar G. Robust portfolio selection problems. Mathematics of Operations Research

2003;28; 1-38.

Hites R, Salazar-Neumann M. The robust deviation p-elements problem with interval data. Tech-

nical Report; Service de Mathematiques de la Gestion; Universite Libre de Bruxelles; 2004.

www.ulb.ac.be/polytech/smg/indexpublications.html.

Kanamori T, Takeda A. Worst-case violation of sampled convex programs for optimization with

uncertainty. Research Report B-425; Dept. of Mathematical and Computing Sciences; Tokyo

Institute of Technology; 2006. www.is.titech.ac.jp/research/research-report/B/index.html

Kouvelis P, Yu G. Robust discrete optimization and its applications. Kluwer Academic Publishers;

Norwell; 1997.

Krishnamurthy V. Robust optimization in finance. Second Summer Paper for the Doctoral Program

(supervised by Tutuncu R.H.); 2004. Preprint.

Tempo R, Calafiore G, Dabbene F. Randomized algorithms for analysis and control of uncertain

systems. Springer-Verlag London Limited; 2005.

Yaman H, Karasan O.E, Pinar M.C. Restricted robust optimization for maximization over uni-

form matroid with interval data uncertainty. Technical Report; Bilkent University; 2004.

www.bilkent.edu.tr/∼hyaman/RRD.htm.

Yaman H, Karasan O.E, Pinar M.C. The robust spanning tree problem with interval data. Operations

Research Letters 2001;29; 31-40.

Yu G, Yang J. On the robust shortest path problem. Computers & Operations Research 1998;25;

457–468.

26

Research Reports on Mathematical and Computing Sciences

Documents