Page 1
Research Reports on
Mathematical and
Computing Sciences
Department ofMathematical andComputing Sciences
Tokyo Institute of Technology
SERIES B: Operations Research
ISSN 1342-2804
A Relaxation Algorithm with Probabilistic Guarantee
for Robust Deviation Optimization Problems
Akiko Takeda, Shunsuke Taguchi
and Tsutomu Tanaka
March 2006, B–427
Page 2
B-427 A Relaxation Algorithm with Probabilistic Guarantee for Robust Deviation Opti-mization Problems
Akiko Takeda†, Shunsuke Taguchi‡ and Tsutomu Tanaka§
March 2006
Abstract. For uncertain optimization problems, three measures of robustness (absolute robustness,deviation robustness and relative robustness) are proposed depending on the goal and specifics ofthe decision maker. Absolute robustness is well discussed on the tractable formulation and itsapplicability, but the studies on other robustness measures seem to be restricted to the field ofdiscrete optimization.
We mainly focus on deviation robustness for uncertain convex quadratic optimization problems,and propose a relaxation technique based on random sampling for solving robust deviation optimiza-tion problems with familiar uncertainty set. The relaxation problem gives a tighter approximatesolution than a simple sampled relaxation problem, theoretically and experimentally. Furthermore,robustness of the solution is measured in the probabilistic setting. The number of random samplesis estimated for obtaining an approximate solution with probabilistic guarantee, and approximationerror is evaluated a-priori and a-posteriori. Our relaxation algorithm with probabilistic guaranteemakes a-posteriori assessment to evaluate accuracy of approximate solutions.
Key words.
Robust Optimization, Relative Robustness, Probabilistic Robustness Analysis, Relaxation Method,Worst-Case Violation.
† Department of Mathematical and Computing Sciences, Tokyo Institute of Technology,2-12-1-W8-29 Oh-Okayama, Meguro-ku, Tokyo 152-8552 Japan. [email protected]
‡ Digital Media Network Company, Toshiba Corporation, 2-9, Suehiro-Cho, Ome, Tokyo198-8710, Japan. [email protected]
§ Department of Information Science, Tokyo Institute of Technology, 2-12-1-W8-29 Oh-Okayama, Meguro-ku, Tokyo 152-8552 Japan. [email protected]
Page 3
1 Introduction
Uncertainty is an inevitable feature of many decision-making environments. On a regular basis
engineers, economists, investment professionals, and others need to make decisions to optimize a
system with incomplete information and considerable uncertainty. Robust optimization (RO) is
a term that is used to describe both modeling strategies and solution methods for optimization
problems that are defined by uncertain inputs. The objective of robust optimization models and
algorithms is to obtain solutions that are guaranteed to perform well (in terms of feasibility and
near-optimality) for all, or at least most, possible realizations of the uncertain input parameters.
For uncertain optimization problems, several robustness criteria are proposed depending on the
goal and specifics of the decision maker. In particular, Kouvelis and Yu (1997) defined three measures
of robustness: absolute robustness, deviation robustness and relative robustness. In their robust
optimization problems, a scenario based approach is used to represent the input data uncertainty.
Let us note f(x, As) the cost of solution x ∈ X in scenario s ∈ S, where X is the set of all solutions
and S is the set of all potentially realizable input data scenarios. As denotes the instance of the input
data that corresponds to scenario s. Then, the three robust optimization problems are described
in the following ways (Kouvelis and Yu 1997, Hites and Salazar-Neumann 2004, Yaman et al. 2004,
e.g.):
* Absolute robust optimization problem, which minimizes the maximum total cost among all feasible
decisions over all realizable input data scenarios such as minx∈X
maxs∈S
f(x, As),
* Robust deviation optimization problem, which exhibits the best worst case deviation from opti-
mality as minx∈X
maxs∈S{f(x, As)− f(x∗
s, As)}, where f(x∗s, As) := min
x∈Xf(x, As),
* Relative robust optimization problem, which exhibits the best worst case percentage deviation
from optimality such as minx∈X
maxs∈S
f(x, As)− f(x∗s, As)
f(x∗s, As)
, where f(x∗s, As) := min
x∈Xf(x, As).
In the studies of (Hites and Salazar-Neumann 2004, Yaman et al. 2004), the set of scenarios S can
be the Cartesian product of all intervals, that is, uncertain data possibly take any value in some
interval. Robust deviation optimization problems as well as absolute robust optimization problems
are studied well for discrete optimization problems with uncertainty: shortest path problems (Yu
and Yang 1998), spanning tree problem (Aron and Hentenryck 2004, Yaman et al. 2001), and so on.
To be concrete, shortest path problem of Yu and Yang (1998) considers that travel time on each arc
is uncertain because of various scenarios of the traffic conditions such as presence of accidents. And,
it is proposed to minimize the maximum deviation of the path length from the optimal path length
of the corresponding scenario, when the chosen path’s relative performance compared to the optimal
paths under various circumstances matters more.
On the other hand, there are different definitions of robust optimization problems in the literature
(Ben-Tal and Nemirovski 1998, 1999, El Ghaoui and Lebret 1997, Goldfarb and Iyengar 2003a,b, e.g.),
but these robust optimization problems may be regarded as absolute robust optimization problems.
The standard robust optimization formulations assume that the uncertain input parameters are
1
Page 4
known only within certain bounds, which is called uncertainty set U , and focus on the considerable
worst case in U . As description of U for uncertain data, not only scenarios {As : s ∈ S} and some
interval, but also other types of uncertainty set U are proposed. Ben-Tal and Nemirovski (1999)
pointed out that ellipsoidal uncertainty set U , which assumes that uncertain data exists in some
ellipsoidal set, avoids a “too conservative” solution, compared to interval uncertainty set. Conversely,
in this robust optimization framework, other robustness such as deviation robustness is not discussed
well, though Krishnamurthy (2004) proposed several robust deviation optimization problems with
uncertainty set U represented by the convex hull of scenarios {As : s ∈ S} for portfolio optimization
problems.
In this paper, we mainly focus on deviation robustness for uncertain convex quadratic optimiza-
tion problems, and propose a relaxation technique for solving robust deviation optimization problems
(RD) with familiar uncertainty sets U used in (Ben-Tal and Nemirovski 1998, 1999, El Ghaoui and
Lebret 1997, Goldfarb and Iyengar 2003a,b). The concerned uncertainty set is known to provides
reasonable representation for uncertain input data. From U we draw N random samples and con-
struct a constraint relaxation problem (CRPN ) whose constraints are formed with these random
samples. The proposed problem (CRPN ) is shown to give a tighter approximate solution than a
simple sampled relaxation problem theoretically and experimentally.
Furthermore, robustness of a solution given by (CRPN ) is measured in the probabilistic setting.
Probabilistic robustness analysis and its applications to robust control are extensively discussed in
Tempo et al. (2005). The concept of violation probability is applied to sampled relaxation problems in
(Calafiore and Campi 2004, 2005, e.g.) and the number of random samples is estimated to guarantee
probabilistically that the resulting solution violates only a small portion of constraints. Furthermore,
Kanamori and Takeda (2006) investigated how large the solution violates each constraint. Extending
the probabilistic robustness analysis to (CRPN ), we estimate the number of samples N to obtain
an approximate solution with probabilistic guarantee, and assess approximation error, that is, the
difference of optimal values between (RD) and (CRPN ), a-priori and a-posteriori. Our relaxation
algorithm with probabilistic guarantee, presented in this paper for (RD), makes a-posteriori as-
sessment to evaluate accuracy of approximate solutions. Numerical results show that only a few
iterations of the algorithm are necessary to attain an approximate solution with required accuracy.
The rest of this paper is organized as follows. Section 2 proposes a relaxation technique for
robust deviation optimization problems with uncertainty set U . The resulting relaxation problem,
constructed with N random samples from U , gives a tighter lower bound than simple sampled re-
laxation problem. Furthermore, in Section 3, a-priori and a-posteriori assessments of approximation
error are introduced to estimate the number of samples N . And then, a relaxation algorithm with
probabilistic guarantee is presented. In Section 4, we discuss an application of the results we de-
veloped to linear least squares problem and a problem in financial mathematics, maximum Sharpe
ratio problem. Finally, Section 5 concludes the paper with some remarks.
2
Page 5
2 Relaxation Techniques for Robust Deviation Optimization
We discuss robust deviation optimization problems with two kinds of uncertainty sets U , and derive
relaxation problems formulated as second-order cone programming problem or semidefinite program-
ming problem. Also, a relaxation problem for relative robust optimization with ellipsoidal uncertainty
set is briefly referred.
2.1 Robust Deviation Optimization Problem
We consider the following robust deviation optimization problem:
(RD) | minx∈X
max(Q,q,γ)∈U
f(x;Q, q, γ)− f∗(Q, q, γ), (1)
wheref(x;Q, q, γ) := x⊤Qx + q⊤x + γ
f∗(Q, q, γ) := minx∈X f(x;Q, q, γ).
Throughout this paper, we assume that the regions X and U are bounded. Also, suppose that
f(x;Q, q, γ) is convex quadratic in x, that is, Q is positive semidefinite matrix, and the feasible
region X consists of convex quadratic constraints.
For absolute robust optimization problems, several kinds of uncertainty sets U are proposed in
(Ben-Tal and Nemirovski 1998, 1999, Goldfarb and Iyengar 2003a). Krishnamurthy (2004) picked
up so-called polytopic uncertainty set U of Goldfarb and Iyengar (2003a), expressed as the convex
hull of ℓ given points such as
U =
(Q, q, γ) :(Q, q, γ) =
ℓ∑
j=1
uj(Qj, qj, γj)
Qj � O, j = 1, . . . , ℓ, u ≥ 0,∑ℓ
j=1 uj = 1
, (2)
and proposed to solve ℓ convex programs: minx∈X f(x;Qj, qj , γj) to obtain its optimal value
f∗(Qj , qj, γj) for all j = 1, . . . , ℓ. Then, a convex program
minx∈X,t
t
s.t. f(x;Qj, qj, γj)− f∗(Qj, qj , γj) ≤ t, j = 1, . . . , ℓ,
is constructed and solved as a convex quadratic optimization problem.
In this paper, we deal with familiar uncertainty set U : norm-constrained uncertainty set (Goldfarb
and Iyengar 2003a), ellipsoidal uncertainty set (Ben-Tal and Nemirovski 1998, 1999) and its variant,
which are defined as follows:
• Norm-constrained uncertainty set (Goldfarb and Iyengar 2003a):
U =
(Q, q, γ) :(Q, q, γ) = (Q0, q0, γ0) +
ℓ∑
j=1
uj(Qj, qj , γj)
Qj � O, j = 0, 1, . . . , ℓ, u ∈ U
, (3)
U := {u : u ≥ 0, ‖u‖ ≤ 1}. (4)
3
Page 6
Then, function f(x;Q, q, γ) is written as
f(x,u) := f(x;Q, q, γ)
=∑ℓ
j=1(x⊤Qjx + q⊤
j x + γj)uj + x⊤Q0x + q⊤0 x + γ0,
= α(x)⊤u + x⊤Q0x + q⊤0 x + γ0,
(5)
where
α(x) = (x⊤Q1x + q⊤1 x + γ1, . . . ,x
⊤Qℓx + q⊤ℓ x + γℓ)
⊤. (6)
• Ellipsoidal uncertainty set (Ben-Tal and Nemirovski 1998) :
U =
(Q, q, γ) :
Q = (D0 +
ℓ∑
j=1
ujDj)⊤(D0 +
ℓ∑
j=1
ujDj)
(q, γ) = (q0, γ0) +
ℓ∑
j=1
uj(qj , γj), u ∈ U
, (7)
U := {u : ‖u‖ ≤ 1}.
Then, function f(x;Q, q, γ) is written as
f(x,u) := f(x;Q, q, γ)
=ℓ∑
i,j=1
(x⊤D⊤i Djx)uiuj +
ℓ∑
j=1
{x⊤(D⊤0 Dj + D⊤
j D0)x+
q⊤j x + γj}uj + x⊤D⊤
0 D0x + q⊤0 x + γ0
= u⊤D(x)u + r(x)⊤u + µ(x)⊤u + x⊤D⊤0 D0x + q⊤
0 x + γ0,
(8)
where matrix D(x) consists of (i, j)th element of x⊤D⊤i Djx, i, j = 1, . . . , ℓ, vector r(x) has
2x⊤D⊤0 Djx, j = 1, . . . , ℓ, as jth element, and vector µ(x) does q⊤j x + γj , j = 1, . . . , ℓ, as jth
element.
• A variant of ellipsoidal uncertainty set:
U =
(Q, q, γ) :
Q =m∑
k=1
(Dk0 +
ℓ∑
j=1
ukj D
kj )
⊤(Dk0 +
ℓ∑
j=1
ukj D
kj )
(q, γ) =
m∑
k=1
(qk
0 , γk0 ) +
ℓ∑
j=1
ukj (q
kj , γ
kj )
uk = (uk1 , . . . , u
kℓ )⊤ ∈ Uk, k = 1, . . . ,m
, (9)
Uk := {uk : ‖uk‖ ≤ 1}.
Denoting u1 ∈ U1, . . . ,um ∈ Um by u ∈ U , we have
f(x,u) := f(x;Q, q, γ)
=m∑
k=1
{uk⊤Dk(x)uk + rk(x)⊤uk + µk(x)⊤uk + x⊤Dk⊤
0 Dk0x + qk⊤
0 x + γk0
},
where matrix Dk(x) consists of (i, j)th element of x⊤Dk⊤i Dk
j x, i, j = 1, . . . , ℓ, vector rk(x)
has 2x⊤Dk⊤0 Dk
j x, j = 1, . . . , ℓ, as jth element, and vector µk(x) does qk⊤j x+ γk
j , j = 1, . . . , ℓ,
as jth element.
4
Page 7
2.2 Relaxation Problems for Robust Deviation Optimization
We rewrite f(x;Q, q, γ) as f(x,u) and f∗(Q, q, γ) as f∗(u). Also, we refer to U as uncertainty set
instead of U . Then, Problem (1) is reformulated as the problem including infinitely many constraints
such as
(RD)minx∈X,t
t
s.t. f(x,u)− f∗(u) ≤ t, ∀u ∈ U .
When f(x,u) is linear in u as (5), f(x,u)− f∗(u) is convex in u since f∗(u) = minx∈X f(x,u) is
concave. However, f(x,u) − f∗(u) is not necessarily convex in u even if f(x,u) is convex in u as
(8). Furthermore, the infinite number of constraints f(x,u)− f∗(u) ≤ t makes (RD) difficult to be
solved.
As a simple way to construct a relaxation problem of (RD), we prepare a set of random samples
U (N) := {u1, . . . ,uN} selected from U , and think of the sampled relaxation problem:
(RDN )minx∈X,t
t
s.t. f(x,u)− f∗(u) ≤ t, ∀u ∈ U (N).
f∗(u) is obtained by solving a convex quadratic optimization minx∈X f(x,u) for random samples
u ∈ U (N). This problem (RDN ) which consists of a finite number of constraints can be solved as
a convex quadratic optimization problem via existing optimization methods. The set of constraint
functions of (RD) includes all constraints of (RDN ), and thus, (RDN ) yields a lower bound for
(RD). Furthermore, if we take sufficiently large number as N , the optimal value of (RDN ) may be
sufficiently close to that of (RD), though the number of constraints of (RDN ) becomes large.
In this section, we propose “nice” relaxation problems which yield tighter lower bounds than
(RDN ). The proposed relaxation problem is also constructed with the use of random samples
u1, . . . ,uN as well as (RDN ). The difference of two relaxation problems is that our relaxation
problem approximates the function f∗(u) by a set of tractable functions f(xi,u), i = 1, . . . , N , with
optimal solutions xi of minx∈X f(x,ui) for random samples ui, while (RDN ) approximates the set
U by a set of vector points U(N) = {u1, . . . ,uN}.
Lemma 2.1 Let xi be an optimal solution of minx∈X f(x,ui) for random samples ui, i = 1, . . . , N .
Then, the constraint relaxation problem:
(CRPN )
minx∈X,t
t
s.t. f(x,u)− f(x1,u) ≤ t, ∀u ∈ U
. . .
f(x,u)− f(xN ,u) ≤ t, ∀u ∈ U
yields a lower bound for (RD). Furthermore, the lower bound is tighter than the bound of sampled
relaxation problem (RDN ).
Proof: Let ϕ(u) = mini=1,...,N f(xi,u). Naturally, ϕ(u) ≥ f∗(u) holds for ∀u ∈ U , since
f(xi,u) ≥ minx∈X f(x,u) = f∗(u) does for i = 1, . . . , N and ∀u ∈ U . Then, we have
minx∈X
maxu∈U
f(x,u)− f∗(u) ≥ minx∈X
maxu∈U
f(x,u)− ϕ(u),
5
Page 8
which implies that (RD) is lower bounded by minx∈X maxu∈U f(x,u) − ϕ(u). One can easily
show the equivalent between the relaxation problem and
minx∈X
maxu(i)∈U ,i=1,...,N
f(x,u(i))− f(xi,u(i)),
which is also described as (CRPN ). The relaxation problem (CRPN ) obviously has tighter lower
bound than the sampled relaxation problem (RDN ), since the set of constraint functions of (CRPN )
includes that of (RDN ). Indeed, the constraint f(x,ui) − f∗(ui) ≤ t of (RDN ) induced from a
random sample ui coincides with the ith constraint of (CRPN ) with fixed ui ∈ U , i.e., f(x,ui)−
f(xi,ui) ≤ t.
We describe the optimal value of the problem (•) by opt(•). The above lemma implies the
following relation among three problems: opt(RDN ) ≤ opt(CRPN ) ≤ opt(RD). If the problem
(CRPN ) is possibly transformed into a tractable optimization problem, it may yield a sufficiently
tight lower bound for (RD).
Now we assume norm-constrained uncertainty or ellipsoidal uncertainty set U for uncertain input
data, and deal with a quadratic function f(x,u). For each uncertainty set, we transform (CRPN )
into a tractable optimization problem.
2.2.1 Norm-constrained Uncertainty Set
In this section, we focus on norm-constrained uncertainty set defined as (3) and deal with robust
deviation optimization problem (RD) with the uncertainty set U of (4) and function f(x,u) =
α(x)⊤u + x⊤Q0x + q⊤0 x + γ0 of (5). Moreover, positive semidefinite matrices Qj (j = 0, . . . , ℓ) is
described as Qj = V ⊤j V j. For the random samples u1, . . . ,uN selected from U , let the corresponding
solution be xi = arg minx∈X f(x,ui), i = 1, . . . , N .
Theorem 2.2 Relaxation problem (CRPN ) of (RD) with norm-constrained uncertainty set is for-
mulated as the second-order cone programming problem:
minx∈X,t,g1,...,gN ,ν
t
s.t.
gi ≥ 0,∥∥∥∥∥
(2V jx
1− gij + q⊤
j x− x⊤i Qjxi − q⊤
j xi
)∥∥∥∥∥ ≤1 + gi
j − q⊤j x
+x⊤i Qjxi + q⊤
j xi,∥∥∥∥∥
(2V 0x
1− ν
)∥∥∥∥∥ ≤ 1 + ν,
‖gi‖ ≤ −ν − q⊤0 x−
(γ0 − f ∗(ui) + α(xi)
⊤ui
)+ t,
j = 1, . . . , ℓ, i = 1, . . . , N.
(10)
Proof: Functions f(xi,u), i = 1, . . . , N , are described as
f(xi,u) = (α(xi)⊤ui + x⊤
i Q0xi + q⊤0 xi + γ0) + α(xi)
⊤(u− ui)
= f ∗(ui) + α(xi)⊤(u− ui).
(11)
6
Page 9
Then, Lemma 2.1 leads to a relaxation problem of (RD) such as
minx∈X,t
t
s.t. f(x,u)− {f∗(u1) + α(x1)⊤(u− u1)} ≤ t, ∀u ∈ U ,
· · ·
f(x,u)− {f∗(uN ) + α(xN )⊤(u− uN )} ≤ t, ∀u ∈ U .
(12)
Then, an inequality constraint with index i (i = 1, . . . , N) in (12) is rewritten as
x⊤Q0x + q⊤0 x +
(γ0 − f ∗(ui) + α(xi)
⊤ui
)− t
+ max‖u‖≤1,u≥0
ℓ∑
j=1
uj(x⊤Qjx + q⊤
j x− x⊤i Qjxi − q⊤
j xi)
≤ 0.
Similar to Lemma 2 of Goldfarb and Iyengar (2003a), the above inequality is transformed into
gi ≥ 0,
gij ≥ x⊤Qjx + q⊤
j x− x⊤i Qjxi − q⊤
j xi, j = 1, . . . , ℓ,
x⊤Q0x + q⊤0 x +
(γ0 − f ∗(ui) + α(xi)
⊤ui
)− t + ‖gi‖ ≤ 0,
⇔
gi ≥ 0,∥∥∥∥∥
(2V jx
1− gij + q⊤
j x− x⊤i Qjxi − q⊤
j xi
)∥∥∥∥∥ ≤ 1 + gij − q⊤
j x + x⊤i Qjxi + q⊤
j xi,
j = 1, . . . , ℓ∥∥∥∥∥
(2V 0x
1− ν
)∥∥∥∥∥ ≤ 1 + ν,
‖gi‖ ≤ −ν − q⊤0 x−
(γ0 − f ∗(ui) + α(xi)
⊤ui
)+ t,
and the relaxation problem (10) follows.
2.2.2 Ellipsoidal Uncertainty Set
We consider uncertainty set (7) which contains quadratic terms of uncertain parameter u. Then,
the robust deviation optimization problem (RD) consists of U = {u : ‖u‖ ≤ 1} and f(x,u) =
u⊤D(x)u + r(x)⊤u + µ(x)⊤u + x⊤D⊤0 D0x + q⊤
0 x + γ0.
Theorem 2.3 Relaxation problem (CRPN ) of (RD) with ellipsoidal uncertainty set is formulated
7
Page 10
as the semidefinite programming problem:
minx∈X,t,λ
t
s.t.
I D0x D1x . . . Dℓx
(D0x)⊤ t− δi(x) + f∗(ui)− λ 12(r(xi) + µ(xi)− µ(x))⊤
(D1x)⊤
... 12(r(xi) + µ(xi)− µ(x)) D(xi) + λI
(Dℓx)⊤
� 0,
λ ≥ 0, i = 1, . . . , N,
(13)
where δi(x) = q⊤0 x + γ0 + u⊤
i D(xi)ui + r(xi)⊤ui + µ(xi)
⊤ui.
Proof: For optimal solutions xi of minx∈X f(x,ui), i = 1, . . . , N , we have
f(xi,u)
= u⊤D(xi)u + r(xi)⊤u + µ(xi)
⊤u + x⊤i D⊤
0 D0xi + q⊤0 xi + γ0
= f ∗(ui) + u⊤D(xi)u + r(xi)⊤u + µ(xi)
⊤u− (u⊤i D(xi)ui + r(xi)
⊤ui + µ(xi)⊤ui).
The ith constraint of (CRPN ) is described as
f(x,u)− f(xi,u)
= u⊤(D(x)−D(xi))u + (r(x)− r(xi) + µ(x)− µ(xi))⊤u
+x⊤D⊤0 D0x + (q⊤
0 x + γ0 + u⊤i D(xi)ui + r(xi)
⊤ui + µ(xi)⊤ui)︸ ︷︷ ︸
δi(x)
−f ∗(ui)
≤ t, ∀u ∈ U .
(14)
We equivalently transform the constraint (14) into
u⊤(D(xi)−D(x))u + τ(r(xi)− r(x) + µ(xi)− µ(x))⊤u
+τ2(t− x⊤D⊤0 D0x− δi(x) + f∗(ui)) ≥ 0,
for all (τ,u) satisfying ‖u‖2 ≤ τ2 by multiplying τ 2 to (14) and regarding τu as u. The constraint
can be interpreted as follows: for all (τ,u) satisfying (τ,u⊤)
[1 0⊤
0 −I
](τ
u
)≥ 0,
(τ u⊤)
[t− δi(x) + f∗(ui)
12(r(xi) + µ(xi)− µ(x))⊤
12(r(xi) + µ(xi)− µ(x)) D(xi)
](τ
u
)
−(τ u⊤) [D0x D1x . . . Dℓx]⊤ [D0x D1x . . . Dℓx]
(τ
u
)
≥ 0
follows. Utilizing S-lemma, this interpretation is equivalent to the existence of λ ≥ 0 such as[
t− δi(x) + f∗(ui)− λ 12 (r(xi) + µ(xi)− µ(x))⊤
12 (r(xi) + µ(xi)− µ(x)) D(xi) + λI
]
− [D0x . . . Dℓx]⊤ [D0x . . . Dℓx] � 0.
(15)
8
Page 11
The Schur complement procedure transforms (15) into a linear matrix inequality of the ith con-
straint of problem (13).
The problem (13) consists of linear matrix inequality constraints, and therefore, we can solve it
by interior point methods.
Remark 2.4 A variant of ellipsoidal uncertainty set (9) is considered. Then, we have
f(x,u) =
m∑
k=1
{uk⊤Dk(x)uk + rk(x)⊤uk + µk(x)⊤uk + x⊤Dk⊤
0 Dk0x + qk⊤
0 x + γk0
}.
The random samples uk1, . . . ,u
kN are extracted from Uk, k = 1, . . . ,m. Here, m-tuple of random
samples drawn from U 1, . . . ,Um is denoted by ui := (u1i , . . . ,u
mi ), i = 1, . . . , N . With optimal
solutions xi = arg minx f(x,ui), we have
f(xi,u) = f(xi, (u1, . . . ,um)) = f∗(ui) +
m∑
k=1
{uk⊤Dk(xi)uk + rk(xi)
⊤uk + µk(xi)⊤uk
−(uk⊤i Dk(xi)u
ki + rk(xi)
⊤uki + µk(xi)
⊤uki )}.
Then, similar to (14), the ith constraint of (CRPN ) is described as
f(x,u)− f(xi,u)
=
m∑
k=1
{uk⊤(Dk(x)−Dk(xi))uk + (rk(x)− rk(xi) + µk(x)− µk(xi))
⊤uk
+x⊤Dk⊤0 Dk
0x + δki (x)} − f∗(ui),
≤ t, ∀u1 ∈ U1, . . . ,∀um ∈ Um,
where δki (x) := qk⊤
0 x + γk0 + uk⊤
i Dk(xi)uki + rk(xi)
⊤uki + µk(xi)
⊤uki . It is also equivalent to
m∑
k=1
ski − f ∗(ui) ≤ t,
uk⊤(Dk(x)−Dk(xi))uk + (rk(x)− rk(xi) + µk(x)− µk(xi))
⊤uk + x⊤Dk⊤0 Dk
0x + δki (x) ≤ sk
i ,
∀u1 ∈ U1, . . . ,∀um ∈ Um, i = 1, . . . , N,
and therefore, the relaxation problem results in
minx∈X,t,s1
1,...,smN
,λ1,...,λmt
s.t.
m∑
k=1
ski − f ∗(ui) ≤ t,
I Dk0x Dk
1x . . . Dkℓ x
(Dk0x)⊤ sk
i − δki (x)− λk 1
2(rk(xi) + µk(xi)− µk(x))⊤
(Dk1x)⊤
... 12(rk(xi) + µk(xi)− µk(x)) Dk(xi) + λkI
(Dkℓ x)⊤
� 0,
λk ≥ 0, k = 1, . . . ,m, i = 1, . . . , N.
(16)
9
Page 12
2.3 Extension to Relative Robust Optimization
Suppose that f ∗(Q, q, γ) > 0,∀(Q, q, γ) ∈ U and consider the relative robust optimization problem:
minx∈X
max(Q,q,γ)∈U
f(x;Q, q, γ)− f∗(Q, q, γ)
f∗(Q, q, γ). (17)
It should be noted that the numerator f(x;Q, q, γ) − f∗(Q, q, γ) ≥ 0 and the objective function
becomes nonnegative on the feasible region. As well as the robust deviation optimization problem,
we assume that f(x;Q, q, γ) := x⊤Qx + q⊤x + γ and Q is positive semidefinite matrix. When
uncertain data (Q, q, γ) of f(x;Q, q, γ) is linearly perturbed with parameter u such as (2) and (3),
f∗(Q, q, γ) is concave and the function f(x;Q, q, γ) − f∗(Q, q, γ) is convex in u. Therefore, the
objective function of (17) is quasiconvex in u, and thus, for polytopic uncertainty set U , which is
expressed as the convex hull of ℓ given points defined by (2), a relative robust optimization problem
(17) can be solved by a convex quadratic optimization problem:
minx∈X,t
t
s.t. f(x;Qj, qj , γj)− (1 + t)f∗(Qj , qj, γj) ≤ 0, j = 1, . . . , ℓ.
However, (17) with norm-constrained uncertainty or ellipsoidal uncertainty set is generally intractable.
As well as relaxation problems for robust deviation optimization, we prepare approximate func-
tions for f∗(Q, q, γ) and construct a tractable relaxation problem for relative robust optimization.
We now deal with norm-constrained uncertainty set U of (3), and use the notation f(x,u) instead
of f(x;Q, q, γ) and f∗(u) instead of f∗(Q, q, γ).
Theorem 2.5 A relaxation problem of relative robust optimization (17) with norm-constrained un-
certainty set is formulated as the second-order cone programming problem:
minx∈X,t,g1,...,gN ,ν
t
s.t.
gi ≥ 0,∥∥∥∥∥
(2V jx
1− gij + q⊤
j x− x⊤i Qjxi − q⊤
j xi − δijt
)∥∥∥∥∥ ≤1 + gi
j − q⊤j x + x⊤
i Qjxi+
q⊤j xi + δi
jt,∥∥∥∥∥
(2V 0x
1− ν
)∥∥∥∥∥ ≤ 1 + ν,
‖gi‖ ≤ −ν − q⊤0 x− (α(xi)
⊤ui − f ∗(ui))t−(γ0 + α(xi)
⊤ui − f ∗(ui)),
j = 1, . . . , ℓ, i = 1, . . . , N,
(18)
where δij = x⊤
i Qjxi + q⊤j xi + γj .
Proof: Let xi be an optimal solution of minx∈X f(x,ui) for i = 1, . . . , N . Note that Problem
(17) is transformed into
minx∈X,t
t s.t. f(x,u)− (1 + t)f∗(u) ≤ 0, u ∈ U . (19)
10
Page 13
Applying the relation f(xi,u) ≥ f∗(u), i = 1, . . . , N , and f(xi,u) of (11) which includes the
vector α(x) of (6) to the constraint of (19), we have the following inequalities; for i = 1, . . . , N ,
f(x,u)− (1 + t)f∗(u)
≥ f(x,u)− (1 + t){f∗(ui) + α(xi)
⊤(u− ui)}
= x⊤Q0x + q⊤0 x + (α(xi)
⊤ui − f ∗(ui))t +(γ0 + α(xi)
⊤ui − f ∗(ui))+
max‖u‖≤1,u≥0
ℓ∑
j=1
uj
x⊤Qjx + q⊤
j x− x⊤i Qjxi − q⊤
j xi − (x⊤i Qjxi + q⊤
j xi + γj)︸ ︷︷ ︸δij
t
.
Therefore, we construct a relaxation problem whose constraints are
x⊤Q0x + q⊤0 x + (α(xi)
⊤ui − f ∗(ui))t +(γ0 + α(xi)
⊤ui − f ∗(ui))+
max‖u‖≤1,u≥0
ℓ∑
j=1
uj
(x⊤Qjx + q⊤
j x− x⊤i Qjxi − q⊤
j xi − δijt) ≤ 0
⇔ gi ≥ 0∥∥∥∥∥
(2V jx
1− gij + q⊤
j x− x⊤i Qjxi − q⊤
j xi − δijt
)∥∥∥∥∥ ≤ 1 + gij − q⊤
j x + x⊤i Qjxi + q⊤
j xi + δijt,
∥∥∥∥∥
(2V 0x
1− ν
)∥∥∥∥∥ ≤ 1 + ν
‖gi‖ ≤ −ν − q⊤0 x− (α(xi)
⊤ui − f ∗(ui))t−(γ0 + α(xi)
⊤ui − f ∗(ui)),
j = 1, . . . , ℓ
and the relaxation problem (18) follows.
Similar to Lemma 2.1, relaxation problem (18) for relative robust optimization problem (17) also
has a better lower bound than the sampled relaxation problem:
minx∈X,t
t
s.t.f(x,ui)− f∗(ui)
f∗(ui)≤ t, i = 1, . . . , N.
3 Estimation of Probabilistic Approximation Error
In the previous section, we have chosen N samples randomly from the given uncertainty set, and
constructed relaxation problems (CRPN ) for robust deviation optimization problems (RD) with two
different types of uncertainty sets U . Also, relaxation problem (18) is proposed for relative robust
optimization. Noticing that the lower bound achieved by (CRPN ) depends on the drawn random
samples, we consider how many N samples we should take to obtain an approximate solution with
probabilistic guarantee. Here, we focus on (RD), but the same discussion holds for relative robust
optimization.
11
Page 14
3.1 Convergence Properties
Here, we consider sampled relaxation problem (RDN ), which is constructed using N independently
identically distributed (iid) random samples U (N) = {u1, . . . ,uN} from uncertainty set U . The direct
application of Kanamori and Takeda (2006) (Theorem 4.1) leads to Proposition 3.1 and Theorem
3.2, which ensure that sampled relaxation problem (RDN ) converges to (RD). In their paper, an
increasing function q(δ) for 0 ≤ δ ≤ B is defined as follows:
q(δ) := 1Vℓ(1) Dℓ
(δL, 1),
B := L√ℓ+1
(norm-constrained uncertainty (3)),
B := 2L (ellipsoidal uncertainty (7)).
Vℓ(r) denotes the volume of ℓ-dimensional hypersphere with radius r, and Dℓ(r, s) is defined as
Dℓ(r, s) = Vℓ−1(1)
sℓ
∫ cos−1(1− r2
2s2)
0(sinx)ℓdx + rℓ
∫ π
cos−1(− r2s
)(sinx)ℓdx
.
Let L be Lipschitz constant which satisfies the conditions:
| {f(x,u)− f∗(u)} − {f(x,v)− f∗(v)} | ≤ L‖u− v‖ for ∀x ∈ X and ∀u,v ∈ U .
Let (x∗N , t∗N ) be an optimal solution of sampled relaxation problem (RDN ) with random samples in
U (N). The assertion (i) of the following proposition follows directly from Corollary 1 of Calafiore and
Campi (2004). The probability Pr{u ∈ U : f(x∗N ,u)−f∗(u) > t∗N} in the assertion is called violation
probability (Calafiore and Campi 2004, 2005, e.g.). If a uniform probability distribution is assumed
on U , violation probability is calculated from the volume of the set {u ∈ U : f(x∗N ,u)−f∗(u) > t∗N}.
From Kanamori and Takeda (2006), we see that under the uniform distribution over U , function q(δ)
satisfies
q(δ) ≤ Pr
{u ∈ U : max
v∈U{f(x,v)− f∗(v)} − δ < f(x,u)− f∗(u)
}(20)
for all x ∈ X. The uniform distribution can be replaced by other probability distribution with some
regularity conditions, but for the sake of simplicity, the uniform distribution is assumed.
Proposition 3.1 (Kanamori and Takeda (2006)) Let ǫ ∈ (0, q(B)), η ∈ (0, 1) and N ≥ N(ǫ, η) :=2ǫlog 1
η+ 2dim(x) + 2 dim(x)
ǫlog 2
ǫ, where dim(x) denotes the dimension of x. An optimal solution
(x∗N , t∗N ) of sampled relaxation problem (RDN ) satisfies the following inequalities simultaneously with
probability of at least 1− η,
(i) Pr{u ∈ U : f(x∗N ,u)− f∗(u) > t∗N} ≤ ǫ,
(ii) minx∈X
maxu∈U
{f(x,u)− f∗(u)} − minx∈X
maxu∈U (N)
{f(x,u)− f∗(u)} ≤ q−1(ǫ).
Note that the value
minx∈X
maxu∈U (N)
{f(x,u)− f∗(u)}
12
Page 15
in (ii) corresponds to the optimal value t∗N of (RDN ) at an optimal solution x∗N . Therefore, the
assertion (ii) of this proposition implies that the approximation error between robust deviation
optimization problem (RD) and its sampled relaxation problem (RDN ) is guaranteed to be within
q−1(ǫ) with probability at least 1− η.
Only the estimation of Lipschitz constant L remains for constructing function q(δ). Taking the
case of function f(x,u) of norm-constrained uncertainty set (3), we briefly show how to evaluate
Lipschitz constant L. Note that f(x,u) is described as f(x,u) = α(x)⊤u+x⊤Q0x+q⊤0 x+γ0 with
the notation α(x) of (6). Then we have
| {f(x,u1)− f∗(u1)} − {f(x,u2)− f∗(u2)} |
≤ | f(x,u1)− f(x,u2) |+ | f∗(u1)− f∗(u2) |,
and the first term of the right-hand side is upper bounded by { maxx∈X ‖α(x)‖ } ‖u1−u2‖. Also,
considering the relations
f∗(u1) = f(x1,u1) ≤ f(x2,u1), f∗(u2) = f(x2,u2) ≤ f(x1,u2)
for optimal solutions xi of minx∈X f(x,ui) = f∗(ui), i = 1, 2, we see that
| f∗(u1)− f∗(u2) | ≤ max{ ‖α(x1)‖, ‖α(x2)‖ } ‖u1 − u2‖.
Therefore, as Lipschitz constant L, we find 2 × { maxx∈X ‖α(x)‖ }, which can be easily esti-
mated with maximum eigenvalues of matrices Qj , j = 1, . . . , ℓ, sufficiently large value rx so that
maxx∈X ‖x‖ ≤ rx holds, and so on. For other kinds of uncertainty sets, we estimate L similarly.
The assertion (ii) indicates that the inequality: opt(RD)− opt(RDN ) ≤ q−1(ǫ) probabilistically
holds between (RD) and (RDN ). Moreover, Lemma 2.1 ensures that opt(RD) − opt(CRPN ) ≤
opt(RD)− opt(RDN ) always holds under the same random samples U (N), and thus, we have
PrN{
opt(RD)− opt(CRPN ) ≤ q−1(ǫ)}
≥ PrN{
opt(RD)− opt(RDN ) ≤ q−1(ǫ)}≥ 1− η,
(21)
where PrN{· · · } denotes the probability over N independent random samples U(N). Therefore, (ii)
also holds for a solution of (CRPN ).
The proposition leads to convergence properties of relaxation problems (RDN ) and (CRPN ) to
(RD).
Theorem 3.2 The optimal value of (RDN ) converges in probability to that of (RD), i.e., for ∀δ ∈
(0, B),
limN→∞
PrN
{minx∈X
maxu∈U
{f(x,u)− f∗(u)} − minx∈X
maxu∈U (N)
{f(x,u)− f∗(u)} > δ
}= 0.
The relaxation problem (CRPN ) also converges in probability to that of (RD).
13
Page 16
Proof: From (21), we have
PrN{
opt(RD)− opt(CRPN ) > q−1(ǫ)}≤ η,
PrN{
opt(RD)− opt(RDN ) > q−1(ǫ)}≤ η.
It is possible to obtain η satisfying ⌈N(ǫ, η)⌉ = N implicitly when some probability ǫ and the
number of samples N are given. That is, under fixed parameter ǫ > 0, there exists a sequence
ηN → 0 as N → ∞. Since δ = q−1(ǫ) takes the value in (0, B) by ǫ ∈ (0, q(B)), the statement of
this theorem is proved.
This theorem means that if we take sufficiently large number N , relaxation problem (CRPN )
yields a lower bound almost equal to opt(RD).
3.2 Relaxation Algorithm with Probabilistic Guarantee
Proposition 3.1 determines the number of samples N so that approximation error q−1(ǫ) between
(RD) and (CRPN ) is guaranteed theoretically. However, when small approximation error is required,
it is difficult, from practical point of view, to solve relaxation problem (CRPN ) which includes many
constraints induced from N samples.
In this section, we introduce a practical iterative algorithm for (RD) by solving relaxation prob-
lems (CRPN ) with N less than theoretical number N(ǫ, η). The number of samples N is gradually in-
creased and the resulting relaxation problems (CRPN ) are solved until a sufficiently tight lower bound
is obtained for (RD). As criteria to terminate, that is, to decide that the lower bound is sufficiently
tight, we adopt a-posteriori assessment of the worst violation with Monte-Carlo techniques. Let an
optimal solution of (CRPN ) as (x∗N , t∗N ). Then, constraint violation of (RD) may possibly occur such
as f(x∗N , u)−f∗(u)− t∗N > 0, u ∈ U . We construct a set of random samples U(M) from U with suffi-
ciently large number M , and evaluate violation in the worst case: maxu∈U (M){f(x∗
N ,u)−f∗(u)−t∗N}.
Our algorithm is terminated when the worst violation becomes small so that we neglect it. In this
procedure, we obtain an approximate solution (x∗N , t∗N ) with probabilistic guarantee if the number
of random samples M is determined properly.
Theorem 3.3 Let (x∗N , t∗N ) be an optimal solution of (CRPN ), and U(M) := {u1, . . . ,uM} be a set
of M(≥ ⌈ ln ηln(1−q(δ))⌉) random samples from U for a confidence parameter η ∈ (0, 1) and a permissible
error δ ∈ (0, B]. For the worst-case violation among M samples:
βM = maxu∈U (M){f(x∗
N ,u)− f∗(u)− t∗N},
we have
opt(RD) − opt(CRPN ) = minx∈X
maxu∈U{f(x,u)− f∗(u)} − t∗N < βM + δ (22)
with probability at least 1− η.
14
Page 17
Proof: Using the inequality (20), we have
PrM
{maxu∈U{f(x,u)− f∗(u)} − δ < max
u∈U (M){f(x,u)− f∗(u)}
}
= 1−
M∏
i=1
Pr
{maxu∈U{f(x,u)− f∗(u)} − δ ≥ f(x,ui)− f∗(ui)
}
≥ 1− (1− q(δ))M ≥ 1− η.
This implies that
maxu∈U{f(x,u)− f∗(u)} < max
u∈U (M){f(x,u)− f∗(u)}+ δ
holds with probability of at least 1 − η for any x ∈ X. Now, utilizing (x∗N , t∗N ), we subtract t∗N
from both sides of the inequality and substitute x∗N for x. Then, we have
maxu∈U{f(x∗
N ,u)− f∗(u)} − t∗N < βM + δ.
Since minx∈X maxu∈U{f(x,u)− f∗(u)} ≤ maxu∈U{f(x∗N ,u)− f∗(u)} holds, (22) follows.
Kanamori and Takeda (2006) proved the statement of Theorem 3.3 for an optimal solution
(x∗N , t∗N ) of sampled relaxation problem (RDN ). It should be noted that as the above proof shows,
this statement is proved not only for an optimal solution of relaxation problem (RDN ) or (CRPN ),
but also for any approximate solution x ∈ X with approximate value t. We propose to use an optimal
solution (x∗N , t∗N ) of (CRPN ) instead of that of (RDN ), since the worst case violation βM of (CRPN )
is expected to be considerably smaller than that of (RDN ).
Note that (22) of Theorem 3.3 as well as (21) evaluates approximation error opt(RD)−opt(CRPN ),
theoretically. The distinction between (21) and (22) is as follows: (21) makes a-priori assessment
of approximation error as q−1(ǫ), while (22) does a-posteriori assessment as βM + δ. Concretely, in
the case of a-priori assessment, before solving relaxation problem (CRPN ), the number of random
samples N can be determined for required accuracy q−1(ǫ). In the case of a-posteriori assessment,
once a solution (x∗N , t∗N ) has been computed, approximation error βM +δ of the solution is evaluated
with the use of Monte-Carlo techniques. It is expected that βM + δ is extremely less than q−1(ǫ),
since βM + δ is estimated with given solution (x∗N , t∗N ). Moreover, in a-posteriori assessment of
Kanamori and Takeda (2006), function q(δ,x∗N ), defined for the solution x∗
N , is used instead of q(δ)
for evaluation of M . To form q(δ,x∗N ), Lipschitz constant L of q(δ) is replaced with Lx∗
N, which
satisfies
| {f(x∗N ,u)− f∗(u)} − {f(x∗
N ,v)− f ∗(v)} | ≤ Lx∗
N‖u− v‖ for ∀u,v ∈ U . (23)
Lx∗
Ncan be smaller than L, and thus, necessary M random samples are decreased.
In the following algorithm, a-posteriori assessment is carried out for an optimal solution (x∗N , t∗N )
of (CRPN ). When we set small values close to 0 for η and δ, we need to solve minx∈X f(x,u),
u ∈ U (M) for large number M , and repeat function evaluations many times f(x∗N ,u)− f∗(u)− t∗N ,
15
Page 18
u ∈ U (M). If opt(RD)−opt(CRPN ) < βM +δ is ensured with high accuracy 1−η, and furthermore,
βM +δ is sufficiently small at an approximate solution (x∗N , t∗N ), we regard (x∗
N , t∗N ) as almost optimal
for robust deviation optimization problem (RD).
Algorithm 3.4 Input: ζ > 0 (threshold for βM ), δ ∈ (0, B], η ∈ (0, 1) and initial number of
random samples N0.
Output: Almost optimal solution (x∗N , t∗N ), whose approximation error is guaranteed to be less than
ζ + δ with probability at least 1− η.
Step 0: Construct U (N0) = {u1,u2, . . . ,uN0} via random sampling from U . Let N ← N0, and go
to Step 1.
Step 1: Construct a relaxation problem (CRPN ) using U (N) and let (x∗N , t∗N ) be its optimal solution.
Go to Step 2.
Step 2: For sample size M = ⌈ ln ηln(1−q(δ,x∗
N))⌉, prepare another set of random samples U (M) =
{u1,u2, . . . ,uM} from U . Then, compute
βM = maxu∈U (M)
{f(x∗N ,u)− f∗(u)− t∗N}.
If βM > ζ, then go to Step 3. Otherwise go to Step 4.
Step 3: Define the subset V := {u ∈ U (M) : f(x∗N ,u) − f∗(u) − t∗N > 0} of U (M). Let V = |V|,
U (N+V ) ← U (N) ∪ V and N ← N + V . Go to Step 1.
Step 4: Terminate with the almost optimal solution (x∗N , t∗N ).
4 Numerical Results
As simple examples of robust deviation optimization problems, linear least squares problem and
maximum Sharpe ratio problem are considered. We compare accuracy of lower bounds obtained by
(RDN ) and (CRPN ) for these application problems. Then, a-priori and a-posteriori assessments are
carried out for maximum Sharpe ratio problem. All computations are conducted on an Opteron 850
(2.4GHz), 8GB of physical memory and 1MB of L2 cache size with SuSE Linux Enterprise Server 9.
4.1 Robust Deviation Linear Least Squares Problem
Linear least squares problem finds the best fitting straight line y =∑n
i=1 aixi + xn+1 through a
set of points. Suppose that we have m plots (a1, b1), . . . , (am, bm), where ak ∈ IRn and bk ∈ IR
(k = 1, . . . ,m), and formulate the linear least squares problem as follows.
minx∈IRn+1
‖Ax− b‖2, (24)
16
Page 19
where A =
a⊤1 1
. . . ·
a⊤m 1
∈ IRm×(n+1) and b =
b1
...
bm
∈ IRm. If the data A and b are available,
we find easily an optimal solution of (24) as
x∗ = (A⊤A)−1A⊤b. (25)
In some applications of least squares problems, problem data (A, b) may include some measurement
error. In order to reduce the sensitivity of the decision x to perturbations in the data, the papers
(El Ghaoui and Lebret 1997, Goldfarb and Iyengar 2003a) proposed several types of absolute robust
optimization problems for a linear least squares problem. For the simplicity of notation, we describe
Problem (24) as
minx∈IRn+2
‖Dx‖2 s.t. xn+2 = 1,
where D = [A,−b]. The absolute robust optimization problems treat the coefficient matrix D as
uncertain input data, and focus on the worst case of D ∈ U where the regression ‖Dx‖2 becomes
large.
In this section we take notice of the worst case deviation from optimality and consider the robust
deviation optimization problem for the linear least squares problem:
minx
maxD∈U
‖Dx‖2 − f ∗(D) s.t. xn+2 = 1, (26)
where f ∗(D) = minx ‖Dx‖2 s.t. xn+2 = 1. f∗(D) is easily obtained from x∗ of (25). For each data
(ak, bk), k = 1, . . . ,m, of D subject to measurement error, we consider ellipsoidal uncertainty set
such as
U =
D =
α⊤1
. . .
α⊤m
:
αk = αk0 +
ℓ∑
j=1
ukj α
kj
‖uk‖ ≤ 1, k = 1, . . . ,m
. (27)
We refer to Uk = {uk : ‖uk‖ ≤ 1}, k = 1, . . . ,m, as uncertainty sets, and describe the conditions
u1 ∈ U1, . . . ,um ∈ Um by u ∈ U . Replacing f∗(D) by f∗(u), we rewrite (26) as
minx
maxu∈U
m∑
k=1
x⊤(αk0 +
ℓ∑
j=1
ukj α
kj )(α
k0 +
ℓ∑
j=1
ukj α
kj )
⊤x− f ∗(u)
s.t. xn+2 = 1.
Note that αkj corresponds to (n + 2) × 1 dimensional matrix (Dk
j )⊤ in a variant of ellipsoidal un-
certainty set (9). Hence, this problem can be formulated as a semidefinite programming similar to
(16).
We consider 8 points of two dimension (ai, bi), i = 1, . . . , 8 which fall into line. All data points
are considered to be uncertain, and each point has uncertainty set of circle with radius 0.5. Figure
1 shows how a shift of point positions influences the fitting straight lines of two robust problems:
absolute robust optimization and robust deviation optimization. As Figure 1 (left) shows, two robust
optimization problems have almost same optimal solutions whose fitting straight lines go through the
17
Page 20
0 2 4 6 8
01
23
4
++
++
++
++
0 2 4 6 8
01
23
4Robust Deviation Opt.Absolute Robust Opt.
0 2 4 6 8
01
23
4
+ +
++
++
++
0 2 4 6 8
01
23
4
Robust Deviation Opt.Absolute Robust Opt.
Figure 1: Optimal fitting lines of absolute robust optimization and robust deviation optimization
0
0.5
1
1.5
2
2.5
3
103102.5102101.5101
Opt
imal
Val
ue
Num. of Samples (N)
proposed relaxation (CRPN)sampled relaxation (RDN)
0.01
0.1
1
10
100
1000
103102.5102101.5101
Ave
rage
Com
p. T
ime
[sec
]
Num. of Samples (N)
proposed relaxation (CRPN)sampled relaxation (RDN)
Figure 2: Optimal values of (CRPN ) and (RDN ), and their average computational time
nominal data (center) of each circle. With two data points going down, the dotted fitting straight
line, formed from an optimal solution of robust deviation optimization, tends to shift downward.
The shift of the optimal line is caused by the objective of robust deviation optimization. To find a
robust fitting line which has the least difference with optimal fitting line under various circumstances,
the resulting optimal line naturally shifts downward. On the other hand, the optimal fitting line of
absolute robust optimization remain the same for small change of two data points, since the downward
shift of fitting line induces larger regression error for remaining 6 data points in the worst case. The
choice of robust models depends on the concerned objective: fitting line’s relative performance or
worst-case performance.
Now we compare two relaxation problems (CRPN ) and (RDN ) in terms of their approximation
accuracy and computational time. Figure 2 (left) shows the minimum, maximum and average optimal
values of (CRPN ) and (RDN ) among 10 times trials for each fixed sample number N . The horizontal
axis shows that the sample number N is chosen on a log scale as 101, 101.25, 101.5, . . . , 103. Note that
even (RDN ) with N = 1000 cannot find tighter lower bound than (CRPN ) with N = 10. On
18
Page 21
the other hand, in terms of computational time, (RDN ) is superior to (CRPN ). Indeed, most
computational time of (CRPN ) is devoted for solving a semidefinite programming problem (16) with
many linear matrix inequality constraints, and that of (RDN ) for a convex quadratic optimization
problem, since f∗(u), u ∈ U (N), are easily obtained from the computation of (25). The difference
of computational time is induced from the resulting optimization problems: (16) of (CRPN ) and
(RDN ). However, though (CRPN ) takes more computational time under the same N , relaxation
problem (CRPN ) with only N = 10 surpasses (RDN ) with N = 1000 in terms of computational time
and approximation accuracy. In that sense, solving (CRPN ) with less N is efficient way to obtain a
tight lower bound of (RD).
4.2 Robust Deviation Maximum Sharpe Ratio Problem
For a given expected return vector µ and a positive definite covariance matrix Q, Sharpe ratio
problem finds a portfolio that maximizes the Sharpe ratio defined as
maxx∈X
µ⊤x− rf√x⊤Qx
,
where rf ≥ 0 is the expected return of a riskless asset. The numerator is regarded as the expected
excess return on the portfolio, i.e., the return in excess of the risk-free rate rf , while the denominator
indicates the standard deviation of the return. The Sharpe ratio is a measure to evaluate a portfolio.
Goldfarb and Iyengar (2003b) considered Q and µ to be uncertain, and assumed interval uncer-
tainty set (described as interval) and factorized uncertainty set (Goldfarb and Iyengar 2003a,b) for
µ and Q, respectively. The objective of the robust counterpart of maximum Sharpe ratio problem
is to choose a portfolio that maximizes the worst case ratio of the expected excess return to the
standard deviation of the return. On the other hand, Krishnamurthy (2004) proposed the robust
deviation Sharpe ratio problem, which minimizes the maximum deviation of the Sharpe ratio from
the maximum ratio obtained for all possible realizations of expected return vector µ. The robust
deviation counterpart with uncertain expected return vector µ ∈ U is formulated as follows.
minx∈X
maxµ∈U
{f∗(µ)− f(x,µ)} , (28)
where X := {x : x ≥ 0,e⊤x = 1},
f(x,µ) =(µ− rfe)⊤x√
x⊤Qx, and f ∗(µ) = max
x∈X
(µ− rfe)⊤x√x⊤Qx
.
It is shown in Krishnamurthy (2004) that when the uncertainty set is given as the convex hull of ℓ
vectors µ1, . . . ,µℓ, which is called polytopic uncertainty set (2), an optimal solution of (28) can be
achieved by solving (ℓ + 1) second-order cone programming problems.
As uncertainty set U , we assume ellipsoidal uncertainty set where the quadratic term of x dis-
appears. We describe µ ∈ U as µ0 +∑ℓ
j=1 ujµj , u ∈ U := {u : ‖u‖ ≤ 1}, and use the notation
f(x,u) and f ∗(u) instead of f(x,µ) and f∗(µ), respectively. Applying the techniques of Goldfarb
19
Page 22
and Iyengar (2003b) and Krishnamurthy (2004), Problem (28) is equivalently transformed into
minx,t
t
s.t. f∗(u)− (µ0 +∑ℓ
j=1 ujµj − rfe)⊤x ≤ t, ∀u ∈ U
x⊤Qx ≤ 1, x ≥ 0,
(29)
since the function of Sharpe ratio is homogeneous of a portfolio x, and furthermore, the optimality
is achieved at x⊤Qx = 1.
Now we try to solve the robust deviation optimization problem approximately. To formulate re-
laxation problem (CRPN ), approximating function f∗(u) from below is necessary. For that purpose,
we prepare a set of random samples U (N) from U and obtain optimal solutions xi of minx∈X f(x,ui)
for random samples ui ∈ U(N). Using optimal solutions xi, i = 1, . . . , N , we have
f∗(u) ≥ f(xi,u) = f∗(ui) +α(xi)
⊤(u− ui)√x⊤
i Qxi
,
where α(x) = (µ⊤1 x, . . . ,µ⊤
ℓ x)⊤. Then the first constraint of (29) is relaxed into
maxu∈U
α(xi)√
x⊤i Qxi
−α(x)
⊤
u
+ (rfe− µ0)⊤x + f ∗(ui)−
α(xi)⊤ui√
x⊤i Qxi
≤ t,
and the maximum value with respect to u is obtained as ‖ α(xi)√x⊤
i Qxi
−α(x)‖. Therefore, relaxation
problem (CRPN ) for (29) results in a second-order cone programming problem.
We assume that the dimension of variables x, that is, the number of assets is 20, and the dimension
ℓ of uncertain u is 4. The data µj , j = 0, 1, . . . , 4 of µ = µ0 +∑4
j=1 ujµj are given as
µ0 = (0.6, 0.58, 0.56, . . . , 0.24, 0.22)⊤ ,
µj = (0, . . . , 0︸ ︷︷ ︸5×(j−1)
, 0.45, 0.35, 0.25, 0.15, 0.05, 0, . . . , 0︸ ︷︷ ︸5×(4−j)
)⊤,
under the assumption that nominal return per unit, µ0, is decreasing from 1st asset to 20th asset
by 0.02, and that each 5 assets can behave similarly under various circumstances (for example, they
belong to the similar industry). Q is constructed randomly based on the uncertain expected return
vector µ so that the risk becomes higher as the expected return does higher.
Figure 3 shows the difference of optimal portfolios between absolute robust optimization and
robust deviation optimization. For the sake of convenience, the identification number from 1 to 20 is
given for each asset. The horizontal axis shows the number of each asset, and the vertical axis does
the optimal investment rate x∗i of ith asset (i = 1, . . . , 20). The optimal solution of absolute robust
optimization indicates that there are similar tendency in each 5 assets. From (k + 1)th to (k + 5)th
asset, k ∈ {0, 5, 10, 15}, their investment rates increase. This tendency is caused by the parameter
setting of µ, which shows that (k + 1)th asset possibly take small return per unit in the worst case,
though its nominal value is relatively high. We can say that an optimal solution of absolute robust
optimization is strongly sensitive to uncertainty set. On the other hand, in the optimal solution of
20
Page 23
5 10 15 20
0.00
0.02
0.04
0.06
0.08
0.10
5 10 15 20
0.00
0.02
0.04
0.06
0.08
0.10
Robust Deviation Opt.
Absolute Robust Opt.
Figure 3: Characteristic of optimal solutions between two robust problems
robust deviation optimization, such tendency is not recognized well. A portfolio of robust deviation
optimization problem may be balanced well between return and risk.
We compare lower bounds of relaxation problems: sampled relaxation problem (RDN ) and pro-
posed relaxation problem (CRPN ) with fixing the number of random samples N adequately, and
make sure how (CRPN ) improves a lower bound. For each N , 100 different sets of random samples
are drawn and two relaxation problems (RDN ) and (CRPN ) are constructed. Figure 4 depicts aver-
age optimal values of each relaxation problem with the maximum and minimum values among 100
optimal values. The number of samples N are taken on a log scale as 101, 101.25, 101.5, . . . , 103. The
optimal value 1.0601 of (CRPN ) with N = 50000 is actually substituted for opt(RD). Note that the
average lower bound of (RDN ) with N = 1000 is achieved by (CRPN ) only with N = 32 ≈ 101.5.
Moreover, the width of optimal value interval for (CRPN ) tends to be smaller than (RDN ). Con-
sequently, numerical results indicate that the optimal value of (CRPN ) converges to that of (RD)
extremely faster than (RDN ).
In terms of computational time, there is no large difference between two relaxation problems with
same random samples N , as Figure 5 shows. Indeed, most of total computational time is devoted
for obtaining f∗(u) for u ∈ U (N), that is, for solving minx∈X f(x,ui), i = 1, . . . , N . Concretely, the
computation have occupied more than 97% of total computational time necessary for (CRPN ), while
more than 99% for (RDN ) under the random samples N = 1000. Noticing that the computational
time increases linearly as the number of samples N increases, we see that the proposed relaxation
problem (CRPN ) yields a tight lower bound of (RD) with less computational time.
4.3 Evaluation of Relaxation Algorithm
We now carry out a-priori and a-posteriori assessment of opt(RD) − opt(CRPN ) for the robust
deviation maximum Sharpe ratio problem, and finally evaluate the proposed algorithm which provides
21
Page 24
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
103102.5102101.5101
Opt
imal
Val
ue
Num. of Samples (N)
proposed relaxation (CRPN)robust deviation (RD)
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
103102.5102101.5101
Opt
imal
Val
ue
Num. of Samples (N)
sampled relaxation (RDN)robust deviation (RD)
Figure 4: Optimal values of (CRPN ) and (RDN )
0.1
1
10
100
103102.5102101.5101
Ave
rage
Com
p. T
ime
[sec
]
Num. of Samples (N)
proposed relaxation (CRPN)sampled relaxation (RDN)
Figure 5: Average computational time of (CRPN ) and (RDN )
an approximate solution with probabilistic guarantee.
A-priori assessment We construct function q(δ) with dim(x) = 20 of x and ℓ = 4 of u. Lipschitz
constant is evaluated as L = 2×maxx ‖α(x)‖ = 0.9. For ellipsoidal uncertainty set, we have range
[0, 2L] of δ for q(δ). For its inverse function q−1(ǫ), the range of ǫ is [0, 1] because of q(2L) = 1.
Table 1 indicates the relation between the number of random samples N(ǫ, η) and theoretical
error measures (theoretical error q−1(ǫ) and its relative error) under the above parameter setting.
The relative error is defined by R.err := q−1(ǫ)
|opt(RD)|× 100. The optimal value of (CRPN ) obtained
with N = 50000, which is 1.060, is actually substituted for opt(RD). Proposition 3.1 guarantees
with probability at least 1−η(= 0.99) that relaxation problem (CRPN ) constructed with N = 10000
random samples gives a lower bound of (RD) whose approximation error is within q−1(ǫ) = 0.429,
which corresponds to 40.4% relative error. The theoretical approximation error q−1(ǫ) is common for
relaxation problems (RDN ) and (CRPN ), and therefore, we expect that (CRPN ) actually achieves
22
Page 25
Table 1: A-priori assessment for opt(RD)− opt(CRPN ) (η = 0.01)
ǫ N(ǫ, η) q−1(ǫ) R.err [%]
0.787 100 1.397 131.8
0.125 1000 0.724 68.3
0.020 10000 0.429 40.4
Table 2: A-posteriori assessment for opt(RD)− opt(CRPN ) with x∗N (η = 0.01)
x∗N1
(N1 = 100) x∗N2
(N2 = 1000)
δ M βM A.err R.err [%] M βM A.err R.err [%]
0.15 2297 0.030 0.180 16.9 2294 0.006 0.156 14.7
0.10 11074 0.016 0.116 11.0 11057 0.006 0.106 10.0
0.05 169026 0.031 0.081 7.6 168759 0.010 0.060 5.7
x∗N3
(N3 = 10000)
δ M βM A.err R.err [%]
0.15 2285 -0.077 0.073 6.9
0.10 11013 -0.020 0.080 7.5
0.05 168073 -0.003 0.047 4.4
better lower bound whose error is far less than q−1(ǫ). From an empirical point of view, much less
number N ≪ 10000 of samples may be required to attain x∗N with approximation error less than
0.429. Certainly, Table 4 (left) shows that opt(CRPN ) achieved with N = 1000 is far less than the
theoretical error q−1(0.125) = 0.724.
A-posteriori assessment Next, a-posteriori assessments are carried out for x∗Ni
(i = 1, 2, 3),
which are obtained via (CRPN ) with N1 = 100, N2 = 1000 and N3 = 10000, respectively. In
a-posteriori assessment, q(δ,x∗N ) is used to evaluate sample number M . Lipschitz constant Lx∗
N
of (23) is estimated as Lx∗
N= ‖α(x∗
N )‖ + maxx ‖α(x)‖, whose first term is from estimation of
| f(x∗N ,u) − f(x∗
N ,v) | and second is from | f∗(u) − f∗(v) | for ∀u,v ∈ U . Lx∗
Ni= 0.575 is
applicable to all x∗Ni
(i = 1, 2, 3).
Now, for some fixed M > 0, we compute the worst violation:
βM = maxu∈U (M)
{f∗(u)− f(x∗N ,u)− t∗N}.
As a theoretical error measure, we define A.err := βM + δ for δ satisfying M = ⌈ ln ηln(1−q(δ,x∗
N))⌉
with fixed M and η = 0.01. Theorem 3.3 guarantees with probability at least 1 − η that the error
opt(RD)− opt(CRPN ) is less than A.err whenever δ ∈ (0, 2LN ] holds.
23
Page 26
Table 3: Iterative relaxation algorithm with probabilistic guarantee (ζ = 0.01, η = 0.01, N0 = 10)
δ = 0.15 ( ζ + δ ≤ 0.16 )
itr. V.rate (V/M) βM t∗N
1 42/2274 0.275 0.856
2 0/2276 -0.043 1.041
δ = 0.10 ( ζ + δ ≤ 0.11 )
itr. V.rate (V/M) βM t∗N1 248/10963 0.275 0.856
2 0/11002 -0.012 1.057
δ = 0.05 ( ζ + δ ≤ 0.06 )
itr. V.rate (V/M) βM t∗N1 3653/167295 0.349 0.856
2 0/168250 -0.004 1.060
Table 2 shows A.err for solutions x∗N1
, x∗N2
and x∗N3
of (CRPN ). R.err is evaluated by A.err|opt(RD)|
×
100. Note that A.err = βM +δ possibly takes small value with sufficiently small parameter δ, though
sample number M becomes large and consequently computational time of a-posteriori assessment in-
creases. Compared to a-priori assessment, a-posteriori one is considerably strict. Indeed, a-posteriori
assessment guarantees that the relative error of x∗N1
is within 7.6% with probability 1− η (= 0.99),
while 131.8% relative error is guaranteed via a-priori assessment. Also, from this table, we see
that x∗N3
is sufficiently close to an optimal solution of (RD), since constraint violation, that is,
f∗(u)− f(x∗N ,u)− t∗N > 0, u ∈ U (M) hardly occurs with randomly chosen M samples.
Now we check the validity of a-posteriori assessment. We have lower bounds t∗N1= 1.040, t∗N2
=
1.045 and t∗N3= 1.055 for (RD), which are achieved by x∗
N1, x∗
N2and x∗
N3of (CRPN ), respectively.
Since opt(RD)− t∗N < A.err holds with probability at least 1− η = 0.99, we see that
t∗N1= 1.040 ≤ opt(RD) < 1.121, and t∗N2
= 1.045 ≤ opt(RD) < 1.105,
t∗N3= 1.055 ≤ opt(RD) < 1.102.
The almost optimal value 1.060 of (RD) exactly in these ranges.
Evaluation of Algorithm 3.4 For achieving a nice approximation of an optimal solution of (RD),
it might be a clever way to solve (CRPN ) with appropriately large N and check the accuracy of the
obtained solution x∗N via a-posteriori assessment. If the solution is sufficiently accurate, it can be
accepted as almost optimal solution of (RD).
Table 3 evaluates Algorithm 3.4 with three different permissible errors δ under the common
parameter ζ = 0.01, η = 0.01 and N0 = 10. t∗N is a lower bound of (RD) obtained by (CRPN ) at
each iteration of the algorithm, and V.rate shows that V constraints are violated among M sampled
constraints by x∗N achieved at each iteration. We see that as more accurate solution is required
with less δ, the number of random samples M becomes large and violation of constraints is likely
to occur, that is, V becomes large. However, by adding such violated constraints to (CRPN ), the
resulting approximate solution improves considerably and as a result, an approximate solution with
24
Page 27
probabilistic guarantee is obtained within a few iterations. Note that the solution of the algorithm
with δ = 0.05 may be regarded as an optimal solution of (RD), since the lower bound t∗N is coincident
with 1.060, which is almost optimal value of (RD).
The behavior of Algorithm 3.4 with a few iterations has been observed even with N0 = 10. Tight
approximate solutions of (CRPN ) also contribute to the behavior. The algorithm whose relaxation
problems (CRPN ) are replaced with (RDN ) produces a lower bound t∗N = 1.017 after 5 iterations
under parameter δ = 0.15.
5 Concluding Remarks
In this paper, we have mainly focused on deviation robustness for uncertain convex quadratic opti-
mization problems, and proposed relaxation techniques for robust deviation optimization problems
with familiar uncertainty sets. The proposed relaxation problem is formulated with random samples
from given uncertainty set, and the resulting lower bound depends on the drawn random samples.
Therefore, it was our concern how many N samples we should take to obtain an approximate solution
with probabilistic guarantee.
A-priori assessment makes possible to estimate N for required accuracy, while once a solution
has been computed, a-posteriori assessment evaluates the accuracy of the solution. We applied the-
oretical results on a-priori and a-posteriori assessments of approximation error to the application
examples of linear least squares problem and maximum Sharpe ratio problem. Numerical results
indicate that a-posteriori assessment of approximation error is effectively utilized in our proposed
algorithm. Furthermore, from numerical results, we see that the proposed relaxation technique is
extremely effective compared to simple sampling technique. Indeed, in the example of linear least
squares problem, the proposed relaxation problem (CRPN ) with only N = 10 surpasses the sam-
pled relaxation problem (RDN ) with N = 1000 in terms of computational time and approximation
accuracy.
The proposed relaxation technique and a-priori and a-posteriori assessments may be applicable
to other kinds of uncertainty sets and also, relative robust optimization problems. One possible
extension is to modify the relaxation technique and algorithm presented here for a wide range of
robust optimization problems with deviation robustness or relative robustness.
References
Aron I.D, Hentenryck P.V. On the complexity of the robust spanning tree problem with interval
data. Operations Research Letters 2004;32; 36-40.
Ben-Tal A, Nemirovski A. Robust convex optimization. Mathematics of Operations Research 1998;23;
769-805.
Ben-Tal A, Nemirovski A. Robust solutions of uncertain linear programs. Operations Research
Letters 1999;25; 1-13.
25
Page 28
Calafiore G, Campi M.C. A new bound on the generalization rate of sampled convex programs. 43rd
IEEE Conference on Decision and Control (CDC04) 2004;5; 5328-5333.
Calafiore G, Campi M.C. Uncertain convex programs: randomized solutions and confidence levels.
Mathematical Programming 2005;102; 25-46.
El Ghaoui L, Lebret H. Robust solutions to least-squares problems with uncertain data. SIAM
Journal on Matrix Analysis and Applications 1997;18; 1035-1064.
Goldfarb D, Iyengar G. Robust convex quadratically constrained programs. Mathematical Program-
ming 2003;97; 495-515.
Goldfarb D, Iyengar G. Robust portfolio selection problems. Mathematics of Operations Research
2003;28; 1-38.
Hites R, Salazar-Neumann M. The robust deviation p-elements problem with interval data. Tech-
nical Report; Service de Mathematiques de la Gestion; Universite Libre de Bruxelles; 2004.
www.ulb.ac.be/polytech/smg/indexpublications.html.
Kanamori T, Takeda A. Worst-case violation of sampled convex programs for optimization with
uncertainty. Research Report B-425; Dept. of Mathematical and Computing Sciences; Tokyo
Institute of Technology; 2006. www.is.titech.ac.jp/research/research-report/B/index.html
Kouvelis P, Yu G. Robust discrete optimization and its applications. Kluwer Academic Publishers;
Norwell; 1997.
Krishnamurthy V. Robust optimization in finance. Second Summer Paper for the Doctoral Program
(supervised by Tutuncu R.H.); 2004. Preprint.
Tempo R, Calafiore G, Dabbene F. Randomized algorithms for analysis and control of uncertain
systems. Springer-Verlag London Limited; 2005.
Yaman H, Karasan O.E, Pinar M.C. Restricted robust optimization for maximization over uni-
form matroid with interval data uncertainty. Technical Report; Bilkent University; 2004.
www.bilkent.edu.tr/∼hyaman/RRD.htm.
Yaman H, Karasan O.E, Pinar M.C. The robust spanning tree problem with interval data. Operations
Research Letters 2001;29; 31-40.
Yu G, Yang J. On the robust shortest path problem. Computers & Operations Research 1998;25;
457–468.
26