SIAM J. MATRIX ANAL. APPL. c 2006 Society for Industrial and Applied Mathematics Vol. 28, No. 2, pp. 425–445 FINDING A GLOBAL OPTIMAL SOLUTION FOR A QUADRATICALLY CONSTRAINED FRACTIONAL QUADRATIC PROBLEM WITH APPLICATIONS TO THE REGULARIZED TOTAL LEAST SQUARES ∗ AMIR BECK † , AHARON BEN-TAL † , AND MARC TEBOULLE ‡ Abstract. We consider the problem of minimizing a fractional quadratic problem involving the ratio of two indefinite quadratic functions, subject to a two-sided quadratic form constraint. This formulation is motivated by the so-called regularized total least squares (RTLS) problem. A key difficulty with this problem is its nonconvexity, and all current known methods to solve it are guaranteed only to converge to a point satisfying first order necessary optimality conditions. We prove that a global optimal solution to this problem can be found by solving a sequence of very simple convex minimization problems parameterized by a single parameter. As a result, we derive an efficient algorithm that produces an -global optimal solution in a computational effort of O(n 3 log −1 ). The algorithm is tested on problems arising from the inverse Laplace transform and image deblurring. Comparison to other well-known RTLS solvers illustrates the attractiveness of our new method. Key words. regularized total least squares, fractional programming, nonconvex quadratic op- timization, convex programming AMS subject classifications. 65F20, 90C20, 90C32 DOI. 10.1137/040616851 1. Introduction. In this paper we consider the problem of minimizing a frac- tional quadratic function subject to a quadratic constraint: min x∈F f 1 (x) f 2 (x) , (1) where f i (x)= x T A i x − 2b T i x + c i , i =1, 2, (2) A 1 , A 2 ∈ R n×n are symmetric matrices, b 1 , b 2 ∈ R n , c 1 ,c 2 ∈ R, and 0 ≤ L<U . We do not assume that A 1 and A 2 are positive semidefinite, and the only assumption required for the problem to be well defined is that f 2 (x) is bounded away from zero. We will discuss two cases of the feasible set F : F 1 = {x ∈ R n : L 2 ≤ x T Tx ≤ U 2 }, where T is a positive definite matrix and U>L ≥ 0, and F 2 = {x ∈ R n : x T Bx ≤ U 2 }, where B is a positive semidefinite matrix and U> 0. ∗ Received by the editors October 13, 2004; accepted for publication (in revised form) by P. C. Hansen November 29, 2005; published electronically May 26, 2006. http://www.siam.org/journals/simax/28-2/61685.html † MINERVA Optimization Center, Department of Industrial Engineering and Management Tech- nion, Israel Institute of Technology, Haifa 3200, Israel ([email protected], [email protected]. ac.il). The research of the second author was partially supported by BSF grant 2002038. ‡ School of Mathematical Sciences, Tel-Aviv University, Ramat-Aviv 69978, Israel (teboulle@post. tau.ac.il). The research of this author was partially supported by BSF grant 2002010. 425
21
Embed
SIAM J. MATRIX ANAL. APPL c - Technion · siam j. matrix anal. appl. c 2006 society for industrial and applied mathematics vol. 28, no. 2, pp. 425–445 finding a global optimal solution
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
FINDING A GLOBAL OPTIMAL SOLUTION FOR AQUADRATICALLY CONSTRAINED FRACTIONAL QUADRATIC
PROBLEM WITH APPLICATIONS TO THE REGULARIZED TOTALLEAST SQUARES∗
AMIR BECK† , AHARON BEN-TAL† , AND MARC TEBOULLE‡
Abstract. We consider the problem of minimizing a fractional quadratic problem involvingthe ratio of two indefinite quadratic functions, subject to a two-sided quadratic form constraint.This formulation is motivated by the so-called regularized total least squares (RTLS) problem. Akey difficulty with this problem is its nonconvexity, and all current known methods to solve it areguaranteed only to converge to a point satisfying first order necessary optimality conditions. We provethat a global optimal solution to this problem can be found by solving a sequence of very simpleconvex minimization problems parameterized by a single parameter. As a result, we derive an efficientalgorithm that produces an ε-global optimal solution in a computational effort of O(n3 log ε−1). Thealgorithm is tested on problems arising from the inverse Laplace transform and image deblurring.Comparison to other well-known RTLS solvers illustrates the attractiveness of our new method.
Key words. regularized total least squares, fractional programming, nonconvex quadratic op-timization, convex programming
AMS subject classifications. 65F20, 90C20, 90C32
DOI. 10.1137/040616851
1. Introduction. In this paper we consider the problem of minimizing a frac-tional quadratic function subject to a quadratic constraint:
minx∈F
f1(x)
f2(x),(1)
where
fi(x) = xTAix − 2bTi x + ci, i = 1, 2,(2)
A1,A2 ∈ Rn×n are symmetric matrices, b1,b2 ∈ R
n, c1, c2 ∈ R, and 0 ≤ L < U . Wedo not assume that A1 and A2 are positive semidefinite, and the only assumptionrequired for the problem to be well defined is that f2(x) is bounded away from zero.We will discuss two cases of the feasible set F :
F1 = {x ∈ Rn : L2 ≤ xTTx ≤ U2},
where T is a positive definite matrix and U > L ≥ 0, and
F2 = {x ∈ Rn : xTBx ≤ U2},
where B is a positive semidefinite matrix and U > 0.
∗Received by the editors October 13, 2004; accepted for publication (in revised form) by P. C.Hansen November 29, 2005; published electronically May 26, 2006.
http://www.siam.org/journals/simax/28-2/61685.html†MINERVA Optimization Center, Department of Industrial Engineering and Management Tech-
nion, Israel Institute of Technology, Haifa 3200, Israel ([email protected], [email protected]). The research of the second author was partially supported by BSF grant 2002038.
‡School of Mathematical Sciences, Tel-Aviv University, Ramat-Aviv 69978, Israel ([email protected]). The research of this author was partially supported by BSF grant 2002010.
425
426 AMIR BECK, AHARON BEN-TAL, AND MARC TEBOULLE
The major difficulty associated with problem (1) is the nonconvexity of the ob-jective function and in the case of F1 also the nonconvexity of the feasible set.
The main motivation for considering problem (1) comes from the so-called reg-ularized total least squares (RTLS) problem. Many problems in data fitting andestimation give rise to an overdetermined system of linear equations Ax ≈ b, whereboth the matrix A ∈ R
m×n and the vector b ∈ Rm are contaminated by noise. The
total least squares (TLS) approach to this problem [9, 10, 15] is to seek a perturba-tion matrix E ∈ R
m×n and a perturbation vector r ∈ Rm that minimize ‖E‖2 + ‖r‖2
subject to the consistency equation (A+E)x = b+ r (here and elsewhere in this pa-per a matrix norm is always the Frobenius norm and a vector norm is the Euclideanone). The TLS approach was extensively used in a variety of scientific disciplinessuch as signal processing, automatic control, statistics, physics, economic, biology,and medicine (see, e.g., [15] and the references therein). The TLS problem has es-sentially an explicit solution, expressed by the singular value decomposition of theaugmented matrix (A,b).
Regularization of the TLS solution is required in the case where A is nearlyrank deficient. Such problems arise, for example, from the discretization of ill-posedproblems such as integral equations of the first kind (see, e.g., [8, 13] and the referencestherein). In these problems the TLS solution can be physically meaningless, and thusregularization is employed in order to stabilize the solution.
Regularization of the TLS solution was addressed by several approaches: trun-cation methods [5, 13], Tikhonov regularization [8], and recently by introducing aquadratic constraint [20, 11, 8]. All the above methods are still trapped in the non-convexity of the problem and thus are not guaranteed to converge to a global optimum.At best, they are proven to converge to a point satisfying first order necessary opti-mality condition. In contrast, in this paper, we develop an efficient algorithm whichfinds the global optimal solution by converting the original problem into a sequenceof very simple convex optimization problems parameterized by a single parameter α.The optimal solution corresponds to a particular value of α, which can be found by asimple one-dimensional search. The algorithm finds an ε-optimal solution x∗ of (1),i.e.,
f1(x∗)
f2(x∗)≤ min
x∈Ff1(x)
f2(x)+ ε,
in a computational effort of order O(n3 log
(1ε
)).
The paper is organized as follows. In the next section, we show how to recover theformulation of the RTLS problem as a quadratically constrained fractional quadraticproblem. Section 3 describes a schematic algorithm designed to solve (1) for generalquadratic functions f1 and f2 that provides the starting point of the analysis and themain results that are developed in section 4. In section 5 we return to the RTLSproblem and give a detailed algorithm (RTLSC) for its solution. In order to illustratethe performance of algorithm RTLSC, two problems from the “Regularization Tools”[13] are employed: a problem that arises from the discretization of the inverse Laplacetransform and an image deblurring problem. These numerical examples are reportedin section 6, where we also compare the performance of our algorithm RTLSC withother well-known RTLS solvers. Some useful technical results used throughout thepaper are collected in the appendix.
2. The RTLS problem. In this section we show how to recover a known for-mulation of the RTLS problem as a quadratically constrained fractional quadratic
A GLOBAL SOLUTION FOR THE REGULARIZED TLS PROBLEM 427
programming. This result is well known [9, 15, 20]. However, we believe that thederivation we give below is simpler. The RTLS problem as stated in [20] is
minE,r,x ‖E‖2 + ‖r‖2
subject to (A + E)x = b + r,x ∈ F2.
(3)
To show that the RTLS problem (3) is a special case of problem (1), let us write (3)as
minx∈F2
minE,r:(A+E)x=b+r
‖E‖2 + ‖r‖2.(4)
Next, fix x ∈ F2 and consider the inner minimization problem in (4). Denote w =vec(E, r), where, for a matrix M, vec(M) denotes the vector obtained by stacking thecolumns of M. The linear constraint (in E and r) (A + E)x = b + r can be writtenas Qxw = b − Ax, where
Qx =
⎛⎜⎜⎜⎝xT 0 . . . 00 xT . . . 0...
......
0 0 . . . xT
⎞⎟⎟⎟⎠and x = (xT ,−1)T . Thus, the inner minimization problem in (4) takes the form
minQxw=b−Ax
‖w‖2.(5)
Using the KKT conditions, it is easy to see that the solution of (5) is attained atw = QT
x (QxQTx )−1(b−Ax), and as a result the optimal value of problem (5) is equal
to
(b − Ax)T (QxQTx )−1(b − Ax).
Since QxQTx = ‖x‖2I we deduce that the value of the inner minimization problem (5)
is equal to ‖Ax−b‖2
‖x‖2= ‖Ax−b‖2
‖x‖2+1 . Consequently, the value of the RTLS problem (3)
reduces to
minx∈F2
‖Ax − b‖2
‖x‖2 + 1,(6)
which is indeed a special case of problem (1).
3. A schematic algorithm. We consider problem (1) and henceforth make thefollowing assumption.
Assumption 1. f2 is bounded below on F by a positive number N .Let m and M be numbers such that
m ≤ minx∈F
f1(x)
f2(x)≤ M.(7)
Such bounds are easy to find; see section 4.3.Remark 3.1. For the RTLS problem (6), Assumption 1 is trivially satisfied for
N = 1. The lower bound m can be chosen as 0 and M can be taken to be f(0) = ‖b‖2.
428 AMIR BECK, AHARON BEN-TAL, AND MARC TEBOULLE
Although both denominator and nominator in the RTLS problem (6) are convex, thisproperty does not make the problem simpler since the quotient of convex functions isnot necessarily convex.
A simple observation that goes back to Dinkelbach [4] and will enable us to solve(1) is the following.
Observation. The following two statements are equivalent:
1. minx∈Ff1(x)f2(x) ≤ α.
2. minx∈F{f1(x) − αf2(x)} ≤ 0.Using the above observation, we can solve (1) by the following schematic bisection
algorithm.
Schematic Algorithm
Initial Step: Set lb0 = m and ub0 = M .General Step: For every k ≥ 1:
1. Define αk = lbk−1+ubk−1
2 .
2. Calculate βk = minx∈F {f1(x) − αkf2(x)}.(a) If βk ≤ 0, then define lbk = lbk−1 and ubk = αk.
(b) If βk > 0, then define lbk = αk and ubk = ubk−1.
Stopping Rule: Stop at the first iteration k∗ that satisfies ubk∗ − lbk∗ ≤ ε.Output:
x∗ ∈ argminx∈F
{f1(x) − ubk∗f2(x)} .(8)
Proposition 3.1. The schematic algorithm ends after⌈ln(M−m
ε
)/ln(2)
⌉itera-
tions with an output x∗ that is an ε-optimal solution of problem (1). More precisely,
x∗ ∈ F , α∗ ≤ f1(x∗)
f2(x∗)≤ α∗ + ε,
where α∗ = minx∈Ff1(x)f2(x) .
Proof. The length of the initial interval is ub0− lb0 = M−m. By the definition oflbk and ubk, we have that for every k ≥ 1, ubk− lbk = 1
2 (ubk−1− lbk−1), and therefore
ubk − lbk = (M −m)(
12
)k. From this it follows that k∗, the number of iterations of
the schematic algorithm, is the smallest integer k satisfying
(M −m)
(1
2
)k
≤ ε,
which is equivalent to k ≥⌈ln(M−m
ε
)/ln(2)
⌉. By (8), x∗ is feasible, i.e., x∗ ∈ F .
Also, by the definition of the bisection process we have that lbk ≤ α∗ ≤ ubk. By (8)
we have that lbk ≤ α∗ ≤ f1(x∗)
f2(x∗) ≤ ubk for every k and finally, since ubk∗ ≤ lbk∗ + ε,
the result follows.Remark 3.2. By writing “min” and not “inf” in statements 1 and 2 of the
observation and in the above scheme, we implicitly assumed that the minimum ofthe corresponding problems is attained (which is certainly the case when F = F1).
A GLOBAL SOLUTION FOR THE REGULARIZED TLS PROBLEM 429
Otherwise, the inequalities in the statements of the observation should be replacedby strict inequalities and the schematic algorithm revised accordingly. The schematicalgorithm then terminates with a point x∗, at which the objective value is at mostε away from the infimum. Thus henceforth we will assume that the minimum isattained.
To convert the schematic algorithm to a practical scheme we still need to addressthe following two questions:
1. How do we choose the lower and upper bound m and M?2. How do we solve the subproblem
minx∈F
{f1(x) − αf2(x)}?(9)
The first question is rather easy (see section 4.3). The second one is seemingly moredifficult since problem (9), like the original problem (1), is nonconvex. In the nextsection we give complete answers to these two questions.
4. Analysis and main results. In sections 4.1 and 4.2 we show how to effi-ciently solve the subproblem (9). We first transform the problem (9) into a convexoptimization problem by using the methodology of Ben-Tal and Teboulle [2]. Wethen show that the solution of the derived convex optimization problem consists ofone eigenvector decomposition and solutions of at most two one-dimensional secularequations [17]. Finally, in section 4.3 we show how to find the lower and upper boundsm and M .
4.1. Solving the subproblem in the case F = F1. In this section we con-sider the case in which the feasible set is equal to {x : L2 ≤ xTTx ≤ U2}, whereT is a positive definite matrix. Notice that in this case, the feasible set is compact,and thus the minimum is always attained both in the original problem (1) and inthe subproblem (9). First, we convert problem (9) to one with an Euclidean normconstraint by making the change of variables s = T1/2x. The result is the followingoptimization problem:
minL2≤‖s‖2≤U2
{f1(T
−1/2s) − αf2(T−1/2s)
}.(10)
Using the notation
A = T−1/2(A1 − αA2)T−1/2,
b = T−1/2(b1 − αb2),
c = c1 − αc2,
we obtain that problem (10) is the same as
(P) : minL≤‖s‖≤U
{sT As − 2bT s + c}.(11)
A is symmetric and hence can be diagonalized by an orthogonal matrix U, so that
UT AU = D = diag(λ1, λ2, . . . , λn),(12)
where λ1 ≥ λ2 ≥ · · · ≥ λn. Making the change of variables s = Uz we obtain that(11) is equivalent to
minL2≤‖z‖2≤U2
⎧⎨⎩n∑
j=1
(λjz2j − 2fjzj) + c
⎫⎬⎭ ,(13)
430 AMIR BECK, AHARON BEN-TAL, AND MARC TEBOULLE
where f = UTb. The following lemma will enable us to transform problem (13) intoa convex optimization problem.
Lemma 4.1. Let (z∗1 , z∗2 , . . . , z
∗n) be an optimal solution of
minL2≤‖z‖2≤U2
q(z),
where
q(z) =n∑
j=1
(λjz2j − 2fjzj).(14)
Then z∗j fj ≥ 0 for every j = 1, 2, . . . , n for which fj �= 0.
Proof. Since w = (z∗1 , z∗2 , . . . , z
∗n) is optimal it is in particular feasible, i.e., L2 ≤
‖w‖2 ≤ U2. An immediate result is that (z∗1 , z∗2 , . . . , z
∗k−1,−z∗k, z
∗k+1, . . . , z
∗n) is also
feasible for every k = 1, 2, . . . , n. Since w is optimal we have that for every k =1, 2, . . . , n,
q(z∗1 , . . . , z∗n) ≤ q(z∗1 , . . . , z
∗k−1,−z∗k, z
∗k+1, . . . , z
∗n).(15)
Substituting (14) into (15) yields
n∑j=1
(λj(z∗j )2 − 2fjz
∗j ) ≤
n∑j=1,j �=k
(λj(z∗j )2 − 2fjz
∗j ) + λk(−z∗k)2 + 2fkz
∗k.
Therefore, fkz∗k ≥ 0, and the result follows.
Note that if fj = 0 for some j, then the objective function q(z) is symmetricwith respect to zj and as a result we can arbitrarily restrict zj to be nonnegative ornonpositive. In view of this and Lemma 4.1, we can make the change of variables
zj = sign(fj)√vj , j = 1, 2, . . . , n,(16)
where vj ≥ 0. Substituting (16) into (13), we conclude that problem (9) is equivalentto the convex optimization problem
minvj≥0
⎧⎨⎩n∑
j=1
(λjvj − 2|fj |
√vj)
+ c : L2 ≤n∑
j=1
vj ≤ U2
⎫⎬⎭ .(17)
Proposition 4.1. Let A ∈ Rn×n be a symmetric matrix, b ∈ R
n, c ∈ R, and thespectral decomposition of A be given by A = UDUT , where D = diag(λ1, λ2, . . . , λn)and λ1 ≥ λ2 ≥ · · · ≥ λn. Then the global solution to the optimization problem
minL2≤‖s‖2≤U2
{sT As − 2bT s + c
}is given by s = Uz, where
zj = sign(fj)√vj , j = 1, 2, . . . , n,
and v is the solution of the convex optimization problem (17).Proposition 4.1 shows that the main step in the schematic algorithm (step 2)
consists of solving the linearly constrained convex optimization problem (17). This
A GLOBAL SOLUTION FOR THE REGULARIZED TLS PROBLEM 431
will be done by solving the dual problem, since, as we are about to show, the latterrequires the solution of at most two single-variable convex problems.
To develop the dual problem of (17), we assign a nonnegative multiplier ξ to thelinear inequality constraint −
∑nj=1 vj +L2 ≤ 0 and a nonpositive multiplier η to the
linear inequality constraint −∑n
j=1 vj + U2 ≥ 0 and form the Lagrangian of (17):
L(v, η, ξ) =
n∑j=1
(λjvj − 2|fj |
√vj)− η
⎛⎝ n∑j=1
vj − U2
⎞⎠+ ξ
⎛⎝−n∑
j=1
vj + L2
⎞⎠ c
=n∑
j=1
((λj − η − ξ)vj − 2|fj |
√vj)
+ ηU2 + ξL2 + c.(18)
Differentiating (18) with respect to vj and equating to zero, we obtain
vj =f2j
(λj − η − ξ)2, j = 1, 2, . . . , n,(19)
subject to the conditions η + ξ ≤ λn, η ≤ 0, and ξ ≥ 0. Thus, the dual objectivefunction is given by
infvj≥0
L(v, η, ξ) =
{h(η, ξ) if η − ξ > −λn, η ≤ 0, ξ ≥ 0,−∞ otherwise,
where
h(η, ξ)= −
n∑j=1
f2j
λj − η − ξ+ ηU2 + ξL2 + c
and the dual problem of (17) is
(D) : maxη,ξ
{h(η, ξ) : η + ξ < λn, η ≤ 0, ξ ≥ 0} .
From duality theory for convex optimization problems we have that [19, 3]
val(P) = val(D),
where val(P) (val(D)) denotes the optimal value of problem (P) (problem (D)). Nowwe note that the dual variables η and ξ cannot both be nonzero, since in that casewe would have by the complementarity slackness condition that
∑nj=1 vj is equal to
both U2 and L2, which is clearly a contradiction. As a result, instead of consideringthe problem (D) in two variables, we can consider the following two single-variableconvex optimization problems (maximization of concave functions subject to a simpleconvex bound constraint):
(D1) : maxη≤min{λn,0}
−n∑
j=1
f2j
λj − η+ ηU2 + c︸ ︷︷ ︸
h(η,0)
and
(D2) : max0≤ξ<λn
−n∑
j=1
f2j
λj − ξ+ ξL2 + c︸ ︷︷ ︸
h(0,ξ)
.
We thus obtain that in order to solve (D), we need to follow the following three steps:
432 AMIR BECK, AHARON BEN-TAL, AND MARC TEBOULLE
1. Find a solution η of (D1).2. Find a solution ξ of (D2).3. If h(η, 0) > h(0, ξ), then the solution of (D) is (η, 0). Otherwise, the solution
is (0, ξ).Notice that both (D1) and (D2) are easy problems to solve since they consist ofmaximizing a concave function of a single variable. A very efficient algorithm forsolving problems with an exact structure as (D1) and (D2) will be discussed at theend of this section.
We summarize our results on the solution of (11) in the following theorem.
Theorem 4.1. Let A ∈ Rn×n be a symmetric matrix, b ∈ R
n, c ∈ R, and thespectral decomposition of A be given by A = UDUT , where D = diag(λ1, λ2, . . . , λn)with λ1 ≥ λ2 ≥ · · · ≥ λn. Then the solution to the optimization problem
minL2≤‖s‖2≤U2
{sT As − 2bT s + c
}is s = Uz, where z ∈ R
n is given by
zj =fj
λj − η∗ − ξ∗, j = 1, 2, . . . , n,
with (η∗, ξ∗) given by
(η∗, ξ∗) =
{(η, 0) if [λn > 0 and h(η, 0) > h(0, ξ)] or λn ≤ 0,(0, ξ) if [λn > 0 and h(η, 0) ≤ h(0, ξ)],
where η and ξ are the optimal solution of problems (D1) and (D2), respectively.As was already mentioned, solving problems (D1) and (D2) is an easy task; to
demonstrate this fact, let us consider the solution of (D1) in the case λn ≤ 0 (all otherinstances can be similarly treated). In this case, (D1) takes the following form:
maxη<λn
⎧⎨⎩−n∑
j=1
f2j
λj − η+ ηU2 + c
⎫⎬⎭ .
Since h1(η) = h(η, 0) is continuous and strictly concave for η < λn and also satisfies
limη→−∞
h1(η) = −∞, limη→λ−
n
h1(η) = −∞,
we conclude that the maximum is obtained at a unique point η < λn that satisfiesh′
1(η) = 0. Therefore, in this case we need to find the unique root of the followingso-called secular equation [17]:
η < λn, G(η) = U2,(20)
where
G(η) ≡n∑
j=1
f2j
(η − λj)2.(21)
Finding the unique root, which lies to the left of λn, of the secular equation (20) isa well-studied problem (see, e.g., [17, 7]). Specifically, Melman [17] transforms theproblem into the equivalent problem
G−1/2(η) = U−1(22)
A GLOBAL SOLUTION FOR THE REGULARIZED TLS PROBLEM 433
for which Newton’s method exhibits global quadratic convergence. The algorithm isas follows.
Algorithm SEC.
Input: (f ,Λ, U), where f ∈ Rn, Λ = diag(λ1, λ2, . . . , λn) with λ1 ≥ λ2 ≥ · · · ≥ λn
and U > 0.Output: η∗ < λn that satisfies |G(η∗) − U2| < ε2, where G is defined in (21).Initial step: η0 = λn − ε1.General step: for every k ≥ 0,
ηk+1 = ηk + 2G−1/2(ηk) − U−1
G−3/2(ηk)G′(ηk).
Stopping rule: Stop at the first iteration k∗ that satisfies |G(ηk∗) − U2| < ε2. Setη∗ = ηk∗ .
In our implementation the tolerance parameters ε1 and ε2 take the values ε1 =10−4, ε2 = 10−15. Melman’s algorithm solves the secular equation very fast (typically5 or 6 iterations suffice to achieve 15 digit accuracy independently of n).
Example. To demonstrate the rate of convergence of algorithm SEC we considerproblem (20) with n = 100, λi = i, fi = 1 (i = 1, 2, . . . , 100), and U = 1. We comparealgorithm SEC with a simple bisection algorithm with initial interval [−100, λn] andan identical stopping criteria as the one of algorithm SEC.
Table 1
Quadratic rate of convergence of Melman’s algorithm.
From Table 1 it is clear that the algorithm exhibits quadratic rate of convergence rightfrom the very first iteration. The bisection algorithm terminated in this example after55 iterations.
The dominant computational effort when solving the subproblem in the caseF = F1 are (i) the calculation of the matrices T1/2,T−1/2 and (ii) the spectral
decomposition of the matrix A. Each requires a computational effort of O(n3). ByProposition 3.1, the schematic algorithm requires solving O(log ε−1) subproblems inorder to generate a ε-global optimal solution. We thus conclude that the overallcomputational effort of the schematic algorithm is O(n3 log ε−1).
4.2. Solving the subproblem in the case F = F2. Here we consider prob-lem (9) in the case where the feasible set is F2 = {x : xTBx ≤ U2}, where B ispositive semidefinite but not positive definite. Thus, the subproblem in step 2 of theschematic algorithm under consideration here is
β∗ = minxTBx≤U2
{xTAx − 2bTx + c},(23)
where
A = A1 − αA2, b = b1 − αb2, c = c1 − αc2.
434 AMIR BECK, AHARON BEN-TAL, AND MARC TEBOULLE
Notice that since B is singular the feasible set F2 is not compact, and thereforethe solution of the subproblem (23) might be −∞. This issue is addressed in thefollowing.
Lemma 4.2. Let A ∈ Rn×n be a symmetric matrix, b ∈ R
n, c ∈ R, U > 0, andB ∈ R
n×n be a positive semidefinite matrix. Then1. if there exists λ ≥ 0 such that A + λB � 0, then the minimum of (23) is
finite: β∗ > −∞;2. if no λ ≥ 0 exists such that A + λB � 0, then the minimum of (23) is not
finite.Proof. The optimal value β∗ of problem (23) is finite if and only if the following
statement is true:
∃μ ∈ R, xTBx ≤ U2 ⇒ xTAx + 2bTx + c ≥ μ,(24)
which by the S-lemma (see Lemma A.1 in the appendix) is equivalent to
∃μ ∈ R, λ ∈ R+,
(A −b
−bT c− μ
)� λ
(−B 00 U2
)and which can also be written as
∃μ ∈ R, λ ∈ R+,
(A + λB −b−bT c− μ− U2
)� 0.(25)
Since a necessary condition for the validity of (25) is that there exists a λ ≥ 0 suchthat A + λB � 0, we conclude that the second statement of the lemma is proven.Moreover, if there exists a λ0 ≥ 0 such that A + λ0B � 0, then taking μ0 < c−U2 −bT (A+λ0B)−1b we have by Schur’s complement (Lemma A.2) that the linear matrixinequality (LMI) (25) is satisfied for λ = λ0 and μ = μ0, and therefore β∗ > −∞ andthe first statement of the lemma is proven.
Notice that the only case not covered by Lemma 4.2 is the case where there is aλ ≥ 0 such that A + λB � 0 but there does not exist a λ ≥ 0 such that A + λB � 0.Later, we will see that we can ignore this case.
In the next result we find equivalent conditions for the finiteness of the minimiza-tion problem (23) that can be easily checked and analyzed.
Lemma 4.3. Let A ∈ Rn×n be a symmetric matrix and B ∈ R
n×n be a positivesemidefinite matrix of rank r. Denote by F the n× (n− r) matrix whose columns area basis for the null space of B. Then the following two statements are equivalent:
1. There exists λ ≥ 0 such that A + λB � 0.2. FTAF � 0.
Proof. First, since B � 0, statement 1 is equivalent to the same statement withoutthe sign constraint on λ:
∃λ ∈ R, A + λB � 0.
By Finsler’s theorem (see Theorem A.1 in the appendix), this condition is equivalentto the following statement:
xTAx > 0 for every x �= 0 such that xTBx = 0.(26)
Now, since B � 0, we have that xTBx = 0 is equivalent to x ∈ Null(B). Thus, (26)is equivalent to
xTAx > 0 for every x �= 0 such that x ∈ Null(B),
A GLOBAL SOLUTION FOR THE REGULARIZED TLS PROBLEM 435
which is equivalent to saying that FTAF � 0.A direct consequence of Lemmas 4.3 and 4.2 is that if
FTAF � 0,(27)
then β∗ > −∞ and if FTAF is not positive semidefinite (i.e., has at least one negativeeigenvector), then β∗ = −∞. In the case where condition (27) is satisfied we cansimultaneously diagonalize A and B (see Appendix B), and therefore we can continuewith the hidden convexity argument.
Let C be a nonsingular matrix that simultaneously diagonalizes A and B:
CTBC = diag(1, 1, . . . , 1︸ ︷︷ ︸r times
, 0, 0, . . . , 0︸ ︷︷ ︸n− r times
),
CTAC = diag(λ1, λ2, . . . , λr, 1, 1, . . . , 1︸ ︷︷ ︸n− r times
),
where λ1 ≥ λ2 ≥ · · · ≥ λr (see Appendix B for details). Making the change ofvariables x = Cz we obtain that (23) is equivalent to
min
⎧⎨⎩r∑
j=1
λjz2j +
n∑j=r+1
z2j − 2
n∑j=1
fjzj + c :
r∑j=1
z2j ≤ U2
⎫⎬⎭ ,(28)
where f = CTb. The same argument as in Lemma 4.1 shows that we can make thechange of variables
zj = sign(fj)√vj , j = 1, 2, . . . , n,
where vj ≥ 0. We obtain the following equivalent convex optimization problem:
minvj≥0
⎧⎨⎩r∑
j=1
(λjvj − 2|fj |
√vj)
+
n∑j=r+1
(vj − 2|fj |
√vj)
+ c :
r∑j=1
vj ≤ U2
⎫⎬⎭ .(29)
To develop the dual problem of (29), we assign a nonpositive multiplier λ to thelinear inequality constraint −
∑rj=1 vj+U2 ≥ 0 and form the Lagrangian of (29) given
by
L(v, η, ξ) =r∑
j=1
(λjvj − 2|fj |
√vj)
+
n∑j=r+1
(vj − 2|fj |
√vj)− λ
⎛⎝ r∑j=1
vj − U2
⎞⎠+ c
=r∑
j=1
((λj − λ)vj − 2|fj |
√vj)
+
n∑j=r+1
(vj − 2|fj |
√vj)
+ λU2 + c.(30)
Differentiating (18) with respect to vj and equating to zero, we obtain
vj =f2j
(λj − λ)2, j = 1, 2, . . . , r,
vj = f2j , j = r + 1, . . . , n,
436 AMIR BECK, AHARON BEN-TAL, AND MARC TEBOULLE
subject to the condition λ ≤ min{λn, 0}. Thus, the dual objective function is givenby
h(λ) = infvj≥0
L(v, η, ξ) =
{−∑r
j=1
f2j
λj−λ + λU2 + d, λ < min{λr, 0},−∞ otherwise,
where d = c−∑n
j=r+1 f2j . The dual problem of (29) is therefore
(D) : maxλ≤min{λr,0}
h(λ).
From duality theory for convex optimization problems we have that [19, 3]
val(P) = val(D).
The solution of (D) involves the solution of a single secular equation of the form (20).We summarize the above discussion in Theorem 4.2.
Theorem 4.2. Let A ∈ Rn×n be a symmetric matrix, B ∈ R
n×n a positivesemidefinite matrix of rank r, b ∈ R
n, and c ∈ R. Moreover, suppose that FTAF � 0,where F is an n× (n− r) matrix whose columns are an orthogonal basis for the nullspace of B. Let C be a nonsingular matrix for which the following is satisfied:
CTBC =
(Ir 00 0
), CTAC =
(Λ 00 In−r
),
where Λ = diag(λ1, λ2, . . . , λr) and λ1 ≥ λ2 ≥ · · · ≥ λr. Then the solution to theoptimization problem
minxTBx≤U2
{xTAx − 2bTx + c
}is x = Cz, where z ∈ R
n is given by
zj =
{fj
λj−λ , j = 1, 2, . . . , r,
fj , j = r + 1, . . . , n(f = CTb)
and λ is the solution to the maximization problem
maxλ≤min{λr,0}
⎧⎨⎩−r∑
j=1
f2j
λj − λ+ λU2
⎫⎬⎭ ,
whose solution consists of at most one root finding of a secular equation with onevariable of the form (20).
We will impose an additional assumption on the quadratic function f2(x).Assumption 2. f2 is a strongly convex function (i.e., A2 � 0).Note that Assumption 2 is readily satisfied by the RTLS problem (6). Recall that
in the schematic algorithm A = A1 − αA2, so (27) is equivalent to
FTA1F − αFTA2F � 0.(31)
F is full column rank and, by Assumption 2, we have that A2 is positive definite, andas a consequence FTA2F is also positive definite. Multiplying (31) from the right andleft by Q = (FTA2F)−1/2, we obtain the following equivalent LMI:
Q(FTA1F)Q − αI � 0.
A GLOBAL SOLUTION FOR THE REGULARIZED TLS PROBLEM 437
The last LMI is equivalent to α < λmin(Q(FTA1F)Q). We summarize this in thefollowing proposition.
Proposition 4.2. Let α = λmin(Q(FTA1F)Q), where Q = (FTA2F)−1/2.Then the minimum of (23) is finite if α < α and equal to −∞ if α > α.
α is of course an upper bound for the minimal value of the original problem (1),and thus, in the schematic algorithm, we will always take an upper bound M that isof most α. We therefore conclude that throughout the schematic algorithm, we needto consider only subproblems with a finite minimum that satisfies (27).
A similar argument to the one given in the case F = F1 shows that in the case F =F2 as well, the algorithm produces an ε-global optimal solution in a computationaleffort of O(n3 log ε−1).
4.3. Finding the bounds. In this section we present some suggestions for thelower and upper bounds m and M of the schematic algorithm. In the special case ofthe original RTLS problem, simpler bounds are derived in section 5.
4.3.1. The case F = F1. In this case the constraint is given by L2 ≤ xTTx ≤U2. From this it follows that ‖x‖2 ≤ U2/λmin(T). We can therefore bound theobjective function of problem (1) as follows:∣∣∣∣f1(x)
f2(x)
∣∣∣∣ =
∣∣∣∣xTA1xT − 2bT
1 x + c1xTA2x − 2bT
2 x + c2
∣∣∣∣ ≤ 1
N
∣∣xTA1xT − 2bT
1 x + c1∣∣
≤ 1
N(|xTA1x
T | + |2bT1 x| + |c1|) ≤
1
N
(U2λmax(A1)
λmin(T)+ 2
‖b1‖U√λmin(T)
+ |c1|).
Thus, we can choose m and M to be
M =1
N
(U2λmax(A1)
λmin(T)+ 2
‖b1‖U√λmin(T)
+ |c1|), m = −M.
The only element in the definition of m and M which is not given explicitly is thepositive number N , defined in Assumption 1. For the RTLS problem, where f2(x) =‖x‖2 +1, we can take N to be equal to 1. Also, for other problems we can define N tobe the optimal value of the minimization problem minL≤‖x‖T≤U{xTA2x−2bT
2 x+c2}.4.3.2. The case F = F2. In this case the constraint is given by xTBx ≤
U2, where B is a positive semidefinite matrix. We consider the case where bothAssumptions 1 and 2 hold true and that f2 is bounded below in R
n. The upperbound can be taken as M = α, where α is given in Proposition 4.2. To find a lowerbound m, we first make the change of variables z = x − A−1
2 b2 resulting with thefollowing form of the objective function:
zTA1z − 2eT z + f
zTA2z + d,(32)
where d = c2 −bT2 A−1
2 b2 > 0, e = b1 −A1A−12 b2, and f = c1 + bT
2 A−12 A1A
−12 b2 −
2bT1 A−1
2 b2.The unconstrained minimum of the last expression (32) is a lower bound on the
optimal value, and we can lower bound it using a relaxation technique.
minz
zTA1z − 2eT z + f
zTA2z + d
w=A1/22 z
= minw
wTA−1/22 A1A
−1/22 w − 2eTA
−1/22 w + f
‖w‖2 + d
438 AMIR BECK, AHARON BEN-TAL, AND MARC TEBOULLE
= minw,t=
√d
wTA−1/22 A1A
−1/22 w − 2√
deTA
−1/22 wt + f
d t2
‖w‖2 + t2
≥ minw,t
wTA−1/22 A1A
−1/22 w − 2√
deTA
−1/22 wt + f
d t2
‖w‖2 + t2
= λmin
(A
−1/22 A1A
−1/22
1√dA
−1/22 e
1√deTA
−1/22
fd
)Thus, we can take
m = λmin
(A
−1/22 A1A
−1/22
1√dA
−1/22 e
1√deTA
−1/22
fd
).
5. A detailed algorithm for the RTLS problem. In this section we use theresults obtained so far to write in full details the schematic algorithm of section 3 asapplied to the RTLS problem:
minx
{f(x) ≡ ‖Ax − b‖2
‖x‖2 + 1: ‖Lx‖ ≤ U
}.(33)
We call this algorithm RTLSC.The RTLSC algorithm solves at each iteration a subproblem of the form
min{xTQx − 2dTx : ‖Lx‖ ≤ U
}.(34)
The detailed algorithm SUBP for solving the latter problem is explicitly written below.It invokes three procedures:
• SDG—an algorithm for simultaneous diagonalization of two matrices, one ofwhich is positive definite (see Appendix B).
• SDGP—an algorithm for simultaneous diagonalization of two matrices, oneof which is positive semidefinite (see Appendix B).
• SEC—Melman’s algorithm for solving secular equations given in section 4.Algorithm SUBP.
Input: (Q,d,L, U,F), where Q ∈ Rn×n is a symmetric matrix, d ∈ R
n,L ∈R
r×n(r ≤ n) is a full rank matrix, U > 0, and F ∈ Rn×(n−r) is a matrix whose
columns are an orthogonal basis for the null space of L.Output: (x∗, μ). x∗ is an optimal solution to problem (34) and μ is the correspondingoptimal value.
1. If r < n, then call algorithm SDG with input (A,LTL,F) and obtain anoutput (C,Λ). Else call algorithm SDGP with input (A,LTL) and obtainan output (C,Λ).
2. Set f = CTd.
3. If λr > 0 and∑r
j=1
f2j
λ2j
< U2, then set λ∗ = 0. Else call algorithm SEC with
5. Set x∗ = Cv and μ = (x∗)TQx∗ − 2fTx∗.Algorithm RTLSC.
Input: (A,b,L, U, ub, ε), where A ∈ Rm×n(m ≥ n),b ∈ R
m,L ∈ Rr×n(r < n) has
full row rank, U > 0, ub > 0 is an upper bound on the optimal function value, andε > 0 is a tolerance parameter.Output: x∗—an ε-optimal solution of problem (33).
A GLOBAL SOLUTION FOR THE REGULARIZED TLS PROBLEM 439
1. Set k = 0, lb0 = 0, ub0 = ub.2. Calculate a matrix F ∈ R
n×(n−r) whose columns are an orthogonal basis forthe null space of L.
3. While ubk − lbk > ε, do(a) αk = lbk+ubk
2 .(b) Call algorithm SUBP with input (ATA − αkI,A
Tb,L, U) and obtainan output (xk, βk).
(c) Calculate fk = f(xk).(d) If βk + ‖b‖2 − αk > 0, then
lbk+1 = αk, ubk+1 = min{ubk, fk},(35)
else
lbk+1 = lbk, ubk+1 = min{αk, fk}.(36)
(e) Set k ← k + 1.End.
4. Define x∗ = xm, where m is chosen so that fm = min{f0, f1, . . . , fk−1}.Choice of lower and upper bounds. In the case where L is square and nonsin-
gular, the upper bound can be chosen as ub = f(x), where x is any feasible point (suchas 0). A tight upper bound can be obtained by choosing x as a solution of anothermethod such as regularized least squares. In the rank deficient case, Proposition 4.2implies that λmin(FTAF) is an upper bound on the optimal function value. Hence,an initial upper bound is given by min{λmin(FTAF), f(x)}. This choice guaranteesthat all subproblems have a finite value.
Remark 5.1. Note that the update equations (35) and (36) for the upper boundubk are different from the naive implementation suggested in the schematic algorithmof section 3. The idea behind the revised update formulas is to incorporate theinformation gained at previous iterations in order to find better upper bounds. Ateach iteration we calculate a new feasible point xk, which induces a new upper boundfk ≡ f(xk) on the optimal function value. Thus, the update equation ubk+1 = ubk inthe original schematic algorithm is converted to ubk+1 = min{ubk, fk}. The followingexample demonstrates the advantage of using the new update equations.
Example. In this section we illustrate a single run of the RTLSC algorithm. Weconsider problem (33) with
n = 2, A =
(1 23 4
), b =
(1025
), L =
(1 00 2
), ρ = 10.
Table 2 describes the first six iterations of algorithm RTLSC. The initial upperbound ub0 was chosen to be f(0) = ‖b‖2 = 725. Note that the decrease in the upperbound is very drastic at the first few iterations. The size of the interval [lbk, ubk]decreases by a factor of 3000 between iteration 0 and iteration 1 (instead of a factorof 2 in the old update equations). The minimum value is equal to 0.047501 and isreached after only three iterations. This run is typical in the sense that usually thealgorithm converges to a point after very few iterations.
6. Numerical examples. In order to test the performance of algorithm RTLSC,two problems from the “Regularization Tools” [13] are employed: a problem that arisesfrom the discretization of the inverse Laplace transform and an image deblurringproblem. The following algorithms are tested:
• RLS—Regularized Least Squares. This is the solution to the problem
min{‖Ax − b‖2 : ‖Lx‖ ≤ ρ},
implemented in the function lsqi from [13].• TTLS—Truncated Total Least Squares originating from [5] and implemented
in the function ttls from [13].• RTLSC—Our algorithm from section 5.• QEP—Sima, Van Huffel, and Golub’s solver for RTLS [20].• GR—Guo and Renaut’s eigenvalue method for RTLS [11] with the RLS so-
lution as a starting vector.
6.1. Inverse Laplace transform. We consider the problem of estimating thefunction f(t) from its given Laplace transform [21]:∫ ∞
0
e−stf(t)dt =2
(s + 1/2)3.
By means of Gauss–Laguerre quadrature, the problem reduces to a linear systemAx = b. This system and its solution xR are implemented in the function ilaplace(n,3)from [13]. The perturbed right-hand side is generated by
b = (A + σE)xR + σe,(37)
where each component of E and e is generated from a standard normal distributionand σ runs thorough the values 1e-1, 1e-2, and 1e-4. The matrix L approximatesthe first-derivative operator implemented in the function get l(n,1) from [13]. Twocases are tested: m = n = 20 and m = n = 100. Table 3 describes the relative error‖x − xR‖/‖xR‖ averaged over 300 random realizations of E and e.
The best results in each row are emphasized in boldface. The RTLSC and QEPmethods give the best results in all but one case. The RLS also performed quite well.Note that the average relative error for the RTLSC and QEP solvers are equal. It
A GLOBAL SOLUTION FOR THE REGULARIZED TLS PROBLEM 441
is interesting to note that not only the average was the same but in fact for all 1800simulations of QEP and RTLSC, the results were the same. Incidentally, this providesan experimental evidence to the claim that QEP finds the global minimum, althoughsuch a theoretical claim was not proved in [20].
The CPU time in seconds of the three RTLS solvers averaged over 20 realizationsof E and e is given in Table 4 (σ was fixed to be 1e-4). To make a fair comparison, weemployed the same stopping rule for each of the methods: ‖xk+1 −xk‖/‖xk‖ < 10−3.
It is clear from Table 4 that RTLSC and QEP are significantly faster than GR.Moreover, RTLSC and QEP require more or less the same running time.
6.2. Image deblurring. We consider the problem of estimating a 32× 32 two-dimensional image obtained from the sum of three harmonic oscillations:
x(z1, z2) =
3∑l=1
ai cos(wl,1z1 + wl,2z2 + φl),
(wl,i =
2πkl,in
), 1 ≤ z1, z2 ≤ 32,
where kl,i ∈ Z2 (see Figure 1(A)). The specific values of the parameters are given in
Table 5.The image is blurred by atmospheric turbulence blur originating from [12] and
implemented in the function blur(n,3,1) from [13].The blurred image is generated by the relation (37) with σ = 0.1, which results
in a highly noisy image (see Figure 1(B)).Choice of regularization matrix. We first ran algorithm RLS with standard
regularization (L = I). The result is the poor image given in Figure 1(C). We thenchose L as a discrete approximation of the Laplace operator [16] which is a two-dimensional convolution with the following mask:⎡⎣ −1 −1 −1
−1 8 −1−1 −1 −1
⎤⎦ .
The above results demonstrate the importance of the choice of the regularizationmatrix. In the following experiments we use the nonstandard L. The result foralgorithm TTLS is given in Figure 1(E). A much improved result is obtained by ouralgorithm RTLSC (Figure 1(F)). Here again algorithm QEP gave the same result asalgorithm RTLSC. Also, algorithm GR gave in this example the same image as RLS.
It is interesting to note that in this and many other examples algorithm RTLSCrequired only three iterations in order to produce quality reconstructions. As anillustration, Figure 2 shows the result of the first three iterations of algorithm RTLSC.The function values of the images generated in iterations 1, 2, and 3 are 2.0934,1.5715, and 1.5566, respectively. The difference between the first and second iteration
442 AMIR BECK, AHARON BEN-TAL, AND MARC TEBOULLE
(A) True Image (B) Observation
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(C) RLS with L = I (D) RLS with Laplace operator
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(E) TTLS (F) RTLSC
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
Fig. 1. Results for different regularization solvers.
is substantial. However, the image produced at the third iteration is almost identicalto the image produced at the third iteration. Further iterations of RTLSC do notimprove the image, although the function value reduces to the minimal value 1.5234.
A GLOBAL SOLUTION FOR THE REGULARIZED TLS PROBLEM 443
Fig. 2. First three iterations of algorithm RTLSC.
Appendix A. Known results.Lemma A.1 (S-lemma [1]). Let A and B be n×n symmetric matrices, e, f ∈ R
n,and g, h ∈ R. Assume that the quadratic inequality
xTAx + 2eTx + g ≥ 0(38)
is strictly feasible; i.e., there exists x such that xTAx + 2eT x + g > 0. Then thequadratic inequality
xTBx + 2fTx + h ≥ 0(39)
is a consequence of (38) if and only if there exists a nonnegative λ such that(B ffT h
)� λ
(A eeT g
).
Lemma A.2 (Schur’s complement [1]). Let
M =
(A BT
B C
)be a symmetric matrix with C � 0. Then M � 0 if and only if ΔC � 0, where ΔC isthe Schur complement of C in M and is given by
ΔC = A − BTC−1B.
Theorem A.1 (Finsler’s theorem [6]). Let A and B be symmetric n× n matri-ces. Then the quadratic inequality
xTBx > 0
is a consequence of the quadratic equality
xTAx = 0
444 AMIR BECK, AHARON BEN-TAL, AND MARC TEBOULLE
if and only if there exists α ∈ R such that B − αA � 0.
Appendix B. Algorithms for simultaneous diagonalization. In this sectionwe recover an algorithm for the simultaneous diagonalization of an n× n symmetricmatrix A and a positive semidefinite matrix B ∈ R
n×n of rank r(< n). We denoteby F the n× (n− r) matrix whose columns are an orthogonal basis for the null spaceof B and assume that the condition
FTAF � 0(40)
is satisfied, which implies that the matrices A and B are simultaneously diagonalizableby a nonsingular matrix. This fact follows directly from [18, Theorem 6.2.2]. Here weexplicitly recover the algorithm that follows from [18] for the special case where (40)is satisfied.
Algorithm SDG.
Input: (A,B,F), where A ∈ Rn×n is a symmetric matrix, B ∈ R
n×n is a positivesemidefinite of rank r(r < n), and F ∈ R
n×(n−r) is a matrix whose columns are anorthogonal basis for the null space of B.Condition: FTAF � 0.Output: (C,Λ). C is a nonsingular matrix and Λ = diag(λ1, λ2, . . . , λr) (λ1 ≥ λ2 ≥· · · ≥ λr) is a diagonal matrix such that
CTBC =
(Ir 00 0
), CTAC =
(Λ 00 In−r
).
1. Find a full row rank r × n matrix L such that1 B = LTL.2. Define M = LT (LLT )−1 (M is a right inverse of L). We have MTBM = Ir.3. Define S =
(M − F(FTAF)−1FTAM, F
). We have
STBS =
(Ir 00 0
), STAS =
(E 00 FTAF
),
where E is an r × r symmetric matrix.4. Find an r × r orthogonal matrix Q1 such that QT
5. Find an (n − r) × (n − r) matrix Q2 such that QT2 (FTAF)Q2 = In−r (this
is possible since we assume that FTAF � 0).6. Define
C = S
(Q1 00 Q2
).
In the case where one of the matrices is positive definite, simultaneous diagonaliza-tion is always possible without any restrictions [14]. The procedure for simultaneousdiagonalization in that case is much simpler and is given below.
Algorithm SDGP.
Input: (A,B), where A ∈ Rn×n is a symmetric matrix and B ∈ R
n×n is a positivedefinite matrix.Output: (C,Λ). C is a nonsingular matrix and Λ = diag(λ1, λ2, . . . , λn) (λ1 ≥ λ2 ≥· · · ≥ λn) is a diagonal matrix such that
CTBC = I, CTAC = Λ.
1This step can be done by, e.g., Cholesky’s factorization. In some applications B is already givenin that form.
A GLOBAL SOLUTION FOR THE REGULARIZED TLS PROBLEM 445
1. Find a singular matrix L such that B = LTL.2. Calculate the spectral decomposition of (LT )−1AL−1:
UT ((LT )−1AL−1)U = D,
where U is an orthogonal matrix, D = diag(λ1, λ2, . . . , λn), and λ1 ≥ λ2 ≥· · · ≥ λn.
3. Set C = L−1U,Λ = D.
Acknowledgments. We thank the associate editor and two anonymous refereesfor their constructive comments.
REFERENCES
[1] A. Ben-Tal and A. Nemirovski, Lectures on Modern Convex Optimization: Analysis, Al-gorithms, and Engineering Applications, MPS-SIAM Ser. Optim. 2, SIAM, Philadelphia,2001.
[2] A. Ben-Tal and M. Teboulle, Hidden convexity in some nonconvex quadratically constrainedquadratic programming, Math. Programming, 72 (1996), pp. 51–63.
[3] D. P. Bertsekas, Nonlinear Programming, 2nd ed., Athena Scientific, Belmont, MA, 1999.[4] W. Dinkelbach, On nonlinear fractional programming, Management Sci., 13 (1967), pp. 492–
498.[5] R. D. Fierro, G. H. Golub, P. C. Hansen, and D. P. O’Leary, Regularization by truncated
total least squares, SIAM J. Sci. Comput., 18 (1997), pp. 1223–1241.[6] P. Finsler, Uber das Vorkommen definiter und semi-definiter Formen in scharen quadratische
Formen, Comment. Math. Helv., 9 (1937), pp. 188–192.[7] W. Gander, G. H. Golub, and U. von Matt, A constrained eigenvalue problem, Linear
Algebra Appl., 114/115 (1989), pp. 815–839.[8] G. H. Golub, P. C. Hansen, and D. P. O’Leary, Tikhonov regularization and total least
squares, SIAM J. Matrix Anal. Appl., 21 (1999), pp. 185–194.[9] G. H. Golub and C. F. Van Loan, An analysis of the total least squares problem, SIAM J.
Numer. Anal., 17 (1980), pp. 883–893.[10] G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd ed., The Johns Hopkins Uni-
versity Press, Baltimore, MD, 1996.[11] H. Guo and R. Renaut, A regularized total least squares algorithm, in Total Least Squares and
Errors-in-Variables Modeling, Kluwer Academic Publishers, Dordrecht, The Netherlands,2002, pp. 57–66.
[12] M. Hanke and P. C. Hansen, Regularization methods for large-scale problems, Surveys Math.Indust., 3 (1993), pp. 253–315.
[13] P. C. Hansen, Regularization tools, a Matlab package for analysis of discrete regularizationproblems, Numer. Algorithms, 6 (1994), pp. 1–35.
[14] R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge,UK, 1985.
[15] S. Van Huffel and J. Vandewalle, The Total Least Squares Problem: Computational Aspectsand Analysis, Frontiers Appl. Math. 9, SIAM, Philadelphia, 1991.
[16] A. K. Jain, Fundamentals of Digital Image Processing, Prentice–Hall, Englewood Cliffs, NJ,1989.
[17] A. Melman, A unifying convergence analysis of second-order methods for secular equations,Math. Comp., 66 (1997), pp. 333–344.
[18] C. R. Rao and S. K. Mitra, Generalized Inverse of Matrices and Its Applications, John Wileyand Sons, New York, 1971.
[19] R. T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, NJ, 1970.[20] D. Sima, S. Van Huffel, and G. H. Golub, Regularized total least squares based on quadratic
eigenvalue problem solvers, BIT, 44 (2004), pp. 793–812.[21] J. M. Varah, Pitfalls in the numerical solution of linear ill-posed problems, SIAM J. Sci.