On the role of Linear Algebra in the development of Interior Point algorithms and software Due Giorni di Algebra Lineare Numerica Bologna, 6-7 marzo, 2007 D. di Serafino, Second University of Naples, Italy daniela.diserafi[email protected]joint work with S. Cafieri, École Polytechnique, Paris M. D’Apuzzo, V. De Simone, Second University of Naples G. Toraldo, University of Naples “Federico II”
26
Embed
On the role of Linear Algebra in the development of Interior Point algorithms and software Due Giorni di Algebra Lineare Numerica Bologna, 6-7 marzo, 2007.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
On the role of Linear Algebra in the development of Interior Point algorithms
and software
On the role of Linear Algebra in the development of Interior Point algorithms
and software
Due Giorni di Algebra Lineare NumericaBologna, 6-7 marzo, 2007
PRQP - Potential Reduction software for Quadratic Programming problems PRQP - Potential Reduction software for Quadratic Programming problems
x, y, , t primal and dual variables, s,, z slack variables
, , , ,spsd 21EI21 nmmmJJQ nmnmnn
Motivations
0),,,(
, s.t.2
1),,,(max
tzyx
ctzJyJQx
tudybQxxtyxp
T
e
T
i
TTTT
0),,,(
, s.t.2
1),,,(max
tzyx
ctzJyJQx
tudybQxxtyxp
T
e
T
i
TTTT
0),,(
, s.t.2
1)(min
vsxux
d,xJ
bsxJ
xcQxxxq
e
i
TT
0),,(
, s.t.2
1)(min
vsxux
d,xJ
bsxJ
xcQxxxq
e
i
TT
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
5
KKT system and IP methods
type of problem different blocks formulation of the method in the KKT system
large-scale problems sparse direct or iterative solvers local convexity of the problem KKT matrix inertia control
accuracy requirements inexact solution of the KKT system adaptive stopping criteriaincreasing ill conditioning as the iterates approach the solution preconditioning techniques
OPTIMIZATION LINEAR ALGEBRA
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
6
Infeasible primal-dual PR framework
reduce the infeasibility at least at the same rate
as the duality gap
0, 0 corresponding to the initial point w0
n
iii
n
i
m
jjjii
TTT tsyzxtsyzxtzsyx11 1
)ln()ln()ln()ln(),,,,,( min
n
iii
n
i
m
jjjii
TTT tsyzxtsyzxtzsyx11 1
)ln()ln()ln()ln(),,,,,( min
barrier functions
potential function (Tanabe,1988; Todd & Ye, 1990)
duality gap
00
0),,,,,(~
σ
σ
tszyxw
00
0),,,,,(~
σ
σ
tszyxw
ctzJyJHxr
dxJruxrbsxJr
rrrrσ
T
e
T
id
Eppip
dppp
, ,
),,,(
321
321 2
(Kojima, Mizuno & Todd, 1995)
primal infeasibility
dual infeasibility
= 0 feasible version
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
7
PR basic steps
1. Given the current interior iterate w=(x,y,s,z,,v,t), apply a Newton step to the perturbed KKT conditions
2. Update w
with suitably chosen
)/()( gwwG )/()( gwwG
www www
parameteron perturbati /
conditions KKT theofJacobian )(
wG
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
8
Newton system reduction: KKT system
2
1
2
1
b
b
x
x
DJ
JH T
2
1
2
1
b
b
x
x
DJ
JH T
equality + bound constraints:
inequality + bound constraints:
ineq. + eq. + bound constraints:
TVZXEEQH 11 , E accounts for bound constraints
0
e
Te
J
JH
DJ
JH
i
Ti
Di pd
00
0
e
ii
Te
Ti
J
DJ
JJH
D pd
bound constraints only (J = I): )(
21
1
2
2
1
11
bxTVx
TbVbxH
diagonal
condensed system
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
9
Inherent ill conditioning
approaching the solution, some entries of D and E may become very
large, producing an increasing ill conditioning in the
matrix
DJ
JEQ T
DJ
JEQ T
it is crucial to use preconditioning strategiesit is crucial to use preconditioning strategies
00
01SYD
TVZXE 11
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
10
KKT system solvers: direct vs iterative
Direct solvers Widely used in well-established IP software
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
11
Indefinite preconditioner with the same structure as the KKT matrix
M “simple” approximation to H such that is spd on ker(Je). Common choice: M diagonal.
CP variants investigated by many researchers(Axelsson, 1979; Golub & Wathen, 1998; Luksan & Vlcek, 1998; Keller et al., 2000; Rozloznik & Simoncini, 2002; Durazzi & Ruggiero, 2003; Gondzio et al., 2004; Dollar et al., 2006-2007; di Serafino et al., 2007; Forsgren et al., 2007; …)
DJ
JMP
T
DJ
JMP
T
Constraint Preconditioner (CP)
DJ
JHK
T
ii
T
iJDJM 1
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
12
P-1K has at least m unit eigenvalues
If rank(D)=p, then P-1K has additional m-p unit eigenvalues
The remaining eigenvalues are real positive (bounds are available)
(Keller at al., 2000; Durazzi & Ruggiero, 2003; Bergamaschi et al., 2004; Dollar, 2007)
When the iterate approaches the solution, if q entries of D get close to zero, then additional q eigenvalues tend to be clustered around 1
We expect that the preconditioner increases its effectiveness as the IP method progresses
CP: spectral properties
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
13
CP: use of Conjugate Gradient (CG) method
(Gould, Hribar, Nocedal, 2001; Rozloznik, Simoncini, 2002; Cafieri, D’Apuzzo, De Simone, di Serafino, 2007; Dollar, 2007)
No breakdown Convergence in at most n-m+p iterations,
p=rank(D)
Starting guess such that
0
22
210
21
0
1
b
b
x
x
J
DJ
e
ii
0
22
210
21
0
1
b
b
x
x
J
DJ
e
ii
CP+CG applied to KKT system behaves as CG applied
to a reduced system with matrix
using as preconditioner
(Z basis of ker(Je))
CP+CG applied to KKT system behaves as CG applied
to a reduced system with matrix
using as preconditioner
(Z basis of ker(Je))
),,( 0
22
0
21
0
1xxx
Preconditioned Projected CG
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
14
Building CP
Explicit LBLT factorization of P (B block diagonal with 1x1 or 2x2 blocks)
LBLT factorization of the Schur complement of -M
Implicit Schilders factorization of P (Dollar et al., 2006)
T
TT
L
JMI
D
M
LJM
I
DJ
JMP
0
1
00
1 00
00
TT
MLBLJJMDS
000
1
May still account for a large part of the computational cost!
May still account for a large part of the computational cost!“Requiring a factorization of P may still be considered a disadvantage,
and methods which avoid this are urgently needed ” (Gould, Orban, Toint, Acta Numerica, 2005)
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
15
Reducing the cost of building CP
Always set the (2,2)-block to 0(Dollar, Gould, Schilders, Wathen, 2007)
Approximate J by dropping away entries below a prescribed tolerance and outside a fixed band (case D=0 )(Bergamaschi, Gondzio, Venturin, Zilli, 2006)
Approximate the Schur complement via incomplete Cholesky factorization (case D=0)(Benzi, Simoncini, 2006)
Reuse CP for some IP iterations (Cafieri, D’Apuzzo, De Simone, di Serafino, 2007b)
DJ
JMP
T
DJ
JMP
T
TJJM 1
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
16
Reusing CP
The idea is not new (condensed system: Carpenter & Shanno, 1993; Karmakar & Ramakrishnan, 1991)
Reusing CP use an approximate CP
CG cannot be used, apply SQMR
DJ
JMP
T
~
~~
DJ
JMP
T
~
~~
00
0~~
~ 1SYD
computed at a previous outer step
)~~~~
(~ 11 TVZXQdiagM
~
,~
,~
,~
,~
,~
ZYTVSX
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
17
Reusing CP: spectral properties (work in progress)
pjDDrankmpDrank )~
( ,)( pjDDrankmpDrank )~
( ,)(
has
an eigenvalue at 1 with multiplicity at least 2m-p-j at most 2j eigenvalues with nonzero imaginary part
The eigenvalues λ satify
KP 1~
pm
pDD i
}
}
00
0
HIJR T )(
12/12/12/1 2 ,~
,~~ RDJHIJRCMJJMHMH TT
,~~~ 1 RRJMJDS TT
CHCHmaxmaxminmin
,min,min
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
18
Reusing CP
CP is reused until its effectiveness deteriorates the number of inner iterations must not increase too much the number of outer steps for which CP is reused must be bounded
by taking into account the increasing ill conditioning and the accuracy requirements
k = k+1 factorize Pk
apply to the KKT system SQMR with Pk
j = k; l = 0while (iit(k) ≤ α ∙ iit(j) and l ≤ β ∙ lmax) do
apply to the KKT system SQMR with P j
k = k+1; l = l+1endwhilelmax = l
k = k+1 factorize Pk
apply to the KKT system SQMR with Pk
j = k; l = 0while (iit(k) ≤ α ∙ iit(j) and l ≤ β ∙ lmax) do
apply to the KKT system SQMR with P j
k = k+1; l = l+1endwhilelmax = l
while (PR stopping criterion not satisfied) do…
…endwhile
iit(k) = # inner iter. at outer step k
= 2, = 1 (by numerical experiments)
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
19
Adaptive stopping criteria, according to the quality of the IP iterate (to reduce the number of inner iterations)
Basic idea: at early outer iterations low accuracy in solving the KKT system as the iterates approach the optimal solution the accuracy requirement grows up
Regard the IP method as an inexact Newton method:
,)( )()()()( kkkk whrr 0 , 10step IP
)(
kkk
residual of the KKT system
Termination control
perturbation of the KKT cond. of the original
pb.
KKT cond. of the original pb.
chosen suitably 1
/)(
)()()(
k
kkkr
1 ,)( )()()()()( kkkkk whrr
condition for convergence
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
20
Termination control
polynomial convergence
solution KKT approx. theof residual ,4
3
rr
00 // Cafieri, D’Apuzzo, De Simone, di Serafino, Toraldo, 2007c