Modern iterative methods For basic iterative methods, converge linearly Modern iterative methods, converge faster – Krylov subspace method • Steepest descent method • Conjugate gradient (CG) method --- most popular • Preconditioning CG (PCG) method • GMRES for nonsymmetric matrix – Other methods (read yourself) • Chebyshev iterative method • Lanczos methods • Conjugate gradient normal residual (CGNR) c D x R D x c x R x D b x A m m 1 ) ( 1 ) 1 ( b x x A x x x T T x n 2 1 : ) ( ) ( min
29
Embed
Modern iterative methods For basic iterative methods, converge linearly Modern iterative methods, converge faster –Krylov subspace method Steepest descent.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
• Steepest descent method• Conjugate gradient (CG) method --- most popular• Preconditioning CG (PCG) method• GMRES for nonsymmetric matrix
– Other methods (read yourself)• Chebyshev iterative method• Lanczos methods• Conjugate gradient normal residual (CGNR)
cDxRDxcxRxDbxA mm 1)(1)1(
bxxAxxx TT
x n
2
1:)()(min
Modern iterative methods
Ideas:– Minimizing the residual – Projecting to Krylov subspace
Thm: If A is an n-by-n real symmetric positive definite matrix, then
have the same solutionProof: see details in class
bxxAxxbxA TT
xx nn
2
1min)(min
bAbxbAx T 11
2
1*)(*
bxxAxxdxdxx TT
mm
mmmm
2
1:)()(min )()()1(
Steepest decent method
Suppose we have an approximation Choose the direction as negative gradient of
– If
– Else, choose to minimize
*xxc
ccxxxxc rxAbbxAxdcc
:|)(|)(
cd
c
)(x
!solution!exact theis 0 ccc xxAbr
)( cc dx
Steepest decent method
Computation
Choose asc
cT
c
cT
c
cTc
cTc
c rAr
rr
dAd
dd
)()(2
1)(
)()(2
1)(
)()()(2
1)(
2
2
cTcc
Tcc
cTcc
Tcc
Tcccc
Tcccc
dddAdx
bxAddAdx
bdxdxAdxdx
Algorithm– Steepest descent method
)xx
xAbrrαxx
rArrrαmm
r
mxAbr
x
mm
mmmm
mm
mT
mmT
mm
m
10)1()(
)(11
)1()(
11111
)0(0
)0(
10 (e.g. until
&
)/()( & 1
0 while
0set & Compute
guess Initial
Theory
Suppose A is symmetric positive definite.Define A-inner product
Define A-norm
Steepest decent method
yAxyAxyx TA
),(),(
xAxxxx TAA
),(
2
Amm
mmm
mm
mmmm
rr
rrαxAbr
rαxxx
),(
),( &
& guess Initial
)(
)()1()0(
Theory
Thm: For steepest decent method, we have
Proof: Exercise
*)()()(
11
2
1)(
)(
11
2
1)(*)()(
)1(
2
1)1(
2
1)()(
xxAk
bAbxAk
bAbxxx
m
Tm
Tmm
Theory
Rewrite the steepest decent method
Let errors
Lemma: For the method, we have m
mmmmmm reexxexxe
)1()()()()()( ~*~~*
(m)m
Amm
mmmm
(m))(m
mm
mmmm
mm
xA brrr
rrαrx x
xxrxx
),(
),(~
~)1(
1
)1()()()1(
0),()~,(
~)1()1()1()()1(
)1()()1(
Amm
Ammm
mm
mm
m
reeee
eee
Theory
Thm: For steepest decent method, we have
Proof: See details in class (or as an exercise)
A
m
A
m
A
m
A
m
AmmA
m
A
mmmA
m
A
m
eAk
Ake
eere
eeee
)(
2
2)1(
)()1(222)1(
2)1()(22)1(2)(
1)(
1)(
nrom.-A of sense thein
convergently monotional algorithm The
~
Steepest decent method
Performance– Converge globally, for any initial data– If , then it converges very fast– If , then it converges very slow!!!
Geometric interpretation– Contour plots are flat!!– Local best direction (steepest direction) is not necessarily a global best direction – Computational experience shows that the method suffers a decreasing
convergence rate after a few iteration steps because the search directions become linearly dependent!!!
)1()(2 OAk
1)(2 Ak
Conjugate gradient (CG) method
Since A is symmetric positive definite, A-norm
In CG method, the direction vectors are chosen to be A-orthogonal (and called as conjugate vectors), i.e.
xAxxxx TAA
),(
midAd mT
i ,0)(
CG method
In addition, we take the new direction vector as a linear combination of the old direction vector and the descent direction as
By the assumption we get 0)( 1 mT
m dAd
mT
m
mT
mmm
Tmmm
dAd
dArdAdr
)(
)()(0
)( 1 m
mmmmm xAbrdrd
Algorithm– CG Method
(0)
(0)0 0 0
( 1) ( )1
1 2
Choose initial guess
Compute & set
For 0,1,..., do
Compute ( ) / ( )
&
If
T Tm m m m m
m mm m m m m m
m
x
r b A x d r
m
α r r d A d
x x α d r r α A d
r
10
1 11 1
(e.g. 10 , then
( ) &
( )
endif
endfor
Tm m
m m m m mTm m
)
r rd r d
r r
An example
An example
Initial guess
The approximate solutions
1
1
1
7
7
7
511
151
115
xbxAbA
0000.0
0000.0
0000.0)0(x
0003.1
0003.1
0003.1
,1429.0,
0000.7
0000.7
0000.7)1(
000 xdr
CG method
In CG method, are A-orthogonal!
Define the linear space as
Lemma: In CG method, for m=0,1,…., we have
– Proof: See details in class or as an exercise
1 & mm dd
0),()(A respect to with 111 AmmmT
mmm dddAddd
0 11
span{ , , , } { | , }m
m i i ii
d d d y y d
},,,{span
},,,{span},,,{span
000
1010
rArAr
rrrdddm
mm
CG method
In CG method, is A-orthogonal to or
Lemma: In CG method, we have
– Proof: See details in class or as an exercise
Thm: Error estimate for CG method
1md
mddd
,,, 10
A respect to with},,,{span 101 mm dddd
jirrdAd jT
ijT
i ,0)(,0)(
min
max)(&*
:&,2,1,01)(
1)(2
2
1
22)()(
2
2
)0(
)(
AAAkxxe
xAxxmAk
Ak
e
e
mm
T
A
m
A
A
m
CG method
Computational cost– At each iteration, 2 matrix-vector multiplications. This
can be further reduced to 1 matrix-vector multiplications
– At most n steps, we can get the exact solution!!!Convergence rate depends on the condition #– K2(A)=O(1), converges very fast!!– K2(A)>>1, converges slow but can be accelerated by
preconditioning!!
Preconditioning
Ideas: Replace by satisfying
– C is symmetric positive definite – is well-conditioned, i.e. – can be easily solved
Conditions for choosing the preconditioning matrix– as small as possible– is easy to compute– Trade-off
bxA
bxA~~~
xxC ~A~ )()
~( 22 AkAk
)~
(2 Ak1C
bCbxCxACCA 111 ~~~
Algorithm– PCG Method
(0) (0)0
0 0 0 0
( 1) ( )1
Choose initial guess & compute
Solve & set
For 0,1,..., do
Compute ( ) / ( )
&
T Tm m m m m
m mm m m m m
x r b A x
Cr r d r
m
α r r d A d
x x α d r r α
101 2
1 1
1 11 1
If (e.g. 10 , then
Solve
( ) &
( )
endif
endfor
m
m
m m
Tm m
m m m m mTm m
A d
r )
Cr r
r rd r d
r r
Preconditioning
Ways to choose the matrix C (read yourself)– Diagonal part of A– Tri-diagonal part of A– m-step Jacobi preconditioner– Symmetric Gauss-Seidel preconditioner– SSOR preconditioner– In-complete Cholesky decomposition– In-complete block preconditioning– Preconditioning based on domain decomposition– …….
Extension of CG method to nonsymmetric
Biconjugate gradient (BiCG) method: – Solve simultaneously– Works well for A is positive definite, not symmetric– If A is symmetric, BiCG reduces to CG
Conjugate gradient squared (CGS) method– A has a special formula in computing Ax, its transport hasn’t– Multiplication by A is efficient but multiplication by its transport
is not
& TAx b A y b
Krylov subspace methods
Problem I. Linear systemProblem II. Variational formulation
Problem III. Minimization problem
– Thm1: Problem I is equivalent to Problem II– Thm2: If A is symmetric positive definite, they are equivalent
bxA
),(),(2
1
2
1:)()(min bxxxAbxxAxxx TT
x n
nvvbvxA
),(),(
Krylov subspace methods
To reduce problem size, we replace by a subspace
Subspace minimization: – Find – Such that
Subspace projection
n
),(),(2
1
2
1)()(min )(
)0(bxxxAbxxAxxx TTm
Sx m
guess initial an with},,,{span )0(110
)0( xdddSSx mmm
1( ) (0)
0
m
mk k
k
x x d
mm
kkm
SvvbvxA
mkdbdxA
),(),(
10),(),()(
)(
: )0(n)(m
m Sxx
Krylov subspace methods
To determine the coefficients, we have – Normal Equations
– It is a linear system with degree m!!
m=1: line minimization or linear search or 1D projection
By converting this formula into an iteration, we reduce the original problem into a sequence of line minimization (successive line minimization ).