FMB - NLA Multilevel preconditioning methods . – p.1/24
FMB - NLA
Multilevel preconditioning methods
. – p.1/24
MG_Tutorial-1
Multigrid (MG) and Local Refinement forElliptic Partial Differential Equations
Klaus Stüben
FhG-SCAISchloss Birlinghoven53754 St. Augustin, Germanye-mail: [email protected]
Multigrid Tutorial
MG_Tutorial-2
Overview
• Why multigrid?
• Basic multigrid principles
• Full multigrid (FMG)
• Nonlinear multigrid (FAS)
• Eigenproblems
• Local Refinements
MG_Tutorial-55
Historical Papers
• Discovery of multigrid (theoretical)– Fedorenko, R.P.: The speed of convergence of an iterative method,
USSR Comput. Math. and Math. Phys. 4,3 (1964).
– Bakhvalov, N.S.: On the convergence of a relaxation method with natural constraints on the elliptic operator, USSR Comput. Math. and Math. Phys. 6,5 (1966).
• Beginning of multigrid– Brandt, A.: Multi-level adaptive technique (MLAT) for fast numerical
solution to boundary value problems, Lecture Notes in Physics 18, Springer 1973.
– Brandt, A.: Multi-level adaptive solutions to boundary value problems, Math. Comp. 31 (1977).
• Re-discovery of multigrid– Hackbusch, W.: On the multigrid method applied to difference equations,
Computing 20 (1978).
MG_Tutorial-56
Text Books
• Classical– Stüben, K.; Trottenberg, U.: Multigrid methods: Fundamental algorithms, model problem
analysis and applications, Lecture Notes in Mathematics 960, Springer (1982).
– Brandt, A.: Multigrid techniques: 1984 Guide with applications to fluid dynamics, GMD-Studie No. 85 (1984).
• Theory– Hackbusch, W.: Multigrid methods and applications, Springer Series in Comp. Math. 4,
Springer (1985).
– McCormick, S. (ed.).: Multigrid methods, Frontiers in Applied Mathematics, Vol. 5, SIAM, Philadelphia (1987).
• Tutorial-level– Briggs, W.: A multigrid tutorial, SIAM, Philadelphia (1987). New edition: 2001.
• Engineers– Wesseling, P.: An introduction to multigrid methods, Pure and Applied Mathematics Series,
John Wiley and Sons (1992).
• Engineers and Practitioners– Trottenberg, U.; Oosterlee, C.W.; Schüller, A.: Multigrid, Academic Press, 2001 (with
appendices by Brandt, A., Oswald, P. and Stüben, K.)
AMultigrid Tutorial
ByWilliam L. Briggs
Presented byVan Emden Henson
Center for Applied Scientific ComputingLawrence Livermore National Laboratory
This work was performed, in part, under the auspices of the United States Department of Energy by Universityof California Lawrence Livermore National Laboratory under contract number W-7405-Eng-48.
38 of 119
• Many relaxation schemes have the smoothingproperty, where oscillatory modes of the errorare eliminated effectively, but smooth modesare damped very slowly.
• This might seem like a limitation, but by usingcoarse grids we can use the smoothing property togood advantage.
• Why use coarse grids??
First observation towardmultigrid
x0 xN
x0 xN2
Ωh
Ω2h
39 of 119
Reason #1 for using coarsegrids: Nested Iteration
• Coarse grids can be used to compute an improvedinitial guess for the fine-grid relaxation. This isadvantageous because:
– Relaxation on the coarse-grid is much cheaper (1/2 asmany points in 1D, 1/4 in 2D, 1/8 in 3D)
– Relaxation on the coarse grid has a marginally betterconvergence rate, for example
instead of 1 )(− hO 241 )(− hO 2
40 of 119
Idea! Nested Iteration
• …• Relax on Au=f on to obtain initial guess• Relax on Au=f on to obtain initial guess• Relax on Au=f on to obtain … final solution???
• But, what is Au=f on , , … ?
• What if the error still has smooth componentswhen we get to the fine grid ?
Ω4h
Ω2h
Ωh
v2h
vh
Ω2h Ω4h
Ωh
46 of 119
1D Interpolation (Prolongation)
• Mapping from the coarse grid to the fine grid:
• Let , be defined on , . Then
where
vh v2h Ωh Ω2h
vv 22
hi
hi =
vvv 21
212
hi
hi
hi ++ )+(=
21
for 12
0 −≤≤N
i
I 22
hhhh Ω→Ω:
vvI 22
hhhh =
47 of 119
1D Interpolation (Prolongation)
Ωh
Ω2h
• Values at points on the coarse grid map unchangedto the fine grid
• Values at fine-grid points NOT on the coarse gridare the averages of their coarse-grid neighbors
48 of 119
The prolongation operator (1D)
• We may regard as a linear operator fromℜ N/2-1 ℜ N-1
• e.g., for N=8,
• has full rank, and thus null space
=
/
//
//
/ 21
12121
12121
121
177
6
5
4
3
2
1
1323
22
21
37
xh
h
h
h
h
h
h
xh
h
h
x v
v
v
v
v
v
v
v
v
v
φ
I2hh
I2hh
52 of 119
1D Restriction by injection• Mapping from the fine grid to the coarse grid:
• Let , be defined on , . Then
where .
vh v2h Ωh Ω2h
vv hi
hi 22 =
I hhhh
22 Ω→Ω:
vvI hhhh
22 =
53 of 119
1D Restriction by full-weighting
• Let , be defined on , . Then
where
vh v2h Ωh Ω2h
vvvv hi
hi
hi
hi 122122 )++(= 2
41
+−
vvI hhhh
22 =
55 of 119
Prolongation and restriction areoften nicely related
• For the 1D examples, linear interpolation and full-weighting are related by:
• A commonly used, and highly useful, requirement isthat
for c in ℜIcI 22
Thh
hh )(=
I2hh
=
1211
211
21
21 I h
h2
=121
121121
41
58 of 119
Now, let’s put all these ideastogether
• Nested Iteration (effective on smooth errormodes)
• Relaxation (effective on oscillatory error modes)
• Residual equation (i.e., residual correction)
• Prolongation and Restriction
59 of 119
Coarse Grid Correction Scheme
• 1) Relax times on on with
arbitrary initial guess .
• 2) Compute .
• 3) Compute .
• 4) Solve on .
• 5) Correct fine-grid solution .
• 6) Relax times on on with initial
guess .
fuA hhh = Ωh
vh
fuA hhh = Ωh
vh
fvGCv hhh )α,α,,(← 21
α1
α2
vAfr hhhh −=
reA 222 hhh = Ω2h
rIr 22 hhh
h =
eIvv hhh
hh +← 22
60 of 119
Coarse-grid Correction
Relax on fuA hhh =uAfr hhhh −=Compute
rIr 22 hhh
h =Restrict
Solve reA 222 hhh =rAe 2122 hhh )(= −
Correct
eIe hhh
h ≈ 22
Interpolate
euu hhh +←
61 of 119
What is ?
• For this scheme to work, we must have , acoarse-grid operator. For the moment, we willsimply assume that is “the coarse-gridversion” of the fine-grid operator .
• We will return to the question of constructinglater.
A 2h
A 2h
A 2h
A 2h
A h
62 of 119
How do we “solve” the coarse-grid residual equation? Recursion!
uIe hhh
h← 22
uIe 424
2 hhh
h←
uIe 848
4 hhh
h←
fAGu hhh ),(← ν
fAGu 222 hhh ),(← ν
fAGu 444 hhh ),(← ν
fAGu 888 hhh ),(← ν
uAfIf 22 hhhhh
h )−(←
uAfIf 22242
4 hhhhh
h )−(←
uAfIf 44484
8 hhhhh
h )−(←
euu 888 hhh +←
euu 444 hhh +←
euu 222 hhh +←
euu hhh +←
fAe HHH )(= −1
63 of 119
V-cycle (recursive form)
1) Relax times on , initial arbitrary
2) If is the coarsest grid, go to 4) Else:
3) Correct
4) Relax times on , initial guess
fvVMv hhhh ),(←
α1 fuA hhh = vh
Ωh
vAfIf 22 hhhh
hh )−(←
v2h ← 0
fvVMv 2222 hhhh ),(←
vIvv hhh
hh +← 22
α2 fuA hhh = vh
64 of 119
Storage Costs: and mustbe stored on each level
In 1-d, each coarse grid has about half the number of points as the finer grid.
In 2-d, each coarse grid has about one- fourth the number of points as the finer grid.
In d-dimensions, each coarse grid has about the number of points as the finer grid.
2− d
21
2222212
NN
d
ddMdddd
−<)+…++++(
−−−−− 32
Total storage cost: less than 2, 4/3, 8/7 the cost of storage on the fine grid for 1, 2, and 3-d problems, respectively.
vh f h
65 of 119
Computation Costs• Let 1 Work Unit (WU) be the cost of one
relaxation sweep on the fine-grid.• Ignore the cost of restriction and interpolation
(typically about 20% of the total cost).• Consider a V-cycle with 1 pre-Coarse-Grid
correction relaxation sweep and 1 post-Coarse-Grid correction relaxation sweep.
• Cost of V-cycle (in WU):
• Cost is about 4, 8/3, 16/7 WU per V-cycle in 1, 2,and 3 dimensions.
21
2222212
−<)+…++++(
−−−−−
ddMddd 32
70 of 119
Work needed to converge to thelevel of truncation
• Since θ V-cycles at convergence rate γ arerequired, we see that
implying that .
• Since one V-cycle costs O(1) WU and one WU isO(Nd), we see that the cost of converging to thelevel of truncation using the MV method is
• which is comparable to fast direct methods (FFTbased).
)(∼γ−θ NO p
)(∼θ NO gol
NNO )( d gol
FMB - NLA Multilevel preconditioning methods: MG
Procedure MG: u(k) ←MG“u(k), f (k), k, ν
(k)j
kj=1
”;
if k = 0, then solve A(0)u(0) = f (0) exactly or by smoothing,else
u(k) ←s1S
(k)1
`u(k), f (k)
´, perform s1 pre-smoothing steps,
Correct the residual:r(k) = A(k)u(k) − f (k); form the current residual,
r(k−1) ←R`r(k)
´, restrict the residual on the next coarser grid,
e(k−1) ←MG“0, r(k−1), k − 1, ν
(k−1)j k−1
j=1
”;
e(k) ← P`e(k−1)
´; prolong the error from the next coarser to the
current grid,u(k) = u(k) − e(k); update the solution,
u(k) ←s2S
(k)2
`u(k), f (k)
´, perform s2 post-smoothing steps.
endif
end Procedure MG
. – p.3/24
FMB - NLA ...
post-smoothing steps
pre-smoothing steps
exact solving
restriction prolongation
One MG step (V -cycle)
. – p.4/24
FMB - NLA ...
The MG W -cycle
. – p.5/24
FMB - NLA Nested iterations
Procedure NI : u(ℓ) ← NI“u(0),
˘f (k)
¯(ℓ)
k=1, ℓ, ν(k)ℓ
k=1
”;
u(0) = A(0)−1f (0),
for k= 1 to ℓ do
u(k) = P`u(k−1)
´;
u(k) ←MG“u(k), f (k), k, ν
(k)j
kj=1
”;
endfor
end Procedure NI
. – p.6/24
FMB - NLA Full MG (V-cycle)
The so-called full MG corresponds to Procedure NI(·, ·, ℓ, 1, 1, · · · , 1)
The full MG (V -cycle)
. – p.7/24
FMB - NLA ...
A compact formula presenting the MG procedure in terms of a recursively definediteration matrix:( i) Let M (0) = 0,(ii) For k = 1 to ℓ, define
M (k) = S(k)s2“A(k)−1
− P kk−1
“I −M (k−1)ν
”A(k−1)−1
Rk−1k
”A(k)S(k)s1
,
where S(k) is a smoothing iteration matrix (assuming S1 and S2 are the same), Rk−1k
and P kk−1 are matrices which transfer data between two consecutive grids and
correspond to the restriction and prolongation operatorsR and P, respectively, andν = 1 and ν = 2 correspond to the V - and W -cycles.It turns out that in many cases the spectral radius of M (ℓ), ρ
`M (ℓ)
´, is independent of ℓ,
thus the rate of convergence of the NI method is optimal. Also, a mechanism to makethe spectral radius of M (ℓ) smaller is to choose s1 and s2 larger. The price for the latteris, clearly, a higher computational cost.
. – p.8/24
FMB - NLA ...
Rate of convergence
ρ(Mℓ) ≤ 1− Cℓ−2
ρ(Mℓ) ≤ 1−O(ℓ−1−α
α )
ρ(Mℓ) ≤ 1−O(ℓ−(1−α)2
α )
u ∈ H1+α(Ω), 0 ≤ α < 1.
The larger α, the better the convergence, but the stronger the requirements with respect
of the regularity of the solution.
. – p.9/24
FMB - NLA ...
. – p.10/24
FMB - NLA Multilevel preconditioning methods: AMLI
The Algebraic Multilevel Iteration (AMLI) methods are the first regularity-free multilevelmethods.Derived in a series of papers by Owe Axelsson and anayot Vassilevski in 1989-1991.
Sequence of matrices˘A(k)
¯ℓ
k=k0
Nk0⊂ Nk0+1 ⊂ . . . ⊂ Nℓ
A(k) =
264
A(k)11 A
(k)12
A(k)21 A
(k)22
375 N
k\Nk−1
Nk−1
. (1)
. – p.11/24
FMB - NLA ...
A(k) = A(k+1)22 −A
(k+1)21 B
(k+1)11 A
(k+1)12 . (2)
where B(k+1)11 is some sparse, positive definite, nonnegative and symmetric
approximation of A(k+1)−1
11 .
How to split Nk+1 into two parts: the order nk of the matrices A(k) should decreasegeometrically:
nk+1
nk
= ρk ≥ ρ > 1.
. – p.12/24
FMB - NLA ...
M (k0) = A(k0),
for k = k0, k0 + 1, . . . ℓ− 1
M (k+1) =
2664
A(k+1)11 0
A(k+1)21
eS(k)
3775
2664
I(k+1)1 A
(k+1)−1
11 A(k+1)12
0 I(k+1)2
3775 ,
endfor
(3)
where eS(k) can be, for instance, of the folowing form
eS(k) = A(k)hI − Pν(M (k)−1
A(k))i−1
, (4)
. – p.13/24
FMB - NLA ...
In eS(k) = A(k)hI − Pν(M (k)−1
A(k))i−1
Pν(t) denotes a polynomial of degree ν which satisfies the conditions
0 ≤ Pν(t) < 1, 0 < t ≤ 1 and Pν(0) = 1. (5)
The fact that Pν(0) is normalized at the origin is important because then the expression
for eS(k) does not require actions of A(k)−1.
. – p.14/24
FMB - NLA ...
Forward sweep:
Solve
2664
A(k+1)11 0
A(k+1)21
eS(k)
3775
2664
w1
w2
3775 =
2664
y1
y2
3775 , i.e.
(F1) w1 = A(k+1)−1
11 y1,
(F2) w2 = eS(k)−1“y2 − A
(k+1)21 w1
”.
Backward sweep:
Solve
2664
I(k+1)1 A
(k+1)−1
11 A(k+1)12
0 I(k+1)2
3775
2664
x1
x2
3775 =
2664
w1
w2
3775 , i.e.
(B1) x2 = w2,
(B2) x1 = w1 −A(k+1)−1
11 A(k+1)12 x2.
. – p.15/24
FMB - NLA ...
Since Pν(t) is of the form Pν(t) = 1− a1t− . . .− aνtν , we observe that
v = eS(k)−1z =
hI − Pν(M (k)−1
A(k))i
A(k)−1z
=ha1I + a2M (k)−1
A(k) + · · ·+ aν(M (k)−1A(k))ν−1
iM (k)−1
z.
. – p.16/24
FMB - NLA ...
M (k+1) =
264
B(k+1)11 0
eA(k+1)21
eS(k)
375
264
I(k+1)1 B
(k+1)−1
11eA(k+1)12
0 I(k+1)2
375 (6)
with eS(k) defined as eS(k) = A(k)hI − Pν(M (k)−1
A(k))i−1
.
. – p.17/24
FMB - NLA ...
M (k+1) =
264
B(k+1)−1
11 0
A(k+1)21 I
(k+1)2
375
264
I(k+1)1 B
(k+1)11 A
(k+1)12
0 eS(k)
375 , (7)
where eS(k) = A(k)hI − Pν(M (k)−1
A(k))i−1
. This time it is not A(k+1)11 but its inverse
which is approximated by some other matrix B(k+1)11 . Thus, instead of solving systems
with A(k+1)11 or with some approximation of it, we need only to do matrix multiplications
with B(k+1)11 . The matrix B
(k+1)11 is constructed so that it meets certain requirements - it
has a prescribed sparsity pattern, is nonnegative when A(k+1)11 is monotone, and is a
sufficiently good approximation of A(k+1)−1
11 . In the case of discontinuous coefficients, it
suffices with a simple diagonal approximation of A(k+1)−1
11 .
. – p.18/24
FMB - NLA The AMLI algorithm
Procedure AMLI : u(k) ← AMLI“f (k), k, νk, a
(k)j
νkj=0
”;
[f(k)1 , f
(k)2 ]← f (k),
w(k)1 = B
(k)11 f
(k)1 ,
w(k)2 = f
(k)2 −A
(k)21 w
(k)1 ,
k = k − 1,
if k = 0 then u(0)2 = A(0) w
(1)2 , solve on the coarsest level exactly;
else
u(k)2 ← AMLI
“a(k)νk
w(k)2 , k, νk, a
(k)j
νkj=0
”;
for j = 1 to νk − 1:
u(k)2 ← AMLI
“A(k) u
(k)2 + a
(k)νk−jw
(k)2 , k, νk, a
(k)j
νkj=0
”;
endfor
endif
k = k + 1,
u(k)1 = w
(k)1 −B
(k)11 A
(k)12 u
(k)2 ,
u(k) ← [u(k)1 ,u
(k)2 ]
end Procedure AMLI
. – p.19/24
FMB - NLA ...
solution onthe coarsest level
multiplicationwith A(k)
12 ,B(k)11
multiplicationwith A(k)
21 ,B(k)11
multiplication with
A(k)
One AMLI step (V -cycle)
. – p.20/24
FMB - NLA ...
level 0
level 1,
level 2,
level 3,
level 4,
level 5
ν=1
ν=3
ν=1
ν=1
ν-fold W -cycle, [1, 1, 3, 1]
. – p.21/24
FMB - NLA Computational complexity
Recall:nk+1
nk≥ ρ > 1
Wℓ = C(nℓ + nℓ−1 + · · ·nℓ−µ)+
Cν(nℓ−µ−1 + nℓ−µ−2 + · · ·nℓ−2µ − 1)+
Cν2(· · · )
...≤ C nℓ(1 + 1
ρ+ · · ·+ 1
ρ
µ) 11−νρ−(µ+1
If we impose the condition
ν < ρµ+1
then the work per iteration is bounded independently of ℓ.
. – p.22/24
FMB - NLA Rate of convergence
We want:
κℓ = κ(M (ℓ)−1A(ℓ)) = O(1) ℓ→∞
It involves κk = κ(M (k)−1A(k)), and, in turn, the need to estimate the extreme
eigenvalues of intermediate eigenvalues.It turms out that the condition for optimal κℓ can be formulated as
f(λ(M (k)−1A(k))) ≤ ν
which gives a lower bound for the degrees of the polynomials to be used. Thus, weobtain
Optimal comput. complexity ≤ ν ≤ Optimal conv. rate
. – p.23/24
FMB - NLA How does actually the whole thing work?
. – p.24/24