Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations page 1 of 1
Parallel Numerics, WT 2016/2017
5 Iterative Methods for Sparse Linear Systemsof Equations
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 1 of 1
Contents1 Introduction
1.1 Computer Science Aspects1.2 Numerical Problems1.3 Graphs1.4 Loop Manipulations
2 Elementary Linear Algebra Problems2.1 BLAS: Basic Linear Algebra Subroutines2.2 Matrix-Vector Operations2.3 Matrix-Matrix-Product
3 Linear Systems of Equations with Dense Matrices3.1 Gaussian Elimination3.2 Parallelization3.3 QR-Decomposition with Householder matrices
4 Sparse Matrices4.1 General Properties, Storage4.2 Sparse Matrices and Graphs4.3 Reordering4.4 Gaussian Elimination for Sparse Matrices
5 Iterative Methods for Sparse Linear Systems of Equations5.1 Stationary Methods5.2 Nonstationary Methods5.3 Preconditioning
6 Domain Decomposition6.1 Overlapping Domain Decomposition6.2 Non-overlapping Domain Decomposition6.3 Schur Complements
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 2 of 1
• Disadvantages of direct methods (in parallel):– strongly sequential– may lead to dense matrices– sparsity pattern changes, additional entries necessary– indirect addressing– storage– computational effort
• Iterative solver:– choose initial guess = starting vector x (0), e.g., x (0) = 0– iteration function x (k+1) := Φ(x (k))
• Applied on solving a linear system:– Main part of Φ should be a matrix-vector multiplication Ax
(matrix-free!?)– Easy to parallelize, no change in the pattern of A.
x (k) k→∞−→ x = A−1b
– Main problem: Fast convergence!
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 3 of 1
• Disadvantages of direct methods (in parallel):– strongly sequential– may lead to dense matrices– sparsity pattern changes, additional entries necessary– indirect addressing– storage– computational effort
• Iterative solver:– choose initial guess = starting vector x (0), e.g., x (0) = 0– iteration function x (k+1) := Φ(x (k))
• Applied on solving a linear system:– Main part of Φ should be a matrix-vector multiplication Ax
(matrix-free!?)– Easy to parallelize, no change in the pattern of A.
x (k) k→∞−→ x = A−1b
– Main problem: Fast convergence!
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 3 of 1
• Disadvantages of direct methods (in parallel):– strongly sequential– may lead to dense matrices– sparsity pattern changes, additional entries necessary– indirect addressing– storage– computational effort
• Iterative solver:– choose initial guess = starting vector x (0), e.g., x (0) = 0– iteration function x (k+1) := Φ(x (k))
• Applied on solving a linear system:– Main part of Φ should be a matrix-vector multiplication Ax
(matrix-free!?)– Easy to parallelize, no change in the pattern of A.
x (k) k→∞−→ x = A−1b
– Main problem: Fast convergence!Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 3 of 1
5.1. Stationary Methods5.1.1. Richardson Iteration
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 4 of 1
• Construct from Ax = b an iteration process:
b = Ax = ( A− I + I︸ ︷︷ ︸(artificial) splitting of A
)x = x − (I − A)x ⇒ x = b + (I − A)x
= b + Nx
• Leads to equation x = Φ(x) with Φ(x) := b + Nx :
start: x (0);
x (k+1) := Φ(x (k)) = b + Nx (k) = b + (I − A)x (k)
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 4 of 1
• Construct from Ax = b an iteration process:
b = Ax = ( A− I + I︸ ︷︷ ︸(artificial) splitting of A
)x = x − (I − A)x ⇒ x = b + (I − A)x
= b + Nx
• Leads to equation x = Φ(x) with Φ(x) := b + Nx :
start: x (0);
x (k+1) := Φ(x (k)) = b + Nx (k) = b + (I − A)x (k)
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 4 of 1
Richardson Iteration (cont.)start: x (0);
x (k+1) := Φ(x (k)) = b + Nx (k) = b + (I − A)x (k)
If x (k) is convergent, x (k) → x ,then
x = Φ(x) = b + Nx = b + (I − A)x ⇒ Ax = b
and therefore it holds
x (k) → x = x := A−1b
Residual-based formulation:
x (k+1) = Φ(x (k)) = b + (I − A)x (k) = b + x (k) − Ax (k)
= x (k) + (b − Ax (k))︸ ︷︷ ︸r(x) = residual
= x (k) + r(x (k))
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 5 of 1
Richardson Iteration (cont.)start: x (0);
x (k+1) := Φ(x (k)) = b + Nx (k) = b + (I − A)x (k)
If x (k) is convergent, x (k) → x ,then
x = Φ(x) = b + Nx = b + (I − A)x ⇒ Ax = b
and therefore it holds
x (k) → x = x := A−1b
Residual-based formulation:
x (k+1) = Φ(x (k)) = b + (I − A)x (k) = b + x (k) − Ax (k)
= x (k) + (b − Ax (k))︸ ︷︷ ︸r(x) = residual
= x (k) + r(x (k))
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 5 of 1
Convergence Analysis via Neumann Series
x (k) = b + Nx (k−1) = b + N(b + Nx (k−2)) = b + Nb + N2x (k−2) =
. . . = b + Nb + N2b + · · ·+ Nk−1b + Nk x (0) =
=∑k−1
j=0N jb + Nk x (0) =
(∑k−1
j=0N j)
b + Nk x (0)
Special case x (0) = 0:
x (k) =
(∑k−1
j=0N j)
b
⇒ x (k) ∈ span{b,Nb,N2b, . . . ,Nk−1b} = span{b,Ab,A2b, . . . ,Ak−1b}= Kk (A,b)
which is called the Krylov space to A and b.
For ‖N‖ < 1 holds:∑k−1
j=0N j →
∑∞
j=0N j = (I − N)−1 = (I − (I − A))−1 = A−1
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 6 of 1
Convergence Analysis via Neumann Series
x (k) = b + Nx (k−1) = b + N(b + Nx (k−2)) = b + Nb + N2x (k−2) =
. . . = b + Nb + N2b + · · ·+ Nk−1b + Nk x (0) =
=∑k−1
j=0N jb + Nk x (0) =
(∑k−1
j=0N j)
b + Nk x (0)
Special case x (0) = 0:
x (k) =
(∑k−1
j=0N j)
b
⇒ x (k) ∈ span{b,Nb,N2b, . . . ,Nk−1b} = span{b,Ab,A2b, . . . ,Ak−1b}= Kk (A,b)
which is called the Krylov space to A and b.
For ‖N‖ < 1 holds:∑k−1
j=0N j →
∑∞
j=0N j = (I − N)−1 = (I − (I − A))−1 = A−1
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 6 of 1
Convergence Analysis via Neumann Series
x (k) = b + Nx (k−1) = b + N(b + Nx (k−2)) = b + Nb + N2x (k−2) =
. . . = b + Nb + N2b + · · ·+ Nk−1b + Nk x (0) =
=∑k−1
j=0N jb + Nk x (0) =
(∑k−1
j=0N j)
b + Nk x (0)
Special case x (0) = 0:
x (k) =
(∑k−1
j=0N j)
b
⇒ x (k) ∈ span{b,Nb,N2b, . . . ,Nk−1b} = span{b,Ab,A2b, . . . ,Ak−1b}= Kk (A,b)
which is called the Krylov space to A and b.
For ‖N‖ < 1 holds:∑k−1
j=0N j →
∑∞
j=0N j = (I − N)−1 = (I − (I − A))−1 = A−1
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 6 of 1
Convergence Analysis via Neumann Series(cont.)
x (k) →(∑∞
j=0N j)
b = (I − N)−1b = A−1b = x
Richardson iteration is convergent for ‖N‖ < 1 or A ≈ I.
Error analysis for e(k) := x (k) − x :
e(k+1) = x (k+1) − x = Φ(x (k))− Φ(x) = (b + Nx (k))− (b + Nx) =
= N(x (k) − x) = Ne(k)
‖e(k)‖ ≤ ‖N‖‖e(k−1)‖ ≤ ‖N‖2‖e(k−2)‖ ≤ · · · ≤ ‖N‖k‖e(0)‖
‖N‖ < 1⇒ ‖N‖k k→∞−→ 0⇒ ‖e(k)‖ k→∞−→ 0
• Convergence, if ρ(N) = ρ(I − A) < 1, where ρ is spectral radius
ρ(N) = |λmax| = maxi
(|λi |) (λi is eigenvalue of N)
• Eigenvalues of A have to be all in circle around 1 with radius 1.
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 7 of 1
Convergence Analysis via Neumann Series(cont.)
x (k) →(∑∞
j=0N j)
b = (I − N)−1b = A−1b = x
Richardson iteration is convergent for ‖N‖ < 1 or A ≈ I.Error analysis for e(k) := x (k) − x :
e(k+1) = x (k+1) − x = Φ(x (k))− Φ(x) = (b + Nx (k))− (b + Nx) =
= N(x (k) − x) = Ne(k)
‖e(k)‖ ≤ ‖N‖‖e(k−1)‖ ≤ ‖N‖2‖e(k−2)‖ ≤ · · · ≤ ‖N‖k‖e(0)‖
‖N‖ < 1⇒ ‖N‖k k→∞−→ 0⇒ ‖e(k)‖ k→∞−→ 0
• Convergence, if ρ(N) = ρ(I − A) < 1, where ρ is spectral radius
ρ(N) = |λmax| = maxi
(|λi |) (λi is eigenvalue of N)
• Eigenvalues of A have to be all in circle around 1 with radius 1.
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 7 of 1
Convergence Analysis via Neumann Series(cont.)
x (k) →(∑∞
j=0N j)
b = (I − N)−1b = A−1b = x
Richardson iteration is convergent for ‖N‖ < 1 or A ≈ I.Error analysis for e(k) := x (k) − x :
e(k+1) = x (k+1) − x = Φ(x (k))− Φ(x) = (b + Nx (k))− (b + Nx) =
= N(x (k) − x) = Ne(k)
‖e(k)‖ ≤ ‖N‖‖e(k−1)‖ ≤ ‖N‖2‖e(k−2)‖ ≤ · · · ≤ ‖N‖k‖e(0)‖
‖N‖ < 1⇒ ‖N‖k k→∞−→ 0⇒ ‖e(k)‖ k→∞−→ 0
• Convergence, if ρ(N) = ρ(I − A) < 1, where ρ is spectral radius
ρ(N) = |λmax| = maxi
(|λi |) (λi is eigenvalue of N)
• Eigenvalues of A have to be all in circle around 1 with radius 1.Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 7 of 1
Splittings of A• Convergence of Richardson only in very special cases!
Try to improve the iteration for better convergence!• Write A in form A := M − N
b = Ax = (M − N)x = Mx − Nx ⇔ x = M−1b + M−1Nx = Φ(x)
Φ(x) = M−1b + M−1Nx = M−1b + M−1(M − A)x =
= M−1(b − Ax) + x = x + M−1r(x)
• N should be such that Ny can be evaluated efficiently.• M should be such that M−1y can be evaluated efficiently.
x (k+1) = x (k) + M−1r (k)
• Iteration with splitting M − N is equivalent to Richardson on
M−1Ax = M−1b
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 8 of 1
Splittings of A• Convergence of Richardson only in very special cases!
Try to improve the iteration for better convergence!• Write A in form A := M − N
b = Ax = (M − N)x = Mx − Nx ⇔ x = M−1b + M−1Nx = Φ(x)
Φ(x) = M−1b + M−1Nx = M−1b + M−1(M − A)x =
= M−1(b − Ax) + x = x + M−1r(x)
• N should be such that Ny can be evaluated efficiently.• M should be such that M−1y can be evaluated efficiently.
x (k+1) = x (k) + M−1r (k)
• Iteration with splitting M − N is equivalent to Richardson on
M−1Ax = M−1b
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 8 of 1
Convergence
• Iteration with splitting A = M − N is convergent if
ρ(M−1N) = ρ(I −M−1A) < 1
• For fast convergence it should hold
– M−1A ≈ I– M−1A should be better conditioned than A itself
• Such a matrix M is called a preconditioner for A.Is used in other iterative methods to accelerate convergence.
• Condition number:
κ(A) = ‖A−1‖‖A‖,∣∣∣∣λmax
λmin
∣∣∣∣ , orσmax
σmin
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 9 of 1
Convergence
• Iteration with splitting A = M − N is convergent if
ρ(M−1N) = ρ(I −M−1A) < 1
• For fast convergence it should hold
– M−1A ≈ I– M−1A should be better conditioned than A itself
• Such a matrix M is called a preconditioner for A.Is used in other iterative methods to accelerate convergence.
• Condition number:
κ(A) = ‖A−1‖‖A‖,∣∣∣∣λmax
λmin
∣∣∣∣ , orσmax
σmin
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 9 of 1
5.1.2. Jacobi (Diagonal) Splitting
Choose A = M − N = D − (L + U) withD = diag(A)
L the lower triangular part of A, and
U the upper triangular part.
x (k+1) = D−1b + D−1(L + U)x (k) =
= D−1b + D−1(D − A)x (k) = x (k) + D−1r (k)
Convergent for A ≈ diag(A) or diagonal dominant matrices:
ρ(D−1N) = ρ(I − D−1A) < 1
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 10 of 1
5.1.3. Jacobi (Diagonal) Splitting
Choose A = M − N = D − (L + U) withD = diag(A)
L the lower triangular part of A, and
U the upper triangular part.
x (k+1) = D−1b + D−1(L + U)x (k) =
= D−1b + D−1(D − A)x (k) = x (k) + D−1r (k)
Convergent for A ≈ diag(A) or diagonal dominant matrices:
ρ(D−1N) = ρ(I − D−1A) < 1
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 10 of 1
Jacobi (Diagonal) Splitting (cont.)Iteration process written elementwise:
x (k+1) = D−1(b − (A−D)x (k))⇒ x (k+1)j =
1ajj
bj −n∑
m=1,m 6=j
aj,mx (k)m
ajjx
(k+1)j = bj −
∑j−1
m=1aj,mx (k)
m −∑n
m=j+1aj,mx (k)
m
• Damping or relaxation for improving convergence• Idea: Iterative method as correction of last iterate in search
direction.• Introduce step length for this correction step:
x (k+1) = x (k) + D−1r (k) → x (k+1) = x (k) + ωD−1r (k)
with additional damping parameter ω.• Damped Jacobi iteration:
x (k+1)damped = (ω + 1− ω)x (k) + ωD−1r (k) = ωx (k+1) + (1− ω)x (k)
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 11 of 1
Jacobi (Diagonal) Splitting (cont.)Iteration process written elementwise:
x (k+1) = D−1(b − (A−D)x (k))⇒ x (k+1)j =
1ajj
bj −n∑
m=1,m 6=j
aj,mx (k)m
ajjx
(k+1)j = bj −
∑j−1
m=1aj,mx (k)
m −∑n
m=j+1aj,mx (k)
m
• Damping or relaxation for improving convergence• Idea: Iterative method as correction of last iterate in search
direction.
• Introduce step length for this correction step:
x (k+1) = x (k) + D−1r (k) → x (k+1) = x (k) + ωD−1r (k)
with additional damping parameter ω.• Damped Jacobi iteration:
x (k+1)damped = (ω + 1− ω)x (k) + ωD−1r (k) = ωx (k+1) + (1− ω)x (k)
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 11 of 1
Jacobi (Diagonal) Splitting (cont.)Iteration process written elementwise:
x (k+1) = D−1(b − (A−D)x (k))⇒ x (k+1)j =
1ajj
bj −n∑
m=1,m 6=j
aj,mx (k)m
ajjx
(k+1)j = bj −
∑j−1
m=1aj,mx (k)
m −∑n
m=j+1aj,mx (k)
m
• Damping or relaxation for improving convergence• Idea: Iterative method as correction of last iterate in search
direction.• Introduce step length for this correction step:
x (k+1) = x (k) + D−1r (k) → x (k+1) = x (k) + ωD−1r (k)
with additional damping parameter ω.• Damped Jacobi iteration:
x (k+1)damped = (ω + 1− ω)x (k) + ωD−1r (k) = ωx (k+1) + (1− ω)x (k)
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 11 of 1
Damped Jacobi Iteration
x (k+1) = x (k) + ωD−1r (k) = x (k) + ωD−1(b − Ax (k)) =
= . . .
= ωD−1b + [(1− ω)I + ωD−1(L + U)]x (k)
is convergent for
ρ([(1− ω)I + ωD−1(L + U)]︸ ︷︷ ︸ω→0−→ I
) < 1
Look for optimal ω with best convergence (add. degree of freedom).
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 12 of 1
Parallelism in the Jacobi Iteration
• Jacobi method is easy to parallelize: only Ax and D−1x .
• But often too slow convergence!
• Improvement: block Jacobi iteration
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 13 of 1
5.1.4. Gauss-Seidel Iteration
Always use newest information available!
Jacobi iteration:
ajjx(k+1)j = bj −
j−1∑m=1
aj,m x (k)m︸︷︷︸
already computed
−n∑
m=j+1
aj,mx (k)m
Gauss-Seidel iteration:
ajjx(k+1)j = bj −
j−1∑m=1
aj,m x (k+1)m︸ ︷︷ ︸
already computed
−n∑
m=j+1
aj,mx (k)m
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 14 of 1
Gauss-Seidel Iteration (cont.)
• Compare dependancy graphs for general iterative algorithms.Here:
x = f (x) = D−1(b + (D − A)x) = D−1(b − (L + U)x)
to splitting A = (D − L)− U = M − N
x (k+1) = (D − L)−1b + (D − L)−1Ux (k) =
= (D − L)−1b + (D − L)−1(D − L− A)x (k) =
= x (k) + (D − L)−1r (k)
• Convergence depends on spectral radius ρ(I − (D − L)−1A) < 1
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 15 of 1
Parallelism in the Gauss-Seidel Iteration
• Linear system in D − L is easy to solve because D − L is lowertriangular but
• strongly sequential!
• Use red-black ordering or graph colouring for compromise:
parallel↔ convergence
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 16 of 1
Successive Over Relaxation (SOR)
• Damping or relaxation:
x (k+1) = x (k)+ω(D−L)−1r (k) = ω(D−L)−1b+[(1−ω)+ω(D−L)−1U]x (k)
• Convergence depends on spectral radius of iteration matrix
(1− ω) + ω(D − L)−1U
• Parallelization of SOR == parallelization of GS
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 17 of 1
Stationary Methods (in General)
• Can always be written in the two normal forms
x (k+1) = c + Bx (k)
with convergence depending on ρ(B) of iteration matrix and
x (k+1) = x (k) + Fr (k)
with preconditioner F , B = I − FA
• For x (0) = 0:x (k+1) ⊆ Kk (B, c),
which is the Krylov space with respect to matrix B and vector c.
• Slow convergence (but good smoothing properties!→ multigrid)
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 18 of 1
Stationary Methods (in General)
• Can always be written in the two normal forms
x (k+1) = c + Bx (k)
with convergence depending on ρ(B) of iteration matrix and
x (k+1) = x (k) + Fr (k)
with preconditioner F , B = I − FA
• For x (0) = 0:x (k+1) ⊆ Kk (B, c),
which is the Krylov space with respect to matrix B and vector c.
• Slow convergence (but good smoothing properties!→ multigrid)
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 18 of 1
MATLAB Example
• clear; n=100;k=10;omega=1;stationary
• tridiag(−.5,1,−.5):
– Jacobi norm | cos(π/n)|– GS norm cos(π/n)2
– both < 1→ convergence, but slow
• To improve convergence→ nonstationary methods (or multigrid)
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 19 of 1
Chair of Informatics V—SCCSEfficient Numerical Algorithms—Parallel & HPC
• High-dimensional numerics (sparse grids)• Fast iterative solvers (multi-level methods,
preconditioners, eigenvalue solvers)• Uncertainty Quantification• Space-filling curves• Numerical linear algebra• Numerical algorithms for image processing• HW-aware numerical programming
Fields of application in simulation
• CFD (incl. fluid-structure interaction)• Plasma physics• Molecular dynamics• Quantum chemistry
Further info→ www5.in.tum.deFeel free to come around and ask for thesis topics!
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 20 of 1
5.2. Nonstationary Methods5.2.1. Gradient Method
• Consider A = AT > 0 (A SPD)
Function Φ(x) =12
xT Ax − bT x
• n-dim. paraboloid Rn → R• Gradient ∇Φ(x) = Ax − b
• Position with ∇Φ(x) = 0 is exactlyminimum of paraboloid
• Instead of solving Ax = b considerminx Φ(x)
• Local descent direction in y :∇Φ(x) · y is minimum for
y = −∇Φ(x)
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 21 of 1
5.3. Nonstationary Methods5.3.1. Gradient Method
• Consider A = AT > 0 (A SPD)
Function Φ(x) =12
xT Ax − bT x
• n-dim. paraboloid Rn → R• Gradient ∇Φ(x) = Ax − b
• Position with ∇Φ(x) = 0 is exactlyminimum of paraboloid
• Instead of solving Ax = b considerminx Φ(x)
• Local descent direction in y :∇Φ(x) · y is minimum for
y = −∇Φ(x)
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 21 of 1
Gradient Method (cont.)
• Optimization: start with x (0)
x (k+1) := x (k) + αk d (k)
with search direction d (k) and step size αk .
• In view of previous results the optimal (local) search direction is
−∇Φ(x (k)) =: d (k)
• To define αk :
minα
g(α) := minα
(Φ(x (k) + α(b − Ax (k))))
= minα
(12
(x (k) + αd (k))T A(x (k) + αd (k))− bT (x (k) + αd (k))
)= min
α
(12α2d (k)T
Ad (k) − αd (k)Td (k) +
12
x (k)TAx (k) − x (k)T
b)
αk = d (k)T d (k)
d (k)T Ad (k)d (k) = −∇Φ(x (k)) = b − Ax (k)
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 22 of 1
Gradient Method (cont.)
• Optimization: start with x (0)
x (k+1) := x (k) + αk d (k)
with search direction d (k) and step size αk .
• In view of previous results the optimal (local) search direction is
−∇Φ(x (k)) =: d (k)
• To define αk :
minα
g(α) := minα
(Φ(x (k) + α(b − Ax (k))))
= minα
(12
(x (k) + αd (k))T A(x (k) + αd (k))− bT (x (k) + αd (k))
)= min
α
(12α2d (k)T
Ad (k) − αd (k)Td (k) +
12
x (k)TAx (k) − x (k)T
b)
αk = d (k)T d (k)
d (k)T Ad (k)d (k) = −∇Φ(x (k)) = b − Ax (k)
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 22 of 1
Gradient Method (cont. 2)
x (k+1) = x (k) +‖b − Ax (k)‖2
2
(b − Ax (k))T A(b − Ax (k))(b − Ax (k))
• Method of steepest descent.
• Disadvantage: Distortedcontour lines.
• Slow convergence (zig zagpath)
• Local descent direction is notglobally optimal
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 23 of 1
Gradient Method (cont. 2)
x (k+1) = x (k) +‖b − Ax (k)‖2
2
(b − Ax (k))T A(b − Ax (k))(b − Ax (k))
• Method of steepest descent.
• Disadvantage: Distortedcontour lines.
• Slow convergence (zig zagpath)
• Local descent direction is notglobally optimal
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 23 of 1
Analysis of the Gradient Method
• Definition A-norm:‖x‖A :=
√xT Ax
• Consider error:
‖x − x‖2A = ‖x − A−1b‖2
A = (xT − bT A−1)A(x − A−1b)
= xT Ax − 2bT x + bT A−1b= 2Φ(x) + bT A−1b
• Therefore, minimizing Φ is equivalent to minimizing the error inthe A-norm! More detailed analysis reveals:
‖x (k+1) − x‖2A ≤
(1− 1
κ(A)
)· ‖x (k) − x‖2
A
• Therefore, for κ(A)� 1 very slow convergence!
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 24 of 1
Analysis of the Gradient Method
• Definition A-norm:‖x‖A :=
√xT Ax
• Consider error:
‖x − x‖2A = ‖x − A−1b‖2
A = (xT − bT A−1)A(x − A−1b)
= xT Ax − 2bT x + bT A−1b= 2Φ(x) + bT A−1b
• Therefore, minimizing Φ is equivalent to minimizing the error inthe A-norm! More detailed analysis reveals:
‖x (k+1) − x‖2A ≤
(1− 1
κ(A)
)· ‖x (k) − x‖2
A
• Therefore, for κ(A)� 1 very slow convergence!
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 24 of 1
5.3.2. The Conjugate Gradient Method
• Improving descent direction being globally optimal.
• x (k+1) := x (k) + αk p(k) with search direction not being negativegradient, but projection of gradient that is A-conjugate to allprevious search directions:
p(k) ⊥ Ap(j) for all j < k orp(k) ⊥A p(j) or
p(k)TAp(j) = 0 for j < k
• We choose new search direction as component of last residualthat is A-conjugate to all previous search directions.
• αk again by 1-dim. minimization as before (for chosen p(k))
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 25 of 1
5.3.3. The Conjugate Gradient Method
• Improving descent direction being globally optimal.
• x (k+1) := x (k) + αk p(k) with search direction not being negativegradient, but projection of gradient that is A-conjugate to allprevious search directions:
p(k) ⊥ Ap(j) for all j < k orp(k) ⊥A p(j) or
p(k)TAp(j) = 0 for j < k
• We choose new search direction as component of last residualthat is A-conjugate to all previous search directions.
• αk again by 1-dim. minimization as before (for chosen p(k))
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 25 of 1
The Conjugate Gradient Algorithm
p(0) = r (0) = b − Ax (0)
for k = 1,2, . . . do
α(k) = − 〈r(k),r (k)〉
〈p(k),Ap(k)〉
x (k+1) = x (k) − α(k)p(k)
r (k+1) = r (k) + α(k)Ap(k)
if ‖r (k+1)‖22 ≤ ε then break
β(k) = 〈r (k+1),r (k+1)〉〈r (k),r (k)〉
p(k+1) = r (k+1) + β(k)p(k)
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 26 of 1
Properties of Conjugate Gradients• It holds
p(j)TAp(k) = 0 = r (j)
Tr (k) for j 6= k
• and
span(p(1), . . . ,p(j)) = span(r (0), . . . , r (j−1)) =
= span(r (0),Ar (0), . . . ,Aj−1r (0)) = Kj (A, r (0))
• Especially for x (0) = 0 it holds
Kj (A, r (0)) = span(b,Ab, . . . ,Aj−1b)
• x (k) is best approximate solution to Ax = b in subspaceKk (A, r (0)). For x (0) = 0 : x (k) ∈ Kk (A,b)
• Error:‖x (k) − x‖A = min
x∈Kk (A,b)‖x − x‖A
• Cheap 1-dim. minimization gives optimal k -dim. solution for free!
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 27 of 1
Properties of Conjugate Gradients• It holds
p(j)TAp(k) = 0 = r (j)
Tr (k) for j 6= k
• and
span(p(1), . . . ,p(j)) = span(r (0), . . . , r (j−1)) =
= span(r (0),Ar (0), . . . ,Aj−1r (0)) = Kj (A, r (0))
• Especially for x (0) = 0 it holds
Kj (A, r (0)) = span(b,Ab, . . . ,Aj−1b)
• x (k) is best approximate solution to Ax = b in subspaceKk (A, r (0)). For x (0) = 0 : x (k) ∈ Kk (A,b)
• Error:‖x (k) − x‖A = min
x∈Kk (A,b)‖x − x‖A
• Cheap 1-dim. minimization gives optimal k -dim. solution for free!Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 27 of 1
Properties of Conjugate Gradients (cont.)
• Consequence: After n steps Kn(A,b) = Rn and thereforex (n) = A−1b is solution in exact arithmetic.
• Unfortunately, this is not true in floating point arithmetic.
• Furthermore, O(n) iteration steps would be too costly:costs: #iterations ∗ matrix-vector-product
• Matrix-vector-product easy in parallel.
• But, how to get fast convergence and reduce #iterations?
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 28 of 1
Properties of Conjugate Gradients (cont.)
• Consequence: After n steps Kn(A,b) = Rn and thereforex (n) = A−1b is solution in exact arithmetic.
• Unfortunately, this is not true in floating point arithmetic.
• Furthermore, O(n) iteration steps would be too costly:costs: #iterations ∗ matrix-vector-product
• Matrix-vector-product easy in parallel.
• But, how to get fast convergence and reduce #iterations?
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 28 of 1
Error Estimation (x (0) = 0)
‖e(k)‖A = ‖x (k) − x‖A = minx∈Kk (A,b)
‖x − x‖A =
= minα0,...,αk−1
∥∥∥∥∑k−1
j=0αj (Ajb)− x
∥∥∥∥A
=
= minP(k−1)(x)
∥∥∥P(k−1)(A)b − x∥∥∥
A=
= minP(k−1)(x)
∥∥∥P(k−1)(A)Ax − x∥∥∥
A=
= minP(k−1)(x)
∥∥∥(P(k−1)(A)A− I)(x − x (0))∥∥∥
A=
= minQ(k)(x),Q(k)(0)=1
∥∥∥Q(k)(A)e(0)∥∥∥
A
for polynomial Q(k)(x) of degree k with Q(k)(0) = 1.Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 29 of 1
Error Estimation
• Matrix A has orthonormal basis of eigenvectors uj , j = 1, . . . ,n,eigenvalues λj
• It holds
Auj = λjuj , j = 1, . . . ,n, and uTj uk = 0 for j 6= k and 1 for j = k
• Start error in ONB: e(0) =∑n
j=1ρjuj
‖e(k)‖A = minQ(k)(0)=1
∥∥∥∥∥∥Q(k)(A)n∑
j=1
ρjuj
∥∥∥∥∥∥A
= minQ(k)(0)=1
∥∥∥∥∥∥n∑
j=1
ρjQ(k)(A)uj
∥∥∥∥∥∥A
=
= minQ(k)(0)=1
∥∥∥∥∥∥n∑
j=1
ρjQ(k)(λj )uj
∥∥∥∥∥∥A
≤ minQ(k)(0)=1
{max
j=1,...,n
∣∣∣Q(k)(λj )∣∣∣}∥∥∥∥∥∥
n∑j=1
ρjuj
∥∥∥∥∥∥A
=
= minQ(k)(0)=1
{max
j=1,...,n|Q(k)(λj )|
}∥∥∥e(0)∥∥∥
A
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 30 of 1
Error Estimation
• Matrix A has orthonormal basis of eigenvectors uj , j = 1, . . . ,n,eigenvalues λj
• It holds
Auj = λjuj , j = 1, . . . ,n, and uTj uk = 0 for j 6= k and 1 for j = k
• Start error in ONB: e(0) =∑n
j=1ρjuj
‖e(k)‖A = minQ(k)(0)=1
∥∥∥∥∥∥Q(k)(A)n∑
j=1
ρjuj
∥∥∥∥∥∥A
= minQ(k)(0)=1
∥∥∥∥∥∥n∑
j=1
ρjQ(k)(A)uj
∥∥∥∥∥∥A
=
= minQ(k)(0)=1
∥∥∥∥∥∥n∑
j=1
ρjQ(k)(λj )uj
∥∥∥∥∥∥A
≤ minQ(k)(0)=1
{max
j=1,...,n
∣∣∣Q(k)(λj )∣∣∣}∥∥∥∥∥∥
n∑j=1
ρjuj
∥∥∥∥∥∥A
=
= minQ(k)(0)=1
{max
j=1,...,n|Q(k)(λj )|
}∥∥∥e(0)∥∥∥
A
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 30 of 1
Error Estimates
By choosing polynomials with Q(k)(0) = 1, we can derive errorestimates for the error after the k th step:
Choose, e.g. Q(k)(x) :=
∣∣∣∣1− 2λmax + λmin
x∣∣∣∣k
‖e(k)‖A ≤ maxj=1,...,n
|Q(k)(λj )|∥∥∥e(0)
∥∥∥A
=
∣∣∣∣1− 2λmax
λmax + λmin
∣∣∣∣k ‖e(0)‖A
=
(λmax − λmin
λmax + λmin
)k
‖e(0)‖A =
(κ(A)− 1κ(A) + 1
)k
‖e(0)‖A
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 31 of 1
Error Estimates
By choosing polynomials with Q(k)(0) = 1, we can derive errorestimates for the error after the k th step:
Choose, e.g. Q(k)(x) :=
∣∣∣∣1− 2λmax + λmin
x∣∣∣∣k
‖e(k)‖A ≤ maxj=1,...,n
|Q(k)(λj )|∥∥∥e(0)
∥∥∥A
=
∣∣∣∣1− 2λmax
λmax + λmin
∣∣∣∣k ‖e(0)‖A
=
(λmax − λmin
λmax + λmin
)k
‖e(0)‖A =
(κ(A)− 1κ(A) + 1
)k
‖e(0)‖A
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 31 of 1
Better Estimates
• Choose normalized Chebyshev polynomialsTn(x) = cos(n arccos(x))
‖e(k)‖A ≤1
Tk
(κ(A)+1κ(A)−1
)‖e(0)‖A ≤ 2
(√κ(A)− 1√κ(A) + 1
)k
‖e(0)‖A
• For clustered eigenvalues choose special polynomial, e.g.assume that A has only two eigenvalues λ1 and λ2:
Q(2)(x) :=(λ1 − x)(λ2 − x)
λ1λ2
‖e(2)‖A ≤ maxj=1,2
∣∣∣Q(2)(λj )∣∣∣ ‖e(0)‖A = 0
Convergence of CG after two steps!
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 32 of 1
Better Estimates
• Choose normalized Chebyshev polynomialsTn(x) = cos(n arccos(x))
‖e(k)‖A ≤1
Tk
(κ(A)+1κ(A)−1
)‖e(0)‖A ≤ 2
(√κ(A)− 1√κ(A) + 1
)k
‖e(0)‖A
• For clustered eigenvalues choose special polynomial, e.g.assume that A has only two eigenvalues λ1 and λ2:
Q(2)(x) :=(λ1 − x)(λ2 − x)
λ1λ2
‖e(2)‖A ≤ maxj=1,2
∣∣∣Q(2)(λj )∣∣∣ ‖e(0)‖A = 0
Convergence of CG after two steps!
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 32 of 1
Outliers/Cluster
Assume the matrix has an eigenvalue λ1 > 1 and all othereigenvalues are contained in an ε-neighborhood of 1:
∀λ 6= λ1 : |λ− 1| < ε
Q(2)(x) :=(λ1 − x)(1− x)
λ1
‖e(2)‖A ≤ max|λ−1|<ε
∣∣∣∣ (λ1 − λ)(1− λ)
λ1
∣∣∣∣ ‖e(0)‖A ≤(λ1 − 1 + ε)ε
λ1= O(ε)
Very good approximation of CG after only two steps!
Important: small number of outliers combined with cluster.
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 33 of 1
Conjugate Gradients Summary
• To get fast convergence and reduce the number of iterations:→ find preconditioner M, such that M−1Ax = M−1b withclustered eigenvalues.
• Conjugate gradients (CG) is always the method of choice forsymmetric positive definite A (in general).
• To improve convergence, include preconditioning (PCG).
• CG has two important properties: optimal and cheap.
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 34 of 1
Parallel Conjugate Gradients Algorithm
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 35 of 1
Non-Blocking Collective Operations
• Use to hide communication!
• Allows overlap of numerical computations and communcations.
• In MPI-1/MPI-2 only possible for point-to-point communication:MPI Isend and MPI Irecv.
Additional libraries necessary for collective operations!Example: LibNBC (non-blocking collectives)
• Are included in new MPI-3 standard.
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 36 of 1
Example: Pseudocode for Nonblock. Reduct.MPI_Request req;
int sbuf1[SIZE], rbuf1[SIZE], buf2[SIZE];
// compute sbuf1
compute(sbuf1, SIZE);
// start non-blocking allreduce of subf1
// computation and communication overlap
MPI_Iallreduce(sbuf1,rbuf1,SIZE,MPI_INT,MPI_SUM,MPI_COMM, &req);
// compute buf2 (independent of buf1)
compute(buf2, SIZE);
// synchronization
MPI_WAIT(&req, &stat);
// use data in rbuf1; final computation
evaluate(rbuf1, buf2, SIZE);
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 37 of 1
Iter. Methods for general (nonsymmetric) A:BiCGSTAB
• Applicable if little memory at hand and AT not available.
• Computational costs per iteration similar to BiCG and CGS.
• Alternative for CGS that avoids irregular convergence patterns ofCGS maintaining similar convergence speed.
• Less loss of accuracy in the updated residual.
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 38 of 1
Iter. Methods for general (nonsymmetric) A:GMRES
• Leads to smallest residual for fixed number of iteration steps, butthese steps become increasingly expensive.
• To limit increasing storage requirments and work per iterationstep, restarting is necessary. When to do so depends on A andb; it requires skill and experience.
• Requires only matrix-vector products with the coeff. matrix.
• Number of inner products grows linearly with iteration number(up to restart point).
• Implementation based on Gram-Schmidt→ inner productsindependent→ only one synchronization point.Using modified Gram-Schmidt→ one synchronization point perinner product.
We consider GMRES in the following.
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 39 of 1
Iter. Methods for general (nonsymmetric) A:GMRES
• Leads to smallest residual for fixed number of iteration steps, butthese steps become increasingly expensive.
• To limit increasing storage requirments and work per iterationstep, restarting is necessary. When to do so depends on A andb; it requires skill and experience.
• Requires only matrix-vector products with the coeff. matrix.
• Number of inner products grows linearly with iteration number(up to restart point).
• Implementation based on Gram-Schmidt→ inner productsindependent→ only one synchronization point.Using modified Gram-Schmidt→ one synchronization point perinner product.
We consider GMRES in the following.Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 39 of 1
5.3.4. GMRES
• Iterative solution method for general A
• Consider small subspace Um and determine optimal approximatesolution for Ax = b in Um. Hence x is of the form x := Umy
minx∈Um‖Ax − b‖2 = min
y‖A(Umy)− b‖2
• Can be solved by normal equations: (UTmAT AUm)y = UT
mAT b
• Try to find sequence of ”good“ subspaces U1 → U2 → U3 → . . .such that iteratively we can update the optimal solutions
x1 → x2 → x3 → . . .→ A−1b
using mainly matrix-vector products.
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 40 of 1
5.3.5. GMRES
• Iterative solution method for general A
• Consider small subspace Um and determine optimal approximatesolution for Ax = b in Um. Hence x is of the form x := Umy
minx∈Um‖Ax − b‖2 = min
y‖A(Umy)− b‖2
• Can be solved by normal equations: (UTmAT AUm)y = UT
mAT b
• Try to find sequence of ”good“ subspaces U1 → U2 → U3 → . . .such that iteratively we can update the optimal solutions
x1 → x2 → x3 → . . .→ A−1b
using mainly matrix-vector products.
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 40 of 1
GMRES: Subspace
What subspace for fast convergence and easy computations?
Um := Km(A,b) = span(b,Ab, . . . ,Am−1b) (Krylov space)
Problem: b, Ab, A2b, . . . is bad basis for this subspace!
So we need a first step to compute a good basis for Um:
Start with u1 := b‖b‖2
and do for j = 2 : m:
uj := Auj−1 −j−1∑k=1
(uTk Auj−1)uk = Auj−1 −
j−1∑k=1
hk,j−1uk
uj :=uj
‖uj‖2=
uj
hj,j−1
which is the standard orthogonalization method (Arnoldi method)
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 41 of 1
GMRES: Subspace
What subspace for fast convergence and easy computations?
Um := Km(A,b) = span(b,Ab, . . . ,Am−1b) (Krylov space)
Problem: b, Ab, A2b, . . . is bad basis for this subspace!
So we need a first step to compute a good basis for Um:
Start with u1 := b‖b‖2
and do for j = 2 : m:
uj := Auj−1 −j−1∑k=1
(uTk Auj−1)uk = Auj−1 −
j−1∑k=1
hk,j−1uk
uj :=uj
‖uj‖2=
uj
hj,j−1
which is the standard orthogonalization method (Arnoldi method)
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 41 of 1
GMRES: Subspace
What subspace for fast convergence and easy computations?
Um := Km(A,b) = span(b,Ab, . . . ,Am−1b) (Krylov space)
Problem: b, Ab, A2b, . . . is bad basis for this subspace!
So we need a first step to compute a good basis for Um:
Start with u1 := b‖b‖2
and do for j = 2 : m:
uj := Auj−1 −j−1∑k=1
(uTk Auj−1)uk = Auj−1 −
j−1∑k=1
hk,j−1uk
uj :=uj
‖uj‖2=
uj
hj,j−1
which is the standard orthogonalization method (Arnoldi method)
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 41 of 1
Matrix Form of Arnoldi ONB
Um = span(b,Ab, . . . ,Am−1b) = span(u1,u2, . . . ,um) (ONB)
Write this orthogonalization method in matrix form
Auj−1 =
j−1∑k=1
hk,j−1uk + uj =
j∑k=1
hk,j−1uk
AUm = A(u1 . . . um) = (u1 . . . um+1)Hm+1,m = Um+1Hm+1,m
Hm+1,m =
h11 · · · · · · h1m
h21. . .
...
0. . . . . .
.... . . hmm
0 0 hm+1,m
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 42 of 1
Matrix Form of Arnoldi ONB
Um = span(b,Ab, . . . ,Am−1b) = span(u1,u2, . . . ,um) (ONB)
Write this orthogonalization method in matrix form
Auj−1 =
j−1∑k=1
hk,j−1uk + uj =
j∑k=1
hk,j−1uk
AUm = A(u1 . . . um) = (u1 . . . um+1)Hm+1,m = Um+1Hm+1,m
Hm+1,m =
h11 · · · · · · h1m
h21. . .
...
0. . . . . .
.... . . hmm
0 0 hm+1,m
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 42 of 1
GMRES: Minimization
This leads to minimization problem
minx∈Um‖Ax − b‖ = min
y‖AUmy − b‖
= miny
∥∥∥Um+1Hm+1,my − ‖b‖u1
∥∥∥= min
y
∥∥∥Um+1(Hm+1,my − ‖b‖e1)∥∥∥
= miny
∥∥∥Hm+1,my − ‖b‖e1
∥∥∥Because Um+1 is part of an orthogonal matrix.
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations
page 43 of 1