Math 170A (Winter 2009) - Lecture 24 Emre Mengi Department of Mathematics University of California at San Diego [email protected] Lecture 24 – p.1/14
Math 170A (Winter 2009) - Lecture 24
Emre MengiDepartment of Mathematics
University of California at San Diego
Lecture 24 – p.1/14
Outline
Eigenvalues and Eigenvectors
Convergence Properties of Power Iteration - section 5.3
Extensions of Power Iteration - section 5.3
Lecture 24 – p.2/14
Power Iteration
Reminders from last lecture
Lecture 24 – p.3/14
Power Iteration
Reminders from last lecture
Given a matrix A ∈ Cn×n and an initial vector q0 ∈ Cn. The poweriteration generates the sequence of vectors {qk} satisfying
qk :=Aqk−1
‖Aqk−1‖, (k = 1, 2, . . . )
Lecture 24 – p.3/14
Power Iteration
Reminders from last lecture
Given a matrix A ∈ Cn×n and an initial vector q0 ∈ Cn. The poweriteration generates the sequence of vectors {qk} satisfying
qk :=Aqk−1
‖Aqk−1‖, (k = 1, 2, . . . )
Convergence to dominant eigenvector:The sequence {qk} approaches a unit eigenvector v̂ associated withthe eigenvalue λ1 with largest modulus.
Lecture 24 – p.3/14
Power Iteration
Reminders from last lecture
Given a matrix A ∈ Cn×n and an initial vector q0 ∈ Cn. The poweriteration generates the sequence of vectors {qk} satisfying
qk :=Aqk−1
‖Aqk−1‖, (k = 1, 2, . . . )
Convergence to dominant eigenvector:The sequence {qk} approaches a unit eigenvector v̂ associated withthe eigenvalue λ1 with largest modulus.
Retrieval of dominant eigenvalue:Notice that v̂∗Av̂ = v̂∗λ1v̂ = λ1‖v̂‖
2 = λ1.
Lecture 24 – p.3/14
Power Iteration
Pseudocode
Given A ∈ Cn×n and q0 ∈ Cn s.t. ‖q0‖ = 1.for k = 1, m do
qk ← Aqk−1
qk ← qk/‖qk‖
end forv ← qm
λ← q∗mAqm
Return (λ, v)
Lecture 24 – p.4/14
Power Iteration
Rate of Convergence : Suppose limk→∞vk = v̂.
Lecture 24 – p.5/14
Power Iteration
Rate of Convergence : Suppose limk→∞vk = v̂.
Linear convergence : for some positive constant c < 1
limk→∞
‖vk+1 − v̂‖
‖vk − v̂‖= c
Lecture 24 – p.5/14
Power Iteration
Rate of Convergence : Suppose limk→∞vk = v̂.
Linear convergence : for some positive constant c < 1
limk→∞
‖vk+1 − v̂‖
‖vk − v̂‖= c
e.g. {10−k} = {0.1, 0.01, 001, . . . } converges to 0 linearly.
Lecture 24 – p.5/14
Power Iteration
Rate of Convergence : Suppose limk→∞vk = v̂.
Linear convergence : for some positive constant c < 1
limk→∞
‖vk+1 − v̂‖
‖vk − v̂‖= c
e.g. {10−k} = {0.1, 0.01, 001, . . . } converges to 0 linearly.
10−(k+1)−0
10−k−0
= 0.1
Lecture 24 – p.5/14
Power Iteration
Rate of Convergence : Suppose limk→∞vk = v̂.
Linear convergence : for some positive constant c < 1
limk→∞
‖vk+1 − v̂‖
‖vk − v̂‖= c
e.g. {10−k} = {0.1, 0.01, 001, . . . } converges to 0 linearly.
10−(k+1)−0
10−k−0
= 0.1
Quadratic convergence : for some positive constant c
limk→∞
‖vk+1 − v̂‖
‖vk − v̂‖2= c.
Lecture 24 – p.5/14
Power Iteration
Rate of Convergence : Suppose limk→∞vk = v̂.
Linear convergence : for some positive constant c < 1
limk→∞
‖vk+1 − v̂‖
‖vk − v̂‖= c
e.g. {10−k} = {0.1, 0.01, 001, . . . } converges to 0 linearly.
10−(k+1)−0
10−k−0
= 0.1
Quadratic convergence : for some positive constant c
limk→∞
‖vk+1 − v̂‖
‖vk − v̂‖2= c.
e.g. {10−2k} = {10−2, 10−4, 10−8, 10−16, . . . } converges to 0 quadratically.
Lecture 24 – p.5/14
Power Iteration
Rate of Convergence : Suppose limk→∞vk = v̂.
Linear convergence : for some positive constant c < 1
limk→∞
‖vk+1 − v̂‖
‖vk − v̂‖= c
e.g. {10−k} = {0.1, 0.01, 001, . . . } converges to 0 linearly.
10−(k+1)−0
10−k−0
= 0.1
Quadratic convergence : for some positive constant c
limk→∞
‖vk+1 − v̂‖
‖vk − v̂‖2= c.
e.g. {10−2k} = {10−2, 10−4, 10−8, 10−16, . . . } converges to 0 quadratically.
10−2k+1−0
(10−2k−0)2
=10−2k+1
10−2k10−2k =
10−2k+1
10−2k+1 = 1
Lecture 24 – p.5/14
Power Iteration
Rate of Convergence: It can be shown that for some constant c
limk→∞
‖v̂ − qk+1‖
‖v̂ − qk‖= c
∣
∣
∣
∣
λ2
λ1
∣
∣
∣
∣
Lecture 24 – p.6/14
Power Iteration
Rate of Convergence: It can be shown that for some constant c
limk→∞
‖v̂ − qk+1‖
‖v̂ − qk‖= c
∣
∣
∣
∣
λ2
λ1
∣
∣
∣
∣
When it is convergent, the power iteration converges only linearly.
The closer the moduli of the eigenvalues λ2 and λ1 are, theslower the convergence is.
Lecture 24 – p.6/14
Power Iteration
Rate of Convergence: It can be shown that for some constant c
limk→∞
‖v̂ − qk+1‖
‖v̂ − qk‖= c
∣
∣
∣
∣
λ2
λ1
∣
∣
∣
∣
When it is convergent, the power iteration converges only linearly.
The closer the moduli of the eigenvalues λ2 and λ1 are, theslower the convergence is.
Dominant Eigenvalue:
Lecture 24 – p.6/14
Power Iteration
Rate of Convergence: It can be shown that for some constant c
limk→∞
‖v̂ − qk+1‖
‖v̂ − qk‖= c
∣
∣
∣
∣
λ2
λ1
∣
∣
∣
∣
When it is convergent, the power iteration converges only linearly.
The closer the moduli of the eigenvalues λ2 and λ1 are, theslower the convergence is.
Dominant Eigenvalue: The eigenvalue with largest modulus is givenby q(v̂) where
r(x) =x∗Ax
x∗x
is called the Rayleigh quotient of x ∈ Cn.
Lecture 24 – p.6/14
Power Iteration
Rate of Convergence: It can be shown that for some constant c
limk→∞
‖v̂ − qk+1‖
‖v̂ − qk‖= c
∣
∣
∣
∣
λ2
λ1
∣
∣
∣
∣
When it is convergent, the power iteration converges only linearly.
The closer the moduli of the eigenvalues λ2 and λ1 are, theslower the convergence is.
Dominant Eigenvalue: The eigenvalue with largest modulus is givenby q(v̂) where
r(x) =x∗Ax
x∗x
is called the Rayleigh quotient of x ∈ Cn.
Note that r(v̂) = v̂∗Av̂v̂∗v̂
= v̂∗λ1v̂v̂∗v̂
= λ1
Lecture 24 – p.6/14
Inverse Iteration
Power iteration suffers from slow convergence, when |λ1| ≈ |λ2|.
Lecture 24 – p.7/14
Inverse Iteration
Power iteration suffers from slow convergence, when |λ1| ≈ |λ2|.
A key observation to speed-up power iteration
Av = λv ⇐⇒ Av − µv = λv − µv
Lecture 24 – p.7/14
Inverse Iteration
Power iteration suffers from slow convergence, when |λ1| ≈ |λ2|.
A key observation to speed-up power iteration
Av = λv ⇐⇒ Av − µv = λv − µv
⇐⇒ (A− µI)v = (λ− µ)v
Lecture 24 – p.7/14
Inverse Iteration
Power iteration suffers from slow convergence, when |λ1| ≈ |λ2|.
A key observation to speed-up power iteration
Av = λv ⇐⇒ Av − µv = λv − µv
⇐⇒ (A− µI)v = (λ− µ)v
⇐⇒ (λ− µ)−1v = (A− µI)−1v
Lecture 24 – p.7/14
Inverse Iteration
Power iteration suffers from slow convergence, when |λ1| ≈ |λ2|.
A key observation to speed-up power iteration
Av = λv ⇐⇒ Av − µv = λv − µv
⇐⇒ (A− µI)v = (λ− µ)v
⇐⇒ (λ− µ)−1v = (A− µI)−1v
(λ, v) is an eigenpair of A⇐⇒ ((λ−µ)−1, v) is an eigenpair of (A−µI)−1
Lecture 24 – p.7/14
Inverse Iteration
Power iteration suffers from slow convergence, when |λ1| ≈ |λ2|.
A key observation to speed-up power iteration
Av = λv ⇐⇒ Av − µv = λv − µv
⇐⇒ (A− µI)v = (λ− µ)v
⇐⇒ (λ− µ)−1v = (A− µI)−1v
(λ, v) is an eigenpair of A⇐⇒ ((λ−µ)−1, v) is an eigenpair of (A−µI)−1
Suppose σ is a good estimate of an eigenvalue λl.
Lecture 24 – p.7/14
Inverse Iteration
Power iteration suffers from slow convergence, when |λ1| ≈ |λ2|.
A key observation to speed-up power iteration
Av = λv ⇐⇒ Av − µv = λv − µv
⇐⇒ (A− µI)v = (λ− µ)v
⇐⇒ (λ− µ)−1v = (A− µI)−1v
(λ, v) is an eigenpair of A⇐⇒ ((λ−µ)−1, v) is an eigenpair of (A−µI)−1
Suppose σ is a good estimate of an eigenvalue λl.
That is |λl − σ| ≪ |λj − σ| (or 1
|λl−σ| ≫1
|λj−σ| ) for all j 6= l.
Lecture 24 – p.7/14
Inverse Iteration
Power iteration suffers from slow convergence, when |λ1| ≈ |λ2|.
A key observation to speed-up power iteration
Av = λv ⇐⇒ Av − µv = λv − µv
⇐⇒ (A− µI)v = (λ− µ)v
⇐⇒ (λ− µ)−1v = (A− µI)−1v
(λ, v) is an eigenpair of A⇐⇒ ((λ−µ)−1, v) is an eigenpair of (A−µI)−1
Suppose σ is a good estimate of an eigenvalue λl.
That is |λl − σ| ≪ |λj − σ| (or 1
|λl−σ| ≫1
|λj−σ| ) for all j 6= l.
The eigenvalues of (A− σI)−1 are 1
λ1−σ, 1
λ2−σ, . . . , 1
λn−σ
Lecture 24 – p.7/14
Inverse Iteration
Power iteration suffers from slow convergence, when |λ1| ≈ |λ2|.
A key observation to speed-up power iteration
Av = λv ⇐⇒ Av − µv = λv − µv
⇐⇒ (A− µI)v = (λ− µ)v
⇐⇒ (λ− µ)−1v = (A− µI)−1v
(λ, v) is an eigenpair of A⇐⇒ ((λ−µ)−1, v) is an eigenpair of (A−µI)−1
Suppose σ is a good estimate of an eigenvalue λl.
That is |λl − σ| ≪ |λj − σ| (or 1
|λl−σ| ≫1
|λj−σ| ) for all j 6= l.
The eigenvalues of (A− σI)−1 are 1
λ1−σ, 1
λ2−σ, . . . , 1
λn−σ
Power iteration applied to (A−σI)−1 must converge to vl quickly.
Lecture 24 – p.7/14
Inverse Iteration
Rate of Convergence: Let λj be the eigenvalue second closest to σ.
limk→∞
‖v̂ − qk+1‖
‖v̂ − qk‖= c
∣
∣
∣
∣
1/(λj − σ)
1/(λl − σ)
∣
∣
∣
∣
= c
∣
∣
∣
∣
λl − σ
λj − σ
∣
∣
∣
∣
Lecture 24 – p.8/14
Inverse Iteration
Rate of Convergence: Let λj be the eigenvalue second closest to σ.
limk→∞
‖v̂ − qk+1‖
‖v̂ − qk‖= c
∣
∣
∣
∣
1/(λj − σ)
1/(λl − σ)
∣
∣
∣
∣
= c
∣
∣
∣
∣
λl − σ
λj − σ
∣
∣
∣
∣
Inverse iteration requires the product (A− σI)−1qk, equivalently thesolution of the linear system (A− σI)x = qk, at each iteration.
Lecture 24 – p.8/14
Inverse Iteration
Rate of Convergence: Let λj be the eigenvalue second closest to σ.
limk→∞
‖v̂ − qk+1‖
‖v̂ − qk‖= c
∣
∣
∣
∣
1/(λj − σ)
1/(λl − σ)
∣
∣
∣
∣
= c
∣
∣
∣
∣
λl − σ
λj − σ
∣
∣
∣
∣
Inverse iteration requires the product (A− σI)−1qk, equivalently thesolution of the linear system (A− σI)x = qk, at each iteration.
In practice an LU factorization of (A− σI) is computed initially (ata cost of 2n3/3).
Lecture 24 – p.8/14
Inverse Iteration
Rate of Convergence: Let λj be the eigenvalue second closest to σ.
limk→∞
‖v̂ − qk+1‖
‖v̂ − qk‖= c
∣
∣
∣
∣
1/(λj − σ)
1/(λl − σ)
∣
∣
∣
∣
= c
∣
∣
∣
∣
λl − σ
λj − σ
∣
∣
∣
∣
Inverse iteration requires the product (A− σI)−1qk, equivalently thesolution of the linear system (A− σI)x = qk, at each iteration.
In practice an LU factorization of (A− σI) is computed initially (ata cost of 2n3/3).
At each iteration the system
(A− σI)x = LUx = qj
is solved by forward and back substitutions (at a cost of O(n2)).
Lecture 24 – p.8/14
Inverse Iteration
Pseudocode
Given A ∈ Cn×n, q0 ∈ Cn s.t. ‖q0‖ = 1 and σ ∈ C.Compute an LU factorization of (A− σI)
for k = 1, m doSolve Lx̂ = qk−1 by forward substitution.Solve Ux = x̂ by back substitution.qk ← x/‖x‖
end forv ← qm
λ← q∗mAqm
Return (λ, v)
Lecture 24 – p.9/14
Inverse Iteration
Inverse iteration is commonly employed to compute the eigenvectorsgiven the eigenvalues (or very good estimates).
Lecture 24 – p.10/14
Inverse Iteration
Inverse iteration is commonly employed to compute the eigenvectorsgiven the eigenvalues (or very good estimates).
Suppose λ is very close to an eigenvalue.
Lecture 24 – p.10/14
Inverse Iteration
Inverse iteration is commonly employed to compute the eigenvectorsgiven the eigenvalues (or very good estimates).
Suppose λ is very close to an eigenvalue.
(A− λI) is almost singular.
Lecture 24 – p.10/14
Inverse Iteration
Inverse iteration is commonly employed to compute the eigenvectorsgiven the eigenvalues (or very good estimates).
Suppose λ is very close to an eigenvalue.
(A− λI) is almost singular.
The condition number of the matrix (A− λI) is large.
Lecture 24 – p.10/14
Inverse Iteration
Inverse iteration is commonly employed to compute the eigenvectorsgiven the eigenvalues (or very good estimates).
Suppose λ is very close to an eigenvalue.
(A− λI) is almost singular.
The condition number of the matrix (A− λI) is large.
The computed solution x̂ for (A− λI)x = qk can potentially havea large error ‖x̂− x‖.
Lecture 24 – p.10/14
Inverse Iteration
Inverse iteration is commonly employed to compute the eigenvectorsgiven the eigenvalues (or very good estimates).
Suppose λ is very close to an eigenvalue.
(A− λI) is almost singular.
The condition number of the matrix (A− λI) is large.
The computed solution x̂ for (A− λI)x = qk can potentially havea large error ‖x̂− x‖.
The direction of the computed solution is usually accurate, i.e.∥
∥
∥
x̂‖x̂‖ −
x‖x‖
∥
∥
∥is usually small.
Lecture 24 – p.10/14
Rayleigh Iteration
Rayleigh iteration is similar to the inverse iteration with the exceptionthat the shifts σ are set to the Rayleigh quotient at every iteration, i.e.
qk :=(A− σk−1I)−1qk−1
‖(A− σk−1I)−1qk−1‖where σk−1 := r(qk−1) =
q∗k−1Aqk−1
q∗k−1qk−1
Lecture 24 – p.11/14
Rayleigh Iteration
Rayleigh iteration is similar to the inverse iteration with the exceptionthat the shifts σ are set to the Rayleigh quotient at every iteration, i.e.
qk :=(A− σk−1I)−1qk−1
‖(A− σk−1I)−1qk−1‖where σk−1 := r(qk−1) =
q∗k−1Aqk−1
q∗k−1qk−1
Upside: Rayleigh iteration usually converges to an eigenvector vl
associated with an eigenvalue λl very quickly.
Lecture 24 – p.11/14
Rayleigh Iteration
Rayleigh iteration is similar to the inverse iteration with the exceptionthat the shifts σ are set to the Rayleigh quotient at every iteration, i.e.
qk :=(A− σk−1I)−1qk−1
‖(A− σk−1I)−1qk−1‖where σk−1 := r(qk−1) =
q∗k−1Aqk−1
q∗k−1qk−1
Upside: Rayleigh iteration usually converges to an eigenvector vl
associated with an eigenvalue λl very quickly.
The quick convergence is due to the fact that r(qk) becomes anincreasingly better estimate of r(vl) = λl as qk approaches vl.
|r(qk)− λl| = |r(qk)− r(vl)| ≤ 2‖A‖‖qk − vl‖
See Theorem 5.3.25 on page 326 in the textbook.
Lecture 24 – p.11/14
Rayleigh Iteration
Rate of Convergence: Suppose limk→∞ qk = v̂. Then
limk→∞
‖v̂ − qk+1‖
‖v̂ − qk‖2= c
Rate of convergence is quadratic.
Lecture 24 – p.12/14
Rayleigh Iteration
Rate of Convergence: Suppose limk→∞ qk = v̂. Then
limk→∞
‖v̂ − qk+1‖
‖v̂ − qk‖2= c
Rate of convergence is quadratic.
Downside: At each iteration an LU factorization of (A− σkI) needs tobe computed from scratch to solve (A− σkI)x = qk for x.
Each iteration costs 2n3
3flops.
Lecture 24 – p.12/14
Rayleigh Iteration
Pseudocode
Given A ∈ Cn×n and q0 ∈ Cn s.t. ‖q0‖ = 1.for k = 1, m do
σk−1 ← q∗k−1Aqk−1
Compute an LU factorization of (A− σk−1I)
Solve Lx̂ = qk−1 by forward substitution.Solve Ux = x̂ by back substitution.qk ← x/‖x‖
end forv ← qm
λ← q∗mAqm
Return (λ, v)
Lecture 24 – p.13/14
Next
Today: Reduction to Hessenberg form - section 5.5
Next Lecture : The QR Algorithm - section 5.6
Lecture 24 – p.14/14