Top Banner
PART III (5) Computation with Markov Chains: Iterative Methods -Vector Norms and Matrix Norms -Power Method for Matrix Eigenvalues -Iterative Methods for System of Linear Equations -Spectral Radius -Steepest Descent Method and Conjugate Gradient Method (6) Markovian Queueing Networks, Manufacturing and Re-manufacturing Systems -A Single Markovian Queue (M/M/s/n-s-1) -Two-Queue Free Models -Two-Queue Overflow Free Models -Manufacturing and Re-manufacturing Systems: Computational science is now the third paradigm of science, complementing theory and experiment. Kenneth G. Wilson, Nobel Laureate in Physics. (Wikipedia) http://hkumath.hku.hk/wkc/course/part3.pdf 1
112

hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

May 12, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

PART III

(5) Computation with Markov Chains: Iterative Methods

-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative Methods for System of Linear Equations-Spectral Radius-Steepest Descent Method and Conjugate Gradient Method

(6) Markovian Queueing Networks, Manufacturing and Re-manufacturingSystems

-A Single Markovian Queue (M/M/s/n-s-1)-Two-Queue Free Models-Two-Queue Overflow Free Models-Manufacturing and Re-manufacturing Systems:

Computational science is now the third paradigm of science,complementing theory and experiment.

Kenneth G. Wilson, Nobel Laureate in Physics. (Wikipedia)

http://hkumath.hku.hk/∼wkc/course/part3.pdf

1

Page 2: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

5 Computation with Markov Chains: Iterative Methods

5.1 From Numerical Analysis to Scientific Computing

• It was until 20th century, “Numerical Analysis” became a rec-ognized mathematical discipline though there were a lot of numeri-cal methods for getting approximate solution of many mathematicalproblems in the ancient time.

• In fact, many numerical methods bear the name of great mathe-maticians such as Gauss, Fourier, Jacobi and Newton etc.

• For example, Newton’s divided difference method is a famous in-terpolation formula for fitting a polynomial for a given set of pointsand Gaussian elimination is a direct method for solving a system oflinear equations.

2

Page 3: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Figure 1: Gauss (1777-1798) (left) Fourier (1768-1830) (Right)

Figure 2: Jacobi (1804-1851) (Left) Newton (1642-1727) (Right) Taken from Wikipedia

3

Page 4: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• The followings are two books on history of Numerical Analy-sis and Scientific Computing:

(1) H. H. Goldstine, A History of Numerical Analysis from the 15thThrough the 19th Century, Springer, New York, 1977.

(2) H. H. Goldstine, The Computer from Pascal to von Neumann,Princeton University Press, Princeton, NJ, 1972.

• You can also consult the slides on the topic “Key Moments inthe History of Numerical Analysis” (pdf) by Michele Benzi(The remainder of the note on history of Numerical Analysis is par-tially taken from the slides).

http : //history.siam.org/pdf/nahist Benzi.pdf

4

Page 5: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• There was very little work in Numerical Analysis/ Compu-tational Mathematics before 1940. The following is a remarkablework on numerical method for solving differential equations.

• L. F. Richardson, The approximate arithmetical solution by finitedifferences of physical problems involving differential equationswith an application to stresses in a masonry dam, PhilosophicalTransactions of the Royal Society of London, A, 210 (1910) 307-357.

• Theoretically speaking, to approximate a differential equation bythe finite difference method is an excellent idea. However, solv-ing a large system of linear equations (thousands or millions)is computationally infeasible without a computer! Thus it seems thatit is useless in doing research in such area.

5

Page 6: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• Two important events trigger the development of Numerical Anal-ysis and Scientific computing:

(i) The second world war (1939);(ii) The invention of the first digital computer (1945).

• Due to the second war, a lot of European refugees, especially fromNazi Germany and Fascism Italy have moved to US.

• Many of them were scientists and gave important contributions tothe war effort.

6

Page 7: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Figure 3: Courant (1888-1972) Taken from Wikipedia.

One important giant, Richard Courant, who was the successor of David Hilbert as the director of the famous Mathematical Institute in

Gottingen (a leading center for research in quantum physics in 1920-30). Courant left Germany in 1933 as he was classified as a Jew by the

Nazis. After one year in Cambridge, Courant went to New York City where he became a professor at New York University in 1936. He was given

the task of founding an institute for graduate studies in mathematics, a task which he carried out very successfully. The Courant Institute of

Mathematical Sciences (as it was renamed in 1964) continues to be one of the most respected research centers in applied mathematics. (Taken

from Wikipedia).

7

Page 8: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Figure 4: von Neumann (1903-1957) Taken from Wikipedia.

Another giant is John von Neumann. Between 1926 and 1930, he taught at the University of Berlin, the youngest in its history. His father,

Max von Neumann died in 1929. In 1930, von Neumann, his mother, and his brothers emigrated to the United States. Von Neumann was

invited to Princeton University, New Jersey, in 1930, and, subsequently, was one of the first four people selected for the faculty of the Institute

for Advanced Study (two of the others being Albert Einstein and Kurt Gdel), where he remained a mathematics professor from its formation

in 1933 until his death. (Taken from Wikipedia).

8

Page 9: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Figure 5: The first ENIAC (1945) Taken from Wikipedia.

The first large-scale electronic computer: Electronic Numerical Integrator and Com-puter (ENIAC) was built in 1945.

9

Page 10: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Figure 6: Goldstine (1913-2004) Taken from Wikipedia.

After World War II Goldstine joined von Neumann and Burks at the Institute for Advanced Study in Princeton where they built a computer

referred to as the IAS machine. The IAS machine influenced the design of IBM’s early computers, through von Neumann who was a consultant

to IBM. When von Neumann died in 1958, the IAS computer project terminated. von Neumann and Goldstine started the pioneering work in

numerical linear algebra. Taken from Wikipedia.

10

Page 11: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Figure 7: Hestenes (1906-1991) (Left) Lanczos (1893-1974)(Middle) Stiefel (1909-1978) (Right) Taken from Wikipedia

M. Hestenes together with C. Lanczos and E. Stiefel invented the conjugate gra-dient (CG) method in 1950s.

CG method is an iterative method for solving a symmetric positive definite linearsystem.

11

Page 12: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Figure 8: Young (1923-2008) Taken from Wikipedia.

David M. Young designed the Successive Over-Relaxation (SOR), a variantof the Gauss-Seidel method for solving a system of linear equations which has fasterconvergence in 1950s. He is also called Dr. SOR.

12

Page 13: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

5.2 Vector Norms and Matrix Norms

Definition 1 On a vector space V , a norm is a function ∥ · ∥ from V to theset of non-negative real numbers such that

(i) ∥x∥ > 0 ∀x ∈ V and x = 0;

(ii) ∥λx∥ = |λ|∥x∥ ∀x ∈ V, λ ∈ R;

(iii) ∥x + y∥ ≤ ∥x∥ + ∥y∥ ∀x,y ∈ V .

Proposition 1 The followings are three popular vector norms: 1

(a) ℓ2-norm: ∥x∥2 =(

n∑i=1

x2i

)12

where x = (x1, . . . , xn)T ;

(b) ℓ1-norm: ∥x∥1 =n∑

i=1

|xi| =(

n∑i=1

|xi|1)1

;

(c) ℓ∞-norm: ∥x∥∞ = max1≤i≤n

{|xi|}.

1An iterative method produces a sequence of approximate solution for the solution of a system of linear equations. To measure the error of the approximates,we have to introduce a measurement: vector norm and matrix norm.

13

Page 14: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proof: (i) We note that if x = 0, then there is at least one xi = 0. Thus(n∑

i=1

x2i

)12

= 0.

(ii) We note that

||λx||2 =

(n∑

i=1

(λxi)2

)12

= |λ|

(n∑

i=1

x2i

)12

.

(iii) Moreover, we also have((n∑

i=1

x2i

)12

+

(n∑

i=1

y2i

)12

)2

=

(n∑i=1

x2i

)+

(n∑i=1

y2i

)+ 2

(n∑i=1

x2i

)12(

n∑i=1

y2i

)12

≥(

n∑i=1

x2i

)+

(n∑i=1

y2i

)+ 2

(n∑i=1

|xi||yi|)

=

(n∑i=1

(|xi| + |yi|)2)≥(

n∑i=1

(|xi + yi|)2).

Hence the result follows.

14

Page 15: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

(b) (i) We note that if x = 0, then there is at least one xi = 0. Thus n∑i=1

|xi|

= 0.

(ii) Second, we have

||λx||1 =

n∑i=1

(|λxi|)

= |λ|

n∑i=1

|xi|

.

(iii) Moreover, we have(n∑i=1|xi|

)+

(n∑i=1|yi|

)=

n∑i=1

(|xi| + |yi|) ≥n∑i=1

(|xi + yi|).

Hence the result follows.

15

Page 16: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

(c) (i) If x = 0, then there is at least one xi = 0. Thus we have

maxi{|xi|} = 0.

(ii) We also have

||λx||∞ = maxi{|λxi|} = |λ|max

i{|λxi|}.

(iii) Finally, we have

||x + y||∞ = maxi{|xi + yi|}≤ maxi{|xi|} +maxi{|yi|}= ||x||∞ + ||y||∞.

Hence the result follows.

• We have the above THREE popular vector norms. Is there any other vectornorm? In fact, the answer is yes and we shall introduce a family of vector normswith the above 3 vector norms being a particular case.

16

Page 17: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

5.2.1 The ℓp-norm

For p ≥ 1, the following is a vector norm

||x||p =

(n∑

i=1

|xi|p)1

p

.

(i) It is clear that if x = 0 then ||x||p > 0.(ii) We have

||λx||p =

(n∑

i=1

|λxi|p)1

p

= |λ|

(n∑

i=1

|xi|p)1

p

= |λ|||x||p.

(iii) Finally we have to show that

||x + y||p ≤ ||x||p + ||y||p,

i.e. (n∑i=1

|xi + yi|p)1

p

(n∑

i=1

|xi|p)1

p

+

(n∑

i=1

|yi|p)1

p

.

For p = 1, we have proved and we shall consider p > 1.

17

Page 18: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Lemma 1 Let p > 1 and define q such that 1p +

1q = 1. Then for any non-

negative a and b, we have

a1pb

1q ≤ a

p+

b

q.

Proof: The inequality is true for b = 0 or b = a.Case 1: Assume that a ≥ b > 0, let x = a/b ≥ 1.Then the inequality can be re-written as

x1p − 1 ≤ 1

p(x− 1) for x ≥ 1.

Define

f (x) = x1p − 1− 1

p(x− 1)

and we have

f ′(x) =1

px

1p−1 − 1

p=

1

p

(x−

1q − 1

).

Since f (1) = 0 and f ′(x) ≤ 0 for x ≥ 1. Then we have f (x) ≤ 0 for x ≥ 1.Case 2: Assume b > a > 0 we let x = b/a. The proof is similar to Case 1.

18

Page 19: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Lemma 2 Let p > 1 and q be defined such that 1p +

1q = 1. Then

n∑i=1

|xiyi| ≤

(n∑

i=1

|xi|p)1

p(

n∑i=1

|yi|q)1

q

.

Proof: Let

A =

(n∑

i=1

|xi|p)1

p

, B =

(n∑

i=1

|yi|q)1

q

, ai =|xi|p

Ap, bi =

|yi|p

Bp

By Lemma 1 we have for i = 1, 2, . . . , n

a1p

i b1q

i =|xi||yi|AB

≤ aip+biq.

n∑i=1

|xi||yi| ≤ AB

n∑i=1

(aip+biq

)=

(1

p+

1

q

)AB =

(n∑

i=1

|xi|p)1

p(

n∑i=1

|yi|q)1

q

.

19

Page 20: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• Now for p > 1 and x,y = 0, we haven∑

i=1

|xi + yi|p =n∑i=1

|xi + yi||xi + yi|p−1 ≤n∑

i=1

|xi||xi + yi|p−1 +n∑

i=1

|yi||xi + yi|p−1

By Lemma 2, we have

n∑i=1

|xi||xi + yi|p−1 ≤

(n∑

i=1

|xi|p)1

p(

n∑i=1

|xi + yi|(p−1)q)1

q

andn∑i=1

|yi||xi + yi|p−1 ≤

(n∑i=1

|yi|p)1

p(

n∑i=1

|xi + yi|(p−1)q)1

q

.

Hence

n∑i=1

|xi + yi|p ≤

( n∑i=1

|xi|p)1

p

+

(n∑i=1

|yi|p)1

p

( n∑i=1

|xi + yi|p)1

q

and the result follows.

20

Page 21: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

5.2.2 The ℓ∞-norm

For ℓ∞-norm, one can regard it as :

limp→∞||x||p = lim

p→∞

(n∑

i=1

|xi|p)1

p

.

Let |x| = max1≤i≤n{|xi|}, we note that

||x||p = |x| ·

(n∑

i=1

(|xi||x|

)p)1

p

and |xi|/|x| ≤ 1. Thus we have

1 ≤

(n∑

i=1

(|xi||x|

)p)1

p

≤ n1p

and

1 ≤ limp→∞

(n∑

i=1

(|xi||x|

)p)1

p

≤ limp→∞

n1p = 1.

21

Page 22: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• We conclude that

limp→∞

||x||p = limp→∞

n∑i=1

|xi|p1

p

= max1≤i≤n

{|xi|}

and therefore we define

||x||∞ = max1≤i≤n

{|xi|}.

Definition 2 The matrix norm of an n × n square matrix A isdefined as

∥A∥M = sup {∥Au∥ : u ∈ Rn, ∥u∥ = 1} . (5.1)

Remark 1We note that ∥·∥ is a vector norm and it can be ∥·∥∞or ∥ · ∥1 or ∥ · ∥2. A matrix norm is induced by a vector norm.

22

Page 23: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 2 If ∥ · ∥ is any vector norm of Rn, then (5.1) defines a normon the linear space of all n× n matrices.

Proof: (i) Suppose A = 0, let say Aij = 0, we let

x = (0 0 · · · 0 1︸︷︷︸jth entry

0 · · · 0)T then ∥x∥ = 1

and ∥Ax∥ = ∥Aj∥ > 0 where Aj is the jth column of A. Hence ∥A∥M > 0.

(ii) We have

∥λA∥M = sup{∥λAu∥ : ∥u∥ = 1} = |λ| sup{∥Au∥ : ∥u∥ = 1} = |λ|∥A∥M .

(iii) Finally we have to show that

∥A +B∥M ≤ ∥A∥M + ∥B∥M .

∥A +B∥M = sup{∥(A +B)u∥ : ∥u∥ = 1}≤ sup{∥Au∥ + ∥Bu∥ : ∥u∥ = 1}≤ sup{∥Au∥ : ∥u∥ = 1} + sup{∥Bu∥ : ∥u∥ = 1}= ∥A∥M + ∥B∥M .

23

Page 24: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 3 We have ∥Ax∥ ≤ ∥A∥M∥x∥.

Proof: Case 1: For x = 0, the result is obvious.

Case 2: For x = 0, let

u =x

∥x∥then ∥u∥ = 1. We have

∥A∥M ≥ ∥Au∥ =∥∥∥∥Ax∥x∥

∥∥∥∥ .Hence

∥A∥M ≥1

∥x∥∥Ax∥

and therefore ∥A∥M∥x∥ ≥ ∥Ax∥.

Remark 2 If A = I then

∥A∥M = sup{∥Au∥ = ∥u∥ : u ∈ Rn, ∥u∥ = 1} = 1.

That is to say ∥I∥M = 1.

24

Page 25: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 4 ∥AB∥M ≤ ∥A∥M · ∥B∥M .

Proof: From Proposition 3 above, we have

∥ABx∥ ≤ ∥A∥M∥Bx∥ ∀x ∈ Rn

≤ ∥A∥M∥B∥M∥x∥.Hence

∥AB∥M = sup{∥ABx∥ : x ∈ Rn, ∥x∥ = 1}≤ sup{∥A∥M∥B∥M∥x∥,x ∈ Rn, ∥x∥ = 1}= ∥A∥M · ∥B∥M .

25

Page 26: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 5

∥A∥M∞ = max1≤i≤n

n∑

j=1

|aij|

.

Proof:

∥A∥M∞ = sup {∥Ax∥∞ : x ∈ Rn, ∥x∥∞ = 1} = sup∥x∥∞=1

{∥Ax∥∞}

= sup∥x∥∞=1

{max1≤i≤n

|n∑

j=1

Aijxj|}

= max1≤i≤n

sup∥x∥∞=1

{|n∑

j=1

Aijxj|} = max1≤i≤n

{n∑

j=1

|Aij|}

• Note thatsup∥x∥∞=1

{−2x1 + 3x2 − 4x3} = 2 + 3 + 4 = 9.

The above is achieve by taking x1 = x3 = −1 and x2 = 1.

26

Page 27: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 6

∥A∥M1 = max1≤j≤n

{n∑

i=1

|Aij|

}.

Proof: We have

∥A∥M1 = sup {∥Ax∥1 : x ∈ Rn, ∥x∥1 = 1} = sup∥x∥1=1

{∥Ax∥1}

= sup∥x∥1=1

{n∑i=1

|n∑

j=1

Aijxj|}

≤ sup∥x∥1=1

{n∑

j=1

|xj|n∑

i=1

|Aij|} = 1 · max1≤j≤n

{n∑

i=1

|Aij|}.

We note that if

max1≤j≤n

{n∑i=1

|Aij|} =n∑

i=1

|Aik|

and this can be achieved by letting

x = ek = (0, 0, . . . , 0, 1︸︷︷︸kth

, 0. · · · )T .

27

Page 28: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 7

||A||M2 =√λmax(AT · A).

Proof: We note that

||A||2M2= sup{||Ax||22 : ||x||22 = 1}.

Since AT · A is a symmetric matrix, there exists a matrix P such that

AT · A = P T ·D · P and P T · P = I.

Here D is a diagonal matrix containing all the eigenvalues of AT · A. Hence

||Ax||22 = (Ax)T · Ax = xTAT · Ax = (Px)T ·D · (Px).

We also observe that

xTx = 1 if and only if (Px)T · Px = xTP T · Px = 1.

Hence by letting y = Px we have

||A||2M2= sup{yT ·D · y : ||y||22 = 1} = λmax(A

T · A).

28

Page 29: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Example 1 Suppose

A =

2 −1 03 2 10 1 2

Then we have

||A||M1= max{5, 4, 3} = 5.

||A||M∞ = max{3, 6, 3} = 6.

||A||M2=

√λmax(AAT ) =

√16.5498 = 4.0681.

29

Page 30: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Definition 3 A sequence of vectors vk converges to a vector v, i.e.,(limk→∞

vk = v)

iflimk→∞∥vk − v∥ = 0.

Example 2 Consider

vk =

(1− 1

2k

1 + 13k

)∈ R2

We note that

vk − (1, 1)T = (− 1

2k,1

3k)T .

Hence

∥vk − (1, 1)T∥1 =1

2k+

1

3kand

∥vk − (1, 1)T∥∞ =1

2k.

All of them tends to 0 as n→∞. We say limk→∞

vk = v.

Remark 3 The same concept can be applied to a sequence of matrices. One canreplace the vector norm ∥ · ∥ by a matrix norm ∥ · ∥M .

30

Page 31: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 8 If A is an n × n matrix such that ∥A∥M < 1 then (I − A)−1

exists and equals∞∑k=0

Ak.

Proof: Suppose that (I −A) is not invertible then there exists x = 0 such that(I − A)x = 0.

• Letu = x/∥x∥

then∥u∥ = 1 and u = Au.

This implies that ∥A∥M ≥ 1. Hence (I − A) is invertible and (I − A)−1 exists.

• To prove

(I − A)−1 =

∞∑k=0

Ak,

we will prove that

(I − A) ·

( ∞∑k=0

Ak

)= I.

31

Page 32: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• In other words, we wish to prove

limm→∞

∥∥∥∥∥∥(I − A)m∑k=0

Ak − I

∥∥∥∥∥∥M

= 0.

We note that

(I − A)m∑k=0

Ak =m∑k=0

Ak −m∑k=0

Ak+1

= I − Am+1.

Therefore

limm→∞

∥∥∥∥∥(I − A)m∑k=0

Ak − I

∥∥∥∥∥M

= limm→∞

∥∥Am+1∥∥M

≤ limm→∞

∥A∥m+1M

= 0.

32

Page 33: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Example 3

A =

(0.5 0.40.3 0.6

)∥A∥M∞ = 0.9 < 1

...

...

A100 =

(1.1 1.51.1 1.5

)× 10−5

(I − A)−1 =

(0.5 −0.4−0.3 0.4

)−1= I + A + A2 + · · · +=

(4 53.75 6.25

)

33

Page 34: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• Hence we have

I + A + · · · + A100 =

(3.99999 4.99993.74999 6.2499

).

We note the error E in

∥E∥M∞ = ∥A101 + A102 + · · · + ∥M∞.We have

∥E∥M∞ ≤ ∥A101∥M∞ + ∥A102∥M∞ + · · ·+

≤ ∥A∥101M∞ + ∥A∥102M∞ + · · ·+= 0.9101 + 0.9102 + · · ·+= 0.9101 · ( 1

1−0.9)= 10 · 0.9101.

34

Page 35: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

5.3 Power Method for Matrix Eigenvalues

We discuss the problem of computing the dominant eigenvalueand its corresponding eigenvector of a square matrix. Let the n× nmatrix A satisfies:

(i) There is a single eigenvalue of maximum modulus. Let the eigen-values λ1, λ2, · · · , λn be labeled so that

|λ1| > |λ2| ≥ |λ3| ≥ · · · ≥ |λn|.(ii) To briefly discuss the idea, we assume that there is a linearly in-

dependent set of n unit eigenvectors. This means that there is abasis {

u(1),u(2), · · · ,u(n)}

for Rn such that

Au(i) = λiu(i), i = 1, 2, · · · , n,

and ∥u(i)∥ = 1.

35

Page 36: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• Begin with an initial vector x(0), we write

x(0) = a1u(1) + a2u

(2) + · · · + anu(n).

Here{u(i)}is a basis (unit vector) for Rn.

• Now

Akx(0) = a1Aku(1) + . . . + anA

ku(n)

= a1λk1u

(1) + . . . + anλknu

(n) because Au(i) = λiu(i)

= λk1

{a1u

(1) +

(λ2

λ1

)k

a2u(2) + . . . +

(λn

λ1

)k

anu(n)

}.

• We remark that the convergent “speed” of the power method depends on the“gap” between |λ1| and |λ2|. That is to say the smaller the value of |λ2|/|λ1|, thefaster the rate will be as one can observe that

1 >

∣∣∣∣λ2

λ1

∣∣∣∣ ≥ ∣∣∣∣λ3

λ1

∣∣∣∣ ≥ · · · ≥ ∣∣∣∣λn

λ1

∣∣∣∣ .

36

Page 37: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• Since|λi||λ1|

< 1 for i = 2, . . . , n,

we have

limk→∞

|λi|k

|λ1|k= 0 for i = 2, . . . , n.

Hence we haveAkx(0) ≈ a1λ

k1u

(1).

• Definerk+1 =

Ak+1x(0)

∥Akx(0)∥we have rk+1 =

Ark||rk||

.

We note that

limk→∞

rk+1 = limk→∞

a1λk+11 u(1)

∥a1λk1u

(1)∥= λ1u

(1)

where ∥ · ∥ can be ∥ · ∥1, ∥ · ∥2 or ∥ · ∥∞. Therefore we have

limk→∞

rk+1

∥rk+1∥= u(1),

λ1 can be found by comparing Au(1) and u(1).

37

Page 38: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Example 4 Consider

A =

2 1 01 2 10 1 2

, with initial guess x(0) =

111

.

We take the vector norm ∥ · ∥ to be ∥ · ∥2x(1) = (1.7321, 2.3094, 1.7321)T ,

x(2) = (1.7150, 2.4010, 1.7150)T ,

x(3) = (1.7086, 2.4121, 1.7086)T ,

x(4) = (1.7074, 2.4139, 1.7074)T .

∥r1∥2 = 3.3665,∥r2∥2 = 3.4128,∥r3∥2 = 3.4142,∥r4∥2 = 3.4142.

• Therefore λ1 ≈ 3.4142 and u(1) ≈ (1.7074, 2.4139, 1.7074)T .38

Page 39: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

5.4 Iterative Solutions for Matrix Equations

Proposition 9 If ||H||M < 1, then the following iterative scheme

xk+1 = H · xk + r

will converge to the solution of the linear system (I −H)x = r for any givenx0.

Proof: We note that

xk+1 = H · xk + r = H2 · xk−1 + (I +H) · r = · · · = Hk+1x0 +

k∑m=0

Hm · r.

By Proposition 8 we have

limk→∞

Hk+1 · x0 = 0 and limk→∞

k∑m=0

Hm · r = (I −H)−1r.

Hencelimk→∞

xk+1 = (I −H)−1r

which is the solution of the linear system (I −H)x = r.

39

Page 40: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 10 If A and B are n × n matrices such that ∥I − AB∥M < 1then A and B are invertible. Furthermore

A−1 = B

∞∑k=0

(I − AB)k and B−1 =

∞∑k=0

(I − AB)kA.

Proof: Now from Proposition 8, the matrix (AB) is invertible and this impliesthat both A and B are invertible. Now

(AB)−1 =

∞∑k=0

(I − AB)k

i.e.

B−1A−1 = (AB)−1 =

∞∑k=0

(I − AB)k.

Hence we have

B−1 =

∞∑k=0

(I − AB)k · A and A−1 = B ·∞∑k=0

(I − AB)k.

40

Page 41: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 11 If ∥I −B−1A∥M < 1 then the following iterative scheme 2

xk+1 = xk +B−1(b− Axk) = B−1b︸ ︷︷ ︸r

+ (I −B−1A)︸ ︷︷ ︸H

xk

will converge to the solution of the linear system Ax = b.

Proof: Using Proposition 8, we have

B−1 =

∞∑k=0

(I − AB)kA if ∥I − AB∥M < 1.

Replace B by A and A by B−1, we get

A−1 =

∞∑k=0

(I −B−1A)kB−1

if ∥I −B−1A∥M < 1. Clearly we have

limk→∞

∥∥(I −B−1A)k+1∥∥M

= 0.

Therefore by Proposition 9, letH = I−B−1A and r = B−1b, we have xk convergesto the solution of Ax = b.

2This is called the preconditioning techniques. To solve Ax = b, if ||I − A||M ≥ 1, try to find a matrix B such that ||I − B−1A||M < 1. Then we solveB−1Ax = B−1b instead of the original one.

41

Page 42: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Example 5 Consider the following n× n linear system

Anx = r

(n ≥ 2) where

An =

2n 1 1 · · · 11 . . . . . . . . . ...1 . . . . . . . . . 1... . . . 1 2n 11 · · · 1 1 2n

.

Suggest a preconditioner matrix B such that the iterative scheme:

xk+1 = (In −B−1An)xk +B−1r

converges to the solution of Anx = r for any given x0. With yoursuggestion, what is the computational cost in each iteration?

42

Page 43: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• Take B = 2nIn, then we can check

||In −B−1An||M1=

n− 1

2n<

1

2< 1.

The scheme converges to the solution of Anx = r.

• In each iteration, the main computational cost comes from thematrix-vector multiplication of the formB−1Any. The cost of gettingAny is O(n) because

Ay = ((2n− 1)In + (1, 1, · · · , 1)T (1, 1, · · · , 1))y= (2n− 1)y + (1, 1, · · · , 1)T

∑ni=1 yi.

It is clear that the cost is O(n). Furthermore, there is no cost ingetting B−1.

• Since B is a diagonal matrix, the cost of B−1y is O(n). Finallythe addition of two vector in Rn is O(n). Therefore the total cost periteration is O(n).

43

Page 44: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

5.5 Iterative Methods Based on Splitting Matrix

• There are at least three different ways of splitting a matrix A:

A =

12

13 0

13 1 1

30 1

312

=

1 0 00 1 00 0 1

+

−12

13 0

13 0 1

30 1

3 −12

case 1

=

12 0 00 1 00 0 1

2

+

0 13 0

13 0 1

30 1

3 0

case 2

=

12 0 013 1 00 1

312

+

0 13 0

0 0 13

0 0 0

case 3

= S + (A− S).

• NowAx = (S + (A− S))x = b

and therefore Sx + (A− S)x = b. Hence we may write

x = S−1b− S−1(A− S)x

where we assume that S−1 exists.

44

Page 45: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

•Given an initial guess x0 of the solution ofAx = b, one may consider the followingiterative scheme:

↙ iteration matrixxk+1 = S−1b− S−1(A− S)xk

(5.2)

• Clearly if xk → x as k →∞ then we have

x = A−1b.

• From the results in the previous sections, we know that Eq. (5.2) converges ifand only if there is a matrix norm ||.||M such that

||S−1(A− S)||M < 1.

Therefore we have the following proposition.

Proposition 12 If∥S−1(A− S)∥M < 1

then the iterative scheme (5.2) converges to the solution of

Ax = b.

45

Page 46: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Example 6 Let A be the matrix and b be the right hand side vector. We usex0 = (0 0 0)T as the initial guess.

Case 1: S =

1 0 00 1 00 0 1

.

xk = b− (A− I)xk

=

5105

− −1

213 0

13 0 1

30 1

3 −12

xk

x1 = (5.0000 10.0000 5.0000)T

x2 = (4.1667 6.6667 4.1667)T

x3 = (4.8611 7.2222 4.8611)T

x4 = (5.0231 6.7593 5.0231)T

...

x30 = (5.9983 6.0014 5.9983)T .

When S = I , this is called the Richardson Method.

46

Page 47: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Example 7 Case 2: S =

12 0 00 1 00 0 1

2

• Therefore

xk+1 = S−1b− S−1(A− S)xk

=

101010

− 1

21

12

−1 0 13 0

13 0 1

30 1

3 0

xk

= (10 10 10)T −

0 23 0

13 0 1

30 2

3 0

xk

x1 = (10.0000 10.0000 10.0000)T

x2 = (3.3333 3.3333 3.3333)T

x3 = (7.7778 7.7778 7.7778)T

...

x30 = (6.0000 6.0000 6.0000)T .

When S = Diag (a11, · · · , ann). This is called the Jacobi method.

47

Page 48: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Example 8 Case 3: S =

12 0 013 1 00 1

312

xk+1 = S−1b− S−1(A− S)xk

=

10203509

− 1

2 0 013 1 00 1

312

−1 0 13 0

0 0 13

0 0 0

xk

x1 = (10.000020

3

50

9)T

x2 = (5.5556 6.2963 5.8025)T

x3 = (5.8025 6.1317 5.9122)T

x4 = (5.9122 6.0585 5.9610)T

...

x14 = (6.0000 6.0000 6.0000)T .

When S = Lower triangular part of the matrix A. This method is called theGauss-Seidel method.

48

Page 49: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

5.6 Spectral Radius

Definition 4 Given an n× n square matrix the spectral radius of A is defined as

ρ(A) = max{|λ| : det(A− λI) = 0}

or in other words if λ1, λ2, · · · , λn are eigenvalues of A then ρ(A) = maxi{|λi|}.

Example 9

A = A1 + A2 ≡(0 01 0

)+

(0 −10 0

)=

(0 −11 0

)then eigenvalues of A are ±i and |i| = | − i| = 1. Therefore ρ(A) = 1 in this case.While ρ(A1) = ρ(A2) = 0.

Remark 4 Here ρ does NOT define a norm for the square matrices. We note

1 = ρ(A1 + A2) > ρ(A1) + ρ(A2) = 0 + 0 = 0.

Proposition 13 If A = PBP−1 then ρ(A) = ρ(B).

Proof: Because A and B have the same set of eigenvalues.

49

Page 50: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Lemma 3 Every square matrix A is unitarily similar to anupper triangular matrix C, i.e., there exists S such that

S∗ · A · S = C and S · S∗ = I.

Proposition 14 Every square matrix A is similar to an uppertriangular matrix whose off-diagonal elements are arbitrarily small.

Proof: From Lemma 3 we have

S−1AS =

c11 c12 · · · · · · c1n

c22 c23...

. . . . . . cn−2ncn−1 n−1 cn−1n

0 cnn

an upper triangular matrix.

50

Page 51: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• Define

E =

ε 0ε2

. . .

0 εn

to be a diagonal matrix where ε = 0.• Now we have

E−1(S−1AS)E =

c11 εc12 ε2c13 · · · εn−1c1n

c22 εc23...

. . . . . . ...cn−1 n−1 εcn−1 n

0 cnn

(SE)−1A( SE︸︷︷︸

dependson ε

) =

c11 0. . .

0 cnn

+ U

|Uij| ={|cijεj−i| ≤ ε|cij| j > i.0 j ≤ i.

We note that ∥U∥M∞ ≤ ε∥c∥M∞ ≤ ε′.

51

Page 52: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 15 For any square matrix A, we have

ρ(A) = inf∥·∥M{∥A∥M}.

Proof: We first show that

ρ(A) ≤ inf∥·∥M{∥A∥M}.

• Let λ be an eigenvalue of A and x be the corresponding eigenvector,

↙ property ofnorm ↙ eigen-

value ↙ Thm. onmatrix norm

|λ|∥x∥ = ∥λx∥ = ∥Ax∥ ≤ ∥A∥M∥x∥• Hence |λ| ≤ ∥A∥M , where ∥ · ∥M is an arbitrary matrix norm.

• This implies that

ρ(A) ≤ ∥A∥M ∀ ∥ · ∥Mand therefore

ρ(A) ≤ inf∥·∥M{∥A∥M} .

52

Page 53: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Next we wish to show that

inf∥·∥M{∥A∥M} ≤ ρ(A).

• For any square matrix A, there exists Sε such that

S−1ε ASε =

λ1 0

λ2. . .

0 λn

+ T

where ∥T∥M∞ ≤ ε for any ε > 0.• We have

∥S−1ε ASε∥M∞ ≤

∥∥∥∥∥∥ λ1

. . .λn

∥∥∥∥∥∥M∞

+ ∥T∥M∞

≤ ρ(A) + ε

because λi are eigenvalues of A.

53

Page 54: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• We note that∥A∥

Mε= ∥S−1ε ASε∥M∞

defines a matrix norm.

• Therefore the matrix norm ∥A∥Mε≤ ρ(A) + ε.

Since ε can be arbitrary small (but not equal to zero),

inf∥·∥M{∥A∥M} ≤ ρ(A).

Remark 5 Ifρ(A) < 1

then there exists

∥ · ∥M such that ∥A∥M < 1.

54

Page 55: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 16 The iterative scheme

xk = Gxk−1 + c

converges to(I −G)−1c

for any starting vectors x(0) and c if and only if ρ(G) < 1.

Proof: We note that

x1 = Gx0 + c;

x2 = G2x0 +Gc + c;... ...

xk = Gkx0 +k−1∑j=0

Gjc.

Now there exists ∥·∥M such that ∥G∥M < 1 and therefore ∥Gk∥M →0 as k →∞.

55

Page 56: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• We havek−1∑j=0

Gj → (I −G)−1 as k →∞.

Hencexk → (I −G)−1c as k →∞.

• Suppose ρ(G) ≥ 1 then there exists u = 0 such that

Gu = λu and |λ| ≥ 1.

Let x0 = c = u then

xk = λku +

k−1∑j=0

λju =

k∑j=0

λiu =1− λk+1

1− λu

and1− λk+1

1− λdiverges if |λ| ≥ 1.

56

Page 57: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 17 (The Gershgorin’s theorem) The eigenvalues of an n × nmatrix A are contained in the union of the following n disks Di where

Di =

z ∈ C : |z − Aii| ≤ −|Aii| +n∑

j=1

|Aij|

.

Proof: • Let λ be an eigenvalue of A and x be its correspondingeigenvector such that ||x||∞ = |xi| = 1.• This can be done by dividing x by |xi| = maxj{|xj|}.• Since Ax = λx, we have

λxi =

n∑j=1

aijxj and therefore (λ− aii)xi =

n∑j=1,j =i

aijxj.

Hence

|λ− aii| = |(λ− aii)xi| ≤n∑

j=1,j =i|aijxj| ≤

n∑j=1,j =i

|aij|.

Therefore λ ∈ Di.

57

Page 58: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 18 If Q is a column stochastic matrix of a Markovchain then ρ(Qk) = 1 (non-negative matrix and all column sumsare one).

Proof: We note that 1Q = 1 where 1 = (1, 1, . . . , 1).

• Therefore1Qk = 1.

This means that 1 is an eigenvalue of Qk. Thus we conclude thatρ(Qk) ≥ 1.

• By using the Gershgorin’s theorem and the fact that all the entriesof Qk are non-negative, all the column sums of Qk are equal to one,we have ρ(Qk) ≤ 1.

• Hence we conclude that ρ(Qk) = 1.

58

Page 59: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 19 The iterative scheme

xk+1 = S−1b− S−1(A− S)xk

= (I − S−1A)xk + S−1b

converges to A−1b if and only if ρ(I − S−1A) < 1.

Proof: Take G = I − S−1A and c = S−1b.

Proposition 20 If A is row (column) diagonally dominant i.e., for i = 1, 2, . . . , n

2|Aii| >n∑

j=1

|Aij|

(2|Aii| >

n∑i=1

|Aij|

)then the Gauss-Seidel method converges for any starting x0.

Proof: Let S be the lower triangular part of A. From Proposition 19 above, oneonly needs to show that ρ(I − S−1A) < 1.

• Let λ be an eigenvalue of (I−S−1A) and x be its corresponding eigenvector suchthat ∥x∥∞ = 1. We want to show |λ| < 1.

59

Page 60: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• We have(I − S−1A)x = λx

and therefore

Sx− Ax = λSx for x = (x1, x2, . . . , xn)T .

In other words, we have0 −a12 · · · −a1n... 0... . . . −an−1n0 · · · 0

x1x2...xn

=

a11 0 · · · 0a21 a22 . . . ...... . . . 0

an1 · · · · · · ann

λx1λx2...

λxn

.

• Therefore

−(a12x2 + · · · + a1nxn) = a11λx1−(a23x3 + · · · + a2nxn) = λ(a21x1 + a22x2)

...

−an−1nxn = λ(an−11x1 + · · · + an−1n−1xn−1).

60

Page 61: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• In general we have

−n∑

j=i+1

aijxj = λ

i∑j=1

aijxj for i = 1, · · · , n− 1.

• Since ∥x∥∞ = 1, there exists i such that

|xi| = 1 ≥ |xj|.

• For this i we have

|λ||aii| = |λaiixi| ≤n∑

j=i+1

|aij| + |λ|i−1∑j=1

|aij|

and therefore

|λ| ≤n∑

j=i+1

|aij|

/|aii| − i−1∑j=1

|aij|

< 1

61

Page 62: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

5.7 The Successive Over-Relaxation (SOR) Method

Solving Ax = b, one may split A as follows:

A = L + wD︸ ︷︷ ︸+(1− w)D + U

L = strictly lower triangular part; D = Diagonal part; U = strictly uppertriangular part.

Example 10[2 11 2

]=

[0 01 0

]︸ ︷︷ ︸

L

+w

[2 00 2

]︸ ︷︷ ︸

D

+(1− w)

[2 00 2

]︸ ︷︷ ︸

D

+

[0 10 0

]︸ ︷︷ ︸

U

One may consider the iterative scheme with S = L + wD as follows:

xn+1 = S−1b + S−1(S − A)xn

= S−1b + (I − S−1A)xn.

We remark thatI − S−1A = I − (L + wD)−1A.

62

Page 63: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• Moreover, when w = 1, the method is just the Gauss-Seidel method. Thismethod is called the SOR method.

• It is clear that the method converges if and only if the iteration matrix has aspectral radius less than one.

Proposition 21 The SOR method converges to the solution of Ax = b if andonly if

ρ(I − (L + wD)−1A) < 1.

• We are going to prove that if A is a positive definite Hermitian matrixand w > 1

2 then the SOR method converges. Let us recall that

1. A is Hermitian if A = A∗.

2. Define < x,y >=n∑i=1

xiyi.

Then < x, λy >= λ < x,y > and < λx,y >= λ < x,y >.

3. If A is Hermitian then < Ax,y >=< x, Ay >.

4. A is positive definite if < Ax,x > > 0 for x = 0.

63

Page 64: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 22 Let A be a positive definite Hermitian matrix. If w > 12

then the SOR method converges.

Proof: We will show that ρ(I − S−1A) < 1 then SOR method converges.

• Let λ be an eigenvalue ofG = I − S−1A

and x be the corresponding eigenvector.

• Therefore we haveGx = (I − S−1A)x = λx. (5.3)

Lety = (I −G)x = S−1Ax. (5.4)

Then we haveSy = (L + wD)y = S · S−1Ax = Ax.

64

Page 65: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Hence we conclude that

< Ly + wDy,y >= < Ly,y > + < wDy,y >=< Ax,y > . (5.5)

We note that

AGx = A(I − S−1A)x = Ax− Ay ( by (5.3) & (5.4)) (5.6)

= Sy − Ay = (S − A)y (5.7)

= (−(1− w)D − U)y. (5.8)

Hence

< y, AGx >= − < y, Dy > + < y, wDy > − < y, Uy > . (5.9)

Adding Eq. (5.5) and Eq. (5.9) together we get

− < y, Dy > +< Ly,y > − < y, Uy >+ < wDy,y >+ < y, wDy >=< Ax,y > + < y, AGx > .

(5.10)

We note that< Ly,y >=< y, L∗y >=< y, Uy >

then we have

< wDy,y > − < y, (1− w)Dy >=< Ax,y > + < y, AGx > . (5.11)

65

Page 66: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Sincey = (I −G)x = x− λx = (1− λ)x

we have

< wDy,y >=< w(1− λ)Dx, (1− λ)x >= w(1− λ)(1− λ) < Dx,x >

and

< y, Dy >=< (1− λ)x, (1− λ)Dx >= (1− λ)(1− λ) < x, Dx > .

We note that (1− λ)(1− λ) = |1− λ|2 and

< y, wDy >=< Dy, wy >= |1− λ|2w < Dx,x >= |1− λ|2w < Dx,x > .

Hence L.H.S of Eq. (5.11) becomes

(2w − 1)|1− λ|2 < Dx,x > .

and R.H.S of Eq. (5.11) becomes

< Ax,y > + < y, AGx > = < Ax, (1− λ)x > + < (1− λ)x, Aλx >

= (1− λ) < Ax,x > +(1− λ)λ < x, Ax >= (1− |λ|2) < Ax,x > .

66

Page 67: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• Now we observe that

(2w − 1)︸ ︷︷ ︸>0

|1− λ|2︸ ︷︷ ︸≥0

< Dx,x >︸ ︷︷ ︸>0

= (1− |λ|2)< Ax,x >︸ ︷︷ ︸>0

.

Thus we have |λ| ≤ 1, but we wish to prove |λ| < 1. We shall show that |λ| = 1.

Case 1: If λ = −1 then

4(2w − 1) < Dx,x >= 0· < Ax,x >= 0.

This is NOT possible.

Case 2: If λ = 1 then y = (1− λ)x = 0. We have 0 = Sy = Ax. Again this isNOT possible.

Hence we have |λ| < 1, i.e. ρ(I − S−1A) < 1.

Proposition 23 If A is a positive definite Hermitian matrix then the Gauss-Seidel Method converges to the solution of Ax = b.

Proof: Take w = 1 and apply Proposition 22. The result follows.

67

Page 68: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

5.8 Steepest Descent Method and Conjugate Gradient Method

We consider the problem of solving Ax = b such that

(1) A is an n× n matrix;(2) A is symmetric, i.e., AT = A;(3) A is positive definite, i.e., xTAx > 0 for x = 0.

Remark 6 Condition (3) implies that A−1 exists.

Recall the properties of the inner product in Rn:

< x,y >= xTy =

n∑i=1

xiyi.

(i) < x,y >=< y,x >;

(ii) < αx,y >= α < x,y >;

(iii) < x, αy >= α < x,y >;

(iv) < x + y, z >=< x, z > + < y, z >;

(v) < x, Ay >=< ATx,y >.

68

Page 69: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 24 If A is symmetric and positive definite, then theproblem of solving

Ax = b

is equivalent to the problem of minimizing

q(x) =< x, Ax > −2 < x,b > .

Proof: Let v be a vector and t be a scalar. We consider thefunction

q(x + tv) = < x + tv, A(x + tv) > −2 < x + tv,b >= < x, Ax > +t < x, Av > +t < v, Ax >

+t2 < v, Av > −2 < x,b > −2t < v,b >

= q(x) + 2t < v, Ax >− 2t < v,b > +t2 < v, Av >

= q(x) + 2t < v, Ax− b > +t2 < v, Av > .

69

Page 70: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• Now one can regard it as a function of t

q(x + tv) = f (t) = q(x) + 2 < v, Ax− b > t+ < v, Av > t2.

In fact it is a quadratic function in t. Moreover f (t) attains minimumat t s.t. f ′(t) = 0, i.e.

2 < v, Ax− b > +2 < v, Av > t = 0.

• Solving the equation we have

t∗ =< v,b− Ax >

< v, Av >.

• We remark that < v, Av >= 0 because A is positive definite.

70

Page 71: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• Thereforeq(x + t∗v) = q(x) + t∗ {2 < v, Ax− b > + < v, Av > t∗}

= q(x) + t∗ {2 < v, Ax− b > + < v,b− Ax >}= q(x) + t∗ {< v, Ax− b >}= q(x)− <v,b−Ax>2

< v, Av >︸ ︷︷ ︸←−non-negative• We note that reduction in the value of q(x) always occurs in pass-ing from x to x + t∗v. (unless < v,b− Ax >= 0, in this case v isorthogonal to b− Ax. )

• So if b− Ax = 0 then we can find a vector v such that

< v,b− Ax > = 0 and q(x + t∗v) < q(x)

and x is NOT the minimizer of q(x).• If b−Ax = 0 then q(x+ t∗v) = q(x) for any vector v. Thereforex is the minimizer.

71

Page 72: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• One may design an iterative method by using the idea in Proposition 24.

• Given A, an n× n symmetric positive definite matrix and b is an n× 1 vector.

• With an x0, an initial guess of the solution of Ax = b we develop an iterativealgorithm namely the steepest decent method. The iterative method reads:

Input: Max, A, b, x0.Error-tol. k = 0,r0 = b− Ax0, initial residual.

While ∥rk∥ < Error-tol or k < Max

rk = b− Axk

tk = < rk, rk > / < rk, Ark >;xk+1 = xk + tk · rk;k = k + 1;

end

72

Page 73: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

We remark that rk is the search direction and tk is the step size.

In the iterative method, t = t∗ in Proposition 24 by letting

v = r = b− Ax.

Example 11

A =

(2 11 2

), b =

(45

), x0 =

(00

).

k xk t ∥rk∥21 (1.34 1.68)T 0.3361 6.40312 (0.98 1.97)T 0.9762 0.47243 (1.01 1.99)T 0.3361 0.10124 (0.99 1.99)T 0.9762 0.00755 (1.00 1.99)T 0.3361 0.0016... ... ... ...

The true solution is (1, 2)T . This method, “steepest descent” is rarely used becauseits convergence rate is “too slow”.

73

Page 74: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

5.8.1 Conjugate Gradient Method

Definition 5 (A-orthonormality.) Assuming thatA is an n×n symmetric positivedefinite matrix, suppose that a set of vectors {u1,u2, · · · ,un} is provided and hasthe property

⟨ui, Auj⟩ = δij

where

δij =

{1 if i = j0 if i = j.

This property is called the A-orthonormality. Clearly it is a generalization of theordinary orthonormality where A = In.

Remark 7 Here ∥x∥2A =< x, Ax > defines a norm in Rn.

Proposition 25 LetU = (u1, · · · ,un)

thenUTAU = In.

Proof: It follows from the definition.

74

Page 75: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 26 The set {u1, · · · ,un} forms a basis for Rn.

Proof: We need only to show that {ui} are independent.

Supposen∑i=1

αiui = 0

then

0 =

⟨n∑

i=1

αiui, Auj

⟩j = 1, · · · , n

=

n∑i=1

αi ⟨ui, Auj⟩

= αj ⟨uj, Auj⟩ = αj.

Hence αj = 0 for j = 1, · · · , n.

• This shows that {ui} are independent and hence form a basis for Rn.

75

Page 76: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Proposition 27 Let {u1, · · · ,un} be an A-orthonormal system. Define thefollowing recursive scheme:

xi = xi−1 + ⟨b− Axi−1,ui⟩ui

for i = 1, 2, · · · , n iteratively in which x0 is an arbitrary vector in Rn then wehave

Axn = b.

Proof: Defineti = ⟨b− Axi−1,ui⟩ .

The iterative method readsxi = xi−1 + tiui.

We note thatAxi = Axi−1 + tiAui.

Therefore

Axn = Axn−1 + tnAun

= Axn−2 + tn−1Aun−1 + tnAun... ...

76

Page 77: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• Finally we haveAxn = Ax0 + t1Au1 + · · · + tnAun.

• Now⟨Axn − b,ui⟩ = ⟨Ax0 − b,ui⟩ + ti.

Since

ti = ⟨b− Axi−1,ui⟩=⟨b− Ax0 + Ax0︸ ︷︷ ︸−Ax1 + Ax1︸ ︷︷ ︸+ · · · − Axi−1,ui

⟩= ⟨b− Ax0,ui⟩ + ⟨Ax0 − Ax1,ui⟩

+ ⟨Ax1 − Ax2,ui⟩ + · · · + ⟨Axi−2 − Axi−1,ui⟩= ⟨b− Ax0,ui⟩ + ⟨−t1Au1,ui⟩ + · · · + ⟨−ti−1Aui−1,ui⟩= ⟨b− Ax0,ui⟩ .

Hence⟨Axn − b,ui⟩ = 0, i = 1, · · · , n and Axn − b = 0.

Because Axn − b is orthonormal to all ui and it must be the zero vector.

77

Page 78: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

Definition 6 (A-orthogonal). Assuming A is an n × n symmetricpositive definite matrix, then a set of vector

{v1, · · · ,vn}is said to be A-orthogonal if⟨

vi, Avj⟩= 0 whenever i = j.

Proposition 27 can be extended as follows.

Proposition 28 Let{v1, · · · ,vn}

be an A-orthogonal system of non-zero vectors for a symmetricand positive definite n× n matrix A. Define

xi = xi−1 +⟨b− Axi−1,vi⟩⟨vi, Avi⟩

vi

in which x0 is arbitrary, then Axn = b.

78

Page 79: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• The CG algorithm reads:

Given an initial guess x0, A, b, Max, tol:r0 = b− Ax0;v0 = r0;

For k = 0 to Max−1 doIf ||vk||2 = 0 then stop

tk =< rk, rk > / < vk, Avk >;xk+1 = xk + tkvk;rk+1 = rk − tkAvk;

If ||rk+1, rk+1||2 < tol then stop

vk+1 = rk+1 +<rk+1,rk+1><rk,rk>

vk;

end;

output xk+1, ||rk+1||2.79

Page 80: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• The main computational cost in the CG algorithm comes from the matrix-vectormultiplication of the form Ax. It takes at most O(n2) operations.

• If A is not symmetric, one can consider the normal equation:

ATAx = ATb.

• The algorithm converges in at most n steps. However, it can be faster as theconvergence rate of this method also depends on the spectrum of the matrix An.For example if the spectrum of An is contained in an interval, i.e. σ(An) ⊆ [a, b],then the error in the i-th iteration is given by

||ei||||e0||

≤ 2

(√b−√a√

b +√a

)i

i.e. the convergence rate is linear. Hence the approximate upper bound for thenumber of iterations required to make the relative error ||ei||||e0||

≤ δ is given by

1

2

(√b

a− 1

)log

(2

δ

)+ 1.

80

Page 81: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• Very often CG method is used with a matrix called preconditionerto accelerate its convergence rate.

• A good preconditioner C should satisfy the following conditions.

(i) The matrix C can be constructed easily;

(ii) Given right hand side vector r, the linear system Cy = r can besolved efficiently; and

(iii) the spectrum (or singular values) of the preconditioned systemC−1A should be clustered around one.

81

Page 82: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• In the Preconditioned Conjugate Gradient (PCG) method,we solve the linear system

C−1Ax = C−1b

instead of the original linear system

Ax = b.

We expect the fast convergence rate of the PCG method can com-pensate much more than the extra cost in solving the preconditionersystem

Cy = r

in each iteration step of the PCG method.

82

Page 83: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• Apart from the approach of condition number, in fact, condition(iii) is also very commonly used in proving convergence rate. In thefollowing we give the definition of clustering.

Definition 7We say that a sequence of matrices Sn of size n hasa clustered spectrum around one if for all ϵ > 0, there exist non-negative integers n0 and n1, such that for all n > n0, at most n1eigenvalues of the matrix

S∗nSn − In

have absolute values larger than ϵ.

• One sufficient condition for the matrix to have eigenvalues clusteredaround one is that

Hn = In + Ln

where In is the n × n identity matrix and Ln is a low rank matrix(rank(Ln) is bounded above and independent of the matrix size n).

83

Page 84: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

5.9 A Summary of Learning Outcomes

1. Able to compute a given vector norm, for example, ||.||1, ||.||2 and ||.||∞.2. Able to define and compute a matrix norm (induced by a vector norm via thesup definition) and examples such as ||.||M1, ||.||M2 and ||.||M∞.

3. Able to show some properties of a matrix norm, e.g. ||A·B||M ≤ ||A||M ·||B||M .

4. Able to recognize and apply the iterative scheme: xk+1 = (I − A)xk + r forsolving the solution of Ax = r under the condition that for some matrix normwe have ||I − A||M < 1.

5. Able to recognize the preconditioning techniques. Find a matrix B such that

||I −B−1A||M < 1

for some matrix norm and B−1 is easy to be solved. Then apply the iterativescheme to solve B−1Ax = B−1r instead of Ax = r.

6. Able to apply the power method for the largest eigenvalue and eigenvector of asquare matrix.

7. Able to program and apply conjugate gradient type methods for solving systemof linear equations.

84

Page 85: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

6 Markovian Queueing Networks, Manufacturing and Re-manufacturing Sys-

tems

6.1 A Single Markovian Queue (M/M/s/n-s-1)

• λ input rate (Arrival Rate),• µ output rate (Service Rate, Production Rate).

ss− 1

...

321

m��p pm��p pm��p p...

m��p pm��p pm��p p

1 2 3 · · · j · · · n− s− 1

p p p p p p · · · p p p p p p · · · λ�

�µ

�µ

�µ

�µ

�µ

�µ

: empty buffer in queue

p p : customer waiting in queue

m��p p : customer being served

85

Page 86: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

6.1.1 The Steady-state Distribution

• Let pi be the steady-state probability that i customers in the queueing system.Here p = (p0, . . . , pn−1)

T is the steady-state probability vector.

• Important for system performance analysis, e.g. average waiting time of the cus-tomers in long run.

• B. Bunday, Introduction to Queueing Theory, Arnold, N.Y., (1996).

• Here pi governed by the Kolmogorov equations:

Out Going Rate Incoming Rate

� - - �pi−1 pi pi+1λ

iµpi−1 pi pi+1

(i + 1)µ1

λ

86

Page 87: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

-

µ

���

0�

-

���

1 · · · �

-

���

s · · · ����n-1

-

λThe Markov Chain of the M/M/s/n-s-1 Queue

• We are solving: A0p0 = 0,∑pi = 1,

pi ≥ 0.

• A0, the generator matrix, is given by the n× n tridiagonal matrix:

A0 =

λ −µ−λ λ + µ −2µ 0

−λ λ + 2µ −3µ· · ·−λ λ + sµ −sµ

· · ·0 −λ λ + sµ −sµ

−λ sµ

.

87

Page 88: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

6.2 Two-Queue Free Models

s1

s1 − 1

...

321

m��p pm��p pm��p p...

m��p pm��p pm��p p

1 2 3 · · · j · · · n1 − s1 − 1

p p p p p p · · · p p p p p p · · · λ1�

�µ1

�µ1

�µ1

�µ1

�µ1

�µ1

s2

s2 − 1

...

321

m��p pm��p pm��p p...

m��p pm��p pm��p p

1 2 3 · · · k · · · n2 − s2 − 1

p p p p p p · · · p p · · · � λ2

�µ2

�µ2

�µ2

�µ2

�µ2

�µ2

: empty buffer in queue

p p : customer waiting in queue

m��p p : customer being served

88

Page 89: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• Let pi,j be the probability that i customers in queue 1 and j customers in queue2. Then the Kolmogorov equations for the two-queue network:

Out Going Rate Incoming Rate

� -

?

6

- �

6

?

pi,j+1

pi,j−1

pi−1,j pi,j pi+1,j

jµ2

λ2

λ1

iµ1

pi,j+1

pi,j−1

pi−1,j pi,j pi+1,j

λ2

(j + 1)µ2

(i + 1)µ1

λ1

89

Page 90: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• Again we have to solve A1p = 0,∑pij = 1,

pij ≥ 0.

• The generator matrix A1 is separable (no interaction between the queues):

A1 = A0 ⊗ I + I ⊗ A0.

• Kronecker tensor product of two matrices An×r and Bm×k:

An×r ⊗Bm×k =

a11B · · · · · · a1nBa21B · · · · · · a2nB... ... ... ...

am1B · · · · · · amnB

nm×rk

.

• It is easy to check that the Markov chain of the queueing system is irreducibleand the unique solution is p = p0 ⊗ p0.

90

Page 91: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

6.3 2-Queue Overflow Networks

s1

s1 − 1

...

321

m��p pm��p pm��p p...

m��p pm��p pm��p p

1 2 3 · · · j · · · n1 − s1 − 1

p p p p p p · · · p p p p p p p p · · · p p λ1�

��6

�µ1

�µ1

�µ1

�µ1

�µ1

�µ1

s2

s2 − 1

...

321

m��p pm��p pm��p p...

m��p pm��p pm��p p

1 2 3 · · · k · · · n2 − s2 − 1

p p p p p p · · · p p · · · � λ2

�µ2

�µ2

�µ2

�µ2

�µ2

�µ2

: empty buffer in queue

p p : customer waiting in queue

m��p p : customer being served

91

Page 92: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• The generator matrix A2 is given by

A2 = A0 ⊗ I + I ⊗ A0 +

(0

1

)⊗R0,

where

R0 = λ1

1−1 1 0· ·· ·

0 −1 1−1 0

describes the overflow discipline of the queueing system.

• In fact, we may write

A2 = A1 +

(0

1

)⊗R0,

• Unfortunately analytic solution for the steady-state distribution p is not available.

92

Page 93: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• The generator matrices are sparse and have block structures.

• Direct method (LU decomposition will result in dense matrices L and U)is not efficient in general.

• Block Gauss-Seidel (BGS) is an usual approach for mentioned queueing problems.Its convergence rate is not fast and increase linearly with respect to the size of thegenerator matrix in general.

• Fast algorithm should make use of the block structures and the sparsity of thegenerator matrices. We shall apply Preconditioned Conjugate Gradient type meth-ods.

• R. Varga, Matrix Iterative Analysis, Prentice-Hall, N.J., (1963).

93

Page 94: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

6.4 Practical Examples of Queueing Systems

6.4.1 The Telecommunication System

µ1

����

s1 Queue 1

�λ1

@@@

@@R•••

µn

����

sn Queue n �

λn

���

���

-λMain Queue ��

��s -

µ

Size N

• K. Hellstern, The Analysis of a Queue Arising in Overflow Models, IEEETrans. Commun., 37 (1989).

•W. Ching, R. Chan and X. Zhou, Circulant Preconditioners for MarkovModulated Possion Processes and Their Applications to Manufacturing Sys-tems, SIAM J. Matrix Anal., 17 (1997).

94

Page 95: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

•We may regard the telecommunication network as a (MMPP/M/s/s+N) queue-ing system.

• An MMPP is a Poisson Process whose instantaneous rate itself is a stationary ran-dom process which varies according to an irreducible n-state Markov chain (Whenn = 1, it is just the Poisson Process).

• Important in analysis of blocking probability and system utilization.

•M. Neuts, Matrix-Geometric Solutions in Stochastic Models, Johns HopkinsUniversity Press, M.D., (1981).

• J. Flood, Telecommunication Switching Traffic and Networks, Prentice-Hall,N.Y., (1995).

95

Page 96: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• Generator matrix is given by

A3 =

Q + Γ −µI 0−Γ Q + Γ + µI −2µI

. . . . . . . . .

−Γ Q + Γ + sµI −sµI. . . . . . . . .

−Γ Q + Γ + sµI −sµI0 −Γ Q + sµI

,

((N + 1)-block by (N + 1)-block), where

Γ = Λ + λI2n,

Q = (Q1⊗ I2⊗ · · · ⊗ I2) + (I2⊗Q2⊗ I2⊗ · · · ⊗ I2) + · · ·+ (I2⊗ · · · ⊗ I2⊗Qn),

Λ = (Λ1⊗ I2⊗ · · · ⊗ I2) + (I2⊗ Λ2⊗ I2⊗ · · · ⊗ I2) + · · ·+ (I2⊗ · · · ⊗ I2⊗ Λn),

Qj =

(σj1 −σj2−σj1 σj2

)and Λj =

(λj 00 0

).

96

Page 97: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

6.4.2 The Manufacturing System of Two Machines in Tandem

M1 -

µ1

����B1

size l

- M2 -

µ2

����B2

size N

-

λ

• Search for optimal buffer sizes l and N (N >> l), which minimizes (1) the aver-age running cost, (2) maximizes the throughput, or (3) minimizes the blocking andthe starving rate.

• G. Yamazaki, T. Kawashima and H. Sakasegawa, Reversibility ofTandem Blocking Queueing Systems Manag., Sci., 31 (1985).

•W. Ching, Iterative Methods for Manufacturing Systems of Two Stationsin Tandem, Applied Maths. Letters, 11 (1998).

97

Page 98: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• The generator matrix is of the form:

A4 =

Λ + µ1I −Σ 0−µ1I Λ +D + µ1I −Σ

. . . . . . . . .

−µ1I Λ +D + µ1I −Σ0 −µ1I Λ +D

,

((l + 1)-block by (l + 1)-block), where

Λ =

0 −λ 0

λ . . .. . . −λ

0 λ

, Σ =

0 0µ2

. . .

. . . . . .

0 µ2 0

,

andD = Diag(µ2, · · · , µ2, 0).

98

Page 99: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

6.4.3 The Re-Manufacturing System

Q?

N

ProcurementInventory

of Returns

-γ Inventory

of Product

Re-manu-

facturing· · ·

• There are two types of inventory to manage: the serviceable product and thereturned product. The re-cycling process is modelled by an M/M/1/N queue.

• The serviceable product inventory level and the outside procurements are con-trolled by an (r,Q) continuous review policy. Here r is the outside procurementlevel and Q is the procurement quantity. We assume that N >> Q.

•M. Fleischmann, Quantitative Models for Reverse Logistics, (501) LNEMS,Berlin, Springer (2001).

•W. Yuen, W. Ching and M. Ng, A Direct Method for Solving Block-Toeplitzwith Near-Circulant-Block Systems with Applications to Hybrid ManufacturingSystems, Numerical Linear Algebra with Applications, 2005 (12) 957-966.

99

Page 100: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• The generator matrix is given by

A5 =

B −λI 0−L B −λI

. . . . . . . . .

−L B −λI−λI −L BQ

,

where

L =

0 µ 0

0 . . .. . . . . .

. . . µ0 0

, B = λIN+1 +

γ 0−γ γ + µ

. . . . . .. . . γ + µ

0 −γ µ

,

andBQ = B − Diag(0, µ, . . . , µ).

100

Page 101: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

6.5 Circulant-based Preconditioners

• Circulant matrices are Toeplitz matrices (constant diagonal entries) such thateach column is a cyclic shift of its preceding column.

• Class of circulant matrices denoted by F .

• C ∈ F implies C can be diagonalized by Fourier matrix F :

C = F ∗ΛF.

HenceC−1x = F ∗Λ−1Fx.

• Eigenvalues of a circulant matrix has analytic form, therefore enhance the spec-trum analysis of the preconditioned matrix.

• C−1x can be done in O(n log n).

• P. Davis, Circulant Matrices, John Wiley and Sons, N.J. (1985).

101

Page 102: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• In our captured problem:

A =

λ −µ 0−λ λ + µ −2µ

· · ·−λ λ + sµ −sµ

· · ·−λ λ + sµ −sµ

0 −λ sµ

.

s(A) =

λ + sµ −sµ −λ−λ λ + sµ −sµ

· · ·−λ λ + sµ −sµ

· · ·−λ λ + sµ −sµ

−sµ −λ λ + sµ

.

We have rank(A− s(A)) = s + 1.

102

Page 103: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

6.5.1 The Telecommunication System

• A3 = I ⊗Q + A⊗ I +R⊗ Λ where.

R =

1 0−1 1−1 . . .

. . . 10 −1 0

.

•s(A3) = s(I)⊗Q + s(A)⊗ I + s(R)⊗ Λ.

s(I) = I and s(R) =

1 −1−1 1−1 . . .

. . . 10 −1 1

.

103

Page 104: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

6.5.2 The Manufacturing System of Two Machines in Tandem

• Circulant-based approximation of A4 : s(A4) =s(Λ) + µ1I −s(Σ) 0−µ1I s(Λ) + s(D) + µ1I −s(Σ)

. . . . . . . . .

−µ1I s(Λ) + s(D) + µ1I −s(Σ)0 −µ1I s(Λ) + s(D)

,

((l + 1)-block by (l + 1)-block), where

s(Λ) =

λ −λ 0

λ . . .. . . −λ

−λ λ

, s(Σ) =

0 µ2

µ2. . .. . . . . .

0 µ2 0

,

ands(D) = Diag(µ2, · · · , µ2, µ2).

104

Page 105: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

6.5.3 The Re-manufacturing System

• circulant-based approximation of A5:

s(A5) =

s(B) −λI 0−s(L) s(B) −λI

. . . . . . . . .

−s(L) s(B) −λI−λI −s(L) s(BQ)

,

where

s(L) =

0 µ 0

0 . . .. . . . . .

. . . µµ 0

, s(B) = λIN+1 +

γ + µ −γ−γ γ + µ

. . . . . .

0 −γ γ + µ

,

ands(BQ) = s(B)− µI.

105

Page 106: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• In fact all the generator matrices A take the form

A =

m∑i=1

n⊗j=1

Aij,

where Ai1 is relatively huge in size.

• Our preconditioner is defined as

C =

m∑i=1

s(Ai1)

n⊗j=2

Aij.

• We note thatF n⊗j=2

I

∗ · C ·F n⊗

j=2

I

=

m∑i=1

Λi1

n⊗j=2

Aij =

ℓ⊕k=1

m∑i=1

λki1

n⊗j=2

Aij

which is a block-diagonal matrix.

106

Page 107: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

• One of the advantages of our preconditioner is that it can be inverted in parallelby using a parallel computer easily. This would therefore save a lot of computa-tional cost.

• Theorem: If all the parameters stay fixed then the preconditioned matrix hassingular values clustered around 1. Thus we expect our PCG method convergesvery fast.

• Ai1 ≈ Toeplitz except for rank (s + 1) perturbation≈ s(Ai1) except for rank (s + 1) perturbation.

• R. Chan and W. Ching, Circulant Preconditioners for Stochastic Au-tomata Networks, Numerise Mathematik, (2000).

107

Page 108: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

6.5.4 Numerical Results

• Since generator A is non-symmetric, we used the generalized CG method, theConjugate Gradient Squared (CGS) method. This method does not require themultiplication of ATx.

• Our proposed method is applied to the following systems.

(1) The Telecomunications System.

(2) The Manufacturing Systems of Two Machines in Tandem.

(3) The Re-Manufacturing System.• P. Sonneveld, A Fast Lanczos-type Solver for Non-symmetric Linear Sys-tems SIAM J. Sci. Comput., 10 (1989).

• Stopping Criteria:||rn||2||r0||2

< 10−10;

where ||rn||2 = nth step residual.

108

Page 109: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

6.5.5 The Telecommunications System

• n, number of external queues; N , size of the main queue.• Cost per Iteration:

I C BGS

O(n2nN) O(n2nN logN) O((2n)2N)

• Number of Iterations:s = 2 n = 1 n = 4N I C BGS I C BGS32 155 8 171 161 13 11064 ∗∗ 7 242 ∗∗ 13 199128 ∗∗ 8 366 ∗∗ 14 317256 ∗∗ 8 601 ∗∗ 14 530512 ∗∗ 8 ∗∗ ∗∗ 14 958

• ’∗∗’ means greater than 1000 iterations.

109

Page 110: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

6.5.6 The Manufacturing Systems of Two Machines in Tandem

• l, size of the first buffer; N , size of the second buffer.• Cost per Iteration:

I C BGSO(lN) O(lN logN) O(lN)

• Number of Iterations:l = 1 l = 4

N I C BGS I C BGS32 34 5 72 64 10 7264 129 7 142 139 11 142128 ∗∗ 8 345 ∗∗ 12 401256 ∗∗ 8 645 ∗∗ 12 ∗∗1024 ∗∗ 8 ∗∗ ∗∗ 12 ∗∗

• ’∗∗’ means greater than 1000 iterations.

110

Page 111: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

6.5.7 The Re-Manufacturing System

• Q, size of the serviceable inventory; N , size of the return inventory.• Cost per iteration:

I C BGSO(QN) O(QN logN) O(QN)

• Number of Iterations:Q = 2 Q = 3 Q = 4

N I C BGS I C BGS I C BGS100 246 8 870 ∗∗ 14 1153 ∗∗ 19 1997200 ∗∗ 10 1359 ∗∗ 14 ∗∗ ∗∗ 19 ∗∗400 ∗∗ 10 ∗∗ ∗∗ 14 ∗∗ ∗∗ 19 ∗∗800 ∗∗ 10 ∗∗ ∗∗ 14 ∗∗ ∗∗ 19 ∗∗

• ’∗∗’ means greater than 2000 iterations.

111

Page 112: hkumath.hku.hkhkumath.hku.hk/~wkc/course/part3.pdf · PART III (5) Computation with Markov Chains: Iterative Methods-Vector Norms and Matrix Norms-Power Method for Matrix Eigenvalues-Iterative

6.6 A Summary of Learning Outcomes

1. Able to recognize and apply networks of Markovian queues for realworld problems.

2. Able to apply the network models to real applications in manu-facturing systems, re-manufacturing systems, telecommunicationNetworks.

3. Able to apply the circulant-based preconditioning techniques.

112