Iterative Matrix computation

8/13/2019 Iterative Matrix computation

1/55

Matrix Computations

Marko Huhtanen

1 Introduction

Matrix computations is at the center of numerical analysis requiring know-

ledge of several mathematical techniques and at least a rudimentary unders-tanding of programming. In the early era during the 50s and 60s, the fieldcould be described as being comprised of certain fundamental matrix facto-rizations and how to compute them reliably. (Then computer architectureswere sequential whereas parallel computing has become the dominant pa-radigm since.) These inluded the LU factorization and SVD decomposition,as well as algorithms for solving the eigenvalue problem. The complexity ofthese algorithms is O(n3) while the storage requirement is O(n2).

Having these algorithms in an acceptable (early) form did not mean thatthe numerical linear algebra problems were wiped away. Typically applica-

tions involve partial differential equations, already from the very early era ofcomputing of the 40s. Once discretized, the matrices approximating the cor-responding linear operators are sparse. Roughly, this means that only O(n)ofthe entries are nonzeros. (The reason: differentiation operates locally on func-tions at a point.) Stored by taking this into account, i.e., zeros are not stored,the storage requirement is just O(n) as opposed to O(n2). Consequently, ve-ry large matrices could be generated to approximate the original problem.In fact, so large that the existing matrix computational techniques could notbe used at all to solve the corresponding linear algebra problems. This was(and still is) certainly irritating: everything has been carefully set up and

it just remains to do the matrix computations. Except that it is not goingto happen. The solution turns out to be out of reach because of its severecomputational complexity. Something that was considered originally a tri-vial linear algebra problem has turned into an exact opposite, i.e., actuallyan exceptionally tough problem. So either you scale down your ambitionsand accept coarser approximations (something you do not want to do), ortry to solve the linear algebra problems somehow (but not at any cost sinceyou cannot afford it).

From the 70s iterative methods started seriously gaining ground to solvevery large linear algebra problems without O(n3)complexity and O(n2)sto-

1


2/55

rage requirement. Analogous developements are still going on in every areaof numerical analysis (or scientific computing or computational science orwhatever you want to call it). As a rule, in executing iterative methods, thematrix (or matrices) related with the problem cannot be manipulated free-

ly. For example, it may be that matrix-vector products with the matrix isthe only information available, although you certainly want to avoid such anextereme. However, mathematically this means that the underlying assump-tions are getting closer to those usually made in, let us say, operator theory.To get an idea what this could imply as opposed to studying classical matrixanalysis, assuming having the Hermitian transpose may be unrealistic.

A reason for writing these lecture notes is the hope of being able to combineclassical matrix analytic techniques with those mathematical ideas that areuseful in studying iterative methods. Occasionally this means that the view-point is more abstract and functional analytic. The abstractness is, however,

not an aim. This is underscored by the fact that everything what is donetakes place in Cnn.1

It is assumed that the students have learnt undergraduate linear algebra andknow things such as the Gram-Schmidt orthogonalization process, the Gaus-sian elimination and know basics of the eigenvalues and eigenvectors, like howto solve tiny (2-by-2 or 3-by-3) eigenvalue problems by hand. One should al-so be familiar with the standard Euclidean geometry ofCn originating fromthe inner product

(x, y) =yx= j=1

xjyj

of vectors x, y Cn. (This is, of course, needed in the Gram-Schmidt ortho-gonalization.) The Hermitian transpose of a matrix A is the unique matrixA satisfying (Ax,y) = (x, Ay) for every vectors x and y.

2 Product of matrix subspaces

Factoring matrices is the way to solve small linear algebra problems. (Small isrelative. It typically means that you can use a PC to finish the computationsin reasonable time.) The most well-known case is that of solving a linearsystem

Ax= b (1)

with a nonsingular A Cnn and a vector b Cn given. The solution canbe obtained by the Gaussian elimination by computing the LU factorizationofA, i.e.,A is represented as (actually replaced with) the product

A= LU, (2)

1Here it is appropriate to quote P. Halmos: "Matrix theory is operator theory in themost important and the most translucent special case."

2


3/55

whereLis a lower triangular andUan upper triangular matrix.2 For the com-putational complexity, it requiresO(n3)floating point operations to computethe LU factorization.

In view of large scale problems, it is beneficial to approach factoring matricesmore generally. A reason for this is that then approximate factoring is morerealistic than factoring exactly. (Also in small computational problems theequality (2) holds only approximately, because of finite precision.)

The notion of matrix subspace is a reasonably flexible structure to this end.That is,V Cnn is a matrix subspace ofCnn over C(or R) if

V1+ V2 Cnn

holds whenever , C (or R) and V1, V2 V.

Any subalgebra ofCnn is clearly also a subspace ofCnn. Lower and uppertriangular matrices are subalgebras ofCnn.

Example 1 The set of complex symmetric matrices consists of matricesM Cnn satisfyingMT =M. It is a matrix subspace ofCnn of dimensionn+ (n 1) + + 1 = (n + 1)n/2.

Example 2 The set of Hermitian matrices consists of matrices M Cnnsatisfying M = M. It is a matrix subspace of Cnn over R of dimension

n+ 2(n 1 + + 1) =n2

.

Example 3 The set of Toeplitz matrices consists of matrices M Cnnhaving constant diagonals. It is a matrix subspace of Cnn of dimension2n 1.

The structure you have in the LU factorization is the following.

Definition 1 AssumeV1 andV2 are matrix subspaces ofCnn overC (orR). Their set of products is defined as

V1V2 = {V1V2 : V1 V1 and V2 V2}.

With two matrix subspaces V1 and V2, basic questions arising in practice arethe following. For a given A Cnn, does

A V1V22Later on, we will see that partial pivoting is needed to a numerically reliable compu-

tation PA= LU, where Pis a permutation.

3


4/55

hold, i.e., do we have A= V1V2? If it does, how to compute such a factoriza-tion? And what is the computational complexity? If we do not care aboutan exact factorization, how to approximate A with an element of the set ofproducts to have A V1V2?For the LU factorization the answers are fortunately known.

Problem 1 Show that a nonsingular A Cnn has an LU factorization ifand only if it is strongly nonsingular.3

Problem 2 Show that either A Cnn has an LU factorization or there isa matrix arbitrarily close toA which has. (In other words, the closure of theset of products of lower and upper triangular matrices is Cnn.)

The singular value decomposition is related with the problem of approxima-ting a given matrixA Cnn with matrices of rankk at most. Such matricescan be expressed as the product of matrix subspaces. (For simplicity, we onlyconsider the square matrix case.)

Definition 2 Matrices of rankk at most inCnn is the set of productV1V2whereV1 (resp.V2) is the subspace ofCnn consisting of matrices having thelastn k columns (resp. rows) zeros.

Matrices of rank k at most in Cnn are denoted byFk.Often one uses the expansion

V1V2 = [u1 u2 uk0 0][v1 v2 vk0 0]=k

j=1

ujv

j (3)

with uj, vj Cn for j = 1, . . . , k, to represent matrices fromFk. This showsthat for k < n the elements ofFk are singular (and do not constitute asubspace). Namely, take x span{v1, . . . , vk}4 to have x N(V1V2).Observe that, by using (3), an element ofFk requires storing only 2nkcomplex numbers. This means that the rank-one matrices in the sum arenot computed explicitely unless necessary. Certain operations, such as com-puting matrix-vector products, are still possible (and less expensive) by using(3).

3A is strongly nonsingular if all its principal minors are nonzero. A principal minor isthe determinant of any k-by-k matrix cut from the left-upper corner ofA.

4Find an element in the nullspace of the matrix whose rows are v1 , . . . , vk.

4


5/55

Example 4 Matrices of the form I+V1V2 are important as well, whereI Cnn denotes the identity matrix and V1V2 Fk. Historically theyare related with integral equations. More importantly, they can be invertedquickly (whenever invertible) since their structure is preserved in inversion.

In the rank-one case, try this out by finding the scalar Cthat solves theequation

(I+ u1v1)(I+ u1v

1) =I .

The cost is about one inner product. This is an example of a matrix which isvery easy to invert, i.e., never use standard methods such as the Gaussianelimination with matrices of this form.

3 The singular value decomposition

The singular value decomposition (SVD) has many applications. (The ap-proach is completely analogous ifA Cnn is rectangular but not square.)From the view point of data compression, n2 complex numbers must be keptin memory when A is stored. There are ways to try to approximate mat-rices somehow with fewer parameters, i.e., compress A. The singular valuedecomposition is related with the approximation problem

minFkFk

||A Fk||, (4)

where the norm of a matrix will be defined below.

5

Bear in mind that anelement ofFk requires storing just 2nk complex numbers. Consequently, iffor a small k the value of (4) is small, A can be well compressed with thehelp of the singular value decomposition.

Observe that in case of a compression when we do not have zero in (4), weare satisfied with an approximate factorization

A V1V2 (5)

of A. (One should keep in mind that actually any numerically computed

factorization is approximate.)Let us finally define the singular value decomposition.

Definition 3 The singular value decomposition ofA Cnn is a factoriza-tionA= UV withU, V Cnn unitary and a diagonal Cnn with thediagonal entries satisfying1 2 n 0.

5This approximation problem can be formulated in Banach spaces, to measure com-pactness. It is just in the Euclidean setting ofCn where the SVD happens to solve theproblem.

5


6/55

Ifk+1= 0, then the so-called compressed SVD is A= UVwith Uand V of

sizen-by-kconsisting of the first k columns ofUandV. The diagonal matrix is the k-by-k block from the left upper corner of. Then the columns ofUyield an orthonormal basis of the column space ofA. The dropped n kcolumns ofVyield an orthonormal basis of the null-space ofA.

Recall that a matrixQ Cnn is unitary if its columns are orthonormal, i.e.,QQ= I (6)

holds. Unitary matrices are extremely important for computations. The rea-son is that they do not magnify errors by the fact that unitary matrices arenorm preserving, i.e., (6) is equivalent to having

||Qx|| = ||x|| (7)

for every x Cn

.Unitary matrices can be generated easily once you invoke the Gram-Schmidtprocess.

Problem 3 Assume A Cnn is nonsingular. Show that orthonormalizingits columns with the Gram-Schmidt process starting from the leftmost co-lumn is equivalent to computing a representation A = QR with Q unitaryandR upper triangular with positive diagonal entries. (This is called the QRfactorization ofA.)

Recall that unitary matrices appear also in connection with the Hermitianeigenvalue problem.

Problem 4 Assume A Cnn is Hermitian, i.e., A =A. Show that Acanbe unitarily diagonalized, i.e.,

A= UU (8)

with a unitary U Cnn and a diagonal matrix Cnn. (Hint: show firstthat eigenvectors related to differing eigevalues are orthogonal.)

That there exists a singular value decomposition is based on inspecting theeigenvalue problem for AA. SinceAAHermitian, it can be unitarily diago-nalized as

AA= VV

Assume the eigenvalues are ordered nonincreasingly. (Note that AAis posi-tive semidefinite, i.e., its eigenvalues are nonnegative.) Take any eigenvectorsvj and vl ofA

A. Then Avj and Avl are orthogonal by the fact that

(Avj, Avl) = (vj , AAvl) = (vj, lvl) = 0. (9)

6


7/55

Consequently, take the columns ofUto be

uj = Avj

j

=Avj

j(10)

for nonzero eigenvaluesj ofAA. For zero eigenvalues, the remaining eigen-vectors are in the null space ofA by (9). Corresponding to these, take anyset of orthonormal vectors in the orthogonal complement of the vectors. Forthem, the corresponding singular values are zeros.

The (operator) norm of a matrix A Cnn is defined as||A|| = max

||x||=1||Ax||. (11)

It expresses how large at most can A make unit vectors.6 Because of (7) we

have ||A|| = ||||. (12)For diagonal matrices the norm is easy to compute and we obtain ||A|| =1.So the first singular value ofAis unique.

Problem 5 Show that for a diagonal matrix the norm is the maximal ab-solute value of its diagonal entries. (Observe that you can restrict the com-putations to real numbers.)

Repeating these arguments yields the fact that the singular values of a matrixare uniquely determined.

Problem 6 Show that the singular values of a matrixA Cnn are unique.

Without resorting to (12) it would be very challenging to compute the normofA. In particular, if you use other than the Euclidean norm in Cn, then youhave this challenge. (And numerous other challenges. So the reason for notusing the Euclidean norm should be exceptionally good.)

How fast the singular values decay is directly related with how well A isapproximated in (4).

Theorem 1 LetA Cnn. Then the value of the minimization problem(4)isk+1.

Proof. The value of (4) is at most k+1. This is seen by using the singularvalue decomposition ofA and forming Fk by setting k+1 = = n = 0.

6Unit vector is a vector of length one.

7


8/55

(This corresponds to replacing the n k last columns ofUand rows ofVwith zeros.)

To see thatk+1 is actually the minimum, consider V1V2 Fk which realizes(4). Let w

1

, . . . , wk

be the nonzero rows of V2. Then choose a unit vectorv which is a linear combination of v1, . . . , vk+1 which is in the orthogonalcomplement ofw1, . . . , wk. (For this, find a nonzero solution to a (k + 1)-by-khomogeneous linear system.) Then (AV1V2)v= Av and its norm is at leastk+1.End of proof.

The solution constructed from the singular value decomposition solves theminimization problem (4) actually in any unitarily invariant norm. A norm|||||| on Cnn is said to be unitarily invariant if for any A Cnn holds

|||A||| = |||Q1AQ2|||

for any unitaryQ1, Q2 Cnn. Aside from the operator norm, the Frobeniusnorm|| ||F is often used in practice, due to its computational convenience.The Frobenius norm is induced by the inner product

(A, B) = trace BA (13)

on Cnn. When dealing with matrix subspaces over R, use

(A, B) = Re trace BA. (14)

The trace is computed by summing the diagonal entries of the matrix. Thissimply means that Cnn is treated as Cn2

using the standard inner product.Therefore

||A||F = n

j=1

nk=1

|ajk |2.

This is clearly an easy computation whereas computing ||A|| is more involved.Numerically reliable algorithms for computing the SVD do not proceed theway we showed the existence of the SVD above. The problem with the exis-tence proof is the formation of AA which can have very large and very

small numbers, relatively. Numerically this may imply problems and in tho-se situations one shoud never accept the numerical results blindly. (Roughly,for any nontrival numerical problem, there is no numerical method which is100 percent reliable. A numerical method is good, if the measure of thoseproblems where it fails is negligible.) Fortunately, all the respectable publiclyor commercial codes available for computing the SVD can be regarded as re-liable.7

7Such as LAPACK or Matlab. For numerical libraries, see, e.g., wikipedia.

8


9/55

Problem 7 In a numerically reliable computation of the SVD, the so-calledHouseholder transformations are used. They matrices of the form I+uv

with u, v Cn which are additionally unitary. Using Example 4 for , findexplicit conditions for u and v to have a Householder transformation, i.e., a

unitary I+ uv.

4 Nonsingular and invertible matrix subspaces

The singular value decomposition can be viewed as a method for solvingthe approximate factoring problem (5). Other (approximate) factorizationproblems require completely different techniques. In this section we identifya family of matrix subspaces which allows solving the task.

It is noteworthy that Fk possesses a very special structure since both factorsinvolve singular matrices. It is not that we absolutely want to use singularapproximations in practice. Often it is quite the opposite, since it is easy tosee that an arbitrary A Cnn is invertible with probability one. Thereforethe SVD provides actually a very peculiar way to approximate matrices ingeneral.

Example 5 A way to circumvent the described problems with the SVD isto consider a more general approximation in terms of summing. This can bedone in many ways. For a classical example, principal component analysis

is an approach in statistics to analyze data. The process is simply an SVDapproximationafterthe data has been mean centered, i.e., translated so thatits center of mass is at the origin.

The question of existence of invertible elements in a matrix subspace is henceof central relevance for general algorithmic factoring.

Definition 4 A matrix subspaceVofCnn is said to be nonsingular if thereare invertible elements inV.

Example 6 In the eigenvalue problem for a given A Cnn one deals withthe nonsingular matrix subspace

V= span{I, A}. (15)The singular elements ofVcorrespond to the eigenvalues ofA.

Example 7 In the generalized eigenvalue problem for given A, B Cnnone is concerned with solving

Ax= Bx (16)

9


10/55

for C and nonzero x Cn. In the case of interest, one deals with thenonsingular matrix subspace

V= span{A, B}. (17)

The singular elements ofVcorrespond to the generalized eigenvalues.

Since the singular elements of a matrix subspace are important, let us makesome remarks on their structure. To this end, assume V1, . . . , V k is a basis ofa matrix subspaceV ofCnn over C (or R). Thus, for any V Vwe have aunique representation

V =k

j=1

zjVj (18)

with zj C (or R) for j = 1, . . . , k. Define

p(z1, . . . , z k) = detk

j=1

zjVj

and setV(p) = {z Ck : p(z1, . . . , z k) = 0} (19)

This gives us a so-called determinantal hypersurface in Ck (or Rk).

Definition 5 V(p) is said to be the spectrum ofV in the basisV1, . . . , V k ofV.

It is noteworthy that the spectrum is basis dependent. However, ifW1, . . . , W kis another basis ofV, then there exists a matrix M Ckk such thatM V(p)is the spectrum in this new basis.

Problem 8 Let Vbe the set of upper triangular matrices. Find the spectrumin some basis ofV.

Problem 9 Suppose Wl =k

j=1 mljVj . Show that M ={mlj} yields thematrix for transforming the spectrum into new basis W1, . . . , W k.

Example 8 In the eigenvalue problem for a given A Cnn one chooses,without exception,V1 = I and V2=A.

Fordim V>2 computing the spectrum is a tough problem in general. In thecase dim V= 2 it is handled with equivalence transformations.

10


11/55

Definition 6 Matrix subspacesV andWare said to be equivalent if thereexist invertible matricesX, Y such thatW=XVY1.

With the spectrum it is easy to see that if there are nonsingular elements in

V, then most of them are.

Proposition 1 Suppose there are nonsingular elements in a matrix subspaceV. Then the set of nonsingular elements is open and dense.

Proof.Because of the isomorphism (18), Vcan be indentified with Ck (or Rk).Its is clear that V(p) is a closed set in Ck (or Rk). It cannot have interiorpoints either. Namely, if(z1, . . . , z k)were an interior point, then fixz2, . . . , z kand regardp as a polynomial in the one variablez1. It has a finite number of

zeros. Thereby an arbitrary small perturbation yields a nonzero value. Endof proof.

There are two types of nonsingular matrix subspaces. To see this, we startby recalling how the LU factorization is found.

Suppose A Cnn is invertible and can be LU factored. The Gaussian eli-mination to this end constructs a lower triangular matrix L1 such that

L1A= U (20)

is an upper triangular matrix. Then the fact that the inverse of a nonsingularlower triangular matrix is lower triangular matrix is used to conclude thatthis actually yields an LU factorization ofA.

To generalize this, for a nonsingular matrix subspaceV setInv(V) = {V1 : V V,det V= 0} (21)

and call it the set of inverses ofV. It is not easy to characterize this set ingeneral. The structure you have in the lower triangular case is of the followingtype.

Definition 7 A matrix subspaceV is said to be invertible ifInv(V) = {W : W W, det W= 0}

for a nonsingular matrix subspaceW.

This means that the set of inverses is a matrix subspace, aside from thoseelements which are singular. Or, in other words, the closure of Inv(V) is amatrix subspace.

11


12/55

Clearly,W is unique and denoted byV1 and called the inverse ofV.The set of lower (upper) triangular matrices is an invertible matrix subspaceby the fact that it is a subalgebra of Cnn containing invertible elements.The inverse is the set of lower (upper) triangular matrices.

Example 9 LetV be the set of Hermitian matrices. Use (8) to concludethatV is invertible withV1 = V.

Invertibility is preserved under equivalence, i.e., ifW = XVY1 andV isinvertible, then so isWand

W1 =YV1X1

holds.

Problem 10 LetVT ={VT : V V}. ThenVis invertible if and only ifVT is. (For the Hermitian transposition an analogous claim holds.)

The following notion is of importance for matrix subspaces as well as foriterative methods considered later.

Definition 8 Let A

Cnn. Then the minimal polynomial8 of A is the

monic polynomialp of the least degree satisfying

p(A) = 0. (22)

Problem 11 Show that similar matrices have the same minimal polynomial.(Recall that two matrices A and B are similar if there exists an invertiblematrix such that A = XBX1.) Show also that the minimal polynomial isunique.

The Jordan canonical form can be used to show that the characteristic po-lynomial annihilates A. Hence the degree of the minimal polynomial is atmost n. The characteristic polynomial may not yield the minimal polyno-mial though.

Problem 12 Suppose A Cnn is diagonalizable. Determine the degree ofthe minimal polynomial in terms of the distinct eigenvalues ofA.

8A monic polynomialckzk + ck1z

k1 + + c2z2 + c1z + c0has the leading coefficientcn= 1.

12


13/55

Let p(z) =zk +ck1zk1 + +c2z2 +c1z+ c0 be the minimal polynomialofA. Then Ais invertible if and only ifc0= 0. Namely, ifc0= 0, then

A(1c0

Ak1 +ck1

c0Ak2 + +c1

c0I) =I . (23)

In particular, this also shows that the inverse ofA is a polynomial in A. Forthe converse, ifc0 = 0, then

A(Ak1 + ck1Ak2 + + c1I) = 0

and since Ak1 + ck1Ak2 + + c1I= 0,A cannot be invertible.

Example 10 The set of complex symmetric matrices is an invertible matrixsubspace. Its inverse is the set of complex symmetric matrices. To see this,assume A is complex symmetric and invertible. If p is a polynomial, thenp(A)T = p(AT) = p(A), i.e., any polynomial in A is complex symmetric.Therefore the inverse ofAis complex symmetric.

In this example we used the following fact. A matrix subspaceV over C(or R) is said to be polynomially closed ifV V, then p(V) V for anypolynomial p with complex (real) coefficients.

Proposition 2 Suppose a nonsingular matrix subspaceV is equivalent to amatrix subspace which is polynomially closed. ThenVis invertible.

Problem 13 Let A Cnn. For iterative methods the matrix subspacesKj(A; I) = span{I , A , . . . , Aj1}

are very important, forj = 1, 2, . . .. Are these nonsingular matrix subspaces?For what value ofj isKj(A; I) invertible?

5 Factoring algorithmically

Assume given two nonsingular matrix subspacesV1 andV2 of which oneis invertible. Let us supposeV2 is invertible with the inverseW. SupposeA Cnn is nonsingular and the task is to recover whether A V1V2.Clearly,A= V1V2 holds if and only if

AW =V1 (24)

for some nonsingular W W. This latter problem is linear and therebycompletely solvable. (Once done, we have A= V1W

1.) To this end we willuse projections.

13


14/55

Recall that a linear operator P on a vector space is a projection ifP2 = P.Aprojector moves points onto its range and acts like the identity operator onthe range. Such operators are of importance in numerical computations andapproximation. In the so-called dimension reduction approximation, the task

is to find in some sense a good projector which is used to replace the originalproblem with a problem which is smaller in dimension. Then the problem isprojected onto the range ofP. Observe that ifP is a projector, then so isI P.It is preferable to use orthogonal projectors since they take the shortes pathwhile moving points to the range. We say that P is an orthogonal projectorif

R(P) R(I P),i.e., the range ofP is orthogonal to the range ofIP. Orthogonality requiresusing an inner product, so that we use (13) on

Cn

n

. (Recall that thenCn

n

is isometrically isomorphic with Cn2.)

The standard way of constructing an orthogonal projector onto a given subs-pace is to take its orthonormal basis q1, . . . , q k. Then set

Px=k

j=1

qj(x, qj).

Occasionally orthogonal projectors onto familiar matrix subspaces are rea-

dily available. The orthogonal projector onCn

n

onto the set of Hermitianmatrices is given by

PA=1

2(A + A). (25)

(Because the set of Hermitian matrices is a subspace over R, use (14).) Thisis the so-called Hermitian part ofA. Similarly, onto the set of complex sym-metric matrices the orthogonal projector acts according to

PA=1

2(A + AT). (26)

These are simple to apply.

Problem 14 Suppose Pis a projection. Then show that Pis an orthogonalprojection iff the operator norm||P|| = 1 iff for every xholds||x||2 = ||(IP)x||2 + ||Px||2.

A matrix is called standard if there is exactly one entry which equals 1 whileother entries equal zero. A matrix subspaceV is called standard if it hasa basis consisting of standard matrices. This simply means that there areno interdepencies between the entries of

V. In this case the the orthogonal

14


15/55

projector P onto Vacts such that PAsimply replaces with zeros those entriesofA which are outside the sparsity structure ofV. Other entries ofA arekept intact.

Definition 9 The sparsity structure of a matrix subspaceVmeans the loca-tion of those entries which are nonzero for someV V.

Consider now solving the problem (24). Denote by P1 the orthogonal pro-jector ontoV1. Define a linear map

W (I P1)AW (27)

from Wto Cnn. SinceIP1is an orthogonal projector onto the orthogonalcomplement of

V1, the solutions can be computed by finding the nullspace of

(27). Namely, ifW is in the nullspace, then (I P1)AW = 0. Thereby wehave AW =V1 V1.Of course, since (27) is linear, the nullspace is computable. (As we know,there are even finite step algorithms to this end.)

It is a classical result that every matrix A Cnn is the product of twocomplex symmetric matrices. So far the demonstrations of this result have bequite ad hoc. Let us see how this happens routinely with the above method.

Example 11 Let us factorA = 1 2

1 1

into the product of two symmetric

matrices. We have

(I P)A

s1 s2s2 s3

= (I P)

s1+ 2ss s2+ 2s3

s1+ s2 s2+ s3

=

0 s1/2 + s3

s1/2 s3 0

.

Thereby, whenevers1= 2s3 we have a symmetric AS=S1. The nullspace isclearly nonsingular. Moreover, its dimension is two, so that the factorization

is certainly not unique.

This is no coincidence. It can be shown that to factor any A Cnn intothe product of two complex symmetric matrices, the nullspace is nonsingularand at least of dimension n.

Because of Proposition 1, the dimension of the nullspace of (27) expresseshow nonunique the factorization is. Of course, in practice it typically sufficesto compute a single factorization.

15


16/55

Observe that ifV1 had been invertible instead ofV2, then we can proceed bytransposing the problem, i.e., then the question reads whether AT VT2 VT1holds. Another option is to proceed by replacing (27) with

W

(IP2)W A (28)

whereW is now the inverse ofV1. Then we look for the nullspace of thislinear map. If (I P2)W A = 0, then W A= V2 V2. This relates directlywith (20) how the LU factorization can be computed.

Example 12 In the standard method for an LU factorization, one computes(20) by constructing a single element L1 by performing row operations suchthat L1Ais upper triangular. Let us see how we solve the problem comple-tely with (28). Then Wis the set of lower triangular matrices and V2 the setof upper triangular matrices. They are both standard matrix subspaces. Let

an invertible A = {ajk} be given. Denote by W = {wjk} the lower triangularmatrix of variables. Now IP2 projects to the set ofstricly lower triangu-lar matrices. Therefore finding the nullspace of (28) gives us the equationscollected row-wise from the strictly lower triangular part

a11w21+ a21w22 = 0a11w31+ a21w32+ a31w33 = 0a12w31+ a22w32+ a32w33 = 0

...

and so forth. There are no equations for w11. One equation binding w21 and

w22. Two equations binding w31,w32 andw33. And so forth. It is noteworthythat here the transpose ofAstarts appearing row-wise, instead ofA. In thegeneric case (= what happens with probability one) the dimension of thenullspace of (28) is now n. This means then that any two different factors Lof an LU factorization differ by a scaling. That is,A = LU=LDD1U=LUfor a diagonal matrix D.

Definition 10 Let A Cnn. Multiplying A from the right (left) by aninvertible diagonal matrix is a called right (left) scaling ofA

Observe that it may not be necessary to invert the factor W at all. Thisis the case, for example, in the Gaussian elimination applied to the linearsystem (1). Then one has Ux = c, where U = L1A and L1b= c. In sucha case, it sufficesW be nonsingular since we do not need to understandwhat Inv(W)is like. In preconditioning (considered later in connection withiterative methods) one has such a situation.

In practice it may be realistic to compute only an approximate factorization.The problem is challenging if we consider

infV1

V1, V2

V2

||A V1V2|| (29)

16


17/55

to this end. First, there does not seem to exist any direct way of solving thisproblem. Second, we have to accept dealing with the infimum instead of aminimization problem. Numerically such a situation indicates possibly severeproblems.

Example 13 Let

A=

11 1

and supposeV1 is the set of lower andV2 the set of upper triangular mat-rices. Then the value (29) is zero for any . For = 0 the minimum does

not exist since A is not LU factorizable. Otherwise we have

11 1

=

1 01/ 1

10 1

1/

. For very small , the factors are huge. Moreo-

ver, in finite precision arithmetics,1 1/is replaced with1/. This has adramatic effect since

1 01/ 1

10 1/

A=

0 00 1

which is not

small at all.

Numerically one option is to look at

minWW

||(I P1)AW|| (30)

by imposing additional constraints for Wto satisfy. This could be some sortof norm conditions. As a result, we have (I

P1)AW

0, i.e., AW

V1

with V1= P1AW. Then it remains to invert W, if required.

In approximations with (30), one should keep in mind that

||A V1W1||||W1|| ||AW V1||

A V1W1 ||W|| .Here appears the condition number

(W) = ||W||||W1|| (31)ofWwhich scales the accuracy of the approximations. This is no accident.

The condition number appears often in assessing accuracy of numerical linearalgebra computations. From the SVD ofWone obtains (W) = 1

n.

6 Computing the LU factorization with partial

pivoting

In factoring in practice, one typically computes a single element of the nulls-pace (27) or (28). This should be done as fast as possible without sacrificing

17


18/55

the accuracy of the numerical results. This means that one tries to benefitfrom the properties of the problem as much as possible. For the LU facto-rization this means computing the standard LU factorization not for A butfor a matrix which has the rows ofAreordered, i.e.,

P A= LU, (32)

wherePis a permutation matrix to make the computations numerically morestable. This permutation is not known in advance. Finding P is part of thealgorithm called the LU factorization with partial pivoting.

Problem 15 Show that the complexity of the standard Gaussian elimina-tion for A Cn (when all the pivots are nonzero) is 2

3n3 flops.9

We work out a low dimensional example to see what is going on.10 For thematrix A below, the standard Gaussian row operations give

A=

2 1 1 04 3 3 18 7 9 56 7 9 8

=

1 0 0 02 1 0 04 3 1 03 4 1 1

2 1 1 00 1 1 10 0 2 20 0 0 2

=LU.

In theLfactor we get a hint of the catastrophic behaviour of Example 13, i.e.,away from the diagonal appear large entries. To avoid this, one must partialpivot inbetween the row operations. The resulting LU factorization withpartial pivoting has replaced the standard LU factorization since the 1950s.(To such an extent that by an LU factorization ofA is typically meant theLU factorization (32).)

The rule is simple: the pivot must be the largest entry among those entriesbeing under elimination. This is achieved by permuting rows. We have forthe first column

P1A=

0 0 1 00 1 0 01 0 0 00 0 0 1

2 1 1 04 3 3 18 7 9 56 7 9 8

=

8 7 9 54 3 3 12 1 1 06 7 9 8

.

Then the row operations expressed in terms of a matrix-matrix product forthe first column read

L1P1A=

1 0 0 012

1 0 014

0 1 034

0 0 1

8 7 9 54 3 3 12 1 1 06 7 9 8

=

8 7 9 50 1

232

32

0 34

54

54

0 74

94

174

.

9By a flop is meant a floating point operation: sum, difference, product or a fraction oftwo complex numbers.

10This example is from [4].

18


19/55

Then

P2L1P1A=

1 0 0 00 0 0 10 0 1 00 1 0 0

8 7 9 50 1

232

32

0 34

54

540 7

494

174

=

8 7 9 50 7

494

174

0 34

54

540 1

232

32

and

L2P2L1P1A=

1 0 0 00 1 0 00 3

7 1 0

0 27

0 1

8 7 9 50 7

494

174

0 34

54

54

0 12

32

32

=

8 7 9 50 7

494

174

0 0 27

47

0 0 67

27

.

Then, similarly, with

P3 =

1 0 0 00 1 0 00 0 0 10 0 1 0

and L3=

1 0 0 00 1 0 00 0 1 00 0 1

3 1

we complete the process to have

L3P3L2P2L1P1A= U=

8 7 9 50 7

494

174

0 0 67

27

0 0 0 23

.

Problem 16 The Gaussian row operation matrices (also called the Gausstransforms) can be expressed as

Lj =I+ ljej (33)

withlj Cn having the first j entries zeros. Show that L1j =I ljej .(Hereej denotes the jth standard basis vector.) Using this, show that the inverseofLn1 L1 is I

n1j=1lle

j .

This looks certainly complicated. However, we have (hidden) here the searc-hed factorizationP A= LU. To see this, there holds

L3P3L2P2L1P1 = L3L

2L

1P3P2P1

once we set

L3 =L3, L2 =P3L2P

13 and L

1 =P3P2L1P

12 P

13 .

We have P =P3P2P1 and L1 =L3L

2L

1.

19


20/55

Matrices Lj are easily found. Because of (33),

Lj =I+ Pn1 Pj+1ljej P1j+1 P1n1.Observe that when Pl is applied to a vector, it does not permute the l

1

first entries. Therefore ej P1j+1 P1n1 = (Pn1 Pj+1ej) = ej . Moreover,the first j entries of lj = Pn1 Pj+1lj are zeros while the remaing entriesare just those oflj permuted.

IfA Cnn is nonsingular, then the Gaussian elimination with partial pivo-ting always yieds a factoriziation (32). The reason is that the permutationsand Gauss transforms are invertible, so that the transformed matrix remainsinvertible. (And if there were just zeros in a column such that no Pl canbring a nonzero to the lth diagonal position, then the first l columns wouldbe necessarily linearly dependent.)

Problem 17 In assessing accuracy of finite precision computations, oftenthe norm used on Cn is

||x||= max1jn

|xj|.This is the so-called max norm. Then the corresponding norm of a matrixA Cnn is defined as

||A||= max||x||=1 ||Ax||.

Show that in (32) produced with the partially pivoted Gaussian elimination

we have||L|| n.Problem 18 In Problem 17, a moderate linear growth with respect to thedimension was shown to bound the max norm of theL factor. Unfortunately,the max norm of the U factor can grow exponentially. Show that||U||2n||A|| is possible by considering A= {aij} with

aij =

1, when i= j or j =n1, when i > j

0, else.

Example 14 Consider Example 13. For A =

11 1

the partial pivoted

Gaussian elimination yields P1=

0 11 0

,so that

P1A=

1 0 1

1 10 1

.

In finite precision arithmetics, 1 is replaced with1. This does not have adramatic effect at all since P11 LU A=

0 0 0

.

20


21/55

Regarding the cost of the partial pivoting, in choosing Pj one needs to com-pute the absolute values ofn j + 1 entries of a column of the U matrixcomputed so far. In all this sums up to computation ofO(n2)absolute values.This is negligible in comparison with 2

3n3 flops consumed in row operations.

The so-called complete pivoting means applying permutations from the leftand right asP AQwhose LU factorization is gets then computed. The corres-ponding algortihm is significantly more costly by the fact thatO(n3)absolutevalues need now to be evaluated in constructing P and Q. Therefore the LUfactorization with complete pivoting is rarely used. So far, partial pivotinghas been sufficient to deal with practical computational problems.

Because of Examples 13 and 14, we must say a few words about the floa-ting point arithmetic the computers rely on. The current standard is theIEEE double precision arithmetic. Practically all computer manufacturers

have chosen to use this standard. Its purpose is to discretize R by replacingit with a finite set of rational numbers. Numbers can be in absolute valuebetween2.23 10308 and1.79 10308. Otherwise there will be underflow oroverflow. This is usually not the thing to worry about. Since only a finite setof rational numbers are in use, there appear gaps between numbers comparedwith R. To desrcibe this, the interval [1, 2] Ris represented by

1, 1 + 252,1 + 2 252, 1 + 3 252, . . . , 2. (34)For other intervals this representation is then scaled such that the interval[2j , 2j+1]is represented by the rational numbers (34) multiplied by 2j. This is

done untill the overflow is reached. (Similarly for the underflow.) For example,the interval [2, 4] Ris represented by

2, 2 + 251,2 + 2 251, 2 + 3 251, . . . , 4.This means that the gaps are of the same size, in the relative sense.

In modelling the IEEE double precision standard, one assumes that there isneither underflow nor overflow, and that there is the zero number. Designatethis set by F. Then f l: R F maps real numbers onto F such that

f l(x) =x(1 + ),

where|| machine with the IEEE double precision standard machine =253 1.111016. This is the way the computer is regarded to round realnumbers; it retains about 16 correct digits.

Once numbers are rounded, we can perform elementary arithmetic opera-tions with them. These are the sum, difference, product and fraction. Letdesignate such an operation. Ifx, y F, it is assumed that the floating pointarithmetic for computing x y R yields

(x

y)(1 + )

F,

21


22/55

where|| machine.Using this model, it can be shown that ifL are Uare the factors computedin finite precision arithmetic, then the standard LU factorization satisfies

LU=A + A,

where ||A||||L||||U|| =O(machine). The partially pivoted LU factorization satisfies

LU=P A + A,

where ||A||||A|| = O(machine). Here = maxi,j |uij |maxi,j |aij | .

11 Examples 13 and 14 are

clearly in line with these results.

7 Using the structure: Cholesky factorization,Sylvester equation and FFT

To solve a linear system (1), we know now that it can be done reliably at thecost ofO(n3) flops requiring storing n2 numbers. This is thereby the worstscenario in the sense that it can only be improved. (With a linear systemone should always ask, do I really need the LU factorization or is there abetter way.) An improvement requires that the linear system has some specialstructure. This is often the case in applications. We illustrate this with three

very different examples: the Cholesky factorization, the Sylvester equationand the use of the FFT.

The Cholesky factorization replaces the LU factorization if the matrix ispositive definite. It is simply the LU factorization computed by keeping inmind that A is positive definite.

Definition 11 A matrixA Cnn is positive definite if it is Hermitian andsatisfies

(Ax,x)> 0 (35)

for any nonzerox Cn.

Problems related with energy minimization typically involve positive definitematrices. (In physics, there is no lack of such problems!)

Problem 19 Show that a Hermitian matrix is positive definite if and onlyif its eigenvalues are strictly positive. Moreover, show that if A is positivedefinite and M is invertible, then MAM is positive definite.

11See Numerical Linear Algebra, L.N. Trefethen and D. Bau, III, SIAM, 1997.

22


23/55

Suppose a positive definite A Cnn is split as A =

a11 a

a B

with a

Cn1. Use the condition (35) with x= e1 to conclude that a11>0. Then wehave

A=

0a/ I

1 00 B aa/a11

a/0 I

= R1A1R

1 (36)

with =

a11. The middle factor A1 is positive definite by Problem 19.ThereforeBaa/a11is positive definite as well. To see this, use the condition(35) with vectors xwhose first entry is zero. Consequently, the trick of (36)can be repeated with the block B aa/a11. Once completed, we have

A= R1R2 RnIRn R2R1 =RR,

the so-called Cholesky factorization of A. Since only an upper triangularmatrixRis needed, the Cholesky factorization requires storingn2/2numbers.

Problem 20 Show that the Cholesky factorization can be computed twicefaster than the standard LU factorization, i.e., that it requires 1

3n3 flops

to compute the Cholesky factorization of a positive definite A Cnn.

It is quite remarkable that no partial pivoting is needed in computing theCholesky factorization. The reason is that the operator norm of the factorRcan be elegantly controlled in terms of that ofA. Namely, ifR = UV isthe SVD ofR, then A= U2U is the SDV ofA. (Compare with Example13, where the matrix is not positive definite for small.)

Next we consider solving the so-called Sylvester equation which appears incontrol theory and stability analysis. To this end, assume given A,B,CCnn. The task is to find a matrix X Cnn solving

AX XB =C (37)

which is called the Sylvester equation. The linear operator associated withthe Sylvester equation is

X AX XB (38)

from Cnn to Cnn. Hence (37) could be written as a standard linear systemof sizen2-by-n2. Then the LU factorization would consume 2

3n6 flops to solve

the problem. This is not very attractive. It turns out that the problem canbe solved numerically reliably by consuming O(n3) flops only. Then we donot factor the linear operator (38).

To this end we need a similarity transformation called the Schur demposition.

23


24/55

Theorem 2 LetA Cnn. Then there exists a unitaryQ Cnn and anupper triangularT Cnn such thatA= QT Q.

Proof. The following construction is not algorithmic.

Because the characteristic polynomial has (by the fundamental theorem ofalgebra) at least one zero, A has at least one eigenvector q1 Cn. Supposeit is of unit length and Q1 Cnn is unitary having q1 as its first column.Then A= Q1T1Q

1, where T1 has the first column consisting of zeros except

that the first entry is the eigenvalue 1 corresponding to q1.

This idea is repeated with the right lower (n 1)-by-(n 1) block of T1.We obtain A = Q1Q2T2Q

2Q

1, where the first column ofT2 equals the first

column ofT1. The second column ofT2 consists of zeros except for the firsttwo entries.

Continue this construction to have

A= Q1 Qn1T Qn1 Q1,where T =Tn1 and Q= Q1 Qn1.End of proof.Observe that the eigenvalues ofA, denoted by (A), can be found from thediagonal ofT.

IfA is Hermitian, then T is diagonal with real entries. More generally, ifTis diagonal, then Ais said to be normal.

By the same reasoning, there exists a factorization ofA as A= QT Q witha unitary Qand a lower triangular T. (Or, use the theorem for A and thenHermitian transpose the result.)

The above construction is not algorithmic since the computation of the zerosof a polynomial is numerically a very tough problem. Still, the Schur decom-position of a matrix can be computed numerically reliably by using O(n3)flops. The algorithm is the so-called QR iteration. The eigenvalue problem issolved in this way in practice.

Problem 21 It seems that finding the zeros of the characteristic polynomialgives you the eigenvalues. In reality it goes the other way. To see this, considera monic polynomial p(z) = z3 +a2z

2 +a1z+ a0. How are its zeros relatedwith the eigenvalues of the matrix

A=

0 0 a01 0 a1

0 1 a2

called the companion matrix of p? From this you can guess what is thecompanion matrix ofp(z) =zm + am1zm1 +

+ a1z+ a0.

24


25/55

To solve the Sylvester equation numerically reliably, we assume having com-puted the Schur decompositions ofAand B as

A= Q1T Q1, and B =Q2SQ

2,

where T is lower and Sis upper triangular. Then (37) is equivalent to

T Y Y S=D (39)with Y = Q1XQ2 and D = Q1CQ

2. Since T and S are lower and upper

triangular, the solving can be done by substitution. It takes place column-wise starting from the first column as

(t11 s11)y11 = d11(t22 s11)y21+ t21y11 = d21(t33

s11)y31+ t32y21+t31y11 = d31

... ... ...

.

These have a unique solution for any first column of D if and only if theeigenvalue s11 ofB is not an eigenvalue ofA. Having these solved, move onto the second column

(t11 s22)y12 s12y11 = d12(t22 s22)y22 s12y21+ t21y12 = d22(t33 s22)y32 s12y31+ t32y22+ t31y12 = d32...

... ...

.

These have a unique solution for any second column ofD if and only if theeigenvalue s22 ofB is not an eigenvalue ofA.

These arguments can be repeated to have the following result.

Theorem 3 Let A, B Cnn. Then (37) has a unique solution for anyC Cnn if and only if(A) (B) = .

Problem 22 AssumeA and B are diagonalizable and you manage to diago-nalize them numerically reliably as A = X1D1X

11 and B =X2D2X

12 . How

you can use this to solve the Sylverster equation (37).

The third example we deal with is the fast Fourier transform (FFT) origina-ting from numerical Fourier analysis. It is an algorithm encompassing manyideas and principles which are important more generally in matrix compu-tations. Recall that the Fourier coefficents of a sufficently regular functionf : [0, 1] Care defined by

cj =

10

f(t)e2jtidt (40)

25


26/55

for n Z. Then f(t) = j= cje2jti holds (in some sense). Initially, theFourier transform was used in solving partial differential equations. Sincethen it (and its many variants) has found use in many applications, mostnotably in signal processing.

In practice, only a finite number of the Fourier coefficients can be computed.Then the computation of a single Fourier coefficient relies on numerical in-tegration. (Only rarely on has an fwhich allows intgration in a closed form.Also, it is possible thatfcan only be sampled at a finite set.) Since the com-putation of the Fourier coefficients is linear12 the numerical problem turnsinto a problem in matrix computations.

In the numerical computation of (40), the interval [0, 1] is divided into nsubintervals. Thereafter the integral is approximated with the Riemann sum.These approximations can expressed in terms of a matrix-vector product. For

n= 1, 2, 3, . . .the associated matrices are

F1= [1], F2=

1 11 1

,

1 1 11 w3 w23

1 w23 w43

. . . ,

where wn = e2i/n, i.e., an nth complex root of1. For Fn ={fjk} Cnn

the (j, k)-entry isfjk =e

2i(j1)(k1)/n.

This is called the Fourier matrix. An application of 1n

Fn to a vector x Cncorresponds to computing a numerical approximation to the Fourier coef-

ficients c0, . . . , cn1 off. (Then f is sampled with values put into x.) Theoperation is called the discrete Fourier transformation (DFT) ofx. The vectorxconsists of the values offat the grid points 0, 1/ n , . . . , (n 1)/n.

Problem 23 The Fourier matrix is a so-called Vandermonde matrix definedas follows. Letz0, . . . , z n1 C be distinct. Then the associated Vandermon-de matrix is

V(z0, . . . , z n1) =

1 z0 z

20 zn10

1 z1 z21 zn11

.

.. .

.. .

.. .

.. .

..1 zn1 z2n1 zn1n1

.

Consider an interpolation problem: Let c0, . . . , cn1 C be given. Find apolynomial p of degree n 1 such that p(zj) =cj for j = 0, . . . , n 1. Howcan you solve this problem with the help ofV(z0, . . . , z n1)?

Hence, Fn=V(1, wn, w2n, . . . , w

n1n ).

12We have10

(f(t) + g(t))e2jtidt= 10 f(t)e2jtidt+

10 g(t)e2jtidtfor any

, C.

26


27/55

Proposition 3 LetFn be the Fourier matrix. ThenFn is complex symmetricand satisfiesFn Fn=nI.

Proof. It is clear that Fn is complex symmetric. To show that F

n Fn = nI,

the (j, k)-entry of the matrix Fn Fn is sjk =

n1l=0 wnjl wkln =

n1l=0 w(kj)ln .Forj =k we havesjk =n. Otherwise, compute the sum in the standard wayto have

sjk wkjn sjk = 0by the fact that wkjn w

(kj)(n1)n = 1.End of proof.

This means that 1n

Fn is unitary. (Because of this, sometimes 1

nFn is called

the Fourier matrix.) And F1n = 1n

Fn = 1n

Fn because of complex symmetry.Only because the Fourier matrix appears so often in applications, it deservesall this special attention.

The FFT is an algorithm to perform matrix-vector products rapidly with theFourier matrix whenn= 2l for l N. In other words, DFT, i.e., the numericalcomputation of the Fourier coefficients can be done very fast. Bear in mindthat for an arbitrary matrix A Cnn, the matrix-vector product can beexpected to require about2n2 flops. (n2 multiplications andn(n 1) sums.)For certain values ofn, this can be done much fast with Fn.

Assume thus n= 2l. Denote m= n2

. Then

Fn = I D

I D Fm 0

0 FmP, (41)

whereD is a diagonal matrix and Pa permutation matrix defined as follows.We have D= diag(1, wn, . . . , w

m1n ) and P acts as

P[x0 xn1]T = [x0 x2 x4 xn2 x1 x3 x5 xn1]T,i.e.,Pcollects first the even components and thereafter the odd componentsofx. Before showing that (41) holds, lets do the operation count for a matrix-vector product with the right hand side of (41).

We assume that formingP xis free. Thereafter applying twice with Fm costs2(2( n

2)2) =n2 flops. Then the last matrix-vector product costs 2nflops when

applied taking into account its structure, i.e., zeros are ignored in applyingdiagonal matrices in the blocks. In alln2+2n. This is less than 2n2, of course.

This idea can be used again. There is nothing we can do about these 2nflops.However, by the same trick applied twice to Fmreplacesn

2 with 12

n2+2n. Thiscan be repeated log2 ntimes. Thereby we consume just 2n log2 n= 2nl flopsin all. This is much less than 2n2, the cost of computing Fnxby performingthe matrix-vector product directly.13

13Sometimes only multiplications are counted. On a computer they take longer thansums. Then there are other constants instead of2 in front.

27


28/55

It remains to show (41). Let y=Fnx. Then

yj =n1k=0

wkjn xk =m1k=0

w2kjn x2k+m1k=0

w(2k+1)jn x2k+1.

by separating the components ofx into even and odd. Thus we have

yj =m1k=0

wkjm x

k+ wjn

m1k=0

wkjm x

k. (42)

by denoting x

= [x0 x2 x4 xn2]T and x = [x1 x3 x5 xn1]T. Set y =Fmx

and y

=Fmx

. Then (42) can be written as

yj =y

j+ wjny

j (43)

andyj+m= y

j wjny

j (44)

for j = 0, . . . , m 1. (Use wkmm = 1 and wmn = 1 to have (44).) This is thefactorization (41) written componentwise.

Observe that the inverse of the Fourier matrix can be applied with the samespeed by the fact that

1

nF1n x=

1

nFnx.

Aside from numerical Fourier analysis, the Fourier matrix appears in connec-tion with Toeplitz matrices. (See Example 3.) This is best illustrated withwith the circulant matrices, a subset of Toeplitz matrices. Circulant matricesare defined conveniently with the help of the permutation matrix

P =

0 0 0 11 0 0 00 1 0 0...

... . . . ...

0 0 1 0

. (45)

Problem 24 If you have a Hermitian matrix, you know its eigenvalues arelocated on the real axis. (You can see this by using the Schur decomposition.)Since Pis a permutation, it is also unitary. Using the Schur decomposition,where are the eigenvalues ofP located now?

Definition 12 LetPbe the permutation(45). ThenKn1(P; I)is the set ofcirculant matrices.

28


29/55

Let p(z) =n1

j=0cjzj be a polynomial. Then the polynomial p in a matrix

A Cnn is defined to be the matrixp(A) = n1j=0cjAj . In the case ofP wehave

C=p(P) =n1j=0

cjPj =

c0 cn1 cn2 c1c1 c0 cn1 c2c2 c1 c0 c3...

... . . . ...

cn1 cn2 c1 c0

(46)

illustrating the matrix structure of circulant matrices.

The columns of the Fourier matrix are eigenvectors of circulant matrices.

Problem 25 Let Fn be the Fourier matrix and P Cnn the permutationmatrix (45). Find the diagonal matrix satisfying

P Fn= Fn, (47)

i.e., determine the eigenvalues ofPby hand.

Denote byQ the unitary matrixQ = 1n

Fnand consider the circulant matrix

Cin (46). Because of (47),Cis normal, i.e., unitarily similar with a diagonalmatrix. This follows from

QCQ=

n1j=0

cjQPj

Q=

n1j=0

cj(QP Q)j

=

n1j=0

cjj

. (48)

Since the eigenvalues ofPare known (see Problem 25), this yields anO(n2)algorithm to determine the eigenvalues of a circulant matrix Cby evaluatingthe corresponding polynomialp at the eigenvalues ofP. (Assuming the com-putation ofp() for any eigenvalue costs O(n) flops. Realistic because theeigenvalues are so special.)

Problem 26 The polynomial trick (48) generalizes. That is, show that if

A Cn

n

and pis a polynomial, then

(p(A)) =p((A)),

wherep((A)) = {p() : (A)}. (Hint: Again the Schur decomposition.)

Ifn= 2l, then there is a much faster way to find the eigenvalues and hencediagonalize C. This relies on using the FFT as follows. From the identity

F1n C= F1n ,

29


30/55

consider the first columns on both sides to have

Fn[c0 c1 cn1]T = [1 2 n]T.Consequently, the eigenvalues require computing a single matrix-vector pro-duct withFn. Hence, by invoking the FFT, the eigenvalues ofCcan be foundat the cost ofO(n log2 n) flops. Moreover, then a linear system

Cx= b (49)

with b Cn can be solved in O(n log2 n) flops by using the diagonalizationC=FnF

1n and the FFT. That is,x= Fn

1F1n b.

Problem 27 Suppose T Ckk is a Toeplitz matrix. Devise a method toperform matrix-vector products fast with T in two steps. First construct a

circulant matrix C Cnn with n= 2l k containing Tas its block. Thenperform matrix-vector products appropriately with C to have matrix-vectorproducts with T.

There are also fast algorithms for solving linear systems involving Toeplitzmatrices [1].

8 Eigenvalue problems and functions of mat-

rices

Next we are concerned with the eigenvalue problem

Ax= x (50)

for a givenA Cnn. We also say something about the generalized eigenvalueproblem (16).

Compared with solving linear systems, the eigenvalue problem is of differentnature since it is not solvable through a finite computation. This followsfrom the fact that finding eigenvalues and finding zeros of polynomials areequivalent problems. (See Problem 21.) Abels theorem on finding zeros ofpolynomials states that, in general, the solutions cannot be expressed exactlywith radicals14 in terms of the coefficients of the polynomial if the degree ofthe polynomial is five or larger.

Because of this negative result, all the numerical methods for solving theeigenvalue problem are iterative and stopped after some degree of accuracy is

14Repeatedly forming sums, differences, products, quotients, and radicals (nth roots, forsome integer n) of previously obtained numbers.

30


31/55

achieved with the approximations. Of course, for computers, the IEEE doubleprecision accuracy would certainly suffice. However, if there are eigenvaluesof very different magnitude, it is unlikely attained for all the eigenvalues.

In practice, there are two types of eigenvalue problems. Either all the eigen-values and perhaps the corresponding eigenvectors need to be computed. Or,only the eigenvalues (and perhaps the corresponding eigenvectors) located insome region of the complex plane are of interest. Techniques for this latterproblem are described in connection with iterative methods.

Example 15 How only a few eigenvalues could be of interest? In quantummechanics one deals with Hermitian operators. The energy levels of a systemare given by the eigenvalues. One is often interested in knowing the lowestenergy levels, i.e., finding the smallest eigenvalues of the matrix obtainedafter discretizing the problem.

Before outlining how the full eigenvalue problem is very succesfully sol-ved with the so-called QR iteration, we mention some inclusion results. Aninclusion region for the eigenvalues is a subset ofC which is know to inclu-de (A). It should be mentioned that such results were more important inthe early days of numerical analysis, before the existence of reliable nume-rical methods for the eigenvalue problem. It seems that their importance ismarginal at present.

Example 16 The eigenvalues of a Hermitian matrix are contained in thereal axis. The eigenvalues of a skew-Hermitian matrix are contained in theimaginary axis.

The so-called Gershgorin discs of a matrix A Cnn are defined as follows.For j = 1, . . . n, set Rj =

l=j |ajl |. Then define

Dj = {z C :|ajj z| Rj}. (51)

Theorem 4 LetA Cnn. Then there holds

(A) n

j=1

Dj,

whereDj is the Gershgorin disc ofA defined in (51).

Proof. Suppose (A), so that (50) holds for a nonzero vector x Cn.Assume the jth entry satisfies|xj| |xl| for alll= 1, . . . n. Then

|ajj | l

=j

|ajl ||xl|/|xj| l

=j

|ajl |

31


32/55

and therefore Dj. End of proof.First, we do not show that everyDj contains an eigenvalue. Second, this so-called Gershgorins theorem is basis dependent. (=The linear operator corres-ponding to A represented in another basis gives rise to different Gershgorindiscs.) This can be benefited from. Before there were numerical methods tosolve the eigenvalue problem, various tricks using similarity transformationswere invented to squeeze more information out of the Gershgorins theorem.The goal: sharp information is obtained in case Ais diagonal.

Another inclusion region for the eigenvalues is the so-called field of values ofA Cnn defined as

F(A) = {xAx :||x|| = 1}.

Occcasionally, the field of values is also called the numerical range ofA. Itcontains the eigenvalues ofA (easy) and is convex (not so easy). It has alsosome other uses. However, the computation of F(A)is tedious and based onusing the following two observations repeatedly. First,F(A) = F(A) forany C. Second, we have A= H+ iKwith Hermitian

H=1

2(A + A) and K=

1

2i(A A).

Then xAx = xHx+ ixKx gives the real and maginary parts of xAx.In particular, finding min||x||=1 xHx and max||x||=1 xHxgive left and rightvertical extremes of

F(A). (This means finding two extreme eigenvalues of

H.) Repeat this with eiAfor a finite number of [0, 2).To have an idea how the Schur decomposition is computed, consider first thepower method to approximate an eigenvalue ofAhaving the largest modulus.Starting with an intial guess q(0), the so-called power method proceeds as

for k= 1, 2, . . .z(k) =Aq(k1)

q(k) =z(k)/||z(k)||(k) = (Aq(k), q(k))

end

(52)

Hence, A is repeatedly applied to q(0) and the result is scaled to keep theiterates to be of unit length.

To see when the power method can be expected to converge, assume A isdiagonalizable as A = XX1 with X = [x1 x2 xn]. Suppose|1| >|2| |3| |n|. We also assume the starting vector to be such thata1= 0 in the expasion

q(0) =a1x1+ + anxn. (53)

32


33/55

(Ifq(0) is randomly chosen, then this holds with probablity one.) Then, wit-hout scaling, we have after k steps

Akq(0) =a1k1(x1+

n

j=2

j

1

k aj

a1xj ).

Hence, q(k) expanded in the basis x1, . . . , xn as q(0) in (53) is dominantly in

the direction ofx1in the sense that the remaining components are O(|j ||1|

k).

The Schur decomposition is computed, roughly, by executing the power met-hod simultaneously with all the columns ofA.

The computation of the Schur decomposition consist of two phases. Sincethe ideas involved are fundamental, we describe the main ingredients of thescheme. Both phases require using the Householder transformations

H=I 2vv

vv (54)

where v Cn is chosen in such a way that Hx is in the direction ofe1 fora given x Cn. (Or an obvious modification of this.) This is achieved bymaking an educated guess

v=x+ e1, (55)

where Cmust be chosen so that Hhas the desired property. This leadsto = (x,e1)|(x,e1)| ||x|| and =||x|| if(x, e1) = 0. Then Hx =e1. Note thatHis unitary and Hermitian.

All this may sound simple. In a sense it is. However, the Householder trans-formations are fundamental building blocks for more complicated algorithms.

The first phase of the computation of the Schur decomposition consists of theconstruction ofn 1 Householder transformations to have unitary matrix

Q0=Hn1Hn2 H1 (56)

such that Q0AQ0 is a Hessenberg matrix.

Definition 13 A matrix H Cnn is a Hessenberg matrix ifhjk = 0 forj k+ 2.

Hence, a Hessenberg matrix is almost upper triangularit just can have oneextra nonzero diagonal right below the main diagonal.

To have H1 in (56), in (55) take x= [0 a21 a31 an1]T and replace e1 withe2. Then form H1 accordingly. As a result, once we form H1AH

1 =

A, thismatrix has the first column of the correct form. (An application with H1 =H1from the right does not affect the first column: Denote M = H1A. Then

33


34/55

MH1 =M 2vv Mvv. Now the first component ofvis zero.) Next in (55) takex = [0 0 a31a31 an1]T and replace e1 with e3. Then form H2 accordingly.As a result, once we form H2H1AH

1H

2 =

A, this matrix has the first twocolumns of the correct form.

Once complete, we have the unitary matrix (56) such that Q0AQ0 =H is a

Hessenberg matrix.

Problem 28 Explain how at most n 1 Householder transformations areneeded to compute the QR factorization ofA Cnn. (This is the numericallyreliable way to compute the QR factorization.)

Next comes the second phase whose purpose is to construct unitary similaritytransformations (plus various shifts alongside to speed up the convergence toachieveO(n3)complexity) with the aim at making the subdiagonal arbitrarilysmall so that the result is an upper triangular matrix plus . The matrix should be so small that it can be discarded. This is the actual QR iteration.Once achieved, the iteration is stopped.

The method can be argued by generalizing the power method. WithQ(0)=Iiterate according to

for k = 1, 2, . . .Z(k)=H Q(k1)Z(k)=Q(k)R(k)

end

(57)

This is a way to generalize the power method to matrices. Let us emphasizethat the QR factorization ofZ(k) is needed. Mere scaling by the norms of thecolumns would be doing just the same power method n times. Recall that theQR factorization is the same as the Gram-Schmidt process started from theleft-most column. This means that everything in the direction of the columnsorthogonalized so far are projected away when the algorithm proceeds. Thisis the key idea for converging to different eigenvalues with differing columns.

Note that if we put

Hk1= Q(k1)HQ(k1)= Q(k1)(HQ(k1)) = (Q

(k1)Q(k))R(k).

Then

Hk =Q(k)HQ(k)= (Q

(k)HQ(k1))(Q

(k1)Q(k)) =R(k)(Q

(k1)Q(k)).

This means that one can interpret the iteration (57) such that one computesthe QR decomposition ofHk1. Then Hk is obtained by changing the orderof the factors. This means we can do the following QR iteration

for k= 0, 1, 2, . . .

Hk= Q(k)R(k)Hk+1= R(k)Q(k)

end

(58)

34


35/55

instead. (Here we have H0 = H.) It is this sequence Hk which convergesto an upper triangular matrix, under sufficient assumptions. In high qualitymathematical software, there are many tricks to speed up the convergence ofthis basic version of the QR iteration. For the convergence, see [1, Chapter

7.3].

Problem 29 Suppose you have computed Q and T in the Schur decompo-sition ofA Cnn. At this point you know (A). How do you compute aneigenvector related with a given j (A)?

In Problem 29 the eigenvectors are immediately available ifA is normal, i.e.,when T is a diagonal matrix. Then the eigenvalue problem can numericallyreliable solved, at least when the eigenvalues are distinct. In the nonnor-mal case, the eigenvalue problem can be extremely tough in finite precision

arithmetics.

Problem 30 A reliable computation of the eigenvalues can be tough. Sup-pose you have performed the first phase of the QR iteration and computedH. Suppose your H looks like Pin (45) except that the (1, n)entry is . Youare not sure if your is a result of a rounding error. If so, perhaps it shouldbe replaced with zero. Compute the eigenvalues (with the help of Problem21) to see does it really make any difference.

The generalized eigenvalue problem (16) is solved by computing the so-called

generalized Schur decomposition.

Theorem 5 LetA, B Cnn. Then there exists unitaryQ, Z Cnn andupper triangularT, S Cnn such thatA= QT Z andB =QS T.

Proof. The following construction is not algorithmic. It is a generalization ofthe proof of the existence of the Schur decomposition.

Consider the polynomialp() = det(AB). By the fundamental theorem ofalgebra, p has at least one zero . Hence, there exist at least one generalized

eigenvectorz1 Cn such that Az1 =Bz1. Suppose it is of unit length andZ1 Cnn is unitary havingz1 as its first column. Let Q1 Cnn be unitaryhaving q1 =B z1/||Bz1|| as its first column. (IfBz1 = 0, then take any unitvector to be q1.) Note that Az1 is in the direction ofq1.

Then A= Q1T1Z1 , where T1 has the first column consisting of zeros except

the first entry. Similarly, B =Q1S1Z1 , whereS1 has the first column consis-

ting of zeros except the first entry.

This idea is repeated with the right lower (n 1)-by-(n 1) blocks ofT1 andS1. After orthonormalizations, we obtain A= Q1Q2T2Z

2Z

1 , where the first

35


36/55

column ofT2 equals the first column ofT1. The second column ofT2 consistsof zeros except for the first two entries. We obtain B =Q1Q2S2Z

2Z

1 , where

the first column ofS2 equals the first column ofS1. The second column ofS2 consists of zeros except for the first two entries.

Continue this construction to have the claim. End of proof.

Now if we have the generalized Schur decomposition available, then

Ax Bx = 0

holds if and only ifQ(T S)Zx= 0. (59)

The generalized eigenvalues of the problem T y = Sy are recovered imme-diately from the diagonal entries: tjj sjj = 0 for j = 1, . . . , n. For theeigenvectors, the techniques of Problem 29 apply.

The algorithm to compute the generalized Schur decomposition is called theQZ iteration [1]. Observe that there are also ways to transform a generalizedeigenvalue problem into a standard eigenvalue problem. For instance, ifBis invertible, then the generalized eigenvalue problem (16) is equivalent tosolving the standard eigenvalue problem

Mx= x

with M = B1A. A possible source of numerical problems in equivalence

transformations like this is the possible ill-conditioning ofB . If the conditionnumber

(B) = ||B||||B1||ofB is very large, then B is said to be ill-conditioned. As opposed to this,in (59) all the appearing equivalence transformations are unitary matrices.

Problem 31 Let A, B Cnn. Give an algorithm for computing unitaryQ, Zsuch that QAZ is a Hessenberg matrix and QBZupper triangular.

Since everything in numerical computations is subject to perturbations, it isinformative to have a quantitative estimate of how eigenvalues behave then.The distance of a point x from a setY is defined as(x, Y) = infyY||xy||.Then the distance of the set X from Y is defined as

(X, Y) = supxX

(x, Y).

The distance between X and Y is d(X, Y) = max{(X, Y)(Y, X)}. Thefollowing result is called the Bauer-Fike theorem.

36


37/55

Theorem 6 SupposeA Cnn is diagonalizable asA= XX1. Then((A + E), (A)) (X)||E||.

Proof. Suppose (A+E). If(A), then the claim holds. So let usassume (A). Then

(I (A))1X1(I A E)X=I (I (A))1X1EXis singular. This forces15

1 ||(I (A))1X1EX| | | |(I (A))1||||X1||||E||||X||.Since I (A) is diagonal, we obtain

||(I

(A))1

||= max

j

1

| j(A)|=

1

minj | j(A)|which yields the claim. End of proof.

This means that best stability results can be expected in the normal case, i.e.,when Xcan be chosen to be unitary. Otherwise, (X) is a good measure ofthe reliability of computations. Of course, a nice thing about the eigenvaluecomputations is that you can always compute the so-called residual||Ax x|| to check how accurate eigenpair approximations and x you havegenerated.

There are also perturbation results for the non-diagonalizable case, but the

bounds are very weak. The reason for this can be seen in Problem 30.

Problem 32 16 The trace ofA Cnn is defined astr(A) =nj=1 ajj . Showthat

p() = det(A I) = (1)nn + (1)(n1)tr(A)(n1) + + det(A).Conclude (by using the Schur decomposition) that tr(A) is the sum of theeigenvalues of A. Consequently, the sum of the eigenvalues depends conti-nuously on A very nicely.

Once there is a way to solve the eigenvalue problem, it allows computingfunctions of matrices economically. So far we have dealt with polynomials ina matrixAdefined as

p(A) =k

j=0

ajAj (60)

15Recall that if||M||


38/55

for any polynomial p(z) =k

j=0 ajzj. Polynomials in matrices are of impor-

tance in connection with iterative methods, although they are not formedexplicitely.

For more complicated functions, the most important example is the expo-nential ofAdefined analogously by replacing the variable zwith A to have

eA =

j=0

Aj

j!. (61)

Typically the exponential appears in solving differential equations with timedependence. Then it is used by applying it as

t eAtb (62)

where t > 0 and b Cn

n

. It solves the intial value problem x = Ax withx(0) =b. When resulting from discretizing time dependent PDEs, n is verylarge. This is important for the following reason. In (62), the exponential itselfis not needed, just matrix-vector products with it when t varies. Therefore,you do not want to compute the exponentialeAt, if somehow it can be avoided.(Compare with solving a linear system. Then A1 is not needed, onlyA1b.)

The most general definition is given by using complex analysis and the Cauc-hy integral formula as follows. Iff is analytic in a neighborhood of(A),then

f(A) = 1

2i

f()(I

A)1d, (63)

where is a simple Jordan curve surrounding the eigenvalues conterclockwise(and included in the domain of definition off). This is easy to accept byconsidering a diagonalizable matrix A= XX1. Then we can factor usingthe similarity to have

f(A) =X 1

2i

f()(I )1dX1 (64)

For the integral this means using the standard complex analysis on the dia-gonal, i.e., the resulting diagonal matrix is D(f) = diag(f(1), . . . , f (n)).

Thereafter perform the similarity to get f(A) = XD(f)X1. In particular,we see that iffis a polynomial or the exponential function, then f(A)agreeswith our earlier definitions (60) and (61). The usage of (64) requires that Acan be reliably diagonalized. This is not always a realistic assumption.

The above arguments prove the spectral mapping theorem

(f(A)) =f((A))

in the diagonalizable case.

38


39/55

Example 17 Let A Cnn be skew-Hermitian, i.e., A =A. Then (IA)(I+ A)1 is the so-called Cayley transform ofA. It follows from (64) thatthe Cayley transform is unitary.

Example 18 Let A Cnn be skew-Hermitian. Then in the numerical sol-ving of the time dependent Schrdinger equation one needs to apply eAt witht >0 to a vector. Also now eAt is unitary.

In practice it depends on fhowf(A)should be computed. For the exponen-tial function, the method of choice is the so-called squaring and scaling al-gorithm [1]. (Any high quality mathematical software, such as Matlab, hasan implementation of this algorithm). The squaring and scaling algorithm isnot based on diagonalizing A.

Problem 33 (This problem is from [1, Chapter 11].) Let A Cnn beHermitian positive definite. (a) Show that there exists a unique Hermitianpositive definiteXsuch that A= X2. (b) Show that ifX0 =Iand Xk+1 =(Xk+AX

1k )/2, then Xk

A with quadratic speed, where

A denotes

the matrixXof (a).

9 Iterative methods

Next we consider ways to solve the linear system

Ax= b (65)

for an invertible A Cnn and b Cn given when n is large. By large ismeant that direct methods such as the LU factorization ofO(n3)complexityand O(n2) storage are not acceptable. Thus n is of order O(104) at least. Itcould be of order O(107) or even O(108).

Practically all the problems considered so far in these lecture notes appearforn large. The large scale eigenvalue problem is encountered often. So is thetask of computing the SVD. (Or, rather, some singular values and possiblyrelated singular vectors.) Also the problem of applying the exponential of avery large matrix to a vector arises in practice.

Instead of direct methods, in large scale problems one executes iterative met-hods. Iterative methods require a different mindset compared with when di-rect methods are used. By iterative methods we mean algorithms which arenot based on factoring the coefficient matrixA. In iterative methods one usesinformation based on matrix-vector products and then constructs approxi-mations to the solution of (65). A rule of thumb is that a single iteration step

39


40/55

should not cost more than O(n)or O(n log n) floating point operations. Theapproximations (hopefully) improve step by step untill sufficient accuracy isreached. Very seldom one needs approximations in full machine precision.Something like four or six correct digits may well suffice.

Since a single iteration step should not cost more than O(n) or O(n log n)floating point operations, any of the performed matrix-vector products can-not cost more than this. It means that A is either sparse, or has some veryspecific structure which makes this possible (like some sort of Toeplitz-likestructure). Sparse means that A has onlyO(n)nonzero entries. Sparse mat-rices are not stored like full matrices, i.e., zeros are not stored. The locationand the values of the entries are stored instead.17 Matrix-vector products areprogrammed such that no zero multiplications are performed.

If there is an option to model a problem either with a differential equation

or with an integral equation, the former gives rise to sparse matrices, afterdiscretizing.

Iterative methods are not guaranteed to converge, i.e., it may take far toomany iterations to achive the required accuracy. In other words, althougha single itration step is assumed to be inxpensive, we cannot afford to takethem indefinitely. In fact, in serious applications, the basic iterative methodstypically do not converge fast enough. Fortunately, there are ways to speedup the convergence by so-called preconditioning. It can be said that the taskof solving a large scale problem depends on how succesfully the problem canbe preconditioned.

In large scale problems one still needs very much direct methods, althoughmore indirectly. During the iteration is executed, direct methods are used forvery small sub-problems at every iteration step. With the so-called optimaliterative methods one needs to solve least squares problems.

In the least squares problem one is concerned with a linear system

My = c (66)

for M Ckj and c Ck given such that k j. It is assumed that thecolumns ofMare linearly independent. Since the problem is overdetermined,

i.e., there are more equations than variables, (66) is rarely solvable. (Clearly,the problem solvable if only ifcis in the column space ofM.18) Therefore itis of interest to solve the problem in the least squares sense by solving

minyCj

||My c||2 (67)17Sparse matrix storage means that A is located in memory in a nonstandard way. This

has some consequences. For example, performing matrix-vector products withA can bevery costly, i.e., slow.

18Recall that the column space ofMis the subspace{My : y Cj}ofCn, i.e., the setof all linear combinations of the columns ofM.

40


41/55

instead. This is easily accomplished with the help of the orthogonal projectorPonto the column space ofM. This is achieved by computing the reducedQR factorization ofM = QR ofM, where Q Ckj having orthonormalcolumns spanning the columns space ofM andR Cjj upper triangular.19(Do this either with the help of Problem 3 or 28. For numerical stability,the latter alternative is recommended.) Then P =QQ. Consequently, (67)equals

minyCj

||My QQc||2 + ||(I QQ)c||2. (68)

Since||(I QQ)c|| is a constant, it suffices to concentrate on

minyCj

||MyQQc||2 = minyCj

||QQMyQQc||2 = minyCj

||Q(RyQc)||2. (69)

Since the columns ofMwere assumed to be linearly independent, the valueof this latter minimization problem is zero by choosing y = R

1Q

c. This

also solves (67). (Gauss invented a way to solve the least squares problem.The above is a numerically correct way of doing it.)

Least squares problems appear frequently and just in connection with itera-tive methods.

Problem 34 In Problem 23 you hadsquareVandermonde matrices becauseyou did interpolation. Explain what kind ofrectangularVandermonde mat-rices you obtain when you have to fit a polynomial of degree j 1 in theleast squares sense to go approximately through the points (x1, y1),...,(xk, yk).

Find the best cubic least squares fit to

(0, 0.486),(0.15, 1.144),(0.30, 1.166),(0.45, 1.095),

(0.60, 1.099),(0.75, 1.117),(0.9, 1.38),(1.05, 1.857).

Use Matlab (and its built in reduced QR factorization function) and plotyour curve against the points.

With this tool available, we can now introduce the so-called GMRES met-

hod (generalized minimal residual method) for iteratively solving the linearsystem (65). The reasoning behind the GMRES method is largely based onthe equality (23), i.e., we know there exists a polynomial psuch that

Ap(A)b= b

and therefore x = p(A)b. Since n is very large, constructing this polyno-mial is completely unrealistic. Instead, one uses lower degree polynomials to

19This is the reduced QR factorization ofM. The other choice isM= QRwithQ Ckkunitary and R= [RT 0]T Ckj .

41


42/55

get approximations through solving the arising minimization problems. theoptimality This means that at the jth step one solves

mindeg(pj1)j1

||Apj1(A)b b|| (70)

to have the corresponding approximation as xj = pj1(A)b. This is realisticonly forj n. Heredeg(p)denotes the degree of a polynomialp. Of course,one cannot expect to have zero with (70) since j is small.

Example 19 SupposeA can be split asA = I B with ||B||


43/55

Gram-Schmidt process.20 Then compute Aq2 and orthonormalize this vectoragainst q1 and q2 to have

h32q3=Aq2 h12q1 h22q2.The coefficients h12, h22 and h32 are computed by executing the modifiedGram-Schmidt process. The unit vector q3 is orthogonal against q1 and q2.Hence, the kth iterate is computed according to

hk k1qk =Aqk1 k1l=1

hl k1ql, (73)

where the coefficents hl k1 are computed by executing the modified Gram-Schmidt process such that Aqk1 is orthogonalized against the orthonormalvectors q1, . . . , q k1 computed so far. After j steps we have computed an

orthonormal basis q1, . . . , q j ofKj (A; b) assuming its dimension is j. Thealgorithm described is the so-called Arnoldi method.

The computations satisfy the matrix identity

AQj =Qj+1Hj, (74)

where qj = [q1 q2 qj] Cnj has orthonormal columns and Hj = {hst} C(j+1)j has a Hessenberg-like structure. The entries of the k 1th columnofHj are obtained from the identity (73). Using this means that (70) can bewritten as

minyjCj

||AQj yj b|| = minyjCj

||Qj+1Hjyj b|| = minyjCj

||Hjyj e1||, (75)

where = ||b||.Solving this (small) least squares problem is straighforward.Once done, we obtain the approximate solution as xj =Qj yj.

Problem 35 How many inner products are required to have the factoriza-tion (74)?

Problem 36 Suppose at thejth step the dimension of the Krylov subspacesceases to grow, i.e.,

j = dim Kj(A; b) = dim Kj+1(A; b). (76)Show that then (74) can be written as AQj = Qj Hj , where Hj Cjj is aHessenberg matrix. If j = n and A is Hermitian, what can you say aboutHn? When taken into account, how many inner products are needed then tohave the factorization (74)?

20The modified Gram-Schmidt process is a more accurate version of the standard Gram-Schmidt process. In exact arithmetic the results coincide.

43


44/55

Problem 37 If (76) holds, then Kj(A; b)is said to be an invariant subspaceof the matrixA. This means that

AKj(A; b) Kj (A; b)holds. How are the eigenvalues and corresponding eigenvectors ofHj relatedwith those ofA?

Denote by rj = bAxj the residual at the jth step. SinceKj(A; b) Kj+1(A; b), we have

||rj| | | |rj+1|| (77)for every j. In this sense the GMRES method is optimal and improves theapproximation at every step.

Regarding the computational cost of the GMRES method, the Arnoldi ite-ration, i.e., the construction of an orthonormal basis ofKj(A; b) dominatesthe complexity. It requires computing j 1 matrix-vector products. Then,the Gram-Schmidt process requires computing inner-products. The cost ofsolving (75) is negligible.

The storage required by the GMRES method grows linearly with the iterationnumber. It consists, in essence, of the need to store the orthonormal basisq1, . . . , q j. (Of course,A must be stored as well.) The storage required by Hjis neg

Iterative Matrix computation

Documents