CHAPTER III SPECTRA OF OPERATORS In this chapter, we investigate the central problem in linear algebra: the eigenvalue and eigenvector problem. The importance of this problem can be understood from a purely mathematical point of view: it is the gateway leading to our understanding of the structure of a linear operator. It is also needed for understanding our physical world. We can tell if a star million light years away is mainly composed of hydrogen atoms by reading through a spectroscope the “eigenvalues” of the Schr¨ odinger operator for hydrogen from the light it emits. When a bell shows cracks, it starts to sound dull because of the decrease in each eigenvalue of certain operator. Working in natural science or engineering research, we should always be prepared to encounter eigenvalue problems. §1. Eigenvalues and Eigenvectors 1.1. As we know, given a linear operator T on a finite dimensional space V , we can convert it into a matrix by using a basis of V . However, different bases give different matrix representation of T . Naturally we wonder, can we pick a basis so that the corresponding matrix representing T is a diagonal matrix, which is considered to be the simplest? Certainly, diagonal matrices are easy for computation purposes and an answer to this question has practical values. Suppose that a basis B in V consisting of vectors v 1 , v 2 ,..., v n is judicially chosen so that the matrix representing T relative to B is diagonal, say [T ] B = λ 1 0 ··· 0 0 λ 2 ··· 0 . . . . . . . . . 0 0 ··· λ n . This means T v j = λ j v j , or (T − λ j I )v j = 0 for all j =1, 2,...,n. This identity indicates that λ j is an eigenvalue of T and v j is a corresponding eigenvector. Definition 1.1. By the spectrum of an operator T on a finite dimensional complex vector space V , denoted by σ(T ), we mean the set of all complex numbers λ such that T − λI is not invertible. W e know that an operator on a finite dimensional vector space is 1
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
CHAPTER III
SPECTRA OF OPERATORS
In this chapter, we investigate the central problem in linear algebra: the eigenvalue
and eigenvector problem. The importance of this problem can be understood from a purely
mathematical point of view: it is the gateway leading to our understanding of the structure
of a linear operator. It is also needed for understanding our physical world. We can tell if
a star million light years away is mainly composed of hydrogen atoms by reading through
a spectroscope the “eigenvalues” of the Schrodinger operator for hydrogen from the light
it emits. When a bell shows cracks, it starts to sound dull because of the decrease in
each eigenvalue of certain operator. Working in natural science or engineering research,
we should always be prepared to encounter eigenvalue problems.
§1. Eigenvalues and Eigenvectors
1.1. As we know, given a linear operator T on a finite dimensional space V ,
we can convert it into a matrix by using a basis of V . However, different bases give
different matrix representation of T . Naturally we wonder, can we pick a basis so that the
corresponding matrix representing T is a diagonal matrix, which is considered to be the
simplest? Certainly, diagonal matrices are easy for computation purposes and an answer
to this question has practical values. Suppose that a basis B in V consisting of vectors
v1, v2, . . . , vn is judicially chosen so that the matrix representing T relative to B is
diagonal, say
[T ]B =
λ1 0 · · · 00 λ2 · · · 0...
......
0 0 · · · λn
.
This means Tvj = λjvj , or (T −λjI)vj = 0 for all j = 1, 2, . . . , n. This identity indicates
that λj is an eigenvalue of T and vj is a corresponding eigenvector.
Definition 1.1. By the spectrum of an operator T on a finite dimensional complex
vector space V , denoted by σ(T ), we mean the set of all complex numbers λ such that
T −λI is not invertible. W e know that an operator on a finite dimensional vector space is
1
not invertible if and only if its kernel is nonzero. Therefore we may put
σ(T ) = {λ ∈ C : ker(T − λI) �= {0} }.
A complex number λ in the spectrum σ(T ) is also called an eigenvalue of T . By our
definition here, if λ is an eigenvalue λ of T , the subspace ker(T − λI) is nonzero. This
subspace is called the eigenspace corresponding to λ. A nonzero vector in this eigenspace
is called an eigenvector of T corresponding to the eigenvalue λ.
Notice that
v ∈ ker(T − λI) ⇔ (T − λI)v = 0 ⇔ Tv = λv.
Hence we have:
λ ∈ σ(T ) if and only if Tv = λv holds for some nonzero vector v.
♠ Aside: The importance of the word “nonzero” in the above statement cannot be overem-
phasized. If it were dropped, the statement would become absolute nonsense, because
Tv = λv is always satisfied for some vector v, namely, 0. ♠
1.2. How do we find eigenvalues? Take any basis E in V and consider the matrix
representation [T − λI] := [T ]− λI of T − λI relative to this basis; (here we use the same
symbol for the identity operator as well as the identity matrix). Now λ is an eigenvalue
of T means that T − λI is not invertible, and descending to matrices, this means that
[T ]− λI is not invertible, or λI − [T ] is not invertible. But we know that a matrix is not
invertible if and only if its determinant is zero. Hence λ is an eigenvalue of T if and only
if it satisfies:
det(λI − [T ]) = 0. (1.2.1)
If the dimension of the space V is n, i.e. if [T ] is an n × n matrix, then (1.2.1) is a
polynomial equation in λ of degree n. It is called the characteristic equation of the
matrix [T ], or of the operator T . The expression det(λI − [T ]), which is a polynomial in
λ of degree n, (n = dimV ), is called the characteristic polynomial of the operator T .
Example 1.2.1. Define an operator T on P2 by
T (p(x)) = xp′(x)− p(x+ 1).
2
Take the standard basis B = {1, x, x2} of P2. To find the matrix [T ]B, we compute:
T (1) = 0− 1 = −1,
T (x) = x− (x+ 1) = −1,
T (x2) = x(2x)− (x+ 1)2 = x2 − 2x− 1.
Hence
[T ] ≡ A =
−1 −1 −10 0 −20 0 1
, and det([T ]− λI) =
∣
∣
∣
∣
∣
∣
−1− λ −1 −10 −λ −20 0 1− λ
∣
∣
∣
∣
∣
∣
.
Thus the characteristic equation of T is
(−1− λ)(−λ)(1− λ) = 0
and hence the eigenvalues of T are 1, 0,−1. In other words, σ(T ) = {1, 0,−1}. Next we
find an eigenvector of T for each eigenvalue. First consider the eigenvalue λ = 1. An
eigenvector corresponding to λ = 1 is a polynomial p ≡ p(x) such that (T − λI)(p) = 0.
This gives [T − λI][p] = 0, or ([T ]− λI)[p] = 0. Here [T ] is the matrix A given above, λ is
1, and [p] is a column, say [p] = X = [x1, x2, x3]⊤. Thus we have (A− I)X = O, i.e.
−2 −1 −10 −1 −20 0 0
x1x2x3
=
000
.
This matrix equation gives the following homogeneous system of linear equations:
(−2)x1 + (−1)x2 + (−1)x3 = 0
(−1)x2 + (−2)x3 = 0
We only need to find one nontrivial solution to this equation. This is easy to do. Set x3 = 1.
Then, from the second equation, we have x2 = (−2)x3 = −2. From the first equation,
we have x1 = (1/2)(−x2 − x3) = (1/2)(−(−2) − 1) = 1/2. Thus X = [1/2,−2, 1]⊤ is
a solution. To get a neater expression, we multiply this solution by 2 to obtain another
solution X1 = [1,−4, 2]⊤. The polynomial p1(x) with X1 = [1,−4, 2]⊤ as its column
representation relative to the standard basis {1, x, x2} is p1(x) = 1−4x+2x2. (Aside. We
can check that this polynomial is indeed an eigenvector corresponding to λ = 1:
Since v1,v2, . . . ,vk are eigenvalues of T with respect to distinct eigenvalues λ1, λ2, . . . , λkrespectively, these k eigenvectors, by our induction hypothesis, are linearly independent.
Therefore we must have a1 = 0, a2 = 0, . . . , ak = 0. It remains to show ak + 1 = 0.
Return to (1.3.2). We can rewrite this identity as ak + 1vk + 1 = 0. Since vk + 1 is nonzero
(because it is an eigenvector), we also have ak + 1 = 0. The proof is complete.
Theorem 1.3.2. If a linear operator T defined on an n-dimensional complex vector
space V has n distinct eigenvalues, then T is diagonalizable, that is, there exists a basis B
consisting of eigenvectors of T so that the representing matrix [T ]B of T relative to B is a
diagonal matrix.
Let λ1, λ2, . . . , λn distinct eigenvalues of T and let v1,v2, . . .vn be their corresponding
eigenvectors: Tv1 = λ1v1, Tv2 = λ2v2, . . . , Tvn = λnvn. By Theorem 1.3.1, we know
that v1,v2, . . . ,vn are linearly independent. Since n = dimV , these vectors form a basis
of V . So the above theorem is valid.
1.4. Let A be an n× n complex matrix. We may regard A as a linear operator on
Cn; (that is, we identify A with the linear operator MA defined by MAx = Ax for all
x in Cn). Assume that A has n distinct eigenvalues λ1, λ2, . . . , λn with corresponding
eigenvectors P1, P2, . . . , Pn (which are column vectors). According to Theorem 1.3.1,
these column vectors in linearly independent. Let us write
AP1 = P1λ1, AP2 = P2λ2, . . . , APn = Pnλn.
Here, certainly Pkλk is the scalar multiple of the column vector Pk by λk. The reason
we write in this way instead of λkPk is because Pk is regarded as an n × 1 matrix and
λk as an 1 × 1 matrix. The correct order here is crucial for performing block matrix
5
multiplication below. Let P be the n × n matrix [P1 P2 · · · Pn]. The matrix P is
invertible, since it is a square matrix and its columns are linearly independent. Now
AP = A[P1 P2 · · · Pn] = [AP1 AP2 · · · APn]
= [P1λ1 P2λ2 · · · Pnλn] = [P1 P2 · · · Pn]
λ1 0 · · · 00 λ2 · · · 0...
0 0 · · · λn
= PD,
Thus AP = PD, where D is a diagonal matrix with eigenvalues of A along its diagonal,
or A = PDP−1, which gives a diagonalization of matrix A.
Example 1.4.1. In this example we show how diagonalization helps for solving linear
differential equations. We are asked to find a general solution to the system of equations
dy1dt
= y1 + y2,dy2dt
= 2y1.
We can rewrite this system as
dy
dt= Ay with y =
[
y1y2
]
, A =
[
1 12 0
]
.
Using the method described in Example 1.2.1, we find the eigenvalues 2, −1 of A with
corresponding eigenvectors P1 = (1, 1), P2 = (1,−2). Let
P =
[
1 11 −2
]
and D =
[
2 00 −1
]
with P−1 = −1
3
[
−2 −1−1 1
]
.
Then we have AP = PD, as we can check directly. Replacing A in dy/dt = Ay by
PDP−1, we have dy/dt = PDP−1y, or dP−1y/dt = DP−1y. Let w = P−1y. Then
dw/dt = Dw and y = Pw. Thus we have
dw1
dt= 2w1,
dw2
dt= −w2 with y1 = w1 +w2, y2 = w1 − 2w2.
The new system of differential equations is easy to solve: w1 = C1e2t, w2 = C2e
−t. Our
final answer is
y1 = C1e2t +C2e
−t, y2 = C1e2t − 2C2e
−t.
(The reader should check this answer.)
Example 1.4.2. Consider the system of difference equations
un + 1 = un + vn, vn+ 1 = 2un; n ≥ 0.
6
We can rewrite this system as
yn+ 1 = Ayn with yn =
[
unvn
]
, A =
[
1 12 0
]
.
We have y1 = Ay0, y2 = Ay1 = A2y0, y3 = Ay2 = A3y0, etc. In general, yn = Any0.
So, in order to find the general solution to this system of difference equations, we need to
give an explicit expression for An. Using the method described in Example 1.2.1, we find
A = PDP−1, where
P =
[
1 11 −2
]
, D =
[
2 00 −1
]
and P−1 = −1
3
[
−2 −1−1 1
]
.
Now
An = (PDP−1)n = PDP−1PDP−1PDP−1 · · ·PDP−1 = PDnP−1.
So
An = −1
3
[
1 11 −2
] [
2n 00 (−1)n
] [
−2 −1−1 1
]
=1
3
[
2n+ 1 + (−1)n 2n − (−1)n
2n+ 1 − 2(−1)n 2n + 2(−1)n
]
Thus [un vn]⊤ = yn = Any0 gives
un =2n+ 1 + (−1)n
3u0 +
2n − (−1)n
3v0,
vn =2n+ 1 − 2(−1)n
3u0 +
2n + 2(−1)n
3v0.
Example 1.4.3. An elevator in Herzberg Building has two states: s1 for “working”
and s2 for “out of order”. Let pij denote the probability of being in state i on the next day,
when today’s elevator is in state j. A student trying to take this elevator everyday comes
up with the following subjective probabilities: p11 = 0.5, p21 = 0.5, p12 = 0.1 p22 = 0.9
after a year of observation. Let
P =
[
p11 p12p21 p22
]
=
[
0.5 0.10.5 0.9
]
To find the frequency of the elevator in working condition, we need to compute limn→ ∞ Pn.
Following the method described in Example 1.2.1, we find eigenvalues 1, 0.4 with corre-
sponding eigenvectors (1, 5), (1,−1). Then we have P = SDS−1, where
S =
[
1 15 −1
]
, D =
[
1 00 0.4
]
and S−1 =1
6
[
1 15 −1
]
.
7
As n → ∞,
Pn = SDnS−1 =1
6
[
1 15 −1
] [
1 00 0.4n
] [
1 15 −1
]
tends to1
6
[
1 15 −1
] [
1 00 0
] [
1 15 −1
]
=
[
1/6 1/65/6 5/6
]
.
So, on the average, about 1/6 of time the elevator is in working condition. This seems to
fit the student’s experience over the year.
1.5. We have seen that, if A = PDP−1, then An = PDnP−1. More generally, if
p is a polynomial, then p(A) = Pp(D)P−1. This suggests the definition of f(A) for any
function f (defined on the spectrum σ(A) of A) by putting f(A) = Pf(D)P−1, where
f(D) =
f(λ1) 0 · · · 00 f(λ2) · · · 0
0 0 · · · f(λn)
for D =
λ1 0 · · · 00 λ2 · · · 0
0 0 · · · λn
.
Computing f(A) is called functional calculus of A. Besides polynomial, another com-
monly used functions for functional calculus is ft(x) = ext, where t is a parameter. We
will write ft(A) as eAt in the future.
Consider the initial value problem dy/dt = Ay with y(0) = y0, where y0 is a given
vector in Cn. Formally we can write down the solution as
y(t) = eAty0. (1.5.1)
The 1–dimensional case is well known: the initial value problem dy/dt = ay with y(0) = y0is y(t) = eaty0. It is known that the Taylor expansion for the exponential function eat is
(a) If A is similar to B and if C is similar to D, then AC is similar to BD.
(b) If A is similar to B, then A2 is similar to B2.
23
(c) If A is similar to B, the A⊤ is similar to B⊤.
(d) It A is invertible and B is similar to A, then B is also invertible.
(e) If A is similar to the identity matrix I, then A = I.
Exercises
1. Show that the matrices A =
[
1 10 2
]
and B =
[
1 20 2
]
are similar by finding an
invertible matrix P such that PAP−1 = B.
2. Consider the 2-dimensional complex vector space V of functions spanned by sinx and
cosx. For a fixed real number α, define a linear operator T ≡ Tα on V by putting
T (f(x)) = f(x + α). Find the matrices [T ]B and [T ]E of T relative to the bases
B = {cosx, sinx} and E = {cosx + i sinx, cosx − i sinx}. Find an invertible matrix
P implementing the similarity between [T ]B and [T ] E : [T ]B = P [T ]E P−1.
3. Show that, if A = [aij ] is an n× n matrix, then
tr(A⊤A) =∑n
j,k = 1a2jk.
4. Show that the n × n identity matrix I cannot be written in the form AB − BA for
some n× n matrices A and B. (Hint: Use a basic property of tr(·).)
5. Let A and B be n× n matrices. (a) Show that, if A is invertible, then AB is similar
to BA. (b) Show that AB and BA may not be similar in general.
6. Criticize the following “proof” of the statement “if a (square) matrix A is similar to12A, then A = O”. “Proof ”: For simpicity, we write B ∼ C for B is similar to C.
Notice that, if B ∼ C, then 12B ∼ 1
2C. So, from A ∼ 1
2A we obtain 1
2A ∼ 1
4A. In the
same way, we get 14A ∼ 1
8A etc. Hence A ∼ 12nA for all positive integer n. Letting
n → ∞, we get A ∼ O, from which it follows A = O.
7. Let A abd B be n× n matrices with B invertible. Simplify the expression
∑n
k = 1Ak(A−B−1)Bk.
8. Use the summation sign to give a careful inductive proof of the binomial formula
(a+ b)n =∑n
k = 0
(
n
k
)
an−kbk.
24
§3. Basic Spectral Theory
3.1. Spectral Theory is considered as “the heart of the matter” in linear algebra. Here
we only present some basic aspect of this theory which is adequate for most applications.
The full theory, not described here, includes the Jordan canonical form (see Appendix C),
which is substantually more difficult.
Let T be a linear operator on a finite dimensional vector space V over the complex
field C with dimV = n. As we know, the dimension of L (V ), the space of all linear
operators on V , is also finite dimensional with dim L (V ) = n2. Thus, letting N be any
integer with N ≥ n2, the N +1 operators I, T, T 2, . . . , TN must be linearly dependent.
Hence there exist complex numbers a0, a1, . . . , aN , not all zeroes, such that
a0I + a1T + · · · + aNTN = O.
Let p(x) = a0 + a1x + · · · + aNxN . Then p(x) is a nonzero polynomial such that
p(T ) = O. We have proved that, given a linear operator on a finite dimensional space,
there is a nonzero polynomial p(x) such that p(T ) = O.
Among all nonzero polynomials p(x) satisfying p(T ) = O, we pick the one with
the smallest degree, say m, with the leading coefficient one, and denote it by pT (x).
Now, suppose that q(x) is any nonzero polynomial such that q(T ) = O. Divide q(x) by
pT (x) to get q(x) = Q(x)pT (x)+r(x), where r(x) is the remainder, which is a polynomial
of degree less than m, the degree of pT (x). Now q(T ) = Q(T )pT (T ) + r(T ). From
q(T ) = O and pT (T ) = O we get r(T ) = O. Since the degree of r(x) is less than that of
pT (x), it is necessarily the zero polynomial; (otherwise it would be a nonzero polynomial
of lower degree satisfying r(T ) = O, contradicting our choice of pT (x)). Thus we have
q(x) = Q(x)pT (x). In the future we will call pT (x) the minimal polynomial of T . We
have proved that any polynomial q(x) satisfying q(T ) = O is a multiple of the minimal
polynomial pT (x) of T .
Let λ1, λ2, . . . , λr be all roots of pT (x) with multiplicities m1, m2, . . . , mr respec-
tively. Thus pT (x) = (x− λ1)m1(x− λ2)
m2 · · · (x− λr)mr and
(T − λ1I)m1(T − λ2I)
m2 · · · (T − λrI)mr = pT (T ) = O.
Let q1(x) = (x− λ1)m1−1(x− λ2)
m2 · · · (x− λr)mr . Since the degree of q1(x) is less than
the degree of pT (x), we must have q1(T ) �= O. Hence there exists a vector u in V such
that v ≡ q(T )u �= 0. Since (x− λ1)q(x) = pT (x), we have
(T − λ1I)v = (T − λ1I)q(T )u = pT (T )u = 0.
25
This shows that λ1 is an eigenvalue of T . In the same way, we can show λk for any k ≤ r
is an eigenvalue of T . We have proved that the roots of the minimal polynomial of a linear
operator on a finite dimensional complex vector space are eigenvalues of T . In particular,
we have proved that eigenvalues do exist for such an operator. (♠ Remark: If we work in
a field other than C, say R, eigenvalues may not exist.♠)
Next, we use an idea in the proof of Proposition 3.4.1 in Chapter I to investigate
the so–called spectral decomposition of T . Let fk(x) be the polynomial obtained from
pT (x) = (x− λ1)m1(x− λ2)
m2 · · · (x− λr)mr by deleting the factor (x− λk)
mk . Thus we
have (x−λk)mkfk(x) = pT (x) for all k. Clearly, the polynomials f1(x), f2(x), . . . , fr(x)
do not have any common root. So they are coprime polynomials. Hence there exist
P2 = −A(A− 2I) and P3 =12A(A− I). Actual computation shows
P1 =
1 −2 −1/20 0 00 0 0
, P2 =
0 2 −20 1 −10 0 0
, P3 =
0 0 5/20 0 10 0 1
.
As you can check, PjPk = δjkPj , P1 + P2 + P3 = I, AP1 = O, AP2 = P2, AP3 = 2P3.
27
3.2. We go back to our discussion of a linear operator T on V with minimal polynomial
pT (x) = (x−λ1)m1(x−λ2)
m2 · · · (x−λr)mr . In a lucky situation, all of the exponents mk
are equal to 1, in other words, the minial polynomial is of the form
pT (x) = (x− λ1)(x− λ2) · · · (x− λr)
where λk (1 ≤ k ≤ r) are distinct roots of pT (x) and thus pT (x) has simple roots (that is,
each root does not repeat itself). Then (3.1.5) says that (T − λkI)Pk = O for all k. This
identity tells us the the range Pk(V ) of Pk is contained in the eigenspace of T corresponding
to the eigenvalue λk. From (3.1.2) we see that each vector v can be expressed as a sum
v =∑r
k = 1Pkvk,
where Pkv is either zero or an eigenvector of T . This tells us that T is diagonalizable, that
is, it has a basis consisting of eigenvectors; (a more detail argument is given in Appendix
B). The converse is also true: if T is diagonalizable, then its minimal polynomial pT (x) has
simple roots. Indeed, if b1, b2, . . . , bn is a basis of eigenvectors and let λ1, λ2, . . . , λrbe the distinct eigenvalues of T , then each bj is annihilated by T − λkI for some k and
hence (T − λ1I)(T − λ2I) · · · (T − λrI)bj = O. Since bj (1 ≤ j ≤ n) span V ,
(T − λ1I)(T − λ2I) · · · (T − λrI)v = O
for all v. Thus p(T ) = O, where p(x) = (x− λ1)(x− λ2) · · · (x− λr) is a polynomial with
simple roots. We have proved:
Theorem 3.2.1. A linear operator T defined on a finite dimensional complex
vetcor space is diagonalizable if and only if its minimal polynomial is of the form pT (x) =
(x−λ1)(x−λ2) · · · (x−λr), where λ1, λ2, . . . , λr is the set of distinct eigenvalues of T .
According to this theorem, to see if an operator T is diagonalizable, we can take the
following two steps: first, find all distinct eigenvalues λ1, λ2, . . . , λr of T ; second, form
the polynomial p(x) = (x− λ1)(x − λ2) · · · (x − λr) and check that if p(T ) = O holds. If
p(T ) = O, the answer is yes; if p(T ) �= O, the answer is no.
Example 3.2.1. Find the condition on a, b, c such that the matrix
A =
1 0 ab 2 c0 0 2
is diagonalizable.
28
Solution. We find that the characteristic polynomial of A is (x− 1)(x− 2)2 and hence
the distinct eigenvalues are 1, 2. Form p(x) = (x− 1)(x− 2). Then
p(A) =
0 0 ab 1 c0 0 1
−1 0 ab 0 c0 0 0
=
0 0 00 0 ab+ c0 0 0
which is the zero matrix if and only if ab+ c = 0. Thus A is diagonalizable if and only if
ab+ c = 0.
If we know that T satisfies p(T ) = O for a polynomial p(x) with simple roots, then T is
diagonalizable. This is because the minimal polynomial pT (x) is a factor of p(x) and hence
it also has simple roots.
Example 3.2.2. An operator P is called a projection if P 2 = P . A projection P
is diagonalizable, since P 2 = P tells us that p(P ) = O, where p(x) = x2 − x = x(x − 1),
which is a polynomial with simple roots. Next, suppose that T is an operator satisfying
Tm = I for some positive integer, then T is diagonalizable because p(T ) = O is satisfied
with p(x) = xm − 1 and p(x) has simple roots: indeed,
xm − 1 = (x− 1)(x− ω)(x− ω2) · · · (x− ωm−1)
with m distinct roots 1, ω, ω2, . . . , ωm−1, where ω = e2πi/m.
3.3. We continue with the general discussion of the spectral decomposition of T and
keep the notation used in subsection 3.1. LetMk be the range of the spectral projection Pk:
Mk = Pk(V ). We call Mk the spectral subspace of T corresponding to the eigenvalue
λk. The identity TPk = PkT (see (3.1.5)) tells us that Mk is invariant for T . Indeed a
vector in Mk := Pk(V ) has the form Pkv and T (Pkv) = Pk(Tv), showing that T (Pkv) is
also in Mk. Thus T sends vectors in Mk to vectors in Mk. So we can define an operator
Tk on Mk by putting Tkv = Tv for all v in Mk. (The operator Tk defined in the
way is called the restriction of T to Mk.) Let Qk = Tk − λkIk, where Ik is the identity
operator on Mk. Then
Tk = λkIk +Qk and Qmk
k = O, (3.3.1)
according to (3.1.5). We call a linear operator Q a nilpotent operator if its certain
power vanishes, that is, Qm = O for some positive integer m. Thus Qk here is a nilpoient
operator on Mk for each k. So, by means of the spectral decomposition, the problem
about the structure of a general operator is boiled down to the one about the structure of
a general nilpotent operator.
29
Example 3.3.1 Cosider the operator D on the space Pn (of polynomials of degrees
at most n) defined by D(p(x)) = p′(x), the derivative of p(x) We have Dn+ 1 = O, which
tells us that D is nilpotent. This can be seen from the fact that D reduces the degree
of a nonconstant polynomial by one and sending constant polynomials to zero. Take any
constant a and let
τk(x) =(x− a)k
k!k = 0, 1, 2, . . . , n.
(By convention, 0! = 1 and hence τ0(x) = 1. The Greek letter τ is pronounced as “tau”.
We choose this letter because of its association with “Taylor”.) Notice that the degree of
τk(x) is k. From this fact it is not hard to deduce that τ0(x), τ1(x), τ2(x), . . . , τn(x) form
a basis of Pn, say T . Notice that, for k ≥ 1,
D(τk(x)) =d
dx
(x− a)k
k!=
k(x− a)k−1
k!=
(x− a)k−1
(k − 1)!= τk−1(x).
This shows that the matrix [D]T of D relative to T is given by