Lie groups, Lie algebras, and their representationsgbellamy/lie.pdf · 3 Lie groups and Lie algebras 11 4 The exponential map 20 5 The classical Lie groups and their Lie algebras

Lie groups, Lie algebras, and their representations

Gwyn Bellamy

[email protected]

www.maths.gla.ac.uk/∼gbellamy

September 16, 2016

Abstract

These are the lecture notes for the 5M reading course ”Lie groups, Lie algebras, and their

representations” at the University of Glasgow, autumn 2015.

Contents

1 Introduction i

2 Manifolds - a refresher 2

3 Lie groups and Lie algebras 11

4 The exponential map 20

5 The classical Lie groups and their Lie algebras 30

6 Representation theory 35

7 The structure of Lie algebras 40

8 Complete reducibility 48

9 Cartan subalgebras and Dynkin diagrams 54

10 The classification of simple, complex Lie algebras 65

11 Weyl’s character formula 69

12 Appendix: Quotient vector spaces 76

1 Introduction

Lie groups and Lie algebras, together called Lie theory, originated in the study of natural symme-

tries of solutions of differential equations. However, unlike say the finite collection of symmetries of

the hexagon, these symmetries occurred in continuous families, just as the rotational symmetries

of the plane form a continuous family isomorphic to the unit circle.

The theory as we know it today began with the ground breaking work of the Norwegian

mathematician Sophus Lie, who introduced the notion of continuous transformation groups and

showed the crucial role that Lie algebras play in their classification and representation theory.

Lie’s ideas played a central role in Felix Klein’s grand ”Erlangen program” to classify all possible

geometries using group theory. Today Lie theory plays an important role in almost every branch of

pure and applied mathematics, is used to describe much of modern physics, in particular classical

and quantum mechanics, and is an active area of research.

You might be familiar with the idea that abstract group theory really began with Galois’

work on algebraic solutions of polynomial equations; in particular, the generic quintic. But, in a

sense, the idea of groups of transformations had been around for a long time already. As mentioned

above, it was already present in the study of solutions of differential equations coming from physics

(for instance in Newton’s work). The key point is that these spaces of solutions are often stable

under the action of a large group of symmetries, or transformations. This means that applying a

given transformation to one solution of the differential equation gives us another solution. Hence

one can quickly and easily generate new solutions from old one, ”buy one, get one free”.

As a toy example, consider a ”black hole” centred at (0, 0) in R2. Then, any particle on

R2 r (0, 0), whose position at time t is given by (x(t), y(t)), will be attracted to the origin, the

strength of the attraction increasing as it nears (0, 0). This attraction can be encoded by the

system of differential equations

dx

dt=

−x(x2 + y2)3/2

,dy

dt=

−y(x2 + y2)3/2

. (1)

In polar coordinates (r(t), θ(t)), this becomes

dr

dt=−1

r2. (2)

i

The circle S1 = 0 ≤ ϑ < 2π (this is a group, ϑ1?ϑ2 = (ϑ1 +ϑ2) mod 2π) acts on R2 by rotations:

ϑ · (x, y) =

(cosϑ − sinϑ

sinϑ cosϑ

)(x

y

)=

(x cosϑ− y sinϑ

x sinϑ+ y cosϑ

).

Then it is clear from (2) that if (x(t), y(t)) is a solution to (1), then so too is

ϑ · (x(t), y(t)) = (x(t) cosϑ− y(t) sinϑ, x(t) sinϑ+ y(t) cosϑ).

Thus, the group S1 acts on the space of solutions to this system of differential equations.

The content of these lecture notes is based to a large extent on the material in the books [5]

and [8]. Other sources that treat the material in these notes are [1], [2], [4], [9] and [7].

1

2 Manifolds - a refresher

In this course we will consider smooth real manifolds and complex analytic manifolds.

2.1 Topological terminology

Every manifold is in particular a topological space, therefore we will occasionally need some basic

topological terms. We begin by recalling that a topological space is a pair (X, T ) where X is a

set and T is a collection of subsets of X, the open subsets of X. The collection of open subsets

T must include X and the emptyset ∅, be closed under arbitrary unions and finite intersections.

Recall that a topological space M is disconnected if there exist two non-empty open and closed

subsets U, V ⊂ M such that U ∩ V = ∅; otherwise M is connected. It is path connected if, for

any two points x, y ∈ X, there is a path from x to y (every path connected space is connected,

but the converse is not true). We say that X is simply connected if it is path connected and the

fundamental group π1(X) of X is trivial i.e. every closed loop in X is homotopic to the trivial

loop. The space X is compact if every open covering admits a finite subcover. The space X is said

to be Hausdorff if for each pair of distinct points x, y ∈ X there exists open sets x ∈ U , y ∈ Vwith U ∩ V = ∅.

For each ` ∈ R+ and x ∈ Rn, we denote by B`(x), resp. B`(x), the set of all points y ∈ Rn

such that ||x − y|| ≤ `, resp. ||x − y|| < `. We will always consider Rn as a topological space,

equipped with the Euclidean topology i.e. a base for this topology is given by the open balls

B`(x). Recall that the Heine-Borel Theorem states:

Theorem 2.1. Let M be a subset of Rn. Then, M is compact if and only if it is closed and

contained inside a closed ball B`(0) for some ` 0.

Remark 2.2. If M is some compact subspace of Rn and Xii∈N a Cauchy sequence in M , then

the limit of this sequence exists in M . This fact is a useful way of showing that certain subspaces

of Rn are not compact.

2.2 Manifolds

Recall that a real n-dimensional manifold is a Hausdorff topological space M with an atlasA(M) =

φi : Ui → Vi | i ∈ I, where Ui | i ∈ I is an open cover of M , each chart φi : Ui → Vi is a

homeomorphism onto Vi, an open subset of Rn, and the composite maps

φi φ−1j : Vi,j → Vj,i

2

are smooth morphisms, where Vi,j = φj(Ui ∩ Uj) ⊂ Rn and Vj,i = φi(Ui ∩ Uj) ⊂ Rn.

Example 2.3. The circle S1 = (x1, x2) | x21+x2

2 = 1 ⊂ R2 is a manifold. We can use stereographic

projection to construct charts on S1. Let N = (0, 1) and S = (0,−1) be the north and south poles

respectively.

N

S

Q = (z, 0)

P = (x1, x2)

A line through N and P meets the x1-axis in a unique point. This defines a map S1 r N → R.

To get an explicit formula for this, the line through N and P is described by t(x1, x2) + (1 −t)(0, 1) | t ∈ R. This implies that z = x1

1−x2 . Therefore we define U1 = S1 r N, V1 = R and

φ1 : U1 → V1, φ1(x1, x2) =x1

1− x2

.

Similarly, if we perform stereographic projection from the south pole, then we get a chart

φ2 : U2 → V2, φ2(x1, x2) =x1

x2 + 1,

where U2 = S1 r S and V2 = R.

For this to define a manifold, we need to check that the transition map φ2φ−11 is smooth. Since

U1∩U1 = S1 rN,S, φ1(U1∩U2) = φ2(U1∩U2) = Rr0. Hence φ2 φ−11 : Rr0 → Rr0.

A direct calculation shows that φ2 φ−11 (z) = 1

z. This is clearly smooth.

A continuous function f : M → R is said to be smooth if f φ−1i : Vi → R is a smooth,

i.e. infinitely differentiable, function for all charts φi. The space of all smooth functions on M is

denoted C∞(M).

Example 2.4. The real projective space RPn of all lines through 0 in Rn+1 is a manifold. The

set of points in RPn are written [x0 : x1 : · · · : xn], where (x0, x1, . . . , xn) ∈ Rn+1 r 0 and

[x0 : x1 : · · · : xn] represents the line through 0 and (x0, x1, . . . , xn). Thus, for each λ ∈ R×,

[x0 : x1 : · · · : xn] = [λx0 : λx1 : · · · : λxn].

3

The n+ 1 open sets Ui = [x0 : x1 : · · · : xn] ∈ RPn | xi 6= 0 cover RPn. The maps φi : Ui∼−→ Rn,

[x0 : x1 : · · · : xn] 7→ (x0xi, . . . , xi, . . . ,

xnxi

) define an atlas on RPn. Thus, it is a manifold.

Exercise 2.5. Check that the maps φi are well-defined i.e. only depends on the line through

(x0, x1, . . . , xn). Prove that RPn is a manifold by explicitly describing the maps φj φ−1i : φi(Ui ∩

Uj) ⊂ Rn → φj(Ui ∩ Uj) ⊂ Rn and checking that they are indeed smooth map.

The following theorem is extremely useful for producing explicit examples of manifolds.

Theorem 2.6. Let m < n and f : Rn → Rm, f = (f1(x1, . . . , xn), . . . , fm(x1, . . . , xn)), a smooth

map. Assume that u is in the image of f . Then f−1(u) is a smooth submanifold of Rn, of

dimension n−m if and only if the differential

dvf =

(∂fi∂xj

)i=1,...,m

j=1,...,n|x=v: Rn → Rm

is a surjective linear map for all v ∈ f−1(u).

The proof of the above theorem is based on the inverse function theorem, which implies that

for each v in the closed set f−1(u), there is some ` > 0 such that B`(v)∩ f−1(u) ' B`′(v) for some

(n − m)-dimensional ball B`′(v). This implies that f−1(u) is a submanifold of Rn of dimension

(n−m). A proof of this theorem can be found in [10, Corollary 1.29].

Exercise 2.7. By considering the derivatives of f = x21 + · · ·+x2

n− 1, show that the (n− 1)-sphere

Sn−1 ⊂ Rn is a manifold.

Exercise 2.8. Consider the smooth functions f1 = x21 + x3

2 − x1x2x3 − 1, f2 = x21x2x3 and f3 =

(2x2 − x33)ex1 from R3 to R. For which of these functions is the differential dvfi always surjective

for all v ∈ f−1i (0)? For those that are not, the closed subset f−1

i (0) is not a submanifold of R3.

Remark 2.9. Since every point in a manifold admits an open neighborhood homeomorphic to an

open subset of Rn, for some n, they are ”especially well-behaved” topological spaces. We mention,

in particular, that a connected manifold is path connected and always admits a universal cover.

2.3 Tangent spaces

The tangent space at a point of manifold is extremely important, playing a key role when trying

to compare manifolds, study functions on the manifold etc. Intuitively, I hope it is clear what

the tangent space at a point should be - it’s simply all vectors tangent to the manifold at that

point. Unfortunately, to make this mathematically precise, some of the intuition is lost in the

4

technicalities. Remarkably, there are several equivalent definitions of the tangent space TmM to

M at the point m ∈ M . In this course we will see three of these definitions, and show that they

are equivalent. The first definition, which we will take as “the definition” is in terms of point

derivations.

Definition 2.10. Fix a point m ∈ M . A point derivation at m is a map ν : C∞(M) → R such

that the following two properties are satisfied

1. ν is linear i.e. ν(αf + βg) = αν(f) + βν(g) for α, β ∈ R and f, g ∈ C∞(M).

2. It satisfies the derivation rule:

ν(fg) = ν(f)g(m) + f(m)ν(g), ∀ f, g ∈ C∞(M).

One can easily check that if ν, µ are point derivations at m and α ∈ R, then ν + µ and αν

are point derivations at m. Thus, the set of point derivations forms a vector space. The tangent

space to M at m is defined to be the vector space TmM of all point derivations at m. To get some

intuition for this notion, lets first consider the case where the manifold M is some just Rn. For

a = (a1, . . . , an) ∈ Rn, we define the point derivation va at u by

va(f) = a1∂f

∂x1

∣∣∣x=u

+ · · ·+ an∂f

∂xn

∣∣∣x=u

.

I claim that a 7→ va defines an isomorphism Rn ∼→ TuRn. One can easily check that va is a point

derivation. Also, va(xi) = ai, which implies that the map is injective. Therefore the only thing to

check is that every point derivation at u can be written as va for some a. Let ν be an arbitrary

point derivation and set ai = ν(xi) ∈ R. Then if a = (a1, . . . , an), we need to show that ν−va = 0.

It is certainly zero on each of the coordinate functions xi. Locally, every smooth function has a

Taylor expansion

f(x) =∑

k1,...,kn≥0

∂kf

∂xk11 ∂xk12 · · · ∂xknn

∣∣∣x=u

(x1 − u1)k1 · · · (xn − un)kn .

Using the derivation rule for point derivations it is easy to see that ν(f) − va(f) = 0 too. Thus,

we can think of ∂∂x1, . . . , ∂

∂xnas being a basis of TuRn for any u ∈ Rn. In particular, it is clear that

dimTuRn = n for all u ∈ Rn.

Remark 2.11. The tangent space TmM at m is a locally property of M i.e. it only sees what

happens locally on M near m. This statement can be made precise as follows. Let U ⊂ M be

5

an open neighborhood of m. Then, the fact that any smooth function f ∈ C∞(M) restricts to a

smooth function on U i.e. f |U ∈ C∞(U) means that we can define a canonical map TmU → TmM ,

ν 7→ ν by

ν(f) := ν(f |U).

This map is an isomorphism. In order to prove this, one needs to use the existence of partitions

of unity, which in this case imply that every function g ∈ C∞(U) can be extended to a smooth

function on M i.e. for each g ∈ C∞(U) there exists f ∈ C∞(M) such that f |U = g.

Since a manifold locally looks like Rn, and the tangent space at m only sees what happens

around m, it is not surprising that:

Proposition 2.12. If M is an n-dimensional manifold, then dimTmM = n for all m ∈M .

Proof. To prove the proposition, we will show that a chart ϕ : U → Rn around m defines an

isomorphism of vector spaces ϕ∗ : TmM∼→ Tϕ(m)Rn. The definition of ϕ∗ is very simple. Given

ν ∈ TmM a point derivation and f ∈ C∞(Rn) a function, we define

ϕ∗(ν)(f) := ν(f ϕ).

We leave it to the reader to check that ϕ∗(ν) ∈ Tϕ(m)Rn is a point derivation. To show that it is

an isomorphism, it suffices to note that we also have a map ϕ−1 : Im ϕ → U and hence we can

define a map (ϕ−1)∗ : Tϕ(m)Rn → Tm. Unpacking the definitions, one sees that (ϕ−1)∗ ϕ∗ = id

and ϕ∗ (ϕ−1)∗ = Id, as required. For instance,

(ϕ∗ (ϕ−1)∗)(ν)(f) = ϕ∗((ϕ−1)∗(ν))(f)

= (ϕ−1)∗(ν)(f ϕ)

= ν(f ϕ ϕ−1) = ν(f).

The second definition of tangent space uses the notion of embedded curves. A curve on M is

a smooth morphism γ : (−ε, ε) → M , where ε ∈ R>0 ∪ ∞. We say that γ is a curve through

m ∈ M if γ(0) = m. Given a curve γ through m, we can construct a point derivation γ at m by

the simple rule

γ(f) =d

dt(f γ)

∣∣∣t=0.

The key point here is that f γ is just a function (−ε, ε)→ R, which we can easily differentiate.

Let’s consider the case where M = Rn. If ρ : (−ε, ε)→ Rn is a curve in Rn then we can differentiate

6

m

γ

Figure 1: A curve through m and its point derivation.

it and get a vector ρ′(0) ∈ Tρ(0)Rn = Rn. Concretely, if ρ(t) = (ρ1(t), . . . , ρn(t)), then ρ′(0) is the

point derivation

ρ′(0) =n∑i=1

(dρidt|t=0

)∂

∂xi(3)

at ρ(0). For instance, consider ρ : R→ R3, ρ(t) = (t2, 3t, 2 sin t), then ρ′(0) = 3 ∂∂x2

+ 2 ∂∂x1

.

Now the question becomes: when do two curves through m define the same point derivation?

Well we see from the definition that if γ1 ∼ γ2 if and only if

d

dt(f γ1)

∣∣∣t=0

=d

dt(f γ2)

∣∣∣t=0, ∀ f ∈ C∞(M),

then γ1 ∼ γ2 ⇔ γ1 = γ2 ∈ TmM . Denote by [γ] the class of curves through m that are equivalent

to γ. The claim is that, as a set at least, the tangent space TmM at m can be identified with the

equivalence classes of curves through m, under the above equivalence relation. By construction,

there is an injective map from the set of equivalence classes to TmM . So we just need to show

that it is surjective i.e.

Lemma 2.13. For any ν ∈ TmM , there exists a curve γ through m such that γ = ν.

Proof. Recall from the proof of Proposition 2.12 that, give a chart ϕ : U → Rn around m, we

constructed an isomorphism ϕ∗ : TmM∼→ Tϕ(m)Rn, where (ϕ∗ν)(f) = ν(f ϕ). Let’s assume that

we can find a curve γ : (−ε, ε)→ Rn through ϕ(m) such that γ = ϕ∗ν. Then let µ = ϕ−1 γ. We

7

have

µ(f) =d

dt(f µ)

∣∣∣t=0

=d

dt(f ϕ−1) γ

∣∣∣t=0

= γ(f ϕ−1)

= (ϕ∗ν)(f ϕ−1) = ν(f ϕ−1 ϕ) = ν(f).

Thus, it suffices to assume that M = Rn and m = (m1, . . . ,mn) ∈ Rn. In this case we have seen

that

ν = ai∂

∂x1

+ · · ·+ an∂

∂xn

for some ai ∈ R. Let γ be the curve γ(t) = (a1t+m1, . . . , ant+mn). Then γ(0) = m and equation

(3) shows that γ = ν.

Putting all the tangent spaces into a single family, we get

Definition 2.14. The tangent bundle of M is the set

TM = (m, v) | m ∈M, v ∈ TmM.

The tangent bundle TM is itself a manifold and comes equipped with smooth morphisms

i : M → TM , i(m) = (m, 0) and π : TM →M , π(m, v) = m.

Exercise 2.15. If M = f−1(0) ⊂ Rn, where f : Rn → Rm, then the tangent space to M at m is

the subspace Ker(dmf : Rn → Rm) of TmRn = Rn. Describe the tangent space to Sn−1 = f−1(0)

in Rn at (1, 0, . . . , 0), where f = x21 + · · ·+ x2

n − 1.

The differential of a function f ∈ C∞(M), at a point m, is a linear map dmf : TmM → R.

In terms of the first definition of a tangent space, dmf([γ]) = (f γ)′(0). In terms of the second

definition of a tangent space, if Xi ∈ Tφi(m)Rn, then f φ−1i : Vi → R. Differentiating this function

gives dφi(m)(f φ−1i ) : Rn → R and we define (dmf)([Xi]) = dφi(m)(f φ−1

i )(Xi). Of course, one

must check that both these definitions are actually well-define i.e. independent of the choice of

representative of the equivalence class.

Let f : M → N be a smooth map between manifolds M and N . Then, for each m ∈ M , f

defines a linear map dmf : TmM → Tf(m)M between tangents spaces, given by (dmf)([γ]) = [f γ].

Since we get one such map for all points in m and they vary continuously over M , we actually get

8

a smooth map

df : TM → TN, (df)(m, [γ]) = (f(m), dmf([γ])).

The following fact, which is a consequence of the inverse function theorem, will be useful to us

later. Let f : M → N be a smooth map between manifolds M and N such that the differential

dmf is an isomorphism at every point m ∈ M . If N is simply connected and M connected then

f is an isomorphism.

2.4 Vector fields

A vector field is a continuous family of vectors in the tangent bundle i.e. it is a rule that assigns

to each m ∈M a vector field Xm ∈ TmM such that the family Xmm∈M varies smoothly on M .

The notion of vector field will be crucial later in relating a Lie group to its Lie algebra.

Definition 2.16. A vector field on M is a smooth morphism X : M → TM such that πX = idM .

The space of all vector fields on M is denoted Vect(M).

The key point of defining a vector field is that one can differentiate functions along vector fields.

Let X be a vector field on M and f : M → R a smooth function. We define X(f)(m) = (f γ)′(0)

for some (any) choice of curve through m such that [γ] = Xm.

Lemma 2.17. The vector field X defines a map C∞(M)→ C∞(M) satisfying the product rule

X(fg) = X(f)g + fX(g), ∀ f, g ∈ C∞(M). (4)

Proof. Let f, g be smooth maps and m ∈M . Then,

X(fg)(m) = ((fg) γ)′(0) = ((f γ)(g γ))′(0)

= (f γ)(0)(g γ)′(0) + (f γ)′(0)(g γ)(0)

f(m)X(g)(m) +X(f)(m)g(m).

Hence X(fg) = X(f)g + fX(g).

A linear map C∞(M) → C∞(M) satisfying equation (4) is called a derivation. Thus, every

vector field defines a derivation. The converse is also true - every derivation defines a unique

vector field on M (we won’t need this fact though). An equivalent definition of the action of a

vector field is that X(f) is the function on M whose value at m equals (dmf)(Xm). Once should

think of vector fields, or derivations, as being continuous families of point derivations.

9

Exercise 2.18. Let M,N and P be manifolds and f : M → N , g : N → P smooth maps. Show

that the linear map dm(g f) : TmM → Tg(f(m))P equals (df(m)g) (dmf). Hint: Using the first

definition of the tangent space, this is virtually a tautology.

Exercise 2.19. Since S2 ⊂ R3, we can define the function f : S2 → R by saying that it is the

restriction of 2x1−x22 +x1x3. Recall the description of the tangent space T(1,0,0)S

2 given in exercise

2.15. Describe d(1,0,0)f : T(1,0,0)S2 → R.

Similarly, let g : R3 → R2 be the function f = (f1(x1, x2, x3), f2(x1, x2, x3)), where f1(x1, x2, x3) =

x32x1 − x3 sinx1 and f2(x1, x2, x3) = ex3x2 − cosx2. What is is the linear map d(0,π,1)g : R3 → R2?

2.5 Integral curves

Let X be a vector field on a manifold M and fix m ∈ M . An integral curve (with respect to X)

γ : J → M through m is a curve such that γ(0) = m and (dxγ)(1) = Xγ(x) for all x ∈ J . Then γ

is a solution to the equationd

dtγ(t)

∣∣t=x

= Xγ(x)

for all x ∈ J . By choosing a chart containing m, the problem of finding an integral curve through

m is easily seen to be equivalent to solving a system of linear, first order differential equations.

Therefore the fundamental theorem of ordinary differential equations says that there exists some

ε > 0 and an integral curve γ : (−ε, ε) → M . Moreover, γ is unique. One can try to make the

open set J ⊂ R as large as possible. There is a unique largest open set on which γ exists; if J is

this maximal set then γ : J →M is called the maximal integral curve for X through m.

Definition 2.20. A vector field X on M is said to be complete if, for all m ∈ M , the maximal

integral curve through m with respect to X is defined on the whole of R.

Exercise 2.21. Let X be the vector field x1∂∂x1− (2x1 + 1) ∂

∂x2on R2. Construct an integral curve

γ for X through (a, b) ∈ R2. Is X complete?

2.6 Complex analytic manifolds

All the above definitions and results hold for complex analytic manifolds, where M is said to be

a complex analytic manifold if the atlas consists of charts into open subsets of Cn such that the

transition maps φi φ−1j are biholomorphic.

All maps between complex analytic manifolds are holomorphic e.g. holomorphic vector fields

or Chol(M), the spaces of holomorphic functions on M .

10

3 Lie groups and Lie algebras

In this section we introduce the stars of the show, Lie groups and Lie algebras.

3.1 Lie groups

Let (G,m, e) be a group, where m : G×G→ G is the multiplication map and e ∈ G the identity

element.

Definition 3.1. The group (G,m, e) is said to be a Lie group if G is a manifold such that both

the multiplication map m, and inversion g 7→ g−1, are smooth maps G × G → G, and G → G

respectively.

We drop the notation m and simply write gh for m(g, h) if g, h ∈ G.

Example 3.2. Let S1 ⊂ C be the unit circle. Multiplication on C restricts to S1×S1 → S1, making

it a Lie group. It is the group of rotations of the real plane.

Example 3.3. The set of invertible n × n matrices GL(n,R) is an open subset of Rn2and hence

a manifold. Matrix multiplication and taking inverse are smooth maps. Therefore GL(n,R) is a

Lie group.

Example 3.4. Every finite group can be considered as a zero-dimensional Lie group.

Example 3.5. Let SO(3) denote the set of all 3 by 3 real matrices A such that AAT = 1 and

det(A) = 1. It is clear that this set is a group under the usual multiplication of matrices.

3.2 Morphisms of Lie groups

A morphism φ : G → H between Lie groups is a group homomorphism which is also a smooth

map between manifolds.

Exercise 3.6. For g ∈ G, let Lg : G→ G be the map Lg(h) = gh of left multiplication.

1. Using the fact that ig : G → G×G, ig(h) = (g, h), is a smooth map, show that Lg : G→ G

is a smooth map.

2. Using part (1), show that if U ⊂ G is an open subset and g ∈ G, then gU is also open in G.

3. Similarly, show that if C ⊂ G is a closed subset then gC is also closed in G.

11

Hints: In part 1. use the fact that the composite of two smooth maps is smooth. In parts 2. and

3. recall that a map f : X → Y between topological spaces is continuous if and only if f−1(U) is

open, resp. f−1(C) is closed, in X for all U ⊂ Y open, resp. C ⊂ Y closed.

Assume that the Lie group G is connected. Then, as the following proposition shows, one can

tell a great deal about the group by considering the various neighborhoods of the identity.

Proposition 3.7. Let G be a connected Lie group and U an open neighborhood of the identity.

Then the elements of U generate G.

Proof. Recall that G connected implies that the only non-empty closed and open subset of G is

G itself. Since the map g → g−1 is smooth, U−1 = g−1 | g ∈ U is open in G. Thus, U ∩ U−1

is also open. It is non-empty because e ∈ U ∩ U−1. Replacing U by this intersection we may

assume that g−1 ∈ U if and only if g ∈ U . By exercise 3.6, if g ∈ U then gU is open in U . Hence

U · U =⋃g∈U gU is open in G. This implies by induction that H = g1 · · · gk | gi ∈ U is an

open subset of G. But it’s easy to check that H is also a subgroup of G. Therefore, to show that

H = G, it suffices to show that H is closed in G.

Let C = G r H. Since H is open in G, C is closed. We assume that C 6= ∅. Notice that if

g ∈ C, then gH ⊂ C and hence H ⊂ g−1C, a closed subset of G. Thus, H is contained in the

intersection C ′ =⋂g∈C g

−1C. The arbitrary intersection of closed sets is closed, thus C ′ is closed.

Hence it suffices to show that H = C ′. If f ∈ C ′ r H then, in particular, f ∈ (f−1)−1(G r H)

i.e. there exists g ∈ GrH such that f = (f−1)−1g = fg. But this implies that g = e belongs to

GrH; a contradiction.

In particular, if f : G → H is a morphism of Lie groups with G connected then Proposition

3.7 implies that f is uniquely defined by what it does on neighborhoods of the identity. Taking

smaller and smaller neighborhoods of e, one eventually ”arrives” at the tangent space of G at e

and the map def : TeG → TeH. Remarkably this linear map captures all the information about

the original morphism f ,

Theorem 3.8. Let G and H be Lie groups, with G connected. Then a morphism f : G → H is

uniquely defined by the linear map def : TeG→ TeH.

What this really means is that if f, g : G→ H are morphisms of Lie groups then f = g if and

only if def = deg. The proof of Theorem 3.8 is given in section 4; see Corollary 4.20.

Exercise 3.9. Let G = R×(=: R r 0), where the multiplication comes from the usual multipli-

cation on R. Show that the map φ : G→ G, φ(x) = xn is a homomorphism of Lie groups. What

is TeG? Describe deφ : TeG→ TeG.

12

Naturally, one can ask, as a converse to Theorem 3.8, which linear maps TeG → TeH extend

to a homomorphism of groups G → H? Surprisingly, there is a precise answer to this question.

But before it can be given we will need to introduce the notion of Lie algebras and describe the

Lie algebra that is associated to each Lie group.

3.3 The adjoint action

In this section we will define the Lie algebra of a Lie group. The idea is that geometric objects are

inherently non-linear e.g. the manifold M ⊂ R3 defined by the non-linear equation x5+y5−z7 = 1.

The same applies to Lie groups. But humans don’t seem to cope very well with non-linear objects.

Therefore Lie algebras are introduced as linear approximations to Lie groups. The truly remarkable

thing that makes Lie theory so successful is that this linear approximation captures a great deal

(frankly unjustifiable) of information about the Lie group.

We begin with automorphisms. An automorphism φ of a Lie group is an invertible homomor-

phism G→ G.

Exercise 3.10. Show that the set Aut(G) of all automorphisms of G forms a group.

An easy way to cook up a lot of automorphisms of G is to make G act on itself by conjugation.

Namely, for each g ∈ G, define Ad(g) ∈ Aut(G) by Ad(g)(h) = ghg−1. This defines a map

G→ Aut(G), g 7→ Ad(g), called the adjoint action.

Exercise 3.11. Check that Ad is a group homomorphism.

If φ ∈ Aut(G) belongs to the image of Ad, we say that φ is an inner automorphism of G.

We may also consider Ad as a smooth map G × G → G, Ad(g, h) = ghg−1. The key reason for

introducing the adjoint action is that it fixes the identity i.e. Ad(g)(e) = e for all g ∈ G.

If we return to the setting of Theorem 3.8, then the homomorphism f : G → H also sends

e ∈ G to e ∈ H. Moreover, the diagram

Gf //

Ad(g)

H

Ad(f(g))

Gf// H

(5)

is commutative; that is,

Ad(f(g))(f(u)) = f(Ad(g)(u)) for all u ∈ G.

13

So we can being our linear approximation process by differentiating diagram (5) at the identity,

to get a commutative diagram of linear maps

TeGdef //

de Ad(g)

TeH

de Ad(f(g))

TeG def// TeH

(6)

However, if we want to check that def really is the differential of some homomorphism f : G→ H

then (6) doesn’t really help because the right vertical arrow is de Ad(f(g)) and we would need to

know the value f(g) for g 6= e i.e. we still need to see what f is doing away from the identity

element.

To overcome this problem, we will use the different interpretation of Ad as a map G×G→ G,

(g, h) 7→ ghg−1. Just as in diagram 5, we get a commutative diagram

G×G f×f //

Ad

H ×HAd

Gf

// H.

(7)

The temptation now is just to differentiate this diagram at (e, e). But this is not quite the right

thing to do. Instead, we differentiate each entry of Ad, resp. of f × f , separately to get a bilinear

map. You may not have see the definition of bilinear before. As a remainder,

Definition 3.12. Let k be a field and U, V and W k-vector spaces. A map b : U ×V → W is said

to be bilinear if both b(u,−) : V → W and b(−, v) : U → W are linear maps, ∀ u ∈ U, v ∈ V , i.e.

b(αu1 + βu2, γv1 + δv2) = αγb(u1, v1) + αδb(u1, v2) + βγb(u2, v1) + βδb(u2, v2),

for all u1, u2 ∈ U, v1, v2 ∈ V, α, β, γ, δ ∈ k.

Remark 3.13. The bidifferential: if M,N and K are manifolds and f : M×N → K is a smooth

map, the the bidifferential b(m,n)f of f at (m,n) ∈ M × N is a bilinear map TmM × TnN →Tf(m,n)K. Informally, one first fixes m ∈M to get a smooth map fm : N → K. Differentiating at

n, we get a linear map dnfm : TnN → Tf(m,n)K. Then we fix w ∈ TnN and define f ′w : M → TK

by f ′(m) = dnfm(w). Differentiating again we get dmf′w : TmM → Tf(m,n)K and hence b(m,n)f :

TmM × TnN → Tf(m,n)K given by (v, w) 7→ (dmf′w)(v). If one differentiates along M first and

then along N then we get the same bilinear map. This follows from the fact that ∂2f∂xi∂xj

= ∂2f∂xj∂xi

14

for a smooth map f : Rn → R.

Thus, we get a commutative diagram of bilinear maps

TeG× TeGb(e,e)f×f //

b(e,e) Ad

TeH × TeHb(e,e) Ad

TeG def

// TeH.

(8)

Notice that only the biderivative of f at the identity appears in the above diagram. We have no

need to know what f does away from the identity. To make the notation less cumbersome we fix

g = TeG and h = TeH. Then [−,−]G := b(e,e) Ad : g × g → g is a bilinear map. This is the Lie

bracket on the vector space g. Thus, we have shown

Proposition 3.14. Let f : G→ H be a morphism of Lie groups. Then, the linear map def : g→ h

preserves brackets i.e.

(def)([X, Y ]G) = [(def)(X), (def)(Y )]H , ∀ X, Y ∈ g. (9)

This leads us to the following key result, which is one of the main motivations in the definition

of Lie algebras.

Theorem 3.15. Let G and H be Lie groups, with G simply connected. Then a linear map g→ h

is the differential of a homomorphism G→ H if and only if it preserves the bracket, as in (9).

The proof of Theorem 3.15 will be given in section 4; see Theorem 4.18.

Example 3.16. As an example to keep some grasp on reality, we’ll consider the case G = GL(V ),

where V is some n-dimensional real vector space (so V ' Rn). Then, for matrices A,B ∈ GL(V ),

Ad(A)(B) = ABA−1 is really just naive matrix conjugation. For G = GL(V ), the tangent space

TeGL(V ) equal End(V ), the space of all linear maps V → V (after fixing a basis of V , End(V )

is just the space of all n by n matrices over R so that End(V ) ' Rn2). If A ∈ GL(V ) and

Y ∈ End(V ), then differentiating Ad with respect to B in the direction of Y gives

d(A,1) Ad(Y ) = limε→0

A(1 + εY )A−1 − A(1)A−1

ε= AY A−1.

Thus, d(A,1) Ad is just the usual conjugation action of A on End(V ). Next, for each Y ∈ End(V ),

we want to differential the map A 7→ d(A,1) Ad(Y ) = AY A−1 at 1 ∈ GL(V ). This will give a linear

15

map ad(Y ) : End(V )→ End(V ). If X, Y ∈ End(V ), then (1 + εX)−1 = 1− εX + ε2X2 − · · · . So,

ad(Y )(X) = limε→0

(1 + εX)Y (1− εX + ε2X2 − · · · )− Yε

= limε→0

εXY − εY X + ε2 · · ·ε

= XY − Y X.

Thus, b(1,1) Ad(X, Y ) = [X, Y ] = XY − Y X, the usual commutator of matrices.

As explained in remark 3.13, [−,−]G is defined by differentiating the map Ad′Y : G→ g given

by Ad′Y (g) = (de Ad(g))(Y ). Swapping arguments, we may also consider the map Ad(g) : g → g

again given by Y 7→ (de Ad(g))(Y ).

Exercise 3.17. Show that Ad(g) Ad(h) = Ad(gh) for all g, h ∈ G. Conclude that Ad(g) is an

invertible linear map and hence defines a group homomorphism Ad : G→ GL(g).

The map Ad : G → GL(g) is a morphism of Lie groups. Its differential at the identity is

denoted ad : g→ End(g). Applying Proposition 3.14 to this situation gives the following lemma,

which will be useful later.

Lemma 3.18. The map ad preserves brackets i.e. ad([X, Y ]G) = [ad(X), ad(Y )]E, where [A,B]E :=

AB −BA is the bracket on End(g).

Remark 3.19. One can check that ad(Y )(X) = ad(X)(Y ) = [X, Y ]G for all X, Y ∈ g.

3.4 Lie algebras

We define here the second protagonist in the story - the Lie algebra. We’ve essential already seen

above that for each Lie group G, the space g is an example of a Lie algebra.

Definition 3.20. Let k be a field and g a k-vector space. Then g is said to be a Lie algebra if

there exists a bilinear map [−,−] : g× g→ g, called the Lie bracket, such that

1. The Lie bracket is anti-symmetric meaning that

[X, Y ] = −[Y,X], ∀ X, Y ∈ g.

2. The Jacobi identity is satisfied:

[X, [Y, Z]] + [Z, [X, Y ]] + [Y, [Z,X]] = 0, ∀ X, Y, Z ∈ g.

16

Exercise 3.21. Assume that char k 6= 2. Show that the first axiom in the definition of a Lie algebra

is equivalent to the condition [X,X] = 0 for all X ∈ g.

Example 3.22. Let M be a manifold. Then the space Vect(M) of vector fields on M is a Lie

algebra via the rule

[X, Y ] := X Y − Y X,

where X Y is the vector field that acts on C∞(M) by first applying Y and then applying X.

Example 3.23. Let V be a k-vector space and gl(V ) = End(V ) the space of all linear maps V → V .

Then, as we have already seen, gl(V ) is a Lie algebra, the general linear Lie algebra, with bracket

[F,G] = F G−G F . If V is n-dimensional, then we may identify gl(V ) with gl(n, k), the Lie

algebra of n × n matrices, where the bracket of two matrices A and B is just the commutator

[A,B] = AB − BA. The Lie algebra gl(n, k) contains many interesting Lie subalgebras such as

n(n, k) the Lie algebra of all strictly upper triangular matrices or b(n, k) the Lie algebra of upper

triangular matrices.

Exercise 3.24. Prove that gl(n, k) is a Lie algebra, and that b(n, k) and n(n, k) are Lie subalgebras

of gl(n, k).

3.5 The Lie algebra of a Lie group

Recall that we defined in section 3.3 a bracket [−,−] on the tangent space g at the identity of a

Lie group G. As expected,

Proposition 3.25. The pair (g, [−,−]) is a Lie algebra.

Proof. The bracket is bilinear by construction. Therefore we need to check that it is anti-symmetric

and satisfies the Jacobi identity. By exercise 3.21, to check that the first axiom holds it suffices

to show that [X,X] = 0 for all X ∈ g.

Recall from the first definition of tangent spaces that an element X ∈ g can be written γ′(0)

for some γ : (−ε, ε) → G. Let Y = ρ′(0) be another element in g. We can express the bracket of

X and Y in terms of γ and ρ. First, for each t ∈ (−ε, ε) and g ∈ G, Ad(γ(t))(g) = γ(t)gγ(t)−1.

Then, taking g = ρ(s) and differentiating ρ to get Y ,

(de Ad(γ(t)))(Y ) =

(d

dsγ(t)ρ(s)γ(t)−1

)|s=0.

Thus,

ad(X)(Y ) =

[(d

dsγ(t)ρ(s)γ(t)−1

)|s=0

]|t=0.

17

In particular,

[X,X] = ad(X)(X) =

[d

dt

(d

dsγ(t)γ(s)γ(t)−1

)|s=0

]|t=0

=

[d

dt

(d

dsγ(s)

)|s=0

]|t=0

=

[d

dtX

]|t=0 = 0.

We have implicitly used the fact that γ(t)γ(s) = γ(s)γ(t) for all s, t. This is proved in Lemma 4.6

below.

Using the anti-symmetric property of the bracket, the Jacobi identity is equivalent to the

identity [X, [Y, Z]] − [Y, [X,Z]] = [[X, Y ], Z] for all X, Y, Z ∈ g. Recall that ad(X)(Y ) = [X, Y ].

Therefore the above identity can be written [ad(X)ad(Y )−ad(Y )ad(X)](Z) = ad([X, Y ])(Z),

which would follow from the identity ad(X)ad(Y )−ad(Y )ad(X) = ad([X, Y ]) in End(g). But

this is exactly the statement of Lemma 3.18 that ad preserves brackets.

A Lie algebra h is said to be abelian if [X, Y ] = 0 for all X, Y ∈ h.

Exercise 3.26. Let H be an abelian Lie group. Show that its Lie algebra h is abelian.

3.6 Ideals and quotients

Let g be a Lie algebra. A subspace h of g is called a subalgebra if the bracket on g restricts to a

bilinear map [−,−] : h× h→ h. This makes h into a Lie algebra. A subalgebra l is said to be an

ideal if [l, g] ⊂ l. If l is an ideal of g then the quotient vector space g/l is itself a Lie algebra, with

bracket

[X + l, Y + l] := [X, Y ] + l.

Exercise 3.27. Let l ⊂ g be an ideal. Check that the bracket on g/l is well-defined and that g/l is

indeed a Lie algebra.

Exercise 3.28. Let l be a Lie subalgebra of g such that [g, g] ⊂ l. Show that l is an ideal and that

the quotient g/l is abelian.

Exercise 3.29. Show that n(2, k) is a one-dimensional ideal in b(2, k). More generally, show that

n(n, k) is an ideal in b(n, k). What are the dimensions of b(n, k) and n(n, k)? Is the quotient

b(n, k)/n(n, k) abelian?

18

3.7 Lie algebras of small dimension

Let g be a one-dimensional Lie algebra. Then g = kX for any 0 6= X ∈ g. What is the bracket

on g? The fact that the bracket must be anti-symmetric implies that [X,X] = 0. Therefore the

bracket is zero and g is unique up to isomorphism.

When dim g = 2, let X1, X2 be some basis of g. If the bracket on g is not zero, then the only

non-zero bracket can be [X1, X2] = −[X2, X1] and hence [g, g] is a one dimensional subspace of g.

Let Y span this subspace. Let X be any element not in [g, g] so that X, Y is also a basis of g.

Then [X, Y ] must be a non-zero element in [g, g], hence it is a multiple of Y . By rescaling X, we

may assume that [X, Y ] = Y . This uniquely defines the bracket on g (and one can easily check

that the bracket does indeed make g into a Lie algebra). Thus, up to isomorphism there are only

two Lie algebras of dimension two.

In dimension three there are many more examples, but it is possible to completely classify

them. The most important three dimensional Lie algebra is sl(2,C), the subalgebra of gl(2,C)

consisting of matrices of trace zero.

Exercise 3.30. Let

E =

(0 1

0 0

), F =

(0 0

1 0

), H =

(1 0

0 −1

),

Show that E,F,H is a basis of sl(2,C). Calculate [H,E], [H,F ] and [E,F ].

19

4 The exponential map

In this section, we define the exponential map. This allows us to go back from the Lie algebra of

a Lie group to the group itself.

4.1 Left-invariant vector fields

First we note that there is a natural action of G on C∞(G), the space of all smooth functions on

G. Namely, given f ∈ C∞(G) and g, h ∈ G, define the new function g · f by

(g · f)(h) = f(g−1h). (10)

Now let X ∈ Vect(G) be a vector field on G. Since X is uniquely defined by its action on C∞(G),

we can define an action of G on Vect(G) by

(g ·X)(f) = g · [X(g−1 · f)], ∀ f ∈ C∞(G). (11)

It might seem a bit strange that I’ve just decided to put in the (−)−1 into equation (10), but this

is necessary to ensure that both C∞(G) and Vect(G) become left G-modules. The vector field X

is said to be left invariant if g · X = X for all g ∈ G. We denote by VectL(G) ⊂ Vect(G) the

subspace of all left invariant vector fields.

Exercise 4.1. Show that VectL(G) is a Lie subalgebra of Vect(G).

Lemma 4.2. Let G be a Lie group. The map X 7→ Xe defines an isomorphism of vector spaces

VectL(G)∼−→ g.

Proof. Let X ∈ VectL(G). We will show that X is unique defined by its value Xe at e. Let g, h ∈ Gand f ∈ C∞(G). First we consider the differential dh(g · f) : ThG→ R. Since (g · f)(h) = f(g−1h),

we have g · f = f Lg−1 , where Lg : G → G is defined by Lg(h) = gh. Thus, dh(g · f) =

dg−1hf dhLg−1 . Then,

[(g ·X)(f)](h) = [g · (X(g−1 · f))](h) = [X(g−1 · f)](g−1h)

= [dg−1h(g−1 · f)](Xg−1h)

= (dhf) (dg−1hLg)(Xg−1h).

Since X is assumed to be left invariant, this implies that (dhf)(Xh) = (dhf) (dg−1hLg)(Xg−1h)

20

for all f ∈ C∞(G). Hence

Xh = (dg−1hLg)(Xg−1h), ∀ g, h ∈ G. (12)

In particular, taking h = g shows that Xg = (deLg)(Xe) i.e. X is uniquely defined by Xe. Thus,

the map X 7→ Xe is injective.

Conversely, for each X ∈ g, define the vector field X on G by Xg = (deLg)(X). This means

that X(f)(h) = (dhf)(deLh(X)) for all f ∈ C∞(G). It is left as an exercise to check that X

belongs to VectL(G).

Exercise 4.3. Check that the vector field X defined in the proof of Lemma 4.2 is left invariant.

Remark 4.4. One can actually show that the isomorphism X 7→ Xe of Lemma 4.2 is an isomor-

phism of Lie algebras.

Example 4.5. Let G = (R×, ·). Then TG = R××R and Vect(G) = f ∂∂x| f ∈ C∞(R×). If α ∈ R×

and f ∈ C∞(G), then the group R× acts on C∞(R×) by (α · f)(p) = f(α−1p) e.g. α · xn = αnxn

since

(α · xn)(β) = xn(α−1β) = (α−1β)n.

This implies that C∞(R×)G = R. Similarly,(α ·(f∂

∂x

))(xn) = α ·

(f∂

∂x

)(αnxn)

= (α · f)nαnxn−1

=

(α(α · f)

∂

∂x

)(xn).

Thus, α ·(f ∂∂x

)= α(α ·f) ∂

∂x; for instance α ·

(xn ∂

∂x

)= α−n+1xn ∂

∂x. This implies that VectL(R×) =

Rx ∂∂x.

4.2 One parameter subgroups

By Lemma 4.2 each element X ∈ g defines a unique left invariant vector field ν(X) such that

ν(X)e = X. Then, associated to ν(X) is an integral curve ϕX : J → G through e.

Lemma 4.6. The integral curve ϕX is defined on the whole of R. Moreover, it is a homomorphism

of Lie groups R→ G i.e. ϕ(s+ t) = ϕ(s)ϕ(t) for all s, t ∈ R.

Proof. Choose s, t ∈ J such that s+ t also belongs to J . Then we claim that ϕ(s+ t) = ϕ(s)ϕ(t).

Fix s and let t vary in a some small open set J0 ⊂ J containing 0 such that s+ t still belongs to J .

21

Then α(t) = ϕ(s+ t) and β(t) = ϕ(s)ϕ(t) are curves in G such that α(0) = β(0). Differentiating,

α(t)′ = ϕ(s + t)′ = ν(X)ϕ(s+t) = ν(X)α(t). To calculate β(t)′, we first write β(t) = (Lϕ(s) ϕ)(t).

Then,

β(t)′ = (dϕ(t)Lϕ(s))(ϕ′(t))

= (dϕ(t)Lϕ(s))(ν(X)ϕ(t)

)= ν(X)ϕ(s)·ϕ(t) = ν(X)β(t),

where we have used (12) in the last equality but one. Thus, they are both integral curves for ν(X)

through the point ϕ(s). By uniqueness of integral curves, they are equal.

We can use the equality ϕ(s + t) = ϕ(s)ϕ(t) to extend ϕ to the whole of R: for each s, t ∈ Jsuch that s + t is not in J , set ϕ(s + t) = ϕ(s)ϕ(t). The uniqueness property of integral curves

shows that this is well-defined.

The homomorphism ϕX : R → G is called the one-parameter subgroup associated to X ∈ g.

The uniqueness of integral curves implies that it is the the unique homomorphism γ : R→ G such

that γ′(0) = X.

Definition 4.7. The exponential map exp : g→ G is defined by exp(X) = ϕX(1).

The uniqueness of ϕX implies that ϕsX(t) = ϕX(st) for all s, t ∈ R (since the derivatives with

respect to t of ϕsX(t) and ϕX(st) at 0 are equal). Hence

(d0 exp)(X) = limε→0

exp(εX)− exp(0)

ε

= limε→0

ϕεX(1)− ϕ0(1)

ε

= limε→0

ϕX(ε)− ϕX(0)

ε

= ϕ′X(0) = X,

where we have used the fact that ϕ0(t) = ϕX(0) = e for all t ∈ R. Thus, the derivative d0 exp of

exp at 0 ∈ g is just the identity map.

In the case of G = GL(n,R) and hence g = gl(n,R), the exponential map can be explicitly

written down, it is just the usual exponential map

exp(X) =∞∑i=0

X i

i!.

22

The same formula applies for any closed subgroup G of GL(n,R). This function behaves much

the same way as the exponential function ex : R→ R×. However, it is not true that exp(X + Y )

equals exp(X) exp(Y ) in general - see subsection 4.6 for more.

4.3 Cartan’s Theorem

So far the only significant example of a Lie group we have is GL(n,R). Cartan’s Theorem allows

us to cook up lots of new Lie groups. Our proof of Cartan’s Theorem is based on the proof in [?].

The proof of the theorem involves some basic analysis, which we give as a separate lemma. Let

Sn−1 denote the unit sphere in Rn. For each x ∈ Rnr0, let [x] = x||x|| ∈ S

n−1 i.e. [x] is the point

on the unit sphere in Rn that also lies on the line through 0 and x.

Lemma 4.8. Let xn, x ∈ Rn r 0 with limn→∞ xn = 0. Then, limn→∞[xn] = [x] if and only if

there exist positive integers an such that limn→∞ anxn = x.

Proof. If such integers exist then clearly limn→∞[xn] = [x] since [anxn] = [xn] for all n. Conversely,

assume that limn→∞[xn] = [x]. To say that limn→∞ anxn = x means that the distance ||anxn − x||between anxn and x becomes arbitrarily small as n→∞. For each n choose a positive integer an

such that |an − ||x||||xn|| | < 1. If v, w ∈ Rn and α, β ∈ R then the triangle inequality implies that

||αv − w|| ≤ ||βv − w||+ |α− β| · ||v||.

In our situation, this implies that

||anxn − x|| ≤∣∣∣∣∣∣∣∣xn||x||||xn||

− x∣∣∣∣∣∣∣∣+

∣∣∣∣an − ||x||||xn||∣∣∣∣ · ||xn|| ≤ ∣∣∣∣∣∣∣∣xn||x||||xn||

− x∣∣∣∣∣∣∣∣+ ||xn||.

Now limn→∞ xn = 0 implies that limn→∞ ||xn|| = 0 and∣∣∣∣∣∣∣∣xn||x||||xn||− x∣∣∣∣∣∣∣∣ =

1

||x||

∣∣∣∣∣∣∣∣ xn||xn|| − x

||x||

∣∣∣∣∣∣∣∣ =1

||x||||[xn]− [x]||

also tends to zero as n tends to infinity since limn→∞[xn] = [x].

Theorem 4.9 (Cartan’s Theorem). Let G be a Lie group and H a closed subgroup. Then H is a

submanifold of G. Hence H is a Lie subgroup of G.

Proof. We begin by noting that it suffices to construct some open neighborhood U of e in G such

23

that H ∩ U is a closed submanifold of U . Assuming this, then for each h ∈ H, hU is an open

neighborhood of h in G such that H ∩ hU = h(H ∩ U) is a closed submanifold.

We fix a norm || · || on the vector space g. The key to the proof of Cartan’s Theorem is to

choose carefully a candidate subspace W ⊂ g for the Lie algebra of H. The space W is defined

to be all w ∈ g such that either w = 0 or there exists a sequence wn in g such that wn 6= 0,

exp(wn) ∈ H, wn → 0 and [wn]→ [w].

We will show:

1. exp(W ) ⊆ H.

2. W is a subspace of g.

3. There is an open neighborhood U of 0 in g and a diffeomorphism φ : U → φ(U) ⊂ G with

φ(0) = e such that φ(U ∩W ) = φ(U) ∩H.

Assume (1)-(3) hold. Then, since W is a subspace of g, U ∩W is clearly a submanifold of

U . Therefore, part (3) implies that φ(U ∩W ) is a submanifold of φ(U). Hence φ(U) ∩ H is a

submanifold of φ(U) as required.

Proof of part (1): Let w ∈ W r 0, with wn as in the definition of W . By Lemma 4.8,

there exist positive integers an such that anwn → w. Since exp(anwn) = exp(wn)an ∈ H, and H

is closed in G, the limit exp(w) = limn→∞ exp(wn)an belongs to H.

Part (2): Since [w] = [tw] for all t ∈ R×, tw ∈ W if w ∈ W . Thus, it suffices to show that if

v, w ∈ W then v+w ∈ W . We can assume without loss of generality that v, w, v+w are non-zero.

Recall that exp is a diffeomorphism in a neighborhood of 0. Therefore, for sufficiently small t,

there exists a smooth curve t 7→ u(t) in g such that

exp(tv) exp(tw) = exp(u(t)), (13)

and u(0) = 0. Equation (13) implies that exp(u(t)) ∈ H and

limn→∞

nu

(1

n

)= lim

ε→0

u(ε)

ε= u′(0) = v + w.

Then, since u(

1n

)→ u(0) = 0, exp

(u(

1n

))belongs to H for all n, and [u

(1n

)] → [v + w], we

conclude that v + w ∈ W .

Part (3): Let V be a complement to W in g. We define

φ : g = V ⊕W → G, (v, w) 7→ exp(v) exp(w).

24

The differential d0φ is just the identity map on g. Therefore there is some neighborhood U of 0 in g

such that φ is a diffeomorphism from U onto φ(U). Clearly, φ(U ∩W ) ⊂ φ(U)∩φ(W ) ⊂ H∩φ(U).

Therefore, we need to show that H ∩ φ(U) is contained in φ(U ∩W ).

Assume that this is not the case. Then, in every open neighbourhood Un of 0 in g there exist

(vn, wn) ∈ V ⊕W such that φ(vn + wn) ∈ H but vn 6= 0. In particular, exp(vn) ∈ H. We take

U1 ⊃ U2 ⊃ · · · such that⋂n Un = 0, so that vn is a sequence converging to 0. Since the unit

sphere S in V is compact, there exists some v ∈ V r 0 and subsequence v′n of vn such that

[v′n]→ [v] (this is the result from metric spaces saying that every sequence in a compact space has

a convergent subsequence). But this implies that v ∈ W ; a contradiction.

Remark 4.10. We’ve actually shown in the proof of Theorem 4.9 that there is some neighborhood

U of e in G such that U ∩H = exp(h) ∩ U for any closed subgroup H of G.

Example 4.11. As a consequence, the following are Lie groups

B(n,R) = All upper triangular, invertible matrices.,

N(n,R) = A ∈ B(n,R) | Diagonal entries of A all equal 1.,

T (n,R) = All diagonal matrices.

since they are all subgroups of GL(n,R) defined as the zeros of some polynomial equations.

Exercise 4.12. Using the criterion of example 2.6, show directly that B(n,R), N(n,R) and T (n,R)

are submanifolds of GL(n,R).

The analogue of Cartan’s Theorem holds for closed subgroups of GL(n,C); they are complex

Lie subgroups.

4.4 Simply connected Lie groups

We recall that a path connected topological space X is simply connected if the fundamental group

π1(X) of X is trivial i.e. every closed loop in X is homotopic to the trivial loop. In general

it is not true that a Lie group is uniquely defined by its Lie algebra i.e. it is possible to find

non-isomorphic Lie groups whose Lie algebras are isomorphic. However, if we demand that our

Lie group be simply connected, then:

Theorem 4.13 (S. Lie). Let g be a finite dimensional, real Lie algebra. Then, there exists a unique

simply connected Lie group G whose Lie algebra is g. Moreover, if G′ is any other connected Lie

group with Lie algebra g, then G′ is a quotient of G.

25

Recall that a covering map f : M → N is a map such that every n ∈ N is contained in some

open neighborhood U with f−1(U) a disjoint union of open sets, each mapping homeomorphically

onto U . Before we prove the theorem, we require some preparatory results. A covering map always

satisfies the path lifting property : if γ : [0, 1] → N is a path with n = γ(0) and m is a lift of n,

then there is a unique path γ : [0, 1]→ M lifting γ such that γ(0) = m (we say that γ is a lift of

γ if f γ = γ). Using the path lifting property one can easily show that

Lemma 4.14. Assume that M is simply connected and let g : Z → N be a smooth morphism from

a simply connected manifold Z sending z to n. Then there exists a unique morphism g : Z →M ,

sending z to m such that f g = g.

An easy, but important consequence of Lemma 4.14 is that every Lie group admits a simply

connected cover.

Proposition 4.15. Let G be a Lie group, H a connected manifold, and ϕ : H → G a covering

map. Choose e′, an element lying over the identity of G. Then there is a unique Lie group structure

on H such that e′ is the identity and ϕ is a map of Lie groups; and the kernel of ϕ is in the centre

of H.

Proof. We first assume that H is a simply connected manifold. Define the map α : H ×H → G

by α(h1, h2) = ϕ(h1)ϕ(h2)−1. By Lemma 4.14, there exists a unique map α′ : H ×H → H such

that α′(e′, e′) = e′ and ϕ α′ = α. Then we define h−1 := α′(e′, h) and h1 · h2 = α′(h1, h−12 ) for

h, h1, h2 ∈ H. This defines smooth morphisms H → H and H ×H → H resp. and one can use

the uniqueness of the lift α to show that this makes H into a group.

The proof of the following proposition follows easily from the proof of Cartan’s Theorem,

Theorem 4.9.

Proposition 4.16. Let G be a Lie group with Lie algebra g and h a Lie subalgebra of g. Then

the subgroup H generated by exp(h) is a closed, Lie subgroup of G, whose Lie algebra is h.

We also note:

Lemma 4.17. Let G be a connected group and φ : G → H a morphism of Lie groups. If

deφ : g→ h is an isomorphism, then dgφ : TgG→ Tφ(g)H is an isomorphism for all g ∈ G.

Proof. This is simply the case of rewriting the map φ in a clever way. Recall that we have

Lg : G → G, Lg(u) = gu. Fix g ∈ G and define ϕ : G → H by ϕ(u) = φ(g)φ(u). We can rewrite

26

this as ϕ = Lφ(g) φ. This implies that deϕ = dφ(g)Lφ(g) deφ is an isomorphism. On the other

hand, ϕ also equals u 7→ φ(gu), which we can write as ϕ = φ Lg. Therefore, deϕ = dgφ deLg.Since both deϕ and deLg are invertible linear maps, this implies that dgφ is too.

Proof of Theorem 4.13. The key to Theorem 4.13 is Ado’s Theorem, Theorem 7.27, which says

that there exists some n 0 such that g is a Lie subalgebra of gl(n,R). Therefore Proposition

4.16 implies that there is some connected, closed Lie subgroup G′ ⊂ GL(n,R) with Lie G′ = g.

Let ϕ : G→ G be the universal cover of G′. Proposition 4.15 says that we can endow G with the

structure of a Lie group such that ϕ is a quotient of Lie groups. Moreover, deϕ : g → g is the

identity.

Thus, it suffices to show that if G′ is another simply connected Lie group with Lie algebra

g then G′ ' G. To show this, we consider the product G × G′. It’s Lie algebra is g ⊕ g and

the diagonal copy g∆ of g in g ⊕ g is a Lie subalgebra. Therefore Proposition 4.16 implies that

there is some connected, closed Lie subgroup K ⊂ G × G′ such that Lie K = g∆. The maps

φ1 : K → G × G′ G and φ2 : K → G × G′ G′ are homomorphisms of Lie groups whose

differential at the identity is the identity map on g. Thus, we have maps φi between Lie groups

whose differential is an isomorphism on Lie algebras. Hence, Lemma 4.17 implies that dkφi is an

isomorphism for all k ∈ K. Now, we have maps φi : K → G,G′ between connected manifolds,

where G and G′ are simply connected. As mentioned at the end of section 2.3, this implies that

each φi is an isomorphism. Hence we may identify K ' G ' G′.

4.5 The proof of Theorems 3.15 and 3.8

Finally, we have the theorem that motivated the definition of a Lie algebra in the first place. It

was stated as Theorem 3.15.

Theorem 4.18. Let G and H be Lie groups with G simply connected, and let g and h be their

Lie algebras. A linear map µ : g → h is the differential of a morphism φ : G → H if and only if

µ is a map of Lie algebras.

Proof. We showed in Proposition 3.14 that if µ is the differential of a homomorphism then it must

be a map of Lie algebras. Therefore we need to show the existence of φ, knowing that µ is a map

of Lie algebras.

We will deduce the theorem from Proposition 4.16, applied to G × H. The Lie algebra of

G×H is just g× h. Let Γµ = (X,µ(X)) | X ∈ g ⊂ g× h be the graph of µ. Then the fact that

µ is a map of Lie algebras is equivalent to saying that Γµ is a Lie subalgebra of g× h. Let K be

27

the subgroup of G×H generated by exp(Γµ). By Proposition 4.16, K is a closed Lie subgroup of

G×H.

Projection from G × H onto G is a homomorphism of Lie groups. Therefore the composite

η : K → G×H → G is also a homomorphism. The differential deη is just the projection map from

Γµ to g, which is an isomorphism. Thus, we have a map η between Lie groups whose differential

is an isomorphism on Lie algebras. We note that since exp(Γµ) is the image of a connected space

under a continuous map, it is connected. Therefore, exp(Γµ) is contained in K0, the connected

component of K containing e. Thus, K = K0 is connected. Hence, Lemma 4.17 implies that

dkη is an isomorphism for all k. Now, we have a map η : K → G between connected manifolds,

where G is simply connected. As mentioned at the end of section 2.3, this implies that η is an

isomorphism. Hence we may identify G ' K.

Since G is simply connected, Theorem 4.13 implies that this must be an isomorphism. Hence

G ' K → H is a map of Lie groups whose differential is µ.

Remark 4.19. The map φ constructed in the proof of Theorem 4.18 is unique. To see this, assume

we are given another map ϕ : G → H such that deϕ = µ. Let Γϕ ⊂ G × H be the graph

(g, ϕ(g)) ∈ G×H | g ∈ G of ϕ. Since ϕ is a homomorphism of Lie groups, Γϕ is a Lie subgroup

of G × H. The Lie algebra of Γϕ equals the Lie algebra Γµ of K ⊂ G × H. This implies that

exp(Γµ) is contained in Γϕ and hence K ⊂ Γϕ. But both K and Γϕ are connected Lie groups of

the same dimension, hence they are equal. Since both ϕ and φ are defined to be projection from

K = Γϕ onto H, ϕ = φ.

Now the proof of Theorem 3.8 is an easy corollary.

Corollary 4.20. Let G and H be Lie groups, with G connected. Then a morphism f : G→ H is

uniquely defined by the linear map def : TeG→ TeH.

Proof. By Theorem 4.13, there is a simply connected Lie group G and surjection u : G→ G such

that Lie G = LieG = g and deu = idg. Thus, de(f u) = def . By Theorem 4.18 and remark

4.19, there is a unique homomorphism h : G → H such that deh = de(f u). This implies that

h = f u. If g : G → H was another map such that deg = def , then u f = u g. But u is

surjective, therefore this implies that f = g.

4.6 The Campbell-Hausdorff formula

Recall that exp(X) exp(Y ) 6= exp(X + Y ) in general. What we’d like is some ”product” X ? Y

such that exp(X ? Y ) = exp(X) exp(Y ). If g ∈ G is an element that is sufficiently close to the

28

identity in G, then an inverse to exp is given by

log(g) =∞∑i=1

(−1)i+1 (g − e)i

i∈ g ⊂ gl(n,R).

We can try to use this to define ? on gl(n,R) by

X ? Y = log(exp(X) exp(Y ))

If we unpack this, being careful to remember that X and Y don’t necessarily commute, then we

get

exp(X) exp(Y ) = e+ (X + Y ) +

(X2

2+XY +

Y 2

2

)+ · · ·

and

X ? Y = (X + Y ) +

(−(X + Y )2

2+

(X2

2+XY +

Y 2

2

))+ · · ·

= X + Y +1

2[X, Y ] + · · ·

We see that X?Y up to quadratic terms only depends on linear combinations of X, Y and brackets

of X and Y . Remarkably, this is true for all higher terms too. The resulting formula is called the

Campbell-Hausdorff formula.

Exercise 4.21. Calculate the degree three term of the Campbell-Hausdorff formula for X ? Y .

The key point of the Campbell-Hausdorff formula is that it shows that the product in G ⊂GL(n,R) can be described, at least in some neighborhood of the identity, completely in terms of

the Lie bracket on g.

29

5 The classical Lie groups and their Lie algebras

In this section we describe the classical Lie groups. By Cartan’s Theorem, every closed subgroup

of GL(n,R) or GL(n,C) is a real (resp. complex) Lie group. Using this fact we can produce many

interesting new examples.

5.1 The classical real Lie groups

Given a matrix A in Mat(n,R) or in Mat(n,C), we let AT denote its transpose. Similarly, if

A ∈ Mat(n,C) then A∗ = AT =(A)T

is the Hermitian conjugate of A, where A is the matrix

obtained by taking the complex conjugate of each entry e.g.(2 + i −7i

3 −1 + 6i

)∗=

(2− i 3

7i −1− 6i

),

Finally we define the 2n× 2n matrix

Jn =

(0 In

−In 0

),

where In ∈ GL(n,R) is the identity matrix.

Remark 5.1. If v ∈ R2n, thought of as a column vector, then ω(v, w) := vT · Jn · w is a number.

Thus, ω defines a bilinear form on R2n. It is skew-symmetric and non-degenerate. This form

is the starting point of symplectic geometry, see [3] or [6]. This subject is the modern face of

”Hamiltonian mechanics”, first developed by the Irish mathematician William Hamilton in 1833

(the same mathematician who first conjured up the Quaternions).

Now we can define the real Lie groups

SL(n,R) := A ∈ GL(n,R) | det(A) = 1,

SO(n,R) := A ∈ GL(n,R) | det(A) = 1, AT · A = 1,

O(n,R) := A ∈ GL(n,R) | AT · A = 1,

Sp(n,R) := A ∈ GL(2n,R) | AT · Jn · A = Jn,

SU(n) := A ∈ GL(n,C) | det(A) = 1, A∗ · A = 1,

U(n) := A ∈ GL(n,C) | A∗ · A = 1;

these are the special linear group, special orthogonal group, orthogonal linear group, symplectic

30

group, special unitary group and unitary group, respectively. Their Lie algebras are

LieSL(n,R) = sl(n,R) := A ∈ gl(n,R) | Tr(A) = 0,

LieSO(n,R) = LieO(n,R) = o(n,R) := A ∈ gl(n,R) | AT + A = 0,

LieSp(n,R) = sp(n,R) := A ∈ gl(2n,R) | AT · Jn + Jn · A = 0,

LieSU(n) = su(n) := A ∈ gl(n,C) | Tr(A) = 0, A∗ + A = 0,

LieU(n) = u(n) := A ∈ gl(n,C) | A∗ + A = 0.

Let us show that the Lie algebra of SL(n,R) is sl(n,R). If M is a submanifold of Rk defined,

as in example 2.6, by the series of equations f1 = · · · = fr = 0 and m ∈ M , then TmM is the

subspace of TmRk = Rk defined by

TmM =

v ∈ Rk

∣∣ limε→0

fi(m+ εv)− fi(m)

ε= 0, ∀ i = 1, . . . , r

= Ker(dmF : TmRk → T0Rr),

where F = (f1, . . . , fr) : Rk → Rr. So, in our case, we have

sl(n,R) =

A ∈ gl(n,R)

∣∣ limε→0

(det(In + εA)− 1)− (det(In)− 1)

ε= 0

=

A ∈ gl(n,R)

∣∣ limε→0

det(In + εA)− det(In)

ε= 0

.

Recall that an explicit formula for det is given by

det(B) =∑σ∈Sn

(−1)σb1,σ(1) · · · bn,σ(n), (14)

where Sn is the symmetric group on n letter. In the limit ε→ 0 only the term ε1 in the numerator

will matter, all higher εi terms will go to zero. If we take an arbitrary term in the sum (14),

corresponding to some σ 6= 1, then there must be i and j such that σ(i) 6= i and σ(j) 6= j. For

each such term in det(In + εA), bi,σ(i)bj,σ(j) will contribute a ε2. Thus,

det(In + εA) = (1 + εa1,1) · · · (1 + εan,n) + ε2(· · · ),

and

det(In + εA)− det(In) = εTr(A) + ε2(· · · ).

Therefore, A ∈ TInSL(n,R) if and only if Tr(A) = 0.

31

We’ll also show LieSp(n,R) = sp(n,R), just to make sure we get the hang of things. This one

is much easier:

sp(n,R) =

A ∈ gl(2n,R)

∣∣ limε→0

((I2n + εA)TJn(I2n + εA)− Jn)− ((I2nJnI2n − Jn))

ε= 0

.

But clearly,

((I2n + εA)TJn(I2n + εA)− Jn)− ((I2nJnI2n − 1)) = ε(ATJn + JnA) + ε2(· · · ),

and hence sp(n,R) = A ∈ gl(2n,R) | ATJn + JnA = 0.

Exercise 5.2. Show that LieSO(n,R) = LieO(n,R) = o(n,R) and LieSU(n) = su(n).

Exercise 5.3. What is the dimension of the Lie groups SL(2,R), Sp(2n,R), O(n,R) and SU(n).

Hint: The dimension of a connected manifold is the same as the dimension of the tangent space

TmM for any m ∈M .

Exercise 5.4. If ei,j is the n× n matrix with a one in the (i, j)th position and zero elsewhere, give

a formula in terms of Kronecker-delta symbols δi,j for the commutators [ei,j, ek,l] in gl(n,R).

5.2 The classical complex Lie groups

We list here the complex analogous of the above real Lie groups. There are no natural analogous

to the unitary and special unitary groups.

SL(n,C) := A ∈ GL(n,C) | det(A) = 1,

SO(n,C) := A ∈ GL(n,C) | det(A) = 1, AT · A = 1,

O(n,C) := A ∈ GL(n,C) | AT · A = 1,

Sp(n,C) := A ∈ GL(2n,C) | AT · Jn · A = Jn.

Their Lie algebras are

LieSL(n,C) = sl(n,C) := A ∈ gl(n,C) | Tr(A) = 0,

LieSO(n,C) = LieO(n,C) = o(n,C) := A ∈ gl(n,C) | AT + A = 0,

LieSp(n,C) = sp(n,C) := A ∈ gl(2n,C) | AT · Jn + Jn · A = 0,

32

5.3 The quaternions

The complex numbers C are a field, which can also be thought of as a two-dimensional vector

space over R. One can ask if there are other field that are finite dimensional vector spaces over R.

Strangely, the answer is no. However, if one considers skew-fields i.e. real vector spaces that are

also rings (but not necessarily commutative) such that every non-zero element is invertible, then

there does exist one other example.

The quaternions are a four-dimensional real vector space H = R⊕Ri⊕Rj⊕Rk that are also

a ring, where multiplication is defined by

i2 = j2 = k2 = ijk = −1.

Exercise 5.5. Show that every element in H× := Hr0 is invertible. By giving explicit equations

for multiplication, show that H× is a real Lie group.

The complex conjugation on C extends to a conjugation on H by a+ bi + cj + dk = a− bi−cj− dk.

Exercise 5.6. Show that uu = a2 + b2 + c2 + d2 if u = a+ bi + cj + dk.

As a complex vector space H = C ⊕ Cj, so that z + wbj = z − wj for z, w ∈ C. The group

H× acts on H on the right, u 7→ uv for u ∈ H and v ∈ H×. The subgroup of H× consisting of all

element v such that vv = 1 is denoted S1(H).

Exercise 5.7. 1. Thinking of elements u in H as row vectors of length two, describe the action

of v ∈ H× on u as a 2 by 2 complex matrix A(v) so that u 7→ uA(v). Hint: First show that

if z ∈ C ⊂ H, then jz = zj.

2. Describe, as 2 by 2 matrices those elements belonging to S1(H).

3. Using part (2), construct an explicit isomorphism of Lie groups S1(H)∼−→ SU(2).

4. Show that S1(H) ' SU(2) ' S3 (the 3-sphere in R4) as manifolds.

If you enjoy playing around with quaternions, you should take a look at John Baez’ brilliant

article on the octonions at

www.ams.org/journals/bull/2002-39-02/S0273-0979-01-00934-X/home.html.

5.4 Other exercises

Exercise 5.8. Construct an isomorphism between GL(n,R) and a closed subgroup of SL(n+1,R).

33

Exercise 5.9. Show that the map C××SL(n,C)→ GL(n,C), (λ,A) 7→ λA, is surjective. Describe

its kernel. Describe the corresponding homomorphism of Lie algebras.

34

6 Representation theory

In this section, we introduce representations of Lie algebras. For simplicity, we will assume that

g is a complex Lie algebra.

6.1 Representations of Lie algebras

Let g be a Lie algebra and V a finite dimensional vector space. A representation of g is a

homomorphism of Lie algebras ρ : g→ gl(V ).

Example 6.1. If g is the Lie algebra of a Lie group, then Lemma 3.18 says that ad is a representation

of g.

6.2 Modules

As in the literature, we will often use the equivalent language of modules. A g-module is a vector

space V together with a bilinear action map − · − : g× V → V such that

[X, Y ] · v = X · (Y · v)− Y · (X · v) ∀ X, Y ∈ g, v ∈ V.

Exercise 6.2. Let ρ : g→ gl(V ) be a representation. Show that V is a g-module with action map

X · v = ρ(X)(v). Conversely, if V is a g-module, define ρ : g → End(V ) by ρ(X)(v) = X · v.

Show that ρ is actually a representation. Check that this defines a natural equivalence between

g-representations and g-modules.

Remark 6.3. For those of you who are comfortable with the language of categories, both repre-

sentations of a Lie algebra g and the collection of all g-modules form categories; in fact they are

abelian categories. Then exercise 6.2 is really saying that these two categories are equivalent.

6.3 Morphisms

A morphism of g-modules is a linear map φ : V1 → V2 such that φ(X · v) = X · φ(v) for all X ∈ g

and v ∈ V i.e. φ commutes with the action of g.

If φ : V1 → V2 is an invertible morphism of g-modules then φ is said to be an isomorphism of

g-modules.

Exercise 6.4. Let φ : V1 → V2 be an isomorphism of g-modules. Show that φ−1 : V2 → V1 is also

a morphism of g-modules.

35

6.4 Simple modules

Let V be a g-module. A subspace W of V is said to be a submodule of V if the action map

· : g× V → V restricts to an action map · : g×W → W . Equivalently, if X · w belongs to W for

all X ∈ g and w ∈ W . We say that W is a proper submodule of V if 0 6= W ( V .

Definition 6.5. A g-module is simple if it contains no proper submodules.

Given a particular Lie algebra g, one of the first things that one would want to work out as a

representation theorist is a way to describe all the simple g-modules. This is often possible (but

difficult).

6.5 New modules from old

We describe ways of producing new g-modules. Given two g-modules M and N , the space of

g-module homomorphisms is denoted Homg(M,N). When M = N we write Endg(M) for this

space. Notice that Homg(M,N) is a subspace of HomC(M,N) and Endg(M) is a subspace of

EndC(M).

Lemma 6.6 (Schur’s Lemma). Let V be a simple, finite dimensional g-module. Then every

g-module endomorphism of V is just a multiple of the identity i.e. Endg(V ) = C.

Exercise 6.7. 1. Prove Schur’s lemma.

2. If V and W are non-isomorphic, simple g-modules, show that Homg(V,W ) = 0.

Let m and l be Lie algebras. Make m⊕ l into a Lie algebra by

[(X1, Y1), (X2, Y2)] = ([X1, X2], [Y1, Y2]), ∀ Xi ∈ m, Yi ∈ l.

Exercise 6.8. If V is a m-module and W is a l-module, show that V ⊗W is a (m⊕ l)-module via

(X, Y ) · v ⊗ w = (X · v)⊗ (Y · w), ∀ (X, Y ) ∈ m⊕ l, v ∈ V, w ∈ W.

Next we prove the following proposition, based on exercise 6.8.

Proposition 6.9. Let V be a simple m-module and W a simple l-module. Then V ⊗W is a simple

(m⊕ l)-module.

Using only the Schur’s lemma, the proof of proposition 6.9 is quite difficult. We break it into

a series of lemmata.

36

Lemma 6.10. Let V be a simple m-module and v1, v2 ∈ V such that v1 is not proportional to v2.

Then the smallest submodule of V ⊕ V containing (v1, v2) is V ⊕ V .

Proof. Let U be the smallest submodule of V ⊕ V containing (v1, v2). The inclusion map i : U →V ⊕V is a module homomorphism, as are the two projection maps p1, p2 : V ⊕V → V . Therefore

maps p1 i, p2 i : U → V are also homomorphisms. Since V is simple, U non-zero and p1 ia non-zero map, it must be surjective. Now the kernel of p1 i is contained in the kernel of p1,

which equals 0 ⊕ V , a simple module. Therefore, either the kernel of p1 i is 0 and U ' V ,

or it is 0 ⊕ V , in which case U = V ⊕ V . Let’s assume that the kernel of p1 i is zero so that

p1 i : U∼−→ V . Its inverse will be written φ : V → U . The map p2 i will also be an isomorphism

U∼−→ V . Hence p2 i φ is an isomorphism V

∼−→ V . Since V is simple, Schur’s lemma implies

that p2 i φ is some multiple of the identity map. But p2 i φ applied to v1 is p2 i(v1, v2) = v2.

By assumption, v2 is not proportional to v1, so p2 i φ can’t be a multiple of the identity. This

is a contraction. Hence U = V ⊕ V .

Next, we prove a special case of the proposition.

Lemma 6.11. Let V be a simple m-module and W a simple l-module. The smallest (m ⊕ l)-

submodule of V ⊗W containing a pure tensor v ⊗ w 6= 0 is V ⊗W .

Proof. The module V ⊗W is spanned by all vectors v′ ⊗w′ for v′ ∈ V and w′ ∈ W . So it suffices

to show that v′ ⊗w′ is contained in the smallest submodule U of V ⊗W containing v ⊗w. Since

(0, Y ) ·v⊗w = v⊗Y ·w for all Y ∈ l ⊂ m⊕ l and W is simple, v⊗W is the smallest l-submodule of

V ⊗W containing v⊗w. Hence v⊗w′ ∈ U . Similarly, the smallest m-submodule of U containing

v ⊗ w′ is V ⊗ w′. Hence v′ ⊗ w′ ∈ U .

Proof of Proposition 6.9. Let u be any non-zero element in V ⊗W and U the smallest (m ⊕ l)-

submodule of V ⊗W containing u. Then we need to show that U = V ⊗W . If we knew that

0 6= v ⊗ w ∈ U , then the result would follow from Lemma 6.11. Let u =∑k

i=1 vi ⊗ wi. After

rewriting, we may assume that v1, . . . , vk are linearly independent. Moreover, we may also assume

that no pair wi1 , wi2 is proportional (if wi2 = αwi1 , then vi1 ⊗w1 + vi2 ⊗wi2 = (vi1 + αvi2)⊗wi1).Replacing u by another element of U if necessary, we may also assume that k is minimal satisfying

the above properties. If k = 1 then we are done. So we assume that k > 1 and construct an

element in U of ”smaller length”. Since v1, . . . , vk are linearly independent, we can define an

injective l-module homomorphism ψ : W ⊕ · · · ⊕ W → V ⊗ W , (w′1, . . . , w′k) 7→

∑ki=1 vi ⊗ w′i.

Then u is the image of w := (w1, . . . , wk) under this map. If w′ is any element in the smallest

l-submodule W ′ of W ⊕ · · · ⊕W containing w, then ψ(w′) belongs to U . So it suffices to show

37

that there is some non-zero element w′ in W ′, with at least one coordinate 0. In this case ψ(w′)

will be a sum of less that k terms. We consider the projection of W ′ onto W ⊕W ⊕ 0 ⊕ · · · .This is a l-submodule of W ⊕W containing (w1, w2). But we assumed that w1 and w2 are not

proportional. Hence Lemma 6.10 implies that this projection must be the whole of W ⊕W . In

particular, (w1, 0) is in the image of the projection. Let w′ be some element in W ′ projecting onto

(w1, 0). This element has at least one coordinate zero, as required.

Exercise 6.12. Let V and W be finite dimensional g-modules and HomC(V,W ) the space of linear

maps from V to W . Show that the rule

(X · f)(v) = X · f(v)− f(X · v), ∀ X ∈ g, v ∈ V, f ∈ HomC(V,W )

makes HomC(V,W ) into a g-module.

6.6 Representations of sl2

In this section we will completely describe the simple finite dimensional representations of sl(2,C).

Recall that g := sl(2,C) has a basis E,F,H such that [H,E] = 2E, [H,F ] = −2F and

[E,F ] = H. The relations imply that H is a semi-simple element. Let V be some finite dimensional

g-module. We can decompose V into generalized eigenspaces with respect to H,

V =⊕α∈C

Vα, Vα = v ∈ V | (H − α)N · v = 0, ∀ N 0.

We say that V is a direct sum of weight spaces for H. If v ∈ V belongs to a single Vα then we say

that v is a weight vector.

Exercise 6.13. If v ∈ Vα, show that E · v ∈ Vα+2 and F · v ∈ Vα−2.

Notice that if Vα 6= 0 but Vα+2 = 0 then the exercise implies that E · v = 0 for all v ∈ Vα. A

non-zero weight vector v ∈ V is call highest weight if E · v = 0 and H · v = αv.

Exercise 6.14. Let V be a finite dimensional g-module. Show that V contains a highest weight

vector v0. Set v−1 = 0 and vi = 1i!F i · v0 for i ≥ 0. If v0 has weight α, by induction on i show that

1. H · vi = (α− 2i)vi,

2. E · vi = (α− i+ 1)vi−1,

3. F · vi = (i+ 1)vi+1.

38

Let V be as in exercise 6.14. Since H ·vi = (α−2i)vi the vectors vi are all linearly independent

(they live in different weight spaces). But V is assumed to be finite dimensional, hence there exists

some N 0 such that vN 6= 0 but vN+1 = 0. Consider equation (2) of exercise 6.14. We’ve said

that vN+1 = 0 but vN 6= 0. This implies that α−N = 0 i.e. α = N is a positive integer. Since V

is assumed to be simple, but contains v0, the vectors v0, . . . , vN are a basis of V and each weight

space VN−2i for i = 0, . . . , N is one-dimensional with basis vi. Moreover, dimV = N + 1. Thus,

we have shown:

Theorem 6.15. Let V be a simple, finite dimensional sl(2,C)-module, with highest weight of

weight α. Then,

1. α is a positive integer N .

2. dimV = N + 1 and the non-zero weight spaces of V are VN , VN−2, . . . , V−N , each of which

is one-dimensional.

3. Conversely, for any positive integer N there exists a unique simple sl(2,C)-module V (N)

with highest weight of weight N .

Exercise 6.16. Recall from exercise 3.30 that the elements E,F and H can be written as particular

2 × 2 matrices. In representation theoretic terms, this means that sl(2,C) has a natural two-

dimensional representation, the ”vectorial representation”. Is this representation simple? If so, it

is isomorphic to V (N) for some N . What is N?

Exercise 6.17. We’ve also seen that sl(2,C) acts on itself by the adjoint action ad : sl(2,C) →gl(sl(2,C)). Is the adjoint representation simple? If so, what is N in this case? Explicitly write

the highest weight vector in terms of E,F and H.

39

7 The structure of Lie algebras

In this section we introduce the notion of semi-simple, solvable and nilpotent Lie algebras. The

main result of the section says that every Lie algebra is built up, in some precise way, from a

semi-simple and a solvable Lie algebra (Levi’s Theorem). Moreover, every semi-simple Lie algebra

is a direct sum of simple Lie algebras. In order to simplify the proofs we will fix k = C, the

complex numbers (what is really needed for all of the results of this section to hold is that k be

algebraically closed of characteristic zero).

7.1 A rough classification of Lie algebras

The centre z(g) of a Lie algebra g is the subspace X ∈ g | [X, Y ] = 0,∀ Y ∈ g. Notice that an

abelian Lie algebra is precisely the same as a Lie algebra whose centre is the whole algebra. At

the other extreme, a Lie algebra is called simple if it contains no proper ideals i.e. the only ideals

in g are g and 0.

Exercise 7.1. Show that z(g) is an ideal in g. Hence, if g is simple, then z(g) = 0.

Example 7.2. The centre of gl(n,C) is the one-dimensional ideal consisting of multiples of the

identity matrix C · In. On the other hand, the Lie algebra sl(n,C) is a simple Lie algebra.

Let g be a Lie algebra. The lower central series of g is defined inductively by g1 = [g, g]

and gk = [g, gk−1]. The derived series of g is similarly inductively defined by g1 = [g, g] and

gk = [gk−1, gk−1].

Definition 7.3. We say that g is

• nilpotent if there exists some n 0 such that gn = 0,

• solvable if there exists some n 0 such that gn = 0,

• semi-simple if g contains no proper, solvable ideals.

Exercise 7.4. Show that every nilpotent Lie algebra is solvable. Give an example of a solvable Lie

algebra that is not nilpotent. Hint: Try dim g = 2.

Exercise 7.5. Show that each piece gn of the lower central series is an ideal in g. Is the same true

of the pieces gn of the derived series? Hint: what does the Jacobi identity tell you in this case?

Exercise 7.6. Let g be a Lie algebra and I, J solvable ideals of g.

1. If g/I is solvable, show that g is solvable.

40

2. Using the fact that (I + J)/I ' J/I ∩ J , show that I + J is a solvable ideal of g.

Lemma 7.7. Let g be a finite dimensional Lie algebra. Then g contains a unique maximal solvable

ideal. It is denoted rad g, and called the solvable radical of g.

Proof. Let I =∑

i Ii be the sum of all solvable ideals of g. Since g is finite dimensional, I is

certainly finite dimensional. Thus, there exists finitely many solvable ideals I1, . . . , Ik such that

I = I1 + · · · + Ik. Inductively applying exercise 7.6, I is a solvable ideal. It is clearly the unique

maximal one.

Notice that Lemma 7.7 implies that g is semi-simple if and only if its solvable radical rad g is

zero.

7.2 Engel’s Theorem

Engel’s Theorem is crucial in describing nilpotent Lie algebras. First, we begin with:

Exercise 7.8. Let V be a finite dimensional vector space and X ∈ gl(V ) a nilpotent endomorphism,

Xn = 0 say. Show that ad(X)2n+1(Y ) = 0 for all Y ∈ gl(V ).

Theorem 7.9. Let n be a subalgebra of gl(V ), for some finite dimensional vector space V . If n

consists of nilpotent endomorphisms and V 6= 0 then there exists some non-zero vector v such that

n(v) = 0.

Proof. The proof is by induction on dim n. If dim n = 0 or 1, the claim is clear.

First we show that there is an ideal in n of codimension one. Let l be any maximal proper

subalgebra of n. Since [l, l] ⊂ l, the algebra l acts on n/l. Exercise 7.8 implies that ad(l) consists

of nilpotent endomorphisms in gl(n). Hence the image of l in gl(n/l) also consists of nilpotent

endomorphisms. By induction, this implies that there is some 0 6= Y ∈ n/l such that l · Y = 0 i.e.

[l, Y ] ⊂ l. Thus, l⊕ CY is a subalgebra of n. This implies that l⊕ CY = n and l is an ideal

in n.

Since dim l < dim n, induction implies that there exists some 0 6= v ∈ V such that l · v = 0.

Let W be the subspace of all such vectors. Since n = l ⊕ CY , it suffices to show that there is

some 0 6= w ∈ W such that Y (w) = 0. Let w ∈ W and X ∈ l. Then,

XY (w) = Y X(w) + [X, Y ](w) = 0,

since X, [X, Y ] ∈ l implies that X(w) = [X, Y ](w) = 0. Hence Y (W ) ⊂ W . Since Y is a nilpotent

endomorphism of V and preserves W , it is a nilpotent endomorphism of W . Thus, there exists

some 0 6= w ∈ W such that Y (w) = 0.

41

What is Engel’s theorem really saying? Here are two important corollaries of his theorem.

Exercise 7.10. Let n be a Lie algebra such that n/z(n) is nilpotent. Show that n is nilpotent.

Corollary 7.11. Let n be a finite dimensional Lie algebra. If every element in n is ad-nilpotent,

then n is nilpotent.

Proof. We may consider l := ad(n) ⊂ gl(n). Our hypothesis says that l consists of nilpotent

endomorphisms. Therefore, by Theorem 7.9, there exists 0 6= m ∈ n such that [n,m] = 0 i.e.

z(n) 6= 0. If n consists of ad-nilpotent elements, then clearly n/z(n) does too. Hence, by induction,

we may assume that n/z(n) is a nilpotent Lie algebra. Then the corollary follows from exercise

7.10.

Recall that n(n, k) denotes the Lie subalgebra of gl(n, k) consisting of all strictly upper-

triangular matrices. In order to concretely understand the meaning of Engel’s Theorem we intro-

duce the notion of flags in V . A flag V q in V is a collection

0 = V0 ( V1 ( V2 ( · · · ( Vk ⊂ V

of nested subspaces of V . We say that the flag V q is complete if dimVi/Vi−1 = 1 i.e. it’s not

possible to insert any more subspaces into the flag. If one fixes a basis e1, . . . , en of V then the

standard complete flag is V q, where Vi = Spane1, . . . , ei. The following lemma is clear.

Lemma 7.12. Let V q be the standard complete flag in Cn.

1. The endomorphism X ∈ gl(n,C) belongs to n(n,C) if and only if X(Vi) ⊂ Vi−1 for all i.

2. The endomorphism X ∈ gl(n,C) belongs to b(n,C) if and only if X(Vi) ⊂ Vi for all i.

Notice that every complete flag in V is the standard complete flag of V with respect to some

basis of V .

Corollary 7.13. Let l be a subalgebra of gl(V ), for some finite dimensional vector space V . If

l consists of nilpotent endomorphisms then there exists a basis of V such that l ⊂ n(n, k); where

n = dimV .

Proof. By Lemma 7.12, it suffices to show that there exists a complete flag V q of V such that

X(Vi) ⊂ Vi−1 for all X ∈ l. Let v be as in Theorem 7.9 and set V1 = Cv ⊂ V . Then l acts on

V/V1, again by nilpotent endomorphisms, hence by induction, there is a complete flag V q = (0 ⊂V 1 ⊂ · · · ⊂ V n−1 = V/V1) in V/V1 such that l(V i) ⊂ V i−1 for all i. Let Vi := v ∈ V | v ∈ V i−1,for i = 2, . . . , n. Then l(Vi) ⊂ Vi−1 as required.

42

You may think, based on Corollaries 7.11 and 7.13, that if l is a nilpotent subalgebra of gl(V ),

then one can always find a basis of V such that l ⊂ n(n,C). But this is not the case. Consider

for instance the subalgebra h of gl(n,C) consisting of all diagonal matrices. This is abelian and

hence nilpotent. It is not possible to change the basis of Cn such that this algebra sits in n(n,C).

The point here is that even though h is nilpotent, the element in h are not nilpotent matrices.

7.3 Lie’s Theorem

The analogy of Engel’s Theorem for solvable Lie algebras is Lie’s Theorem. Unfortunately, its

proof is more involved than that of Engel’s Theorem.

Theorem 7.14. Let V be a finite dimensional k-vector space and s a solvable subalgebra of gl(V ).

If V 6= 0 then there exists a common eigenvector for all endomorphisms in s.

We should first say what it actually means for s to have a common eigenvector. This means

that there exists 0 6= v in V and scalars αX for every X ∈ s such that X(v) = αXv.

Proof. The structure of the proof of Lie’s Theorem is identical to the proof of Engel’s Theorem,

but the justification of each step is slightly different.

The proof is again by induction on dim s. If dim s = 1 then the claim is trivial. Therefore we

assume that dim s > 1. Since s is assumed to be solvable [s, s] is a proper ideal of s; or is zero. In

either case, the Lie algebra s/[s, s] is abelian and hence every subspace of this Lie algebra is an

ideal. Take a subspace of codimension one and let n denote its pre-image in s. Then n is an ideal

in s of codimension one. By induction, there exists a joint eigenvector 0 6= v ∈ V for n i.e. there

is some linear function λ : n→ C such that X(v) = λ(X)v for all X ∈ n.

Consider the subspace W = w ∈ V | X(w) = λ(X)w ∀ X ∈ n of V . We’ve shown that it’s

non-zero. Lemma 7.15 below says that W is a s-submodule of V . If we choose some Y ∈ sr n (

so that s = C · Y ⊕ n as a vector space), then Y (W ) ⊂ W . Hence there exists some 0 6= w ∈ Wthat is an eigenvector for Y ; this w is an eigenvector for all elements in s.

Lemma 7.15. Let n be an ideal in a Lie algebra s. Let V be a representation of s and λ : n→ k

a linear map. Set

W = v ∈ V | X(v) = λ(v) ∀ X ∈ n.

Then W is a s-subrepresentation of V i.e. Y (W ) ⊂ W for all Y ∈ s.

43

Proof. Let w ∈ W be non-zero. To show that Y (w) belongs to W , we need to show that

X(Y (w)) = λ(X)Y (w) for all X ∈ n. We have

X(Y (w)) = Y (X(w)) + [X, Y ](w) (15)

= λ(X)Y (w) + λ([X, Y ])w (16)

where we have used the fact that [X, Y ] ∈ n because n is an ideal. Therefore we need to show

that λ([X, Y ]) = 0 i.e. λ([s, n]) = 0.

The proof of this fact is a very clever trick using the trace of an endomorphism. Let U be the

subspace of V spanned by all w, Y (w), Y 2(w), . . . . Clearly, Y (U) ⊂ U . We claim that U is also a

n-submodule of V i.e. X(U) ⊂ U for all X ∈ n. Certainly X(w) = λ(X)w ∈ U and equation (16)

implies that X(Y (w)) ∈ U . So we assume by induction that X(Y k(w)) ∈ U for all k < n. Then,

X(Y n(w)) = Y (X(Y n−1(w))) + [X, Y ](Y n−1(w)).

Since X, [X, Y ] ∈ n, X(Y n−1(w)) and [X, Y ](Y n−1(w)) belong to U . Thus, X(Y n(w)) also belongs

to U . In fact the above argument shows inductively that X(Y n(w)) = λ(X)Y n(W )+ terms

involving only Y k(w) for k < n. Thus, there is a basis of U such that X is upper-triangular

with λ(X) on the diagonals. In particular, Tr(X|U) = λ(X) dimU . This applies to [X, Y ] too,

Tr([X, Y ]|U) = λ([X, Y ]) dimU . But the trace of the commutator of two endomorphisms is zero.

Hence1 λ([X, Y ]) = 0.

Notice that the proof of Lemma 7.15 shows that if V is a one-dimensional s-module, then

[s, s] · V = 0.

Again, just as in corollary 7.13, Lie’s Theorem, together with Lemma 7.12, immediately implies:

Corollary 7.16. Let V be a finite dimensional k-vector space and s a solvable subalgebra of gl(V ).

Then, there exists a basis of V such that s is a subalgebra of b(n, k) i.e. we can simultaneously

put every element of s in upper triangular form.

The proof of corollary 7.16 is essentially the same as the proof of corollary 7.13.

Corollary 7.17. Let s ⊂ gl(V ) be a solvable Lie algebra. Then [s, s] consists of nilpotent endo-

morphisms in gl(V ). In particular, [s, s] is a nilpotent Lie algebra.

1Notice that we are using here the fact that the characteristic of our field is zero. The lemma is false in positivecharacteristic (as is Lie’s Theorem).

44

Proof. By Corollary 7.16, we may assume that s ⊂ b(n,C). Then, [s, s] ⊂ [b(n,C), b(n,C)] =

n(n,C). Hence [s, s] consists of nilpotent endomorphisms of V . The second statement then follows

from Corollary 7.11 of Engel’s Theorem.

7.4 The Killing form

Let g be a finite dimensional Lie algebra. Recall that the adjoint representation defines a homo-

morphism ad : g → gl(g). We can use the adjoint representation to define a particular bilinear

form on g.

Definition 7.18. The Killing form on g is the bilinear form κ : g× g→ C defined by κ(X, Y ) =

Tr(ad(X) ad(Y )).

Since Tr(AB) = Tr(BA) for any two square matrices, the Killing form is symmetric i.e.

κ(X, Y ) = κ(Y,X).

Exercise 7.19. Let g = sl(2,C) with the usual basis E,F,H.

1. Calculate explicitly the adjoint representation of g in terms of the basis E,F,H.

2. Using part (1), calculate the Killing form on g.

Exercise 7.20. A bilinear form β on g is said to be associative if β([X, Y ], Z) = β(X, [Y, Z]) for

all X, Y, Z ∈ g. Show that the Killing form is associative.

The following key (but difficult) result shows that the Killing form can be used to tell if a

given Lie algebra is solvable or not.

Theorem 7.21 (Cartan’s criterion). Let V be a finite dimensional vector space and g ⊂ gl(V ) a

Lie algebra. If Tr(X, Y ) = 0 for all X, Y ∈ g then g is solvable.

The proof of Cartan’s criterion is tricky, so we’ll skip it.

Exercise 7.22. Show that every non-zero solvable ideal of a finite dimensional Lie algebra contains

a non-zero abelian ideal of g. Hint: if ln = 0, consider ln−1.

Exercise 7.23. Show that the Lie algebra s is solvable if and only if its image ad(s) in gl(s) is

solvable.

Corollary 7.24. Let g be a finite dimensional Lie algebra. Then g is semi-simple if and only if

its Killing form κ is non-degenerate.

45

Proof. Let S be the radical of the Killing form κ.

First we assume that the solvable radical rad g of g is zero. Since κ(S, S) = 0, Cartan’s criterion

together with exercise 8.12 implies that S is a solvable Lie algebra. But S is also an ideal in g.

Hence it is contained in the solvable radical of g. Thus, it is zero.

Conversely, assume that S = 0. Since every solvable ideal of g contains a non-zero abelian

ideal, it suffices to show that every abelian ideal l of g is contained in S. Let l be one such ideal.

Let x ∈ l and y ∈ g. Then (ad x) (ad y) is a map g→ g→ l, and hence (adx ad y)2 maps g to

zero. Hence ad x ad y is a nilpotent endomorphism. This implies that

κ(x, y) = Tr(adx ad y) = 0,

and hence l ⊂ S as required.

Theorem 7.25. Let g be a finite dimensional, semi-simple Lie algebra. Then, there is a unique

decomposition

g = g1 ⊕ · · · ⊕ gk

of g into a direct sum of simple Lie algebras.

Proof. Let l be any non-zero ideal in g and let l⊥ = X ∈ g | κ(X, Y ) = 0 ∀ Y ∈ l. Then l⊥ is also

an ideal - check! The Killing form restricted to l ∩ l⊥ is clearly zero. Therefore Cartan’s criterion

says that it is a solvable Lie algebra. But since it is also the intersection of two ideals, it is an

ideal in g. Thus, since g is semi-simple, l ∩ l⊥ = 0. Hence l⊕ l⊥ ⊂ g. But dim l + dim l⊥ = dim g.

Hence l ⊕ l⊥ = g. Since any solvable ideal of l (or l⊥) would also be a solvable ideal in g, both l

and l⊥ are semi-simple.

Applying the same argument to l and l⊥ gives a finer decomposition of g into a direct sum of

semi-simple algebras. Since g is finite dimensional, this process can’t continue indefinitely, so we

get some decomposition of g = g1 ⊕ · · · ⊕ gk into a direct sum of simple Lie algebras.

To show uniqueness, let h be some simple ideal in g. We must show that h = gi for some i.

The space [g, h] is an ideal in g; it is non-zero because the centre of g is trivial. Therefore, since

h was assumed to be simple, [g, h] = h. But,

[g, h] = [g1, h]⊕ · · · [gk, h] = h.

Thus, there must exists a unique i such that [gi, h] = h. But since gi is also simple and [gi, h] 6= 0,

this implies that [gi, h] = gi.

In the proof of Theorem 7.25, we have shown that if g is semi-simple then g = [g, g].

46

7.5 Levi’s Theorem and Ado’s Theorem

In this final section we state, without proof, two further important structure theorems about Lie

algebras.

Theorem 7.26 (Levi). Let g be a finite dimensional Lie algebra and r its radical. Then there

exists a semi-simple subalgebra l of g such that g = r⊕ l as an l-module.

Most of the example of Lie algebras we will encounter in the course are subalgebras of gl(V )

for some finite dimensional vector space V . Ado’s Theorem says that this is not a coincidence.

Theorem 7.27 (Ado). Every finite dimensional Lie algebra admits a faithful, finite dimensional

representation.

That is, given any g, we can always find some finite dimensional vector space V such that g is

a subalgebra of gl(V ).

47

8 Complete reducibility

In this section, we state and prove Weyl’s complete reducibility theorem for semi-simple Lie

algebras.

Definition 8.1. A g-module V is said to be completely reducible if there is a decomposition

V = V1 ⊕ · · · ⊕ Vk of V into a direct sum of simple g-modules.

Weyl’s complete reducibility theorem say:

Theorem 8.2. Let g be a simple Lie algebra. Then, every finite dimensional representation of g

is completely reducible.

Remark 8.3. The decomposition of a completely reducible g-module need not be unique. For

instance, consider the extreme example where g = 0 so that a g-module is just a vector space. Any

such module is always completely reducible: this is just the statement that any finite dimensional

vector space can be expressed as the direct sum of one-dimensional subspaces. But there are many

ways to decompose a vector space into a direct sum of one-dimensional subspaces.

Weyl’s Theorem is a truly remarkable, and to me surprising, result. It is also very useful.

When g is not semi-simple, the statement of the theorem is simply not true. Also, even when

g is semi-simple, the assumption that V is finite dimensional is crucial. It is false for infinite

dimensional representations. It is very useful to rephrase the notion of complete reducibility in

terms of complements.

Lemma 8.4. A finite dimensional g-module V is completely reducible if and only if every submod-

ule W ⊂ V admits a complement i.e. there exists a submodule W ′ ⊂ V such that V = W ⊕W ′.

Proof. Assume first that every submodule of V admits a complement. In order to use induction on

dimV , let us show that submodules inherit this property i.e. if W ⊂ V is a submodule and U ⊂ W

another submodule then U admits a complement in W . We can certainly find a complement U ′′

to U in V . We claim that U ′ := U ′′∩W is a complement to U in W . Being the intersection of two

submodules, it is a submodule, and U ′ ∩ U = 0. So it suffices to show that U ′ + U = W . Let

w ∈ W . Then w = u+u′′, where u ∈ U and u′′ ∈ U ′′. But u′′ = w−u ∈ W since u,w ∈ W . Hence

u′′ ∈ U ′ and U + U ′ = W . Therefore, it follows by induction on dimV , that if every submodule

of V admits a complement in V then V is completely reducible.

Now assume that V = V1 ⊕ · · · ⊕ Vk for some simple submodules of V . Let W be an arbitrary

submodule. Notice that if V ′ is a simple submodule of V then either V ′ ⊕W is a submodule of

V or V ′ ⊂ W - if V ′ 6⊂ W then V ′ ∩ W is a proper submodule of V ′ hence V ′ ∩ W = 0 and

48

V ′ ⊕ W ⊂ V . Now, since V = V1 ⊕ · · · ⊕ Vk, there must exist i1 such that Vi1 6⊂ W . Hence

Vi1 ⊕ W ⊂ V . If Vi ⊕ W ( V then again there is some i2 such that Vi2 ∩ (Vi1 ⊕ W ) = 0 i.e.

Vi2 ⊕ Vi1 ⊕W ⊂ V . Continuing in this way, we eventually get

V = Vir ⊕ · · · ⊕ Vi1 ⊕W

for some r ≤ k i.e. Vir ⊕ · · · ⊕ Vi1 is a complement to W in V .

We give a couple of counter-examples when g is not semi-simple.

• If g = CX with the trivial bracket, the we can consider the two-dimensional g-module

V = Ce1, e2, where X · e2 = e1 and X · e1 = 0. Then V1 = Ce1 is a g-submodule of

V . Any complimentary subspace to V in V1 is of the form V2 = Ce2 + βe1. But then

X ·(e2+βe1) = e1 means that none of these subspaces is a g-submodule. So no decomposition

V = V1 ⊕ V2 exists.

• If we take g = n(3,C) and V = Ce1, e2, e3 the standard column vectors representations,

then X · e1 = 0 for all X ∈ n(3,C). So V1 = Ce1 is a g-submodule. But there can be no

other g-submodule U such that V = V1⊕U as a g-module: assume that such a U exists and

take a non-zero vector u = αe1 + βe2 + γe3 in U . Then either β 6= 0 or γ 6= 0. We have 0 1 0

0 0 0

0 0 0

· u = βe1,

0 0 1

0 0 0

0 0 0

· u = γe1,

which shows in either case that U ∩ V1 6= 0 - a contradiction.

In order to prove Weyl’s Theorem, we will need the notion of the Casimir element in End(V ).

Let g be a semi-simple Lie algebra. Let β be some symmetric, non-degenerate associative bilinear

form on g. Then, if we fix a basis X1, . . . , Xn of g, there exists a unique ”dual basis” Y1, . . . , Yn of g

such that β(Xi, Yj) = δi,j. A g-module V is said to be faithful if the action morphism ρ : g→ gl(V )

is injective.

Lemma 8.5. Let g be a simple Lie algebra and V a g-module. Then either V is faithful or X ·v = 0

for all X ∈ g and v ∈ V .

Proof. Let ρ : g → gl(V ) be the action morphism. Then Ker ρ is an ideal in g. Since g is simple

it has no proper ideals. Therefore, either Ker ρ = 0 i.e. V is faithful, or Ker ρ = g i.e. X · v = 0

for all X ∈ g and v ∈ V .

49

Let V be a faithful g-module. Since each ρ(X), for X ∈ g, is an endomorphism of V , we can

define βV (X, Y ) = Tr(ρ(X)ρ(Y )). The fact that V is faithful implies, by Cartan’s Criterion, that

βV is non-degenerate. The Casimir of V is defined to be

ΩV =n∑i=1

ρ(Xi)ρ(Yi),

where the Xi and Yi are dual basis with respect to the form βV . The whole point of defining the

Casimir is:

Lemma 8.6. Let V be a faithful g-module. Then the Casimir ΩV is an endomorphism of V

commuting with the action of g i.e. [ΩV , ρ(X)] = 0 in gl(V ) for all X ∈ g. Moreover, the trace

Tr(ΩV ) of ΩV , as an endomorphism of V , equals dim g.

Proof. First let X ∈ g and write [X,Xi] =∑n

j=1 ai,jXj for some ai,j ∈ C. Similarly, [X, Yi] =∑nj=1 bi,jYj. Then,

ai,j =n∑k=1

ai,kβV (Xk, Yj) = βV ([X,Xi], Yj) = −βV (Xi, [X, Yj]) =n∑k=1

−bj,kβV (Xi, Yk) = −bj,i.

Now, using the fact that [XY,Z] = [X,Z]Y +X[Y, Z] in End(V ), we have

[ΩV , ρ(X)] =

[n∑i=1

ρ(Xi)ρ(Yi), X

]

=n∑i=1

[ρ(Xi), ρ(X)]ρ(Yi) + ρ(Xi)[ρ(Yi), ρ(X)]

=n∑i=1

ρ([Xi, X])ρ(Yi) + ρ(Xi)ρ([Yi, X])

= −n∑

i=1,j

ai,jρ(Xj)ρ(Yi) + bi,jρ(Xi)ρ(Yj) = 0

since bj,i = −ai,j.We have

Tr(ΩV ) =n∑i=1

Tr(ρ(Xi)ρ(Yi)) =n∑i=1

βV (Xi, Yi) = dim g,

since X1, . . . , Xn is a basis of g and βV (Xi, Yi) = 1.

Exercise 8.7. Let V be a finite dimensional complex vector space and X ∈ EndC(V ). Prove that

50

V =⊕

α∈C Vα, where Vα = v ∈ V | (X − α)N · v = 0, for N 0..

Lemma 8.4 implies that, in order to prove Weyl’s Theorem, it suffices to show that if V is a

g-module and W a submodule, then there exists a complementary g-submodule W ′ to W in V .

That is, V = W ⊕W ′ as g-modules.

Exercise 8.8. Let g be a semi-simple Lie algebra and V = C · e a one-dimensional g-module. Show

that X · e = 0 for all X ∈ g. Hint: use the fact that g = [g, g].

We begin by proving Weyl’s Theorem in a special case. The general case reduces easily to this

special case.

Lemma 8.9. Let V be a g-module and W a submodule of codimension one. Then there exists a

one-dimensional complementary g-submodule W ′ to W in V .

Proof. The proof is by induction on dimV . The case dimV = 1 is vacuous.

By Lemma 8.5, either V is faithful or X · v = 0 for any X ∈ g and v ∈ V . In the latter

case, one can take W ′ to be any complementary subspace to W in V (in this case every subspace

of V is a submodule). Therefore we assume that V is a faithful g-module. By Lemma 8.6, the

Casimir ΩV is a g-module endomorphism of V . Therefore V will decompose into a direct sum of

g-submodules

V =⊕α∈C

Vα,

where Vα = v ∈ V | (ΩV −α)N(v) = 0, N 0. Since W is a g-submodule and ΩV is expressed

in terms of the ρ(X), it maps W into itself. Therefore W =⊕

α∈CWα, where

Wα = w ∈ W | (ΩV − α)N(w) = 0, N 0 = Vα ∩W.

Claim 8.10. We have Vα = Wα for all α 6= 0 and dimV0/W0 = 1.

Proof. Since each Vα is a g-module and Wα a submodule, the quotient is a g-module. We have

V/W =⊕

α∈C Vα/Wα. Since dimV/W = 1, there is exactly one α for which Vα 6= Wα. For this α,

dimVα/Wα = 1. Now exercise 8.8 says that X · [v] = 0 for all X ∈ g and [v] ∈ Vα/Wα. Since ΩV is

expressed in terms of the ρ(X), ΩV acts as zero on the quotient. On the other hand, Ω acts with

generalized eigenvalue α on Vα/Wα. Hence α = 0. This completes the proof of the claim.

To complete the proof of the lemma, it suffices to show by induction that dimV0 < dimV

since W0 ⊂ V0 is a submodule of codimension one. But Lemma 8.6 says that Tr(ΩV ) = dim g 6= 0.

Hence there exists at least one α 6= 0 such that Vα 6= 0.

51

Finally, we are in a position to prove Weyl’s Theorem in complete generality. Thus, let V be a

g-module and W a proper submodule. Recall from exercise 6.12 that HomC(V,W ) is a g-module,

where

(X · f)(v) = X · f(v)− f(X · v), ∀ X ∈ g, v ∈ V, f ∈ HomC(V,W ). (17)

Define

U = f ∈ HomC(V,W ) | f |W = λIdW for some λ ∈ C.

and U ′ = f ∈ HomC(V,W ) | f |W = IdW. Then it is an easy exercise, left to the reader, that

U ′ and U are g-submodules of HomC(V,W ). There is a natural map U → HomC(W,W ) given by

f 7→ f |W − IdW . Clearly the image is a one-dimensional subspace of HomC(W,W ); this is actually

a homomorphism of g-modules. Moreover, the kernel is precisely U ′. Thus, U ′ has codimension

one in U . Lemma 8.9 says that there is a complementary one-dimensional submodule U ′′ to U ′ in

U . Choose 0 6= φ ∈ U ′′. Then φ|W = IdW .

Claim 8.11. The map φ is a homomorphism of g-modules.

Proof. Since U ′′ is one-dimensional, X ·φ = 0 for all X ∈ g. Equation (17) implies that this means

that φ is a homomorphism of g-modules.

The fact that φ|W = IdW means that φ is a surjective map. The kernel Kerφ is the comple-

mentary g-submodule to W in V .

Exercise 8.12. Let g = Cx, y with [x, y] = y be the unique non-abelian solvable 2-dimensional

Lie algebra. Show that the adjoint representation of g is not completely reducible.

8.1 Generalizing to semi-simple Lie algebras

It is possible to generalize Weyl’s Theorem to semi-simple Lie algebras.

Exercise 8.13. Let V be a m⊕ l-module and W a l-module. Show that

(X · f)(w) = (X, 0) · f(w), ∀ X ∈ m, f ∈ Homl(W,V ), w ∈ W,

makes Homl(W,V ) into a m-module.

Theorem 8.14. Let g be a semi-simple Lie algebra. Then, every finite dimensional representation

of g is completely reducible.

Proof. Recall from Theorem 7.25 that g = g1 ⊕ · · · ⊕ gk, where each gi is simple. We prove the

claim by induction on k. The case k = 1 is Theorem 8.2.

52

Write g = g1 ⊕ m, where m = g2 ⊕ · · · ⊕ gk. We may assume that every finite dimensional

m-module is completely reducible. Since g1 is simple, V = V n11 ⊕· · ·⊕V nr

r , where the Vi are simple,

pairwise non-isomorphic g1-modules. We define a map Homg1(Vi, V )⊗ Vi → V by (φ, v) 7→ φ(v).

By exercise 8.13, Homg1(Vi, V ) is a m-module. Hence Homg1(Vi, V )⊗ Vi is a g-module. Then the

map Homg1(Vi, V ) ⊗ Vi → V is a homomorphism of g-modules. This extends to an isomorphism

of g-modules

(Homg1(V1, V )⊗ V1)⊕ · · · ⊕ (Homg1(Vr, V )⊗ Vr)∼−→ V.

Since each Homg1(Vi, V ) is a completely reducible m-module, we can decompose each Homg1(Vi, V )⊗Vi into a direct sum of simple g-modules.

Exercise 8.15. In the proof of Theorem 8.14, show that the map Homg1(Vi, V ) ⊗ Vi → V is a

homomorphism of g-modules. Show that (Homg1(V1, V )⊗ V1)⊕ · · · ⊕ (Homg1(Vr, V )⊗ Vr)→ V is

an isomorphism.

53

9 Cartan subalgebras and Dynkin diagrams

In this section we begin on the classification of simple, complex Lie algebras. Throughout, g will

be a simple Lie algebra over C.

9.1 Cartan subalgebras

Let V be a finite dimensional vector space. Recall that an element A ∈ EndC(V ) is called semi-

simple, resp. nilpotent, if A can be diagonalized, resp. An = 0 for some n 0. In this context,

Jordan’s decomposition theorem can be stated as

Proposition 9.1. Let A ∈ EndC(V ). Then there is a unique decomposition A = As + An, where

As is semi-simple and An is nilpotent such that [As, An] = 0.

Now let g be a finite dimensional Lie algebra. We say that X ∈ g is semi-simple, resp. nilpotent,

if ad(X) ∈ End(g) is semi-simple, resp. is nilpotent.

Definition 9.2. A Lie subalgebra h of g is called a Cartan subalgebra if every element of h is

semi-simple and it is a maximal subalgebra with these properties i.e. if h′ is another subalgebra

of g consisting of semi-simple elements and h ⊂ h′ then h = h′.

Since 0 is a subalgebra of g consisting of semi-simple elements and g is finite dimensional,

it is contained in at least one Cartan subalgebra i.e. Cartan subalgebras exist.

Lemma 9.3. Let h be a subalgebra of g consisting of semi-simple elements. Then h is abelian.

Proof. Let X ∈ h. We need to show that adh(X) = 0. Since adg(X) is semi-simple and adg(h) ⊂ h,

adh(X) is also semi-simple. So it suffices to show that adh(X) has no non-zero eigenvalues i.e. if

Y ∈ hr 0 such that ad(X)(Y ) = aY , then a = 0. Assume that a 6= 0.

Since Y ∈ h too, we may diagonalize ad(Y ). That is, there exists a basis X1, . . . , Xn of h and

αi ∈ C such that ad(Y )(Xi) = αiXi. We may assume that X1 = Y and hence α1 = 0. Therefore,

there exist unique ui ∈ C such that X = u1Y + u2X2 + · · ·unXn. But then,

ad(Y )(X) = −aY = α2u2X2 + · · ·αnunXn.

Since Y,X2, . . . , Xn is a basis of h, this implies that a = 0; a contradiction.

In particular, Lemma 9.3 implies that every Cartan subalgebra of g is abelian. For any subal-

gebra m of g, we denote by Ng(m) the normalizer X ∈ g | [X,m] ⊂ m of m in g.

54

Exercise 9.4. Show that the normalizer Ng(m) of a subalgebra m is itself a subalgebra of g.

Moreover, show that Ng(m) = g if and only if m is an ideal in g.

A key property of Cartan subalgebras is that they equal their normalizers i.e. if X ∈ g such

that [X, h] ⊂ h, then X ∈ h.

Proposition 9.5. The normalizer of a Cartan subalgebra h is h itself.

Since each element h in h is semi-simple, g will decompose into a direct sum of eigenspaces

g =⊕

α gα, where [h,X] = αX for all X ∈ gα. In fact, since h is abelian, we can simultaneously

decompose g into eigenspaces for all h ∈ h. This means that

g = g0 ⊕⊕

α∈h∗r0

gα

where gα = X ∈ g | [h,X] = α(h)X, ∀h ∈ h. Since g is finite dimensional, there are only

finitely many α ∈ h∗ r 0 such that gα 6= 0. This set R ⊂ h∗ is called the set of roots of g. Since

h is abelian, h ⊂ g0; this is actually an equality g0 = h by Proposition 9.5.

Corollary 9.6. Let h be a Cartan subalgebra of g and α, β ∈ R ∪ 0. The Killing form defines

a non-degenerate pairing between gα and g−α, and κ(gα, gβ) = 0 if α + β 6= 0. In particular, κ|his non-degenerate.

Proof. Let X ∈ gα and Y ∈ gβ. Then,

α(h)κ(X, Y ) = κ([h,X], Y ) = −κ([X, h], Y ) = −κ(X, [h, Y ]) = −β(h)κ(X, Y ).

Thus, (α(h) + β(h))κ(X, Y ) = 0 for all h ∈ h. If α + β 6= 0, then Ker(α + β) is a proper subset

of h. Hence there exists some h ∈ h such that α(h) + β(h) 6= 0. Thus, if β 6= −α, we have

κ(gα, gβ) = 0. However, we know that the Killing form is non-degenerate on g. Therefore, it must

define a non-degenerate pairing between gα and g−α.

Since κ is non-degenerate on h, there is a canonical identification η : h ∼→ h∗ given by η(t) =

κ(t,−). Under this identification, we denote by tα the element in h corresponding to α ∈ h∗ i.e.

tα := η−1(α).

9.2 α-strings through β

Next we’ll try to understand a bit more this decomposition of g with respect to a fixed Cartan

subalgebra h. The key point is that the Killing form κ is non-degenerate on both g and h.

55

Lemma 9.7. Let R be the roots of g with respect to h.

1. If α ∈ R then −α ∈ R.

2. The set of roots in R span h∗.

Proof. Part (1) is a direct consequence of Corollary 9.6. If R does not span h∗ then there exists

some h ∈ h such that α(h) = 0 for all α ∈ R (this follows from the fact that if U ⊆ h∗ is a subspace

then dimU + dimU⊥ = dim h, where U⊥ = h ∈ h | u(h) = 0, ∀u ∈ U). But this means that

[h, gα] = 0 for all α. Since [h, h] = 0 too, this implies that [h, g] = 0 i.e. h belongs to the centre

z(g) of g. But g is semi-simple so ζ(g) = 0.

The following proposition tells us that every semi-simple g is ”built up” from a collection of

copies of sl(2,C) that interact in some complex way.

Proposition 9.8. Let R be the roots of g with respect to h. For each α ∈ R:

1. There exist elements Eα in gα and Fα in g−α such that Eα, Fα and Hα := [Eα, Fα] span a

copy sα of sl(2,C) in g.

2. Hα = 2tακ(tα,tα)

, where tα was defined above.

Proof. We begin by showing that

[X, Y ] = κ(X, Y )tα, ∀ X ∈ gα, Y ∈ g−α. (18)

Since [X, Y ]−κ(X, Y )tα belongs to h and κ|h is non-degenerate, it suffices to show that κ([X, Y ]−κ(X, Y )tα, h) = 0 for all h ∈ h. But

κ([X, Y ], h) = κ(X, [h, Y ]) = α(h)κ(X, Y ) = κ(tα, h)κ(X, Y ).

Hence

κ([X, Y ]− κ(X, Y )tα, h) = κ(tα, h)κ(X, Y )− κ(tα, h)κ(X, Y ) = 0,

as required.

Claim 9.9. κ(tα, tα) = α(tα) is non-zero.

Proof. Assume otherwise, then [tα, X] = [tα, Y ] = 0 for all X ∈ gα and Y ∈ g−α. By Corollary

9.6 we can choose X, Y such that κ(X, Y ) = 1. Then s = CX, Y, tα ' adg(s) is a solvable

subalgebra of g. By Corollary 7.17, this implies that C adg(tα) = [adg(s), adg(s)] consists of

56

nilpotent endomorphisms. That is, adg(tα) is nilpotent. Since tα ∈ h, adg(tα) is also semi-simple

i.e. tα = 0. This is a contradiction. Hence α(tα) 6= 0. This completes the proof of the claim.

Now choose any 0 6= Eα ∈ gα and find Fα ∈ g−α such that κ(Eα, Fα) = 2κ(tα,tα)

. Set Hα =2tα

κ(tα,tα). Then the fact that [X, Y ] = κ(X, Y )tα implies that Eα, Fα, Hα is an sl2-triple.

Corollary 9.10. For each α ∈ R, we have dim gα = 1 and ±α are the only multiplies of α in R.

Proof. We will consider the decomposition of g as a sα-module. We already know what the simple

sα-modules look like; they are the V (n) described in section 6.6.

Let M be the space spanned by all gcα, where c ∈ C, and h. Then M is a sα-submodule of g.

The classification of simple sl(2,C)-modules, together with Weyl’s complete reducibility theorem

implies that the weights of Hα on M are all integers; if cα ∈ R then cα(Hα) = 2c ∈ Z. Now

h = Kerα⊕CHα. This implies that Kerα is a sα-submodule on which it acts trivially. Thus, we

can decompose M = sα⊕Kerα⊕M ′ for some sα-submodule M ′. Notice that M0 = h, so M ′0 = 0

i.e. all the Hα weights in M ′ are odd. This implies already that dim gα = 1. Also, we see that 2α

cannot be a root in R i.e. for any root in R, twice that root is never a root. Now if 12α were a

root then 212α = α cannot possibly be a root. So actually 1

2α isn’t a root either. One can deduce

from this that ±α are the only multiplies of α in R.

Finally, we turn to asking how sα acts on those weight spaces gβ for β 6= ±α. By studying this

action, we will deduce:

Lemma 9.11. Let α, β ∈ R, β 6= ±α. Then, β(Hα) ∈ Z and β − β(Hα)α ∈ R.

Proof. The space gβ will be part of some simple sα-module V (n) say. Since Eα · gβ ⊂ gβ+α and

Fα ·gβ ⊂ gβ−α, there will be some r, s ≥ 0 such that V (n) = gβ+rs⊕· · ·⊕gβ−sα. Notice that all the

weights β+ rα, β+ (r−1)α, . . . , β− sα are roots in R. Moreover, the fact that dim gβ = 1 implies

that if β+tα ∈ R then r ≥ t ≥ −s (otherwise consider the sα-module generated by gβ+tα). We call

β+rα, β+(r−1)α, . . . , β−sα the α-string through β. In this case n = (β+rα)(Hα) = β(Hα)+2r.

Now, as a subspace of V (n), gβ = V (n)k, where k = β(Hα). Thus, β(Hα) ∈ Z. Moreover, V (n)−k

will also be a non-zero weight space gγ for some γ ∈ R. What’s this γ? Well, it must be in the

α-string through β so γ = β + qα for some q and γ(Hα) = −β(Hα). Since α(Hα) = 2, we see that

γ = β − β(Hα)α.

Recall that we have used the Killing form to identify h∗ with h, α ↔ tα. Therefore we can

define a non-degenerate symmetric bilinear form (−,−) on h∗ by (α, β) = κ(tα, tβ). Notice that

57

(α, α) = κ(tα, tα). Then, Lemma 9.11 says

2(β, α)

(α, α)∈ Z, β − 2(β, α)

(α, α)α ∈ R, ∀ α, β ∈ R.

9.3 Root systems

We can axiomatize some of the properties of the set R. This leads to the notion of a root system.

Let E be some real finite dimensional vector space (in our examples E with be a real subspace

of our Cartan h such that R ⊂ E, dimRE = dimC h and h = C ⊗R E), equipped with a positive

definite symmetric bilinear form (−,−). Such a vector space is called an Euclidean space. A linear

transformation M : E → E is said to orthogonal if (M(v),M(w)) = (v, w) for all v, w ∈ E.

Exercise 9.12. By fixing an orthonormal basis x1, . . . , xn of E, show that one can identify the

group of orthogonal transformations of E with

M ∈ gl(n,R) | MTM = Id.

Deduce that det(M) = ±1.

A reflection of E is an orthogonal transformation s such that the subspace Fix(s) = x ∈E | s(x) = x has codimension one in E. Similarly, a rotation of E is an orthogonal transformation

r such that the subspace Fix(r) = x ∈ E | r(x) = x has codimension two in E. For each non-zero

α ∈ E, we define Hα = x ∈ E | (x, α) = 0 to be the orthogonal hyperplane to α.

Exercise 9.13. Let s be a reflection and α ∈ E such that Fix(s) = Hα. Show that

1. s(α) = −α,

2. For all x ∈ E,

s(x) = x− 2(x, α)

(α, α)α.

3. Deduce that s2 = Id.

Hint: First show E = Rα ⊕Hα and sα(α) ∈ Rα.

The above exercise shows that s is uniquely defined by α. Therefore, we can associate to each

α ∈ E r 0 a reflection sα.

58

Exercise 9.14. Let M be an orthogonal transformation of E, with dimE = 2. Fixing an orthonor-

mal basis of E, show that there is a unique θ ∈ [0, 2π) such that

M =

(cos θ − sin θ

sin θ cos θ

), or M =

(cos θ sin θ

sin θ − cos θ

).

Deduce that every orthogonal transformation of E is either a rotation or reflection. How can one

easily distinguish the two cases?

Definition 9.15. A subset R of E is called a root system if

1. R is finite, spans E and doesn’t contain zero.

2. If α ∈ R then the only multiplies of α in R are ±α.

3. If α ∈ R then the reflection sα maps R to itself.

4. If α, β ∈ R then

〈β, α〉 :=2(α, β)

(α, α)

belongs to Z.

Of course, this definition is chosen precisely so that Lemma 9.7, Corollary 9.10 and Lemma

9.11 imply that:

Proposition 9.16. Let g be a semi-simple Lie algebra and h a Cartan subalgebra. The decom-

position of g into h weight spaces defines a root system R, in the abstract sense of Definition

9.15.

Exercise 9.17. Let α 6= ±β ∈ R, for some root system R. Show that sαsβ is a rotation of E. Hint:

decompose E = Rα, β ⊕Hα ∩Hβ and consider sαsβ acting on Rα, β. Use exercise 9.14.

9.4 The angle between roots

Recall that if α, β ∈ E are non-zero vectors, then the angle θ between α and β can be calculated

using the cosine formula ||α|| · ||β|| cos θ = (α, β). Thus,

〈β, α〉 =2(β, α)

(α, α)= 2||β||||α||

cos θ,

59

β

α

A1 × A1 β

α

A2

and hence 〈β, α〉〈α, β〉 = 4 cos2 θ. If α, β ∈ R, then the fourth axiom implies that 4 cos2 θ is a

positive integer. Since 0 ≤ cos2 θ ≤ 1, we have 0 ≤ 〈β, α〉〈α, β〉 ≤ 4. Hence the only possible

values of 〈α, β〉 are:

〈β, α〉〈α, β〉 θ

0 0 π2

1 1 π3

−1 −1 2π3

1 2 ?

−1 −2 3π4

1 3 π6

−1 −3 ??

1 4 0

−1 −4 π

(19)

What about 〈β, α〉 = 〈α, β〉 = ±2?

Exercise 9.18. What are the angles θ in (?) and (??)?

From this, it is possible to describe the rank two root systems.

Exercise 9.19. Write out all the roots in R(B2) as linear combinations of the roots α and β.

9.5 The Weyl group

The subgroup of GL(E) generated by all reflections sα | α ∈ R is called the Weyl group of R.

Lemma 9.20. The Weyl group W of R is finite.

Proof. By axiom (3) of a root system every sα maps the finite set R into itself. Therefore, there

is a group homomorphism W → SN , where N = |R|. This map is injective: if w ∈ W such that

60

α

β

B2

α

β

G2

its action on R is trivial then, in particular, w fixes a basis of E. Hence w acts trivially on E i.e.

w = 1.

Exercise 9.21. Let E = x =∑n+1

i=1 xiεi ∈ Rn+1 |∑n+1

i=1 xi = 0, where ε1, . . . , εn+1 is the

standard basis of Rn+1 with (εi, εj) = δi,j. Let R = εi − εj | 1 ≤ i 6= j ≤ n+ 1.

1. Show that R is a root system.

2. Construct a set of simple roots for R.

3. By considering the action of the reflections sεi−εj on the basis ε1, . . . , εn+1 of Rn+1, show

that the Weyl group of R is isomorphic to Sn+1.

9.6 Simple roots

If R is a root system in E then by the first axiom of root systems, there is some subset of R that

forms a basis of E.

Definition 9.22. Let R be a root system. A set of simple roots for R is a subset ∆ ⊂ R such

that

1. ∆ is a basis of E.

2. Each β ∈ R can be written as β =∑

α∈∆mαα, where all mα are positive integers, or all are

negative integers.

61

The problem with the above definition is that it’s not at all clear that a given root system

contains a set of simple roots. However, one can show:

Theorem 9.23. Let R be a root system.

1. There exists a set ∆ of simple roots in R.

2. The group W is generated by sα | α ∈ ∆.

3. For any two sets of simple roots ∆,∆′, there exists a w ∈ W such that w(∆) = ∆′.

In fact the Weyl group W acts simply transitively on the collections of all sets of simple roots.

The construction of all ∆’s is quite easy, the difficulty is in showing that the sets constructed are

indeed sets of simple roots and that the properties of Theorem 9.23 are satisfied. Recall that Hα

denotes the hyperplane in E perpendicular to α. The open subset E r⋃α∈RHα is a union of

connected components. The components are called the Weyl chambers of R. If P is one of these

chambers then choose some γ ∈ P . Notice that (γ, α) 6= 0 for all α ∈ R. We define R(γ)+ to

be all α such that (γ, α) > 0 and R(γ)− similarly, so that R = R(γ)+ t R(γ)−. The sets R(γ)±

are independent of the choice of γ. Then we say that α ∈ R(γ)+ is decomposable if there exist

β1, β2 ∈ R(γ)+ such that α = β1 + β2. Otherwise α is said to be indecomposable. The set of all

indecomposable roots in R(γ)+ is denoted ∆(γ). Then, ∆(γ) is a set of simple roots in R and

the construction we have described is in fact a bijection between the Weyl chambers of R and the

collection of sets of simple roots. In particular, the Weyl group acts simply transitively on the set

of Weyl chambers.

Example 9.24. The root system of type G2 has 12 Weyl chambers. This implies that the Weyl

group has order 12. On the other hand, the picture in figure 9.24 shows that the group generated

by the reflections sα and sβ is the set of symmetries either hexagon. This group (which must be

contained in the Weyl group) has order 12 too, thus W (G2) equals the group of symmetries of the

hexagon. It is called the dihedral group of order 12.

Exercise 9.25. How many Weyl chambers are there for the root system of type B2? For each

chamber describe the corresponding set of simple roots.

9.7 Cartan matrices and Dynkin diagrams

We have already extracted from our simple Lie algebra g a root system, which is some abstract set

of vectors in a real vector space. From this root system we will extract something even simpler - a

finite graph. Remarkably, this graph, called the Dynkin diagram of g contains all the information

62

α

β

G2

needed to recover the Lie algebra g. It is this fact that allows us to completely classify the simple

Lie algebras.

We proceed as follows. Give a roots system R, fix a set of simple roots ∆ and order them:

∆ = α1, . . . , α`. Then define the Cartan matrix of ∆ to be the `× ` matrix whose (i, j)th entry

is 〈αi, αj〉.

Example 9.26. The rank two root systems have Cartan matrix:

A1 × A1

(2 0

0 2

); A2

(2 −1

−1 2

); B2

(2 −2

−1 2

); G2

(2 −1

−3 2

).

The following proposition implies that a root system R is uniquely defined up to isomorphism

by its Cartan matrix.

Proposition 9.27. Let R ⊂ E and R′ ⊂ E ′ be two root systems, and ∆ ⊂ R, ∆′ ⊂ R′ sets of

simple roots. Assume that there exists a bijection Φ : ∆→ ∆′ such that 〈αi, αj〉 = 〈Φ(αi),Φ(αj)〉for all i, j. Then there is an isomorphism φ : E

∼→ E ′ such that φ defines a bijection R → R′,

whose restriction to ∆ equals Φ. This induces an isomorphism of Weyl groups W ' W ′.

Proof. Since ∆ and ∆′ define a basis of E and E ′ respectively, Φ extends uniquely to an isomor-

phism φ : E → E ′. For each α ∈ ∆, φ sα φ−1 = sφ(α) since sα is uniquely defined by what it

does on ∆. By Theorem 9.23, W is generated by the reflections in ∆. Therefore φ Wφ−1 = W ′.

Theorem 9.23 also says that for each α ∈ R, there is some w ∈ W such that w(α) ∈ ∆. Therefore

φ(α) = φ(w)−1(φ(w(α))) ∈ R′ i.e. φ(R) ⊂ R. Similarly, φ−1(R′) ⊂ R. Thus, φ(R) = R′.

63

Now given a Cartan matrix, define the Dynkin diagram of R to be the graph with ` vertices

labeled by the ` simples roots α1, . . . , α` and with 〈αi, αj〉〈αj, αi〉 edges between vertex αi and αj.

Finally to encode whether a simple root is long or short, we decorate the double and triple edges

with an arrow pointing towards the shorter roots.

Example 9.28. The rank two root systems have Dynkin diagrams:

A1 × A1 ; A2 ; B2 ; G2 ;

Exercise 9.29. Calculate the Cartan matrix associated to the Dynkin diagram

F4

Exercise 9.30. What is the Dynkin diagram associated to the root system of exercise 9.21?

64

10 The classification of simple, complex Lie algebras

Based solely on the definition of root system and Cartan matrix, it is possible to completely classify

the possible Dynkin diagrams that can arise. The proof of the classification theorem involves only

elementary combinatorial arguments, but it is rather long and tedious.

Theorem 10.1. Let D be a connected Dynkin diagram. Then D belongs to the list:

An

Bn

Cn

Dn

G2

F4

E6

E7

E8

The Dynkin diagrams of type An, Bn, Cn and Dn correspond to the classical complex Lie

algebras sl(n+ 1,C), so(2n+ 1,C), sp(2n,C) and so(2n,C) respectively.

Of course one has to then show that the Dynkin diagrams of Theorem 10.1 do come from some

root system. This is done by explicitly constructing the root system in each case (for the classical

types one can do this just by choosing a Cartan subalgebra and explicitly decomposing the Lie

algebra with respect to the Cartan subalgebra).

65

10.1 Constructing Lie algebras from roots systems

Let R be a root system. Serre described how to construct a semi-simple Lie algebra from R such

that R is the root system of g. This is done by giving g in terms of generators and relations.

Let ∆ ⊂ R be a choice of simple roots, ∆ = α1, . . . , α`. Then g will be generated by elements

H1, . . . , H`, E1, . . . , E` and F1, . . . , F`. Now we need to give all nessecary relations amongst the

generators H1, . . . , F`, in addition to the anti-symmetry and Jacobi relation that all Lie algebras

satisfy.

Theorem 10.2 (Serre). Let g be the complex Lie algebra generated by H1, . . . , H`, E1, . . . , E` and

F1, . . . , F` and satisfying the relations

1. [Hi, Hj] = 0, for all 1 ≤ i, j ≤ `;

2. [Hi, Ej] = 〈αi, αj〉Ej, and [Hi, Fj] = −〈αi, αj〉Fj, for all 1 ≤ i, j ≤ `;

3. [Ei, Fi] = Hi and [Ei, Fj] = 0 for all i 6= j;

4. ad(Ei)−〈αi,αj〉+1(Ej) = 0 for all i 6= j;

5. ad(Fi)−〈αi,αj〉+1(Fj) = 0 for all i 6= j.

Then g is a semi-simple, finite dimensional Lie algebra with root system R.

If g is as in Theorem 10.2 then are a basis for H1, . . . , H` a Cartan subalgebra h of g. Since

g = h⊕

α∈R gα and we have shown that dim gα = 0, it follows that dim g = `+ |R|. The number

` is called the rank of g.

Exercise 10.3. Let Hi = ei,i − ei+1,i+1, Ei = ei,i+1 and Fi = ei+1,i in gl(n,C). If R is the root

system of exercise 9.21, show that the satisfy the relations of Theorem 10.2. Show that they also

generate the Lie algebra sl(n+ 1,C). Thus, R is the root system of sl(n+ 1,C).

Exercise 10.4. Write out the explicit relations for sl(n+ 1,C) by calculating the integers 〈αi, αj〉.

Exercise 10.5. We define the 2n× 2n matrix

Jn =

(0 In

−In 0

),

where In ∈ GL(n,C) is the identity matrix. Concretely, the symplectic Lie algebra sp(2n) is

defined to be the set of all A ∈ gl(2n,C) such that AT · Jn + Jn · A = 0. Let h ⊂ sp(2n) denote

the set of diagonal matrices.

66

(a) Check that dim h = n.

(b) Writing A =

(a b

c d

), where a, b, c, d ∈ gl(n,C), show that A ∈ sp(2n) if and only if

bT = b, cT = c and aT = −d. What is the dimension of sp(2n)?

(c) Using part (b), decompose sp(2n) into a direct sum of weight spaces for h. Deduce that h is

a Cartan subalgebra of sp(2n).

(d) (Harder) Using (c), write down the root system of sp(2n).

(e) Using (d), compute the Dynkin diagram for sp(2n).

10.2 The classification

Just as in Lie’s classification theorem, Theorem 4.13, it is known that g is always the Lie algebra

of some complex Lie group G (though in general there are several Lie groups whose Lie algebra

is g). We’ve seen that every simple g gives rise to a Dynkin diagram. Conversely, given a Dynkin

diagram, Theorem 10.2 says that we can construct a simple g with that Dynkin diagram. We

would like to show that this gives a bijection between the isomorphism classes of simple Lie

algebras and Dynkin diagrams. In extracting the Dynkin diagram from g we made two choices.

Firstly, we choose a Cartan subalgebra h of g. We need to show that the corresponding root

system is essential independent of this choice of Cartan.

Theorem 10.6. Any two Cartan subalgebra of g are conjugate by an element of G.

Theorem 10.6 says that if I’m given two Cartan subalgebras h and h′ of g then there exists

some g ∈ G such that h′ = Ad(g)(h). Thus, Ad(g) is an automorphism of g sending the h-weight

decomposition of g into the h′-weight decomposition. This means that the isomorphism Ad(g) :

h → h′ (which is an isometry since the Killing form satisfies κ(Ad(g)(X),Ad(g)(Y )) = κ(X, Y ))

maps the root system R(h) bijectively onto R(h′). The second choice we made was a choice of

simple roots ∆ in R. We’ve seen in Theorem 9.23 that for any two ∆,∆′ ⊂ R there is an element

w in the Weyl group W such that w(∆) = ∆′. Thus, the worst that can happen is that the Dynkin

diagram of ∆ differs by an automorphism from the Dynkin diagram of ∆′. Hence, up to Dynkin

automorphisms, the Dynkin diagram is uniquely defined by g.

Summarizing, the classification result says:

67

Theorem 10.7. A simple Lie algebra g is uniquely defined, up to isomorphism, by its Dynkin

diagram. Moreover, for each connected Dynkin diagram, there exists a simple Lie algebra with

that Dynkin diagram.

Hence, up to isomorphism, there are four infinite series (type An, Bn, Cn and Dn) of simple Lie

algebras - these are the Lie algebras of classical type, and five exceptional Lie algebras (type G2,

F4, E6, E7 and E8).

68

11 Weyl’s character formula

We’ve seen in the previous two sections that the key to classifying semi-simple complex Lie algebras

is to chose a Cartan subalgebra h of g and decompose g, via the adjoint representation, as a h-

module. We can apply the same idea to g-modules i.e. given a g-module V , we can consider it

as a h-module and ask for its decomposition. First, we describe the classification of simple, finite

dimensional g-modules.

11.1 Highest weight modules

Fix a set of simple roots ∆ so that R = R+ t R−. Let n+ =⊕

α∈R+ gα. Since [gα, gβ] ⊂ gα+β,

the subspace n+ is a nilpotent Lie subalgebra. The subalgebra n− is defined similarly, so that

g = n+ ⊕ h ⊕ n−. Since we have fixed a Cartan subalgebra h of g, every finite dimensional

g-module admits a weight space decomposition.

V =⊕λ∈h∗

Vλ

where Vλ = v ∈ V | (H − λ(H))N · v = 0, ∀ N 0, H ∈ h. In fact, since g is semi-

simple, Vλ = v ∈ V | H · v = λ(H)v, ∀ H ∈ h. To prove this, we first note that Weyl’s

complete reducibility Theorem implies that it is enough to check this for V simple. In that case,

consider the subspace⊕

λ∈h∗ V′λ, where V ′λ = v ∈ V | H · v = λ(H)v, ∀ H ∈ h ⊂ Vλ. Since

[H,X] = α(H)X for all X ∈ gα and H ∈ h, the subspace⊕

λ∈h∗ V′λ is a g-submodule. Therefore,

the fact that V is simple implies that this subspace is the whole of V .

Definition 11.1. Let V be a finite-dimensional g-module. A weight vector v ∈ Vλ is said to be

highest weight if n+ · v = 0.

Lemma 11.2. Let V be a finite dimensional g-module. Then V contains a highest weight vector.

Proof. Let b+ = h⊕

α∈R+ gα = h ⊕ n+. Then, since [h, n+] = n+ and n+ is nilpotent, b+ is a

solvable subalgebra of g. Lie’s Theorem implies that V contains a common eigenvector v for b+

i.e. there exists λ : b+ → C, a linear functional, such that X ·v = λ(X)v for all X ∈ b+. It suffices

to show that λ(X) = 0 for all X ∈ n+. Let X ∈ gα ⊂ n+. Then α 6= 0 implies that there exists

some H ∈ h such that α(H) 6= 0. This implies that 1α(H)

[H,X] = X and hence [h, n+] = n+. But

then

λ(X)v = X · v =1

α(H)λ(H)λ(X)v − 1

α(H)λ(X)λ(H)v = 0.

69

Hence λ(X) = 0, as required i.e. v is a highest weight vector.

If ∆ = β1, . . . , β`, then by Proposition 9.8, there exist sl2-triples Ei, Fi, Hi in g such that

Ei is a basis of gβi . Since the elements in ∆ are a basis of h∗, the set H1, . . . , H` is a basis of

h. The set of all λ ∈ h∗ such that λ(Hi) ∈ Z, for all i, is denoted P . It is a Z-lattice in h∗, and

is called the weight lattice of g. Let P+ = λ ∈ P | λ(Hi) ∈ Z+ ∀ i; elements of P+ are called

dominant integral weights. Notice that

P = λ ∈ h∗ | 〈λ, βi〉 ∈ Z, ∀ i, P+ = λ ∈ h∗ | 〈λ, βi〉 ∈ Z≥0, ∀ i.

As an example, consider the simple Lie algebra g(B3) of type B3. If E is the three dimensional

real vector space with basis ε1, ε2, ε3 then β1 = ε1 − ε2, β2 = ε2 − ε3, β3 = ε3 is a set of simple

roots and R = ±εi,±(εi ± εj) | 1 ≤ i 6= j ≤ 3 is the set of all roots.

Exercise 11.3. Describe the set of positive roots with respect to β1, β2, β3. What is the dimension

of g(B3)?

A weight λ = λ1ε1+λ2ε2+λ3ε3 will be dominant integral if and only if 〈λ, β1〉 = λ1−λ2, 〈λ, β2〉 =

λ2−λ3, and 〈λ, β3〉 = 2λ3 are all positive integers i.e. λ1−λ2, λ2−λ3, 2λ3 ∈ Z≥0. The Weyl group

W of type B3 is generated by the three reflections

sβ1 =

0 1 0

1 0 0

0 0 1

, sβ2 =

0 1 0

1 0 0

0 0 1

, sβ3 =

0 1 0

1 0 0

0 0 1

.

It has 3!23 = 48 elements. The 3! comes from the fact that sβ1 and sβ2 generate a subgroup

isomorphic to S3.

In the example of R = εi − εj | 1 ≤ i 6= j ≤ n + 1, the root system of type An, an element

λ1(ε1 − ε2) + · · ·+ λn(εn − εn+1) belongs to P+ if and only if 2λi − λi−1 − λi+1 ∈ Z≥0 for all i.

Lemma 11.4. Let V be a finite dimensional g-module.

1. If µ ∈ h∗ such that Vµ 6= 0, then µ ∈ P .

2. If v ∈ Vλ is a highest weight vector, then λ ∈ P+.

Exercise 11.5. Prove Lemma 11.4. Hint: first check that the lemma holds for sl2-modules. Next,

note that it suffices to check for each i = 1, . . . , ` that Vµ 6= 0 implies µ(Hi) ∈ Z and for (2) that

λ(Hi) ≥ 0. Deduce this from the sl2 case.

A g-module M is called highest weight if it is generated by some highest weight vector m ∈Mλ.

70

11.2 Verma modules

Next we will define certain ”universal” highest weight modules. These are called Verma modules.

Let Y1, . . . , YN be a weight basis of n−. Since n− is nilpotent, we may order the Yi so that

[Yi, Yj] ∈ CYk | k > i for all i ≤ j. Choose some λ ∈ h∗ and let Cvλ be the one-dimensional

b+-module such that n+ · vλ = 0 and H · vλ = λ(H)vλ for all H ∈ h. Let ∆(λ) be the (infinite

dimensional!) vector space with basis given by Yi1Yi2 · · ·Yikvλ, where (i1 ≤ i2 ≤ · · · ≤ ik) is a

k-tuple of element from 1, . . . , N. We say that the length of the basis element Yi1Yi2 · · ·Yikvλ is

k.

Lemma 11.6. The space ∆(λ) has a natural g-module structure.

Proof. The proof is by induction on the length of the basis element. The only element of length

zero is vλ. We define n+ · vλ = 0, H · vλ = λ(H)vλ and Yi · vλ = Yivλ. Now we assume that

we’ve defined the action of g on all basis elements of length less than k. First we consider

∆(λ) as a module over n− i.e. over the Yi. We’d like to define this in the most stupid way:

Yi · (Yi1Yi2 · · ·Yikvλ) = YiYi1Yi2 · · ·Yikvλ. But if i > i1, this won’t be a basis element because the

subscripts i, i1, . . . have to be ordered. So we set

Yi · (Yi1Yi2 · · ·Yikvλ) =

YiYi1Yi2 · · ·Yikvλ if i ≤ i1

Yi1 · (Yi · (Yi2 · · ·Yikvλ)) + [Yi, Yi1 ] · (Yi2 · · ·Yikvλ) if i > i1.

Using the fact that we have ordered the Yi so that [Yi, Yj] ∈ CYk | k > i for all i ≤ j, show that

the above rule makes ∆(λ) into a n−-module. The action of b+ is easier to define:

X · (Yi1Yi2 · · ·Yikvλ) = Yi1 · (X · (Yi2 · · ·Yikvλ)) + [X, Y1] · (Yi2 · · ·Yikvλ).

It is a direct check to see that the above rules do indeed make ∆(λ) into a g-module. This check

is not so obvious since one really needs to check that all the relations of Theorem 10.2 hold.

Each of the basis elements Yi1Yi2 · · ·Yikvλ is a weight vector with weight −αi1−· · ·−αik , where

Yi ∈ g−αi and αi ∈ R+. Here is the key fact about Verma modules that we need.

Lemma 11.7. Let λ ∈ h∗. The Verma module ∆(λ) has a unique simple quotient V (λ) i.e. if

∆(λ) V and ∆(λ) V ′ for some simple g-modules V and V ′ then V ' V ′.

Proof. Let’s call the maps ∆(λ) V and ∆(λ) V ′, φ and φ′ respectively. First, we claim

that φ(vλ) is a highest weight vector in V , that generates V as a g-module, and similarly for

φ′(vλ) ∈ V ′.

71

Since φ is surjective, V is spanned by φ(Yi1Yi2 · · ·Yikvλ). But notice that our rule for the

g-action on ∆(λ) means that Yi1Yi2 · · ·Yikvλ = Yi1 · (Yi2 · (· · · (Yik · vλ) · · · ) and hence

φ(Yi1Yi2 · · ·Yikvλ) = Yi1 · (Yi2 · (· · · (Yik · φ(vλ) · · · ).

Thus, φ(vλ) generates V . If X ∈ b+ then X · φ(vλ) = φ(X · vλ), so it is clear that φ(vλ) is also a

highest weight vector.

Let M =∑

M ′⊂∆(λ) M′ be the sum of all g-submodules of ∆(λ) such that M ′ ∩ ∆(λ)λ = 0

(equivalently, the λ weight space M ′λ of M ′ is zero). Then M is a g-submodule of ∆(λ) such that

Mλ = 0 i.e. it is a proper submodule. To show that V and V ′ are isomorphic, it suffices to show

that Kerφ = Kerφ′ = M . Since φ(vλ) 6= 0 and the λ-weight space of ∆(λ) is spanned by vλ, φ

restricts to an isomorphism ∆(λ)λ∼−→ Vλ. Thus, Kerφ ∩∆(λ)λ = 0 and hence Kerφ ⊂M . This

means that the non-zero g-module ∆(λ)/M is a quotient of V = ∆(λ)/Kerφ. But V is simple.

Hence ∆(λ)/M = V and M = Kerφ. The same argument applies to φ′.

Exercise 11.8. Let V be a finite dimensional g-module with highest weight vector v of weight

λ ∈ P+. Show that vλ 7→ v extends uniquely to a g-module homomorphism ∆(λ) → V . This

explains why ∆(λ) is the universal highest weight module with weight λ. Hint: read carefully

through the proof of Lemma 11.7.

11.3 The classification

Notice that we have already shown that if λ /∈ P+ then there can be no finite dimensional g-module

with highest weight of weight λ. Thus, the simple quotient V (λ) of ∆(λ) is infinite dimensional

in this case.

Theorem 11.9. Let V be a simple, finite dimensional g-module.

1. There is some λ ∈ P+ such that V ' V (λ).

2. If λ, µ ∈ h∗, λ 6= µ then V (λ) 6' V (µ).

Proof. By Lemma 11.2, there exists at least one highest weight vector, v say, in V . If λ is the weight

of v then we have shown that λ ∈ P+. By exercise 11.8, there exists a non-zero homomorphism

∆(λ)→ V , vλ 7→ v. But then Lemma 11.7 says that V ' V (λ) as required.

Now take λ, µ ∈ h∗, λ 6= µ. Since isomorphisms will map highest weight vectors to highest

weight vectors, it suffices to show that V (λ) does not contain a highest weight vector of weight

µ. We know that V (λ) is a quotient of ∆(λ) and all non-zero weight spaces in ∆(λ) have weight

72

λ−αi1−· · ·−αik for some αi ∈ R+. Since V (λ) is a quotient of ∆(λ), the same applies to it. There

are two cases to consider: the first is where there exist i1, . . . , ik such that λ− µ = αi1 + · · ·+ αikand the second, when we can’t find any such αi. In the second case, there can’t be any highest

weight in V (λ) of weight µ so there’s nothing to show.

So we assume λ−µ = αi1 + · · ·+αik and there is some v ∈ V (λ) of weight µ. Then, by exercise

11.8, there is a non-zero map ∆(µ) → V (λ), vµ 7→ v. It is surjective. Hence the weights of V (λ)

must also be of the form µ−αj1 − · · · −αjl for some j’s. In particular λ = µ−αj1 − · · · −αjl and

hence

αi1 + · · ·+ αik = −αj1 − · · · − αjl . (20)

But recall that we have fixed simple roots β1, . . . , β`. This means that every α ∈ R+ can be written

as α =∑`

i=1 niβi for some positive integers ni. Moreover, β1, . . . , β` are a basis of h∗ so such an

expression is unique. This means that αi1 + · · · + αik can also be uniquely written as∑`

i=1 niβi

for some positive integers ni. But it also means that −αj1 − · · · − αjl can be uniquely written as∑`i=1−miβi for some positive integers mi. Thus, equation 20 implies that λ = µ, contradicting

our initial assumptions.

In fact, it is not so difficult to show (see e.g. [8, Theorem 22.2]) that V (λ) is finite dimensional

for all λ ∈ P+. Thus, λ 7→ V (λ) defines a bijection

P+ ∼−→ Isomorphism classes of simple, f.d. g-modules..

11.4 Weyl’s formula

We end with Weyl’s beautiful character formula for the character of a simple, finite dimensional

g-module. First we should explain what we mean by the character Ch(V ) of a finite dimensional

g-module. We know that V decomposes as the direct sum of its weight spaces. Just as the

dimension of a vector space is a good invariant of that space (in fact it uniquely defines it up to

isomorphism), the character of a g-module, which records the dimension of all the weight spaces of

the module is a good invariant of the module (again, for g semi-simple and V finite dimensional,

the character Ch(V ) uniquely defines V up to isomorphism).

We consider formal linear combinations∑

λ∈P mλeλ, where mλ ∈ Z. One can add these

expressions in the obvious way. But it is also possible to multiply then: first we define eλ·eµ = eλ+µ,

73

then we extend this rule by linearity to all expressions. Now given V , define

Ch(V ) =∑λ∈P

(dimVλ)eλ.

Define ρ = 12

∑α∈R+ α; ρ is called the half sum of positive roots. In the example of B3, considered

in the previous section, ρ = 52ε1 + 3

2ε2 + 1

2ε3. Finally, we need one extra ingredient in order to give

Weyl’s formula. Recall that we can define the sign function on permutations in the symmetric

group; a permutation is even if and only if it can be written as a product of an even number of

transpositions. This generalizes to an arbitrary Weyl group. Fix a set of simple roots ∆ and recall

from Theorem 9.23 that W is generated by the set sα | α ∈ ∆. We define the length `(σ) of

σ ∈ W to be the length of σ, written as a minimal product of elements from sα | α ∈ ∆. Then

sgn(σ) := (−1)`(σ), generalizing the sign function on symmetric groups.

Theorem 11.10 (Weyl). Let V (λ) be the simple g-module with highest weight λ ∈ P+. Then

Ch V (λ) =

∑w∈W sgn(w)ew(λ+ρ)∑w∈W sgn(w)ew(ρ)

.

To illustrate how to use Weyl’s character formula, we’ll consider the simple Lie algebra sl(3,C)

of type A2. As per usual, it is easier when dealing with root systems of type A to consider the

simple sl(n,C)-module V as a simple gl(n,C)-module. This can be done by making the central

element Id ∈ gl(n,C) act by a fixed scalar (which we are free to choose).

We have ρ = ε1 − ε3 and∑w∈S3

sgn(w)ew(ρ) = x1x−13 − x2x

−13 − x−1

1 x3 − x1x−12 + x−1

1 x2 + x−12 x3.

Let’s consider the simple gl(3,C)-module with highest weight (2, 1, 0). I claim that the character

of V ((2, 1, 0)) equals x21x2 + x2

1x3 + x22x1 + x2

2x3 + x23x1 + x2

3x2 + 2x1x2x3. To show this, we note

that ∑w∈S3

sgn(w)ew(3,1,−1) = x31x2x

−13 − x3

2x1x−13 − x3

3x2x−11 − x3

1x3x−12 + x3

2x3x−11 + x3

3x1x−12 .

Then, one checks, by explicitly multiplying, that(∑w∈S3

sgn(w)ew(ρ)

)Ch V ((2, 1, 0)) =

∑w∈S3

sgn(w)ew(3,1,−1).

74

Exercise 11.11. Using Weyl’s formula, show that Ch V ((1, 1, 1)) = x1x2x3, where V ((1, 1, 1)) is

the irreducible gl(3,C)-module with highest weight (1, 1, 1).

Exercise 11.12. Let E be the two dimensional real vector space with basis ε1, ε2 then β1 =

ε1 − ε2, β2 = ε2 is a set of simple roots and R = ±εi,±(εi ± εj) | 1 ≤ i 6= j ≤ 2 is the set of all

roots for B2.

(a) Show that a weight λ = λ1ε1 + λ2ε2 will be dominant integral if and only if λ2 ∈ 12Z≥0 and

λ1 − λ2 ∈ Z≥0. Also show that ρ = 32ε1 + 1

2ε2.

(b) The Weyl group W of type B2 is generated by the two reflections

sβ1 =

(0 1

1 0

), sβ2 =

(1 0

0 −1

).

Show that W has eight elements - list them.

(c) Finally, calculate Ch(V (1, 0)), Ch(V (1, 1)) and Ch(V (3, 1)). What is the dimension of these

modules?

Exercise 11.13. The root system for sl2 (i.e. type A1) is ±2ε1 and if ∆ = 2ε1, then P+ = nε1,

where n ∈ Z≥0. What is the Weyl group? For all n ≥ 0, calculate Ch V (n). How does this

compare to Theorem 6.15?

75

12 Appendix: Quotient vector spaces

12.1 The definition

Let F be a field and V a vector space. Let W be a subspace of V . Then, we form the quotient

space

V/W = [v] | v ∈ V, and [v] = [v′] iff v − v′ ∈ W . .

Equivalently, since V is an abelian group and W a subgroup V/W is the set of W -cosets in V .

Lemma 12.1. The quotient V/W is a vector space.

Proof. It is already clear that it is an abelian group. So we just need to define α[v] for α ∈ F and

[v] ∈ V/W , and check that this makes V/W into a vector space.

We define α[v] = [αv]. To see that this is well-define, let [v] = [v′]. Then v−v′ ∈ W and hence

α(v − v′) = αv − αv′ ∈ W and thus [αv] = [αv′]. Hence it is well-defined.

To make sure that this makes V/W into a vector space, we need to check that α([v] + [v′]) =

α[v] + α[v′]. But

α([v] + [v′]) = α([v + v′]) = [α(v + v′)] = [αv + αv′] = α[v] + α[v′].

One way to think about the quotient space V/W (though I would argue, the wrong way!) is

in terms of complimentary subspaces. Recall that a compliment to W is a subspace W ′ of V such

that

• W ′ +W = V

• W ′ ∩W = 0.

We normally write this as V = W ⊕W ′.

There is a canonical map V → V/W , v 7→ [v]. It is a linear map.

Lemma 12.2. Let W ′ be a compliment to W in V . Then the quotient map defines an isomorphism

φ : W ′ → V/W .

Proof. To be precise, φ is the composite W ′ → V → V/W . Being the composite of two linear

maps, it is linear. To show that it is surjective, take [v] ∈ V/W . Since W ′ +W = V , we can find

76

w′ ∈ W ′ and w ∈ W such that v = w+w′. This implies that v−w′ = w ∈ W and hence [v] = [w′]

in V/W i.e. φ(w′) = [v]. So φ is surjective.

Now assume that w′ ∈ Kerφ. This means that [w′] = [0] i.e. w′ ∈ W . But then w′ ∈ W ′∩W =

0. Hence w′ = 0, which means that Kerφ = 0. So φ is injective.

Notice that, if V is finite dimensional, then

dimV = dim(W +W ′) = dimW + dimW ′ − dimW ∩W ′ = dimW + dimW ′.

But φ : W ′ ∼−→ V/W . Hence dimW ′ = dimV/W . Thus, we see that

dimV/W = dimV − dimW.

12.2 Basis of V/W

Assume now that V is finite dimensional.

Lemma 12.3. Let v1, . . . , vk ∈ V such that [v1], . . . , [vk] is a basis of V/W and let w1, . . . , w` be

a basis of W . Then v1, . . . , vk, w1, . . . , w` is a basis of V .

Proof. We need to show that v1, . . . , vk, w1, . . . , w` span V and are linearly independent.

First, we show spanning. If v ∈ V , then the fact that [v1], . . . , [vk] is a basis of V/W implies

that there exists α1, . . . , αk ∈ F such that [v] = α1[v1] + · · ·+ αk[vk]. This means that [v− α1v1−· · · − αkvk] = [0] i.e. v − α1v1 − · · · − αkvk ∈ W . Since w1, . . . , w` are a basis of W , we can find

β1, . . . , β` ∈ F such that

v − α1v1 − · · · − αkvk = β1w1 + · · · β`w`

i.e. v = α1v1 + · · ·+ αkvk + β1w1 + · · · β`w`. So v1, . . . , vk, w1, . . . , w` span V .

Next, assume that α1, . . . , αk, β1, . . . , β` ∈ F such that α1v1 + · · ·+αkvk +β1w1 + · · · β`w` = 0.

This means that

[α1v1 + · · ·+ αkvk + β1w1 + · · · β`w`] = α1[v1] + · · ·+ αk[vk] = [0]

in V/W . But [v1], . . . , [vk] are a basis of V/W . Hence α1, . . . , αk = 0. Thus, β1w1 + · · · β`w` = 0.

But w1, . . . , w` is a basis of W . Hence β1, . . . , β` = 0 too.

Notice that we didn’t really have to prove linear independence in the above proof. If we

know that v1, . . . , vk, w1, . . . , w` span, then the fact that dimV = dimV/W + dimW implies that

dimV = k + l and hence any spanning set of V with k + l elements must be a basis.

77

Lemma 12.4. If v1, . . . , vk ∈ V such that [v1], . . . , [vk] is a basis of V/W , then W ′ = Cv1, . . . , vkis a compliment to W in V .

Proof. Since [v1], . . . , [vk] is a basis of V/W , the map φ : W ′ → V → V/W , vi 7→ [vi] is surjective.

But dimW ′ = dimV/W = k. Hence φ is an isomorphism. The kernel of φ equals W ∩W ′ hence,

since φ is an isomorphism, we have W ∩W ′ = 0.If v ∈ V , then we can find w′ ∈ W ′ such that φ(w′) = [v] i.e. v − w′ ∈ W . So there must be

w ∈ W such that v − w′ = w. Hence v = w′ + w. So we have shown that V = W +W ′.

12.3 Endomorphisms of V/W

Some times linear maps V → V also define for us linear maps V/W → V/W . Let X : V → V be

a linear map.

Lemma 12.5. If X(W ) ⊆ W , then X defines a linear map X : V/W → V/W , by X([v]) = [X(v)].

Proof. First we show that X is well-defined and then check that it is linear. So we need to check

that if [v] = [v′] then X([v]) = X([v′]). But [v] = [v′] iff v − v′ ∈ W . Since X(W ) ⊆ W , we have

X(v − v′) = X(v)−X(v′) ∈ W and hence

X([v]) = [X(v)] = [X(v′)] = X([v′]),

as required.

So now we just check that it is linear. This is easy:

X(α[v] + β[v′]) = X([αv + βv′])

= [X(αv + βv′)]

= [αX(v) + βX(v′)]

= α[X(v)] + β[X(v′)]

= αX([v]) + βX([v′]).

78

We’ll do an example just to get a feel for things. Let’s take V = C4 and

W =

?

?

0

0

, W ′ =

0

0

?

?

,

where W ′ is a compliment to W in V . Since v1 =

0

0

1

0

and v2 =

0

0

0

1

are a basis of W ′, [v1]

and [v2] are a basis of V/W . Let

X =

4 2 −1 6

0 1 0 6

0 0 2 −3

0 0 0 2

.

Then, one can check that X(W ) ⊂ W . So now

X

0

0

α

β

=

−α + 6β

6β

2α− 3β

β

=

−α + 6β

6β

0

0

+

0

0

2α− 3β

β

.

Thus, X([v1]) = 2[v1] and X([v2]) = −3[v1] + [v2]. This implies that

X =

(2 −3

0 2

).

12.4 Bilinear pairings

Let V be a F-vector space. A bilinear pairing is a map

(−,−) : V × V → F

such that

79

1. (u+ v, w) = (u,w) + (v, w) and (u, v + w) = (u, v) + (u,w) for all u, v, w ∈ V .

2. (αu, βv) = αβ(u, v) for all u, v ∈ V and α, β ∈ F.

The bilinear form is said to be symmetric if (u, v) = (v, u) for all u, v ∈ V . We do not assume

(−,−) is symmetric in general. The radical of (−,−) is defined to be

rad (−,−) = u ∈ V | (u, v) = 0 ∀ v ∈ V .

We say that the bilinear form (−,−) is non-degenerate if rad (−,−) = 0.The dual of a vector space V is defined to be the space V ∗ of all linear maps λ : V → F.

Lemma 12.6. If V is finite dimensional, then V ∗ is finite dimensional and dimV = dimV ∗.

Proof. Let n = dimV . Fixing a basis of V , every element of V ∗ can be uniquely written as a 1×n-

matrix i.e. a row vector. It is clear that the space of row vectors of length n is n-dimensional.

The following lemma shows the relationship between dual vector spaces and bilinear forms.

Lemma 12.7. If V is equipped with a non-degenerate bilinear form (−,−), then the form defines

a canonical isomorphism φ : V → V ∗ given by

φ(u)(v) := (u, v).

Proof. For fixed u, the fact that (−,−) is bilinear implies that φ(u) is a linear map i.e. φ(u) ∈ V ∗.Moreover, bilinearity also implies that φ : V → V ∗ is a linear map. The kernel of φ is clearly

rad (−,−). Therefore, since we have assumed that (−,−) is non-generate, φ is injective. On the

other hand, Lemma 12.6 says that dimV = dimV ∗. Therefore φ must be an isomorphism.

If W is a subspace of V then the perpendicular of W with respect to V is defined to be

W⊥ := u ∈ V | (u,w) = 0 ∀ w ∈ W.

Lemma 12.8. Let V be a finite dimensional vector space, (−,−) a non-degenerate bilinear form

on V and W a subspace of V . Then we have a canonical isomorphism

W⊥ ∼−→ (V/W )∗

and dimW⊥ + dimW = dimV .

80

Proof. We define ψ : W⊥ → (V/W )∗ by ψ(u)([v]) = (u, v). Let us check that this is well-defined

i.e. ψ(u)([v]) = ψ(u)([v′]) if [v] = [v′]. We have [v] = [v′] if and only if v − v′ ∈ W . But then

(u, v − v′) = 0 for all u ∈ W⊥. Hence (u, v) = (u, v′) and ψ(u)([v]) = ψ(u)([v′]).

Next we show that (V/W )∗ can be identified with the subspace U of V ∗ consisting of all λ such

that W ⊂ Kerλ i.e. λ(W ) = 0. If λ ∈ U then we can define λ′ ∈ (V/W )∗ by λ′([v]) = λ(v). Just as

in the previous paragraph, it is easy to check that this is well-defined. Now let v1, . . . , v`, w1, . . . , wk

be the basis of V defined in Lemma 12.3. Given ν ∈ (V/W )∗, we define λ(vi) = ν([vi]) and

λ(wj) = 0 for all i, j. Then λ extends uniquely by linearity to an element of V ∗. Since λ(wj) = 0

for all j, λ belongs to U . By construction, λ′ = ν. Hence U∼−→ (V/W )∗.

Finally, we note that, if φ is the isomorphism of Lemma 12.7, then

φ−1(U) = u ∈ V | φ(u)(W ) = 0 = u ∈ V | (u,w) = 0 ∀ w ∈ W = W⊥.

Since φ(u)′ = ψ(u) for all u ∈ φ−1(U), ψ is an isomorphism.

The dimension formula follows from the fact that dimV = dimV/W + dimW and dimV =

dimV ∗ by Lemma 12.6.

81

References

[1] A. Baker. Matrix groups. Springer Undergraduate Mathematics Series. Springer-Verlag London Ltd.,

London, 2002. An introduction to Lie group theory.

[2] R. W. Carter. Lie algebras of finite and affine type, volume 96 of Cambridge Studies in Advanced

Mathematics. Cambridge University Press, Cambridge, 2005.

[3] A. C. de Silva. Lecture on symplectic geometry. http://www.math.ist.utl.pt/~acannas/Books/

lsg.pdf, 2006.

[4] K. Erdmann and M. J. Wildon. Introduction to Lie algebras. Springer Undergraduate Mathematics

Series. Springer-Verlag London Ltd., London, 2006.

[5] W. Fulton and J. Harris. Representation theory, volume 129 of Graduate Texts in Mathematics.

Springer-Verlag, New York, 1991. A first course, Readings in Mathematics.

[6] V. Guillemin and S. Sternberg. Symplectic techniques in physics. Cambridge University Press,

Cambridge, second edition, 1990.

[7] B. C. Hall. Lie groups, Lie algebras, and representations, volume 222 of Graduate Texts in Mathe-

matics. Springer-Verlag, New York, 2003. An elementary introduction.

[8] J. E. Humphreys. Introduction to Lie Algebras and Representation Theory. Springer-Verlag, New

York, 1972. Graduate Texts in Mathematics, Vol. 9.

[9] A. Kirillov, Jr. An introduction to Lie groups and Lie algebras, volume 113 of Cambridge Studies in

Advanced Mathematics. Cambridge University Press, Cambridge, 2008.

[10] D. Mond. Lecture notes for MA455 manifolds. http://homepages.warwick.ac.uk/~masbm/

manifolds.html, 2008.

82

Lie groups, Lie algebras, and their representationsgbellamy/lie.pdf · 3 Lie groups and Lie algebras 11 4 The exponential map 20 5 The classical Lie groups and their Lie algebras

Documents