Top Banner
312
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Foundations of Quantum Mechanics
Page 2: Foundations of Quantum Mechanics
Page 3: Foundations of Quantum Mechanics

FOUNDATIONS OFQUANTUM MECHANICS

JOSEF M. JAUCHUniversity of Geneva, Switzerland

A'V

ADDISON-WESLEY PUBLISHING COMPANYReading, Ma.csac/:useu.c Menlo Park. (ali1thrnia London Don Mi/tv, Ontario

Page 4: Foundations of Quantum Mechanics

This book is in theADDISON-WESLEY SERIES IN ADVANCED PHYSICS

Consulting Editor: MORTON HAMERMESH

COPYRIGHT 1968 BY ADDISON-WESLEY PUI3LISIIING COMPANY, INC. ALL RIGHTSRESERVED. THIS BOOK, OR PARTS IIIERH)F, MAY NOF HE IN ANY FORM

WITHOUT WRITTEN PERMISSION OFIIIE I'U131.1S11114. I'RINIFI) IN TIlE. UNIThD STATESOF AMERICA. PUBLISHED SIMULIANI(RJSLY IN CANAI)A. I.IIrnARY (W CONGRESSCATAIXE CARl) No. 67-2397K.

Page 5: Foundations of Quantum Mechanics
Page 6: Foundations of Quantum Mechanics

PREFACE

This book is an advanced text on elementary quantum mechanics.By "elementary" I designate here the subject matter of nonrelativistic

quantum mechanics for the simplest physical systems. With the word"advanced" I refer to the use of modern mathematical tools and the carefulstudy of difficult questions concerning the physical interpretation of quantummechanics.

These questions of interpretation have been a source of difficulties fromthe beginning of the theory in the late twenties to the present day. Theyhave been the subject of numerous controversies and they continue to worrycontemporary thoughtful students of the subject.

In spite of these difficulties, quantum mechanics is indispensable formost modern research in physics. For this reason every physicist worth hissalt must know how to use at least the language of quantum mechanics. Formany forms of communication, knowledge of the approved usage of thelanguage may be quite sufficient. A deeper understanding of the meaning isthen not absolutely indispensable.

The pragmatic tendency of modern research has often obscured thedifference between knowing the usage of a language and understanding thetneaning of its concepts. There are many students everywhere who passtheir examinations in quantum mechanics with top grades without reallyunderstanding what it all means. Often it is even worse than that. Insteadof learning quantum mechanics in parrot-like fashion, they may learn inthis fashion only particular approximation techniques (such as perturbationtheory, Feynman diagrams or dispersion relations), which then lead themto believe that these useful techniques are identical with the conceptual basisof the theory. This tendency appears in scores of textbooks and is encouragedby some prominent physicists.

This text, on the contrary, is not concerned with applications or approx-linations, but with the conceptual foundations of quantum mechanics. Itis restricted to the general aspects of the nonrelativistic theory. Other funda-mental topics such as scattering theory, quantum statistics and relativistic(Itlailtuni mechanics will be reserved for subsequent publications.

v

Page 7: Foundations of Quantum Mechanics

vi PREFACE

When I wrote this book I had three categories ut readers ninitl: thestudent of physics who has already acquired a first knowledge ot qiutiltitnimechanics, the experienced physicist who is in search ot a (Iceper under-standing, and the mathematician who is interested in the iiiatliciiiaticulproblems of quantum mechanics.

The book consists of three parts. Part I, called Matheniatical Founda-tions, contains in four chapters a sundry collection of mathematical results,not usually found in the arsenal of a physicist but indispensable for under-standing the rest of the book. I have taken special care to explain, motivate,and define the basic concepts and to state the impoitant theorems. Thetheorems are rarely proved, however. Most of the concepts are from func-tional analysis and algebra. For a physicist this part may be useful as ashort introduction to certain mathematical results which are applicable inmany other domains of physics. The mathematician will find nothing newhere, and after a glance at the notation can proceed to Chapter 5.

In Part 2, called Physical Foundations, I present in an axiomatic form thebasic notions of general quantum mechanics, together with a detailedanalysis of the deep epistemological problems connected with them.

The central theme here is the lattice of propositions, an empiricallydetermined algebraic structure which characterizes the intrinsic physicalproperties of a quantum system.

Part 3 is devoted to the quantum mechanics of elementary particles.The important new notion which is introduced here is localizability, togetherwith homogeneity and isotropy of the physical space. In this part the readerwill finally find the link with the conventional presentation of quantummechanics. And it is here also that he encounters Planck's constant, whichfixes the scale of the quantum features.

Of previous publications those of von Neumann have most stronglyinfluenced the work presented here. There is also considerable overlap withthe book by G. Ludwig and with lecture notes by G. Mackey. In additionto material available in these and other references, it also contains the resultsof recent research on the foundations of quantum mechanics carried out inGeneva over the past seven years.

The presentation uses a more modern mathematical language than iscustomary in textbooks of quantum mechanics. There are essentially threereasons for this:

First of all, I believe that mathematics itself can profit by maintainingits relations with the development of physical ideas. In the past, mathematicshas always renewed itself in contact with nature, and without such contactsit is doomed to become pure syniholisni of ever-increasing abstraction.

Second, physical ideas can he expressed much more forcefully andclearly if they are presented in the appropriate language. The use ot such alanguage will enable us to distinguish more easily the difficulties which we

Page 8: Foundations of Quantum Mechanics

PREFACE vii

might call syntactical from those of interpretation. Contrary to a wide-spread belief, mathematical rigor, appropriately applied, does not necessarilyintroduce complications. In physics it means that we replace a traditionaland often antiquated language by a precise but necessarily abstract mathe-matical language, with the result that many physically important notionsformerly shrouded in a fog of words become crystal clear and of surprisingsimplicity.

Third, in all properly formulated physical ideas there is an economy ofthought which is beautiful to contemplate. I have always been convincedthat this esthetic aspect of a well-expressed physical theory is just as in-dispensable as its agreement with experience. Only beauty can lead to that"passionate sympathetic contemplation" of the marvels of the physicalworld which the ancient Greeks expressed with the orphic word "theory."

About 350 problems appear throughout the book. Most of them areshort exercises designed to reinforce the notions introduced in the text.Others are more or less obvious supplements of the text. There are alsosome deeper problems indicated by an asterisk. The latter kind are allsupplied with a reference or a hint.

There remains the pleasant task of remembering here the many col-leagues, collaborators, and students who in one way or another have helpedto shape the content and the form of this book.

My first attempts to rethink quantum mechanics were very muchstimulated by Prof. D. Finkelstein of Yeshiva University and Prof. D.Speiser of the University of Louvain. It was during the year 1958, when allthree of us were spending a very stimulating year at CERN, that we beganexamining the question of possible generalizations of quantum mechanics.Many of the ideas conceived during this time were subsequently elaboratedin publications of my students at the University of Geneva. I should mentionhere especially the work of G. Emch, M. Guenin, J. P. Marchand, B. Misra,and C. Piron.

In the early stages I profited much from various discussions and c9rre-sponcknce with Professor G. Mackey. Many colleagues have read andcriticized different portions of the manuscript. I mention here especiallyl)r. R. Hagedorn of CERN, whose severe criticism of the pedagogicalaspects of the first four chapters has been most valuable. With Dr. J. Bellot CERN, I debated especially the sections on hidden variables and themeasuring process. The chapter on the measuring process has also beeniiitluenced by correspondence with Prof. E. Wigner of Princeton and withProt. L. Rosenfeld of Copenhagen. Several other sections were improvedhy criticism from Prof. R. Ascoli of Palermo, Prof. F. Rohrlich of Syracuse,New York, dell'Antonio of Naples, and Dr. (1. Baron of Rye, New'iurk. Prof. C. F. v. Weizsückcr of IIaniburg read and commented on the

section of Chapter 5. Professor (, Mackey read critically

Page 9: Foundations of Quantum Mechanics

viii PREFACE

the entire manuscript and suggested many improvements. I)r. C. Pirotiand Mr. A. Salah read the proofs, and I am much indebted to them For theirconscientious and patient collaboration. To all of them, and ninny otherstoo numerous to mention, I wish to express here my thanks.

The major portion of the book was written while I had the privilege ofholding an invited professorship at the University of California in LosAngeles, during the winter semester of 1964. May Professors D. Saxon andR. Finkelstein find here my deeply felt gratitude for making this sojourn,and thereby this book, possible. The University of Geneva contributed itsshare by granting a leave of absence which liberated me from teaching dutiesfor three months.

I have also enjoyed the active support of CERN, which, by accordingme the status of visiting scientist, has greatly facilitated my access to itsexcellent research facilities and contact with numerous other physicistsinterested in the matters treated in this book.

It is unavoidable that my interpretation of controversial questions isnot shared by all of my correspondents. Of course, I alone am responsiblefor the answers to such questions which appear in this book.

Mrs. Dorothy Pederson of Los Angeles and Mlle. Frances Prost ofGeneva gave generously of their competent services in typing a difficultmanuscript. May they, too, as well as their collaborators, find here myexpression of gratitude.

Geneva, Switzerland J. M. J.August 1966

Page 10: Foundations of Quantum Mechanics

CONTENTS

PART 1 Mathematical Foundations

Chapter 1 Measure and Integral

1—1 Some notions and notations from set theory 3

1—2 The measure space 7

1—3 Measurable and integrable functions 9

1—4 The theorem of Radon-Nikodym 141—5 Function spaces 16

Chapter 2 The Axioms of Hilbert Space

2—1 The axioms of Hilbert space 182—2 Comments on the axioms 192—3 Realizations of Hilbert space 232—4 Linear manifolds and subspaces 242—5 The lattice of subspaces 26

Chapter 3 Linear Functionals and Linear Operators

3—1 Bounded linear functionals 303—2 Sesquilinear functionals and quadratic forms 323—3 Bounded linear operators t 333—4 Projections 373—5 Unbounded operators 403—6 Examples of operators 42

Chapter 4 Spectral Theorem and Spectral Representation

4—i Self-adjoint operators in finite-dimensional spaces 474—2 The resolvent and the spectrum 524—3 The spectral theorem 534—4 The functional calculus 554—5 Spectral densities and generating vectors 584—6 The spectral representation 604 7 Elgenfunction expansions 62

ix

Page 11: Foundations of Quantum Mechanics

x CONTENTS

PART 2 Physical Foundations

Chapter 5 The Propositional Calculus

5—1 Historic-philosophic prelude 685—2 Yes-no experiments 725—3 The propositional calculus 745—4 Classical systems and Boolean lattices 785—5 Compatible and incompatible propositions 805—6 Modularity 83

5—7 The lattice of subspaces 845—8 Proposition systems 86

Chapter 6 States and Observables

6—1 The notion of state 906—2 The measurement of the state 936—3 Description of states 946—4 The notion of observables 976—5 Properties of observables 996—6 Compatible observables 101

6—7 The functional calculus for observables 101

6—8 The superposition principle 1056—9 Superselection rules 109

Chapter 7 Hidden Variables

7—1 A thought experiment 1127—2 Dispersion-free states 1147—3 Hidden variables 116

7—4 Alternative ways of introducing hidden variables 119

Chapter 8 Proposition Systems and Projective Geometries

8—1 Projective geometries 121

8—2 Reduction theory. 1248—3 The structure of irreducible proposition systems 127

8—4 Orthocomplementation and the metric of the vector space 129

8—5 Quantum mechanics in Hilbert space 131

Chapter 9 Symmetries and Groups

9—1 The meaning of symmetry 135Abstract groups 137

9—3 Topological groups 139

9—4 The automorphisms of a proposition system 1429—5 Transformation of states 145

9 6 Projective representation of groups 146

Page 12: Foundations of Quantum Mechanics

CONTENTS Xi

Chapter 10 The Dynamical Structure

10—1 The time evolution of a system10—2 The dynamical group10—3 Different descriptions of the time evolution10—4 Nonconservative systems

11—1 Uncertainty relations11—2 General description of the measuring process11—3 Description of the measuring process for quantum-mechanical

systems11—4 Properties of the measuring device11—5 Equivalent states11—6 Events and data11—7 Mathematical interlude: The tensor product11—8 The union and separation of systems11—9 A model of the measuring process

11—10 Three paradoxes

PART 3 Elementary Particles

Chapter 12 The Elementary Particle in One Dimension

12—1 Localizability12—2 Homogeneity12—3 The canonical commutation rules12—4 The elementary particle12—5 Velocity and Galilei invariance12—6 The harmonic oscillator12—7 A Hilbert space of analytical functions12—8 Localizability and modularity

1 3—1 Localizability13—2 Homogeneity and isotropy13—3 Rotations as kinematical symmetries13—4 Velocity and Galilei invariance13—5 Gauge transformations and gauge invariance13—6 Density and current of an observable13—-7 Space inversion13 8 Time reversal

Chapter 11 The Measuring Process

151

• • 153• • • • • 155

157

160163

164• • • • 168• • • • • 170• • • • • 173• • • • • 175

• • • • • 179

• • • • • 183• • • • • 185

• • • 195

• • • 197

• • • 199• • • • • • • 205• • • • • • • 206

Chapter 13 The Elementary Particle without Spin

• • • 211

• • • • • 215• • • • • 219

• • • • • 222• • • 224

• • • • • • 227• • • • • • 234

• • • • • 237• • • • • 239

• 241• 244

Page 13: Foundations of Quantum Mechanics

Xii CONTENTS

Chapter 14

14—1

14—2

14-314—4

14—5

14—6

14—7

Particles with Spin

Spin, a nonclassical degree of freedomThe description of a particle with spinSpin and rotationsSpin and orbital angular momentumSpin under space reflection and time inversionSpin in an external force fieldElementary particle with arbitrary spin

• • • • 247

• • • • • 249• • • • • 253

• • • • 258

• • • • 260• • • • • 261

• • • • • 264

Author Index • • 291

Subject Index • • • • • • • • • • 293

Chapter 15 Identical Particles

15—1 Assembly of several particles • • • • • • • • • 26915—2 Mathematical digression: The multiple tensor product • • • 27315—3 The notion of identity in quantum mechanics • • • • • 27515—4 Systems of several identical particles • • • • • • • • 27815—5 The Bose gas • • • • • • • • • • • • 28015—6 The Fermi gas • • • • • 285

Page 14: Foundations of Quantum Mechanics

PART 1

Mathematical Foundations

Page 15: Foundations of Quantum Mechanics
Page 16: Foundations of Quantum Mechanics

CHAPTER 1

MEASURE AND INTEGRAL

Therefore there is no perfect measure of continuous quantity except by meansof indivisible continuous quantity, for example by means of a point, and noquantity can be perfectly measured unless it is known how many individualpoints it contains. And since these are infinite, therefore their number cannotbe known by a creature but by God alone, who disposes everything in number,weight, and measure.

ROBERT GROSSETESTE,

13th century A.D.

The purpose of this chapter is to acquaint the reader with the modern theoryof integration. Section 1-1 contains some basic notions of set theory togetherwith a list of terms and formulas. In Section 1-2 we present the notion ofmeasure space and some properties of measures. We define measures onc-rings of a class of measurable sets, but we pay no attention to the maximalextensions of such measures. The following section (1-3) introduces themeasurable and integrable functions and defines the notion of integral.Section 1-4 introduces the theorem of Radon-Nikodym by way of a trivialexample. The last section (1-5) on function spaces forms the bridge to thegeneral theory of Hilbert space to be presented in Chapter 2.

1-1. SOME NOTIONS AND NOTATIONS FROM SET THEORY

A collection of objects taken as a whole is called a set. The objects whichmake up a set are called the elements of the set. We denote sets by capitalletters, for instance A, B,. . . , S; and the elements by small letters, for instancea, b, . .. , x. If the element x is contained in the set S we write x e 5; if itis not contained in the set S we write x S. If every element of a set A iscontained in B we write A a B or B A, and we say A is a subset of B.

If A c B and B a A, then we say the two sets are equal and we writeA=B.

The set A u B denotes the set of elements which are either in A or in Bor in both. It will be called the union of A and B.

I

Page 17: Foundations of Quantum Mechanics

4 MEASURE AND INTEGRAL I-i

C(b)

CHg. 1—1 Relations between point(b) A n B = 0 (disjoint sets).

sets: (a) A a B (A subset of B);

The set A n B denotes the set of elements which are in A as well as in B.It is called the intersection of A and B.

If A is a subset of a set 5, we define by A' (with respect to 5) the set ofall elements which are in S but not in A. The set A n B' = A — B is calledthe difference of A and B, or the relative complement of B in A.

A subset of a general set S can be defined by a certain property m(x).The set A of all elements which have property m(x) is written

A = {x :

It means: A is the set of all elements which satisfy property m(x). Thus forinstance the operation A u B can be defined by

Similarly we writeAuBr={x:xeAand/orxeB}.

A nB = {x : xeA and xe B}.

(a)

If there exists no element which has property m(x), then the set0 = {x : m(x)} defines the empty set. We have always, for any set A c 5:

ØcA, OuA=A, ØnA—O, 1/f_—S.

If two sets A and B are such that A n B = 0, they are called disjoint.All these notions can be easily illustrated and remembered by using

point sets in a plane (see Figs. 1—1 and 1—2).We shall now go a step further and consider collections of subsets of a

set. We speak then of a class of subsets. Of particular interest are classeswhich are closed with respect to certain operations defined above.

(a) (b) (c)

Fig. 1—2 Intersection-, union-, and difference-sets: (a) A n B;(b) u B; (c) A B A n B' (for S the entire plane).

Page 18: Foundations of Quantum Mechanics

1-1 SOME NOTIONS AND NOTATIONS FROM SET THEORY 5

The most useful notion is that of a ring. A nonempty class of subsets iscalled a ring B? if, forAeB? and BeG?, it follows that A uBePA andA — B e B?. Examples of rings are easily constructed. One of the simplestpossible is the ring consisting of an arbitrary subset A a S, together withthe sets 0, A' and S.

Since A — A = 0, every ring contains the empty set. One proves bymathematical induction that if' A1, A2, . . . , is a finite collection of sub-sets in B?, then

1:1and

Here we have introduced the easily understandable notation

(JA1 A1 u A2 u u

A a ring with the additional property that, for every countablesequence A1 (i = 1, 2, . . .) of sets contained in B?, we have

1=1

A ring is called an algebra (or Boolean algebra) if it contains 5, or equiva-lently if A e B? implies A' e B?.

The primary purpose for introducing the notions of ring, c-ring, andalgebra of sets is to obtain sufficiently large classes of sets to be useful for atheory of integration. On the other hand, for the construction of a measure,the class of sets must be restricted so that an explicit construction of ameasure is possible. This class must contain certain simple sets and for thisreason we want to construct c-rings of sets generated by a certain class ofsubsets. How this is done is now to be explained.

11 e is any class of sets, we may define a unique ring called the ringgenerated by 1, denoted by G?(t). It is defined as follows: Denote by B?,(1 e I = some index set) the family of all the rings which contain the class g.The intersection of all is again a ring (Problem 9) and it defines the ringgenerated by 1:

B?(t) = flIt is clearly the smallest ring which contains the class S of sets. Furthermoreit is unique. The same procedure can be used for c-rings.

We are now prepared for the most important notion of this section,the Borel set.

Let Sbe the reallineS= {x cxc +cc}.

Page 19: Foundations of Quantum Mechanics

6 MEASURE AND INTEGRAL 1-1

For e we choose the set of all bounded semiclosed intervals of the form

[a, b) e {x a � x 'C b}.

The Borel sets on the real line are the sets contained in the c-ring G?(4')generated by this class e.

The choice of semiclosed intervals as starting sets might be somewhatsurprising, but there is a technical reason for this. The finite unions of semi-open intervals are a ring (Problem 7), while this is not so for closed or openintervals. It is, however, a posteriori possible to show that the c-ring gener-ated by the open or the closed interval is also the class of Borel sets. Theseproperties are not difficult to prove, but they require certain technical deviceswhich transcend the purpose of this book ([1], §15). We shall thereforestate them without proof:

1) The class of all Borel sets is the c-ring generated by all open or allclosed sets.

2) The entire set S is a Borel set. The c-ring of Borel sets is thus an algebra.

3) Every countable set is a Borel set.

The first of these properties permits an extension of the notion of Borel setsto certain topological spaces. For instance, in a locally compact Hausdorifspace one defines the Borel sets as the c-ring generated by all closed subsets[1, Chapter 10].

With this general notion one can define Borel sets, for instance, on ann-dimensional Euclidean space, on a finite-dimensional manifold, such as acircle, a torus, or a sphere, and on many other much more complicatedspaces. In many applications which we shall use we have to deal with theBorel sets for an arbitrary closed subset of the real line.

PROBLEMS

1. (A')' = A for all sets A. [Note: In order to prove equality of two sets A and B,one must prove separately A C B and B a A.]

2. AC BimpliesB' CA'.3. A and B are disjoint if and only if A a B', or equivalently B a A'.4.(AnB)'=A'uB'.5.(A—B)'=A'uB.6. The class of all subsets of a set S is a ring.

7. LetS = {x : < x < -(-x}; then the class of all subsetsofS of the form

is a ring.

Page 20: Foundations of Quantum Mechanics

1-2 THE MEASURE SPACE 7

8. If a ring B? of subsets of S contains 5, then A e B? implies A' e B?, and viceversa.

9. The intersection of any family of rings (a-rings) B?, is a ring (a-ring).

1-2. THE MEASURE SPACE

A measure space is a set of elements S, together with a c-ring M of subsetsof S and a nonnegative function 4u(A), defined on all subsets of the class M,which satisfies certain properties to be enumerated below.

The subsets of the c-ring M are called the measurable sets. We denotethe measure space by (5, M, ji). Sometimes the explicit reference to themeasurable sets and the measure p is suppressed, and we then simply referto S as a measure space. Because M is a ring, 0 e M and so the null set isalways measurable. It is always possible to arrange that S e M, too, sothat M is an algebra.

The conditions to be satisfied by the set-function 4u(A) for A e M areas follows:

1) 0 � 4u(A) c cc;2) p(O) = 0;

3) For any disjoint sequence of sets A, e M (i = 1, 2, . .

A1) =Ep(AJ.

Property 1 may be relaxed to include infinite measures. Then we can onlyrequire 0 � p(A) .� cc. (Most applications, however, will be for finitemeasures.) A set function which satisfies property (3) is called c-additive.

The sets A e M with p(A) = 0 are called the sets of measure zero. Aproperty which is true for all x e S except on a set of measure zero is saidto be true "almost everywhere" (abbreviated a.e.).

Naturally the question arises whether any measures exist and what theirproperties are. Instead of entering into these rather difficult questions, weshall explicitly exhibit two types of measures which we shall use constantly:the Lebesgue measure and the Lebesgue-Stieltjes measure.

For the Lebesgue measure, the c-ring of measurable sets consists of theBorel sets on the real line. On the half-open intervals [a, b), with a � b, wedefine

p{[a, b)} = b — a.

In the theory of measure, one proves that this set function on the half-openintervals has a unique extension to the Borel sets such that it satisfies con-ditions (I), (2), and (3). This extension will be called the Lebesgue measureon the real line. (Actually the measure can be further extended to the classof Lebesgue measurable sets, but we shall not need this extension explicitly.)

Page 21: Foundations of Quantum Mechanics

8 MEASURE AND INTEGRAL 1-2

The Lebesgue-Stieltjes measure is a generalization of the Lebesguemeasure obtained in the following way: Let p(2) be a real valued, non-decreasing function, defined for — cc � A � + and such that

p(A+O)= lim p(A +c)= p(A).+ 0

For any semiopen interval [a, b), (a � b), we define

p{[a,b)} =p(b)—p(a).

The unique extension of this set function to the Borel sets on the real lineis called the Lebesgue-Stieltjes measure on the real line.

For p(A) = A this measure reduces to the Lebesgue measure. The greatergenerality of the Lebesgue-Stieltjes measure is especially convenient for dis-crete measures.

We obtain a discrete measure by letting p(A) be constant except for acountable number of discontinuities 2k' where

P(Ak) = PR — 0) +

The measure is then said to be concentrated at the points 2k with theweights

We say two measures and P2 are comparable if they are defined onthe same c-ring of measurable sets. Thus all measures on the Borel setsare comparable.

In the following we shall examine the relation between different com-parable measures. The main point is the observation that comparablemeasures can be partially ordered. In the following we shall assume allmeasures to be comparable without repeating it.

A measure Ri is said to be inferior to a measure P2 if all sets of p2-measurezero are also of p1-measure zero. Pi is also called absolutely continuouswith respect to We use the notation Ri -< Thus Ri -< P2 if and onlyif p2(A) = 0 implies p1(A) = 0. Two measures Pi' P2 are said to be equivalentif both

and P2 -<

The two measures have then the same null sets. We write for this relationPi P2 and we note that it is an equivalence relation (Problem 2).

Two measures Pi and P2 are said to be mutually singular if there existtwo disjoint sets A and B such that A u B = S and such that, for everymeasurable set X c 5,

p1(AnX)=p2(BnX)=O.Examples of mutually singular measures are easily constructed (Problem 8).

If a measure p is absolutely continuous with respect to Lehesgue measure,it is simply called absolutely continuous.

Page 22: Foundations of Quantum Mechanics

1-3 MEASURABLE AND INTEGRABLE FUNCTIONS 9

PROBLEMS

1. Let ji. be a discrete Lebesgue-Stie!tjes measure. For every Bore! set A, =Pk1 where the sum extends over a!! Ak, e A.

2. The re!ation —' is an equiva!ence re!ation; this means it is reflexive, sym-metrica!, and transitive:

(b) imp!ies

(c) r-' ,a2 and r-' imp!ies —'

3. Two discrete measures and on the rea! !ine are equiva!ent if and on!y ifthey are concentrated in the same points and their respective weights are non-zero.

4. Every Lebesgue-Stie!tjes measure on the rea! !ine can be decomposed unique!yinto a discrete part and a continuous part, corresponding to the decompositionof the nondecreasing function p(A) = pa(A) + into a discrete and a con-tinuous function. The discrete part is constant except on a finite orcountab!y infinite set of points where pa(A) is discontinuous.

Every continuous nondecreasing function can be decomposed into anabso!ute!y continuous and a singu!ar function = pa(A) + p8(A). Thefunction p8(A) is singu!ar in the sense that its derivative p8'(A) exists a!mosteverywhere and is equa! to zero, yet p5(A) is continuous and nondecreasing([2], Section 25).

*6. Theorem (Lebesgue). A finite nondecreasing function p(A) (or, more generally,a function of bounded variation) possesses a finite derivative a.e. ([2], Section 4).

Theorem (Lebesgue). The necessary and sufficient condition that a finite, con-tinuous and nondecreasing function is equal to the integral of' its derivative isthat it is absolutely continuous ([2], Section 25).

8. Let be a discrete measure concentrated at the points AS" and anotherdiscrete measure concentrated at the points Then the two measures aresingu!ar with respect to one another if and on!y if for a!! pairs ofindices i and k.

1-3. MEASURABLE AND INTEGRABLE FUNCTIONS

'[he theory of measure spaces permits a definition of the integral of functionswhich is much more general than the so-called Riemann integral usuallyintroduced in elementary calculus. This more general type of integral, tohe defined now, is absolutely indispensable for the definition of Hilbert spaceand other function spaces used constantly in quantum mechanics.

We start with the definition of a function. A function f is a corre-spondence between the elements of a set D1, called the domain of f, and aset A,-, called the range off such that to every x e D1 there correspondsexactly one element 1(v) e A1. The elements of 0,- are called the argumentof the function.

Page 23: Foundations of Quantum Mechanics

10 MEASURE AND INTEGRAL 1-3

We remark especially that we define here what is sometimes called asingle-valued function. It is possible (and, for a systematic exposition, ad-visable) to treat multivalued functions by reducing them to single-valuedones. In analytic function theory this procedure leads in a natural mannerto the theory of Riemann surfaces.

We emphasize, too, that a function has three determining elements:A domain, a range, and a rule of correspondence x —, f(x). It will sometimesbe necessary to distinguish two functions which have different domains,although in a common part of these domains the value of the two functionsmay agree.

B

x

Fig. 1—3 The inverse of a function f(x).

The sets D1 and A1 may be quite general sets—for instance, subsets ofreal or complex numbers. In that case we obtain real or complex functions.But more often they will consist of points in a topological space, functionsin a function space, or even subsets of sets. In the latter case we speak alsoof set functions.

An example of a set function of great importance is the following:

Let B be a subset of the range A1; we define the inverse image (B)(see Fig. 1—3) by setting

f1(B) = {x :f(x)eB}.If the correspondencef between D1 and A1 is one-to-one, we can define theinverse function, also denoted byf 1 but with arguments y e A1. The inversefunction 1(y) has domain D1-1 = A1, range A1-1 = D1, and satisfies

p 1(f(x)) = x for all x e

and

= y for all y e A1.

t Note that the inverse image is not the inverse function. As a function it isdefined on the subsets of and its values are subsets of 1),.

f(x)

Page 24: Foundations of Quantum Mechanics

1-3 MEASURABLE AND INTEGRABLE FUNCTIONS 11

Although, strictly speaking, these two identity functions should be dis-tinguished (since in one case the domain is D1, in the other A1), they areusually considered identical, an assumption which is entirely correct onlyif D1= A1.

For the rest of this section we shall consider real-valued functions overa measure space. A1 is then a subset of the real line B?.

Let (5, M, p) be a measure space and f a real-valued function withdomain D1 = S. We call the function f measurable on S if for every Borelset B on the real line, the setf1(B) is measurable.

The simplest examples of measurable functions are obtained from theset functions XAX) defined by

(1 forxeA,XA(X) = for x A.

A set function is measurable if and only if the set A is measurable. Indeedwe find immediately that

—1 (A ifleB,XA (B)=10 ifl*B,

so that XAX) is measurable if A e M.There is a resemblance between measurable functions in a measure space

and continuous functions in a topological space S. A topological space isdefined by the class of all open subsets of S. A function f(x) from S ontoA1 e B? is then sajd to be continuous if the inverse image f -1(B) of any openset is an open set in S. One obtains the more general class of measurablefunctions by replacing the word "open" by "measurable," in the abovedefinition of continuous function. The class is more general because (atleast in all the measure spaces which we consider) the open sets are measur-able sets. One of the most important problems in measure theory is toidentify the class of measurable functions over a measure space. An efficientway of doing this is to construct the measurable functions from certain simplefunctions by the operations of sums, products, and the passage to the limit.In what follows we shall describe this process, without giving proofs.

First, one observes that if two functions f and g are measurable, thenthe functions

f+g and fg,defined by

(f + g)(x) = f(x) + g(x),

fg(x) = f(x)g(x),arc also measurable. From this it follows that the so-called simple fun ctionsdefined by

f(x)

Page 25: Foundations of Quantum Mechanics

12 MEASURE AND INTEGRAL 1-3

with ; real constants and A, e M, are also measurable. We define theintegral of a simple function as

Jfdp

It is a finite number since the ji(A1) are all finite (we admit only finitemeasures).

Next we consider sequences of functions (n = 1, 2, . . .). We defineconvergence in the measure of such a sequence to a limit function f if, forevery e> 0,

lim p({x : — � 4) = 0.n -.

The notion of measurability is more general than that of integrability, thelatter being restricted by the condition that there exist a finite-valued integral.Since simple functions are not only measurable but also integrable, we candefine the integrable functions as follows: A finite-valued function f on ameasure space (S, M, p) is integrable if there exists a sequence of simplefunctions f,, such that f,, tends to f in the measure. It follows then that thenumbers = J d4u tend to a limit which defines the integral of f:

[py = lim [fndlt.J n—+coj

Various theorems then permit the usual operations with integrals. Forinstance, if f1 andf2 are integrable, so are (jj + f2) and

f and

The integral defined here corresponds to what in the elementary inte-gration theory is called the definite integral. There is also a concept whichgeneralizes the usual notion of the indefinite integral. In the usual definition,the indefinite integral depends on the lower and upper limit of the integrationvariable. It is thus an interval function.

More generally we may define a set function v(A) for all A e M bysetting

v(A)= J J

Page 26: Foundations of Quantum Mechanics

1-3 MEASURABLE AND INTEGRABLE FUNCTIONS 13

If the integrable functionf is positive, the set function v(A) is a new measuredefined on the c-ring M. It is easy to verify that the two measures v and yare equivalent. (The converse of this is an important theorem which weshall discuss in the next section.)

If the measure space S is the real line, M are the Borel sets, and p theLebesgue measure, then the integral is called the Lebesgue integral. Wewrite for it

Jf(x) dx.

Some slight modifications are necessary in some of the definitions andtheorems quoted above, since Lebesgue measure is not a finite measure.

Similarly we obtain the Lebesgue-Stieltjes integral if we define themeasure corresponding to some nondecreasing function p(x). We write forthis integral

jfd4u(x).

The notion of measurable and integrable functions is easily extended tocomplex-valued functions by defining such a function as integrable if bothreal and imaginary parts are integrable. The integral of the function f =t + 1J is then defined by

+

PROBLEMS

I. A continuous function is measurable.

2. There exist measurable functions which are not continuous.3. A function has an inverse if and only if f(x1) =f(x2) implies x1 = x2.

4. If two real-valued functions and g are integrable, then ef, for a constant c,is integrable and (f + g) is integrable.

5. 1ff is integrable, then the positive and negative parts off defined by

and

are also integrable.

6. If the real-valued function f is integrable, then the indefinite integral off definesa finite measure on the class of all measurable sets.

7. The measure of Problem 6 is absolutely continuous with respect to the measurewhich is used for the definition of the integral.

Page 27: Foundations of Quantum Mechanics

14 MEASURE AND INTEGRAL 1-4

1-4. THE THEOREM OF RADON-NIKODYM

En this section we shall consider the relations between comparable measureson a fixed measure space. This relation is of the greatest importance in theapplications of measure theory. We shall neither give the most general formof the theorem nor prove it; but we shall illustrate it with an example whichrenders it sufficiently plausible.

Let us begin with the example: Let S consist of all the integers, S =1, 2, ..., n, . . ., and let p,, be a set of positive numbers such that

The c-ring of measurable sets consists of all the subsets of 5, and the measure4u(A) is defined by

4u(A) = Th•leA

Let be another sequence of nonnegative numbers such that

>cfl <

and define another measure

v(A) = a1.icA

The only set of p-measure zero is the null set 0, which is of course also ofv-measure zero. Thus v is absolutely continuous with respect to p: v -< p.We shall define the function f(n) by setting f(n) = then we find that

v(A) = = dp.icA J

Thus we have established, in this particular case, that if v -< p, there existsa nonnegative measurable function f on S such that for every measurableset A we have

v(A) = fcl4u..JA

In the special case which we have discussed, the function f is not only measur-able but also integrable; but this is the case if and only if the measure v isfinite, as one can easily verify. The generalization of this property to anymeasure is the content of the following theorem.

Theorem (Radon-N ikodym): The necessary and sz4fJicient condition thatthe measure on the real line S he absolutely continuous with re.spect tothe measure v is that there exists a uniquely determined hounded non—

Page 28: Foundations of Quantum Mechanics

1-4 THE THEOREM OF RADON-NIKODYM 15

negative measurable function f(x) with domain S such that

v(A)= [fdp foreveryAeM..JA

If, furthermore, the two measures v and p are equivalent, then the functionf is positive a.e., and

p(A)= IA!

Ifboth measures are finite then both f and 1/fare integrable.

The function f is called the Radon-Nikodym derivative and it is oftenwritten as f = dv/d4u. The notation underlines the analogy of this conceptwith the ordinary derivative of a nondecreasing function. The analogy isfurther enhanced by considering the Lebesgue-Stieltjes measure on the realline determined by a nondecreasing function p(x). If c(x) is another measureof this kind, then the function f of the Radon-Nikodym theorem is given by

dpf(x) =

If c(x) = x, the measure induced by it is the Lebesgue measure, and wemay write f(x) = dp/dx. In this case the function f(x) coincides (a.e.) withthe usual derivative of p(x).

PROBLEMS

1. Let S = {1, 2,. . . , n,.. .} and fh(n) = > 0. The a-ring generated by thesets {n}, with (n = 1, 2,. . .), consists of all the subsets of 5, and the measure

on the sets {n} has a unique extension to this ring given by

= p,.leA

2. If v is another measure on the measure space of Problem 1, then there exists apositive function f(n), with n = 1, 2,.. ., such that

v(A) = f.JA

and another positive function g(n) such that

gdv..JA

Furthermore, f(n) I/g(n). If and v are finite measures, both functions fand p are integrabic.

Page 29: Foundations of Quantum Mechanics

16 MEASURE AND INTEGRAL 1-5

3. If are three measures such that -< -< ji3, then

djzi djz2 — djzi

djz2 djz3 —

We consider the set of all complex, integrable functions on a measure space(S, M, and denote it by L1. 1ff e L1 then any scalar multiple cf is alsoin L1. Likewise, if f1 and f2 e L1 then f1 + f2 e L1. The functions of theclass L1 are thus a linear man()fold.

For any f e L1 we define the norm as

MfM

and for any pair f, g e L1 we define a distance function p(f, g) by setting

p(f,g)= j—gM.

Two functions f andset of measure zero.classes of equivalentby the distance functions p(f, g).

An important property of the space L1 is its completeness.said to be complete if every fundamental

sequence with the property — fmM —, 0 asthere exists an integrable function f such that

for

Of more importance in quantum mechanics is the space L2(S, M, p) L2,which consists of all those measurable functions on S which are square-integrable. Thus f e if

Just as in L', we can define a distance function p(f, g) =A more general concept is the scalar product (f, g) defined for any pair

of functions f and g in L2. It is defined by the formula

g) = ff*g

wheref* is the complex conjugate of f With this definition we have

If 12 = (f.f).

1-5. FUNCTION SPACES

g for which p(f, g) = 0 are equal except possibly on aTwo such functions are said to be equivalent. The

functions form a metric space, with the metric defined

A space isThis meansn, m —* cc,

n —* cc.

Page 30: Foundations of Quantum Mechanics

REFERENCES 17

The space L2 is also complete. Thus if is a fundamental sequence, thereexists a functionf such that — —, 0. The space L2 is the basic mathe-matical object for quantum mechanics and we shall devote the next threechapters to its study.

PROBLEMS

1. The space of all sequences of complex numbers satisfying the condition

n= 1

is an L1-space.

2. The space of all sequences of complex numbers satisfying the condition

H2 <n= 1

is an L2-space.

3. Every element in L1 of Problem 1 is contained in L2 and there exist elements inL2 which are not in L1, so that L1 C L2.

4. In general neither L1 a L2 nor L2 C L1 is true.

REFERENCES

1. P. R. HALMO5, Measure Theory. Princeton: Van Nostrand (1950).

2. F. RIEsz AND B. SZ.-NAGY, Functional Analysis. New York: F. Ungar Publ. Co.(1955).

3. N. DUNFORD AND J. T. SCHWARTZ, Linear Operators (Part I, especially ChapterIII). New York: Ilnterscience Publishers (1958).

Page 31: Foundations of Quantum Mechanics

CHAPTER 2

THE AXIOMS OF HILBERT SPACE

I think we may safely say that the studies preliminary to the construction of agreat theory should be at least as deliberate and thorough as those that arepreliminary to the building of a dwelling-house.

CHARLES S. PIERCE

En this chapter we introduce the basic properties of Hubert space in axio-matic form. The axioms are given in four groups in Section 2-1. The com-ments on the axioms in Section 2-2 introduce such basic material as linearmanifolds, dimension, Schwartz's and Minkowski's inequalities, strong andweak convergence, and orthonormal systems. In Section 2-3 we discussvarious realizations of the abstract Hilbert space. We devote the whole ofSection 2-4 to the important distinction between manifolds and subspaces.We also discuss the decomposition theorem with respect to a subspace. ina final section (2-5) we introduce the notion of the lattice of subspaces,a notion which will play a fundamental role throughout this book.

2-1. THE AXIOMS OF HILBERT SPACE

The abstract Hilbert space .t is a collection of objects called vectors, denotedbyf g,..., which satisfy certain axioms to be enumerated below.

The axioms fall into four groups, each of which refers to a differentstructure property of Hilbert space. Group 1 expresses .the fact that .t is alinear vector space over the field of complex numbers. Group 2 defines thesea/ar product and the metric. Group 3 expresses separability, and group 4,co,npleteness, of the space.

I. .*' is a linear vector space with complex coefficients. This means that toevery pair of elements f g e ..*" there is associated a third (J + g) e t.Furthermore to every element and every complex number A there corresponds

IS

Page 32: Foundations of Quantum Mechanics

2-2 COMMENTS ON THE AXIOMS 19

another element Af e The following rules are postulated:

f+ g = g +f;(f+ g) + h =f+(g + h);

A(f+ g) = Af+ Ag;

(2 + p)f= Af+ pf;AQif) = (Ap)f;

1 •f=f.There exists a unique zero vector 0 such that for all f

o +f=f.0•f=O.

2. There exists a strictly positive scalar product in SW'. The scalar product(f, g) is a function of pairs of elements f, g e 3t' with complex values andsatisfying the following conditions:

(f,g) = (g,f)*;

(f,g + h) = (f,g) + (f,h);

(f, Ag) = A(f, g) for all complex A;

MfM2 (f,f)> 0 unlessf = 0.

3. The space is separable. This means that there exists a e Jt(n = 1, 2,. .) with the property that it is dense in in the following sense:For any f e and any a > 0, there exists at least one element f,, of thesequence such that

Mf—LM <&

4. The space Jt is complete. Any with the property

lim f,nM = 0n,m

defines a unique limit f e .t such that

lim =0.n -.

2-2. COMMENTS ON THE AXIOMS

The axioms of groups 1 and 2 describe the Hilbert space as a linear vectorspace with scalar product. The special choice of the complex numbers forthc field of the coeflicicnts will he justified later from a physical point of

Page 33: Foundations of Quantum Mechanics

20 THE AXIOMS OF HILBERT SPACE 2-2

view. Here we remark that one can define Hilbert spaces over the reals andthe quaternions (or any other field), and they share many of the propertieswhich one finds for the Hilbert space with complex numbers.

In the axioms of group 2 we require positive definiteness of the scalarproduct. There is no difficulty in defining spaces with indefinite scalar prod-ucts. We shall, however, need only the definite scalar product. This toowill be justified from a physical point of view.

Axioms 3 and 4 are restrictions on the size of the space in oppositedirections. Here one should say that axiom 4 is in some sense superfluoussince it can always be fulfilled by a standard procedure, called the completionof the space. It is the same kind of procedure used in the construction of thereal numbers from a dense subset such as the rational numbers. It is theaxiom which permits the notion of continuity in Hilbert space.

Axiom 3, on the other hand, is an important restriction on the size ofthe space. If it is omitted, one obtains nonseparable spaces. These will notbe used in this book since their physical meaning is not yet understood,although many of the properties of separable spaces can be transferred tothe nonseparable ones.

The reader may have noticed the absence of a dimension axiom. Thisaxiom was omitted intentionally, since it is convenient to have a definitionwhich is valid for finite- as well as for infinite-dimensional spaces.

In order to define the notion of the dimension, one needs the notion oflinear independence. A finite or infinite sequence of vectors is calledlinearly independent if a relation such as = 0 implies = 0 for all n.If this is not the case we shall call the sequence linearly dependent.

The maximal number of linearly independent vectors in .t is called thedimension of SW'. Thus we say .t has the dimension n = 1, 2, ..., if thereexists a set of n linearly independent elements in .t but every set of (n + 1)vectors is linearly dependent. If n = cc, then we obtain what is usuallycalled the Hilbert space.

An important question concerns the independence of the axioms. If thedimension n c cc, then Axioms 3 and 4 are consequences of the others,but not for n = cc.

If {L} is a linearly independent sequence, then the set of all elementsof the form

E

is an example of a linear manifold. The elements of a linear manifold satisfythe axioms in groups 1 and 2 but not necessarily Axioms 3 and 4. Thenumber n of elements in the set {1} is called the dinwnsion of the linearmanifold. A formal definition and a more systematic discussion of linearmanifolds will be given in Section 2-4.

Page 34: Foundations of Quantum Mechanics

2-2 COMMENTS ON THE AXIOMS 21

If S is an arbitrary set of vectors in 3t', we may consider the smallestlinear manifold containing S. Such a linear manifold always exists and isunique. We call this the linear manifold spanned by S.

The positive definiteness of the scalar product implies the importantinequality of Schwartz:

(f, � MfM MgM.

The proof of this is obtained immediately from the remark that the quantityis nonnegative for all complex numbers cc, and for 0 its

minimum is equal tot

MfM2 — Kf,gN2

One also sees from this proof that the equality sign holds if and only if fand g are linearly dependent.

An easy corollary is Minkow ski's inequality (also called the triangleinequality):

gM � + MgM.

Hilbert space is not only a vector space but also a topological space.This means we have a notion of convergence (or equivalently the notion ofopen subsets). In order to define convergence we may use either the normor the scalar product. When using the norm, we say a convergesin the norm to f if

lim —fM = 0.n —.

In this case we speak of strong convergence.If we use the scalar product we may define a weak convergence as follows:

The sequence converges weakly towards f if, for every g,

lim (ft, g) = (f, g).n -.

The two kinds of convergence coalesce in finite dimensions, but not ininfinite dimensions. A sequence which converges weakly toward a limitneed not converge strongly toward anything (Problem 10).

Two vectors f and g with (f, g) = 0 are called orthogonaL A sequenceof vectors are called orthonormal if they satisfy

=

(We shall always denote vectors of norm 1 with Greek letters.)

t Cf. Problcm 7.

Page 35: Foundations of Quantum Mechanics

22 THE AXIOMS OF HILBERT SPACE 2-2

For any vector f and any orthonormal sequence, Bessel's inequality(Problem 11) is valid for any f:

�v= 1

The orthonormal system is called complete if Bessel's inequality isin fact an equality for any In that case one finds (Problem 11) that thepartial sums f,, converge strongly towards f, so that onemay write

f =EOPv,f)Qr

Such a system q,, is called a coordinate system. The existence of coordinatesystems is an important consequence of separability (Axiom 3).

PROBLEMS

1. The complex numbers are a ililbert space of dimension 1.

2. The set of all square matrices A of n rows and columns (n < t) makes up aHilbert space of dimension n2, if the scalar product is defined by (A, B) =Tr A*B, where Tr denotes the trace (sum of diagonal matrix elements).

3. In any Hilbert space one has the "law of the parallelogram":

lIf+ gil2 + If— gil2 = 2l]fl]2 + 2JJg112.

(Jordan and von Neumann [4].) A normed vector space permits the definitionof a scalar product such that (f, f) = lJf112 if and only if the norm satisfies theparallelogram law.

5. The vectors of a linear manifold satisfy the axioms in groups 1 and 2.

6. The intersection of any family LII, (1 e 1) of linear manifolds is again a linearmanifold.

7. For g 0 the minimum value of iif+ xgl]2 as runs through the complexnumbers is

f 2_ Kf,g)Phg 112

*8. (Generalization of the inequality of Schwartz.) Let with v = 1, . . . ,

be a finite set of vectors; then the determinant of Gram Det (f',, � 0. More-over, the equality sign holds if and only if the are linearly dependent.

9. Minkowski's inequality llf+ gil � if ii + is a consequence of Schwartz'sinequality.

10. Any infinite orthonormal sequence of vectors ip,, converges weakly to zero. Nosuch sequence can converge strongly to a limit.

11. For any orthonormal system {p.} one has the inequality

0 if I 112 If 112 V

Page 36: Foundations of Quantum Mechanics

2-3 REALIZATIONS OF HILBERT SPACE 23

2-3. REALIZATIONS OF HILBERT SPACE

The axiomatic which we have described in the preceding two sections con-stitutes a definition of abstract Hubert space. The abstract space is a mostconvenient notion when we wish to study very general properties which arenot related to any particular realization of the space in terms of other mathe-matical objects. However, this is not the way Hilbert space appears inphysics. In concrete physical problems involving quantum mechanics,Hilbert space appears always in a particular realization, for instance, as afunction space or as a space of sequences of numbers.

Many realizations of Hilbert space are possible. Such realizations arealso useful from a purely mathematical point of view, since they demonstratethat the axioms of Section 2-1 are consistent. We shall discuss only theinfinite dimensional space here, since the finite dimensional case is alreadyunderstood.

The first of these realizations is the space It consists of all infinitesequences of complex numbers f = with the property

C cc.

If A is a complex number, we define Af = and if g = is anothersequence with c cc, then we define

and (f, g) n1One easily verifies the axioms of groups 1 and 2, but the proofs of the com-pleteness and separability theorems require certain technical devices.

A second realization of Hilbert space is the function space L2(S, M, it)introduced at the end of Chapter 1. The elements of this space are classesof equivalent functions in L2(S, M, p), where two functions f and g aredeclared equivalent if

J If— gl2dp =

The functions f and g are then equal a.e. with respect to the measure p. Iff = {f(x)}, (x e S), is a vector of this space, then we define Af ={Af(x)}.If g = {g(x)} is another vector of this space, then we define

f + g = {f(x) + g(x)} and (f, g) = Jf*g dp.

That with these operations we obtain a Hubert space is a nontrivial assertion.Axioms 1 and 2 are easy enough to verify (Problem 3). hut Axioms 3 and 4are deep theorems [1].

Page 37: Foundations of Quantum Mechanics

24 THE AXIOMS OF HILBERT SPACE 2-4

If the measure p is discrete and concentrated at an infinite number ofpoints, we obtain a slight generalization of the space j2, consisting of allsequences subject to the condition that

<

for a fixed sequence of positive numbers p,,. If we choose for these numbers= 1 we again obtain the spaceFrom the abstract point of view, all these different realizations represent

the same abstract Hilbert space. This abstract space is completely deter-mined by the axioms. Thus if we have two different realizations, then thereexists a one-to-one mapping of one onto the other which preserves theHilbert-space structure. Two realizations which stand in this relation toone another are said to be isomorphic. Two Hilbert spaces of the samedimensions are always isomorphic.

PROBLEMS

1. The space (2 isa linear vector space which satisfies the axioms of groups 1 and 2.*2. The space (2 is separable and complete.

3. The function space L2(S, M, satisfies axioms 1 and 2 if scalar multiplicationaddition and scalar product are defined by

Af= {Af(x)},

f+g {f(x) +g(x)},

(f,g)

2-4. LINEAR MANIFOLDS AND SUBSPACES

We shall now discuss with greater care a notion already introduced inSection 2-2, the linear manifold.

A subset .4' of a Hilbert space .t is called a linear man (fold if f e .11implies that Afe and iffe and g eA' imply that (f+ g) cA'. Wesay the set is stable with respect to multiplication with scalars and vectoraddition.

A linear manifold automatically satisfies axioms 1, 2, and 3. The firsttwo are just the definition of linear manifold, and the third is a consequenceof a theorem in topology which says that the subset of a separable set isalso separable. But what about axiom 4?

Let us examine this question by means of an example. Consider thespace(2. It is easy to verify (Problem 1) that all sequences with only a finite

Page 38: Foundations of Quantum Mechanics

2-4 LINEAR MANIFOLDS AND SUBSPACES 25

number of components # 0 are a linear manifold in j2 which is not com-plete (Problem 2).

A vector f e .t is a limit vector of A' if there exists a sequence e j7such that f,, -+ f. If every limit vector of A' belongs to 4', we call aclosed linear man jfold M, or a subspace. A subspace is a Hilbert space.Every linear manifold 4' can be closed by adding to it all the limit vectors;if we want to express this process we denote it by M = A and call it theclosure of 4'. The closure of 4' is thus the smallest subspace which contains

If 92 is a set of vectors, we denote by bol the set of all vectors orthogonalto all vectors of 9'. Thus

6°' = {f: (f, g) = 0 for all g e b°}.

It is easy to verify that 6°' is a subspace (Problem 4). Furthermore, ifand 6°2 are two subsets of .t such that 9 c 6°2, then 6°f. Since4' c M 2 it follows that M' c 4" and therefore (Problem 8)

1/" M" = M.

But Mis the smallest subspace containing A', and so we must have 411± = M.For every infinite dimensional subspace M there are infinitely many differentlinear manifolds which are dense in 4'. (For physical applications the sub-spaces are more important than the linear manifolds.)

Fig. 2—I Geometrical interpretation of thedecomposition of a vector with respect toa subspace.

A very important property is the decomposition of vector f with respectto a subspace M: To every subspace M there belongs a unique decom-positionf = f1 + f2 such thatf1 e M andf2 e M'. The geometrical contentof this theorem can be seen at a glance from Fig. 2—1.

These considerations can be generalized to more than one subspace.Suppose that is a sequence of mutually orthogonal subspaces suchthat = 0. We say then that the span the entire space. We canthen decompose every vectorf in a unique manner as a sum

with

Page 39: Foundations of Quantum Mechanics

26 THE AXIOMS OF HILBERT SPACE 2-5

This infinite sum is to be interpreted in the sense of strong convergence,

for n—*co.

We say that the space .t is represented as a direct sum of orthogonal sub-spaces and we write

v= 1

PROBLEMS

1. The set of all sequences e (2 with only a finite number of x, 0, is alinear manifold A' in (2 which is dense in (2•

2. The sequence f,, = defined by

(1I— v�n

xtf° =v>n

is a Cauchy sequence and therefore defines a limit f. All f,, e A' but f 4'.C (2

3. The intersection of two subspaces is again a subspace.

4. The set = {f: (f, g) = 0 for all g C 92} is a subspace.

5. If M and M-'- are two orthogonal subspaces, then the vectors f of the form+ e M,f2 eM-'-, are a subspace.

6. The subspace of Problem 5 is the entire space.

7. IffeMandfeM-'-,thenf=O.8. If Mis a subspace then M-'--'- = M.

2-5. THE LATTICE OF SUI3SPACES

While the preceding sections represent basic material of the theory of Hilbertspace, the topic of this section is selected primarily with a view to the physicalinterpretation.

Let M and N be two subspaces. Their set-theoretic intersection M n Nis also a subspace. It is the largest subspace contained both in M and N.In a similar way we may define the smallest subspace containing both Mand N, and we denote it by M u N.

The two operations n and u have properties similar to the set-theoreticintersections and unions introduced in Chapter 1, but there are some im-portant differences which we shall now examine.

It is convenient to introduce sets of subspaces which are closed withrespect to these two operations, intersection (n) and union (u). Such asystem is an example of a lattice, and we denote it by &.

Page 40: Foundations of Quantum Mechanics

2-5 THE LATTICE OF SUBSPACES 27

If we require in addition that 2 be closed even with respect to a count-ably infinite number of intersections and unions, and, furthermore, that withM there is also M-'- in 2, then we obtain a complete, orthocomplementedlattice. Since M n M-'- = 0 and 0-'- = .t, such a lattice always contains 0and A formal definition and detailed discussion of this notion will bepresented in Section 5-3.

If M1 (1 e I, some index set) is a family of subspaces, we denote theunion and intersection of these subspaces by

and

The difference between this and the lattice of subsets of a set comes to lightwhen we consider mixed operations involving unions and intersections inone and the same formula.

Fig. 2—2 Three subspaces of a two- Fig. 2—3 Illustration of the relationdimensional space which do not satisfy (A n B) u (A n B') = A.the distributive law.

Let us examine this in a very special case which displays the charac-teristic features of the general situation. We take a two-dimensional Hilbertspace .t and choose two one-dimensional subspaces, M and M', for instance.Let N be any one-dimensional subspace # M and M'; then we have(see Fig. 2—2)

Nn (M u M') = Nn = N,

but

We see therefore that the operations u and n do not always satisfy thedistributive law as they do in the case for sets (Problem 3).

it is of great importance to have a criterion which tells under whatconditions subspaces satisfy the distributive law. fit is easy to see that forany pair of sets A and B we have an identity (Fit 2—3):

(AnB)u(AnB')=Awhich can he obtained immediately from the distributive law.

/ , , / / 7

-

Page 41: Foundations of Quantum Mechanics

28 THE AXIOMS OF HILBERT SPACE 2-5

In the lattice of subspaces, the operation corresponding to the com-plement A' is the orthocomplement M'. Thus we arrive at the followingdefinition: Two subspaces M and N of a Hilbert space are called compatible if

(MnN)u(MnN')=M. (2-1)

This defining property seems to be unsymmetrical. Actually it is not hardto show that the relation is, however, a symmetrical one (Problem 6), SOthat Eq. (2—1) implies also

(NnM)u(NnM')=N. (2-2)

We shall introduce the notation M .-÷ N to designate two subspaces whichsatisfy either (and hence both) the relations (2—1) and (2—2).

There is another possibility of expressing the relation of compatibility.It is obtained by introducing the notion of disjoint subspaces. We say twosubspaces M and N are disjoint if M c N'. It follows then that N c M',so that the relation of disjointness is symmetrical. We shall write for it M ± N.

Two subspaces M and N are compatible if there exist three mutuallydisjoint subspaces M1, N1, and K, such that

M=M1uK and N=N1uK.It follows then that K= Mn N and M1= Mn K', N1= Nn

The following theorem finally gives the connection between compati-bility and the distributive law:

Three subspaces L, M, and N which are pairwise compatible satisfy thedistributive law

Ln(MuN)=(LnM)u(LnN),Lu(MnN)=(LuM)n(LuN).

PROBLEMS

1.MuN=(M' nN')1.2. The operations of intersection and union of subspaces are associative:

(M1 n M2) n M3 = M1 n (M2 n M3),(M1uM2)uM3=M1u(M2uM3).

3. If A, B, and C are three subsets of a set, then one always has

An(BuC)and

A u (B n C) (A u B) n (4 u C).

Page 42: Foundations of Quantum Mechanics

REFERENCES 29

4. If M c Nthen Mu N = N, and vice versa.

5. M ÷÷ N implies

Mu (N n *1') = Mu N=Nu (M n N').6. The relation M ÷÷ N is symmetric.

REFERENCES

1. N. J. ACHIE5ER AND I. M. GLASMANN, Theorie der linearen Operatoren imHi/bert-Raum (especially Chapter 1). Berlin: Akademie Verlag (1958).

2. P. R. }TTALMOS, Introduction to Hi/bert Space (Chapters I and II). New York:Chelsea Publ. Co. (1957).

3. F. RIEsz AND B. SZ.-NAGY, Functiona/ Ana/ysis (especially Section 83). NewYork: F. Ungar Publ. Co. (1955).

4. P. JORDAN AND J. VON NEUMANN, Ann. Math. 36, 719 (1935).

5. M. FT. STONE, "Linear Transformations in Hilbert Space." Am. Math. Soc.coll. publ. (1932).

Page 43: Foundations of Quantum Mechanics

CHAPTER 3

LINEAR FUNCTIONALSAND LINEAR OPERATORS

Simplicio: "Concerning natural things we need not always seek the necessityof mathematical demonstrations."

Sagredo: " Ofcourse, when you cannot reach it. But jfyou can, why not?"

GALILEO GALILEI, Dialogue on theTwo Major Systems of the World

The two notions of linear functionals and linear operators are closely relatedand are treated in this chapter side by side. Sections 3-1 and 3-3 correspondto each other in this sense, and Section 3-2 on sesquilinear functionals is thelink between the two. The central theorem here is the theorem of Riesz,which shows that linear functionals are scalar products. In Section 3-4 wecollect a number of useful properties on projections, and introduce the notionof spectral measure. In Section 3-5 we expose enough of the difficulties whichone encounters with unbounded operators to warn the unwary of the pitfalls.Our tactics will be to avoid if possible the unbounded operators, and, if thatcannot be done, to subject them to such severe conditions of propriety thatthey cannot cause trouble. In the applications we shall be dealing mostlywith symmetrical (Hermitian) operators; and here the reader should retainthe important distinction between "symmetrical" and "self-adjoint."

3-I. BOUNDED LINEAR FUNCTIONALS

A bounded linear functional 4'(f) on a Hilbert space is a function withdomain SW', which has for its range the complex numbers C, and whichsatisfies the following conditions:

4'(f + g) = 4'(f) + 4'(g) for allf, g 6

4.(Af) = A4(f); and all A e C;

cM Ilfil (M cThe greatest lower bound of the numbers 4'! which satisfy the last inequalityis called the norm of the functional 4'(f).

30

Page 44: Foundations of Quantum Mechanics

3-1 BOUNDED LINEAR FUNCTIONALS 31

The simplest example of a functional is the scalar product off with afixed vector g,

4)(f) = (g, f). (3—1)

One verifies without difficulty that the norm of this functional is =(Problem 1).

The bounded linear functionals form a linear vector space if one definesaddition of 4)i(f) and 4)2(f) by (4)1(f) + 4)2(f)), and multiplication of 4)(f)with a scalar A by A4)(f). They are also a normed vector space with the normdefined by 114)11.

We shall now show that the linear functionals also constitute a Ililbertspace; that is, there exists a scalar product in this space. First we verify that4)(f) is a continuous function with respect to the strong topology in .t. Tosee this, —*f, so that IlL —fIl -+ 0 as n -+ cc. Then

— 4)(f)I = — f)I � 114)11 Ill,, —

and the left side tends to zero with the right.We can now prove the following theorem.

Theorem (Riesz): Every bounded linear functional 4) in a Hubert spaceat is of the form 4)(f) = (g, f) with g some fixed vector in W'.

Proof: Let be a complete orthonormal system in t, and let f =L Since 4)(f) is continuous we have

4)(f) = n1We then define

gn= 1

and find

(gf) = = 4)(f). Q.E.D.

With the theorem of Riesz we have no difficulty defining a scalar productfor linear functionals as follows, by setting

4)2) = (g2, g1)

where 4)1(f) = (g1, f) and 4)2(f) = (g2, f).The linear functionals are thus a linear manifold with a scalar product

(also called a pre-Hilbert space). Its closure in the norm is a Hilbert spacecalled the dual space of There exists a natural mapping of the linearfunctionals 4) e onto the vectors of by assigning to the linear functional4)(f) = (g,f) the vector g e If 4) —, g in this mapping, then A4) -+ A*g;and if —' g1 and 4)2 then + 4)2)—' (g1 + g2). Such a mappingis called antilinear.

Page 45: Foundations of Quantum Mechanics

32 LINEAR FUNCTIONALS AND LINEAR OPERATORS 3-2

If in the linear functional 4'(f) = (g, f) we keep the vector f fixed and letg vary over the space then we obtain a conjugate linear functional W(g) =(g, f). If 'I'(g) is conjugate linear, then qJ*(g) (the complex conjugate of P) isa linear functional.

It follows from the foregoing that the bounded linear functionals of Sare again a Hilbert space 1, and that there exists a natural norm preservinglinear mapping A' onto .t. We can thus identify A with ar.

This duality property of Ililbert space is exploited by Dirac in his notationof the bra and ket vectors [1, p. 18]. In fact, a ket If> is a vector fe t,and a bra is a linear functional over .t which for f takes on the value

= (g,f).

PROBLEMS

1. The norm of the bounded linear functional 441') = (g,f) for some fixed vector gis equal to ugh.

2. The norm of a bounded linear functional satisfies the parallelogram law:

+ 2 + 2

(use the Riesz theorem).

3. A bounded linear functional is determined everywhere by its values on a subsetS a t;*° which spans t.

4, A bounded linear functional is strongly and weakly continuous.

3-2. SESQUILINEAR FUNCTIONALS AND QUADRATIC FORMS

A sesquilinear functional is a complex-valued function 4)(f, g) of two variablevectorsf and g e 3t. Thus the domain of such a functional is the topologicalproduct x .t of the Hilbert space with itself consisting of all pairs ofvectors, and the range is the complex numbers.

We shall restrict our discussion in this subsection to bounded sesquiinearfunctionals which we define by the requirements:

For fixed f the functional 4)(f, g) is a bounded linear functional of g.

For fixed g the functional 4)(f, g) is a bounded conjugate linear functionalof f.

If, in addition to these properties, 4'(f g) = 4)*(g, f), then the functional 4) iscalled symmetrical.

Every sesquilinear functional defines a certain quadratic4)(f,f). Conversely, every such quadratic form defines uniquely a sesquffinear

Page 46: Foundations of Quantum Mechanics

3-3 BOUNDED LINEAR OPERATORS 33

functional by the formula (Problem 1):

44'(f,g) g) — g) + ig) — ig).

The process expressed in this equation is known as polarization.This last result shows that a sesquiinear functional 4.(J g) is in fact

completely determined by its values on the diagonal 4'(ff), that is, by itsassociated quadratic form It follows from this remark that a sesqui-linear functional is symmetrical if and only if its associated quadratic formis real.

A sesquilinear functional 4'(f g) is called positive if its associated quad-ratic form takes only positive values: � 0. It is called strictly positiveif = 0 implies f = 0.

An example of a strictly positive sesquilinear functional is the scalarproduct 4'(J g) = (f, g).

If 44f, g) is a sesquilinear functional, then for every fixedf this is a linearfunctional of g. Thus by Riesz' theorem we can associate an f' with everyf by the formula

=(f',g) for allg e .$°.

The correspondence f-+f' has the following properties: Af-+ Af' and(11 + f2)' = f; + (Problem 3). Such a correspondence is called a linearoperator. We shall devote the next section to the study of such operators.

PROBLEMS

1. If $(f) 4(f, f) for some sesquilinear functional then

=$(f+g)—$(f—g)+i$(f—ig) —i$(f+ig).2. A sesquilinear functional is symmetrical if and only if its associated quadratic

form is real.

3. The correspondence f -÷ f' established by the equation

44f,g)=(f',g) forallgeIs°has the properties

(Afl' = Af', (fi +f2)' =fi' +f2'.

3-3. BOUNDED LINEAR OPERATORS

A bounded linear operator T is a function with a linear manifold asdomain and a subset in a*° as range, and such that

T(f+g)=Tf+Tgfor all complex A;

117111 � M lUll for 0 � M <

Page 47: Foundations of Quantum Mechanics

34 LINEAR FUNCTIONALS AND LINEAR OPERATORS 3-3

It follows immediately from this definition that the range is also a linearmanifold. The greatest lower bound of the numbers M which satisfy the lastinequality is called the norm of the operator T and is denoted by II Til. If theredoes not exist such an M C cc, then the operator is said to be unbounded.In this section we shall discuss only bounded operators.

1ff1 # f2 implies Tf1 # Tf2, then there exists another linear operatorT1, called the inverse ofT, with domain = and range =It has the property

An operator is said to be continuous at fe if f,, e and f,, —' fimplies -+ Tf. Every bounded operator is continuous everywhere.Conversely, every linear operator which is continuous at one point is con-tinuous everywhere and bounded. Continuity and boundedness are thusinterchangeable concepts for linear operators. (See Problems 2 and 3.)

If is a bounded linear operator defined on D and agreeingwith T on then we say T1 is an extension of T, and we write T1 D T(or T c T1).

e is a Cauchy sequence —*f, then if f we define

Tf = limn —

This limit always exists, because iS also a Cauchy sequence:

11Th — TfmII � 11Th IlL fmIL

With this assignment we obtain an extension of the linear operator fromto the closure DT (Problem 4). For this reason we can always assume that

a bounded linear operator is defined on a subspace; in particular, if isdense in 3t', this is the entire space. The domain can even be extended to allof .t if is not dense in We first extend T to by continuity. Thenwe define T on as an arbitrary bounded linear operator; for instance,Tf = 0 for fe On a generalfe 3t', the extended operator is then definedby linearity; it has the same norm as the original operator. When nothingspecific is said about the domain of a bounded operator T, we shall alwaysassume that we are dealing with this maximal extension.

This remark is especially useful if we want to define the sum and theproduct of linear operators. For instance, let T1 and T2 be two operators;then the sum of these two operators is defined by the formula:

(T1 + T2)(f) = T1f+ T2f.

Page 48: Foundations of Quantum Mechanics

3-3 BOUNDED LINEAR OPERATORS 35

Similarly we define the product T1T2 by setting

T1T2(f) = T1(T2f).

We can also define a multiplication with scalars by

(AT)(f) = 2(Tf),

and it is easily seen that if T1, T2, and T are bounded operators, their sum,product, and scalar multiples are also (Problem 5).

Every bounded linear operator defines a sesquilinear functional 4) asfollows:

g) (L Tg).

At the end of the preceding section we have shown that the converse of this isalso true. Actually, every such functional defines not one but two boundedlinear operators T and T* by the formula

4kf,g) = =(f, Tg).The two operators T and T* are said to be adjoint to one another. Oneeasily verifies the following properties

(T1T2)* =

(AT)* = (A* = complex conjugate A);

(T1 + T2)* = +

T T is called self-adjoint or symmetrical. Fromthe definition of the adjoint operator one easily verifies that IIT*II = 11Th(Problem 6).

We describe a few examples of bounded linear operators.

1) The identity operator I: This is the operator which does nothing to thevectors so that If = f, for allfe SW'.

2) The projection operators E, defined everywhere, are characterized by theproperty E*E = E.

Every projection operator determines a subspace which is its rangeand conversely, every subspace M determines a unique projection Ewith range M (Problem 7).

1ff = f1 + f2 withfi e M andf2 e M', then the projection E with rangeM is defined by the equation

Ef=f1.3) The partial isometries flare linear operators with the property: = E

is a projection. it is easy to verify that F = 11(2* is then also a projection(Problem 8). F and F are called rig/it- and left-projections respectively.

Page 49: Foundations of Quantum Mechanics

36 LINEAR FUNCTIONALS AND LINEAR OPERATORS 3-3

4) The unitary operators are a special class of isometries, namely those forwhich both E and F are unit operators. Thus U is unitary if

U*U=UU*=L

The prototype of an isometry which is not unitary is the shjft-operatordefined as follows: Let (n = 1, 2,...) be a complete orthonormalsystem, and define the operator Q on all by

=One easily finds then that Q* satisfies

Q* - * -'Pn+i'Pn an

For all other f we define

= if f =

From these definitions it follows that

= (n = 1,2,...).

Thus = L Furthermore,

= withn = 2,3,..., and

Thus we have verified that QQ* is a projection on the orthogonal com-plement of q'1.

Mother example of an isometry is the continuous shift operator definedas follows: Let = L2(0, cc), so thatf= {f(x)} when 0 � x c cc. If wesetf(x) = 0 for x <0, we can define an operator Q by setting

(Qf)(x) = f(x — a), (Q*f)(x) = f(x + a).

With this definition we find

(Q*qf)(x) =

f(x) for x � a,(QQ*f)(x) =

0 forxca.Thus we have verified that Q is an isometry.

Examples of bounded self-adjoint linear operators are obtained easily asfollows: Let {q,,} be a complete orthonormal system; define =

real) and extend by linearity to the entire space.The sesquilinear functional associated with a self-adjoint operator is

symmetrical (cf. Section 3-2), and every such functional defines a self-adjointoperator (Problem 10). We shall call a self-adjoint operator a positiveoperator if its associated quadratic form is positive.

Page 50: Foundations of Quantum Mechanics

3-4 PROJECTIONS 37

PROBLEMS

1. The correspondence f -÷ (p. f)p for some fixed normalized p is a boundedlinear operator with bound 1.

2. If a linear operator is continuous at one point, it is continuous everywhere.3. A linear operator is continuous if and only if it is bounded.4. Every bounded linear operator T with domain admits a unique extension T1

to the closure without changing the norm, so that

= and IT1 II = 11Th.

5. If T1 and T2 are bounded linear operators, then (T1 + T2) and T1T2 are, too.Furthermore, the bounds satisfy the inequalities

lIT1 + T2II � lIT1 II + lIT2 II, and 11T1T211 � II 11T211.

6. IIT*II = 11Th.

7. Every projection operator defines a unique subspace which is its range, and viceversa.

8. If U is a linear operator such that E is a projection, then QQ* = F isalso a projection.

9. In a finite-dimensional space, every isometry with right projection E = I isunitary (that is, its left projection is also I).

10. A bounded self-adjoint operator defines a symmetrical sesquilinear functional,and vice versa.

3-4. PROJECTIONS

A linear operator E defined on all of and satisfying the relation EE* = Eis called a projection. It follows immediately from this relation that E = E*and consequently E2 = E. Projections are thus self-adjoint and idempotent.The range AE of a projection is a subspace, and to every subspace therecorresponds a unique projection. We have therefore a one-to-one corres-pondence of projections to subspaces which permits us to replace one by theother.

It follows from this remark that the entire lattice structure of subspacescan be transferred to projections. They are, as are the subspaces, a partiallyordered system. For instance, we say that E1 � if for the correspondingsubspaces M1 and M2 we have M1 M2. It is easy to show that E1 � E2if and only if E1E2 = E1 (Problem 1).

If E is a projection with range M, then (1 — E) is the projection withrange 4'!-'-. These remarks show that certain relations in the lattice of sub-spaces can be easily expressed th terms of algebraic operations on correspond-ing projections.

Page 51: Foundations of Quantum Mechanics

38 LINEAR FUNCTIONALS AND LINEAR OPERATORS 3-4

It is natural to ask the question how the operations of union and inter-section of subspaces are expressed algebraically in terms of the correspondingprojections: Let M1 and M2 be two subspaces, and E1 and E2 the projectionswith ranges M1 and M2 respectively. We have defined the intersectionM1 n M2 as the subspace consisting of all the vectors common to M1 andM2. The projection F with range N = M1 n M2 must be a function of theprojections E1 and E2. We shall denote it by F = E1 n E2, and it is bydefinition the largest projection contained in E1 and in E2. What is thisfunction?

Our first impulse might be to suggest the answer F = E1E2. But thiscannot be generally right, since F is a projection only if E1 and P22 commute(Problem 2). In that case the answer is correct, and F E1 n E2 = E1E2(Problem 3).

If E1 and E2 do not commute, then the problem of finding E1 n E2 as afunction of E1 and P22 is more difficult. We give here the result and shall becontent with an illustration of the result in a special case.

Let us consider the situation of Fig. 3—1. The two projectionsE1 and E2are represented by two (nonorthogonal) one-dimensional subspaces. Thevectors f, = (E1E2 )"f are seen to converge to zero,illustrating that for this special case

E2fE1 n E2 = lim (E1E2)" = 0.

fl —

E2fThe general formula valid for any two projec-

tions is (E1 E2)2 f

n = (3—2)E1'mE2

If E1 and commute, then = E1E2, Fig. 3—1 Illustration ofand we find, as before, that E1 n E2 = E1E2. It the formulafollows then that E1 u E2 = E1 + E2 — E1E2 n P22 = lim(Problem 3).

Of special importance in quantum mechanics are families of commutingprojections which are closed with respect to countable unions and intersec-tions. If £9 is such a family and Efi e 2' a sequence of projections in 2,then and are again in 2'. Furthermore, with Ee 2', also(I — E) e 2'. If 2' contains also 0 and I, we obtain a Boolean algebra.

A canonical way of constructing such Boolean algebras of projections isthe following:

Let t = L2(S, M, p) and A e M an arbitrary measurable set. To anysuch set one can associate a projection E(A) by defining, for any fix) e

L2(S, M, ji), (x e S), the operator E(A):

(E(A)J)(x) = (3 3)

Z2E1E2

Page 52: Foundations of Quantum Mechanics

3.4 PROJECTIONS 39

where XA(X) is the characteristic function of the set A; that is, the functiondefined by

1 forxeA,XA(X) =

0 forx*A.Because the class M of measurable sets is c-additive, the projections E(A)are c-additive, too. That is, if is any sequence of disjoint sets, then

(a) =

Furthermore, for any two sets A1, A2, we have

(b)E(A1) n E(A2) = E(A1 n A2),

E(A1) u E(A2) = E(A1 u A2);and finally,

E(Q) = 0,(c)

E(S) = I.

Such a map from the class of sets M to a Boolean algebra of projections whichsatisfies the relations (a), (b), and (c) is called a spectral measure over themeasure space (5, M, hz). This fundamental notion will be used constantly inthe following discussion.

The canonical spectral measure defined by the special (3—3) is,of course, especially simple to realize. The spectral measure is obviously ageneralization of a numerical measure. Instead of assigning a number to thesets, the spectral measure assigns projections in a Hilbert space. From anyspectral measure, one can easily obtain a large number of numerical measures.It suffices to define for an arbitrary vectorf

= (f, E(A)f).

Fhis definition of measures depending on vectorsfwill be of great importancelater on.

PROBLEMS

I. If E1 and E2 are two projections, and and LXE2 their corresponding ranges,then LXE2 c if and only if =

2. liE1 and E2 are projections, then F = is one if and only if =E2 E1E2 and u P22 = + P]2 —

4. three commuting projections E1, E2, and E3 satisfy the distributive law,

E1 n (E2 u L.1) (E1 n E2) u (E1 n

LI (I:j (F1 u n (E1 u

Page 53: Foundations of Quantum Mechanics

40 LINEAR FUNCTIONALS AND LINEAR OPERATORS 3-5

3-5. UNBOUNDED OPERATORS

We cannot avoid discussing the unbounded operators since they are constantlyused in quantum mechanics. We do it here primarily to acquaint the readerwith the mathematical complications which one encounters with unboundedoperators. (For a more complete discussion, see reference 2.)

An unbounded linear operator is such that fM exceeds all finitebounds. For such operators one can always find anfsuch that for any finiteM we have

> M MJM.

Such operators exist only in the infinite-dimensional Hilbert space. In thefinite-dimensional case every linear operator is bounded (Problem 1).

An example of an unbounded operator is easily constructed, for instanceas follows: Let be a complete orthonormal system and define =

The domain DA of this operator consists of all linear combinations

f = such that n2 x,j2 cn1 n1

This is a dense linear manifold c t. The operator is unbounded because= n; thus for n sufficiently large this norm exceeds any finite quantity.

It is easy to verify that this operator cannot be continuous. Consider, forinstance, the sequence of vectors = We obtain = and thisdoes not tend to zero, does. Thus A is not continuous. We alsoobserve that A is not defined everywhere in since the vectorf with compo-nents = 1/n is not contained in DA.

These properties are, in a certain sense, to be made precise now, alsocharacteristic, for the unbounded operators. In order to explain this in moredetail we need a new concept which replaces that of continuity in the un-bounded case.

A linear operator T is called closed if, for every sequence of vectors f, inthe domain of T, the relations

and

Tf = g. Superficially this definition seems to belike the definition of continuity. But there is an important difference: For thedefinition of continuity, f is assumed to be in and Tf is asserted,while for a closed operator, f is asserted to be in under the assumptionthat converges.

Since the domain of a continuous operator is always closed (cf. Section3-3), a continuous operator is always closed, but there are closed operatorswhich are not continuous.

An operator which is not closed may he extended to a closed one. Weshall not examine under what conditions such an extension is possible. SullIce

Page 54: Foundations of Quantum Mechanics

3-5 UNBOUNDED OPERATORS 41

it to remark here that all the unbounded operators which we need to considerdo admit closures. If the closure is possible it is unique.

For closed unbounded operators one can prove that they cannot bedefined in the entire space. We have the following theorem:

Theorem: Every closed linear transformation on t is necessarily bounded[3].

Let us now examine the definition of the adjoint transformation. This isa little more delicate, since is not the entire space. We can and will,however, assume in the following discussion that is dense in t. This isnot a severe restriction. Let us consider the expression = (f, Tg) for afixedf If there exists a vectorf* with the property

(f, Tg) = (f* g) for allg e

then we say fe and we define the operator T* by setting T*f = f*. Theoperator T is called symmetrical (or Hermitian) if T* T; we have called anoperator self-adjoint if T* = T. Such an operator is thus always symmetrical.

In general we have the following situation for symmetrical operators T:

T T** cand we may distinguish the following cases:

1) T = T** but T** c T*. The operator T is closed but not2) T c = T*. The operator T is not closed but its smallest closed

extension T** is self-adjoint (Problem 7). It is then called essentiallyself-ac/joint.

3) T c T** c T*. The operator T is neither closed nor essentially self-adjoint.

4) T = T** = T*. The operator T is self-adjoint and hence closed.

A closed symmetrical operator, if it is not self-adjoint, may admit furthersymmetrical extensions. If there are no further extensions possible, then thesymmetrical operator is called maximal. If there exists a maximal and self-adjoint extension, then the extended operator is called hypermaximal (seevon Neumann [4]).

Von Neumann has given a complete characterization of symmetricaloperators and their various extensions. In quantum mechanics the onlysymmetrical operators which have any useful physical interpretation are theessentially self-adjoint operators, in which we shall therefore be particularlyinterested.

The restriction in the domain of unbounded operators poses a greatdifficulty in the formation of sums and products of operators. Consider, forinstance, the operators T1 and with their respective domains D3 and D2.

Page 55: Foundations of Quantum Mechanics

42 LINEAR FUNCTIONALS AND LINEAR OPERATORS 3-6

The operator T1 + T2 is then definable only in the intersection D1 n D2,and there it is given by

(T1 + T2)f= Tj+ T2f.

Similarly, for the definition of a product operator T1T2, we need the condi-tionsfe D2 and T2fe D1. Then we can define

(T1T2)f= T1(T2f).

The intersection of two domains is of course again a linear manifold,but it need not be dense. In fact it can be 0, even if both D1 and D2 are dense.It is seen from these remarks that the definition of sums and products ofunbounded operators is already quite complicated for merely two operators.

There are two situations when a simplification is possible. The first occurswhen there exists a common dense domain for two operators which is alsoinvariant under the operations. The restriction of the operators to this densedomain permits the definition of unrestricted sums and products, and hencethe formation of any algebraic expression.

The other simplification occurs when one of the two operators is bounded.Let T be an unbounded, and B any bounded operator; then (T + B) and BTare defined in On the other hand, TBis defined only on the elementsf forwhich Bfe

This leads us to the definition: A bounded operator B commutes withthe (unbounded) operator T if TB is an extension of BT,

B is a projection E, then ET TE implies ETE = TE = ET.A projection E which stands with an operator T in this relation is said toreduce T.

If the projection E reduces T, then F = (I — E) also reduces T. We canthen decompose the operator into two: T = TE + TF, where TE = TE,TF = TF are called the reductions of T to the subspaces M = Elf andN = F/f respectively. T' operates only in M and if only in N.

3-6. EXAMPLES OF OPERATORS

1. The position operator Q. The position operator Q is defined in a subsetDQ of the space L2(— cc, + cc) consisting of all Lebesgue square-integrablefunctions by setting

=

The subset DQ is defined by+ 1)

DQ =J

v2 dxc

Page 56: Foundations of Quantum Mechanics

3-6 EXAMPLES OF OPERATORS 43

The operator Q is symmetrical, since for every e DQ and every i4' e DQ,we have

(i4', Qp)= J

dx = p).

This equation shows that for all such i4, Q*i4 is defined and, in fact,Q

We shall now show that, for this domain DQ, the operator Q is alsoself-adjoint. To see this, we verify that in fact DQ* so that QThus let e DQ* and define = Q*II,; then

(i4', Qq,) = p) for all e DQ.

Thus

- dx = 0 for all e DQ.

Since DQ is dense in t we must have, a.e.,

xçl'(x) = çl'i(x).

Therefore '1' e DQ and =

2. The momentum operator. The operator P is defined on the subsetwhich consists of all absolutely continuous functions i4'(x) which are differen-tiable a.e., and for which (di4/dx) e L2(— cc, + cc). On this the set operatorP is defined by

(Pi4')(x) = — idx

One can show that this operator, too, is self-adjoint [2, Chapter IV].There exists, furthermore, a common dense domain D which is invariant

tinder the operations Q and P, and on which both operators are defined. Forany ço e D one finds (Problem 8)

(QP — PQ)ço = kp. (3-4)

We see that the operator QP — PQ is bounded on the dense domain D, andtherefore admits a unique extension to the entire space; so we may write theoperator equation

QP — PQ C II. (35)We shall call this the canonical commutation rule. It plays a fundamentalrole in quantum mechanics.

It is usually (incorrectly) written QP — PQ = ii, which disregards thelact that the left-hand side is defined only on a dense domain.

Page 57: Foundations of Quantum Mechanics

44 LINEAR FUNCTIONALS AND LINEAR OPERATORS 3-6

3. The creation and annihilation operators A * and A. Let (n = 0, 1,. .

be a complete orthonormal system in an abstract Hilbert space Define

= (n = 1,2,..

= + (n = 0,1,2,.

Aq0==0.

By linearity we can extend the definition of A and A* to all linear combinations

f = for which n c cc.

On suchf we define

Af = =

and

A*f YJXnA*con = +

The two operators A and A* are then the adjoint of one another, andDA = D = DA*, so that the * has the usual significance of the adjoint forunbounded operators. There is a simple relation between the operatorsA, A* on the one hand and Q, P on the other (Problem 9).

PROBLEMS

1. In a finite-dimensional Hubert space, every linear operator is bounded.

2. The set of elements f with the property that, for some unbounded operator Twith dense domain, there exists a vector f* such that

(f, Tg) = (f*,g) for allg e

is a linear manifold Dr..

3. If is dense, then the vector f* of Problem 2 is unique.

4. The operator T* is always closed.

5. Every symmetrical operator T admits a closed symmetrical extension, namelyT**.

6. If .E reduces the operator T, then F (I E) also reduccs T.

7. If the symmetrical operator A has the property A" A'. then its smallestclosed symmetrical extension A A" is self-adjoint.

Page 58: Foundations of Quantum Mechanics

REFERENCES 45

8. The functions

1

(x)— (n0 1 )— •

where are the polynomials of Hermite [5, p. 62], are a complete ortho-normal system in L2(—co, + w); and the finite linear combinations of thepn(x) are a dense linear manifold D which is invariant under P and Q, and onwhich both P and Q are defined. For every p e D one verifies

(QP — PQ)p =

9. The operators

1 1

and

are adjoints of each other on the domain

D={f:

and they satisfy

(n =1,2,...),Apo=O,

(n=O,1,2,.

REFERENCES

1. P. A. M. DIRAc, Principles of Quantum Mechanics, 4th ed. Oxford: ClarendonPress (1958).

2. N. I. ACHIESER AND I. M. GLASMANN, Theorie der linearen Operatoren im liii-bertraum, especially Chapters I, II, III, and IV. Berlin: Akademie Verlag(1958).

3. F. RlEsz AND B. SZ.-NAGY, Functional Analysis, especially Chapter VIII.New York: F. Ungar Publ. Co. (1955).

4. L VON NEUMANN, Mathernatische Grundlagen der Quantenmechanik, especiallyChapter 11. Berlin: Springer (1932).

5. L. I. ScH1FF, Quantum Mechanics. New York: McGraw-Hill (1949).

Page 59: Foundations of Quantum Mechanics

CHAPTER 4

SPECTRAL THEOREM

AND SPECTRAL REPRESENTATION

Unless the vessel's pure, all you pour in turns sour.

HORACE

In this chapter we are concerned primarily with those aspects of self-adjointlinear operators which have to do with their realizations in a function space.Most of the practical calculations in the applications are done by means ofsuch realizations. There is a canonical realization which we call the spectralrepresentation. It uses for its Hilbert space a certain function space over thespectrum of the operator.

In order to formulate the spectral representation, we need several basicconcepts and theorems which we introduce in an informal manner in Sec-tion 4-I, in the context of finite-dimensional spaces. There we present thenotions of spectrum, spectral measure, simple spectrum, and spectral repre-sentation unencumbered by technical details of topology and measuretheory.

In Section 4-2, we give the definition of the spectrum of a self-adjointoperator in terms of the domain of the resolvent operator. The central theo-rem of this chapter, the spectral theorem, is stated without proof in Section4-3. It establishes a unique correspondence between spectral measures andself-adjoint operators, and it is an indispensable tool in the establishment of afunctional calculus as explained in the following section (4-4). In Section 4-5,we prepare the ground for the spectral representation by studying someproperties of the spectral dthsity functions. This clarifies the role of thecyclic vector and leads to two equivalent definitions of the simple spectrum ofan operator.

Section 4-6 gives a formulation of the spectral representation. The case ofone operator is explained in sufficient detail to complete the existence proofwithout too much trouble, and to make the subsequent generalization to acountable family of operators at least plausible. In the final section (4-7), wesketch the method of eigenfunction expansion which, if available, gives aconvenient method for calculating the spectral representation.

46

Page 60: Foundations of Quantum Mechanics

4-1 SELF-ADJOINT OPERATORS IN FINITE-DIMENSIONAL SPACES 47

4-1. SELF-ADJOINT OPERATORS IN FINITE-DIMENSIONAL SPACES

In this section i*° will be a finite-dimensional Hilbert space, and we denoteits dimension by n c cc. We shall briefly recapitulate the principal propertiesof the self-adjoint linear operators A in t. Since n c cc,

DA=AA=t.The equationf' = Af, which associates with everyfe t anotherf' e t, canalways be realized as a finite linear system of equations by introducing acomplete orthonormal system (q,.} with r = 1, . . ., n, and (çø,., = ors.If

f and =then

= EArsXs,

whereArs((Pr,A(Ps)Ar (r,s =1,..

The structure of the operator is revealed by referring it to a particularcoordinate system, consisting of eigenvectors. They are the nontrivial solu-tions of the equations

AVIr = 2rcb'r (r = 1,. ..,n).

One can show that there always exist n nontrivial solutions of this equa-tion where the numbers 2r' the eigenvalues, are the solutions of the secularequation

Ars — = 0.

The determinant of the left-hand side of this equation is a polynomial inA and it has, according to the fundamental theorem of algebra, exactly nsolutions; since Ars is Hermitian, they are all real. These solutions are calledthe spectrum of the operator A. Some of these solutions may coincide; theymust then be counted with their proper multiplicity. In the special case inwhich the roots are all simple (that is, of multiplicity 1), we shall call theelgenvalues nondegenerate and the spectrum simple.

The following facts are easily verified: If 2,. then the correspondingcigenvectors i/i,. and are orthogonal: V's) = 0. Furthermore, if thecigenvalue 2,. has multiplicity oc(r), then there exist oc(r) linearly independentcigenvectors i/i,. with this eigenvalue (Problems 1 and 2). The eigenvectorswhich belong to 2,. are then an oc(r)-dimensional subspace of t.

From the preceding remarks it follows that there always exists a coordi-nate system in .t consisting of eigenvectors (r = I, 2. . ., n). lf the 2,.are all nondegenerate. the i/i,. are already orthogonal and complete. lf they

Page 61: Foundations of Quantum Mechanics

48 SPECTRAL THEOREM AND SPECTRAL REPRESENTATION 4-1

are degenerate we can choose them to be so in many different ways. In thiscoordinate system the operator A appears in a particularly simple form:1ff =

1XrçIJr, then J" = Af = 8=1 X;lIJr with = 2rxr. This is the

spectral representation of A.For the sake of later generalizations of this important result, we shall

formulate it as follows.We have really two different spaces: and The latter consists of

all sequences of n complex numbers (xr} with r = 1, 2, . ., n. By theformula X,. = (14'r,f) we establish a one-to-one correspondence Q betweenvectorsf in the abstract space t and sequences (xr} in This correspond-ence is isometric if the norm of sequences (xr} is defined by

=

(Problem 4). Isometric means that if (xr} = Qf, then

= Mfl2.

Two Hilbert spaces which are the linear isometric image of each other areidentical in their abstract structure.

Fig. 4—1 illustrating the equationAT== 0A0'.

The operator A defined in t may be considered an operator in viathe isometric image Q of onto We shall denote it by A. Thus A =QAQ', and (cf. Fig. 4—i)

A(xr} = (2rXr}. (4—i)

(Every self-adjoint operator permits a spectral representation.in finite-We are interested in the generalization of this result to

the infinite-dimensional Hilbert space. In order to formulate the spectralrepresentation in Hilbert space, it is necessary to introduce some new conceptswhich will permit an alternative but equivalent formulation of the aboveresult. The main difficulty which we must negotiate is connected with the useof the coordinate system &"r}. In Hilbert space a complete system of elgenvec-tors does not necessarily exist. It is thus necessary to seek a formulationwhich does not use such a system.

Let us begin by an alternative definition of the spectrum A of the operatorA. We have seen that A is the set of numbers 2 for which the equation(A — 2 = 0 has nontrivial solutions. In other words. if 2 A, then

Page 62: Foundations of Quantum Mechanics

4-1 SELF-ADJOINT OPERATORS IN FINITE-DIMENSIONAL SPACES 49

this equation has only the trivial solution i4 = 0. A linear operator whichhas this property admits an inverse, which is again a linear operator. Thismeans the operator

(4—2)

exists for 2 A. The operator RA is called the resolvent operator.If (still in finite-dimensional spaces) 2,. e A, then the set of i4' for which

(A — 2r1)çli = 0 spans a finite-dimensional subspace M,.. Let P,. be the projec-tion with range M,.. If 2,. is nondegenerate, then Mr is one-dimensional.Generally, the degree of degeneracy of 2,. equals dim M,.. The previous state-ments concerning the orthogonality and completeness of the i/i,. can now beexpressed by the equations

= and > P,. = I. (4—3)

A system of projections which satisfy such relations is sometimes called adecomposition of unity. In terms of the projections P,., the operator A hasthe simple form Z2,.P,. (Problem 7).

We have already stated what we mean by a simple spectrum. Now weshall take up this notion once more, and cast it into a different form, which atfirst sight may seem more complicated but which in the end will be the onlyuseful form of expressing this notion for general operators.

Consider the set of all polynomials

u(A) = + a2A"2 + +

They form an algebra d, that is a vector space with a multiplication law: IfT1 and T2 e d, then 21T1 + 22T2 e d for complex 22 and e d.

The dimension of this vector space cannot exceed n since the operator Asatisfies an algebraic equation of order n, that is, the Cayley equation:

(2,. — A) = 0.r1 N

If the dimension of d is exactly equal to n, then the spectrum of A issimple. If the dimension is less than n, it is degenerate.

Another equally useful way of characterizing the simple spectrum is thelollowing: if g is a fixed vector in t, we can consider the set of all theelements of the formf = Tg with T e d. We denote this set by (dg}. It isclearly a subspace of The size of this subspace will depend on g as well asUI) .cl. Concerning the dependence of g, it is convenient to select vectors forwhich dim {dg} is maximal. For such vectors the dimension of (dg}depends only on the structure of the algebra, and will therefore depend on itsdimension. More precisely, we have the following theorem.

Theorem: The s/n'rlrunl of A is situp/c if and on/j' if I/wire e.vist rectorsU Sl1(/? 1/ill, {.&g} iS ('1/lW/tO 1/li' entire

Page 63: Foundations of Quantum Mechanics

50 SPECTRAL THEOREM AND SPECTRAL REPRESENTATION 4-1

Instead of giving a proof of this theorem we merely make some heuristicremarks which can easily be expanded and sharpened to a complete proof(Problem 6).

The vectors g with the property that (dg} = are called cyclic vectorsof the algebra af. They are characterized by the property 0 for all r.If the P. are all one-dimensional, then f = u(A)g = Er

By suitable choice of u(2r) one can reach any arbitrary vector fe *'.Thus we see that we may characterize the operators with simple spectrum bysaying that they give the highest mobility of the vector Tg with T e af. Theother extreme would be obtained if the spectrum of A is completely degen-erate. Then the operator A is merely a multiple of the unit operator, and wehave the least mobility: The degree of mobility of vectors Tg with T e dthus measures the degree of degeneracy of the spectrum.

Both of the preceding definitions of the simple spectrum can, with suit-able modifications, be transferred to the infinite-dimensional Hilbert space.Before we do this, we shall introduce the notion of the spectral measure(still for the finite-dimensional case).

Let A be a Borel set on the real line. To every set we can associate aprojection E(A) defined by

E(A) = >.ArEA

In this way we obtain a projection-valued set function, defined on all Borelsets and satisfying the relations

E(A1) u E(A2) = E(A1 u A2),

E(A1) n E(A2) = E(A1 n A2),(4—4)

E(0) = 0,

E(R') = I.

At the end of Chapter 3 we called a projection-valued set function withthese properties a spectral measure. Thus we see that every self-adjointoperator A defines a unique spectral measure E. With every such measure wecan also define, for any fixed vector f, a finite numerical-valued measurep1 by the formula

= (f, E(A)f).

In our particular case (finite dimension oft), every p1 is a discrete Lebesgue-Stieltjes measure concentrated at the points 2,.. The weights at the points2,. depend on the choice of f, and they may be zero for certain vectors f.However, for cyclic vectors g we have

— — 0) (g, P,.g) 0,

where we have used the notation jç(A) E — 11), 2]). From this remark we

Page 64: Foundations of Quantum Mechanics

4-1 SELF-ADJOINT OPERATORS IN FINITE DIMENSIONAL SPACES 51

see immediately that the measure p1 for anyf must be absolutely continuouswith respect to any measure with g cyclic.

Thus, in the notation of Section 1-4,

p1-<p9.In particular if f is also a cyclic vector, we have

the two measures are equivalent: p1 —' One verifies easily with theRadon-Nikodym theorem that, conversely, if p1 p9, then f must be cyclic.We summarize this situation with the following theorem:

Theorem: Two numerical spectral measures and p92 are equivalent ifboth g1 and g2 are cyclic vectors. Every other numerical measureinferior to them.

All of the results in this section have analogues in the infinite-dimensionalcase, and the rest of this chapter is devoted to the explicit formulation ofthese analogues and their applications for establishing the spectral represen-tation.

PROBLEMS

1. Let A be a seif-adjoint operator in the finite-dimensional space /t'. Then

= Ai/i2 = A2i4'2 and A2

imply that = 0. Is this result also valid in infinite-dimensional spaces?

2. If A is a root of multiplicity of the secular equation A A . = 0, thenthere exist exactly linearly independent solutions '4' of the equation Ac/i = Ai4'.

3. If and '412 are two linearly independent eigenvectors belonging to the sameeigenvalue A, then all vectors in the subspace spanned by and 1412 are eigen-vectors with eigenvalue A. /

4. The correspondence U : f —> {xr }, where Xr = (cUr, f) and i4', is a completeorthonormal system in is one-to-one and isometric provided that the normof the sequence {x, } is defined by

H{x,}112=

x,12.

5. A vector g is cyclic with respect to a self-adjoint operator A if and only ifP,.q 0 for all spectral projections P..

6. The spectrum of A is simple if and only if {dg} = t for cyclic vectors g.

7. In a finite-dimensional space the self-adjoint operator A has the form A =N\,P,, where P.. is the projection with range M, and M, is the linear subspace ofvectors 14', which satisfy

(A A, I)c/i, 0.

Page 65: Foundations of Quantum Mechanics

52 SPECTRAL THEOREM AND SPECTRAL REPRESENTATION 4-2

4-2. THE RESOLVENT AND THE SPECTRUM

In this section we consider a self-adjoint operator A in Hubert space. A may bebounded or unbounded;inanycase it is closed. For any complex z = x + iy,we consider the operator A — z I, and we denote by D its domain and byA(z) its range, so that A A(O) is the range of A.

The equation Af — zf = g establishes a correspondence between theelements of D and the elements of A(z). If this correspondence is one-to-one,then there exists the resolvent operator = (A — z I) with domainA(z) and range D.

We adopt the following definition of the spectrum of the operator A:The values of z for which A(z) = t are called regular values for A; all othervalues are called the spectrum of A. All the values z = x + iy, with y 0,are regular values for A (Problems 1 and 2). Thus the spectrum of a self-adjoint operator A is always real; in particular the proper values are real.

For real values z = 2, A(2) may or may not be equal to There arefour cases possible, depending on whether A(2) is closed or not, and whetherwe have A(2) c or A(2) = t. We can thus divide the spectrum A intotwo parts, A = Ad, called the continuous and the discrete part. IfA(2) c A(2), then 2 e and if A(2) c then 2 e Ad. The four caseswhich arise is this way are given in Table 4-1.

Table 4-1

THE SPECTRUM OF A SELF-ADJOINTOPERATOR

Property of Spectrum

A eAr, Ad

A*AC, AEAd

Xe Ad

We see from this table that the range A(2) of the operator A — 2 I furnishesa convenient means to identify the structure of the spectrum of a self-adjointoperator.

This characterization does not say anything about the multiplicity of thespectrum. In the discrete spectrum, it is easy to verify that the dimension of

is exactly equal to the multiplicity of the eigcnvalue A (Problem 6).To characterize the multiplicity in the continuous part of A is more difficult.In order to do this, we need the functional calculus to which we devote muchof the remaining part of this chapter.

Page 66: Foundations of Quantum Mechanics

4.3 THE SPECTRAL THEOREM 53

We conclude this section with some remarks on the resolvent operator= (A — z I) If z is a regular value for A (for instance, for all z with

Tm z 0), then is a bounded operator defined for all elements g e if =A(z).

From the two equivalent relations (A — I)f = g and = f, weeasily obtain, for all g e t,

= RZ2(A —

= RZ2(A —

By taking the difference of the two equations we obtain

— = (z1 — (4—5)

This is called the Hilbert relation. It follows immediately that two boundedresolvents commute:

=

Finally one can prove the relation (Problem 7)

=

PROBLEMS

1. The operator (A — zI)', with A self-adjoint and z = x + iy (y 0), existsand is bounded.

2. Ifz x + iy(y 0), then = if.3. The solutions of the eigenvalue problem All' = Aifr are the orthogonal comple-

ment of

4. The spectrum of the operator Q (Section 3-5) is the entire real line.

5. The point A is in the continuous spectrum of the self-adjoint operator if and onlyif there exists a sequence of normalized vectors p,, = 1) such that

for

6. The multiplicity of an eigenvalue A e Ad of a self-adjoint operator is equal tothe dimension of

7. The resolvent of a self-adjoint operator satisfies the relation R? = for all

4-3. THE SPECTRAL THEOREM

At the end of Section 4-I we introduced the notion of the spectral measurethe finite-dimensional case. We shall now generalize this notion to the

operators in the infinite—dimensional Hilbert space. The fundamental spectraltheorem may then he stated in the lollowing way.

Page 67: Foundations of Quantum Mechanics

54 SPECTRAL THEOREM AND SPECTRAL REPRESENTATION 4-3

To every self-adjoint operator A there corresponds a unique spectralmeasure A —* E(A) defined on the Borel sets of the real line 1?1 such that

A=j AdEA, (4—6)

where EA E((— cc, A]) and the integral is to be interpreted as the Lebesgue-Stieltjes integral valid for anyfe DA

(f, Af)= j A d(f, EAf). (4—7)

The proof of this important theorem is found in reference 1.It suffices here to see that it is an obvious generalization of the corre-

sponding theorem for the discrete case (Section 4-1).The one-to-one correspondence of the spectral measures on the real line

to the self-adjoint operators permits us to replace one by the other.To every spectral measure on the real line belongs a spectral family of

projections EA with — cc c A c + cc.The general projection of such a spectral family is defined EA

E((— cc, A]). The spectral family satisfies the relations

for A < ji,

A+O = A'(4-8)

If the spectral family is known, the spectral measure can be reconstructed bythe formula

E(A) =JA

for any Borel set A. This formula is an abbreviated notation for the numericalintegral, valid for anyfet: -

(f, E(A)f)= j d(f, EAf) d(f, EAf)

PROBLEMS

1. If the spectrum A of a self-adjoint operator is continuous (that is Ad is the nullset), then (f, EAf) is a continuous function of A for all vectorsf

2. The spectral measure of a self-adjoint operator A reduces the resolventoperator R, = (A z R2E(.S)

Page 68: Foundations of Quantum Mechanics

4-4 THE FUNCTIONAL CALCULUS 55

4-4. THE FUNCTIONAL CALCULUS

The last formula of the preceding section is a special case of a much moregeneral relation between certain functions u(A) of a real variable A and linearoperators in Hilbert space.

If T is a bounded operator defined in DT = if, then it is clear what wemean by the operator T2. It is the operator which assigns to every vectorf the vector T2f = T(Tf). It is not difficult to extend this to any finite positivepower of T, by a recursive definition: = — 'f). If u(A) A" +

+ + a polynomial of A, then we define the operator u(T) =T" + + + by setting

u(T)f= T"f+ +

This assignment of polynomials to operators satisfies the relations

(u1 + u2)(T) = u1(T) + u2(T),

(u1u2)(T) = u1(T)u2(T), (4—9)

(cu)(T) = cu(T)

for all complex c. Furthermore, all the operators u(T) commute with eachother.

The correspondence of functions u(A), with operators u(T), which satisfiesrelations (4—9) is called a functional calculus. It is desirable to extend thiscalculus to the largest possible class of functions for self-adjoint operators. Aconvenient way to do this is the following.

We define u(A) as measurable with respect to the spectral family EA if forevery fe if the function u(A) is measurable with respect to the Lebesgue-Stieltjes measure defined by the spectral density functions = (f, EAf).

Similarly, we say u(A) is integrable with respect to the spectral familyEA if for everyfe if the function u(A) is integrable with respect to the measuredefined by p1(A).

Let A be a self-adjoint linear operator and EA its spectral family. If u(A)is measurable and integrable with respect to EA, then we can define, for anypair of vectors f, g e t, the expression

du,

and one verifies easily that L9(f) is a bounded linear functional off ByRiesz' theorem there thus exists a unique vector g* with the property L9(f)(g*,f).

The correspondence g g* defines a bounded linear operator u(A) sothat we have the formula

r + 1

',(A)ij)= j

- u(A) d(/, LA cj).

Page 69: Foundations of Quantum Mechanics

56 SPECTRAL THEOREM AND SPECTRAL REPRESENTATION 4-4

This will be written in an abbreviated notation as

u(A)= j u(A) dEA.

One can verify that the correspondence u(A) —+ u(A) of functions with opera-tors has the properties (4—9), (Problem 6).

The operator u(A) is bounded if the function u(A) is essentially boundedwith respect to EA ; that is, if there exists a real number M c cc such that theset {A u(A) c M} has measure zero for all measures p1. The greatest lowerbound of the numbers M with this property is the bound of the operator u(A)(Problem 1).

It is possible to extend the functional calculus to unbounded operatorsby restricting the vectors f and g to suitably defined dense linear manifolds.Some slight modifications of the relations (4—9) are then needed because ofthe restrictions of the domains of definition.

Some important applications of the functional calculus are the followingexamples:

1) The integral representation of the self-adjoint operator A with thespectral family A is given by

A=j AdEA.

It is bounded if and only if

L22 d(f, EAf)

exists for all fe If it is unbounded, then the domain of definitionDA is given by the set of vectors f for which

d(f, EAf) < cc.

2) Let XA(A) be the characteristic function of the Borel set A. It is measurableand integrable with respect to EA. The operator

rXAC'1) I

E(A)JA

defines the projection of the spectral measure associated with A.

3) For every real t and every self-adjoint operator A one can define

= = J eW dEA. (4-10)

One verifies that this is a family of unitary operators which satisiles

c/flue2 =

Page 70: Foundations of Quantum Mechanics

4-4 THE FUNCTIONAL CALCULUS 57

This relation between self-adjoint operators and unitary one-parametergroups has a converse in the form of

Stone's theorem: Every unitary one-parameter group for which(f, is a continuous function in t for all f, g e defines a uniquespectral measure such that

= j dEA.

The self-adjoint operator A which corresponds to this spectral measureis called the generating operator of the group. We may then write the precedingrelation equivalently = eW.

The set of all bounded operators of the form u(A), with A some self-adjoint operator, forms an algebra d. We shall denote it the algebra gen-erated by A.

If T1, T2 e d, then T1 + T2 ed and T1T2 ed. Furthermore if Ted,then ATe d for all complex A. The algebra is abelian; this means that forevery pair of operators T1 and T2 one has T1T2 = T2T1. The algebra can alsobe characterized in the following equivalent way:

Let 92 be any set of self-adjoint operators; then we denote by 9" the setof all bounded operators which commute with 92. The algebra generated by92 is then the set 9"' (92')'. The algebra d defined above is then also thealgebra defined by d = {A}" (Problem 4). The fact that d is abelian isexpressed by the relation d d'. If d = d', then the algebra is calledmaximal abelian. It then does not admit any abelian extension.

In Section 4-1 we stated that the self-adjoint operator A has simple spec-trum if and only if the algebra generated by the operator A has maximaldimension. We can now transfer this property to the infinite dimensionalcase by defining: The operator A has simple spectrum if the algebrad = {A}" is a maximal abelian algebra: d = d'.

At this point we have succeeded in transferring the notion of simplespectrum to the infinite-dimensional case without making use of the notion ofeigenvector.

PROBLEMS

I. The operator u(A) = J u(A) dE,,. is bounded if and only if the function u(A) isessentially bounded.

2. The domain of definition of the unbounded operator u(A) is the set ofvectorsffor which

I<

J

3. The operator u(A) for A self-adjoint is unitary if and only if I a.e.withrespect to the spectral family E4.

Page 71: Foundations of Quantum Mechanics

58 SPECTRAL THEOREM AND SPECTRAL REPRESENTATION 4-5

The algebra d = {A}" consists of all the operators of the form u(A) whereu(A) is an essentially bounded measurable and integrable function with respectto the spectral family EA of A [2, Section 129].

5. If the spectrum of the seif-adjoint operator A is discrete and {A }" = d cthen there exists at least one degenerate eigenvalue A of A.

6. The correspondence

u(A) -÷u(A)J

u(A)dE,,.

is a functional calculus; that is, it satisfies relations (4—9).

4-5. SPECTRAL DENSITIES AND GENERATING VECTORS

In this section we shall study in more detail the spectral density functionintroduced in the preceding section. Let EA be a spectral family, and f avector in t; then we define the spectral density function a1(A) = (f, Ej).This function will depend on the spectral family EA on the one hand, and onthe vector f on the other. We shall study the dependence on f for a fixedspectral family EA.

First we list a few simple properties of any function a(A) = (f, E'f)which follow directly from the basic properties (4—8) of a spectral family.

a(A + 0) =(4—11)

a(—co) = 0,

a(+co) = 1 for MfM = 1.

Furthermore the function a(A) is discontinuous for certain values 2 e Ad.For all other values of A it is continuous. In particular for all values A Athe function a(A) is constant.

Any function a(A) with these properties defines a measure p(A) for allBorel subsets A of the real line through the formula:

p(A)= J

da(A).

Thus for each f we have defined a measure.We shall now examine the dependence of this measure on the vector f.

We recall from Section 1-2 that measures are a partially ordered set. Ameasure p1 is inferior with respect to P2 (I'i -< ) if is absolutely contin-uous with respect to #2•

Page 72: Foundations of Quantum Mechanics

4-5 SPECTRAL DENSITIES AND GENERATING VECTORS 59

For the measures which we are studying there is another partial orderingpossible by using the dependence of a on the vectorf Let

= (f1, Ej1) and a2(A) = (f2,

To any vector f we can associate a subspace

M(f) = {d'f}where the right-hand side denotes the subspace generated by all vectors ofthe form Sf with S e d'. If we denote by

M1 M(f1), M2 M(f2),

then we may define a partial ordering of the measure by denoting a1 inferiorto a2 if M1 c M2. We have adopted a terminology which is justified inanticipation of the following.

Theorem: Let EA be a spectralfamily andf1 ,f2 e if two arbitrary vectorsin then the spectral density function a1 is absolutely continuous withrespect to a2 jf and only jfM1 c M2 [3].

This theorem shows that the notions of partial ordering derived fromabsolute continuity and from the partial ordering of associated subspaces areequivalent. It follows from this that the measure associated with a cyclicvector of d' has a maximal property:

If g is a vector such that {d'g} = t, then the measure defined byp(A) = (g, is maximal in the partially ordered set of measures. That isfor anyfe t$°, we have

af-<p.It follows immediately from this remark thai if g' is any other cyclic vectorand p' the associated measure, then p p' (Problem 4). The spectral densityfunctions for two different cyclic vectors define the same (maximal) measureclass.

In conclusion of this subsection we can now give an alternate definition ofthe operator with simple spectrum which generalizes the one given in Section4-1. We have defined an operator A to have simple spectrum if {A}" =d = d'. An equivalent definition is possible with the cyclic vectors byusing the following.

Theorem: An operator A has simple spectrum if and only jf there exists acyclic vector for the algebra d = {A}" (Problem 5).

This theorem generalizes the corresponding elementary theorem for theunite-dimensional case.

Page 73: Foundations of Quantum Mechanics

60 SPECTRAL THEOREM AND SPECTRAL REPRESENTATION 4-6

PROBLEMS

1. If a' is abelian and M = {d'g}, then the projection F with range M is con-tained in d.

2. If M {d'g} c then there exists a vectorg1, such that

M C {d'g1 }.

There exists a cyclic vector g for d', such that

{d'g} =(Use Zorn's lemma.)

4. If g and g' are two cyclic vectors for d', then the associated measures p and p'are equivalent (p m-' p').

5. a' {A}" = S if and only if there exists a vector g such that {dg} = if.In that case the spectrum of A is simple.

4-6. THE SPECTRAL REPRESENTATION

This chapter started out with a discussion of the spectral representation in thefinite-dimensional case. We have now all the tools on hand to demonstratethe existence of a spectral representation in the infinite-dimensional case.Let us recall the finite situation.

We have seen in Section 4-1 that any self-adjoint operator in an abstractspace t defines an consisting of finite sequences (v = 1, 2,n) together with an isometric mapping C) of t onto such that the trans-formed operator A = QAC) - 1 is a multiplication operator.

When we try to transfer this theorem to the infinite-dimensional case,the first question that comes up is: what is the measure in the continuouspart of the spectrum of A? From the remarks in Section 4-1 it seems fairlynatural to suspect that only the class of equivalent measures will be uniquelydetermined, and that the measure which is thus defined is the measure asso-ciated with the cyclic vector.

We consider thus an operator A with simple spectrum, and define ameasure over the spectrum with the spectral density function p(A) = (g,The space consists of functions u(A) where 2 e A, over the spectrum Aof the operator A. The norm is defined by = J Iu(A)12 dp(A), withp(A) (g, In this expression EA is the uniquely determined spectralfamily of A, and g is a cyclic vector. To everyfe {dg}, that is, to everyf ofthe formf= Tg with Ted, we can associate a function u(A) by the ruleT = u(A). Furthermore, because

If 112 = llTglI2 ==

(I/)(A)

Page 74: Foundations of Quantum Mechanics

4-6 THE SPECTRAL REPRESENTATION 61

we see that the correspondence f÷—* {u(2)} e is an isometry from{dg} onto a dense linear subset of Since {dg} is dense in t, we canextend this correspondence by continuity to an isometric mapping of ifonto We denote this mapping by C).

What becomes of the operator A when it is transformed into an operatorA in Since the functional calculus is multiplicative, the operatorAu(A) corresponds to the function Au(2). This is true without qualification ifAu(2) is essentially bounded. If it is unbounded, certain precautions arerequired in order to take account of the restricted domain of definition. Letus therefore concentrate on the bounded case.

It follows then, from the definition of C), that

= QAf = QAu(A)g = {Au(2)},so that

A{u(2)} = {Au(A)} and A =

This is the spectral representation of the operator A.The spectral representation is unique in the sense of equivalence. If

f ÷—* {u1(A)} is another such representation, then there exists an equivalentdensity function Pi p, and the two representations are connected by atransformation

=

where dp1/dp is the Radon-Nikodym derivative of the two equivalent mea-sures.

The spectral representation has a generalization to a finite or countablyinfinite set of commuting self-adjoint operators. Let Ar (r = 1, 2, . . .) besuch a set, and let Ar be the spectrum of Ar; then there exists a uniquelydefined measure class {p} on the Borel sets of the Cartesian product spaceS = A1 x A2 x and an isometric mapping of if onto such thatiff÷-* {u(A)}, where 2 = 22, . . .} eS, then

MfM2 = 1Iu(2)12dp.is

If we write as before {u(2)} = Qf, then the transformed operators Ar aredefined by

Ar = QArIY'.One can prove that if Pr(tXr) is a spectral projection of the operator Ar, thenPr(Ar) QPr(Ar)C) — 'is given explicitly by

Pr(Ar){U(2)} =where XAr(2r) js the characteristic Ilinctiol) of the set Ar. However, this doesnot mean that the operator ,1, k a multiplication operator in /2 ln order to

Page 75: Foundations of Quantum Mechanics

62 SPECTRAL THEOREM AND SPECTRAL REPRESENTATION 4-7

prove this property, a further assumption is needed which expresses somethingwhich we might describe as independence of the operators and which we shallnot give in detail [3]. Under this additional assumption, one finds that

= {Aru(A)}, (r = 1, 2, . . .). (4—12)

PROBLEMS

1. Iff= {f(A)}eL2(—cc, +cc)andA istheoperatorAf={—-i(df/dA)}definedon a certain dense linear subset DA of L2 such that A is self-adjoint, then thetransformation

1

121 = !QL) = dAV Zn.!

is isometric and transforms A into the spectral representation

A= QAQ-', (A7)(p) =

*2. If two self-adjoint operators A and B commute, and if A has simple spectrum,then B is a function of A B = u(A), B has simple spectrum, too, if and only ifthe function u(A) has an inverse.

4-7. EIGENFUNCTION EXPANSIONS

Let us consider a self-adjoint operator A with spectrum A. From the preced-ing sections we know that such an operator defines a unique spectral measureE(A), (A A) and that this spectral measure determines the operator. Allthe properties of the operator A are known if an explicit method is known forcalculating the projections E(A) of the spectral measure. One such explicitmethod, which often works in practice, is obtained by utilizing the eigenfunc-don expansion. This method resembles in some respects the expansion ineigenvectors used for illustrative purposes in Section 4-1. The differencehere is that we use solutions of the eigenvalue equation which fall outsidethe Hilbert space. We illustrate this in a special case.

Let t = L2( — cc, + cc) be the space of square-integrable functions onthe real line and A some self-adjoint operator in if. The solutionsof the equation = which are square-integrable are called eigenvec-tors of A, and they may be chosen so as to form an orthonormal system in if.If this system is complete, then the spectrum of A is discrete, and we haveessentially the case discussed in Section 4-1. If the system is not complete,then there may exist other solutions of the operator equation

Aço = (4—13)

for which if. In order for this equation to make sense, it is evidentlynecessary to extend the domain of definition of the operator A to such func-

Page 76: Foundations of Quantum Mechanics

4-7 EIGENFUNCTION EXPANSIONS 63

tions; this can often be done. For instance, if A is the differential operatorP discussed in Section 3-6, then we may define it on functions 9(x) whichare not in L2. The only property of 9(x) which we need to define the operatorP = — i(d/dx) is that 9(x) be differentiable. Such a function is, for instance,the exponential 9(x) = (k real), and one verifies that in this case one hasan equation

Pço=kço forall—co<k<+cc.It may be that there exists a complete set of eigenfunctions, that is,

solutions of Eq. (4—1/) which are not contained in it. By "complete" wemean that the solutions depend on some (or several) parameters k : 9(k, x),such that every vector f(x) of il'2 which is orthogonal to the eigenvectors

is a linear superposition of the eigenfunctions 9(k, x):

f(x) = I!(k)9(k,x)dp(k),3K

where the integral extends to the entire range K of the parameter k. If such acomplete set of eigenfunctions exists, then we can also normalize them suit-ably so that the norm off satisfies Parseval's equation

J::f(x)12 dx = dp(k).

For instance, in the above example with A = F, the system

is such a complete system, and the transformation

f(x) = L]2n JK

is simply the Fourier transformation of the function f(x).If such a complete system exists for allf orthogonal to the eigenvectorsthen we say the operator A admits an eigenfunction expansion, it is then

possible to prove that

(Af)(x) = [f(k)A(k)9(k, x) dp(k),JK

where 2(k) is the cigenvaluc belonging to 9(k, x). When A is considered asan operator in the system of functions 7(k). it is a multiplication operator.The eigcnl unction II' it exists, can thus serve the same purpose asthe spectral representation.

Page 77: Foundations of Quantum Mechanics

64 SPECTRAL THEOREM AND SPECTRAL REPRESENTATION

Just as in the case of the Fourier transforms, there exists an inversion ofthe transformation (4—13) given by an equation such as

kky=j

f(x)9*(k, x) dx.

The correspondence f(x) #-*f(k) is thus an isometric mapping of the spaceL2(— cc, + cc) onto the space L2(K, such that A appears in the latter spaceas a multiplication operator.

The usefulness of this method of obtaining a spectral representation forself-adjoint operators is evident if the operator A is a differential operator.The solution of Eq. (4—13) is then obtained as the solution of an ordinarypartial differential equation.

However, the expansion theorem is not generally true; and even if it istrue, as, for instance, for certain partial differential operators, Eq. (4—13) doesnot suffice to determine the eigenfunctions uniquely. It is then necessaryto add further conditions, which determine the eigenfunctions. For furtherdetail on these questions the reader should refer to references 4 and 5.

The mathematical theory of eigenfunction expansions is not yet suffi-ciently developed to guarantee its availability in every case of practical interest.En general situations we must fall back on the spectral representation, whichalways exists. For certain special cases, however, especially for some elemen-tary problems in one-particle quantum mechanics, the expansion theorem isavailable; and then it is a most convenient tool to calculate the spectrum andthe spectral representation.

REFERENCES

1. N. I. ACHIE5ER AND I. M. GLA5MANN, Theorie der linearen Operatoren im Hubert-raum. Berlin: Akademie Verlag (1958).

2. F. RIEsz AND B. Sz. -NAGY, Functional Analysis. New York: F. Ungar Publ.Co. (1955).

3. J, M. JAUCH AND B. Mi5RA, "The Spectral Representation." Helv. Phys. Acta,38, 30 (1965).

4. E. C. TITCHMAR5H, Elgenfunction Expansions Associated with Second-orderDifferential Equations. Parts I and II, 2nd ed. Oxford (1962),

5. T. IIEBE, Arch. Rat. Mech. and Anal., 5, 1 (1960).

Page 78: Foundations of Quantum Mechanics

PART 2

Physical Foundations

Page 79: Foundations of Quantum Mechanics
Page 80: Foundations of Quantum Mechanics

CHAPTER 5

THE PROPOSITIONAL CALCULUS

The first process, therefore, in the effectual study of the sciences, must be one ofsimplification and reduction of the results of previous investigations to aform inwhich the mind can grasp them.

J. C. MAXWELL,

On Faraday's lines of force

The topic of this chapter is the propositional calculus of quantum mechanics,sometimes also called (misleadingly) the logic of quantum mechanics. Thiscalculus expresses the kinematic structure of physical systems and is quiteindependent of any dynamical process law. Already we notice on this generallevel the profound difference between classical and quantum mechanics,which will be placed into historical and philosophical context in Section 5-1.The appropriate tools for expressing this calculus are the yes—no experiments,also called propositions, which are introduced in Section 5-2. In the followingsection, 5-3, we list the principal structural properties of these propositions.The corresponding properties are then formulated for classical systems, andit is shown, in Section 5-4, that for such systems the lattice becomes at3oolean algebra. The notion of Boolean sublattice introduced in Section 5-5is useful for the precise definition of the fundamental concept of compatibility.

A brief sketch of modular lattices is presented in Section 5-6, and it ispointed out that modularity, although more general than distributivity, isstill too restrictive for certain physical systems. In Section 5-7 we connect thedevelopment with the parallel but more special topic of Section 2-5 on thelattice of subspaces. En particular it is shown that the formally differentdelinition of compatibility of Section 2-5 is identical with that given in thischapter. The correct axiom (P) which replaces the distributive law of classicalsystems is finally enunciated in the last section, 5-8, together with the atomicityavunn, which is justified principally by its convenience.

67

Page 81: Foundations of Quantum Mechanics

68 THE PROPOSITIONAL CALCULUS 5-1

5-1. HISTORIC-PHILOSOPHIC PRELUDE

Quantum mechanics has introduced into the description of nature such con-troversial novelties that a re-examination of the epistemological foundation ofphysical science has become a widely felt need. Indeed, some of the foundersof this theory, notably Einstein, Schrodinger, and de Broglie, could not agreewith the "orthodox" interpretation of quantum mechanics proposed byBorn and elaborated by the so-called Copenhagen school.

Yet when the advent of quantum mechanics is viewed in a larger historicalcontext, it is seen as one more step in a process of disintegration of the mecha-nistic view which began in the second half of the nineteenth century, and towhich Einstein himself made decisive contributions. Classical mechanicsoriginated in the seventeenth century, and its point of view dominatedphysical science for more than two hundred years. Newton replaced theanthropomorphic and qualitative physics of the middle ages by a general andquantitative process law which seemed to be the ideal of a physical law towhich all other physical phenomena might eventually be reduced. This con-viction was the driving motive behind most of the research of eighteenth-and nineteenth-century physics. Huygens wrote in 1690: "In true philosophythe causes of all natural phenomena are conceived in mechanical terms. Wemust do this, in my opinion, or give up all hope of ever understanding any-thing in physics." Nearly two hundred years later Helmholtz still believesthat "To understand a phenomenon means nothing else than to reduce it toNewtonian laws. Then the necessity of explanation has been satisfied in apalpable way."

And indeed, who could doubt that the mechanical model would remainthe ideal of physical science for all time to come, after two centuries of themost spectacular success? The difficulties of this program began with theadvent of Maxwell's theory of the electromagnetic field. Although Maxwellwas guided to some extent by mechanical analogies, the mechanistic point ofview could not be carried through consistently. Not only would it entail amechanical ether which had to have absurd physical properties, but themechanical model was in the end found to be definitely in contradiction withthe optical experiments by Michelson and Morley. H. Hertz was mostexplicit in announcing the departure from the mechanistic ideal by writing:"Maxwell's theory is nothing else than Maxwell's equations." Physicistsbegan to realize that they could do physics just as well without ever mentioningthe ether. With Einstein's special theory of relativity, this process was com-pleted, and the ether with its mechanical properties quietly faded from sight.

But the special theory of relativity had introduced another radical noveltyby showing that henceforth it was no longer admissible to consider space-time properties always as attributes of individual objects. The question:"What is the length of this rod?" has a unique answer only i1 it is specifiedwith respect to what system of rcfl.rence one determines the length. The same

Page 82: Foundations of Quantum Mechanics

5-1 HISTORIC—PHILOSOPHIC PRELUDE 69

is true for the duration of physical processes and the simultaneity of distantevents. This meant one had to learn to think of space-time properties such aslength and duration as relations rather than as attributes.

This process has gone much further with the advent of quantum mecha-nics. The discovery of the uncertainty relation by Heisenberg showed that thereplacement of attributes by relations had to be extended even to the measure-ment of coordinates and momenta of a physical system if account were takenof the finite size of the quantum of action.

The question "What is the position?" or "What is the momentum?"of a particle becomes meaningful only in connection with a physical arrange-ment which, at least in principle, permits the determination of the quantityasked for. Bohr has shown, by a detailed analysis of many special cases, thatphysical arrangements which permit an objective determination of such quan-tities are in general incompatible, and he was led in this way to formulate thenotion of complementarity. All this contributed to a complete disintegration ofthe mechanistic ideal.

The safe mechanical model of Huygens was replaced by the symbolism ofmathematical formalism. This process was accompanied on the philosophicalside by independent parallel developments.

In Austria Ernst Mach had published a series of historical—philosophicalstudies which tended to show that science is merely a description of relationsbetween sensations. The radical antimetaphysical tendencies of Mach greatlystimulated a group of philosophers, known as the Vienna circle, who wereprominent in the analysis of the new development in physics. A similardevelopment in the form of a reaction to Kant's synthetic a priori was takingplace elsewhere with such diverse representatives as Nietzsche, Poincaré andDuhem.

There was no lack of opposition to this tendency. Many viewed the disin-tegration process with alarm because the whole of science seemed to collapselike a house of cards. The French historian and philosopher A. Rey in 1907gave expression to this feeling of failure of the scientific method with thewords: "We can have a collection of empirical recipes; we can even system-atize them for the convenience of memorizing them; but we have no cognitiono1 the phenomena to which this system or these recipes are applied."

Max Planck, the founder of quantum theory, rejected the Machian con-ception of science as the most economical way of expressing relations betweensensations. Planck on the contrary insisted on the "reality" correspondingto physical concepts.

The most violent opposition came from the ideologies. Both Nazism and( ommunism have declared open warfare on these "idealistic" tendencies inmodern science, naturally for quite different reasons.

The new development seemed to becloud the concept of reality. What(lI)es it mean to allirm br a physical construct, if these constructs

Page 83: Foundations of Quantum Mechanics

70 THE PROPOSITIONAL CALCULUS 5-1

dissolve into a collection of mathematical symbols, which to boot are not evenunique?

We shall not attempt to review all the different answers which have beengiven to this disturbing question. We merely mention the question here tosketch the philosophical climate in which the evolution of quantum mechanicsoccurred.

We believe that, as far as physics is concerned, the answer to such ques-tions will be largely irrelevant. Physics has been remarkably insensitive tometaphysical concerns. But there is one aspect which concerns us and towhich we should pay attention. This is the new role assigned to mathematicsin modern physical theories.

There is no doubt that the importance of mathematics in theoreticalphysics has increased with the collapse of the mechanistic model for physics.Not only has mathematics become more sophisticated and more intricate, butin addition it has acquired a kind of independence.

For instance, the basic physical notion of general covariance obtained itsperfect expression only in the absolute differential calculus of Ricci and Levi-Civita, Quantum mechanics can only be formulated correctly by adoptingformal techniques from functional analysis, notably the theory of Hilbertspace.

In the absence of a clear-cut mechanical model, certain mathematicalstructures have risen to the status of a model in their own right. The moststriking examples of this kind are found in recent trends toward classifying thedifferent states of the elementary particle systems with the help of certain Liealgebras.

In view of these tendencies, it is necessary to have a sober look at the roleof mathematics in physical theory, so that we can resist the temptation of anew form of rationalism which overestimates the significance of mathematics.

The prime source of scientific knowledge about the physical world is theexperience gained by systematic observation of physical systems. Purelymathematical knowledge, although very useful for the organization ofempirical raw material into a body of interrelated facts, is useless as a source ofknowledge about the physical world. The reason for this is that mathematicaltruth is analytical truth; this means that it contains nothing more than what isalready contained in the premises or the axioms. It is always certain, but,just because of this certainty, essentially tautological. Empirical truth, onthe other hand, is synthetic truth. The general physical laws are arrived atby induction from observed facts, and therefore are never certain but onlyver jfi able for a finite number of instances.

These general laws are formulated as axioms in a mathematical languagc.Conclusions are then drawn from these axioms by rigorous mathematicalmethods which lead to the prediction of other observable facts. These factsmay be verified or not by observations. If they are not verified, the theory has

Page 84: Foundations of Quantum Mechanics

5-1 HISTORIC—PHILOSOPHIC PRELUDE 71

to be modified by modifying the axioms. For instance, the axiom of theinvariance of the physical processes under space inversion predicted thespherical symmetry for the decay products of polarized radioactive atoms.The fact that thisskymmetry was not observed indicated a violation of parityconservation for weak interactions.

This somewhat schematic description of the role of mathematics andexperience would have to be refined to be accurate. For instance, the factswhich are observed are not just a haphazard collection of facts but areselected with a view to establishing general and simple regularities, that is,physical laws. Likewise, the mathematical axioms are to some extent arbi-trary and must be selected by other criteria in addition to those mentionedabove. A certain freedom is left to the creative theoretician which often canbe a decisive element in the success of a theory.

The main problem in the selection of empirical material is the separationof relevant from irrelevant conditions. This is often made possible by isolatinga sufficiently simple part from the rest of the physical world and studying theproperties of the isolated part alone. We shall call such an isolated part aphysical system. The simplest physical systems are those which consist of justone elementary particle, if we disregard its interactions with other particles.

The notion of the physical system as defined here becomes blurred in thegeneral theory of relativity and in relativistic quantum mechanics. In thegeneral theory of relativity one hopes to show that some mechanical propertiessuch as inertia must depend on the state of the rest of the world (Mach'sprinciple).. In relativistic quantum mechanics one must consider transitionsbetween states with a variable number of particles. One is thus led to considerphysical systems which normally would be called different as merely d(fferentstates of one and the same system. These limitations of the concept of "sys-tem" need not concern us here since we are considering only nonrelativisticquantum mechanics.

A physical law which is obtained by the method sketched above may beexpressed in the following form: If a system Sis subject to conditions A, B,then the effects X, Y, . . . can be observed. In this form it establishes a relationbetween the conditions and the effects.

The most general relation of this kind which can be formulated is aprobability relation. There are two extreme cases possible: The effects are notcorrelated with the conditions, and they are distributed independently. (Incommon parlance this means there is no law which connects conditions andefFects.) The other extreme case would be described thus: Every conditionproduces exactly one effect. (There is no scattering in the observed effects ifthe experiment is repeated.) This is still a probability connection but withdispersion zero, In all physical experiments one observes an intermediatesit uation. Measured quantities always fluctuate. It was an axiom of classicalmechanics that, by increasing the accuracy of' the measurements, one could,

Page 85: Foundations of Quantum Mechanics

72 THE PROPOSITIONAL CALCULUS 5-2

for any system, reduce the dispersion of the measured effects below anyarbitrary small amount. This axiom has been found false for certain elemen-tary systems. This is an empirical fact, and is one of the pillars of quantummechanics.

There is thus, as far as we know today, an irreducible statistical elementin quantum mechanics quite in contrast with classical mechanics, where it isassumed that the statistical element, if it does appear, merely reflects ourincomplete knowledge of the state of the system. This has often been con-sidered an unsatisfactory feature of quantum mechanics. Seen historically,it is a further step in the "demechanization" of the physical laws.

There are those who long for a reestablishment of strict determinism inphysics, be it for the reconciliation of some ideological faith, or for estheticreasons. Quite soon after the discovery of quantum mechanics the questionwas raised as to whether there might exist "hidden parameters" not accessibleto the usual observation and not affected by the gross manipulations in theconditioning experiments A, B, . . . We shall discuss this question of thehidden variables in great detail in Chapter 7. Here we shall just state thatsuch attempts have failed and are bound to fail unless they spring from a muchmore profound knowledge of the physical microcosmos than anyone has atthe present time.

In this chapter we shall introduce the statistical character of quantalsystems as a basic axiom without further questions. Enough has been said inthe above sketch of scientific methodology to obviate any further justificationfor this procedure. Its empirical justification will appear in the subsequentdevelopment of the theory.

5-2. YES-NO EXPERIMENTS

In this section we shall explain a mode of description of physical systems interms of so-called "yes—no experiments."

Let us consider a physical system. This may be any specified system, butfor illustrative purposes we shall occasionally speak about a proton. If wewish to determine the physical characteristics of a proton, we must perform aseries of experiments which then in their ensemble will be a full operationalequivalent for the construct proton. There are two complications which make itdifficult to formalize this program into a convenient mathematical structure.First, physical experiments with a proton can be of a great many differentkinds, giving results which are expressed in a variety of different ways.Secondly, the results of such experiments may turn out to be quite different forone and the same system depending on the "state" of the system. Forinstance, if we determine the momentum of a proton, we may obtain a varietyof values, depending on the exact manner of the preparation of the systembefore the measurement was performed. On the other hand, if we measure its

Page 86: Foundations of Quantum Mechanics

5-2 YES—NO EXPERIMENTS 73

rest mass by any method we always obtain the same value. There are thusproperties which depend on the state and there are others which characterizethe system and which therefore are independent of the state.

In order to overcome the first of these difficulties, we introduce a particularkind of experiment which we call yes—no experiments; these are observationswhich permit only one of two alternatives as an answer (hence the name"yes—no" experiment). Such experiments are part of the daily routine ofevery experimental physicist; for example, a counter which registers thepresence of a particle within a certain region of space. What is perhaps lesswell known, however, is that every measurement on a physical system canbe reduced, at least in principle, to the measurements with a certain number ofyes—no experiments.

A typical example of this reduction of an experiment to yes—no experi-ments is the channel analyzer. The individual channels each define a yes—noexperiment; and their ensemble together permits the measurement of anenergy spectrum, for instance. Each measurable quantity has a certain rangeof values which may be indicated as a subset of the real line (or perhaps aEuclidean space). A determination of this quantity is obtained by dividing thereal line into smaller intervals, and then deciding whether the measured valuefalls within any one of the intervals. By making the intervals sufficiently small,one can determine the value of the quantity to any desired accuracy.

If we wish to determine the position of the proton we would place anumber of counters at various positions in space and register their response.This assembly of yes—no experiments will suffice to locate the proton towithin the extension of one counter.

We shall introduce a terminology which will be useful in the discussion ofgeneral systems. A yes—no experiment which serves to select the values of ameasurable physical quantity will be called a filter for this quantity. The nameis clearly borrowed from the analogous situations encountered in frequencyfilters. We shall also refer to yes—no experiments simply as propositions of aphysical system.

The second of the above-mentioned difficulties will be overcome if weinvestigate the properties of the propositions of a physical system which areindependent of the state of the system. Evidently the individual propositionswill not do that, since they can be either true or false. Such properties musttherefore be found among the relations between different propositions. Let usillustrate this again with an example.

Suppose we have two propositions, a and b, which can be measured bytwo counters. Counter A locates a proton in the volume VA and counter Blocates the proton in the volume VB. The responses of the counters A and Bcannot be entirely independent of each other, This is obvious if for instanceV0 c Then whenever the response to counter A is "yes," counter Bmust also respond with "yes" (assuming 100% efficiency of the counters).

Page 87: Foundations of Quantum Mechanics

74 THE PROPOSITIONAL CALCULUS 5-3

Between certain pairs of propositions there exists thus a relation which weexpress as a c b; meaning: Whenever a is true, then b is true, too. What isimportant for us is that this relation is independent of the state of the system.It is thus the desired structural property which expresses a property of thesystem independent of its state.

5-3. THE PROPOSITIONAL CALCULUS

In this section we present in axiomatic form the structure properties of thepropositions of a physical system which are independent of the state.

Let 2' be the set of all propositions (yes—no experiments) a, b,... of aphysical system. This set is partially ordered. We say a implies b andwrite a b jf, whenever a is true it follows that b is true, too. The relation

satisfies the following conditions:

1) a_ca forallae2';2) ac_b and bca impliesa=b; (I)

3) a_cb and bcc impliesa_cc.

We note here that we have not yet defined what we mean by equality oftwo propositions. Rule (2) may therefore be regarded as a definition of equal-ity. This definition is possible, since the simultaneous validity of the relationsa b and b a is an equivalence relation which permits us to define theclass of equivalent propositions. If we set a = b we have essentially replacedthe propositions by classes of equivalent propositions. In the physical inter-pretation this means that we have defined a proposition as a class of physicalyes—no experiments, all of which measure the same proposition.

The next axiom asserts the existence of the greatest lower bound for anysubset of propositions.

Let I be an index set and (i I) any subset of 2', a1 e 2'. Then thereexists a proposition, denoted by fl' with the property

xca, (II)

The physical interpretation of this axiom requires considerably moreexplanation than does (I). Let us first consider the case that the set {a,/}consists of just two elements which we call a and b. The proposition with theproperty (II) will then be denoted by a n b.t

t Note that the operation denoted by a n b is identical with that used for set inter-section in Chapter 1. Here it has a different meaning since the propositions a, .

are not sets.

Page 88: Foundations of Quantum Mechanics

5-3 THE PROPOSITIONAL CALCULUS 75

As a physical measurement a n b denotes the measurement of "a andb." This is the proposition which is true if a is true and if b is true, and which isnot true in all other cases. The existence of such a proposition is thus assuredif it is possible to measure it, be it only in principle.

In a classical system there is no difficulty in measuring the proposition "aand b" if a and b can be separately measured. It suffices to measure both aand b and then we have also measured a n b. In fact a n b is true if and only ifboth a and b are true.

This possibility in classical systems depends in an essential way on theproperty that measurements on such systems can be performed withoutmodifying the state of the system. For quantal systems this is in general notpossible, and then the measurement of the proposition a n b may becomeobscure even if both a and b are separably measurable. The difficulty is this:If the measurement of a n b is attempted by measuring first a, to be followedby a measurement of b, then we are not sure whether the first measurement ofa may not have modified the state in such a way that the subsequent measure-ment of b may be affected. Such situations are easily exhibited in quantummechanics.

It is possible to overcome this difficulty by the following procedure.Instead of making one pair of measurements a and b, we construct a filter forthe proposition a n b by using an infinite sequence of alternating pairs offilters for the propositions a and b respectively. This construction is symboli-cally represented in Fig. 5-1. N /

Fig. 5—1 Construction of the filter a b a

forthepropositiona n bin quantummechanics, a b

The proposition a n b is then true if the system passes this filter, and it isnot true otherwise. Although infinite processes are, of course, not possible inactual physical measurements, the construction can be used as a base foran approximate determination of the proposition to any needed degree ofaccuracy.

It is just as easy to define the proposition fl' a1 if! has a finite number ofelements, With axiom II we require much more than that. We affirm theexistence of the greatest lower bound even for an infinite family of proposi-tions, There is no empirical correlate in the infinite conjunction implied in thedefinition of fl' a1, since one can perform only a finite number of experiments.If' we extend this definition to infinite sets, we transcend the proximatelyobservable fitets and we introd Lice it/eu! into the description of physi-cal systems.

Page 89: Foundations of Quantum Mechanics

76 THE PROPOSITIONAL CALCULUS 5-3

This is a procedure which is quite common in many other axiomatizationsof pure or applied mathematics. For instance, in the calculus of probability itwould correspond to the infinite additivity of the probability measure. Itis justified by the resulting simplicity of the mathematical object therebydefined.

The axiom II implies the existence of a proposition 0 with the propertyaforalla Itisdefinedby

0= fla,ac2

and it is called the absurd proposition, because it is always false. From thisproperty its idealized nature is obvious.

The next axiom affirms the existence of the orthocomplementation.

To every a e 2' there exists another proposition a' e 2' such that

1) (a')' = a;

2) a' n a = 0; (III)

3)

The physical interpretation of this axiom is quite easy, because the proposi-tion a' is measured with exactly the same apparatus as a. The only differenceis that the results are inverted: If a is true, then a' is false, and vice versa. Weintroduce here a strong negation denoted by "false" and distinguished fromsimply "not true." The property (1) is then obvious, while (2) expresses thelaw of the excluded middle. Property (3) is called Morgan's law in formallogic. Here we postulate it. The proposition a' is thus the (strong) negation ofthe proposition a. We could call it the proposition "not a."

From the three groups of axioms we immediately deduce the existence ofthe least upper bound. It is defined by

y (n (5—1)

and we verify easily that it satisfies

a;cxct.yaicx forallxe2'. (5—2)

(See Problem 1.)The proposition I a is called the trivial proposition since it is

always true (Problem 2). One can then prove that x u x' = I for all x e 2'.(Problem 3.)

The proposition a u b has the logical significance "a or h" and it is trueif a or b, or both, are true.

Page 90: Foundations of Quantum Mechanics

THE PROPOSITIONAL CALCULUS 77

Table 5-1

THE PROPOSITIONAL CALCULUS OF A PHYSICAL SYSTEM

Lattice operation Interpretation

a b aimplies b

a' nota

anb aandb

aub aorb

A set 2' of elements which satisfy the axioms I, II, and III is called acomplete, orthocomplemented lattice. We have thus postulated the following:

The propositions of a physical system are a complete, orthocomplemen tedlattice.

The structure of this lattice will be independent of the state of the physicalsystem; the lattice describes the intrinsic structure of the system.

In Table 5-1 we summarize the fundamental relations in a lattice, togetherwith their interpretation.

The propositional calculus of a physical system has a certain similarity tothe corresponding calculus of ordinary logic. In the case of quantum mechan-ics, one often refers to this analogy and speaks of quantum logic in contra-distinction to ordinary logic. This has unfortunately caused such confusionthat we shall add a few words of explanation here to avoid any misunder-standing.

The calculus introduced here has an entirely different meaning from theanalogous calculus used in formal logic. Our calculus is the formalization ofa set of empirical relations which are obtained by making measurements on aphysical system. It expresses an objectively given property of the physicalworld. It is thus the formalization of empirical facts, inductively arrived atand subject to the uncertainty of any such fact. The calculus of formal logic,on the other hand, is obtained by making an analysis of the meaning of prop-ositions. It is true under all circumstances and even tautologically so. Thus,ordinary logic is used even in quantum mechanics of systems with a proposi-tional calculus vastly different from that of formal logic. The two need havenothing in common. It turns out, however, that, if viewed as abstract struc-tures, they have a great deal in common without being identical, Their mostimportant difference is that the calculus of formal logic is Boolean, while thatof physical propositions is Boolean only for classical systems (cf. Section 54'and references 14 through 20). The distinction between a formal logic and anempirical proposition system is exhibited most clearly in reference 16.

Page 91: Foundations of Quantum Mechanics

78 THE PROPOSITIONAL CALCULUS 5-4

We shall occasionally use a graphical representationof lattices which is very convenient for analyzing simple finiteexamples of lattices. In this representation, propositions aredenoted by points and the implication relation by a more orless vertical line. The simplest possible complete orthocom-plemented lattice has the form given in Fig. 5-2. Figure 5—2

PROBLEMS

1. (flia)', then

forallxe2'.I

2. The proposition

I=Uaac2

has the property I = 0'; consequently it is always true (trivial proposition).

3.xux'=I forallxe2'.4. If the two relations x c a and x S b imply one another, then a = b.

5. (a n b)' = a' u b'; (a u b)' = a' n b'.

5-4. CLASSICAL SYSTEMS AND BOOLEAN LATTICES

The propositional calculus so far developed so general that it contains asyet no specific properties of physical systems. In fact, it does not even dis-tinguish between classical and quantal systems. We shall now examine theadditional properties of a lattice which will characterize the classical systems.

In classical mechanics, the state of a system is represented as a point inphase space, defined by the values q1, q2, . . . Pi' P2' . . . of the coordin-ates and canonical momenta. We shall simply designate such a point by(q, p). Any measurable physical quantity will be represented by a real-valued Borel function F(q, p). The elementary propositions associated withsuch a measurable quantity are then the propositions

F(q, p) e A,

where A is some Borel set of the real line. To every such proposition, one canassociate in a unique manner a Borel set of the phase space

F'(A) {q, p F(q, p) e A}.

Conversely, to any Borel set S in the phase space, one can associate a proposi-tion, for instance, xs(q, p) = I, where xs(q, p) is the characteristic function ofthe set S. Thus, every Borel set S in phase space represents a proposition of

Page 92: Foundations of Quantum Mechanics

5-4 CLASSICAL SYSTEMS AND BOOLEAN LATTICES 79

the system. The proposition is true if (q, p) e S. We see from this remark thatin this case the greatest lower bound a n b of two propositions corresponds toset intersection. Thus if S. is the set which represents the proposition x,we have

nb = n Sb and = (complementary set).

The proposition 0 is then represented by the null set and the proposition I bythe entire phase space.

The postulated axioms are now elementary consequences from the theoryof sets, except that II is valid only for countable families of propositions. Inthis case we replace II by IIq.

Fig. 5—3 Illustration of the distributive law:an(buc)=(anb)u(anc).

The lattice of a class of subsets satisfies, however, additional proper-ties which are independent of the axioms I, II, and III, which we shall nowenumerate: One verifies easily (cf. Fig. 5-3) that if a, b, c e then

an(buc)=(anb)u(anc),(D)

a u(b n c) = (a u b) n (a u c).

This is called the disiributive law.For our example we can verify, from our definition of propositions, the

existence of minimal propos itions (or "points") F, with the following property:

ØcF and (A0).

We call such propositions atomic. They are represented by sets consisting ofexactly one point, and every proposition a contains at least one such proposi-tion.

The existence and meaning of atomic propositions in physical systems hasbeen questioned by Birkhoff and von Neumann. In reference 3 they arguethat it is operationally meaningless to say that the angular velocity of themoon around the earth (in radians per second) is a rational number. Theyassume that propositions are not Borel sets but classes of equivalent sets withrespect to the Lebesgue measure in phase space, where two sets A1 and A2 areconsidered equivalent if their Lebesgue measure is equal. It is clear that thelattice of such classes of sets is nonatomic and it certainly contains feweridealized elements than the class of all Borel sets.

b

Page 93: Foundations of Quantum Mechanics

80 THE PROPOSITIONAL CALCULUS 5-5

However, in classical mechanics, this dictinction is not important, sinceeven the classes of equivalent sets contain many elements which are probablyjust as difficult to associate with concrete propositions as some of the Borelsets. Furthermore, the lattice of Birkhoff and von Neumann can be embeddedin an obvious way in the lattice of Borel sets, and so one always has the pos-sibility of accordingly enlarging the lattice to an atomic lattice. Finally, onlythe atomic lattice represents really classical mechanics as it always has beenunderstood, where the precise values (q, p) of the coordinates and momentaare used in the usual treatment of classical systems.

A lattice which satisfies I, III,, III, and A0 is called an atomic, a-completeBoolean lattice. We are thus led to postulate:

The propositions of a classical physical system are a a-complete atomicBoolean lattice.

5-5. COMPATIBLE AND INCOMPATIBLE PROPOSITIONS

It is now time to identify the property which allows the distinction of classicalfrom quantal systems. To this end we shall introduce a new concept, that ofcompatibility of propositions.

In classical systems all propositions are compatible. In the usual inter-pretation of this notion, it means that a simultaneous measurement of severalpropositions can be made without affecting the state of the system. Thismeans specifically that the result of the measurement will be independent of theorder in which the measurements are performed. Thisis naturally true only inan approximate sense; any measurement involves an interaction with thesystem. But in classical mechanics it is asserted that the effect of the distur-bance of the state by the measurement can be reduced to a negligible amount,so that it may be omitted in a discussion of idealized experiments.

Since we have not yet defined and discussed the notion of "state" of asystem, we shall, for the time being, define the notion of compatibility asequivalent to being "classical." In order to make this precise, we need a fewdefinitions.

Let 2' be a lattice (orthocomplemented and complete; we shall not repeatthese adjectives). A subset 2' 2' is again alattice with respect to the operations which make 2' a lattice. Thus, in thelattice of Fig. 5-4, the subset 0, I is a sublattice but not the subset (0. a), sincea' and I are not in that subset.

One proves easily: If (i I) is a family of sublattices of a lattice 2',then the intersection fl is also a sublattice (Problem I).

Now let S c 2' be any subset of 2' and denote by the family of allthe sublattices which contain S (S 2',). Then the lattice = fl is thesmallest lattice which contains S. We shall call it the lattice generatel by S. If

= 2', then the set S is called a total set.

Page 94: Foundations of Quantum Mechanics

5-5 COMPATIBLE AND INCOMPATIBLE PROPOSITIONS 81

A sublattice 21B c 2' is called a Boolean sublattice if it satisfies the dis-tributive law (D).

We now define a subset S c 2' as a compatible set of propositions if thelattice generated by S is a Boolean sublattice of 2'.

Since a sublattice of any Boolean sublattice is again Boolean (Problem 2),we see that every subset of propositions in a classical system is compatible.This is just what we wanted to require of our definition of compatibility.Figure 5-5 illustrates this definition for a non-Boolean lattice.

a a' a

Fig. 5-4 Example of a lattice which Fig. 5—S A Boolean sublattice consist-contains a sublattice, for instance, ing of the elements (0, 1, a, a') in a non-(0, 1). Boolean lattice. It is generated by the

subset S = {a}.

It is easy to prove that the two propositions a and a' are always compatible(Problem 3). This is physically very reasonable, since we have previouslyshown that these two propositions are measured with the same physicalapparatus, and therefore they cannot disturb one another.

If two propositions a and b are compatible, that is, if the set S = {a, b}

generates a Boolean sublattice, we shall write a 4—* b. The notation indicates asymmetrical relationship. It is important to realize that the relationship ofcompatibility is not transitive. Thus if a 4—* b and b 4—* c, we cannot in generalassert a c.

The two ideal propositions 0 and I have the property that they com-patible with any proposition (Problem 4). The set of propositions whichare compatible with all propositions of 2' is called the center of 2'. It is aBoolean sublattice of 2' (Problem 5). In a Boolean lattice the center isidentical with 2'. If the center consists only of the elements 0 and I, it iscalled trivial, and a lattice with trivial center is called irreducible. In all othercases the lattice is reducible.

The notion of irreducibility introduced here permits another interpreta-non which justifies the use of this term. In order to show this we introduce thenotion of the direct union of two lattices. Let and be two lattices. Wecan define a new lattice .7 called the direct union of and by consideringthe elements of as pairs of elements {x1, x2} with x1 e and x2 e

0

Page 95: Foundations of Quantum Mechanics

82 THE PROPOSITIONAL CALCULUS 5-5

The order relation in 2' is then defined by

{x1, x2} {yi, Y2} if x1 and x2 Y2-

With this order relation, it is possible to define unions and intersections in2' by setting

{x1, x2} n Y2} {x1 X2 Y2}

and

{x1, x2} u Y2} {x1 U Yi, X2 U

The lattice 2' obtained in this way is called the direct union. It contains thetwo lattices and 2'2 as sublattices, since the set of elements {x1, 02} isisomorphic with (has the same lattice structure as) and the set of elements

x2} is isomorphic with 2'2.This construction is easily generalized to any family of lattices: Let(1 e I) be a finite or infinite family of lattices. The direct union of the

is the lattice consisting of collections of elements with c 2',. Apartial order in 2 is defined by

if x, for allieI.If unions and intersections in 2' are defined by

{x1} u {y,} = u n {yJ = n

then 2' becomes a lattice.In the lattice of the first example, the elements 02} and 4} are in

the center of 2' and they are different from the elements 0 = 02} and= {4, 12}. Thus the center of 2' is nontrivial. For certain lattices (which

we shall specify presently), one can prove a converse: If the center of a lattice2' is nontrivial, then there exist at least two lattices such that 2' is the directunion of the two. It is possible to expand these remarks to a completereduction theory analogous to the reduction theory of Hilbert spaces ofgroup representations. We shall, however, not pursue these questions furtheruntil we have more fully specified the structure of the lattice of propositions.

PROBLEMS

I. If 2', (1 e I) isa family of sublattices of a lattice, then 2', is also a sublattice.

2. A sublattice of a Boolean lattice is again Boolean.

3. The two propositions a and a' are always compatible.

4. The two elements 0 and fin a lattice 2 are compatible with any element a e 2.

5. The center '€ of a lattice ../ is a Boolean stthlattice of 2.

Page 96: Foundations of Quantum Mechanics

5-6 MODULARITY 83

5-6. MODULARITY

The problem before us is the following: We know that the lattice of proposi-tions cannot be Boolean if it should be more general than the lattice of aclassical system; but we do not yet have any restriction which permits usto seta limit to the generalizations. What is this restriction?

Such a restriction should be motivated by physical considerations.Before we formulate it we shall briefly discuss in this section a proposal forsuch a restrictive axiom which is, however, physically untenable [3, 4]. This isthe so-called modular law.

In any lattice one always has (Problem 1)

forxcz.If the lattice is such that for all x z we actually have equality in this relation,that is,

xu(ynz)=(xuy)nz forallx (M)

then we say that the lattice is a modular lattice. It is clear that a Booleanlattice is always modular. It is also easy to give examples of orthocomple-mented lattices which are not modular. Two such examples are given inFig. 5—6.

I I

C a'b' a

b'b

Fig. 5—6 Two examples of ortho- a' b C'complemented nonmodular lattices.

The modularity postulate has much in its favor. For one thing, it is thebest known of the possible generalizations of Boolean lattices. This, ofcourse, would not suffice for a justification of its use as a physical axiom.

Birkhoff and von Neumann [3, 4] have tried to justify modularity bypointing out that on finite modular lattices one can define a dimension func-tion v(a) with the properties:

v(a) =0 if and only ifa = 0,

v(a) I if and only if a = I,and

v(a) F v(h) v(a L. h) + v(a h).

0

Page 97: Foundations of Quantum Mechanics

84 THE PROPOSITIONAL CALCULUS 5-7

Such a function has the characteristic properties of a probability measure, andv(a) would represent the a priori probability for finding the system with prop-erty a when nothing is specified as to its preparation. It is known that thereare systems for which such a finite a priori probability does not exist. Itwould correspond classically to a system with a phase space of finite invariantmeasure. A particle moving in empty space does not admit a finite dimensionfunction either in classical or in quantum mechanics. Thus this argument forthe modularity of the proposition system is not convincing.

An argument against modularity is obtained from our analysis of thenotion of localizability of a particle. This notion will be examined in Chapter12, where it will be shown that, in the context of conventional quantummechanics in Hilbert space, it is incompatible with modularity.

PROBLEMS

1. In any lattice one has

xu(ynz)c(xuy)nz2. The lattice given by the diagram below is not modular.

5-7. THE LATTICE OF SURSPACES

It is high time we connected our lattice of propositions with the lattice of sub-spaces discussed in Section 2-5. We have already remarked that the lattice ofsubspaces of a Hilbert space is a complete orthocomplemented lattice. Inthat section we also introduced the notion of compatibility, but in a differentway; and we must now show that it is identical with the definition used in thischapter. Finally, we shall answer the question whether the lattice of all sub-spaces is modular or not.

Let us begin with the notion of compatibility. Compatibility as defined inSection 2-5 is expressed very conveniently with the projections E and F, withranges M and N, respectively. Indeed if M 4—* N, then EF = FE; that is, theprojections commute (Problem 1).

0

Page 98: Foundations of Quantum Mechanics

5-7 THE LATTICE OF SUBSPACES 85

Now it is not difficult to verify that the sublattice generated by two com-muting projections is a Boolean lattice (Problem 2).

Conversely, if E and F are two projections generating a Boolean lattice,then one can show that they must be compatible in the sense of Section 2-5(Problem 3). Thus the two definitions are equivalent.

Let us next examine the property of modularity. Here we must distinguishthe finite- from the infinite-dimensional case. If the Hilbert space is finite-dimensional, then we can easily show that the lattice of subspaces is indeedmodular.

To see this, let M, N, and K be three subspaces such that M c K. Wehave already established that, in any lattice,

Mu(NnK)c(MuN)nK ifMcK(cf. Problem 1, Section 5-6). Thus, in order to demonstrate modularity,we need to show the inclusion in the reverse sense. If a is an arbitraryvector a e (M u N) n K, it follows from the definition of intersection thata e MuNand a e K. ButeveryvectorinMuNisa sum ofavectorinMand a vector in N, because M u N is finite-dimensional. Thus we may writea = b + c with b e M and c e N. But according to the assumption M c K,wehavealsobeK. Thusc=a—beKandceN. ThereforeceKnN,and we have shown that a is a sum of t'cvo vectors b e M and c e N n K.This means that a e M u (N n K), and thus

M u (N n K) = (M u N) n K.

Hence, the lattice is modular.When we examine this demonstration, we observe that we have used

the in an essential way, at the point where we wrotea = b + c with b e M and c e N. This remark leads us easily to the con-struction of a counterexample if the two spaces M and N are infinite-dimensional. -

Let us assume that d e M u N and that d is a limit vector of vectors ofthe form (a + b) with a e M and b e N, but that d is not itself of this form.Let us further assume, for simplicity, that M n N = 0 and let K be the sub-space generated by M and this vector d. Every vector x e N n K satisfiesx = Ad + b e N with b e M. But this is possible only under the stated condi-tions for x = 0. Thus N n K = 0. Therefore

Mu(NnK)=M, but (MuN)nK= KcM.Thus we conclude:

The subspaces of an infinite-dimensional Hubert space are a nonmodularlattice.

Page 99: Foundations of Quantum Mechanics

86 THE PROPOSITIONAL CALCULUS 5-8

PROBLEMS

1. Let M and N be the ranges of the projections E and F respectively; then (cf.Section 2-5)

2. The lattice generated by two commuting projections is Boolean.

3. If two projections E and F generate a Boolean sublattice of the lattice of allprojections, then they satisfy

(EnF)u(EnF')=E and

5-8. PROPOSITION SYSTEMS

En this section we shall give the answer to the question formulated in Section5-6 as to the nature of the axiom which is to be postulated in place of thedistributive property. Several equivalent and apparently independent formu-lations of this axiom have been given [9, 10, 11, and 12].

Let us consider two propositions a and b such that a implies b: a c. b. Itis then reasonable to postulate that two such propositions are compatible withone another. That is, we postulate

(P)

Let us first show that in a Hilbert space, this axiom is automaticallysatisfied. Thus, if E and F are two projections such that E F, or EF =then FE = (EF)* = E* = E = El'. Thus, the two projections must com-mute; and, according to the preceding section, they are compatible.

Let us next verify that axiom (P) is independent of the others; that is, itactually excludes certain lattices which satisfy all the other axioms. Twoexamples of orthocompleted and complete lattice which do not satisfy (P) aregiven in Fig. 5—6 (a) and (b). Both of these lattices are nonmodular, and it is,in fact, easy to prove that modularity implies (P) (Problem 1). It seems, there-fore, justified to call axiom (P) a weak modularity.

It follows immediately from axiom (P) that if a b', that is, if a and b aredisjoint, then a 4-* b. Furthermore, one can prove that if a 4-* b, it follows alsothat a 4—* b' (Problem 2). This means that disjoint propositions are alwayscompatible.

A further consequence of axiom (P) is the following useful distributivityproperty:

If {a1} (i e I) is any collection of propositions, and if 4—* b then

b n at)=

IJ (b n a,) and b u (fl a1) = fl (b u a1)

(Problem 3).

Page 100: Foundations of Quantum Mechanics

5-8 PROPOSITION SYSTEMS 87

From this, one deduces easily that if (1 e I), is any subset of ele-ments, and if 4—* b for all i e I, then

U and fla14-*bI I

(Problem 4).As a final theorem, we mention three other equivalent formulations of

axiom (P):

1) If we define by the segment [a, b] the set of elements x such that ax c b, then the mapping x —p x (a u x') n b is an orthocomplementfor the segment [a, b] (see reference 9);2) a -L b, a -L c; a u b = a u c; andb c c b = c;3) the compatible complement a' is unique (cf. reference 13).

Of these equivalent properties, the last (3) is especially interesting, since it ismost easily interpretable in physical terms. These useful theorems simplifymany steps in the study of such lattices.

The last axiom which we postulate for a proposition system is the atom-icity axiom. It consists of two parts:

A1: For any proposition a 0, there exists a point P (minimal proposition,cf. Section 5-4) such that P c a.

A2: IfQisapoini, thenor x=auQ.

The physical justification for this axiom is not very strong; it is analogous tothe justification which we have given for the atomicity in classical propositionsystems. It is therefore desirable to draw as many conclusions as possiblewithout the explicit use of A (= A1 and A2).

From now on we shall call a system of elements £8 which satisfies theaxioms I, II, III, P, and A a proposition system. The kinematical structurewhich contains all the properties independent of the state of the system will beexpressed entirely in the structure of the proposition system S'.

PROBLEMS

1. A modular orthocomplemented lattice satisfies axiom (P).

*2. The following statements are equivalent for a lattice which satisfies (P):(1) a—b.(2) a b'.(3)(anb')ub (bna')ua aub.(4) The four elements a. b, a', ,S' satisfy the distributive law for any combination.(5) (a n b) u (a n 1,') u (a' n b) u (a' n 1,') L(Cf. reference 9.)

Page 101: Foundations of Quantum Mechanics

88 THE PROPOSITIONAL CALCULUS

If at (i e I) is a subset of elements in a lattice satisfying (P), and if a, b, then

b n (U = U(b n a,) and bu(fl at\ = fl(b u a,)\I / I "I / I

for any b e S'. (Cf reference 9.)

4. If at (1 e I) is an arbitrary subset of a lattice satisfying the axiom (P), thena, — b for all I e I implies

fl a,÷-*b andI I

REFERENCES

A readable account of some historical and epistemological questions concerningquantum mechanics is found in:

1. P. FRANK, Modern Science and its Philosophy. Collier Books (1961).

A valuable source book with many further references is

2. H. FEIGL AND M. BRODBECIC, Readings in the Philosophy of Science. Appleton-Century-Crofts (1953).

The proposition calculus of quantum mechanics is given by:

3. G. BIRICH0FF AND J. VON NEUMANN, Ann. Math. 37, 823 (1936).

Although lattice theory led to many interesting developments in mathematics,there was little impact of this branch of mathematics on physics. The little there ismay be found in:

4. 0. BIRKHOFF, "Lattices in Applied Mathematics." Proc. Symp. Pure Math.Vol. II, Am. Math. Soc. (1961); especially p. 155.

The notion of compatibility is found in:

5. P. JORDAN, Arch. Mat/i. 2, 166 (1950).

6. S. MAEDA, .1. Sci. Hirosh. Univ. A19, 211 (1955).

For a convenient general introduction to lattice theory, without any physicalapplications, however, see the following two references:

7. M. L. DUBREIL-JACOTIN, L. LE5IEuR, R. CROISET, "Lecons sur Ia théorie destreillis" in Cahiers Scientifiques XXI. Paris: Gauthier-Villars (1953).

8. G. Szksz, Introduction to Lattice Theory, 3rd ed. New York: Academic Press(1963).

9. C. PIRON, "Axiomatique Quantique," Helv. Phys. Acta 37, 439 (1964).

10. R. SCALE'rTAR, Thesis, Cornell University (1959).

11. L. H. LooMIs, Mem. Amer. Math. Soc. 18, 36(1955).

12. G. LUDWIG, Z. Physik 181, 233 (1964).

13. G. R05E,Z. Physik 181, 331 (1964).

Page 102: Foundations of Quantum Mechanics

REFERENCES 89

The relationship between logic and empirical science has been the subject ofmany studies. We mention here in the first place14. P. MITTELSTAEDT, Sitzungsber. der Bay. Akad. der Wissensch., MUnchen (1959).

15. , Naturw. 47, 385 (1960).

16. , Philosophische Pro bleme der modernen Physik, Mannheim, 1966;especially Chapter VI.

17. E. SCHEIBE, Die kontingenten Aussagen der Physik. Frankfurt, 1964.

A point of view differing from ours is expressed in:

18. P. DE5TOUCHE-FEVRIER, "La Structure des Theories Physiques." Paris: PresseUniversitaire de France (1951).

19. F. V. WEIZSXCKER, "Komplementarität und Logik," Naturwiss. 42, 521, 545(1955).

20. H. REICHENBACH, Philosophical Foundations of Quantum Mechanics, 3rd ed.Los Angeles: University of California Press, (1948).

Page 103: Foundations of Quantum Mechanics

CHAPTER 6

STATES AND OBSERVABLES

There are two kinds of truths. To the one kind belong statements so simple andclear that the opposite assertion obviously could not be defended. The otherkind, the so-called "deep truths," are statements in which the opposite alsocontains deep truth.

NIELS BOHR

In this chapter we introduce three important notions, that of state, that ofobservable, and the superposition principle. In Section 6-1 we compare theclassical with the quantum mechanical notion of state, and point out some ofthe limitations of this notion in the latter case. State is, however, a measurablephysical quantity, and the general aspects of such measurements are describedin Section 6-2. The insight gained from this description is transferred into theprecise mathematical description of state in axiomatic form, and some con-clusions are drawn in Section 6-3.

The important notion of observable is introduced in Section 6-4, and it isshown that the essential feature of this object is a c-homomorphism fromBorel sets on the real line into the lattice of propositions. The properties ofobservables are further discussed in Section 6-5, and such important proper-ties as spectrum, range, separability, etc., are introduced. Section 6-6 intro-duces compatibility for observables as a straightforward generalization ofthat same notion for propositions. In Section 6-7 we introduce the functionalcalculus of observables, and prove a theorem of von Neumann in the general-ized form of Varadarajan concerning systems of compatible observables.

Section 6-8 contains first a description and then a precise mathematicalformulation of the superposition principle. The final Section 6-9 deals withsuperselection rules, where the superposition principle is not generally true.

6-1. THE NOTION OF STATE

The notion of the state of a physical system is so familiar from its use inclassical mechanics and field theory that it has entered quantum mechanicsalmost without any further analysis. Yet this notion becomes substantially

90

Page 104: Foundations of Quantum Mechanics

6-1 THE NOTION OF STATE 91

modified in quantum mechanics, and it requires a much more careful analysisthan that usually given for classical systems.

In order to exhibit this difference, let us for a minute examine the notionof state for a classical system. In a classical system we can distinguish certainproperties which are independent of the state. Such properties appear in thedescription of the system as the number of degrees of freedom and the topo-logical structure of the phase space. The dynamical property, which expressesitself in the structure of the Hamiltonian, is also independent of the state.

The state, on the other hand, appears as the initial condition which deter-mines the solutions of the equations of motion of the system. In a classicalmechanical system, the initial condition is determined by fixing the values ofthe positions and momenta, that is, by fixing a point in phase space. Theevolution of the state is then described by the orbit which passes through thispoint in phase space (see Fig. 6—1).

q

IntiaI condit

Fig. 6—1 Symbolic representation of astate in classical mechanics representedby the initial condition Po, q0 whichdetermines an orbit.

The physical significance of this description for classical systems is this:If one prepares the system in such a way that at some time the initial condi-tions Po' q0 are realized, then the system evolves in time according to theunique orbit which passes through the point Po' q0 in phase space. At everymoment it is then'tâ state given by the values of p and q which are obtained forthat value of time for this particular solution.

In quantum mechanics the specific description of the system in terms ofthe phase space is no longer possible. The structure that is independent of thestate which replaces the classical phase space is the proposition systemdefined in Chapter 5.

The state of a quantum system is still a meaningful and useful notion if weattach this meaning closely to the empirical aspect of the state. This aspect iscontained in the description: The state of a physical system is the result of apreparation of a system. The state thus embodies the specific history whichpreceded the instant to which the state refers. A preparation of a system is aseries of manipulations which affect the system under consideration.

As simple as this description of the preparation of a state might appear atfirst sight, it is actually quite intricate because of the undefined notion ofa/feeling a system. Without previous knowledge of the physical laws, it is not

p0 ,q0

Page 105: Foundations of Quantum Mechanics

92 STATES AND OBSERVABLES 6-1

immediately obvious whether certain physical conditions will affect a systemor not.

Let us consider some examples to make this clear. If the system is amagnetic ion, then a static electric field will (in a very good approximation)have no observable effect on the state of the ion. However, if the electric fieldis oscillating, then the state may be markedly affected.

The photons emitted from a gas discharge tube might be affected by theintensity of the discharge, but again, in a good approximation, they are not.

We are thus led to the notion of relevant conditions for the preparation ofa state. It is an empirical fact that some conditions are relevant and others arenot; we stress here that we have no prior knowledge as to which of the condi-tions are relevant and which are not. The determination of the relevant con-ditions is a question of physics.

We summarize this discussion with a formal definition.

Definition: A state is the result of a series of physical manipulations on thesystem which constitute the preparation of the state. Two states are identical

the relevant conditions in the preparation of the state are identical.

We shall add three remarks to this formal definition. The distinctionbetween the system and its states cannot be maintained under all circum-stances with the precision implied by this definition. The reason is that systemswhich we regard under normal circumstances as different may be consideredas two different states of the same system. An example is a positronium and asystem of two photons. We know that the former transforms spontaneouslyinto the latter; and the adequate description for this process is that whichassumes that a positronium and a pair of photons are two different states ofone and the same system.

A similar situation is found in the description of a nucleon. A proton anda neutron may be considered two different physical systems, but a more unifiedand very useful description is obtained by considering them as two differentstates of a single system, called "nucleon." We speak then of isotopic spin,and we distinguish the two states by their isotopic spin value.

The second remark refers to the fact that there are certain conditionswhich are so obviously irrelevant that they are rarely considered seriously.For instance, only astrologists would include the phases of the moon as pos-sible relevant conditions in the preparation of a micro system.

•An important class of irrelevant physical conditions refers to the isotropyand homogeneity of space and time which are of basic importance for theobjective verifiability of physical theories. For instance, it is a most fortunatefact that the epoch at which a state is prepared is usually irrelevant. This isvery important for the possibility of measuring a state by a statistical series ofexperiments repeated at various times under otherwise identical conditions.

Page 106: Foundations of Quantum Mechanics

6-2 THE MEASUREMENT OF THE STATE 93

Likewise, the fact that an experiment on neutrinos, for instance, carried out inGeneva and in Brookhaven gives identical results means that the absoluteposition in space is not a relevant condition for the preparation of a state.

A final remark: The notion of state defined here has a limited, precise andtechnical meaning. It would lead to futile pseudo-problems if one tried toextend this meaning to situatjons where the notion is not applicable. Thus,it would be meaningless to speak about the state of the universe; and cosmo-logical questions should therefore not be mixed into the kind of quantummechanics that we discuss here.

6-2. THE MEASUREMENT OF THE STATE

A state is a measurable quantity. How can a state be determined by measure-ments? The measurements which are available are the set of yes—no experi-ments which form the propositions of the system. Let us suppose then thatwe try to determine these quantities.

Here we encounter a characteristic difficulty for quantum systems becausenot all elementary propositions are simultaneously measurable. The necessaryand sufficient condition for this to be possible would be that every propositionis compatible with every other proposition. Since we shall see later that this is,in fact, not so, we cannot possibly determine the value of all the elementarypropositions with a set of compatible measurements.

Thus, a state of a quantum system can only be measured if the system canbe prepared an unlimited number of times in the same state. We have seen thatthe states of two systems are defined as identical if they are prepared underidentical relevant conditions.

It is a fundamental fact that identical states do not yield identical resultsfor the truth or falsehood of a proposition. Thus, a measurement of proposi-tion a may sometimes give the value 1 (true) or 0 (false).

An example of such behavior is, for instance, found when one observeslinearly polarized photons with an analyzer with its axis at 45° to the axis ofthe polarizer. It is possible to carry out such experiments with individualphotons; and one observes that each individual photon either passes theanalyzer or it does not. If a statistical study is made with a large number ofphotons, one observes that about half the photons pass through the analyzerand half are absorbed. It is, therefore, impossible to attribute to thesephotons the property of being polarized at 45°, nor is it possible to deny thisproperty, since some of them do pass the analyzer.

From this example we see clearly that what we may expect in quantummechanics is not a definite value for each of the elementary propositions a in agiven state, but at the most a probability p(a) for this value. This probabilitywould be measured by making a large number n of experiments and recording

Page 107: Foundations of Quantum Mechanics

94 STATES AND OBSERVABLES 6-3

the number n(a) of those experiments which give the value + 1 for the proposi-tion a. The probability in question would then be expected to be

p(a) = limn(a)

(6—1)n

so that it could be determined in principle to an arbitrary degree of accuracy bymaking a sufficiently large number n of measurements.

It should be pointed out that this feature of elementary propositions isobserved also for classical systems. Thus, for instance, if our initial prescrip-tion is to throw a die (without further detail), we observe one of the sides toturn up with a probability j The difference between classical and quantalsystems is that the classical state can be further determined by specifyingfurther relevant conditions (such as throwing the die from a certain positionin a certain direction and with a certain velocity) while for quantum systemssuch specification is usually not possible.

In any case we expect that the measured quantity is a probability functionp(a) defined on all the propositions. The physical interpretation which weattach to this function implies certain mathematical properties which we shallnow enumerate.

6-3. DESCRIPTION OF STATES

According to the description of the preceding section, a state is mathematicallydetermined by a real valued functionp(a) on all the propositions a e £9. Weshall require the following properties for the states p(a):

1)O�p(a)� 1.2) p(Ø) = 0, p(I) = 1.

3) If {a,} is a sequence of propositions satisfying a, c for all pairs(i k), then p(a,) = a1).

4) For any sequence p(a,) = 1 implies p(fl a,) = 1.

5) (a) If a # 0 then there exists a state p such that p(a) 0.(b) If a b, then there exists a state p such that p(a) p(b).

Let us examine these axioms to determine to what extent they reflect thephysical meaning of the state.

1) There is not much to say here, since this property is a direct consequence ofthe definition, Eq. (6—i).

2) p(O) = 0 means that the absurd proposition is always false and p(I) =says that the trivial is always true. This is consistent with the idealized natureof these propositions.

Page 108: Foundations of Quantum Mechanics

6-3 DESCRIPTION OF STATES 95

3) This axiom is better discussed first in its finite form. Let, for instance, aand b be two propositions such that a c b'. We shall refer to such a pair asdisjoint. Property (3) says then that

p(a) + p(b) = p(a u b). (6—2)

Let us remember that the proposition a u b has the physical interpretation of"a or b." In view of Eq. (6—i), the relation (6—2) is thus true if a and b arecompatible and if

n(a) + n(b) = n(a u b).

The condition for this to hold is that n(a n b) = 0. But a n b = 0, and thuscondition (2) implies at least n(a n b)/n —* 0 (for n —* cc). (The compatibilityof a and b is a consequence of axiom (P), cf. Problem 1.)

Condition (3) is thus easily justified for two disjoint propositions, andhence for finite sequences of disjoint propositions. It is, however, postulatedfor infinite sequences, and this is a restriction which cannot be justified anylonger by direct appeal to the experimental determination of p(a). The exten-sion to infinite sequences is a convenient mathematical axiom which corres-ponds exactly to the analogous axiom in Kolmogorov's theory of probability

A simple consequence of(3) is:If a 4—* b, thenp(a) + p(b) = p(a n b) + p(a u b). This relation is always

true for any pair of elements if p(a) is a probability function on a Booieanalgebra of sets (Problem 3), but it need not be true for propositions which arenot compatible.

The reader will have noticed that we are dealing with a kind of generalizedprobability. The classical probability is defined as a measure on a Booleanalgebra of sets. The probability which we introduce here is a measure on a non-Boolean lattice. Some mathematical aspects of this notion have been studiedby Varadarajan [2].

4) We discuss this property again first for two elements. Let p(a) = p(b) = i.Then both propositions a and b are true. It is therefore reasonable to requirethat the proposition "a and b" also be true, that is, that

p(a) = p(b) = 1 implies p(a n b) = 1.

Property (4) is the generalization of this to an infinite sequence of proposi-tions. Again, the same remarks apply that were made in connection withproperty (3), as to the idealized nature of this axiom.

For Boolean lattices property (4), when restricted to finite sequences, is aconsequence of the others. This is no longer true for non-Boolean lattices(cf. Problem 4). We shall not always use property (4) in its infinite form in thefollowing discussion. If we use only the finite form of(4), we shall replace it by

4)' p(a) = p(b) = I implies p(a n b) = I.

Page 109: Foundations of Quantum Mechanics

96 STATES AND OBSERVABLES 6-3

An immediate consequence of (4)' and the other axioms is:

p(a) = p(b) = 0 implies p(a u b) = 0

(cf. Problem 5).

5) This is an axiom which guarantees that there are sufficiently many states.If it were not true, then a proposition a for whichp(a) = 0 for all states wouldbe, in a certain sense, unobservable. Likewise, if p(a) = p(b) were true for allstates, then the pair of propositions a, b would be physically indistinguish-able; hence, it might as well be identified. It is possible that the lattice can beredefined in such a way that property (5) is always true. In this sense property(5) seems not a new physical property but merely a requirement which assumesthe avoidance of redundancy.

The first question that ought to be settled is whether there exist stateswhich possess the properties enumerated. This question is by no means easyto answer, because there exist examples of Boolean lattices which admit nostates at all [3, p. 186]. It is thus not profitable to tackle this question in thisgeneral setting. In the more special context of complex quantum mechanics,we shall exhibit plenty of states, and in fact obtain all of them.

The states form a convex set. This means the following. IfPi and P2 aretwo different states, then

p = A1p1 + A2p2 with + = 1, (6—3)

and

is again a state. This is easily verified (Problem 6). A state which is thusrepresented with the help of two others is said to be a mixture. A state whichcannot be represented as a mixture of two others is said to be pure. The purestates are thus the extremal subset of a convex set.

We shall define the function c(a) p(a) — p2(a) as the dispersion of thestate. We shall sometimes also speak of the overall dispersion c defined by

c = sup c(a). (6-4)ac2

If c = 0 we shall call the state dispersion-free. For such a state we haveobviously the alternatives

p(a)= {?'

(6—5)

The overall dispersion of a mixture of two different states is always > 0, sothat dispersion-free states are always pure (ci Problems 7 and 8). Pure statesneed not be dispersion-free, as we shall see later (Section 7-3).

Page 110: Foundations of Quantum Mechanics

6-4 THE NOTION OF OBSERVABLES 97

PROBLEMS

1. Two disjoint propositions are compatible. That is a a b' a b. [Hint:Use the result of Problem 2 in Section 5-8.]

2. If a b, then there exist three disjoint propositions a1, b1 and c such that

a=a1uc, b=b1uc.

They are given by

a1=anb', b1=bna', c=anb.

3. For any pair a, b e £9 and any state, a b implies

p(a) + p(b) = p(a u b) + p(a n b).

(Use result of Problem 2, and axiom 3.)

4. If £9 is Boolean and p(a) satisfies properties (1), (2), and (3), then

p(a) =p(b) = 1 implies p(a n b) = 1.

5. If p(a) r=p(b) = 0, thenp(aub) = 0.

6. If Pi and P2 are two different states and Ai, A2 two positive numbers such thatA1 + A2 = 1, then p = A1p1 + A2p2 is also a state.

7. If Pi and P2 are two dispersion-free states, then their mixture p = A1pi + A2p2has dispersion a = A1A2(p1 P2)2, where A1 > 0, A2 > 0, A1 + A2 = 1.

8. A dispersion-free state is always pure.Conditions (1), (2), (3), and (4) for states are equivalent to conditions (1), (2),and the following:(3') Ifa a b', thenp(a)+p(b) =p(a u(4') Ifp(a) =p(b) = 1, thenp(a n b) = 1.

(C) Continuity. For any increasing sequence a1 a a2 a a a, a

P(U an)

(cf. reference 10).

6-4. THE NOTION OF OBSERVABLES

In physics an observation of a physical system consists in the manipulation ofcertain instruments (the measuring device) and the eventual reading (orrecording) of a scale. The scale may consist of a continuum of values, or adiscrete set, or both. The simplest "scale" would be the counter whichregisters the number of occurrences of certain elementary events. In any case,a measurement will give the answer to a number of elementary propositionscorresponding to the different aggregates of values of the possible measuringresult.

Page 111: Foundations of Quantum Mechanics

98 STATES AND OBSERVABLES 6-4

Since a measuring device is one experiment only, the question of compati-bility cannot arise. Thus we expect all the elementary propositions associatedwith an observable to be mutually compatible. Moreover, we expect thesepropositions to be a Boolean algebra.

In order to formalize these properties, it is convenient to introduce thenotion of a c-homomorphism.

Definition: A c-homomorphism is a mapping from B(R1) (the Borel sets onthe real line) into 5€, such that, for all A e 8(R1), one has x: A —* x(A) e 2',satisfying the following conditions:

1) x(Ø) = 0, x(R1) = I;

2) -L A2 implies x(A1) -L x(A2);

3) for any sequence A, (i = 1, 2,. . .) such that A, -L for all pairs, wehave

x(U AA = U x(Aj)./ I

Here we have used the notation a -L b for the disjoint relation a c b' of alattice.

All these properties, except the c-additivity, flow directly from the heuris-tic meaning of an observable. The c-additivity, property (3), can be justifiedonly by the greater simplicity of the mathematical object thereby defined.

We shall call such a c-homomorphism an observable; it is also sometimescalled an 5€-valued measure or a random variable. The last two terms suggestthe relations to the spectral measure and the observable in quantum mechan-ics, on the one hand, and the relation to classical probability calculus on theother hand. The above definition is indeed a generalization of these conceptsand it embraces them both.

In classical mechanics, for instance, the system defines a phase space Q,and a random variable can be given by giving a Borel function f(w) (w e Q)from Q into the reals R1. Since it is a Borel function, the set

{w :f(w)e A}

defines a Borel subset of Q for any Borel subset of R1. It is called the inverseimage of A. The correspondence A —* defines a c-homomorphism ofthe Borel sets on R1 into those on Q. Here the function f serves no otherpurpose than to establish this homomorphism via its inverse image. Thus, inorder to rid the notion of observable of irrelevancies, it is advisable to leaveout the function f altogether and concentrate on the homomorphism itself.Then one sees immediately that the fact that the image of the homomorphismis an algebra of subsets is irrelevant, too; it can just as well he any Booleansubalgebra of a (not necessarily Boolean) proposition system.

Page 112: Foundations of Quantum Mechanics

6-5 PROPERTIES OF OBSERVABLES 99

In this way we arrive at the abstract notion of an observable which can beused equally as well for classical as for quantal systems. In ordinary quantummechanics, the propositions are the projection operators on some Hilbertspace, and the observable becomes what we have called in Section 4-3 aspectral measure. Since every spectral measure determines a self-adjointoperator, it follows immediately that an observable is nothing else than a self-adjoint operator in that case, a well-known fact taught in all elementary textsof quantum mechanics.

It is easy to prove that the definition of an observable as given hereimplies that all the propositions x(A) are compatible with one another(Problem 1).

There is a converse of this. If a and b are two compatible propositions,then there exist an observable x and two Borel sets A1 and A2 such thata = x(A1) and b = x(A2).

For the proof, we select four arbitrary points on R' denoted by 1, 2, 3,and 4. For every set A which does not contain one or several of these points,we set x(A) = 0. For the others we define

x({1}) = a n b', x({2}) = b n a',x({3}) = a n b, x({4}) = a' n b'.

Using compatibility of a and b, one verifies easily that

x({1, 3}) = a, and x({2, 3}) = b.

Thus A1 = {1, 3}, A2 = {2, 3} will do what is claimed of them.

PROBLEMS

1. The range of an observable is a Boolean a-subalgebra of propositions.2. If a, is an infinite sequence of mutually compatible propositions, then there

exist an observable x and Borel sets such that a, =

6-5. PROPERTIES OF OBSERVABLES

For every state p, the observable x induces a numerical-valued measure(A) on the Borel sets defined by the formula

= p(x(A)). (6—6)

It represents the probability of finding the value of x in the set A when thesystem is in the state p. Physically it can be determined by performing a largenumber of measurements of the proposition x(S) on systems in the identicalstate p. We call it the expectation value. I

Page 113: Foundations of Quantum Mechanics

100 STATES AND OBSERVABLES 6-5

A Bore! set A will be called an x-null set if = 0 for all states p. Let{AJ be the class of all open x-null sets. Then A0 = U1 A1 is also open andx-null. Its complement A is therefore closed, and it is called thespectrum of x. If A is compact, then x is called bounded and we define thebound of an observable as

IxII = sup (IAI : A e A).

The range of an observable x is the subset c £9 defined by

{x x = x(A) for some A e B(R1)}.

It is easily verified that is a Boolean sublattice of propositions in 2'(Problem 1).

At this point the natural question arises: What are the properties whichcharacterize the Boolean sublattices In other words, given a Boolean sub-lattice c 2', what additional property must it have in order that thereexist an observable x such that = For physical reasons we are inter-ested in proposition systems £9 for which every Boolean sublattice is arange of some observable.

This condition must evidently be a restriction on the size of the lattice 2',since we have seen already that finite and infinite Boolean lattices which aregenerated by a countable set are the range of some observable (cf. Problem 2ofSection6-5). We shall now establish that this property is also necessary.

First we define the notion of separability. Let C? c £9 be a subset ofmutually compatible propositions of £9. We denote by the Booleansublattice generated by We remind the reader that this is the smallestBoolean sublattice containing b°.

A Boolean sublattice c £9 is called separable if there exists a countablesubset a1 e such that = That is, there should exist in acountable generating set.

We shall call a proposition system £9 separable if every Boolean sub-lattice of £9 is separable.

The usefulness of separability is revealed in the following:

Theorem: A Boolean sublattice of a proposition system £9 is separable([and only there exists an observable x such that = (For theproof; cf. proposition 3.15 of reference 2.)

From here on we shall make the assumption that the proposition system£9 is separable. Every Boolean sublattice is then the range of some observable.

PROBLEMS

1. The range of an observable is a Boolean sublattice of propositions.

2. Observables are partially ordered as follows: We say x u y if C

Page 114: Foundations of Quantum Mechanics

6-7 THE FUNCTIONAL CALCULUS FOR OBSERVABLES 101

6-6. COMPATIBLE OBSERVABLES

The notion of compatibility for pairs of propositions is, as we have seen inSection 5-5, easily extended to sets of propositions. Thus we may also transferit to pairs or sets of observables. We define:

Two observables x and y are compatible if every proposition of is com-patible with every proposition in

We write in this case x i—* y and The sublatticegenerated by the set u of propositions, is then a Boolean sublattice. Afamily {x1} of observables is said to be compatible if Xk for every pairof {x1}.

There is a special case which is of great interest in the following. If anobservable x is such that is a maximal Boolean lattice, then we say it iscomplete. Likewise, a system of observables x1, all of which are compatiblewith one another, is called complete if the Boolean algebra generatedby the ensemble of all is maximal.

Every system of compatible observables can always be completed byadjoining to them a certain set of compatible observables.

PROBLEMS

1. Every Boolean sublattice can be extended to a maximal Boolean sublattice.

2. If {x, } is a compatible set of propositions, then either it is complete or it can becompleted by adjoining a certain set of additional observables.

3. If x y (cf. Problem 2 of Section 6-6), then x y.

6-7. THE FUNCTIONAL CALCULUS FOR OBSERVABLES

There is an intuitive meaning to a function u(x) of an observable x. If anobservation of x yields the value then u(x) should give the result Thisof course does not suffice to define u(x) as an observable. However, e Aif and only if e u1(A). From this we see that the natural definition of theobservable u(x) would be the homomorphism which sends the set A e B(R')into x(u1(A)). Thus we define:

Let be a real-valued Borel function on R1, and x an observable.We define the observable u(x) as the mapping of B(R1) into £9 defined byu(x) : A -+ u(x)(A) x(u1(A)).

If A runs through all Borel sets B(R1), then 1(A) runs through a subclassof B(R1). It follows that

More interesting is the converse of this expressed in the following:

Theorem: If there are two obser rubles x and y such that c thenthere exists a Borel function u such that y = u(x).

Page 115: Foundations of Quantum Mechanics

102 STATES AND OBSERVABLES 6-7

For the proof of this theorem it is necessary to construct a Borel functionu(t) with the property

x(u_1(A)) =y(A).

Let (n = 1, 2,. . .) be an enumeration of the rational numbers. Set

= y((—cc, rj).

We shall show below that it is possible to construct Borel sets with theproperties

1) =2) A1 whenever i

3) x(U = L

For the moment we shall assume this construction possible. Denote by(3,, A,, the union of all these sets A,,. For any t e X0 define

ü(t) = inf {r,, t e A,,}

We have then {t ü(t) < s} = Un:rn < A,,, so that ü(t) is a Borel function. Ifwe define

11(t) for t e X0,u(t) =

0 forteX6,then

{t:11(t)<s} fors�0,{t u(t) < s} =

forO<s,so that u(t) is also a Borel function. Furthermore

x({t u(t) c s}) = U x(A,,) U y((—cc, r,,))

= y((— cc, s)) for any real s.

Since the sets (— cc, s) generate all the Borel sets, we conclude

x(u1(A)) = y(A) for all A e B(R1).

Thus the function u(t) does everything that is required of it, and the theorem isproved.

There remains the construction of the sets A,, with the three enumeratedproperties. We do this by induction. Let A1, A2,. . . , A,, be so constructedthat they satisfy conditions (I) and (2), and let 1', 2', . . . , n' be a permutation

Page 116: Foundations of Quantum Mechanics

6-7 THE FUNCTIONAL CALCULUS FOR OBSERVABLES 103

We now use the hypothesis of the theorem. Since e it followsfrom the hypothesis that e also. Thus there exists aA e such that = x(A). We now distinguish three cases:

i) < rn.;ii) there exists a p such that 1 � p � n and rp' < ciii) <

We define,

in case (i), A n AflP,in case (ii), = u (A nin case (iii), = u A,

so that properties (1) and (2) are also satisfied for the sets A1, A2, . .

We find

Ux(A,3 = Uy((—co,r,M=I,

so that property (3) is also verified. This finishes the construction andthereby completes the proof of the theorem.

We finish this section with an important theorem from the calculus ofobservables. For self-adjoint operators, von Neumann has proved thetheorem:

For every commuting family A1 (1 e I) of self-adjo!nt operators thereexist a self-adjoint operator x and Borelfunctions u1 such that A1 = u1(x).

This theorem can be (and has been) generalized for observables definedon lattices. Its generalization was conjectured by Mackey [4, p. 71J and itsproof was given by Varadarajan [2J.

With the result of the preceding theorems the proof becomes quite trans-parent, so we shall give it here.

Theorem (von Neumann—Varadarajan): Let x1 (1 e I) be a family ofcompatible observables in a separable lattice £9 so that x1 ÷-+ for alli, j e I. Then there exist an observable x and Borel functions u1 such that

= u1(x).

Let be the Boolean sublattice generated by the rangesof the observables x1. Since £9 is separable, is separable too. By the

theorem of Section 6-5 there exists an observable x such that =('learly c 138,, and the theorem of Section 6-7 then tells us that there existI3orel functions u1 such that x1 = This proves everything we needed.

lt should be mentioned that this theorem is weaker than the one given byVaradarajan, since we have made stronger assumptions about the propositionsystem. The significance of this theorem is more mathematical than practical.The reason is that it is often c;usy to describe physical arrangenients which

Page 117: Foundations of Quantum Mechanics

104 STATES AND OBSERVABLES 6-7

measure a set of commuting observables, while it may be practically impos-sible to describe such an arrangement for the observable x of which they areall functions. This difficulty is, however, only a practical one and must besharply distinguished from the impossibility in principle of certain measure-ments to be discussed below.

The usefulness of the theorem for theoretical considerations is that itpermits an extension of the functional calculus to systems of compatibleobservables (cf. Problems 4 and 5).

The extension of the functional calculus for noncompatible observables isan unsolved problem; thus it is not possible to give a meaning to the observable(x + y) if x and y are noncompatible. An obvious way to attempt doing thatwould be through the method of expectation values. Thus if there exists anobservable z with the property that

= + for all states p,

then one could define z = x + y. However, the existence of such a z cannotbe proved with lattice-theoretical methods alone.

In Segal's method of axiomatic quantum mechanics [5J, one starts with anoncommutative algebra of bounded observables rather than a lattice ofpropositions. Such an algebra is then equipped with a linear functional whichis interpreted as expectation value; but the connection with the lattice ofpropositions is then lost.

The link between the lattice and the observables and their expectationvalues can be established explicitly only for conventional quantum mechanicsin complex Hilbert spaces. The connecting link is the important theorem ofGleason [6J which so far has not been generalized sufficiently to be applicablefor the more general situation envisaged here.

PROBLEMS

1. If x and y are two observables such that = then there exist two Borelfunctions u and v such that y = u(x) and x = v(y).

2. If x is an observable and u1 and u2 are Borel functions on R1, then we can definethe Borel function u = o u2 by setting u(t) = u1(u2(t)). The followingrelation is then true

o u2(x) u1(u2(x)).

3. If x defines the measure in the state p, then u(x) defines the measure

=

4. Let x andy be two compatible observables; then there exist an observable andBorel functions a and v such that

X ,,(z). ;'

Page 118: Foundations of Quantum Mechanics

6-8 THE SUPERPOSITION PRINCIPLE 105

We can define the observable (x + y) by setting

w=u+v and

Similarly we define the observable x y by setting

w=u v and x 'y:5. Generalization of Problem 4: Let x1,. . . , be a finite set of compatible

observables and let U1 be Borel functions and x an observable such that Xj =u1(x). Let . . . , be a Borel function on R". We define the Borel functionu(t) . . , u,,(t)) and the a-homomorphism

—>

This defines an observable . . ,

This definition of the function of several compatible observables has theusual properties of a functional calculus(cf. Varadarajan [2], esp. Theorem 3.4).

6-8. THE SUPERPOSITION PRINCIPLE

In classical physics the principle of superposition is well known. It is a directconsequence of the linearity of certain equations of motion describing physicalproperties. Thus if ç1(x) and ç2(x) are two numerical functions describingtwo different physical processes of a linear system, then ç(x) = ç1(x) + ç2(x)is again a solution describing another possible physical process.

The superposition principle of quantum mechanics has certain similaritieswith this classical form of the principle, at least when expressed in conven-tional quantum mechanics. Yet its physical content is radically different, and,in fact, contains the essential properties ofquantum systems, to the extent that Dirac

2[7], in his well-known book on quantummechanics, could introduce it as the funda-mental principle of quantum mechanicsfroin which much of the theory follows.

In order to illustrate the quantummechanical superposition principle and its 1

relation to the classical principle, let usexamine a couple of examples.

One of the simplest examples is thepolarization interference of light. For a

light wave, the coherent super- Fig. 6—2 Elliptically polarizedposition of two linearly polaruied light light as a superposition of twowaves will result in an elliptically linearly polziriied components.

Page 119: Foundations of Quantum Mechanics

106 STATES AND OBSERVABLES 6-8

wave. The parameters of the elliptical polarization will depend not only onthe relative intensity of the linearly polarized components, but also on therelative phase of these components (Fig. 6—2).

If the intensity of the interfering light is reduced, one may observe indi-vidual photons. Under constant intensity and phase conditions, each of theindividual photons is in a definite state of (generally elliptical) polarization.Yet each of the photons partakes in some sense, to be made precise, of theproperties of linear polarizations from which it was obtained. This can beseen by letting the elliptically polarized photon pass through an analyzerwhich determines linear polarization in one of two orthogonal directions.What one observes under these conditions is individual photons linearlypolarized but distributed over the two alternatives with a certain probability.Thus each individual photon also potentially has the property of linearpolarization in some direction. We shall therefore find it possible to considerelliptically polarized photons as a superposition of linearly polarized ones.

A similar situation is observed in an ordinary interference experimentsuch as the one sketched in Fig. 6—3.

Fig. 6—3 interference experiments.

Here a light source L sends light onto a partially transparent mirror whichreflects half the intensity and transmits the rest. The original beam splits intotwo which are reunited at the detector D after having traversed differentpaths, 1 and 2.

Again this experiment can be carried out with individual photons, andthen we find that the actual state of a photon in such an apparatus consists of asuperposition of the states represented by the two paths, 1 and 2.

The quantum-mechanical description of the superposition principle mustthus contain the following ingredients: For any pair of propositions a, b theremust exist a number of propositions different from a as well as b, which implythe proposition "a and b." It is easiest to express this for the special case ofatomic or minimal propositions.

We have defined minimal or atomic propositions in Section 5-4 as thosepropositions e 0 with the property:

xce implies x==0.

Principle of Superposition. For any pair of atomic propositions e1 and e2

$ e2) there exists a third, e3, such that e3 $ e1, e3 e2, and such that

e1 u = e1 u = e2 u 7)

L

2

Page 120: Foundations of Quantum Mechanics

6-8 THE SUPERPOSITION PRINCIPLE 107

The third atomic proposition e3 which is here assumed to exist plays therole of one of the superpositions of e1 and e2. We shall, in fact, see later thatthere exists not only one but a one-parameterfamily of propositions such as e3

which satisfy the relation (6—7). This fact gives rise to the phenomena of inter-ference which are so characteristic in wave mechanics and which have led tothe notion of the wave nature of matter.

Let us now examine some of the mathematical consequences of theprinciple of superposition. A first remark: A lattice which satisfies theprinciple of superposition cannot be Boolean. Indeed, let e1, e2 and e3 be as inEq. (6—7). If the lattice were Boolean, we would have, for instance,

n (e1 u e2) = (e3 n e1) u (e3 n e2).

But this relation cannot be right, as one sees from the following: Since e1 ande2 are two different minimal propositions, e1 n e2 = 0. Hence the right-handside is equal to 0 u 0 = 0. But the left-hand side is equal to e3 since e1 u e2contains e3. Thus the equation is incompatible with e3 0.

We can now understand why the validity of the superposition principle isso characteristic for quantum systems: It implies the non-Boolean character ofthe proposition system.

A further consequence is the following: The proposition e3 which satisfies(6—7) can never be compatible with both e1 and e2. Indeed, it is quite easy toshow that if e3 ÷-+ and e3 ÷-+ e2, then e3 satisfies a relation of distributivity(Problem 1):

e3 n (e1 u e2) = (e3 n e1) u (e3 n

which we have just shown to be incompatible with Eq. (6—7).This property is quite significant in view of the question as to what extent

a system which is known to have property e3 may be considered to havesimultaneously property e1 or e2 or both. We see that the answer to suchquestions would involve the simultaneous determination of incompatiblepropositions, a feat which we shall show generally to be impossible.

The foregoing does not exclude the possibility that the two propositionsand e2 may be compatible. This is, for instance, the case for the two exam-

ples which we have sketched in Figs. 6—2 and 6—3.The questions which we touch here have been extensively discussed by

Bohr in connection with a large variety of specific examples [8]. In all theseexamples, Bohr could show that the simultaneous determination of properties('3 with either one of the two properties e1 or e2 would involve incompatiblephysical arrangements.

Here we have shown that this fundamental property expresses itself,In the mathematical structures of the proposition system, by the factthat the suhiattice generated by e1, e2, and e3 which satisfy (6 7) cannot beI3oolcan.

Page 121: Foundations of Quantum Mechanics

108 STATES AND OBSERVABLES 6-8

Finally, we want to discuss the relation of the superposition principle tothe reducibility of proposition systems. In Section 5-5 we have introduced thenotion of center, reducibility and irreducibility, and direct union of sublattices.Here we shall connect these concepts with the superposition principle.

Suppose that the lattice £9 is a direct union of two lattices and2 of elements {x1, x2} with

x1 e and x2 e Let us now consider a point (minimal proposition)e1 e and denote by ë1 the element of the form {e1, 02}. Similarly, considerthe element ë2 of the form e2} where e2 is a point in 22. If the superposi-tion principle holds, then there must exist an element ë3 0 such that

el U e2 = e1 U = ë2 U e3.

This is easily seen to be impossible. For such an element would be con-tained in ê1 u ë2. Therefore if we write ë3 = {x1, x2}, its two components x1and x2 would have to satisfy x1 c e1 and x2 c e2. Since e1 and e2 are points,this implies x1 = 0, and x2 = 02. But this is incompatible with ë3 $ 0. Weconclude with the following:

Theorem: A proposition system which satisfies the principle qfsuperpositionfor all pairs of points is irreducible.

Since a proposition system with nontrivial center is reducible (Problem 2)we have immediately the following:

Corollary: A proposition system which satisfies the principle of superposi-tion for all pairs of points has trivial center.

Since the center of a Boolean lattice is identical with the lattice itself(cf. Problem 5, Section 5-5), we see in this corollary again how the superposi-tion principle implies the essential features of quantum mechanics.

The superposition principle is thus not a new axiom of quantum mechan-ics; it is merely a consequence of the non-Boolean structure of the proposi-tion system. The detailed discussion of it in this section is given primarily inorder to bring to light the connection with historical discussions of quantummechanics.

PROBLEMS

1. Ifx÷-*aandx±÷b, then

xu (a n b) = (x u a) n (x u b)and

xn(aub)—(xna)u(xnb).2. A proposition system with nontrivial center is reducible.

Page 122: Foundations of Quantum Mechanics

6-9 SUPERSELECTION RULES 109

6-9. SUPERSELECTION RULES

Whether the superposition principle is valid without restriction is a questionof experience, and experience has shown that it is not universally valid [9J.When it is not valid we speak of superselection rules.

An example of a superselection rule is obtained from the neutron—proton system. These two elementary particles, which form the buildingblocks of all nuclei, are usually considered as two different states of one andthe same nucleon This description, coupled with the isotopic spinformalism, results in ceriain simplifications in calculation. However, onemust not overlook the fact that this description involves a superselection rule.It has never been possible to produce a state which could be considered asuperposition of a neutron and a proton state. It is therefore a reasonablehypothesis that such a superposition does not exist and that the neutron—proton system gives rise to a reducible lattice of propositions.

Thus we must expect that the proposition system is reducible and that thesuperposition principle does not possess unrestricted validity. It is convenientto introduce the word coherent for all those propositions which form anirreducible sublattice of propositions.

A state p for a proposition system which is $ 0 in at least two differentcoherent subsystems is always a mixture. This can be seen as follows: Letp besuch a state, and let

A1 = P({Ii, 02}), A2 = p({01, 12}).

Sincep $ Oin both coherent subsystems, we have 0 cA1 and 0 cA2. Further-more, Property (3) for states (ci Section 6-3) implies A1 + A2 = 1.

If x = {x1, x2} is any proposition, then the functions p1(x) and p2(x),defined by

1 1

p1(x) p({x1, 02}) and p2(x) — p({01, x2}),A1 A2

are easily seen to be different states. Furthermore, p(x) = A1p1(x) +A2p2(x). This shows that p(x) is indeed a mixture. It is possible to generalizethis theorem to states which have components in a discrete or continuousfamily of superselection rules.

The occurrence of superselection rules in nature shows that the actualphysical proposition systems take an intermediate position between theBoolean systems and the coherent systems. The Boolean systems characterizethose systems with only classical mechanical properties. The coherent systemsare those for which the quantum properties are universal and essential. Thispositioning of the actual physical proposition systems somewhere betweenthe two extremes shows, in a most striking way, that the theory of quantumsystems is a generalization of the classical theories, rather than (as it is oftenargued) in contradiction with them.

Page 123: Foundations of Quantum Mechanics

110 STATES AND OBSERVABLES

The fundamental cleavage between the two descriptions, which has causedso much concern, is thus resolved in the synthesis of an all-embracing theoret-ical frame which is flexible enough to accommodate all the known physicaltheories.

It will be our task in the following chapters to introduce further propertiesinto this frame, which characterize specific physical systems.

REFERENCES

1. A. N. KoLMocIoRov, Foundations of the Theory of Probability. New York:Chelsea Pubi. Co. (1950).

2. V. S. VARADARAJAN, Comm. of Pure and AppI. Math., XV, No. 2, 189 (1962).

3. G. BWICHOFF, Lattice Theory, Am. Math. Sci. Coil. Publ. (1948).

4. G. MACKEY, The Mathematical Foundation of Quantum Mechanics. New York:W. A. Benjamin Inc. (1963).

5. I. E. SEGAL, Ann. of Math. 48, 930 (1947).

6. A. M. GLEASON, J. Rat. Mech. Analysis 6, 885 (1957).

7. P. A. M. DIILAC, Principles of Quantum Mechanics. Oxford (1958).

8. N. BoHR, Library of Living Philosophers, Vol. VII, 199 (1949).

9. G. C. WICK, A. S. WIGHTMAN, AND E. WIGNER, Phys. Rev. 88, 101 (1952).

10. J. NEVEU, Bases inathématiques du calcul des probabilites. Paris: Masson(1964); especially Lemma 1-3-2, p. 11.

Page 124: Foundations of Quantum Mechanics

CHAPTER 7

HIDDEN VARIABLES

I reject the basic idea of contemporary statistical quantum theory, insofar as Ido not believe that this fundamental concept will prove a useful basis for thewhole of physics. . . . Jam, in fact, firmly convinced that the essentially statis-tical character of contemporary quantum theory is solely to be ascribed to the

fact that this theory operates with an incomplete description of physical systems.

A. EINSTEIN

The new epistemo logical situation underlying quantum mechanics is satisfactory,both from the standpoint of physics and from the broader standpoint of theconditions of human knowledge in general.

W. PAULI

In this short chapter we formulate the important question concerning thepossible existence of hidden variables in quantum theory. A partial answerto this question can already be given at this state of the theory, and it isdesirable to avoid confusing this problem with the specific form of con-ventional quantum mechanics. In Section 7-1 we show with a thought-experiment (due to Einstein) that it is in general not possible to assign definitevalues to all the observable quantities of a physical system. This leaves openthe question whether there might not exist in principle unobservable quantities(hidden variables) which would account for the probabilistic feature of theobserved states, due to our ignorance of these variables. After a shortpreparatory section (7-2) we give a precise definition of hidden variables inSection 7-3 and prove the theorem which shows that the existence of hiddenvariables of this kind is in contradiction with empirical facts. Two otherproposals, which are not excluded by these considerations, are briefly dis-cussed in the last section (7-4).

ti'

Page 125: Foundations of Quantum Mechanics

112 HIDDEN VARIABLES 7-1

7-1. A THOUGHT EXPERIMENT

in Section 6-1 we analyzed the notion of the state of a physical system andcharacterized it as a result of the preparation. In Section 6-2 we pointed outthat a state can be determined by measuring the values of a sufficient numberof propositions. For each individual measurement the outcome is one of thetwo alternatives "yes" or "no"; but we have emphasized that a repetitionof the experiment, prepared under identical relevant conditions, will notalways produce identical results for the value of a given proposition.

Let us illustrate this situation by a thought experiment discussed byEinstein at the fifth Solvay Congress in 1927.

Fig. 7—1 Einstein's thought experiment.

A stream of particles, homogeneous in momentum, falls on a screen Swith a small hole 0, as shown in Fig. 7—1. Opposite the hole is a photo-graphic plate P, which can detect the arrival of the particles. The intensity ofthe incident beam is assumed to be sufficiently low that individual events canbe detected. If the hole is sufficiently small, particles will arrive at P separatedby distances much larger than the diameter of the hole. The questions nowarise: Does an individual particle actually have a definite momentum which,because it varies from case to case in a random manner, is not known; or doesit have not a definite momentum but only a probabilIty distribution which isrevealed by the statistics of the measurements?

We can formulate this question in the language of propositional calculus.Let p0 represent the state obtained by the preparation of the system aftersending the particle through the hole in the screen. Let Pi and P2 be thestates of the system after the particle has hit the screen at a1, or a2, respec-tively. These states are manifestly different from p0 and from each other, sincea measurement of the proposition "particle is at a1" will be true with certaintyin the state Pi and false with certainty in the state P2' while it has an inter-mediate value in state p0. Thus we have the following relations:

= p2(a2) = I, P1(a2) = P2("l) = 0:

0 cc p0(a1) < I. and 0 cc p1Aa2) cc I.

Page 126: Foundations of Quantum Mechanics

7-1 A THOUGHT EXPERIMENT 113

a))))))

a2)))))

Fig. 7—2 Interference effects observed on screen T.

Since Pi represents the state with the momentum of the particle in thedirection 0a1, the adequate description for the state for which the particlehas definite but unknown momenta in the direction 0a1, and 0a2 would bethe statistical mixture + 22p2 with = p0(a1), 22 = p0(a2). Generally,a screen P with a discrete (finite or infinite) set of points would require astate

p = with = (7—1)

However, whether the state is of this form or not can be decided byexperiment. It suffices to replace two of the detectors a1 and a2 by holes, andobserve the effect on a third screen T. The result, we know, is an interferencepattern as indicated schematically in Fig. 7—2. However, the state expressedby Eq. (7—1) cannot be distinguished from a state where the two holes a1 anda2 are replaced by independent sources of particles. Such an arrangementwould produce only two maxima in front of the two holes (as shown inFig. 7—3), and this is quite different from the interference pattern of Fig. 7—2.

I

Fig. 7—3 Distribution of particles a2

1mm two independent sources.

Thus, whether in this experiment the momenta have definite but unknownvalues can in principle be decided by an experiment and the answer is negative.We shall show later that this interdependence of mutually incompatiblequantities is a quite general feature of quantum mechanics expressiblequantitatively in the form of uneertwntl relations.

I

Page 127: Foundations of Quantum Mechanics

114 HIDDEN VARIABLES 7-2

The foregoing analysis merely shows that the actual state of the particle,after it has passed the hole 0, is such that it does not allow the interpretationthat the particle in this state has a definite (albeit unknown) momentumbecause such a state would behave differently in a subsequent experimentfrom the actual observed behavior. But this analysis does not exclude thepossibility that the state in question might be the statistical mixture ofunknown states, which cannot be produced with known physical equipmentbut which assign definite values to all measurable propositions. If this is thecase for every state, we say that the system admits hidden variables. We shallgive a formal definition of such systems in Section 7-3.

7-2. DISPERSION-FREE STATES

When actually observed, most real states show dispersion. This is true inquantum as well as in classical physics. This means that, when we observea proposition a in a statep, we usually obtain a resultp(a) which lies between0 and 1. We can give a sort of quantitative measure for the dispersion bydefining the dispersion function cc(a) = p(a) — p2(a).

As shown in Fig. 7—4, the dis-persion function has the value of zero a =

p p = 1, and assumes itsmaximum value forp = 3-. it is thusan adequate measure of the degree ofuncertainty associated with any par- 1/4 -ticular value of p. I

We shall also define an overalldispersion a for each state, defined by 1)2 1

S p

cc = sup cc(a). (7—2) .

a Fig. 7—4 The dispersion function.

A state is called dispersion-free if a = 0.

Physically, the existence of dispersion-free states is a question to bedecided by experiment. ln classical mechanics it is implicitly assumed thatsuch states exist, at least in some idealized sense. Even if actual physicalmeasurements on classical systems show dispersion for individual proposi-tions, in classical physics nothing prevents us from assuming that suchdispersion can be reduced further and further by a suitable refinement ofthe preparing experiment. In quantum physics this is not so, as we haveseen in the discussion of Einstein's thought experiment of the precedingsection. However, we can imagine that dispersion-free states, although notrealizable physically, might still be useful as a mental construct for the formu-lation of a quantum mechanics based on a deeper level of causal processes.Such hypothetical states would then have the character of freely constructedmental images with no threct correlation with reality. Contact with reality

Page 128: Foundations of Quantum Mechanics

7-2 DISPERSION-FREE STATES 115

would appear only as derived consequences which might be obtained in adetailed dynamical theory involving such states. This would be a theory withhidden variables. In order to examine this question it is desirable to studysome of the properties of dispersion-free states.

There is one property which can be immediately verified for dispersion-free states contained in the following

Proposition 1: Every dispersion-free state is pure.

Proof: Suppose the state p is a mixture. Then there exist two differentstates Pi and P2, as well as two positive numbers and 22, such that

+ 22 = 1 and p = + Since the two states and P2 are dif-ferent from one another, there exists a proposition a such that p1(a) p2(a)(cf. Property 5b of Section 6-3). Since the states are dispersion-free, there aretwo possibilities only: p1(a) = 1, p2(a) = 0 or p1(a) = 0, p2(a) = 1. Ineither case we have cr(a) = 2122 0. Thus the state is not dispersion-free,contrary to the assumption. This proves the proposition.

The converse of this proposition is false. We shall later see many examplesof pure states which are not dispersion-free. For coherent proposition systems(cf. Section 6-9 for this notion), we can even prove that:

Proposition 2: On a nontrivial coherent proposition system there exists nodispersion-free state.

Proof: Let us recall that a coherent proposition system 2' is defined as asystem with trivial center. The center is the set of all propositions which arecompatible with every other proposition in 2. If the center is trivial, thismeans that it consists of only the two elements 0 and L

We base the proof on the following:

Lemma: If there exists a dispersion-free state p in a proposition system,2, then there exists a point e in the center of 2 such that p(e) = 1.

Proof of Lemma: Let 21 be the subset of all propositions a1 (i e I = someindex set) in 2' such that p(a1) = 1, and let a0 = fl1 a1. By property (4) ofSection 6-3, we also have that p(a0) = 1, so that a0 e

Now consider any x a0 but x a0. Since p(x) = 1 would implya0 x and hence a0 = x, we must have p(x) = 0. Since x is compatiblewith x', it follows from the property of states that

p(x) + p(x') = p(x u x') = 1.

Since p(x) = 0, this means p(x') = 1, or a0 c x'. It follows that

x=xna0xnxO.Therefore x = 0. This means that a0 is a point, which we shall designate bye from now on.

Page 129: Foundations of Quantum Mechanics

116 HIDDEN VARIABLES 7-3

Next we prove that e is compatible with every other proposition x in 2'.If x e then e c x and hence, by axiom (P) of Section 5-8, e 4-* x. If, onthe other hand, x then, as we have just shown, x' e and so e 4-*It follows then from a known theorem (cf. Problem 2 of Section 5-8) thate 4—* x also. Thus we have, in all cases, e 4—* x. This proves the lemma.

The proof of Proposition 2 is then completed by observing that a trivialcenter cannot contain any point unless 2 is trivial itself. This provesProposition 2.

7-3. HIDDEN VARIABLES

We must now express in mathematical language what we mean by a systemthat admits hidden variables. To this end we must generalize the descriptionused in Section 7-1. The essential property of such a system is that everystate which can be physically realized is a mixture of some dispersion-freestates. In the general case the mixture may involve not just two states but awhole family of them. It is therefore necessary to introduce a measure spaceX, called the space of hidden variables, and a finite measure p defined on theBorel sets of X, and to consider mixtures of the form

p(a) =

where e X and is a state which is at the same time a measurable func-tion of for all propositions a.

We shall now define a system with hidden variables in the following way:

Definition: A physical system is said to admit hidden variables if thereexists a measure space X together with a finite measure p (normalized sothat p(X) = 1) on X such that every state p of the system can be repre-sented as a mixture

p(a) = (7-3)Jx

of dispersion-free-states

p is concentrated on a finite or countably infinite sequenceof points then we obtain a representation in the form of a sum

p(a) =

where= and =

The variable e X represents a variable which for a given state neednever be known, and the states are not necessarily producible by knownphysical equipment. For tlth reason we refer to these variables as Izidilen.

Page 130: Foundations of Quantum Mechanics

7-3 HIDDEN VARIABLES 117

We shall now study the most important problem in connection withhidden variables: What are the physical properties of systems which ad-mit hidden variables?

Two types of answers are possible. Either there are no observablephysical properties which follow from the existence of hidden variables, orthere are such observable physical properties. In the first case, the hiddenvariables have no physical significance, and their use for the description ofphysical systems is not a question of physics and would therefore not concernus here. In the second case, one may ask whether these physical consequencesare in agreement with experience or not. If they are not in agreement, thenthe existence of hidden variables is empirically refuted.

There is one case where the answer can be given already without furtherwork. It is the case of the coherent proposition system. In this case wehave already shown that there do not even exist dispersion-free states(Proposition 2 of preceding section). For such simple systems, the questionis thus already answered. However, the answer is more difficult to obtain ifwe have a system with superselection rules. It is contained in the following:

Theorem: If a proposition system 2 admits hidden variables, then any pairof propositions a, b (a e 2, b e 2) is compatible: a b.

Proof: The proof is based on the following:

Lemma: If a proposition system admits hidden variables, then for any pairof propositions a, b e 2 and any state p, one has

p(a) + p(b) = p(a n b) + p(a u b). (7—4)

Pro of of Lemma: Since 2 admits hidden variables, every state is of the form

p(a) = (7-5)'ix

where all states are dispersion-free. It suffices thus to establish therelation in question for dispersion-free states. Thus let p(a) be dispersion-freeso that p(a) is either 0 or 1. There are then four cases possible:

1. p(a)=0, p(b)=0, 3. p(a)=0, p(b)=1,2. p(a)=1, p(b)=0, 4. p(a)=1, p(b)=1.

Let us examine each of these cases separately. Evidently (3) is reducible to

(2) by interchanging the roles of a and b.lf p(a) = p(b) = 0, then p(a n b) � p(a), and therefore p(a n b) = 0

also. On the other hand p(a') = 1 = p(b'). Thus by Property 4 of Section 6-3,p(a' n b') = I. Since (a u b)' = a' n b', we have

p(a u b) = I — pUc' u bY) = I — p(a' n If) = 0.

Thus p(a u b) 0 also, and the relation expressed in Eq. (7 4) is established.

Page 131: Foundations of Quantum Mechanics

118 HIDDEN VARIABLES 7-3

Now letp(a) = 1 andp(b) = 0. It follows then thatp(a u b) � p(a) = 1.

Therefore p(a u b) = 1. Furthermore p(a n b) � p(b) = 0. Thereforep(a n b) = 0. Thus, again relation (7—4) is verified. Finally, assumep(a) = p(b) = 1. We then have

p(a')=l—p(a)=O, p(b')=l—p(b)=O.Furthermore,

p(aub)=1—p(a'nb'), p(anb)=1 —p(a'ub').Therefore

p(a) + p(b) — p(a n b) — p(a u b)

= —[p(a' n b') + p(a' u b') — p(a') — p(b')].

The right-hand side is zero since it is an instance of Case 1. Thus the relation(7—4) is verified in this case also, and the lemma is proved.

We now proceed to the proof of the theorem. For every state we have, byrepeated use of the lemma,

p((a n b') u b) = p(a n b') + p(b)

= p(a) + p(b') — p(a u b') + p(b)

= p(a) + 1 — p(a u b') = p(a) + p(a' n b)

= p(a u (a' n b)).

Since this is true for every state we have, by Property Sb of Section 6-3,

(a n b') u b = (a' n b) u a.

This means (cf. Problem 2 of Section 5-8) a 4—* b, and this proves the theorem.We may remark here that we have not used the axioms in full generality.

For instance, we have used Property 4 only for a finite number of propositions,and we have not used the axiom of atomicity. This is very satisfactory sinceboth of these axioms are only weakly justified empirically and were positedmainly for convenience.

The conclusion of the theorem is seen to be very strong, since it affirmscompatibility for all pairs of propositions. Thus it suffices to exhibit a singlepair of noncompatible propositions to establish that hidden variables areempirically refuted. Now we have seen that the occurrence of noncompatiblepropositions is the essence of quantum mechanics, since the lattice is Booleanand the system behaves classically if every pair of propositions is compatible.Because of this result we may simply affirm: A quantum system cannot admithidden variables in the sense in which we have defined them. With thisresult the quest for hidden variables of this particular kind has found itsdefinitive answer in the negative.

Page 132: Foundations of Quantum Mechanics

7-4 ALTERNATIVE WAYS OF INTRODUCING HIDDEN VARIABLES 119

Dispersion—free states

Mixtures ofispersion—free states

Physical stateswhich are not mixturesof dispersion—free states

Pure states

Fig. 7—5 The circular disk represents all the states. The physically realizable statesare the circumference of the circle, and the dispersion-free states are the indicatedsubset of the circumference.

We can represent the result of this theorem with Fig. 7—5, which showssymbolically the different states and their relation to each other.

7-4. ALTERNATIVE WAYS OF INTRODUCING HIDDEN VARIABLES

It is clear from the foregoing considerations that any attempt at introducinghidden variables without leading to empirically refutable consequences wouldhave to be done by giving up one or several of the axioms on which thetheorems of Section 7-3 were based. Which of them should be given upis largely a matter of personal taste, and each of the chosen ways of doing soamounts to an alternative way of defining "hidden variables."

We shall briefly discuss two such possibilities which can serve for sucha modified definition of hidden variables.

The first is due to G. Mackey, and it is based on the notion of "c-dispersion-free states." A physical system is said to admit a-dispersion-freestates if for every a > 0 there exists a state p(a) such that the overall dis-persion for this state defined by Eq. (7—2) is smaller than this a.

A system can then be said to admit "quasi-hidden variables" if everystate p(a) admits a representation (7—5) where all the states

a > 0.So long as c 0, the proof which we have given in the previous section

does not go through. On the other hand no example of a non-Booleanproposition system is known which admits such quasi-hidden variables.

Another way of avoiding the conclusions of the previous section is tomodify the properties of the hypothetical dispersion-free states which areadmitted in the representation (7 5).

Dispersio4free states

Page 133: Foundations of Quantum Mechanics

120 HIDDEN VARIABLES

An explicit example has been constructed by J. S. Bell [7]. The axiomwhich is violated in this example is (4) of Section 6-3. It is possible to constructother examples which violate other properties of states.

Of course none of these logical possibilities can be properly calledtheories with hidden variables unless they lead to empirically testable con-sequences which can be verified or refuted. As long as this is not done, thequestion concerning such generalized hidden variables remains rather aca-demic. The merit of the considerations in Section 7-3 lies in the fact that theyshow that at least one class of hidden variables, the obvious and most naturalone, can thus be empirically refuted.

REFERENCES

The literature on hidden variables is extensive and not always easy reading. Wemention only a selection.

1. L. DE BROGLIE, La theorie de Ia mesure en mécanique ondulatoire. Paris: Gau-thier-Villars (1957).

2. D. Boi-iM, Causality and Chance in Modern Physics. London: Routledge andKegan Paul (1958).

3. W. PAUL!, Louis de Brogue, Physicien et Penseur. Paris: Albin Michel (1953).

The first "proof" of the impossibility of hidden variables in quantum mechanicsis due to von Neumann:

4. J. VON NEUMANN, Mathematische Grund/agen der Quantenmechanik. Berlin:Springer (1932).

See also:

5. J. ALBERT5ON, Am. J. Phys. 29, 478 (1961).

The theorem of this chapter which is a strengthening of von Neumann's resultwas first stated and proved in:

6. J. M. JAUCH AND C. PIRON, He/v. Phys. Acta 36, 827 (1963).

7. J. S. BELL, Physics 1, 195 (1965).

The most recent discussion is found in:8. B. Misra, Nuovo Cim. 47, 841 (1967).

Page 134: Foundations of Quantum Mechanics

CHAPTER 8

PROPOSITION SYSTEMSAND PROJECTIVE GEOMETRIES

Ubi materia, ibi geometria.

J. KEPLER

In this chapter we begin the building of the bridge which connects the generalquantum theory, as an abstract proposition system, with conventional quan-tum theory in a complex Hilbert space. This bridge is not yet complete.There are no convincing empirical grounds why our Hilbert space shouldbe constructed over the field of the complex numbers. But it is possible tounderstand why we need linear vector spaces. The path leads via the mathe-matical theory of projective geometries which are defined in Section 8-1. Weprepare the ground for a general representation theory of propositions inSection 8-2 by sketching a general reduction theory which leads to a simpli-fication of the problem. The structure of the irreducible components is thesubject of Section 8-3, where we demonstrate the general representation ofthese components as subspaces in a linear vector space over a field F.

The property of orthocomplementation leads to a restriction of the field.In Section 8-4 this restriction is stated and it is furthermore shown thatorthocomplementation defines a definite Hermitian form in the vectorspace V. The vector space V thus becomes a metric space with positivedefinite metric. The metric, and especially its definite character, is thusdirectly traced to the axiom of orthocomplementation. The final section, 8-5,describes the representation of the proposition system in Hilbert space.

8-I. PROJECTIVE GEOMETRIES

In Sections 2-5 and 5-7 we showed that the subspaces of a Hilbert space are alattice if the intersection of two subspaces is the set-intersection and theunion of two subspaces is defined as the closed linear subspace spanned bythem. It was pointed out that this lattice satisfies all the axioms of a proposi-hon system; in particular it is orthocomplemented, atomic, and complete. Thisshows, first ol all, that the axioms of a proposition system are self-consistentand, since conventional quantum mechanics can be formulated as a theory

121

Page 135: Foundations of Quantum Mechanics

122 PROPOSITION SYSTEMS AND PROJECTIVE GEOMETRIES 8-1

using Hubert space, that the axioms are realized in conventional quantummechanics.

In this chapter we examine the converse question: Given the axiomsof a proposition system, to what extent do they determine a realization in avector space? Are there several independent and inequivalent realizationspossible, and if so what are the physical reasons for choosing one over theother?

Let us admit to begin with that the full answer to this important question

is not yet known. Certain inroads, however, have been made recently whichpermit a restriction of the problem. The essentials of the known results areas follows: It is possible to show that an irreducible proposition system canalways be represented as the subspaces in a linear vector space of finite orcountably infinitely many dimensions with coefficients from a field. However,not much is known about the physical implications of the nature of the field, orthe uniqueness of this representation [14].

This result can be obtained by reducing the problem to another onewhich is part of a highly developed theory, namely, the theory of projectivegeometries.

It has been realized for a long time that the essential structural propertiesof a projective geometry are expressible in terms of the intersections andunions of the fundamental geometrical elements which can be built up frompoints, lines, planes, etc. Such a structure is an atomic lattice. Since a greatdeal is known about the structure of projective geometries and their realiza-tions as subspaces, there is hope that this knowledge may be applicable forquantum-mechanical proposition systems, too. The difficulty which stands inthe way of the execution of such a program is that a proposition system isnot a projective geometry. What is missing is the modular law; a projectivegeometry is always defined as a modular lattice (cf. Section 5-6).

In the special case where the maximal element I is a finite union ofpoints, this difficulty disappears, because one can demonstrate that such afinite proposition system is always modular. But for infinite propositionsystems, that is, systems for which I is not a finite union of points, modularityis no longer compatible with the other axioms.

We could of course have retained modularity and given up some of theother axioms; but we have not done so for the reason that they are notsuitable for the description of the physical reality. The possibilities which onecould envisage would be the following:

The lattice of a proposition system is

1) A complete, modular, atomic but not orthocomplementable lattice. Sucha lattice is, for instance, obtained by considering as elements all the linearmanifolds (not necessarily closed) of an infinite-dimensional filbertspace.

Page 136: Foundations of Quantum Mechanics

8-1 PROJECTIVE GEOMETRIES 123

2) A complete, modular, orthocomplemented but nonatomic lattice. Theexistence of such lattices was discovered by von Neumann in the projec-tions from a factor of type II, and they were called by him continuousgeometries.

3) A complete, orthocomplemented, atomic but nonmodular lattice.We can immediately exclude lattices of type (1) since the orthocom-

plementation is a property which can be very easily justified by the physicalinterpretation of the propositions.

The lattices of type (2) are not so easily rejected. We have already pointedout that atomicity is very difficult to justify on physical grounds, at least inthe generality needed for the axiom system. For this reason von Neumannhas on various occasions given serious consideration to lattices of type (2)as a possible generalization of conventional quantum mechanics. Indeed itseems impossible to exclude such lattices without further examination of theempirical material. The decisive property, which we have not yet considered,is what we shall call localizability. We shall show that a system whichpossesses this property, as we know to be the case for elementaty particles,cannot have a modular lattice of propositions (cf. Section 12-7). This wouldthen also exclude case (2), and there remains only possibility (3), which isprecisely that which we have adopted for a general proposition system.

We are thus faced with the problem of how to relate a proposition system(which in general is nonmodular) to a projective geometry (which is modular).We shall find it convenient to adopt the following definition of a projectivegeometry which is valid in the finite as well as in the infinite case [2, 5].

Let E be a set of elements called points, and let there be a class G ofsubsets of E which have the following properties:

subsets called "lines" in G such that,and e2 in G, there exists exactly one

2) Three points e1, e2, and e3 forma triangle which has the followingproperty: If is a point on theline through e1 and e2, and e5 isa point on the line through e2and e3, then e4 and e5 determinea line ( which contains one point,e6, of the line through e3 and e1(ci Fig. 8—1).

3) The necessary and sufficient condition that a subset of points a belongs tothe class of subsets ( is that it contain the line which passes through anypair of points from a.

I) There exists a distinguished class ofto every pair of different points e1line ( which contains e1 and e2.

e

5

Fig. 8—1 The triangle axiom.

Page 137: Foundations of Quantum Mechanics

124 PROPOSITION SYSTEMS AND PROJECTIVE GEOMETRIES 8-2

If the partial ordering in G is defined by set inclusion, and if unions andintersections are defined as least upper bound and greatest lower bound, thenone can prove that such a system G is an atomic modular lattice (not neces-sarily finite), and hence a projective geometry [2].

The principal problem to be discussed in the rest of this chapter, therefore,is how to relate a proposition system of a physical system to a projectivegeometry G, defined above. Before we attack this problem, it is convenient todevelop the reduction theory of proposition systems, to be followed by therepresentation theory of irreducible proposition systems. The reductiontheory is the object of the following section.

8-2. REDUCTION THEORY

A first important step towards the complete structure theory of propositionsystems is the reduction theory, which we shall discuss in this section. Theidea is the following: Every lattice 2 has a center consisting of all thoseelements of 2 which are compatible with every other element in 2. InBoolean lattices the center is identical with 2. In more general lattices it is aproper subset of 2. The center always contains the elements 0 and I of 2.If it contains only these two elements we call the center triviaL A lattice withtrivial center is called coherent or irreducible.

If the center is not trivial, then there exists at least one element z e such

that z 0 and z L It follows then from the axioms of a proposition systemthat z' e 't' also (cf. Problem 2 of Section 5-8). Every element a e 2 can thenbe decomposed according to the formulas a1 = a n z and a2 = a n z'.

Since z and z' are in the center of 2, the distributive law is valid forthe expression

(a n z) u (a n z') = a n (z u z') = a.

Furthermore, a1 n a2 = (a n z) n (a n z') = a n (z n z') = a n 0 = 0.Thus, every element a is decomposed by z into a pair {a1, a2} of elements, andone finds that if a = {a1, a2} and b = {b1, b2} are two elements of 2, then(Problem 1)

a n b = {a1 n b1, a2 n b2} and a u b = {a1 u b1, a2 u b2}.

For the orthocomplement a' of a, one finds the decomposition a' =where a'1=(anz)'nz=a'nz and a'2=(anz')'nz'=a'nz'(Problem 2).

We have thus constructed an explicit reduction of the lattice into twosublattices 2 is the direct union of the latticesand 22.

Both sublattices and J2 by themselves satisfy all the axioms of aproposition system (Problem 4). Each of them contains a center, and

Page 138: Foundations of Quantum Mechanics

8-2 REDUCTION THEORY 125

respectively, which may or may not be trivial. If both centers aretrivial, the decomposition stops, and both and 22 are irreducible. Iffor instance, is not trivial, then the decomposition can be continued bychoosing an element z1 e such that z1 and z1 where is thenull element and is the unit element of Using this element one can thenfurther reduce the sublattice 21, and so on.

The question presents itself whether the process will ever stop, and if itdoes, whether the resultant irreducible sublattices are uniquely determinedapart from a permutation.

These questions are easily answered in the affirmative if the lattice satisfiesa finiteness condition: The unit element I is a finite union of compatiblepoints,

I =(Je1(cf. Problem 5).

If the finite lattice is Boolean, then the irreducible lattices which areobtained in the reduction process each consist of exactly two elements. Thisis so because the only irreducible Boolean lattice is the trivial one consistingof two elements only.

A large class of infinite Boolean lattices can still be reduced to irreduciblelattices in a similar way. When this is possible, we can label the irreduciblecomponents by an index i which runs through an index-space Q, the phase-space of the system. Every proposition x e 2 is then of the form x =with x1 either or If we let A be the subset of Q defined by

A {i : = 13,

we see that this defines a one-to-one correspondence between subsets of Aand propositions x e 2. The operation of union and intersection becomesset union and set intersection, and the entire lattice is isomorphically mappedinto the lattice of the subsets of the phase space Q. The null set of Q corre-sponds to 0 and the entire set corresponds to L

The reduction theory of general Boolean lattices is not as complete as that,however. The strongest results are contained in the papers by Stone and byLoomis (cf. references 6 and 7), where it is shown that any Boolean lattice canbe represented as the lattice of certain subsets of a set. In this general caseone is very far from a complete reduction theory.

Let us now discuss the case of a non-Boolean lattice. We shall sketchonly the main lines of the proof of the reduction theorem, referring for thedetails to reference 5.

One begins by establishing an equivalence relation between points in 2.Two points e3 and i'2 are said to be perspective if there exists a third point

C U C2. One shows that perspectivity is an equivalence relation. (Onlytransitivity uses some deeper properties of the proposition system.) lt is in

Page 139: Foundations of Quantum Mechanics

126 PROPOSITION SYSTEMS AND PROJECTIVE GEOMETRIES 8-2

fact based on the following:

Lemma: Given three points e1, e2, and e3, which satisfy the property thate1 u e2 contains a point e4 such that e4 e1 and e4 e2, and e2 u e3contains a point e5 such that e5 e2 and e5 e3, then e4, e5 define a line1' = u e5 which has exactly one point e6 in common with e3 u e1 (cf.Fig. 8—1).

For the proof of this lemma, see reference 5. Now let ea be any pointand denote by {ea} the class of equivalent points containing ea; then we canconstruct the element

Za = ee

which we shall denote in shorter fashion by Za = U ea. This element is con-tained in the center of 2; in fact, it is the smallest element in whichcontains all the points ea. We shall call it the central cover of ea (Problem 6).Now it is not hard to demonstrate that the desired unique decomposition ofan arbitrary x e 2 has the form

x=Uxa with xa=xnza.It is easy to verify that the elements of the form xa = x n Za are them-

selves an irreducible lattice, and that for any pair of elements x = U xa andYUYa, one has xuy=U(xauya) and xny=U(xanya) (see

reference 5). This reduction theorem simplifies the discussion of the repre-sentation theory of proposition systems. We can, in fact, now concentrate onthe representation theory of irreducible lattices.

PROBLEMS

1. If z is a nontrivial element of the center of a lattice £9, and we define for everyelement x e £9 the decomposition x1 = x n z, x2 = x n z', then we have

(anb)1=ajnb1, (aub)1=a1ub1,

(anb)2=a2nb2,

2. For every z e one has a' n z = (a n z)' n z.3. A finite Boolean lattice has elements (ii = 1, 2,. . .). The elements of such a

lattice are in one-to-one correspondence to the subsets of ii elements.

4. If z is an element of the center 91 of a proposition system £9, then the set ofelements of the form a n z, for all a e £9, is again a proposition system.

5. A lattice which satisfies a finite chain condition can be reduced in a unique wayto the direct union of a finite set of irreducible lattices.The element z0 U e4 is contained in the center %) of 2' and it is the smallestelement in which contains any e. (ci reference 5).

Page 140: Foundations of Quantum Mechanics

8-3 STRUCTURE OF IRREDUCIBLE PROPOSITION SYSTEMS 127

8-3. THE STRUCTURE OF IRREDUCIBLE PROPOSITION SYSTEMS

We shall now study an irreducible lattice which is complete, atomic, ortho-.complemented, and weakly modular (that is, it satisfies axiom (P) of Section5-8). Irreducibility can be expressed in two equivalent forms (cf. Problem 1):

Either to every pair of points e1, e2 there exists a third e3 such that

e1 U e2 = e1 U e3 = e2 U e3,

or the center is trivial.

In physical terms, irreducibility means the unrestricted validity of thesuperposition principle.

We shall introduce the notion of the dimension n of the system as themaximum number of compatible points whose union is equal to I. Thedimension may be finite or infinite, and it is always > 0.

The structure of an irreducible proposition system is related to thestructure of an (irreducible)projective geometry. The exact nature of thisrelation is contained in the following:

Theorem: Every irreducible proposition system 2' can be imbedded in acanonical way into a projective geometry G by a correspondence o which hasthe following properties:

1) c is a correspondence from £9 into G.

2) The restriction of c to the points of £9 is a mapping onto the pointsof G.

3) a b ([and only

4) cc(fl ai) = flcc(ai).

5) c.(a u e) = c.(a) u c.(e) for any point e e £9.

Proof: For the proof of this theorem, we utilize the definition of a projectivegeometry which we have given in Section 8-1. The set E in this definition willbe taken as the set of all points in £9. We define as c(a) the set of pointscontained in the proposition a. Let us verify that c(a) thus defined is indeeda set of the class of subsets which constitute the projective geometry G.According to the definition of G, such a subset must contain, with every pairof points, the line which passes through these points. This is evidently thecase here, because every point of the line is c a and hence contained in

Properties (2) and (3) are immediate consequences of the definition of ccLet us verify (4). The left-hand side consists of all the points contained in

a1 and the right-hand side is equal to all the points contained in all thewhich is by definition equal to all the points in fl a,. Property (5) is a

consequence of the flict that every point e1 Y a u e but different from e is on

Page 141: Foundations of Quantum Mechanics

128 PROPOSITION SYSTEMS AND PROJECTIVE GEOMETRIES 8-3

a line passing through e and a point in a. This is true if there exists a pointa n (e u e1) (cf. Problem 2). With this the theorem is proved.

We mention here for greater clarity that the union in an expression such ascc(a) u c(b) signifies the union with respect to the lattice structure of theprojective geometry G. It does not follow from property (5) that for any pairof propositions a, b, one has cc(a u b) = cc(a) u cc(b). In fact, in general onehas cc(a) u cc(b) c cc(a u b).

The theorem which we have now established enables us to reduce thestudy of the structure of a proposition system £9 to that of the projectivegeometry G into which any proposition system is thus canonically imbedded.

The structure of an irreducible projective geometry is completely deter-mined by the following:

Theorem: If the projective geometry G has dimension n � 3, then thereexists a linear vector space V with coefficients from a field (determined upto isomorphism) and a one-to-one correspondence between the elements ofG and the linear manifolds of V. This correspondence preserves the partialorder relation, and, in particular, maps points into one-dimensional sub-spaces of V. (For the proof refer to reference 1, Chapter Vii.)

With this theorem the first part of the analysis of the structure of proposi-tion systems is accomplished. We may summarize it as follows:

Every proposition system is a unique direct union of irreducible proposi-tion systems.

Every irreducible proposition system is imbedded in a canonical way intoa projective geometry.

Every projective geometry is algebraically isomorphic with the linearmanifolds of some linear vector space V with coefficients from a field. Thenature of the field is, up to isomorphism, determined by the algebraic structureof the lattice. The field may be different for different components of areducible lattice.

PROBLEMS

1. In a proposition system the two properties

a) To every pair of points e1, e2 there exists a third such that

eL u e2 = e1 u e3 = e2 u e3; and

b) the center is trivial

are equivalent.

*2. Given a point e and an element a in a proposition system Then everypoint e1 c a u e is situated on a line passing through e and a point e2 a [5].

Page 142: Foundations of Quantum Mechanics

8-4 ORTHOCOMPLEMENTATION AND METRIC OF VECTOR SPACE 129

8-4. ORTHOCOMPLEMENTATION AND THE METRICOF THE VECTOR SPACE

The theorem of the preceding section makes no statement about the natureof the field over which the vector space V is constructed. The field could befinite, discrete, or continuous. It could be commutative or not. Conventionalquantum mechanics is constructed over a Hilbert space with coefficients fromthe complex numbers. It would be most desirable to understand and exhibitthe physical basis for that particular choice in the conventional theory, butthis has not yet been possible.

A certain restriction concerning the nature of the field comes from thepostulate of orthocomplementation. The attentive reader will have noticedthat the axiom of orthocomplementation has not yet been used for anirreducible proposition system in the analysis of the structure of such systems.This we shall do now. The aim of this section is thus to find out the restrictionon the vector space V. which results from the existence of orthocomple-mentation.

A first example of such a restriction has been established by Birkhoff andvon Neumann. In order to formulate their theorem we need a couple ofdefinitions.

We shall define an involutive antlautomorphism of a field F as a mapping(denoted by *) of F onto F with the following properties:

(cc + fJ)* = + (ccfJ)* = fJ*cc*, and (cc*)* = cc

for all cc, e F.We shall call a sesquilinear form in V a mapping f of the cartesian

product V x V into F such that for all x1, x2, Yi, Y2 V and 2 e F,

f(x1 + Ax2, j') = f(x1, y) + A*f(x2, y),

f(x, Yi + 23)2) = f(x, Yi) + f(x, 3)2)2.

Such a form is called Hermitian if in addition

f(x, x);

and it is called definite if

fix, x) = 0 implies x = 0.

Birkhotl and von Neumann have demonstrated the following:

Theorem I: Let V he a vector space over afield F and let n be its dimension.if 3 � ii < then ei'erv ort/:ocornplen:entation in V defines an anti—

F (1/1(1 a (Ic/mite hermit/an fort;: .vucl: that every sub—•sj'(Irr ran he represented hr rector.s with/i .satis/r IL'-. = 0 for a

certain /(itfliI$ (If i'('(t(Ifl (ci, references 5 atid I).

Page 143: Foundations of Quantum Mechanics

130 PROPOSITION SYSTEMS AND PROJECTIVE GEOMETRIES 8-4

The existence of a definite Hermitian form implies the existence of aninvolutive antiautomorphism. Not every field has this property. We seethus that in the finite case orthocomplementation implies a restriction on thefield. The real, complex, and quaternionic fields all have this property andthus may be suitable candidates for the representation of proposition systems.

The following remarks outline the extension of the above theorem tothe infinite case. A useful intermediate step is obtained from the following:

Theorem 2: If the sesquilinear formsf and g over a vector space representthe same orthocomplementation, then there exists an element 'p e F suchthat g(x, y) = yf(x,y) for all x, y e V. (Cf. reference 1, Proposition 2,p. 105.)

This theorem now permits the following construction: Let £9 be aninfinite irreducible proposition system, and let G be the infinite projectivegeometry into which it is canonically imbedded. Denote by V the vectorspace over the field F which yields an isomorphic representation of G. LetV0 be any three-dimensional subspace of V. The restriction of the ortho-complementation in V to V0 defines an orthocomplementation in V0, andsince V0 is finite-dimensional, it defines a Hermitian definite form f0(x, y).

Now let V1 be any other finite-dimensional subspace of V which containsV0 V0 c i"1. According to Theorem 1, there exists a Hermitian form

g1(x, y) for whichwe denote the restriction to V0 by g0(x, y). According totheorem 2, we can find a number 'p e F such that

f0(x, y) = yg0(x, y).

Thus if we definef1(x, y) yg1(x, y), we have found the uniquely determinedHermitian form which defines the orthocomplementation on V1 and for whichthe restriction to V0 is equal to f0(x, y).

We can now define a definite Hermitian form on the entire space V inthe following way:

Let us define V1 V0(x, y) as the linear subspace of V which is generated byV0 and the two vectors x and y e V, and denote the unique extensiondescribed in the preceding remarks of f0 to the subspace Then we set

f(x,y)

This is, according to the definition, a definite Hermitian form on the entirespace V. Moreover, if e is a one-dimensional subspace (or a point of thelattice), then the subspace e' consists of all those vectors x which satisfy

f(x,y)=O for yee.We see from these remarks that the orthocomplementation defInes on V adefinite Hermitian form, even II V is infinite-dimensional.

Page 144: Foundations of Quantum Mechanics

8-5 QUANTUM MECHANICS IN HILBERT SPACE 131

The usual assumption in conventional quantum mechanics of a definitemetric is thus traced directly to the existence of the orthocomplementation.(For more details see reference 5.)

This result throws an interesting light on the various attempts at general-izing quantum field theory by postulating an indefinite metric in Hilbert space.While such formulations may indeed be possible, we see from the foregoingthat nothing of an essential physical nature can be gained by it. Indeed, wemay announce here quite generally that any physical theory (which admitsorthocomplementation) represented in a Hilbert space with indefinite metriccan be transcribed into another physically equivalent theory which operatesin a Hilbert space with definite metric.

8-5. QUANTUM MECHAMCS IN HILBERT SPACE

We have now completed the construction of the connection between theproposition system as an abstract lattice and its representation as a lattice ofsubspaces in some linear vector space with coefficients from some field. Weobtain the proposition system of conventional quantum mechanics if forthis field we choose the complex numbers, and for the antiautomorphism theordinary complex conjugation.

Although it has not yet been possible to single out in full generality theempirical basis for the choice of complex numbers, the following remarks maybe made concerning this choice. If the field contains the real numbers as asubfield, then the choice of field is severely limited by a celebrated mathe-matical theorem according to which there are only three fields which containthe reals as a subfield. These are the reals themselves, the complex numbers,and the quaternions. Quantum mechanics in real Hilbert space has beendeveloped by Stueckelberg [10], and he has found that the empirical evidencepoints towards the existence of a superselection rule, which has the effect thatat least for simple systems the proposition system is essentially equivalent tothe system of subspaces in a complex Hilbert space.

It is more difficult to rule out quaternionic Hilbert spaces, and the possi-bility of a "quaternionic quantum mechanics" was seriously considered [11].A rather decisive limitation comes from the relativistic form of quaternionicquantum mechanics, which can be shown, with a very plausible supplementaryassumption, to be equivalent to complex quantum mechanics for a systemconsisting of just one particle [12].

i:rom now on, therefore, we shall concentrate on quantum mechanics incomplex Hilbert space. In this case elementary propositions are representedby profection operators or, equivalently, by closed linear subspaces. Theintersection and unions are defined as set intersection and closed linear spanof suhspaccs. The orthoconiplement is defined with the fundamental Hermitianform]: If Al is a suhspacc, then Al1 is the set of all vectors ,v c .W such that

Page 145: Foundations of Quantum Mechanics

132 PROPOSITION SYSTEMS AND PROJECTIVE GEOMETRIES 8-5

f(x, y) = 0 for all y e M. The Hermitian formf(x, y) defines a scalar productof elements in at' by setting

(x, y) f(x, y).

Two propositions are compatible if and only if their correspondingprojection operators commute.

Observables are represented by c-homomorphisms A —* EA, which in thiscase are nothing else than spectral measures. Hence, in view of the one-to-onecorrespondence of spectral measures with self-adjoint linear operators, wemay say: Observables are self-adjoint linear operators.

The states are c-additive positive functionals on the projection operators.Gleason [13] has proved the important theorem that every such functionalhas the form

p(E) = Tr(WE)

where Tr signifies the trace defined by the formula

Tr X X(p,.)

for any complete system of normalized orthogonal vectors The operatorW is a self-adjoint positive operator which satisfies the relations

W>O, TrW=1, and W2�W.The state is pure if and only if the operator W satisfies the relation W2 = W.In that case W is a projection operator of rank I (Problems I and 2).

We define the expectation value p(A) of an obsekable A by setting

p(A)= J

Ad Tr(E2W) Tr(WA).

The expectation value is a linear functional on all the observables; that is,we always have

p(A + B) = p(A) + p(B),

whether A and B commute or not (Problem 5). The operator W is called thedensity operator. If W is a projection with one-dimensional range, and q is aunit vector in that range, then q is called the state vector. A state vector thusalways represents a pure state. In such a state, the expectation value of theobservable A is given byp(A) = (q, Aq). In the general case of a mixture therealways exists an orthonormal family q,. and positive numbers 2,. such that

p(A) = 2,.(q,., Acp,.) when A,. =

(Problem 6).

Page 146: Foundations of Quantum Mechanics

REFERENCES 133

PROBLEMS

1. The state represented by the operator W is pure if and only if W2 = W.

2. For a pure state W is a projection operator with one-dimensional range.3. If W is pure, then there exists a normalized vector p e t such that the expecta-

tion value of any observabie A is given by

<A> = (p, Aq).

4. If the state p is pure, then there exists a projection P of one-dimensional rangesuch that p(P) = 1 and vice versa.

5. The expectation valuep(A) for any state is a linear functional on all observables:

p(A + B) = p(A) + p(B).

6. If p is any state, then there exists an orthonormal family of vectors (not neces-sarily complete) and real numbers Ar > 0, such that

p(A) = A,(q1., Açvr) with Ar = 1.

7. If A is an observable with spectrum A, then we always have a �p(A) � b,where a is the greatest lower bound and b the least upper bound of A.

REFERENCES

Useful general references on projective geometries are:

1. R. BAER, Linear Algebra and Projective Geometry. New York: Academic Press(1952).

2. M. L. DUBREUIL-JACOTIN, L. LE5IEuR, AND R. CRoIsoT, Lecons sur Ia Théoriedes Treillis. Paris: Gauthier-Villars (1953); especially Section 3.

3. F. ARTIN, Geometric Algebra. New York: Interscience Publishers (1957).

Factors of type II were discovered by Murray and von Neumann. The lattice ofprojections in a factor of type II were called continuous geometries by von Neumann.

4. F. J. MURRAY AND J. VON NEUMANN, Ann. Math. 37, 116 (1936). Also, J. vonNeumann, Proc. Nat. Acad. Sc!. 22, 92, 101 (1936).

The main result of this chapter is the subject of the thesis of C. Piron.

5. C. Ph.D. thesis, Helv. Phys. Acta 37, 439 (1964). See also M. D. Mac-laren, Thesis, Harvard University (1962).

The general theory of Boolean lattices is discussed in:

6. M. I-I. STONE, Trans. Am. Math. Soc. 40, 37 (1936).

7. L. I-I. LOoMIS, BulL Am. Math. Soc. 53, 757 (1947).

The lattice-theoretic formulation of projeetive geometry is found primarily in:

8. C. A,,,,. Math. 36, 743 (1935).

Page 147: Foundations of Quantum Mechanics

134 PROPOSITION SYSTEMS AND PROJECTIVE GEOMETRIES

9. K. MENGER, Ann. Math. 37, 456 (1936).

Quantum mechanics over real and quaternionic Hubert spaces is considered in:

10. F. C. G. STUECKELBERG et at, Helv. Phys. Acta 33, 727 (1960); 34, 621, 675(1961); 35, 673 (1962).

11. D. FINKEL5ThIN, J. M. JAUCH, AND D. SPEI5ER, J, Math. Phys. 3, 207 (1962); 4,136, 788 (1963).

12. G. FMCH, Helv. Phys, Acta 36, 739, 770 (1963).

Gleason's theorem is proved in:

13. A. M. GLEASON, J. of Rat, Mech. and Analysis 6, 885 (1957).

Proposition systems can also be realized in algebraic Hilbert spaces, as shownin:

14. L. HoRwITz AND L. BIEDENHARN, Helv. Phys. Acta 38, 385 (1965).

15. L. HoRwITz, Helv. Phys. Acta 39, 144 (1965).

Page 148: Foundations of Quantum Mechanics

CHAPTER 9

SYMMETRIES AND GROUPS

Symmetry, as wide or as narrow as you may define its meaning, is one idea bywhich man through the ages has tried to comprehend and create order, beauty,and perfection.

H. WEYL

In this chapter we introduce the formal tools for the treatment of symmetriesin physical systems. After a brief exposition of the meaning of symmetry inphysics and its mathematical formulation (Section 9-1), we devote twosections, (9-2 and 9-3), to the theory of groups. What is offered here is ashort recapitulation of the principal notions concerning abstract and Liegroups, and it does not replace a more thorough study of group theory.The basic group which concerns us here is the group of automorphisms of aproposition system, defined in Section 9-4. In Section 9-5 we show howstates are transformed under this group and how this property can be usedto make it into a topological group H. The symmetry groups G which weencounter in nature are, however, given by basic laws or by particular cir-cumstances, and they are different from the group of automorphisms. Therearises then the problem of the homomorphisms of G into H. We call thesethe projective representations of G (Section 9-6). Complete solutions of thisproblem are given for the case of Lie groups represented in the lattice of thesubspaces of a Hilbert space. These are the most important cases for laterapplications.

9-1. THE MEANING OF SYMMETRY

The notion of symmetry has been of ever increasing importance in fundamen-tal theoretical physics. The principles of symmetries are often the only guidesinto unknown territory, and in the realm of the known they furnish convenientshortcuts to important results.

Symmetries pervade the whole of physics. They are important in theclassical as well as the quantum domain. They are expressions of the im-mutable laws which make physics as a science possible. and they are the most

135

Page 149: Foundations of Quantum Mechanics

136 SYMMETRIES AND GROUPS 9-1

profound structural properties of space and time, the arena of the physicalphenomena.

A symmetry has an immediate intuitive appeal and can often be recog-nized at a very early stage of a physical theory. But the notion of symmetrycan also be given a precise mathematical sense which is crystallized, so tospeak, in the notion of the group. Mathematics has developed the theory ofabstract groups to a high degree of perfection, and in the formulation ofphysical symmetries the theory of groups is an indispensable tool.

If we analyze the notion of symmetry, we find that it is based on two othermore primitive notions, that of transformation and that of invariance. Wesay that we have a symmetry if there exist certain transformations of theelements of a set which leave one or several relations among these elementsunchanged.

The lattice of a proposition system £9 furnishes excellent examples ofsymmetries. In order to show this, we introduce the notion of a morphism of alattice onto another one £92.

A morphism m for the lattices and £92 is a bijective (that is one-to-one)mapping of onto £92 with the following two properties:

1) a b implies m(a) m(b), and vice versa.

2) m(a') = m(a)' for every a e

If two lattices are connected by a morphism then they are said to beisomorphic.t

Since the mapping m is bijective, there exists an inverse mapping m'with domain £92 and range and this inverse is also a morphism.

In the special case that = £92 = £9, the mapping is a permutation ofthe elements of £9 and m becomes an automorphism.

If there exist automorphisms of a lattice, we say the lattice has symmetries.If m1 and m2 are two different automorphisms of a lattice, then the map-

ping which is obtained by carrying out first m1 followed by m2 is again anautomorphism denoted by m2m1, and called the product of m1 with m2. Theset M of all automorphisms of £9 is thus a family of morphisms which isclosed under the operations of the product and the inverse. Such a familyof transformations is called a group.

The algebraic structure of the group of automorphisms of 2' embodies agreat deal of the structure of £9. It is therefore of some interest to study thestructure of groups quite independently of the structure of the objects whichthe elements of the group transform. Some of the elements of the structureof groups are discussed in the following two sections.

t It should be remarked that morphisms in the mathematical literature areusually given a more general meaning than the one adopted here. A morphism inthe general sense need be neither one—to—one nor onto. Here we do not need it inthis generality,

N

Page 150: Foundations of Quantum Mechanics

9-2 ABSTRACT GROUPS 137

9-2. ABSTRACT GROUPS

An abstract group is a finite or infinite set of objects called the elements ofthe group endowed with an algebraic structure as follows: To every pair ofelements r, s e G there corresponds a unique element rs e G such that:

1) (rs)t = r(st) (associative law).

2) There exists a neutral element e e G such that, for all r e G, er = re = r.

3) To every r there exists a unique inverse r1 such that rr1 = r1r = e.

The order of the elements in an expression such as rs is importantbecause in general rs sr. If rs = sr, we say the elements commute. Agroup for which all pairs of elements commute is called commutative orabe/ian.

The set of elements in G which commute with every element in G is calledthe center Z of G. Abelian groups are thus characterized by the propertyZ = G. The neutral element e certainly belongs to the center. Furthermore,if z1, z2 are two elements e Z, then z1z2r = z1rz2 = rz1z2, so that z1z2 e Z.Likewise if z e Z, then zr = rz, from which it follows that rz1 = z1r.Thus z1 e Z. We see from these remarks that the center Z is itself an abeliangroup, and since Z G we call it a subgroup of G.

In general, a subset H c G is called a subgroup of G, if it is itself a groupunder the group operations of G.

There always exist two trivial subgroups, namely, H = (e} and H = G.

In the first case H consists of the single element e (evidently a group), and inthe second case it consists of all the elements of G. An easy way to constructnontrivial subgroups of a group is the following: Take any element a e G(a e) and form the successive powers of a, a2 = aa, a3 = aa2, etc. If thereexists an n such that a finite subgroup of a special kindcalled a cyclic group. If there exists no such n, then we obtain an infinitecyclic subgroup.

If H is a subgroup of G, and r an element of G which is not contained inH, then we denote by rH = R1 the set of elements of the form rx for all x e Hand fixed r. It is called a left coset of H. It has no elements in common withH. If there exists a further element s e G such that s H and s rH, thenwe can construct a further left coset = sH. By continuing this process wecan exhaust the entire group G and arrive at a decomposition of G as a unionof disjoint subsets,

G = H u u U

One can do the same thing by using right cosets instead, for instance, R,. = Hr,etc.: one then obtains a decomposition of G into right cosets

1'licsc two decompositions become identical if every Idi coset is equal tosome right coset and versa. The necessary and sufficient condition for

Page 151: Foundations of Quantum Mechanics

138 SYMMETRIES AND GROUPS 9-2

this to be the case is that rH = Hr for all r e G. This means that for anyx e H there exists another y e H such that rx = yr. A subgroup with thisproperty is called an invariant subgroup of G.

If H is an invariant subgroup, then the cosets are themselves a group. Agroup operation can be defined between two cosets R and S by selecting anytwo elements r e R and s e S. Then the product rs lies in a coset RS. Thisnotation is justified if the coset RS does not depend on the choice of r e Rnor on the choice of s e S. For this to be the case, it is necessary and sufficientthat H be an invariant subgroup of G. The product can then be transferredto the cosets themselves. The group properties are easily verified (Problem 1).The group of cosets with respect to an invariant subgroup H is called thefactor group, and is denoted by G/H.

If we have two groups G1 and G2, and if there exists a correspondencer1 —* r2 between elements r1 e G1 and r2 e G2 such that r1 —* r2 andimplies r1s1 —* r2s2, we have a homomorphism of G1 into G2. The homo-morphism is onto if every element of G2 is the image of some element in G1.The group G2 is then the homomorphic image of the group G1.

The group G2 may be considered to reflect some but not all of the prop-erties of G1. It is indistinguishable from G1 as an abstract object if thehomomorphism is one-to-one. In that case the two groups are said to beisomorphic.

If the two groups are identical, that is if G1 = G2, then an isomorphismis called an automorphism (cf. Problems 4 and 5).

PROBLEMS

1. The cosets with respect to an invariant subgroup H c G are a group.2. If G1 -÷ G2 is a homomorphism, then the set of elements H which are mapped

into the neutral element e2 e G2 is an invariant subgroup and there exists anatural isomorphism between G/H and G2.

3. The center Z is an invariant subgroup of G.

4. The transformation r -÷ s e G is an automorphism of G whichleaves the center pointwise invariant. This is called an inner automorphism.

5. There are automorphisms which are not inner: The simplest example is thecyclic group of three elements, e : a, a2, a3 = e. The automorphism a -÷ a2

is not inner.

6. Every abstract group is isomorphic to some transformation group. Such anisomorphism can be implemented by considering the left translation L, of thegroup space associated with the element r and defined by

Lr s —> rs.

We have then for any pair r1, r2,

i..r1Lr2 I.,i,j,

Page 152: Foundations of Quantum Mechanics

9-3 TOPOLOGICAL GROUPS 139

9-3. TOPOLOGICAL GROUPS

The abstract groups defined in the preceding section can be endowed with atopology. The group is then simultaneously an abstract group and a topo-logical space. It is then meaningful to consider groups such that the productrf 1 of two elements is a continuous function of both of the elements. Ifthis is the case we speak of a topological group [3]. If the topology is thetrivial discrete topology, then we speak of a discrete topological group. Moreinteresting for the applications are certain continuous groups called Liegroups which have a wide range of applications.

a) Lie groups. For Lie groups the topology is locally that of a finite-dimensional linear manifold. This means that for every element r0 e G thereexists a neighborhood N0 which is the continuous and one-to-one image ofsome neighborhood of a finite-dimensional Euclidean space. If n is thedimension of this space, then every element r e N0 can be represented as aset of n real numbers Pi' P2' , p,,. It suffices to introduce cartesian co-ordinates in the Euclidean space and use the coordinates of the representativepoint for r in that space. These coordinates can, of course, be chosen in manydifferent ways. In any case we can write the group relations as a set of func-tional equations between the parameters. For instance, let s be a secondelement of the group with the parameters a1, a2, . . . , and assume further-more that t = rs e N0; then we can express the law of the group multi-plication in the form

(i=1,2,...,n), (9—1)

where t1, t2, . . . , ; are the parameters for the element it = rs.Theft are n continuous functions of the 2n variables , For Lie

groups the parameters can be chosen in such a way that the arenot only continuous but also analytic in a suitable neighborhood. They thenadmit derivatives to any order. It is natural in this case to study the relationof the local properties of Lie groups to the global properties, that is, thestructure of the group in the large. It turns out that the local propertiesdetermine to a large extent the properties of the group in the large, and theremaining ambiguity can be completely characterized.

b) Local properties of Lie groups. We adopt the following notation. Thelabels r, s, . . . for the group elements are used simultaneously for the set ofparameters in some parametrization of the group. Thus r = (P1' P2' ,

We choose the parameters so that the neutral element has the parameterse = (0,0,..., 0}. Equation (9—I) will then appear in the abbreviatedformt = j(r, s). The local properties in the neighborhood N0 ol' an arbitraryclement r9 can always he related to the neighborhood of the neutral elementby the transformat ion r0 N1). Thus it stifilces to st tidy t lie neighborhood ofthe identity.

Page 153: Foundations of Quantum Mechanics

140 SYMMETRIES AND GROUPS 9-3

A curve which passes through the neutral element is a continuous functionr(cx) of the real variable cc in some neighborhood of cc = 0, and such thatr(0) = e. A curve is a one-parameter subgroup if the parameter can be chosensuch that r(cx)r(f3) = r(cx + fi) in some neighborhood of 0. In that case thefunctions r(cx) are analytic and we can define the tangent vector as a vectorin Euclidean space with the components a1 = =

If we differentiate the equation (9—i) with respect to the parameter cc, weobtain the differential equation

d = (9-2)dcx k=1

where/3f1(r, s)

=\

Thus every one-parameter subgroup of a Lie group satisfies a differentialequation (9—2); and conversely, every solution of this differential equation isa one-parameter subgroup.

Consider two arbitrary local one-parameter subgroups r(cc) and s(fl) withthe tangent vectors a and b respectively. We use vector notation. In aparticular coordinate system the vector a has components a1 =where Pi' P2' , p,, are the coordinates of the point r. We define thecommutant

q(cx;ø,h)

and the infinitesimal commutant

[a, b] = lim q(cc; a, b). (9—3)cc

This expression is antisymmetrical in its arguments

[a,b] = —[b,a], (9-4)

and it satisfies the Jacobi identity

b], c] + c], a] + a], b] = 0. (9—5)

If the group is abelian, then the bracket [a, b] = 0. If it is not abelian, then itsvalue is some sort of measure of the deviation from commutativity in thegroup. In any case the bracket expression [a, b] is again a tangent vector, andit defines an algebraic structure property of the tangent vectors of a Lie group.Such an algebra which satisfies the relations (9—4) and (9—5) is called a Liealgebra or an infinitesimal Lie group.

The structure of a Lie algebra can be expressed explicitly by referring itto a particular coordinate system. Let a (1 = I is) he a base of tangentvectors, where a is the tangent vector iii the directioii / of' the local coordinate

Page 154: Foundations of Quantum Mechanics

9-3 TOPOLOGICAL GROUPS 141

axis (not the component of a vector a!). Then the explicit evaluation of thecommutant gives the following result [3]:

[a1,=

(9—6)

where

/3fk= — (9—7)= =

are the structure constants of the Lie algebra. From the expression (9—7) onesees immediately that the structure constants are antisymmetrical in thelower indices

= —c%. (9—4)

Furthermore, they satisfy

+ crnicjk + = 0, (9—5)m 1

as a consequence of the Jacobi identity.In the theory of Lie groups, one proves that the structure constants

determine the structure of the group completely in a suitable neighborhoodof the neutral element. This is the essential result of the local theory of Liegroups.

We turn now to global properties.

e) The global properties of Lie groups. Let G be a Lie group and r and s twoarbitrary elements of G. These two elements are said to be connected if thereexists a continuous curve x(x) e G (0 � o � 1) such that x(O) = r andx(l) = s. The set of all the elements which are connected with the neutralelement e is an invariant subgroup G0 of G, called the connected componentof G (ci Problem 2).

A connected group is said to be simply connected if every closed curve inthe group can be contracted to a point. This means that if r(cx) (0 � o � 1)is closed (r(0) = r(1)) then there exists a continuous function r(cx, fi) of thetwo variables o, fi (0 � fi � 1) such that

r(cx, 1) r(cx) and r(cx, 0) = r(0, 0).

If this is not the case, then G0 is said to be multiply connected.According to a fundamental theorem on Lie groups there exists for every

connected, multiply-connected Lie group G0 a unique simply-connectedunirersal covering group G* with the following properties:

I) G0 is the homomorphic image of G*;2) the kernel of the homomorphism G* —÷ G0 is an invariant discrete sub-

group N of the center of

Page 155: Foundations of Quantum Mechanics

142 SYMMETRIES AND GROUPS 9-4

This theorem then spells out in detail to what extent the Lie algebra deter-mines the global structure of the Lie group. Every connected Lie group is asubgroup of its universal covering group. The structure constants determinethe nature of the covering group, but they allow no general conclusion as tothe connectedness of the group.

There is one more topological property which is of great importance inthe applications of groups to quantum mechanics. A topological group,seen as a topological space, may be either compact or noncompact. Atopological space is compact if every infinite sequence has a convergentsubsequence. Since the structure constants determine the covering groupcompletely, it must be possible to determine whether this group is compactor not from an examination of these constants alone; this is indeed the case.We shall be content with this general remark here. For greater details werefer the reader to reference 3.

PROBLEMS

1. The group of the real numbers (called the real line) consists of all real numbersunder the group operation of addition. It is a one-parameter abelian group.Its structure constants are identically zero. The group is connected, simplyconnected, and noncompact.

2. The connected component G0 of Lie group G is an invariant subgroup of G.3. The circle group consists of the real numbers p (0 � p � 2n) under the operation

of addition modulo Zn. Its Lie algebra is identical with that of the real line.The group is connected, multiply connected, and compact.

4. The real line is the universal covering group G* of the circle group G. Thekernel of the homomorphism G* -÷ G is the cyclic group of infinite order.

5. The Euclidean motions in a two-dimensional plane are a three-parameter non-abelian group. It is connected, multiply connected, and noncompact.

6. The rotations of three-dimensional Euclidean space are a three-parameter Liegroup. It is nonabelian, connected, doubly connected, and compact. Thekernel of the homomorphism G* -÷ G0 is the cyclic group Z2 of order 2 (cf.Section 13-3, especially Problem 3).

7. The rotation-reflection group of the three-dimensional Euclidean space consistsof two disconnected components.

9-4. THE AUTOMORPHISMS OF A PROPOSITION SYSTEM

As we have seen, by way of illustration of the notion of symmetry in Section9-1, there exists a group of automorphisms of a lattice £f. Automorphismsm are one-to-one mappings of 2' onto itself with the properties

1) a b if and only if m(a)

2) ,n(a') = m(a)'.

Page 156: Foundations of Quantum Mechanics

9-4 THE AUTOMORPHISMS OF A PROPOSITION SYSTEM 143

It follows immediately that an automorphism admits an inverse m' whichis also an automorphism. If a1 = m(a), then the inverse m' is defined bym'(a1) = a. Thus if a1 b1, then by property (1), m'(a1) m'(b1).Property (2) read from right to left then gives = m'(a1)'.

We shall now show that an automorphism leaves not only the partialordering intact, but it leaves the entire lattice structure invariant. We provefirst:

Lemma 1: m(U a1) = U and m(fl a1) = fl m(a1).

Proof: We have, from the definition of union, U a1 for all i. Thus byproperty (1) it follows that m(a1) a1), and by the definition of theunion we obtain m(a,) U m(a,) a1). From the first of these twoinclusion relations we obtain m' U m(a,), by observing that m' isalso an automorphism. Consequently, by taking unions and applying magain,

m(U U m(aJ.

Comparing this with the previous inclusion in the reverse sense, we conclude

m(U a,) U Q.E.D.

The proof of the second assertion of the lemma is similar.

From Lemma 1 it follows immediately that m(Ø) = 0 and m(I) = LFurthermore if a is compatible with b a b, then m(a) *—* m(b) and con-versely. To see this, it suffices to remark that compatibility is expressibleentirely with the relations U and fl. These relations are invariant under anautomorphism m, and so is compatibility. The converse follows again fromthe fact that m' is also an automorphism.

Another corollary is that a point e is transformed by an automorphisminto a point m(e). Suppose indeed that x c m(e). It follows then thatm1(x) c e, and since e is a point, m1(x) = 0. This implies that x =m(0) = 0. Thus m(e) is a point.

Lemma 2: An automorphism is completely determined by its restriction tothe points.

Proof: The assertion is a direct consequence of the fact that every propositionis the union of its points. Thus if a = U e1,thenm(a) = m(tJ = U m(e1),by Lemma 1. This proves Lemma 2.

In Section 8-2 we defined the central cover of a point e as the union of allthe points e, which are coherent with e: z = U e,. Let us denote by acoherent component the segment [0. z], consisting of all elements x of thelattice which satisfy 0 c x c z. What happens to the coherent componentstinder an autornorphusm? The answer is contained in Lemma 3.

Page 157: Foundations of Quantum Mechanics

144 SYMMETRIES AND GROUPS 9-4

Lemma 3: The image of a coherent component under an automorphism isagain a coherent component.

Proof: If z is the union of coherent points. Then m(z) is also a union ofcoherent points, since coherence is invariant under an automorphism. Thisproves Lemma 3.

We have now obtained the following characterization of an automorphismof a lattice: Let £f be a lattice and its (uniquely determined) coherentcomponents. An automorphism induces a permutation of the coherentcomponents. If m(x) is the index of the component with index cc then theautomorphism induces a morphism between the lattice and the lattice

At this point we obtain a further characterization of automorphisms ifwe make use of the fundamental theorem of projective geometry. We haveseen that every projective geometry admits a representation as subspaces of alinear vector space V with coefficients over a field F. Let u1 e V1; we say thecorrespondence Su1 = u2 e V2 of the vector space V1 onto V2 is a semilineartransformation if S(u1 + v1) = Su1 + Sv1 and S(fu) = fu for every elementfe F, wherefs is the image off under an automorphism of the field F. It isclear that every semilinear transformation of V1 onto V2 induces a morphismwhich maps the lattice of subspaces from V1 onto the lattice of subspacesfrom 172. The fundamental theorem of projective geometry affirms theconverse: Every morphism of onto is induced by a semilinear trans-formation of onto 172, provided the dimension of and V2 is at leastthree [4]. With this we have arrived at the following fundamental theorem.

Theorem (Wigner): Every automorphism m of a lattice of propositionsrepresented by a family of vector spaces of dimension at least three,maps every coherent space onto another one by a semilinear trans-formation [7], [8], [9].

In the special case of quantum mechanics, the vector spaces are allHilbert spaces over the complex numbers. The only automorphisms of thecomplex numbers are the identity and the complex conjugation. We speakthen of linear or antilinear transformations of the vector space V.

PROBLEMS

1. An automorphism of a lattice has always at least two fixed elements.

2. If a and b are fixed elements of an automorphism m of then the segment [a, b]is invariant under m. (The segment is defined as the set of elements x such thata c x c b.)

3. If an automorphism m of subspaces in a complex Hilbert space is a square ofanother automorphism, then pn is induced by a linear transformation of theHubert space.

Page 158: Foundations of Quantum Mechanics

9-5 TRANSFORMATION OF STATES 145

9-5. TRANSFORMATION OF STATES

Every automorphism in a lattice 2' induces a transformation of the stateson 2'. It is defined as follows: Let a m(a) be an automorphism of 2' anddefine pm(a) p(m '(a)). Then we can easily verify that pm(a) is also a state(Problem 1).

The important application of this realization of the automorphisms of2' is the definition of a topology on the group M of automorphisms.

For every a > 0 we can define an c-neighborhood of the identitye e M by setting

(m : pm(a) — p(a)I c a for all a and all p}.

This system of neighborhoods satisfies the conditions of Theorem 10 inreference 3, so that this theorem is applicable and we have thereby defined atopological group M of automorphisms.

It now becomes meaningful to speak of connected automorphisms. Twoautomorphisms m, and m2 are said to be connected if there exists a con-tinuous mapping m(x) from the interval [0, 1] into M such that m(0) =and m(1) = m2.

The elements m e M which are connected with the identity e e M arean invariant subgroup M0 c M, called the connected component of M(Problem 2).

We can now use continuity arguments to prove the following:

Theorem: If 2' is the discrete direct union of several coherent sublattices2's, then every automorphism m which is connected to the identitytransforms each onto itself (Problem 3).

The topology of the group M of automorphisms is involved too when wespeak about the (projective) representation of a group G. We define: Acontinuous homomorphism of a topological group G into M is called aprojective representation of G.

Let r U, be a representation of a group G with elements r in theautomorphisms M of a lattice 2'. We shall say that the pair (2', U) is anelementary system if

LJ,a=aforallr impliesa=ø or a=LIt is then easy to prove (Problem 5): Every elementary system with respectto a connected group G is necessarily coherent.

PROBLEMS

I. Ifp(u)is a on the lattice and a m(a)is an automorphism of 2, thenp(np '(ii)) is also a state.

2. Ihe connected component M1, of the autornorphisn'is Al of a lattice is an in—variahi t S Ul)gi'o LII) 4)1 .% I.

Page 159: Foundations of Quantum Mechanics

146 SYMMETRIES AND GROUPS 9-6

*3 If 2' is a finite direct union of irreducible lattices 2',, then an automorphism mwhich is connected with the identity maps each 2', onto itself ([6], Theorem 1.2).

4. If the state p is pure, then ptm is pure, too.*5 If G is a connected group and 2 an elementary system with respect to G, then

2' is necessarily coherent [6].

9-6. PROJECTIVE REPRESENTATION OF GROUPS

From now on we shall concentrate on the projective representations of sym-metry groups in Hilbert space and we shall study the representation in aconnected component of the lattice of subspaces.

We have thus a group G and a homomorphism U, into the automorphismsM of the lattice of subspaces in some Hilbert space. Every such automor-phism is induced by a semilinear transformation which we denote also by U,.U, is determined only up to a numerical factor. By choosing this factorconveniently, we can always make U, unitary or antiunitary. If we concen-trate first on the connected component of a Lie group we can even assumethat U, is unitary. Under these conditions the operator U, is only determinedup to an arbitrary phase factor. The composition law for the representationof the group G may therefore be written in the form

= w(r, (9—8)

where w(r, s) is a complex number of magnitude 1.The problem of the projective representation of a symmetry group is

thus to find a continuous mapping of G into the group of unitary operatorsU, which satisfies a relation such as Eq. (9—8).

It is useful to point out here that the exact values of the numbers w(r, s)have no significance because the operators U, are, as we have said, onlydetermined up to phase factors of magnitude 1. It is therefore appropriateto introduce the notion of equivalent phase factors.

If we carry out the transformation U, —+ U,' = 4(r)U, (14(r)I = 1), thenthe new operators satisfy = w'(r, s) with

tfr(rs)w (r, s) = w(r, s) (9—9)

Two factors which are related to one another by the relation (9—9) aresaid to be equivalent. We easily verify that this is indeed an equivalencerelation, and we shall write w —' w' (Problem 1). If every factor is i—' I, thefactor w is irrelevant and can be removed. In this case and in this case only,it is possible to reduce the problem of the projective representations tothat of a vector representation. One of the major problems in the theory of

Page 160: Foundations of Quantum Mechanics

9-6 PROJECTIVE REPRESENTATION OF GROUPS 147

projective representations of a group is to determine all the classes of in-equivalent phase factors.

The functions w(r, s) are not entirely arbitrary, since G is a group andsatisfies the associative law. If we consider three elements r, s, and t e G,and we write the equation (rs)t = r(st) in the projective representation, thenwe find

w(r, s)w(rs, t) = w(s, t)w(r, st). (9—10)

We can furthermore introduce a normalization of an arbitrary overall phasefactor by assuming Ue = I From this it follows that

1 = w(r, e) w(e, r) for all r e G. (9—11)

Every solution of these two equations furnishes us a possible phase factor.The problem is thus reduced to finding all the equivalent classes of complex-valued functions w(r, s) of Magnitude 1, which satisfy conditions (9—10) and(9—11).

The problem can be broken up into two others, a local one and a globalone. The local problem deals with the classes of inequivalent phase factors ina neighborhood of the identity. The global problem deals with the extensionof these factors to the entire group.

In the theory of the Lie groups, the determination of all local factorsreduces to a relatively simple problem in vector algebra. Here we shall givemerely a sketch of the solution of this problem. For greater details we referthe reader to references 2 and 8.

One first shows that within the class of equivalent factors it is alwayspossible to choose a canonical factor such that all one-parameter subgroupsare represented as vector representations. Thus for such a factor a one-parameter subgroup where = r(; + c2), would be representedby unitary operators r(x) —+ such that = Moreover, therepresentation is continuous, and we can therefore, via Stone's theorem,associate with every one-parameter subgroup a unique self-adjoint operator Athrough the relation = We may regard the self-adjoint operator A asthe representation of the tangent vector a corresponding to the one-parametersubgroup r —* U, were a true vector representation for the group,then one could show that the Lie algebra for the tangent vectors is faithfullyrepresented by the self-adjoint operators A in the following sense: To acomplete family of tangent vectors a1 there corresponds a complete set of,self-adjoint operators A1 such that to the Lie bracket [a1, a1] of two tangentvectors there corresponds the commutator of A1 and A1 multiplied with i.Thus, if the Lie algebra has the basic structure equation

a,, = E (9—12)

Page 161: Foundations of Quantum Mechanics

148 SYMMETRIES AND GROUPS 9-6

then the self-adjoint operators A1 corresponding to satisfy commutatorrelations

lEA1, A1] = (9—12)'m 1

However, if we are dealing with a projective representation, then we obtain,by differentiating Eq. (9—8), a commutator relation such as

i[A1, = + I. (9—13)m- 1

It differs from the relation (9—12)' by the additional terms on the right-handside. These terms come from the phase-factors in Eq. (9—8). In fact, onefinds

= s) — S))

3p13a1 r = S =

where w(r, s) = s) We see from this expression, as well as from Eq.(9—13), that the fi11 are antisymmetrical:

= (9—14)

From the associative law, for instance by differentiating Eq. (9—12), wefurther obtain the relation

+ c)lfiIk + = 0. (9—15)

Every solution of the identities (9—14) and (9—15) will be called a localfactor.Every local factor can be extended by integration to a factor in a suitableneighborhood of the identity.

It follows readily from the Jacobi identity (9—5) that, if is a solutionof the relations (9—14) and (9—15), then

= with 2k real, (9—16)

is again such a solution. The corresponding factors are in the same equiva-lence class, and any two factors in the same equivalence class lead to localfactors related by a set of equations (9—16) with some real constants

From these results it is easily seen that the equivalence classes of localfactors form a linear vector space. If the structure constants are identicallyzero (abelian groups), then this vector space has the dimension 11* of theantisymmetrical matrices of n dimensions, that is = jn(n I). If thestructure constants are different from zero, then this dimension is reduced bythe dimension of the antisymmetrical matrices spanned by the structureconstants. Thus we have generally 11* � — I).

Page 162: Foundations of Quantum Mechanics

9-6 PROJECTIVE REPRESENTATION OF GROUPS 149

The determination of all local factors is thus obtained as the solution of aset of linear equations.

By a suitable choice of the parameters on the group, the factors in theneighborhood of the neutral element may be written in the form

w(r, s) =with

s) = (9—17)1,]

The determination of the factors on the entire group is easy when thegroup is simply connected, because of the following result due to Bargmann:On a simply connected group every local factor determines a factor on theentire group. Moreover, in every class of equivalent factors there existcontinuous and differentiable factors [9]. With this result the determinationof the projective representations of simply connected Lie groups is completed.

PROBLEMS

1. The formula

to (r, s) = w(r, s)

establishes an equivalence relation between two factors of projective represen-tations of a group.

2. For a one-parameter abelian group, every projective representation is equivalentto a vector representation.

3. Consider the translation group of the Euclidean plane, described by pairs of realnumbers (ac, J3) with the composition law (vector addition)

(aci, fljj(oc2, P2) = (0c1 H— oc2, Pi H— P2).

This group is abelian, connected, simply connected, and noncompact, and ofdimension n = 2.

There exists a one-parameter family of nonequivalent classes of factors.Let (oc, ,i3) —> W(cc, ,i3) be any of the projective representations, so that

W(oci, P2) = w(oci, /3', /32)W(oc, + oc2, /3, + /32).

In each of the equivalence classes we can then choose a factor of the form

/31. /32)

it Ii K St )fllC a 11)1 t ra ty real ii un'ihcr.

Page 163: Foundations of Quantum Mechanics

150 SYMMETRIES AND GROUPS

4. Consider the rotation group in three dimensions. It is nonabelian, connected,doubly connected, and compact. Its dimension is n = 3. The three infinitesimalrotations along the three orthogonal axes are a complete system of tangentvectors. They satisfy the commutation rules

[A1, A2] = i(A3 +[A2, A3] = i(Ai + fl1[A3,A1] =i(A2 + fl2 SI).

Show, by a suitable choice of local factors, that every local factor is equivalentto the factor 0, so that every projective representation of the rotation group islocally equivalent to a vector representation.

REFERENCES

All the essentials of group theory needed for this chapter can be found in a contextof physical applications in:

1. E. P. WIGNER, Group Theory and its Application to Quantum Mechanics ofAtomic Spectra. New York: Academic Press (1959).

2. M. HAMERME5H, Group Theory and its Application to Physical Problems.Reading, Mass.: Addison-Wesley (1962).

For the theory of topological groups, the standard reference is

3. L. PONTRYAGIN, Topological Groups. Princeton: Princeton University Press(1958).

The fundamental theorem of projective geometry which is used here is foundfor instance in:

4. E. ARTIN, Geometric Algebra. New York: Academic Press (1957); Theorem 2.26,p. 88.

Wigner's theorem has been proved many times by different methods; see thefollowing:

5. V. BARGMANN, J. of Math. Phys. 5, 862 (1964).

6. G. EMCH, Helv. Phys. Acta 36, 739, 770 (1963).

7. G. EMCH AND C. PIR0N, J. of Math. Phys. 4, 469 (1963).

A complete list and critical evaluation of all the references (up to 1963) concern-ing Wigner's theorem are found in:

8. U. UHLHORN, Arkiv Fysik 23, 307 (1963).

9. V. BARGMANN, Ann. of Math. 59, 1 (1952).

Page 164: Foundations of Quantum Mechanics

CHAPTER 10

THE DYNAMICAL STRUCTURE

The objective world simply is, it does not happen. Only to the gaze of myconsciousness, crawling up along the life-line of my body, does a section of thisworld come to l(fe as a fleeting image in space which continually changes intime.

H. WEYL

The dynamical structure of a physical system contains the law which governsthe time evolution of the states (Section 10-1). For the conservative systemthis law is a homomorphism of the additive group of real numbers into theautomorphisms of a proposition system. The application of results fromprevious chapters readily permits us to derive, without any further assump-tions, the Schrodinger equation (Section 10-2) which we cast into the differentbut equivalent forms corresponding to the Schrodinger, Heisenberg, and Diracpictures (Section 10-3). The last section (10-4) contains some remarksconcerning the dynamical law for nonconservative systems.

10-1. THE TiME EVOLUTION OF A SYSTEM

Until now we have considered merely the kinematic aspect of quantumphysical systems. This aspect refers to properties which can be measuredat one particular instant of time. We have so far completely ignored thetime evolution of the state of a system. This time evolution contains thedynamical aspects of the system.

It is perhaps not superfluous to point out that the reference to theproperties at "one instant of time" is an abstraction which can never becompletely satisfied for real observations or real physical systems. Allmeasurements take time; some measurements may even take a very long time.For such measurements the notion of the "state at a given instant" or similarnotions (which refer to a certain delInite value of time) may become verydifficult to define. In such cases the practical distinction between the kine-matic and dynamical aspects of the system may he obscured. Neverthelessthis distinction can he mwntaincd as an ideali,ation

'5'

Page 165: Foundations of Quantum Mechanics

152 THE DYNAMICAL STRUCTURE 10-1

A second difficulty in the application of this notion is encountered inrelativistic quantum mechanics where it is not possible to give an absolutemeaning to the notion of simultaneity of distant events. This is one of thedifficulties of a relativistic quantum mechanics for nontrivial systems, andit can only be dealt with within the framework of a field theory where timeis a local parameter and not an overall variable as in the nonrelativistic case.This is one of the reasons why we are forced to restrict our discussion tononrelativistic physical systems.

The dynamical property of a physical system expresses itself as a trans-formation of the state at some time t = 0 to the state at some other timet 0. What can we say about the nature of this transformation?

In the classical mechanics of conservative, systems, the state evolves inaccordance with the solution of some first-order differential equation in thecanonical variables. For such systems, the state at one instant of timedetermines the state at any other instant. It is not unreasonable to assume asimilar behavior for quantum mechanical systems, modified only with respectto the quantum mechanical meaning of "state" as given in Section 6-1.

We expect, then, that any given state p = Po at time t = 0 will uniquelydetermine another state Pt at time t 0, and that the transformation p Ptis continuous in the topology induced by the states.

We shall speak of conservative systems if the transformation p Pt

does not depend on the value of the initial time (which we have here chosento be t = 0). For such systems we may say more explicitly: The correspond-ence Pt Pt + depends only on 'r and not on t.

This general property does not yet sufficiently determine the characterof the time evolution of quantum systems. Just as a continuous curve inphase space is a much more general object than the solution of a differentialequation, so it is with the time evolution of states in quantum systems:Continuity alone gives very little information as to the character of thetransformation p —*

We therefore supplement continuity with a very important and far-reaching further assumption: The time evolution of a physical system isinduced by a symmetry transformation of the proposition system.

Let us discuss what this assumption means from the physical point ofview. If time evolution is a symmetry transformation, then the physicalstructure of the proposition system is indistinguishable at two differentvalues of the time. This expresses the homogeneity of time. For isolatedsystems this property is equivalent to the existence of immutable dynamicallaws. With such systems it is not possible to determine, by physical measure-ments, an absolute value of the time; only time differences are accessible tomeasurements. Whether there are systems which are, in this sense, homo-geneous in time is of course a matter of experience, and it is indeed one of thefundamental experiences about the physical world that this is the case.

Page 166: Foundations of Quantum Mechanics

10-2 THE DYNAMICAL GROUP 153

These considerations lead us therefore to the following formulation of thebasic dynamical law for a conservative quantum system:

The evolution p —* Pt of the state is induced by a continuous symmetrytransformation of the lattice of propositions. This means that there exists anautomorphism a at of the lattice 2 of propositions, such that

= p(a).

Moreover, this automorphism is a continuous function of the parameter t.We should emphasize here that this specification of the time evolution

is certainly correct for a certain class of systems, those which correspondclassically to the conservative systems; but it need not be true for all systems.Indeed, systems which during their evolution are subject to external influencesmay deviate from the behavior of conservative systems in two importantrespects: They may evolve in a manner which depends on the absolute valueof the time, and their evolution may not necessarily be induced by a symmetrytransformation of the proposition system. The second of these modificationshas hardly been considered; yet it is possible to give quite elementary exampleswhere this behavior must certainly occur (cf. Section 11-9). It is expected tooccur always when the quantal nature of the external interacting system isnot negligible.

We shall here be primarily concerned with the behavior of conservativesystems which comprise a very large and important class. For such systemsthe description can be made much more specific.

10-2. THE DYNAMTCAL GROUP

We shall now assume that we are dealing with a conservative system de-scribed by a proposition system which can be represented as the subspacesin a Hilbert space. The time evolution of such a system is described by acontinuous homomorphism of the one-parameter group of the real numbersinto the automorphisms of the proposition system. According to the dis-cussion in the preceding chapter, such a homomorphism can be induced by aunitary vector transformation of Hilbert space which maps the real linecontinuously into a one-parameter group of unitary operators U (—cc c

<.+ cc):= and = U_i, (10—1)

and is a continuous function of the real variable t. Under thisgroup, observables, and in particular projections, transform according to

A —* (10—2)

The transformed state p, is then defined by

j;,(U,AUI') = p(A). (10—3)

Page 167: Foundations of Quantum Mechanics

154 THE DYNAMICAL STRUCTURE 10-2

In Hilbert space this becomes

= Tr(WA). (10-4)

Since this relation must be true for all A, in particular for all projections, weobtain, by using the cyclic invariance of the trace,

= (10—5)

In particular if the state is pure and if it is represented by a state vector ifrin the range of W, then the time evolution ifr of the state is given by theunitary transformation

= (10—6)

We call the particular homomorphism t —+ the dynamical group of thesystem. The dynamical group is thus more than a mere abstract group. Asan abstract group, it would merely be the additive group of real numbers.As a homomorphism it is a particular one-parameter family of unitary trans-formations of Hilbert space. We can further characterize the dynamicalgroup by expressing it in infinitesimal form: The linear manifold of vectors

e 2 for which

1— = H/i (10—7)

t—*O t

exists is the domain of a self-adjoint linear operator H defined by the operationon the left-hand side of Eq. (10—7) (cf. Problem 1). The operator H is calledthe evolution operator or Hamiltonian of the system. With this operator wemay write, according to Stone's theorem, = e

iHt

Equation (10—5) can now be written in infinitesimal form by formalexpansion of U, = e In this way we obtain

Similarly, we obtain for Eq. (10—6),

= (10—6)'

This is called the Schrodinger equation of the system.The two descriptions are completely equivalent. The connecting link is

furnished by the spectral measure associated with the self-adjoint operator H.Indeed, if we know this spectral measure, then we can attain the finite formthrough the formula

= (lEA,

Page 168: Foundations of Quantum Mechanics

10-3 DIFFERENT DESCRIPTIONS OF THE TIME EVOLUTION 155

where EA is the uniquely determined spectral family of the operator H (ciSection 4-4). It is evident from this that the spectral measure of the Hamil-tonian contains the dynamical structure of the system.

PROBLEMS

1. The operator 1

I urn — I)i/it-.o

is defined on a dense linear manifold and there defines a self-adjoint linearoperator.

2. If H commutes with a state W, then 14" is constant in time.

3. A pure state which is stationary is necessarily an eigenstate of the operator H.

4. If H has only continuous spectrum, then there does not exist a stationary state.

10-3. DIFFERENT DESCRIPTIONS OF THE TIME EVOLUTION

In the practical calculations of the evolution of the states of a system, it isoften convenient to have different but equivalent descriptions of this process.If we choose the description corresponding to Eqs. (10—5) and (10—6), wesay that we are using the Sch rodinger picture.

Another description, called the Heisenberg picture, is obtained if weretain the states fixed but instead change the observables in such a way thatall expectation values are identical with the expectation values calculated inthe Schrodinger picture. Thus we introduce new time-dependent operators

representing observables, and determined in such a way that

Tr = Tr (WAJ. (10—4)'

If we substitute the expression (10—5) for and use the fact that the re-lation (10—4)' must be an identity, valid for any observable, we obtain

A, = (10—8)

This is the time dependence of an observable in the Heisenberg picture. Thestates W, and consequently also the state vectors ifr, are then independentof time.

In many practical calculations a third description is even more importantthan either of the two preceding ones. This description is in a sense inter-mediate between the two, where part of the time evolution appears as a changeof the state, and another part as a change of the observables. Let us assumethat the Hamiltonian for the system is a sum of twO terms II = + J7,and let us define the evolution operators corresponding to and to I—I bysetting

1W11t

e (10 9)

Page 169: Foundations of Quantum Mechanics

Tab

le 1

0-1 H—if0, V(t)= UT1

-A (11 0\ C rn

TH

E T

HR

EE

PIC

TU

RE

S FO

R T

HE

TIM

E E

VO

LU

TIO

N O

F A

CO

NSE

RV

AT

IVE

PH

YSI

CA

L S

YST

EM

.

Ut ==

=e_

hlt,

V=

NO

TA

TIO

NS:

Dif

fere

ntia

l for

ms

Inte

grat

ed f

orm

s

Stat

e-ve

ctor

sD

ensi

ty o

pera

tors

Obs

erva

bles

Stat

e-ve

ctor

sD

ensi

tyO

bser

vabl

esop

erat

ors

Schr

odin

ger

pict

ure

içb=

Hçb

IW=

[ff,

W]

W

pict

ure

iA=

j4, H

0]I

'/'Q

)=

W(t) =

A(t) =

Page 170: Foundations of Quantum Mechanics

10-4 NONCONSERVATIVE SYSTEMS 157

Let us then introduce a time-dependence of an observable

A(t) = (10—10)

This formula is identical with (10—8), but it has a different meaning becausethe operator U, is the evolution operator corresponding to the HamiltonianH0. We can now determine the time dependence of the state W(t) such that

Tr (W(t)A(t)) = Tr (10—11)

A short calculation then gives the result

W(t) = (10—12)

We shall refer to this description as the Dirac picture. In much of the currentliterature it is also called the interaction picture.

We can also express the different pictures in the infinitesimal forms.For instance, the time dependence of an observable in the Heisenberg picturesatisfies a differential equation given by

= [As, H], (10—13)

while the density operator is of course constant and thus satisfiesSimilarly, we find for the Dirac picture

dA(t)

dt= [A(t), H0] (10—14)

anddW(t)

dt= [V(t), W], (10—15)

where V(t) = ' VU, is the interaction operator in the Dirac picture.A pure state represented in the Dirac picture is then a solution of the

differential equationii/i(t) = V(t)i/.4t), (10—16)

and such a solution can be given in the form

i/i(t) = (10—17)

The different forms are collected for convenience in Table 10-1.

10-4. NONCONSERVATIVE SYSTEMS

The conservative systems which we have described so far are systems whichinteract with the external world through constant forces and which do notreact hack on this world at all, of such systems which we shall

Page 171: Foundations of Quantum Mechanics

158 THE DYNAMICAL STRUCTURE 10-4

study in greater detail later on are, for instance, a particle in a constantexternal field of force or a spin carrying a magnetic moment in an externalmagnetic field. The effect of the external forces is then entirely incorporatedin the nature of the Hamiltonian which expresses the dynamical law of thesystem.

We must bear in mind that such systems are always some sort of approxi-mation, whether because the external forces are not constant, or because thereis some reaction of the system back to the external world. In many situationsthe approximation is a very good one indeed, and we need not concern our-selves with these limitations at all. The theory for conservative systems isthen fully applicable and gives excellent results. But there are also situationsof great physical interest where the approximation for conservative systemsis no longer applicable. We can imagine two causes for a breakdown of thisapproximation. One may be due to a time variation of the external forceswhich act on the system; the other may be due to the reaction of the systemback to the external world.

These situations are of course well known from classical mechanics.There it is often possible to reduce a nonconservative system to a conservativeone by enlarging the system so that it contains not only the original systembut also the source of the external force. But in many situations this pro-cedure is not practicable, and it is then desirable to have a generalization ofthe dynamical law for nonconservative systems. This same need exists indealing with quantum-mechanical systems. But in such systems the situationis even more complicated because of the following circumstance: If twoclassical systems interact and both of them are, at the beginning of the inter-action, in a pure state, then after the end of the interaction each separatecomponent is still in a pure state. If the same problem is considered forquantum systems, one finds that after the interaction the component systemsare, in general, in a mixture, even if before the interaction they were in a purestate. This is a property of quantum systems which is of fundamentalimportance in the theory of measurement. We shall discuss this in greatdetail in the next chapter.

In this section we shall concentrate instead on the modifications neededwhen we are dealing with time-dependent external forces, and we can stillneglect the reaction of the system back to the source of the forces.

We could describe such a system by saying that its dynamical law changesfrom one moment to the next so that it is still expressible in a differentialform, but with a Hamiltonian H(t) which depends explicitly on time. Thus,instead of Eq. (10—5)' describing the evolution of a state, we would expect anequation

i = [11(t), (10-18)

Page 172: Foundations of Quantum Mechanics

10-4 NONCONSERVATIVE SYSTEMS 159

and similarly, instead of (10—6)', we should have

= (10—19)

For such systems it is no longer possible to give a simple expression forthe integrated form of the dynamical law, although states at different timesare still connected by unitary transformations which depend on time butwhich no longer have the group property. Thus while we can still write

= L4i/i, we must admit that +

There does not yet exist a general integration theory for the solutions ofEq. (10—18) or Eq. (10—19), and it is only possible to discuss individual casesor approximation methods.

Page 173: Foundations of Quantum Mechanics

CHAPTER 11

THE MEASURING PROCESS

That things have a quality in themselves quite apart from interpretation andsubjectivity, is an idle hypothesis: It would presuppose that to interpret and tobe a subject are not essential, that a thing detached from all relations is still athing.

F. NIETZSCHE,

in The Will to Power

In this chapter we give an analysis of the process of measurement. Thecharacteristic limitations of the accuracy of measurements in the form of theuncertainty relations (Section 11—1) is traced to the interaction of the meas-uring device and the system (Sections 11-2 and 11-3). In order to analyzethis effect we describe the characteristic properties of a measuring device inSection 11-4. The following sections, 11-5 and 11-6, introduce the notationof equivalent states, events, and data. In a mathematical interlude we sketchthe theory of the tensor product (Section 11-7), followed by Section 11-8 onthe union and separation of system. All the tools are then on hand to analyzein detail the measuring process on a particular and simple model (Section11-9). In the last section, 11-10, we describe three paradoxes of the measuringprocess.

11-1. UNCERTAINTY RELATIONS

The deeper understanding of quantum phenomena began with the discovery ofthe uncertainty relations [1]. From the earliest beginnings in the discussionof the uncertainty relations, it was recognized that the analysis of theserelations, which restrict the precision of measurement, must be somehowrelated to the inevitable interaction between measuring device and the systemduring the process of measurement.

In our formulation of quantum mechanics, the actual behavior of aphysical system under the process of measurement is already incorporated inthe empirically given structure of the proposition system. It is therefore notsurprising that the uncertainty relations appear, at this stage of our presen-

164)

Page 174: Foundations of Quantum Mechanics

11-1 UNCERTAINTY RELATIONS 161

tation, as a purely mathematical consequence of the properties of observables.In this section we shall demonstrate this.

Let A and B stand for two observables, which need not be compatible.They are represented by self-adjoint operators which need not commute.In order to avoid inessential complications, we shall assume that the twooperators are bounded so that their domain of definition is the entire Hilbertspace. Let us consider a general state given by the density operator W, anddefine the expectation values

a=—TrWA, b=TrWB. (11-1)

The mean square deviation of the observable A in the state W is then definedby

(La)2 Tr W(A — a)2. (11—2)Similarly, we define

(Lb)2 Tr W(B — b)2. (11—3)

The uncertainty relation is an inequality for the product of La with Lb. Wewrite A0 = A — a and B0 = B — b. Then we have, by definition,

TrWA0= TrWB0—O.

In order to obtain such an inequality we consider the operator T A0 +ipB0, with p real. Since A and B are self-adjoint and bounded, T*ipB0. It follows from this that Tr WTT* = Tr T* WT> 0 for all values of p.In order to see this, it suffices to recall that the trace of any operator X, if itexists, can be calculated with the formula

Tr X = ('Pr' Xp,.)

for some arbitrary, complete, orthonormal system of vectors 'Pr Thus

Tr T*WT= T*WTcor) = WTcor).

Let 'Pr = then we have finally

Tr T*WT = ('Pr, W'Pr). (11-4)

The system 'fry is in general neither complete nor orthogonal, but this does notmatter. What does matter is that every term WI/fr) is positive. Thereforethe trace in question is expressed as a sum of positive terms. Thus we havefound

Tr WTT* Tr WAg — ip Tr W[AO, B0] + p2 Tr � 0. (11—5)

The right-hand side is a quadratic form in p which is thus positive definite.This implies that its determinant must be positive, too, Since A0 and B0 areboth seli-adjoint, the operator [Afl, B] which occurs in Eq. (11—5) is

Page 175: Foundations of Quantum Mechanics

162 THE MEASURING PROCESS 11-1

antiself-adjoint. We can then define a new self-adjoint operator C by setting[A, B] = iC. We obtain the real quadratic form

(Aa)2 + p Tr WC + p2(Ab)2 � 0.

The positive definiteness of this quadratic expression in p implies that

(Aa)2(Ab)2 � WC)2, (11—6)

or, after taking square roots,

(Aa)(Ab) � Tr WCj. (11—7)

This is the general uncertainty relation. It gives a limitation on the value ofthe product of the root-mean-square deviation for any two observables. Inthe event that A and B do not commute, then C 0, and there exist, ingeneral, states such that the right-hand side is positive (and not zero). Incase C is a positive operator (which will be seen to occur in the most impor-tant application), then this will occur for any state. In this case we havewhat we might call an absolute uncertainty relation: The two observables inquestion can never be measured with an arbitrary simultaneous accuracy.

The difficulty in the interpretation of the uncertainty relation stems fromthe fact that A and B might be observables which admit an arbitrarily smallmean square deviation for certain states. The uncertainty relation implies,then, that this can never occur for the same state. For such cases the observ-able A separately, as well as the observable B separately, can be measuredwith unrestricted accuracy. However, if we desire to measure A and Bsimultaneously, we find that the two measurements must be disturbing eachother in precisely such a way that the uncertainty relation is satisfied. Thisbehavior, which for classical systems is completely unknown, gives rise to theoften discussed conjecture whether the actual values of the observables areperhaps determined, but unknown and unknowable because the measuringprocess interferes with the state of the system.

This point of view is closely related to the possible existence of hiddenvariables which we have already discussed in Chapter 7. But in this case thehidden aspect of the state would even have a subjective character, dependingon the state of information which we have about the system. That this wouldlead to quite inadmissible consequences can be seen, for instance, by the factthat the state would be different for two different observers who have differentinformation about one and the same system. This would amount to a denialof objectively valid physical laws throughout the world of microsystems,something for which we have at present no evidence, and no need [9].

We shall therefore take the position that the correct interpretation of theuncertainty relation is that an observable in a given state need not have adefinite value even though the observable is capable of assuming definitevalues for other states. The value of an observable may be not only ambi-

Page 176: Foundations of Quantum Mechanics

11-2 GENERAL DESCRIPTION OF THE MEASURING PROCESS 163

guous, it may even be undetermined, and this is as much an irreducible attri-bute of a state as specific values were in classical physics.

We observe here a significant difference between quantum and classicalphysics. The situation has often been described by saying that in quantummechanics it is not possible to ignore the reaction of the measuring device onthe system. This interaction, so we are told, produces an uncontrollable effecton the system, so that the actual values of the observables remain partiallyhidden from us. Although this discussion of the interaction of the measuringdevice and system has been most useful in attracting our attention to the factthat we are dealing with physical objects and not with a mathematical model,the introduction of an uncontrollable reaction of one on the other has given amisleading impression. It is not an uncontrollable effect which causes theuncertainty relation; on the contrary, the time evolution of the joint system,consisting of measuring device and original system, may be completelycontrolled and determined by a fixed physical arrangement. It is the laws ofquantum mechanics as they are embodied in the structure properties of theproposition system which cause this result.

A complete understanding of these laws can only be obtained by followingthem through their effects on the measuring process, and this is precisely whatwe shall do in the following sections.

11-2. GENERAL DESCRIPTION OF THE MEASURING PROCESS

All the information which we have about physical systems is obtained fromobservations and measurements on such systems. Observations consist inbringing the system under examination in contact with some other system, theobserver, or some measuring device M, and observing the reaction of thesystem on the observer. The general setup of a measurement on a physicalsystem S has thus schematically the form indicated in Fig. 11—1.

Fig. 11—1 Schematic representation Inter1iction

of the interaction between measuringdevice M and system S.

M

Let us point out at the outset two important features which we shallencounter throughout the discussion on the quantum-mechanical measuringprocess. The first of these concerns the fact that the measuring device M, if itis to be of any use at all, must interact somehow with the system S. But aninteraction always acts both ways. Not only does S influence M, therebyproducing the desired measurable effect, but also Al acts on S, producing aneffect on S with no particularly desirable consequences. ln fact, this back-effect on S seems to he the cause of much of the dilhiculty in the mnterpre-tation of quant urn mechanics.

Page 177: Foundations of Quantum Mechanics

164 THE MEASURING PROCESS 11-3

The second point to be mentioned here is that, in a complete measure-ment, the schematic arrangement of Fig. 11—1 is an oversimplification.If the measurement is to be useful there must be a further observation on M,which we shall call "reading the scale." Such further observations may bemade at a later time by examining a permanent record of some sort, but in anycase, if they are to be of some use for the construction of a scientific theorythey must, at some point, enter the consciousness of a scientific observer.

This appearance of the "conscious observer" in a full description of themeasuring process is a disturbing element, since it seems to introduce into theprocess of reconstructing the objective physical laws a foreign subjective ele-ment. It is essential in the construction of an objective science that it be freedfrom anthropomorphic elements. This requirement of objectivity of thephysical science of microobjects can actually be satisfied because of the factthat the last stages of observation which we have designated symbolically as"reading the scale" take place on the classical level. On this level the measur-ing process also follows the general scheme of Fig. 11—1, with one importantdifference. The interconnection between S and M acts only in one direction,namely from S to M. The reverse action from M to 5, although alwayspresent, is not important because its effect can always be reduced to anegligible amount. This is an essential feature of the classical measuringprocess which distinguishes it fundamentally from the quantum mechanicalmeasuring process, where this reaction is not negligible.

The insensitivity of the system with respect to its observation when "read-ing the scale" has the consequence that all the data furnished by a macro-scopic measuring device have an objective meaning. By this we mean that the"scale can be read" by a number of different observers who can communicateand establish that they read concurrent results. The physical fact is "objec-tivized," as we might say. The individual observer, although necessary forcompleting an actual observation, can now fade into the background; and weretain only the objectively verifiable content of the observation. These are thebuilding bricks of man's theory of the physical world.

The situation which we encounter here is very similar to the description ofa physical event in space and time with respect to some coordinate system.Although a coordinate description of such events is usually necessary, theirobjective character and their spatio-temporal interrelations are objectiveproperties quite independent of the particular choice of such a description.

11-3. DESCRIPTION OF THE MEASURING PROCESSFOR QUANTUM-MECHANICAL SYSTEMS

As we have pointed out in the preceding section, the quantum-mechanicalmeasuring process will affect the system. We shall now examine the effect ofthis reaction more carefully. Let us begin with two examples.

Page 178: Foundations of Quantum Mechanics

11-3 MEASURING PROCESS FOR QUANTUM-MECHANICAL SYSTEMS 165

First we consider the measurement of the position of some elementaryparticle by a counter with a finite sensitive volume. After the measurementhas been performed and the counter has recorded the presence of a particleinside its sensitive volume, we know for certain that the particle, at the instantof the triggering, is actually inside the sensitive volume. By this we mean thefollowing: Suppose we repeated the measurement immediately after it hasoccurred (this is of course an idealization, since counters are notorious forhaving a dead time after they are triggered), then we would with certaintyobserve the particle inside the volume of the counter.

In the second example, we consider a momentum measurement with acounter which analyzes the pulse height of a recoil particle. Here the situationis quite different. The experiment will permit us to determine the value of themomentum only before the collision occurred. If we repeat the measurementimmediately after it has occurred, then we find that the momentum of theparticle will have a quite different value from its measured value. The veryact of measurement has changed the momentum, and it is this change whichproduced the observable effect.

We shall call a measurement which will give the same value when immedi-ately repeated a measurement of the first kind. The second example is then ameasurement of the second kind [3].

From now on we shall be primarily concerned with measurements of thefirst kind. They are easier to discuss, yet they exhibit the characteristicquantum features which we want to explore here.

Measurements of the first kind can be used for preparing a state withdefinite values for certain observables. If they are used in this way we speak offilters. For the preparation of such a state, the condition imposed by themeasuring device becomes a relevant condition in the preparation of thestate (cf. Section 6-1). The states before and after the measurement are notthe same, even if the duration of the measurement can be considered negligible.The change in the state must be attributed to the interaction of the system withthe measuring device.

We shall now obtain a formula for the change of the state by a measure-ment of the first kind. Let us begin with a simple special case where theobservable to be measured is represented by a self-adjoint operator A withnondegenerate eigenvalues. Let ifr,. be the eigenvectors of A, and denote by

the projection which contains ifr,. in its range. The operator A may then berepresented in the form

A = arPr, (11—8)

where ar denotes the elgenvalues of A.A measurement of the quantity A in a system in the state W will give the

result a,. with a probability which is given by

Pftr) = Tr WV,.. (11—9)

Page 179: Foundations of Quantum Mechanics

166 THE MEASURING PROCESS 11-3

If the measurement is of the first kind, then the repetition of a measurementwhich yielded the value a,. will reproduce this value with certainty. If themeasurement is not used as a filter—that is, if it is not used to select a certainsubensemble of systems which have definite values for A—then the state ofthe system after the measurement will be a mixture of the pure states ifr,. withprobabilities as expressed by Eq. (11—9).

The density operator for this mixture is given by

W' = P,.WP,.. (11-10)

To see this we verify that for projections P with one-dimensional range wehave

PWP = (Tr WP)P.

Let 'P by a unit vector in the range of F, and letf be an arbitrary vector. Wehave then, by definition of the projection operator F,

PWPf= (i/i, W'P)('P,f)i/j = Wi/i)Pf.

Thus it suffices to show that

Wi/i) = Tr WP.

This can be verified by evaluating the trace in a special orthonormal systemwhich is so chosen that its first vector = 'P. We have then, because P

has one-dimensional range, Pi/i. = 0 for all r 1. Thus the infinite sum inthe evaluation of the trace reduces to one single term, the term on the left-hand side of the last equation. With this we have established formula (11—10)for the case that the ranges of all F, are one-dimensional.

If the eigenvalues a, of the observable A are degenerate, then the state ofthe system referring to a subspace associated with an a, depends on thedetailed nature of the measuring equipment. The degeneracy leaves us with acertain freedom of choice for the state after the measurement, which cannot beremoved without a detailed knowledge of the actual measuring equipmentused. In such cases the measurement of one and the same observable withdifferent equipment may result in different states after the interaction.

It is therefore convenient to introduce the notion of the ideal measure-ment which affects the state in a minimal way, and for which the state after themeasurement is still given by formula (11—10) but without the requirementthat the projection operators P, be one-dimensional.

For the special case of the ideal measurement of a projection operator E,we shall then obtain for the state after measurement the density operator

= EWE + E1WE1, (11-11)

where if = I — E and W is the density operator before the measurement.If the projection operator E is used to describe a filter, then we shall call it a

Page 180: Foundations of Quantum Mechanics

11-3 MEASURING PROCESS FOR QUANTUM-MECHANICAL SYSTEMS 167

passive filter if after the filter process the state is given by

W = (11-12)

This state is obtained if we prepare the state W, and add to the conditionswhich prepare this state the further relevant condition that the measurement ofthe property represented by E must be true.

When we examine the change of the state as expressed, for instance, inEq. (11—11), we observe an important point. This change is quite differentfrom the change of the state due to the time-evolution of a system, which isexpressed by a formula such as

= (1113)

with some Hamiltonian operator H. The fundamental character of thedifference between (11—11) and (11—13) can be seen by the fact that the trans-formation W —÷ W is not. For instance, ifthe state W is pure and E does not commute with W, then W' is always amixture (cf. Problem 1).

There is one special case when the observation of a property E does notaffect the state, namely, when E commutes with W. If E commutes withW, we may in fact write EWE = WE and E1WE1 = WE1; therefore W1 =EWE + E1WE1 = WE + WE1 = W(E + E1) = W. This condition is alsonecessary. Indeed, if W1 = W, then WE = W'E EWE EW1 = EW.Thus we have established:

The necessary and sufficient condition such that an ideal measurement of aproposition represented by the projection operator E does not disturb astate W is that E commute with W.

The measurements which do not disturb the state are thus very specialcases; in general we must expect a change which leads to a density operator W'which is not unitarily equivalent to the operator W.

This behavior of the state under measurement is at first sight very strange,because in Chapter 10 we derived the unitary transformation of states undervery general conditions. Even if the system is subject to variable externalforces, we must expect such a unitary transformation of the change. It seemsas if the unity of the description of the physical laws is broken at this point:If we have the system under its own influence the state changes in one way,given by (11—13); if we observe or measure something on the state, the statechanges in another entirely different way, given by (Il—Il).

We might suppose this difference to be due to the interaction of the systemS with the measuring device Al. The difficulty with this explanation is that oneis always at liberty to consider the time evolution of the combined system,consisting of S and Al. This combined system, if left undisturbed by outside

Page 181: Foundations of Quantum Mechanics

168 THE MEASURING PROCESS 11-4

influences, then evolves in time according to Eq. (11—13) where the operatorHis now the total Hamiltonian of the combined system. That such an oper-ator must exist follows from the fact that the combined system S + M is alsosubject to the laws of quantum mechanics, even if it is, as in most practicalcases, a very complicated system. The complication should not detract usfrom the essence of the question, namely, how to reconcile the two differentbehaviors of the state vectors without violating the unity of the laws ofnature.

We shall analyze and answer this question in the subsequent sections ofthis chapter.

PROBLEMS

1. If W is pure and E is a projection which does not commute with W, thenW' = EWE + E'WE' is a mixture.

2. A repetition of a measurement E on a state W will not disturb the state W'obtained after the first measurement of E.

11-4. PROPERTIES OF THE MEASURING DEVICE

The general description of the measuring process that we have given in thepreceding section did not specify the nature of the measuring device M. Thiswe shall do now.

It is clear that not every system M will be suitable as a measuring appar-atus. In order to fulfill its function of determining the measurable physicalproperties of the system 5, it must satisfy certain conditions which we shallformulate now.

If we examine any measuring apparatus M commonly used for themeasurements of quantum systems 5, then we observe at once a commonfeature: Every measuring device is usually a large system, producing macro-scopic effects which can be observed or recorded with equipment for which thequantum effects are completely irrelevant. For instance, if our apparatus Mis a counter of some sort, it will contain a large number of molecules in anunstable state which can be triggered by the micro-event to be measured. Thetriggering then produces a chain reaction which may lead, for instance, to apotential drop on a capacitor, and eventually recorded in some mechanicalcounter. The resulting large-scale effect in the last stage of the measurementcan then be read off by any observer without interfering in the least with thestate of the apparatus. The final stage of the observation can therefore bedescribed completely in classical terms alone.

This last remark is very important, and it has played an essential part inthe analysis of the measuring process given, for instance, by Bohr [8]. lt is

Page 182: Foundations of Quantum Mechanics

11-4 PROPERTIES OF THE MEASURING DEVICE 169

this classical aspect of the measuring device which enables us to establish theobjective aspects of the state of a system, and it is from such objectivelygiven ingredients that the physical laws governing the microsystems must bereconstructed.

Thus we must assume that measuring devices are usually constructed outof macroscopic parts which can be adequately described by classical laws; yetthis device must be sensitive to the quantum features of the system S. This isoften accomplished by the use of metastable systems which can be triggeredby micro-events, which subsequently are amplified to macro-events. Atypical example is the bubble chamber using a superheated liquid as themetastable macrosystem. A single micro-event in the form of the ionizationof an atom may be sufficient to trigger the sequence of events which lead to theformation of a bubble.

The classical aspect of the measuring device which we have stressed hereis usually associated with its macroscopic features. We must, however, beon guard not to confuse the two. To be sure, most macroscopic objects—thatis, objects which consist of a very large number of microsystems—do, inmost situations, behave according to the classical laws of physics. But this isnot necessarily so. There are macro-objects which exhibit quantal features,and there are micro-objects which, for certain types of observations, behaveclassically. The part of the measuring system which amplifies the triggeredevent to a macroscopic phenomenon is therefore a convenient device whichenables us to observe and register this event, but it is not absolutely indis-pensable for the completion of a measurement. This means that the charac-teristic difficulties in understanding the measuring process are not to beattributed to the inevitable complication of an amplifying device; they arealready present in the microscopic part of the measuring equipment.

In order to make this explicit, it is convenient to divide the measuringdevice M into two parts, M = m + A. Here m denotes the "little" measuringdevice, that is, that part which is truly responsible for the measurement of thesystem M; and A represents the amplifying part of M. Part A may be dis-connected in thought or sometimes in reality from the little measuring device,without impairing the essential part of the measuring process.

For instance, if our measuring equipment is the photographic plate, thelittle measuring device m may be a single silver-halide complex, while theamplifying device A may be an individual grain, containing a large number ofatoms. The record of an event may be triggered by a single ionization pro-cess and stored as a latent image long before the amplification A in the formof the development of the plate is put into effect.

The system S + m which really performs the measurement must satisfyfurther restrictions. The function of the measurement consists in transferringcertain properties of the system S to the system ,n, in such a way that a mereobservation on the system (by means of the amplifier A) will permit us to

Page 183: Foundations of Quantum Mechanics

170 THE MEASURING PROCESS 11-5

draw certain conclusions as to the state of the system S. There must thereforeexist a correlation between the events on M and the states of S. The measur-ing device can then distinguish states of S which are correlated with observ-able events in M.

In a measurement of the first kind, we have the additional requirementthat a repetition of the measurement immediately after it has occurred willreproduce the same result. This means that the system S is left after themeasurement in the state registered by M.

We have now listed the essential properties which we must require of themeasuring device for a measurement of the first kind,

11-5. EQUIVALENT STATES

In this section we shall now examine in detail the modification in the descrip-tion of states which results from a restricted system of observables. In thepreceding section we have pointed out that the observables which are measuredby a measuring device must be of a classical nature in order that the measure-ment have an objective character. This means that one and the same deviceM can, in general, measure only a restricted class of observables, all of whichmust commute with one another. This has the consequence that certain stateswhich may be represented by different density operators may actually beindistinguishable with respect to this system of observables.

The natural way of describing this situation is by introducing a theory ofequivalence classes of states. We shall first do this quite generally, without therequirement that we are dealing with classical observables, and shall special-ize later for classical observables.

Let b° be a system of observables. We say that two states H"1 and H"2

are equivalent with respect to b° if

TrAW1 = TrAW2 (11-14)

for all A e b°. We shall write for two states equivalent with respect to b°:If two states are equivalent in this sense, then no measure-

ment with observables from b° can distinguish the two states. It is not diffi-cult to show (Problem 1) that H"1 —' H"2 is an equivalence relation. We cantherefore divide the set of all states into classes of equivalent states.

From the physical point of view, the selection of an individual represen-tative inside a class of equivalent states is irrelevant; any choice will be equallygood. It is therefore quite natural to consider the states with respect to asystem b° not as one of the members of the class but as the classes themselves.By this procedure we remove the redundancy in the description of the state,and we restore the one-to-one correspondence between the physical notion ofstate and its mathematical description.

Page 184: Foundations of Quantum Mechanics

11-5 EQUIVALENT STATES 171

Let us denote by [WI the class of the states which are all equivalent toW. We shall call the redundant states W the microstates and the classes ofequivalent states [WI the macrostates.

The operation of the mixture of states can be transferred from the micro-states to the macrostates. This means we claim the property expressed in theformula

[11W1 + 12W2] = 11[W1] + 12[W2]. (11—15)

This formula implies the following:

and

W = 11w1 + 12w2, W' = + 12W2',

then W W'. Due to this property it is possible to transfer the process ofmixing from the microstates to the macrostates, as we have indicated with thenotation of the right-hand side of (11—15).

In order to prove this property, we mention that it is sufficient to verifyit for the projections contained in b°. Let denote the projections in b°.We want to show that

TrEW= TrEW'This we see by using the linearity property of the trace through the followingsequence of steps:

TrEW = TrE(11W1 + 12W2)

=11TrEW1 +A2TrEW2

=11TrEW

= TrE(11W + = TrEW'.

In the third step we have made use of the fact that W1 —' Wf andFE2 Thus the property is verified.

We shall have occasion to use a corollary which states that if W1 —'then any mixture W = W1 + 12W2 is in the same equivalence class

= [W2]. The equivalence classes are thus closed against the operation ofmixing. This implies that every class which contains more than one micro-state contains mixtures.

An important question which will concern us now is this: What happensto the macrostates during a measurement? We know what happens to micro-states, if, for instance, the state is FE before the measurement, then after themcasurcment of E it is

Wr EWE + EWE', (11—16)

where we have written F' = I — E.

Page 185: Foundations of Quantum Mechanics

172 THE MEASURING PROCESS 11-5

Let [W] be the class which contains Wand [WE] the class which containsWE. We shall now determine under what condition the class [WE] is inde-pendent of the representative Win the class [W]. Thus we choose two mem-bers W1, W2 e [W] from the same class, so that W1 W2. The conditionthat Wr is that for any F e FJ'J we have

Tr F(EW1E + E'W1E') = Tr F(EW2E + E'W2E').

By using the invariance property of the trace under cyclic permutations of theoperators, we can change this into

Tr (EFE + E'FE')W1 = Tr (EFE + E'FE')W2.

This must be true for all projection operators F e This is true for allequivalent pairs W1, W2 if and only if

FE EFE + E'FE' e b°. (11-17)

This, then, is the condition which guarantees that the macrostates are notbroken up under the process of measurement.

We shall be particularly interested in the case of an abelian set of observ-ables. In that case FE F, and so condition (11—17) is always true. Whenthis condition is satisfied we may transfer the process of change under meas-urements from the micro- to the macrostates and we can write a formula

[WE] = [W]E

where the right-hand side denotes the class which contains the element WE.We are now also in a position to answer the question under what condi-

tion a state is left invariant under all the measurements. The condition forthis is that

[W]E = [W] or WE W forallW e [W].

This means that for all E e and all F e we must have

TrFWE= TrFW= TrFEW.

A sufficient condition for this to hold is

FE EFE + E'FE' = F (11-18)

for all E, F e If we require invariance for every state, then this condi-dition is also necessary (Problem 2).

Since invariance of the states under measurements is a property of ideal-ized classical systems, we shall call a state which is invariant under all measure-ments a classical state.

Fora classical system the relation(l 1—18) is always satisfied since Eand Fcommute. It follows that in a classical system every state is a classical state.

Page 186: Foundations of Quantum Mechanics

11-6 EVENTS AND DATA 173

PROBLEMS

1. The relation W1 —' W2(E/') is an equivalence relation.

2. IfTrAW= TrBWforall states W, then A =B.

11-6. EVENTS AND DATA

One of the chief difficulties in the epistemology of quantum mechanics is itsapparent inadequacy for describing events. The fact that there are systemswhich do not admit dispersion-free states leads to the inevitable and irreducibleprobability statements regarding the occurrence of certain events. Suchevents may be the measurements associated with yes—no experiments, and assuch they may be macroscopic phenomena. The individual occurence of suchphenomena is then completely outside the scope of the theory; only theprobabilities for such events can be accounted for in our description of thestate.

In order to understand this problem better we may compare it with theclassical situation. The occurrence of probability statements in the descrip-tion of states is not unknown in classical mechanics. Most systems, with alarge number of degrees of freedom, are much better described with stateswhich are not dispersion-free.

To illustrate this, suppose we want to describe the state of a thrown diebefore it is examined to learn what number it shows. Such a system is in astate determined by the conditions of the throw, but this state can only bedescribed by a probability function as to the occurrence on top of one of thesix sides. The initial condition or preparation which determined this statecould be described by a rule such as: Throw die in the air not higher than soand so, and let it come to rest on the table.

From our experience with classical objects such as dice, we know that thisdescription results in a probability statement for the outcome of the throwmerely because it is not precise enough. We know from experience withsimilar systems that the specification of the preparation (throwing a die) canbe made more precise, enough so as to determine the outcome of the throwwith certainty. It is possible to add other relevant conditions to the pre-scription for preparing the state. For instance, we might specify the exactposition, direction of throw, air currents in the vicinity, angular momentum,and many other variables, to such a degree of precision that the result of thethrow will always be the same under the same conditions and we can predictit with certainty. In this case the state has no dispersion any more.

If we examine this question for quantum systems, then we find that it isnot always possible to add conditions which permit the preparation of stateswithout dispersion. Adding further conditions may indeed change the statehut it will not make it dispersion-Free.

Page 187: Foundations of Quantum Mechanics

174 THE MEASURING PROCESS 11-6

The classical states show dispersion not because of any intrinsic occur-rence of probabilities, but because the prescriptions for preparing the statesalready involve probabilities. This gives us justification for considering theprobabilities as expressions of our ignorance of the finer features of the state.We do not have any doubt that a die, when it has come to rest, does indeedshow one of the numbers before we look at it; the final act of observation doesnot produce this number—it merely uncovers a fact which has already oc-curred.

It is convenient to have a special technical terminology for the descrip-tion of such facts to which we want to ascribe reality before they are observed.We shall call them events. The important property of events is that theyrepresent objectively given phenomena capable of being determined by ob-servations which in no way interfere with the state of the system. When anevent has been observed we call it a datum.

While in classical mechanics every physical property which, after anobservation, is found to be true, may be called an event (because we can alwaysarrange it so that such an observation does not affect the state), the situation ismore complex in quantum mechanics. It was pointed out by Einstein thatwe may have events in quantum mechanics, too [8]. Einstein used the term"element of reality" for the description of properties which we have calledevents. There are, however, also properties which do not have this elementof reality and which cannot be called events. It is of the utmost importance inthe analysis of the measuring process to be able to distinguish between thetwo kinds of properties.

Let W be a state and E a projection representing a yes—no experiment.The property E is an event if and only if the measurement of E does not affectthe state W. We have previously shown that this is the case if and only if Wcommutes with E. We may then affirm that each individual system of anensemble of identically prepared systems in the state W realizes one or severalof the events; which of these events are in fact realized for an individualsystem is determined in principle by making measurements of all the events Eon that individual system. Since none of the measurements change the state,all the results which are obtained pertain to that one system in the unchange-able state W.

A little more delicate is the question concerning events if the system ofobservables is a restricted class of operators b°. As we have seen in the pre-ceding section, the proper description of states in this case consists of theclasses of equivalent states, that is, the macrostates. We are then entitled toaffirm that a projection E is an event in the state W if [Wfl [W]E = [W].Thus the observation E may very well change the states in one and the sameclass but it does not change the class. This condition is weaker than the onewe had for microstates. W need not commute with E: it is only necessary thatit leave the macrostate unchanged, to he an event.

Page 188: Foundations of Quantum Mechanics

11-7 MATHEMATICAL INTERLUDE: THE TENSOR PRODUCT 175

This remark will be very important in the analysis of the measuring pro-cess, since the necessary classical feature of the measuring equipment impliesthe restriction of the observables of the measuring device to an abeliansubset of all the observables.

11-7. MATHEMATICAL INTERLUDE: THE TENSOR PRODUCT

In the following parts of the analysis of the measuring process, we need toapply the theory of the tensor product of Hilbert spaces, which we shalldevelop in this section. The tensor product is involved whenever we considerthe union or separation of two subsystems; a process which occurs preciselyduring a measurement; the two systems in this case are the system to bemeasured on the one hand and the measuring equipment on the other.

The reason for the occurrence of the tensor product may be seen from thefollowing remarks. Let S1 and be two systems, and let b°1 be a completeset of commuting observables of and b°2 such a set for Every observ-able A1 e b°1 is naturally also an observable on the joint system +The same is true for every observable A2 e b"2. Furthermore every observ-able A1 commutes with every observable A2 and {b°1, b°2} is a completeset of commuting observables for +

We must thus find a description which will incorporate these characteris-tic properties of the union of two systems. This can be done in the spectralrepresentation for the two systems b"1 and for instance, as follows:Let A1 be the Cartesian product of the spectra of a complete commuting setof some of the observables b"1; and similarly, let A2 be the correspondingproduct of the spectra of some of the observables b"2.

The Hilbert space of the spectral representation for consists ofsquare-integrable functions with e A1. Similarly the I-filbert space

of the spectral representation for b"2 consists of square-integrable func-tions 4)2(12) with 12 e A2. The Hilbert space cc of the spectral representationfor the set {b"1, b"2} is then a set of functions 4(11, 12), square-integrable overthe Cartesian product space A1 x A2.

The measure p for the space A1 x A2 may be taken as the productmeasure PIP2 of the measures for the spaces A1 and A2 separately.

ln this way we arrive quite naturally at the notion of the tensor product.Associated with the pair of spaces and we have constructed anotherspace c#, the tensor product of and which in some respect generalizesthe notion of product. -

The same contains a certain subset of vectors, those of the form22) = This subset is the image of a bilinear mapping of

pairs of vectors e S'1 and S2 into Furthermore, this image set isa complete set ol vectors in in the sense that every vector in %' can he writtenas a linear combination of vectors of the lbrm W'2(22).

Page 189: Foundations of Quantum Mechanics

176 THE MEASURING PROCESS 11-7

These last two remarks will permit us to free ourselves from the particu-lar construction of the tensor product which we have adopted here for illus-trative purposes. We shall now proceed to a formal and abstract definitionof the tensor products.

Let and be two Hi/bert spaces. The tensor product = ®is a Hi/bert space together with a bi/inear mapping q from the topo/ogica/product into such that1) the set of a// vectors q (f1,f2) (wheref1 e e spans2) (tp(f1,f2), p(g1, g2)) = (f1, (f2, g2)for a//f1, g1 e and for a//

f2' e

A few remarks on this definition and the notation may enhance thereader's understanding of them:

We are using two different kinds of products, the topo/ogica/ productx t2, and the tensor product 0 t2, which should not be confused.

The first consists simply of the pairs of vectors {f1,f2} with f1 e andf2 The second is a Hilbert space together with a bilinear mappingq which satisfies the two conditions indicated above.

We have designated the scalar product in with the same bracket no-tation as the scalar product in )f'1 and t2. The reader should not confusethe two.

It is important to note that the image of the bilinear mapping q is not theentire 1-lilbert space is only a proper subset However, this subset istotal in the sense that it spans the entire space

The definition need not be restricted to the product of two spaces.Indeed, we may define the tensor product of a finite number n of spaces

= 0 ® 0 as a Hilbert space together with a multi-linear mapping from x X into which satisfies

1') the set of all vectors q(f1,f2, . . . ,f,J(ft E )f',.)

2') (q(f1,f2, . . . ,f,3, . . . , gj) = (f1,g1)(f2, g2) for all= l,2,...,n).

We have already proved the existence of the tensor product for n = 2 bythe construction employed in our example at the beginning of this section.We now need only verify that this construction does indeed satisfy the condi-tions (1) and (2) of the definition (Problem 1).

The tensor product is unique in the following precise sense: Let andt2 be two Hilbert spaces and denote by ¶# and W two different tensorproducts with the associated bilinear mappings q and q" respectively; thenthere exists a unique isometric operator U with domain and range (i', suchthat

Up(f1,f2) = c'(fl.f2) for aIIf1 e.it1,f2 e.W'2. (I1—19)

Page 190: Foundations of Quantum Mechanics

11-7 MATHEMATICAL INTERLUDE: THE TENSOR PRODUCT 177

This uniqueness property of the tensor product is very important for itsphysical interpretation. The projection operators in are the yes—no experi-ments for the joint system. The representation of the Hilbert space is deter-mined only up to unitary equivalence by the lattice structure of the proposi-tions. Then the uniqueness of the tensor product means that the physicalproperty of the joint system, insofar as it is contained in the algebraic struc-ture of the lattice of propositions, is entirely determined by the structure of thecomponent systems.

An explicit construction of the tensor product, independent of any parti-cular reference system, can be given as follows:

We define first the notion of the conjugate linear transformation T frominto Such a T is required to satisfy

T(f2 + g2) = Tf2 + Tg2,(11—20)

T(1f2) = 1*Tf2

for allf2,g2Let {V'r} be a complete orthonormal system in and define the norm

The sum on the right-hand side need not be finite, but it is independent of thechoice of the orthonormal system {i4ç} (Problem 2). This norm satisfies theparallelogram identity (cf. Section 2-2, Problem 3), so that it can be derivedfrom a scalar product as follows:

(T, S) SI/Fr). (11—22)

The set of all conjugate linear mappings T, with < cc, are therefore aHilbert space (g• Moreover there exists a bilinear mapping q from xinto by setting, for each pair of vectors f1,f2,

Tg2 = (g2,f2)f1. (11—23)

We denote this particular T by f1 0 f2. Its norm is 0 f2 =(Problem 3).

Now let T = 0 f2 and S = g1 0 g2; then one verifies easily that(Problem 3)

(T, 5) = (f1,g1)(f2,g2). (11—24)

Thus we have completed the construction of the tensor product: The elementsof the space are the conjugate linear mappings of .r2 into with finitenorm, and the bilinear mapping is given by Eq. (II 23).

Page 191: Foundations of Quantum Mechanics

178 THE MEASURING PROCESS 11-7

To every T e we can uniquely associate another conjugate linearmapping T# from into by the formula

Tf2) = (f2' T#f1).

This mapping is unique because of Riesz' theorem (Section 3-1).One verifies (Problem 5) the following rules of this correspondence:

T## = T, MT#M =

(T + S)# = T# + S#, (1T)# 1T#.

If T =f1 Øf2, so that, according to Eq. (11-23), Tg2 = (g2,f2)f1, then wefind that

(g1, Tg2) = (g1,f1)(g2,f2) = (g2, T#gi);

therefore we must have T#g1 = (g1,f1)f2. It is suggestive to denote thisparticular T# by T# =f2 Øf1.

The notion of the product can be extended from the vectors to the linearoperators. Thus, if we have two bounded operators A1 and A2, where A1operates in and A2 operates in then we may define for every T =

0 f2 the operation

(A1 0 A2)(f1 ®f2) = Af1 0 Af2.

This definition of A1 0 A2 can be extended by linearity to the entire spaceThen we have, for every T e the formula (Problem 6)

(A1 ® = (11-25)

where is the Hermitian conjugate of A2.This completes the construction of the tensor product.

PROBLEMS

1. The construction of the tensor product by means of the spectral representationof two complete sets of commuting operators used in the first part of this sectiondoes satisfy the formal definition of the tensor product.

2. If Tis a conjugate linear mapping of a Hilbert space c*'2 into another then

IITi/,,IV =r=1 r=1

where {ifr, } and } are two arbitrary complete orthonormal systems in r2.3. 1ff1 ® f2 is defined as the conjugate linear mapping T given by

Tgi: (g2,f2)f1,

Page 192: Foundations of Quantum Mechanics

11-8 THE UNION AND SEPARATION OF SYSTEMS 179

and the scalar product of two such mappings is defined by

(T,S)= (Tijc,SiJi,),r=1

then

012, g1 0 g2) = (1', g2)and

®.121] = II.t'1I]IIj'211.

4. If T is a conjugate linear mapping from c*'2 into then T# defined by(fr, Tf2) = (12, T#f1) for all f' E 12 E is a uniquely determinedconjugate linear mapping of into

5. The correspondence T -* T satisfies the rules

T## =T, IIT#II= 11Th,

(T+ S)# = T# + S#, (AT)#

6. If A1 is a bounded operator in and A2 a similar operator in thenA1 ®A2T= A1TAI for every TE

11-8. THE UNION AND SEPARATION OF SYSTEMS

We shall now consider two coherent systems and The joint system wedenote by + S2. The principal question that we want to answer here ishow the states of the component systems are related to the states of the jointsystem, and vice versa.

Let us begin with the first part of the question. The states of the compo-nent system shall be given by their respective density operators W1 and W2.We wish to know what the state is if we consider the two components togetheras a joint system. The criterion for answering this question is a physical one:lf we measure observables which refer to only one of the components we mustobtain the same result whether we consider them measured on the joint systemor on the component system.

We shall denote by A1 an observable which refers to the componentAs an observable on it is a self-adjoint operator in the Hilbert spacepertaining to the system But as an observable on the joint system +it is a self-adjoint operator in the 1-lilbert space ¶# = ® pertaining tothe joint system. Since it is an observable on alone, it must be the identityoperator '2 in That is, it must have the form A1 = A1 ® '2. Similarly anobservable on the joint system which refers only to system must have theform A2 = ® A2.

Let W1 be the of system and W2 the state of system Thejoint system S1 + is then in a state W lbr which we now want to determinethe manner of its dependence on W1 and W2. The conditions which W must

Page 193: Foundations of Quantum Mechanics

180 THE MEASURING PROCESS 11-8

satisfy are thus

TrA1W = Tr1A1W1 TrA2W = Tr2A2W2. (11-26)

Here we have, for greater clarity, introduced the notation Tr1 and Tr2 fortraces which refer only to and c*'2, respectively.

One possible solution of Eq. (11—26) is given by W = H"1 0 W2, as onemay verify immediately. In general this is not the only solution. Physicallythis means that the state of the compound system S1 + s2 cannot be com-pletely determined by measurements on the component systems alone.There are thus physically distinguishable properties which express themselvesas correlations between observation on and Such correlations arenonexistent for the state W1 0 W2.

There is one exception: If one of the states W1 or W2 is pure (Problem 1),then the states W1 and W2 determine the state of the compound systemuniquely.

Let us now examine the reverse problem: Given W, determine W1 andW2 such that Eqs: (11—26) hold. This problem always has a unique solution.Let us first demonstrate the uniqueness of the solution. Let W, W1, and TV2 bethree density operators satisfying Eq. (11—26) for all A1 and A2. Let us assumethat W, W, and is another set of such operators satisfying also theconditions (11—26). We find then that

Tr1 A1W1 = Tr1 A1W (11—27)

for all observables A1 e b°1. Since was assumed to be a coherent system,this means that (11—27) must be true for all projections F, in particular one-dimensional ones. Let P be such a projection and q a unit vector in its range.Then we obtain from Eq. (1 1—27)

(q, W1co) = (4',

for all q. This is possible only if W1 = W. Q.E.D.Similarly one proves W2 = This proves uniqueness.Let us next assume that W is a mixture, for instance, W = AU + 4uV.

Let U1 and U2 be the component states determined by U, and similarly letand V2 be the component states determined by V. The linearity of the

connection (11—26) between W and W1, H"2 then results immediately in thestatement: The component states W1 and H"2 corresponding to the state Ware given by

= AU1 + 4uV1, H"2 = AU2 + pV2. (11—28)

This shows in particular that the component states can be pure only if W ispure.

In any case we need only determine the component states for pure statesW; for the mixtures they can, by means of Eq. (11—28), be calculated immedi-ately in terms of the pure states contained in the mixture.

Page 194: Foundations of Quantum Mechanics

11-8 THE UNION AND SEPARATION OF SYSTEMS 181

Let us then assume that W is pure and denote by 0 the vector incontained in the range of W. 0 is thus an antilinear mapping intoWe assume 0 normalized, so that = 1. Let us choose a completeorthonormal system b,1 e such that 'b1 We then obtain

Tr A2W = E

A2 = A2 we use Eq. (11—25) and obtain

= =Therefore

= Tr1

which shows that W2 = In a similar way we evaluate W1. The result is

(11-29)

In the general case in which W is a mixture of orthogonal states b,1 withweights the component states are

W1 = and = (11—30)

We shall refer to Eqs. (11—29) and (11—30) as the reduction formulas, and thestates H"1, W2 are called the reduced (or component) states.

Let us now discuss the reduction formula (11—29) in a little more detail.Consider first the case that b = q ® u/i where q and ui/i are both normalizedand q e i/i e g'2• From the definition of 'b and it follows that

= =

Thus b#q = q and flI/i = i/i. Furthermore, if is orthogonal to q, sothat q) = 0, then

= (q'i, q)i/i = 0

so that = 0. Similarly if i/it is orthogonal to = 0.

In this case we find that @D# = P is a projection in with one-dimensionalrange containing q. Similarly = Q is a projection in with one-dimensional range, containing i/i. This means if the compound state has theform S = 0 i/i, then the reduced states are pure. The converse we havealready seen, and so we have established:

The reduced states are pure and only if the pure state b is of the form

Let us now consider the case in which the compound state is still pure,hut not of this form. Then we know from the previous discussion that neither

nor W2 can be pure. Let 1Rb W1 = with projections with

Page 195: Foundations of Quantum Mechanics

182 THE MEASURING PROCESS 11-8

and

W2V1r = - =

1,

a normalized elgenvector of W2 with eigenvalue ;. Furthermoreevery eigenvector of W2 is of this form. It follows from this that W2has the form W2 = Xt "rQr with Q4. =

If we complete the vectors t'p,. and to complete orthonormal systems inand t2 respectively, we obtain such a system in in the form t'p,. ® V15.

By substituting the definition of t'p,. ® as an antilinear mapping of t2 intowe find

(t1, ® V's) = E (Q'r ® = Q'r) =

This means b has the development

We have thus established the following result:

(11—31)

If b is a general vector in then there ex'ists an orthonormal systemin a similar system in and positive numbers; such that

= ® I/Fr,

= E

=

= V's.

With this result we have established the norma/form of the reduction ofthe pure state b of the compound state to its component states.

PROBLEMS

1. If

W = Tr1A1 W1, TrA2W= Tr2A2W2

for all observables A1 and A2, then W = W1 ® W2.

2. Let LI"1, W2 be the reduced states of the state U". If W1 is pure, then U"2 is puretoo.

3. If P = is a projection of one-dimensional range, then Qa projection of one-dimensional range.

is also

one-dimensional range and ; > 0, = 1. Define i/i,. =where t'p,. is a normalized vector in the range of We have then

= I(Q'r,

Page 196: Foundations of Quantum Mechanics

11-9 A MODEL OF THE MEASURING PROCESS 183

11-9. A MODEL OF THE MEASURING PROCESS

In this section we shall return to physics and complete the analysis of theprocess of measurement on a model which. is so designed that it permits anexplicit description of the state of the system and the measuring device duringthe entire measuring process. The measuring device is here assumed in thesimplest form, so that its quantum character is not obscured by the complexityof a large system. The purpose of the model is to show the consistency ofmeasurement in the quantum description of the entire system.

In order for the model to represent the essential features of the measure-ment, it must satisfy the conditions laid down in Section 11-4. The systemthus consists of two parts, the system S1 = S on which a state is to bemeasured, and a microscopic but classical measuring device s2 = m.

For the system 51 we take a system represented by a two-dimensionalstate vector. Let q - be two orthogonal vectors in this space which arethe eigenstates of the quantity to be measured.

The system is assumed to be described by a three-dimensional Hilbertspace. It contains the vector i/i0 which we shall call the neutral state of mand two more orthogonal vectors i/it and i//. The vector i/it represents the"pointer reading" indicating system s1 to be in state Similarly is theindicator for state q -.

If before interaction the system S + m is in the pure state

c-+ ®

then after the measurement is completed the system S + m is in the pure state

= = ® (11—32)

where till is some unitary operator. Similarly the initial state c— ® 11/0

is, after the interaction, given by

= = ®i/i. (11-33)

These formulas describe the characteristic behavior of a measurement of thefirst kind in the context of this model.

We now take the general initial state of the form

= c+b+0 +

Since the transformation cull is linear, we can obtain the final state afterthe measurement in this case by linear superposition of the final states(11—32) and (11—33):

S — t50 = ® + ® i/i. (11-34)

After the measurement is completed we can imagine the interactionbetween S and ,p; removed. The reading of the scale consists in amplifying therecord contained in m and deducing from it the state of S.

Page 197: Foundations of Quantum Mechanics

184 THE MEASURING PROCESS 11-9

The state of m, to be read with the amplifier, is obtained from the purestate (11—34) by reducing that state to the system m. We use the reductionformulas of the previous section. In this case their application is especiallyeasy since (11—34) is already in the normal form (11—31). The only slightgeneralization here is that the coefficients are complex numbers (in Eq.11—31 they were real); however, an adjustment of the phases of willreduce them to reals.

Thus we may write for the reduced states

w1 = Icc+12P+ + Icc_VP—,(11—35)

W2 = Icc+I2Q+ + Icc_I2Q_,

where P± and Q± are the projections containing respectively. Wesee that both states have become mixtures. Since m (as well as m + A) is aclassical system, the state is a classical state. No further observation on mwill modify the state, and the measurement has become an objective record.According to Section 11-6, each individual system m which may be used in astatistic of the measurement realizes one of the two alternatives. Thesealternatives are thus events in the sense of Section 11-6, and their amplifica-tion will make them data. There is no question of any superposition here.The reduction of the state to the system m has wiped out any phase relations.

Moreover, we have a measurement since the events in m and those in Sare correlated. If m is in the state then S is necessarily in the stateIn order to see this, we calculate the expectation value for the cross correla-tion expressed by the proposition and Q and represented by the projec-tion ® Q. It is given by

(b, ® QJD) = 0. (11—36)

Therefore the event + is strictly correlated with q and i/i — is similarlycorrelated with q —. Thus in the reduction formulas we see the true origin ofthe probability statements in the quantum-mechanical measuring process.

A difficulty seems to appear for this interpretation if we include the systemS in the measuring device. In this case there is no occasion to reduce the purestate to that of a mixture. If we do this, then the observable projections ofthe joint system are with range q+ ® and H_ with range ® iji_and, in our model, there are no others. The abelian system b° of observableson this system consists of all linear combinations of these two projections.In this case we are in the situation where the pure state b is contained in aclass of equivalent microstates. These states contain the state

w = icc+12fl+ + Icc_1211_, (11—37)

sinceTr = (b, = Icc÷12

andTr WF! = (S. F! D) = icc 12.

Page 198: Foundations of Quantum Mechanics

11-10 THREE PARADOXES 185

Thus the final state is again a probability distribution of events, this timedescribed by the projections In this case the probability statement comesin through the theory of equivalent states.

11-10. THREE PARADOXES

a) Schrödinger's cat. In a paper entitled "The Present State of QuantumMechanics," Schrodinger wrote a criticism of the orthodox view of quantummechanics [2]. He pointed out that this view would imply rather grotesquesituations for macroscopic events, and he illustrated it with an exampleinvolving a cat. This example has been reformulated by many other authorsin more or less equivalent forms, and it has to this day been considered bymany an unsolved paradox. Here we shall give a literal translation of Schrd-dinger's cat paradox. Schrodinger writes:

"A cat is placed in a steel chamber, together with the following hellishcontraption (which must be protected against direct interference by the cat):In a Geiger counter there is a tiny amount of radioactive substance, so tinythat maybe within an hour one of the atoms decays, but equally probablynone of them decays. If one decays then the counter triggers and via a relayactivates a little hammer which breaks a container of cyanide. If one has leftthis entire system for an hour, then one would say that the cat is still living ifno atom has decayed. The first decay would have poisoned it. The i/i-functionof the entire system would express this by containing equal parts of the livingand dead cat.

"The typical feature in these cases is that an indeterminacy is transferredfrom the atomic to the crude macroscopic level, which then can be decidedby direct observation. This prevents us from accepting a "blurred model" sonaively as a picture of reality. By itself it is not at all unclear or contradictory.There is a difference between a blurred or poorly focussed photograph and apicture of clouds or fog patches."

The paradoxical aspect of this example is to be found in the supposedreduction of the state from a superposition of macro scopically distinctalternatives to one of the events during the act of observation.

b) Einstein's element of physical reality. Einstein has been not only one of thefounders of quantum mechanics, but also one of its strongest critics. Hiscritique does not concern the existing theory as such, which he recognizes assatisfactory as far as it goes in the description of physical phenomena. Hequestions its completeness. The paradox of Einstein, Podolsky, and Rosen[12] is one of the most striking forms in which this question is expressed.

The authors take the position that physics is concerned with the descrip-turn of"physical reality" and they affirm that an objective reality exists whichdoes not depend on our observation. A priori we do not know what it is, so

Page 199: Foundations of Quantum Mechanics

186 THE MEASURING PROCESS 11-10

they say, but this precisely is the task of physics: to establish the properties ofthe existing physical reality.

They are aware that this position requires a meaningful definition of"physical reality." This is, of course, not easy and it is probably impossiblein physical terms alone.

However, certain elements of physical reality can, so they affirm, be givena precise meaning. Indeed, if the value of a physical quantity for a physicalsystem can be determined with certainty without in any manner whatsoeverperturbing the state of the system, then this quantity has for them an elementof "physical reality" in that system.

The authors then proceed to contruct an example which seems to lead tothe conclusion that quantum mechanics is in contradiction with a completedescription of all elements of physical reality. We reproduce this examplehere in a simplified form.

Let us assume that we have two systems I and II, which at a given timecan interact with each other. We assume that the states of each system arecompletely described by a two-dimensional vector space. Let ± represent acomplete orthonormal set of vectors in the first space and ± a similar set inthe second space. Let us further assume that the interaction between the twosystems is such that at some time the (pure) state of the joint system is given by

(11-38)

We now assume that the two systems can be isolated from each other, forinstance by separating them spatially, so that any observation carried out onone of the component systems cannot have any physical effect on the othersystem.

After this separation the state is still given by Eq. (11—38). If we nowmeasure on system I whether it is in the state q + or q —, we find that it is in

with probability 4. The interesting point is that a measurement ofconstitutes at the same time a measurement of on system II. Indeed,according to the general theory of the measuring process, we know thatwhenever a measurement on system I has given the result p+, any futuremeasurement on system II will give the result Since the two systems arephysically separated we have a means of determining the state of system II"without in any manner whatsoever perturbing the state" of that system.According to the criterion of Einstein, Podolsky, and Rosen, the quantity withthe eigenstates ± of system II must therefore have an element of physicalreality.

The value of this quantity is of course not known before the measurementon system I is completed, but that does not invalidate the conclusion that ithas a definite value, since one can determine it by a measurement carried out

Page 200: Foundations of Quantum Mechanics

11-10 THREE PARADOXES 187

entirely on system I. Moreover this definite value must have had the sameelement of reality even before the measurement on system I was carried out,since a measurement on system I cannot produce any physical effect whatso-ever on system II and thus cannot change the reality of a physical quantity inthat system.

We are thus driven to the conclusion that the system (I + II) is in amixture of two different states, namely, the states ® and ®mixed, with probabilities 4. But such a state is different from the state express-ed by Eq. (11—38). Thus the acceptance of the notion of "physical reality"has led us to a contradiction.

This paradox can be given still another form. It is possible to carry out asimultaneous change of coordinate systems in the vector spaces referring tosystems I and II respectively in such a way that the vector 4) remains invariant.This means we can find other orthonormal pairs q4 and iJ4 such that

= + (11-38)'

The same reasoning that was applied for the form (11—38) can now berepeated identically for the representation (11—38)', with the conclusion thatsystem II is in one of the arbitrary states iJ4. But a system cannot be simul-taneously in two different states; hence we have encountered anothercontradiction.

Einstein, Podolsky, and Rosen have drawn the conclusion from this para-dox that quantum mechanics does not furnish a complete description of thephysical reality of individual systems but merely describes the statisticalproperties of ensembles of systems.

This paradox was discussed by Bohr [13], who showed that it could notbe considered a refutation of the basic prinóiples of quantum mechanics butthat it merely revealed the limits of the traditional concepts of natural philo-sophy. In a rejoinder [14] Einstein admits the logical possibility of Bohr'sviewpoint, but reaffirms his belief in and preference for another point of view.

c) Wigner's friend. In 1962 Wigner added a new element to the paradoxesalready known by including consciousness for the physical systems involved[15]. The situation discussed by Wigner is identical with that of formula(11—34), in Section (11—9), except that Wigner endows system II (the measuringapparatus) with the facility of consciousness. He then proceeds to introducethe ultimate observer Q who observes and communicates with the (con-scious) apparatus II. When Q asks II what he has observed, he will receive theanswer that he has observed the state p (as the case may he) and this withprobability 12. All Ihis is quite satisfactory and in agreement with thetheory of measurement.

Page 201: Foundations of Quantum Mechanics

188 THE MEASURING PROCESS 11-10

However, Wigner now inquires what would happen if Q asked his friend(system II): "What did you feel just before I asked you?" Then the friend willanswer, "I told you already I observed (or p4," as the case may be. Inother words, the question whether his friend did observe or p_ wasalready present in his consciousness before Q asked him. But at that momentthere was no question of any interference of the observer £2 into the naturalprocess of evolution of the two interacting systems; thus its state at that timemust have been the superposition (11—34).

But this does not seem to be compatible with the information directlyaccessible to the conscious friend who is aware of his state before he wasasked by the observer what his state is.

For if his awareness is correct, then the state of I + II before �) asked hisfriend, was already a mixture of the two states (p+ ® and (c— ®and not the superposition (11—34).

Wigner considers this paradox an indication of the influence of con-sciousness on the physicochemical conditions of living systems. He finds suchan influence entirely in accord with the general principle of action and reac-tion, since it is known that these physicochemical conditions have in turn aprofound influence on conscious sensations.

d) Discussion of the paradoxes. The similarity between the three paradoxes isobvious. In all three cases one considers two interacting systems, and theparadox is produced by obtaining some information on the state of one of thesystems which seems to be in contradiction with the state obtained fromthe principle of superposition. The difference in the three cases refers only tothe method of obtaining this information.

In case (a) one appeals to the common-sense notion that a cat is eitherdead or alive, and that no other state which would leave us undecided aboutthese two alternatives can occur.

In case (b) the information about system II is obtained by looking at theother system I and using the known correlation of observations in I with thosein II.

Finally in case (c) we have the "consciousness" of system II which,seemingly without outside interference, is capable of determining the state ofII by introspection.

Having thus stressed the similarity, we now pay attention to the differen-ces. Here one sees at once that (b) stands in a class apart, since in this caseonly does one obtain information about the system through an outsideobserver which interacts with the system (I + II). To be sure, the interactionis assumed to affect only system I and not system II, about which we thusobtain information without outside interference.

Case (b) differs from the other two in another respect. In cases (a) and (c)one appeals to notions which are outside the confines of physics. To "be

Page 202: Foundations of Quantum Mechanics

11-10 THREE PARADOXES 189

alive" or to "be conscious" are presumably certain states of very complicatedphysical systems, but it is impossible to express in physical terms what thesestates are.

In paradox (b), on the other hand, an effort is made to reduce the prob-lem entirely to physical terms. For this reason it is easier to discuss this caseand we shall do it first.

If science is possible then there is nothing paradoxical about the physicalworld, and insofar as quantum mechanics is a correct physical theory itcannot contain paradoxes. Thus if paradoxes seem to appear, they mustoriginate either from an inconsistent (and hence incorrect) physical theory, orthey must indicate the limitation of concepts in physics which have acquiredtheir meaning outside the domain of physics. In case (b) we can exclude thesecond possibility, and so we can discuss this case entirely within existingphysical theory, without first having to interpret the physical content ofnonphysical concepts.

What does quantum mechanics tell us about the state of the physicalsystems I, II, and (I + II) after a third observer has carried out a measure-ment of the quantities on system I?

We use the notation and the theory of Sections 11—8 and 11—9. Thistheory tells us that after measurement of the quantity the system is in thestate H"1 ® where ft'1 and are the reductions of the state W = F4, tothe subsystems I and II respectively. This result is a direct consequence of theanalysis of Section 11—9, the only difference being that the system is now(I + II) while the apparatus is the observer Q.

The effect of the observer Q on the system (I + II) was thus to change thestate W of that system to the state W1 ® W11. This change of the state of theentire system is exactly the same as the change which would have been ob-tained by measuring the quantity Q± of system II. We see now quite clearlythat the attempt at restricting the observation to I is illusory. The effect on theentire system is exactly the same, whether we observe P± in system I or Q± insystem IL To be sure, in neither case is the state of subsystem I or subsystemII modified in any manner whatsoever. This state is before and after themeasurement given by W1 for I and W11 for II.

The paradox originates in our habit of thinking that the states of twosubsystems determine uniquely the state of the composite system. As we haveshown in Section 11-8, this is usually not the case. In the present example thetwo djfferent states W = F4, and it'1 ® ['V11 have the same reductions to thesystems I and II and the measurement of either P± or Q± changes the stateii" of the combined system to the state (W1 ® W11).

This shows that the application of Einstein's criterion of physical realitybecomes ambiguous. It all depends how we want to interpret the condition"in any manner whatsoever." If. we refer it only to the states of the subsystemsI or II. it is obviously fulfilled; 1. we refer it to the entire system (I + II). it is

Page 203: Foundations of Quantum Mechanics

190 THE MEASURING PROCESS 11-10

not. In no case is there a contradiction of the uncertainty relation, because, aswe have seen, a measurement of P± has exactly the same effect on the statesas a measurement of Q±.

Thus the "paradox" of Einstein, Podolsky, and Rosen does not revealany contradiction of quantum mechanics; it merely emphasizes in a moststriking way the essential nonclassical consequences of the quantum-mechani-cal superposition of states. It is this very superposition principle which leadsto the ambiguity in the application of Einstein's criterion of "physicalreality."

Let us now turn to the more difficult discussion of paradoxes (a) and (c).The difficulty in these cases stems from the fact that the outside observer ispushed into the background. In case (a) he may merely be needed to verifywhether the cat is dead or alive, an observation which may reasonably beassumed to have no effect whatsoever on the biological state of the cat. Incase (c) he is even entirely superfluous since consciousness becomes aware ofitself by introspection. Of course this faculty of observing the state of IIwithout any observer is obtained here with properties which are difficult toexpress in physical terms, namely, "being alive" in case (a) and "beingconscious" in case (c). In either case the alternatives of the microscopicsystem are transferred to the crude macroscopic level and thus are no longersubject to the quantum-mechanical ambiguities associated with coherentinterferences of two different states.

One might reformulate the Schrodinger cat paradox by using only themacroscopic features of system II. Such a reformulation has been given, forinstance, by Einstein (cf. reference 8, reply to criticisms), who replaced the"hellish contraption" of SchrOdinger by a moving film strip which records theevent of the radioactive decay in a permanent, macroscopic, and unobservedrecord. In this form the paradox is formulated entirely within the confines ofphysics, and yet at first sight it seems to retain its paradoxical character.

The essential point here is that system II, which contains this recordingdevice, can be made as large as one wishes. In a subsequent observation onthis system, the inevitable interaction of the outside observer with system IIcan therefore be made as small as one wishes, and thus (one is tempted toconclude) it can be neglected altogether.

It would, however, be incorrect to neglect it altogether. For we must notforget that the distinction between state W and state ® W11 becomesincreasingly difficult to detect with the increase in size of the whole apparatus,and it is precisely this distinction which is under discussion here. The theoryof the preceding two sections has shown with sufficient generality that theremaining interaction between II and an outside observer Q is in fact theessential effect which will indeed obliterate the distinction between the twostates.

Page 204: Foundations of Quantum Mechanics

REFERENCES 191

If the information of the recording device does have objective validity,that is, if it can be communicated, then this very property makes it impos-sible to distinguish between the two states W and ['V1 ® W11.

Thus the paradox of Schrodinger's cat can be resolved when it is refor-mulated entirely in physical terms.

Wigner's friend could be treated in the same manner with the sameconclusion, but this would not meet the heart of Wigner's problem. As longas one insists on including consciousness as a property of quantum-mechani-cal systems, the outsider observer £2 can be dispensed with altogether and thenwe have no answer to the paradox. Must we conclude from this, as Wignerdoes, that quantum mechanics, as we know it now, would be inapplicable forsystems with consciousness? The answer to such a question obviously pre-supposes a characterization and analysis of "consciousness" in physicalterms, a task which seems to transcend the present limitations of physics.

REFERENCES

The uncertainty relations are discussed in nearly every book on quantum mechanics.They were discovered and analyzed by Heisenberg in a special case.

1. W. HEI5ENEERG, Z. Physik 43, 172 (1927).

2. E. SCHRöDINGER, Natarwiss. 48, 52 (1935).

3. W. PAULI, Die allgemeinen Prinzipien der Wellen mechanik, Handb. d. Phys.,Bd. V. Teil 1, 1.

Concerning the problem of measurement one may profitably consult, in additionto the above:4. 0. LuDwic, Grandlagen der Quantenmechanik. Berlin: Springer-Verlag (1954).

5. J. VON NEUMANN, Mathematische Grandlagen der Quantenmechanik. Berlin:Springer-Verlag (1932).

6. 0. LuDwIG, in Werner Heisenberg and die Physik anserer Zeit, p. 150. Braun-schweig: Vieweg & Sohn (1961).

7. F. LONDON AND E. BAUER, "La Théorie de l'Observation en Mécanique Quan-tique," Actaaltiés Scientifiques et Indastrielles, 775. Paris: Hermann (1939).

8. A. EiNsTEIN, Philosopher-Scientist, especially articles by Bohr and Einstein.Evanston (1949).

9. L. DE BROG LIE, La Théorie de ltMesare en Mécanique Quantique. Paris:Gauthier-Villars (1957).

10. N. R. HANSON, Tire Concept of the Positron. Cambridge: Cambridge Uni-versity Press (1963).

II. Obser,'ation and Interpretation, S. Korner, ed. London: Academic Press andButterworth (1957).

Page 205: Foundations of Quantum Mechanics

192 THE MEASURING PROCESS

12. A. EINSTEIN, B. PODOLSKY, AND N. ROSEN, Phys. Rev. 47, 777 (1935).

13. N. BOHR, Phys. Rev. 48, 696 (1935).

14. A. EINSTEIN, J. Franklin Inst. 221, 349 (1936).

15. E. WIGNER, The Scientist Speculates, p. 284; I. J. Good, ed. London:Heinemann (1962).

A very clear recent review of the entire problem of measurement in quantummechanics is found in:

16. B. D'ESPAGNAT, "Conceptions de la Physique," Actualités Scientifiques etIndustrielles 1320. Paris: Hermann (1965).

See also:

17. J. M. JAUCH, Helv. Phys. Acta 37, 293 (1964).

18. J. M. JAUCH, B. P. WIGNER, AND M. M. YANE5E, Nuovo Cim. 48, 144 (1967).

Page 206: Foundations of Quantum Mechanics

PART 3Elementary Particles

Page 207: Foundations of Quantum Mechanics
Page 208: Foundations of Quantum Mechanics

CHAPTER 12

THE ELEMENTARY PARTICLEIN ONE DIMENSION

I shall never forget the thrill which I experienced when I succeeded in condensingHeisenberg's ideas on quantum conditions in the mysterious equation

pq — qp = h/2iri,

which is the center of wave mechanics and was later found to imply theuncertainty relation.

MAX BORN, Physics and Metaphysics,Joule Memorial Lecture, 1950

In this chapter we begin the quantum theory of elementary particles. Thecentral notion is localizability, which we introduce in Section 12-1 for theparticle in one dimension. Closely related to it is the notion of homogeneity ofspace (Section 12-2) which leads in a natural manner to the canonical com-mutation rules (Section 12-3) and the systems of imprimitivities. The defini-tion of elementary particle is given in Section 12-4. In Section 12-5 we treatthe relation between Galilei invariance and the equation of motion for anelementary particle. The theory of the harmonic oscillator follows as anillustrative example, in Section 12-6. In the following section (12-7) we give amodem version of the harmonic oscillator in a Hilbert space of analyticfunctions, which has lately become important in the treatment of coherencephenomena. In the final section (12-8) we take up once more the question ofmodularity of the proposition system left open in Chapter 5.

12-1. LOCALIZARILITY

The atomic hypothesis is based on the idea that the constituents of matter areelementary indivisible entities, the atoms, whose properties determine thebehavior of the bulk of matter. Although this hypothesis was strikingly suc-cessful, especially in the development of the kinetic theory of heat, crystalstructure, atomic and molecular structure, it had to undergo a number ofmodifications and refinements in the course of the development of physics inthe last filly years. One of the most significant aspects of the present state of

195

Page 209: Foundations of Quantum Mechanics

196 THE ELEMENTARY PARTICLE IN ONE DIMENSION 12-1

knowledge about elementary particles is the fact that the known constituentsof matter, which would merit this label, can be transformed into each other,and most can decay into other particles. Only very few of them are known tobe stable. These phenomena force us to question the very notion of an ele-mentary particle as a suitable notion for a basic theory.

In a nonrelativistic theory, as we develop it here, with a definite, althoughlimited, domain of validity it is quite possible to retain the notion of elemen-tary particle as a useful basic concept. It is meaningful to study the behaviorof the center of mass of a complex nucleus even if we know such a nucleus tobe a composite system of many nucleons. It is equally meaningful to study theproperties of a muonic atom even if we know that it will decay in 10-6sec. It is in this sense that we shall here develop the nonrelativistic quantumtheory of elementary particles.

The key concept in the theory of elementary particles is localizability towhich we have to turn our attention first. We begin with a description of theterm.

One of the basic features of the physical world is its occurrence in aphysical space of three dimensions. This physical space is the "arena" of theevents in physical systems. A careful examination of the preceding chaptersshows that so far we have nowhere introduced this fact as an essential elementin the theory. This we must do now.

Localizability of an elementary particle means first of all a special prop-erty of the proposition system. This system must contain a class of proposi-tions which answer the question whether the particle is in this or in thatvolume element of physical space. We are familiar with this from classicalmechanics, where the corresponding propositions are represented by theBoolean algebra of the Borel subsets in the three-dimensional Euclideanspace. How must we represent these propositions in quantum mechanics?

We must first examine the question whether propositions concerning thelocation of a particle in different volume elements are compatible or not. Theanswer must be sought through experience with counters which locate parti-cles in various regions in space, and we can only record the fact that so far allour experience is consistent with the assumption that all these propositionsare indeed compatible.

We shall point out here, however, that this experience is of course limitedto relatively large volume elements, and no systematic study has ever beenmade of this question. The possibility of a quantum mechanics of localizablesystems, for which not all propositions expressing localization are mutuallycompatible, is a distinct possibility. It would imply a far more profoundmodification of quantum mechanics than has ever been contemplated.

Let us proceed then on the traditional assumption that the propositionslocating a particle in various domains A of physical space are all compatible.We represent them therefore by a projection-valued measure A EA whichsatisfies the properties to be explained and enumerated now. We shall do

Page 210: Foundations of Quantum Mechanics

12-2 HOMOGENEITY 197

this in this chapter for the case in which the physical space is one-dimensional.The A are then Borel sets on the real line. If A1 and A2 are two Borel setson the real line, then the proposition that the system is located in the inter-section is represented by EA1 n EA2. Since EA1 and EA2 are com-patible, this same proposition is also given by EA!EA2 EA1 n EA2.

Similarly the proposition EA1 u EA2 may also be represented byEA1 u EA2 EA1 + EA2 — EA1EA2. Finally if we denote the complementaryBorel set as usual by A' and the complementary proposition I — Eby E', wemust have EA. = By generalizing the additivity property to a countablesequence of Borel sets, we obtain the fundamental relations for the proposi-tions which localize a physical system:

EA1 n A2 = EA1 n EA2,

EA1 u A2 = EA1 u(12—1)

= EUA1 for any sequence of disjoint A1,

E lA'

A special case of the last relation is obtained by choosing for A the null set0, so that 0' = A, the entire real line:

EA = = 1. (12—2)

We see thus that the physical concept of localizability leads in a naturalway to a spectral measure over the real line A.

According to the fundamental theorem quoted in Section 4-3, everyspectral measure defines a self-adjoint operator Q. The operator defined bythe spectral measure (Eq. 12—1) is called the position operator. It is a self-adjoint operator with a continuous spectrum extending over the entire realline.

12-2. HOMOGENEITY

Two of the basic properties of physical space are its homogeneity and iso-tropy. Both of these properties express the fact that physical space has noobservable physical properties. This means that different points in physicalspace are physically indistinguishable. For a particle moving in a one-dimen-sional space, we can express homogeneity by requiring that a translation ofthe space A by an arbitrary amount will induce a symmetry transformationin the proposition system.

In order to express this in a formula, we introduce the notation forthe set (2 : 2 — e A}; that is, the set translated as a whole by the amountIf the space A is physically homogeneous, we must require that there existunitary operators such that

L4%t'1. (12 3

Page 211: Foundations of Quantum Mechanics

198 THE ELEMENTARY PARTICLE IN ONE DIMENSION 12-2

The unitary operators depend continuously on the parameter and wecan choose the as yet undetermined phase of in such a way that they forma continuous vector representation of the additive group of real numbers:

= (12—4)

According to Stone's theorem, such a group uniquely determines a self-adjoint operator P such that

= (12—5)

We shall call P the displacement operator.The relation (12—3) is fundamental. It is the precise mathematical expres-

sion of the notion of localizability in a homogeneous space. In the followingwe shall develop a number of consequences and other equivalent forms of thisfundamental property.

First of all we can transform Eq. (12—3) into another equivalent butmore symmetrical form as follows: The self-adjoint operator Q may also beconsidered as a generator of a one-parameter group representation by setting

= dEs.

We then obtain from Eqs. (12—3) and (12—4),

=

=

or

==

= (12—6)

Relation (12—6) is the canonical commutation rule in Weyl's form. En thisform the symmetry between the two groups is more apparent than in the formof Eq. (12—3). We see at once, for instance, that the group induces in thespectral measure of P also a displacement in the opposite direction. Thus if

are the spectral projections of the operator P, we have

= (12—7)

We can throw the canonical commutation rule (12-6) into still anotherform by expressing it in terms of the generators of the unitary groups. The

Page 212: Foundations of Quantum Mechanics

12-3 THE CANONICAL COMMUTATION RULES 199

generator P is defined on all vectors f for which the limit

inn 1 (U8 — J)f= pjicc

exists, while Q is defined for the vectors f which admit the limit

lim — I)f= Qf.p—.o i/i

From these definitions and Eq. (12—3), we easily obtain the commutation rule(cf. Problem 2)

[Q,P]f=if. (12—8)

In order to give this equation a meaning, we must have Qf eD of such vectors f is everywhere dense (Problem I), and the

restriction of Q or P to D is essentially self-adjoint (Problem 3).

PROBLEMS

1. There exists a dense domain D of vectors f which satisfyPf E DQ and Qf Ewhere DQ is the domain of Q and is the domain of P.

2. If and are two one-parameter unitary group representations satisfyingEq. (12—3), then their infinitesimal generators P and Q respectively satisfy thecanonical commutation rule

[Q,P]f== if,

for all f E D where D is the domain of Problem 1.

*3 The restriction of Q to the domain D is essentially self-adjoint, so is the restric-tion ofF to D.

*4 Theorem (Plessner): The spectrum of P and the spectrum of Q are absolutelycontinuous (cf. reference 2).

12-3. THE CANONICAL COMMUTATION RULES

We have seen that the notion of localizability in a one-dimensional homo-geneous space lead in a natural way to a pair of self-adjoint operators P andQ. which on a dense subset D of the entire Hilbert space satisfy the relation

[Q,P]f= if forallfeD. (12—9)

We shall now examine the mathematical aspects of this canonical commuta-tion rule. In many discussions it is written simply as an operator relation

IQ. I• I. (12 10)

Page 213: Foundations of Quantum Mechanics

200 THE ELEMENTARY PARTICLE IN ONE DIMENSION 12-3

We shall do this too, occasionally, but in doing so we must always bear inmind that this relation is only valid on the dense set D, and that the restric-tion ofF or Q to D is only essentially self-adjoint. Failure to take accountof these important restrictions can cause pseudo problems in the form ofparadoxes.

The principal mathematical problem of the canonical commutation rulesis connected with the question of the possible irreducible representations of thecanonical commutation rules as in Eq. (12—9).

We first construct an explicit representation. Let us assume that thereexists a generating vector g for the operator Q, so that its spectrum is simple(ci Section 4-5). There then exists a spectral representation of Q, and in thisrepresentation the vectors f e are given by functions !4'(x) on the real line(—ccc xc +co)suchthaHfr(x) eL2(—cc, +co) and

(Q!4')(x) = x!fr(x). (12—11)

A possible expression for P in this representation is given by

(PçIi)(x) = (12-12)dx

The domain D on which the commutation rules are satisfied is determined bythe following conditions:

1) e DQ,

2) !4'(x) is differentiable a.e., (12—13)

3) !4"(x) E DQ.

With these restrictions on the domain, the following operations can be carriedout:

= — i = — — ix

(QPçb)(x) = — ix

so that, for all such

[Q, içl4x).

We have thus constructed the Schrodinger representation of the canonicalcommutation rules.

The exponential form of this representation, that is, the representation ofrelation (12—6), is given by the formulas

= + cc)(12-14)

=

Page 214: Foundations of Quantum Mechanics

12-3 THE CANONICAL COMMUTATION RULES 201

It follows from these formulas that

(UaVpçli)(x) = + cc),

= + cc),

and we have verified Eq. (12-6).We turn now to the question of the uniqueness of the irreducible repre-

sentation of the commutation rules. This question is much easier to answerfor the representation of Weyl's commutation rules than for the unboundedones. For Weyl's form, von Neumann has proved the uniqueness of theirreducible representation up to unitary equivalence [1]. Many other proofshave been given since. We may thus state the

Theorem (von Neumann): Every irreducible representation of the com-mutation relations UaV11 = is unitarily equivalent to theSchrodinger representation (Eq. 12—14).

We shall not prove this theorem here, but we shall show that it is acorollary of another theorem, of much greater generality, due to Frobeniusand Mackey, which plays a very important role in quantum mechanics. Thisis the theorem on systems of imprimitivities. This notion appears in thefollowing way:

Given a topological space M (the "configuration space" of the particle)together with a locally compact transformation group G which acts transi-tively on M. This means there exists a function (q, x) [q]x on M x G(q e M, x e G) with values in M with the following properties:a) For each fixed x e G the function q —* [q]x is a one-to-one and continuous

mapping of M onto itself.

b) [q]x1x2 for all x1, x2 e G and all q.

c) [q]e = q.d) if q1, q2 e M, there exists an x e G such that [q1]x = q2 (transitivity).

Let us further assume that we are given a projection-valued measure on theBorel sets of M and a representation of the group G, x — such that

U;'EAUX = E[A]X, (12—15)

where [A]x denotes the set {q: [q]x' e A} obtained from A by the action ofthe group element x on the configuration space.

A projection-valued measure A —* EA which satisfies the fundamentalrelation (12- I 5) is called a transitive system of imprimitivities for the represen-tation .v based on the space M.

A comparison of (12-15) with (12-3) shows immediately that the prop-ositions EA representing the localization of a particle in the lord sets Aare a transitive system of iniprimitivities based on the real line A. Here thegroup (1 is the translation group of A. It is obviously transitive.

Page 215: Foundations of Quantum Mechanics

202 THE ELEMENTARY PARTICLE IN ONE DIMENSION 12-3

The greater generality of the imprimitivity system, which is so useful forlater application, is that we admit for M any topological space and for G anygroup which acts continuously and transitively on M.

For every M and every group G, we can always construct a special systemof imprimitivities called the canonical system, defined in the following way:

Let 1t(A) be a measure on M with the property it([A]x) —'This means that all the measures which are obtained by the action of thegroup on the space Mare equivalent to one another. One can prove that sucha measure always exists and that it is unique up to equivalence. For instance, inthe example of system (12—3), the measure in question is simply Lebesguemeasure on the real line. In this particular case the measure is even invariantunder the translations. What we need is, however, only the weaker condition

—' 1t(A). We call such a measure quasi-invariant under the group G.Such a measure defines the unique Radon-Nikodym derivative (cf.

Section 1-4) which is an a.e. positive bounded function for all x e G.We can then define a Hilbert space L,2(M) consisting of all complex

valued measurable functions f(q) over M which are square-integrable:

IJM

The transformation defined by

= (12—16)

is then easily verified to be unitary.We define the projection-valued measure A —* EA by setting

(Ej)(q) = 1A(q)f(q), (12—17)

where 1 is the characteristic function of the set A

1 forqeA,0

This measure defines a system of imprimitivities with respect to the repre-sentation x —* To see this, it suffices to verify the equation (Problem 5)

(U; = (12—18)

We shall call it a canonical system of imprimitivities.This procedure of constructing systems of imprimitivities can be

generalized as follows: we consider an arbitrary point q0 e M together with thesubgroup G0 of G which consists of all those transformations which leave thepoint q0 invariant. We shall denote the elements of this subgroup G0 by

Page 216: Foundations of Quantum Mechanics

12-3 THE CANONICAL COMMUTATION RULES 203

means of Greek letters so that

and we call G0 the "little group."The choice of the point q0 is arbitrary. The group will of course depend

on that choice. However, the group associated with another choice will bea conjugate subgroup (Problem 6).

Next we consider an irreducible representation —* of G0 by unitaryoperators in a Hilbert space it°0. We can now construct a new I-filbertspace !tL in the following way. The elements of !tL are functionsf(x)on thegroup G with values in t which satisfy the following conditions:a) (f(x), g) is a measurable function for all g eb) for all x e G and all e G0, one has

= Lj(x).

c) c cc.JM

Property (c) needs an explanation. The functionsf(x) are defined on thegroup G but the integration is extended over the space M. This is possiblebecause of condition (b), the quantity is constant on all right cosetsand the latter are in one-to-one correspondence with the points q e M (Prob-lem 7). Since every element x e G is in exactly one right coset, there exists anatural homomorphism from G to M which is given explicitly by x — [q0]x.

For the same reason, we can also define the scalar product between twofunctions F = {f(x)} and G = {g(x)} by writing

(F, G) = (f(x), g(x))JM

If addition and multiplication is defined by

F + G = {f(x) + g(x)}

AF = {2f(x)},

we obtain a Hilbert space !tL. For each x e G we define a unitary operatorby setting

= (12—16)'

Here we have denoted by pAy) the Radon-Nikodym derivative of themeasure p([A]x) with respect to the measure p(A).

It is now easily verified that the operators U' are a unitary representationof' the group (/ in the space •W It is called an iw/uen/ representation.

Page 217: Foundations of Quantum Mechanics

204 THE ELEMENTARY PARTICLE IN ONE DIMENSION 12-3

Equipped with this representation, we can define a generalized irreduciblecanonical system of imprimitivities by setting

(Ej)(x) = (12—17)'

The verification of the relation

= (E[A]Xf)(y) (12—18)'

is then just as easy as that of the corresponding relation (12—18).There exists a very powerful theorem which states that every irreducible

system of imprimitivities is unitarily equivalent to one of the canonical onesthat we have constructed.

More explicitly, if A EA is such a system, where EA are projections inthe representation space it° of the representation x — then there exists anirreducible unitary representation L of the little group and a unitary mappingW which maps t onto !tL such that the image system of imprimitivitiesunder this mapping is a canonical one [5, 6].

For the proof of the uniqueness of the commutation rules in Weyl's form,we need only a special case of this theorem. For in this particular case, thespace M is the real line and the group G its one-parameter translation group,the little group G0 is the trivial group {e} and its representation is unity. Thusthe particular irreducible imprimitivity system (12—3) is unique (up to unitaryequivalence) and this establishes the uniqueness of the commutation rules(12-6) in the same sense.

This theorem has many useful applications. For instance the uniquenesstheorem of the canonical commutation rules in Weyl's form (12—6) is acorollary of the imprimitivity theorem. Indeed any irreducible representationof (12—6) will determine a system of imprimitivities which is obtained from(12—6) by choosing for the EA the spectral projections of Since such asystem is unique up to unitary equivalence, the same is true for the represen-tation of (12—6).

The more general imprimitivity theorem permits the definition of local-izability in other than Euclidean spaces, for instance on the surface of asphere. This notion is also needed for the quantum mechanics of systemssubject to constraints where position operators may not exist.

We have now established the uniqueness of the representations of Weyl'scommutation rules (12—6) on the basis of the imprimitivity theorem. Everysuch representation furnishes also a representation of the canonical com-mutation rules (12—10). But the converse is not true; the uniqueness of (12—6)does not imply the uniqueness of (12—10). The reason for this asymmetry isthe fact that operators which satisfy relations such as (12—JO) cannot both bebounded (cf. Problem 4). Hence they are never definable on the entire Hilbertspace. When this is the case, then the relations(12—l0)do not imply the rela-tions (12—6) even if Q and P are essentially self-adjoint on their respectivedomains.

Page 218: Foundations of Quantum Mechanics

12-4 THE ELEMENTARY PARTICLE 205

It is possible to impose additional conditions on the operators P and Q, so

that their representations also become unique. Such additional conditionshave been formulated by Rellich [7], Dixmier [8], Sz.- Nagy et aL [9], andKilpi [10]. Unfortunately none of these conditions have any obvious physicalinterpretation.

PROBLEMS

*1. The SchrOdinger representation of the canonical commutation rules definestwo essentially self-adjoint operators on the domain D given by the threeconditions (12—13) (cf. reference 4).

2. An irreducible representation of a system of imprimitivities determines amaximal abelian von Neumann algebra d = {EA }" generated by the projectionsEA.

3. In any irreducible representation of the canonical commutation rules [Q, F] = 1,

the spectrum of Q and F is simple.4. The symmetrical operators Q and F which satisfy [Q, F] = i on some dense

domain cannot both be bounded.

5. Define EA and as in Eq. (12—17) and (12—16); then

IT—it IT CI.)x L11st)X

6. If G1 is the subgroup of G which leaves q1 E M invariant and G2 is similarlydefined for another point q2 E M, then the two groups G1 and G2 are similar:There exists an element x E G such that G2 =

7. If G0 is a subgroup of G acting an a homogeneous space M, then the right cosetsof G0 are in one-to-one correspondence with the points q E M. This corres-pondence is a homeomorphism.

12-4. THE ELEMENTARY PARTICLE

With the representation theory of the canonical commutation rules out of theway, we can now give a precise definition of the notion of an "elementaryparticle." Let M be the three-dimensional Euclidean space and G the trans-lation group in M. We define:

A localizable system with the system of imprimitivities {EA, describesan elementary particle the system of imprimitivities is irreducible.

Irreducibility means that the only operators which commute with the entiresystem {EA, are the multiples of the unit operator. Thus {EA, ={2I} for elementary particles.

It is convenient to generalize this notion slightly in order to accommodatesystems with spins, where the "nonelementary" aspect can he absorbed into

Page 219: Foundations of Quantum Mechanics

206 THE ELEMENTARY PARTICLE IN ONE DIMENSION 12-5

a finite-dimensional matrix algebra. Thus we define:

A localizable system is quasi-elementary ti the ring of bounded operators{EA, is isomorphic to a finite-dimensional matrix algebra.

Whether such systems exist in nature is a question of experience. As faras we know there are many systems which have this property to a very gooddegree of approximation. However, one justifiably doubts whether the notioncan have an absolute and fundamental significance. A great many particlessuch as nucleons, mesons, electrons in very energetic collisions behave morelike systems with an internal structure which would require, for its completedescription, the introduction of many more degrees of freedom than therewould be available from the operators {EA, Notwithstanding thisreservation, the notion of elementary particles is certainly useful for a largenumber of physical systems even if it is only approximate.

12-5. VELOCITY AND GALILEI INVARIANCE

The postulates of localizability and homogeneity have led us to a fairlycomplete description of the kinematical aspect of an elementary particle. Asto the dynamical characteristics we know that the state of an isolated ele-mentary particle will evolve according to the solution of a Schrodingerequation, expressed by a unitary group U, = But the principles whichwe have enunciated so far will give us no information about the nature of theevolution operator H. We shall now introduce new principles which will giveus such information.

We begin with the definition of the velocity. As an observable, thevelocity must be represented by a self-adjoint operator. It can be obtainedfrom the position operator Q(t) = in the I-Ieisenberg picture by aformal differentiation with respect to the time t. In this way we find that0(0) is given by

= i[H, Q]. (12—19)

In this equation we have ignored questions pertaining to the domains asso-ciated with the unbounded operators. Such questions would have to bediscussed if the arguments in this section were to be made mathematicallyrigorous.

It is seen from this definition that the velocity will in general depend onH. Conversely if we impose certain properties on 0, then we must expect thatthey will restrict the evolution operator H.

In order to motivate the conditions which we shall impose on 0, weconsider for a moment the classical situation. The velocity 0 depends on thesystem of reference with respect to which the velocity is measured. Thus ifwe change the system of reference to a new system which moves with thc

Page 220: Foundations of Quantum Mechanics

12-5 VELOCITY AND GALILEI INVARIANCE 207

constant velocity v with respect to the old system then the velocity of a particlewill change according to

(12—20)

It is well known that the classical equations of motion are invariant underthis transformation. We are thus led to consider the same transformation(12—20) in quantum mechanics, supplemented by the condition Q —* Q. Weshall refer to such a transformation as the Galilei transformation.

In analogy to the classical principle of Galilei invariance, we wouldexpect that in nonrelativistic quantum mechanics the transformation (12—20)should have a special significance. Inspired by this analogy we may formulatea principle of Ga lilei invariance:

The Galilei transformations (12—20) induce symmetry transformations inthe lattice of propositions.

Let us now examine the consequences of this principle for an elementaryparticle in a one-dimensional space.

If Eq. (12—20) is a symmetry transformation, then this means that thereexists a one-parameter unitary group with the properties

+ v = (12—21)

and

= + (12—22)

Furthermore, since Q remains unchanged under Galilei transformation,commutes with Q. Since the system is elementary, this implies that is afunction of Q. By Stone's theorem we may write = with K =where u is some Borel function on the real line.

We can go a step further if we combine the Galilei transformation withthe displacements discussed in the preceding sections. We define the two-parameter family of unitary operators W(cc, v) with the properties

Q + = W(cc, v)QW'(; v),(12—23)

+ v = W(cc,

v) is a projective representation of the translation ofthe plane:

W(cc1, v1)W(cc2, v2) = w(cc1, v1, v2)W(cc1 + v1 + v2). (12—24)

According to the general theory of such representations (cf. Section9-6), it is possible to choose the arbitrary phase factors in the definition of Win such a way that thc factor w in Eq. (12-24) assumes the form

''2)- (12—25)

Page 221: Foundations of Quantum Mechanics

208 THE ELEMENTARY PARTICLE IN ONE DIMENSION 12-5

where /2 is an arbitrary real constant which distinguishes the different in-equivalent proj ective representations of the two-dimensional translationgroups.

The two one-parameter subgroups U2 and are recovered by specializ-ing the parameter values according to

= W(rx, 0), = W(0, v). (12—26)

Specialized to these two subgroups, the relations (12—24) and (12—25) become

= (12—27)

Comparison of this equation with Eq. (12—6) shows that a possible solutionis obtained by setting

= (12—28)

and we conclude from the representation theorem of the preceding sectionthat every other solution must be unitarily equivalent to it.

Before we proceed further, we shall examine the case = 0. In this casewe see from (12—27) that commutes with According to the remarkfollowing Eq. (12—22), it also commutes with Q and hence with Since thesystem is supposed to be an elementary particle we conclude that must be amultiple of the identity. But this is impossible, since it contradicts Eq. (12-21).Hence, we conclude /1 0 [11].

Proceeding now with the expression (12—28), we find that

1 1P + v = (12--29)

/1 /1

By taking the difference of (12—29) and (12—21), we obtain the result that— commutes with and therefore with Q, and so it can only be a

function u(Q) of Q. We have thus derived the fundamental relation

1P+u(Q). (12-30)/1

The function u(Q) depends on the representation of P. For a given Q wecan always find a representation of P such that u(Q) 0. In order to see this,it suffices to show that there exists a canonical transformation S which com-mutes with Q and which transforms P according to 5P5' = P — jtu(Q)(Problem 1). Thus, replacing P by 5P5' we find the relationship

= P. (12—31)/1

If we combine this result with Eq. (12—19), we find

P = i[H, Q]. (12-32)/1

Page 222: Foundations of Quantum Mechanics

12-5 VELOCITY AND GALILEI INVARIANCE 209

From the commutation rules for P and Q, it follows that the operatorH0 = (1/21t)P2 satisfies

P = i[H0, Q]. (12—33)/2

Thus by taking the difference between Eqs. (12—33) and (12—32), we find thatH — H0 commutes with Q. Since we assumed that our system is an elementaryparticle, it follows that H — H0 is a function of Q. Let us denote it by v(Q).

We have thus found the most general evolution operator which is com-patible with the principle of Galilei invariance. It has the form

H = P2 + v(Q). (12—34)2/1

Moreover, we have shown that the displacement operator P is propor-tional to the velocity operator (More precisely, we can always adopt arepresentation so that it is.) The function v(Q) can be interpreted as theeffect of an external force acting on the particle. Classically it would representthe potential of that force. The function v(Q) might even depend explicitly ontime. in that case the evolution of the state is no longer described by a group,and some modification in the preceding derivation is necessary.

For the special case in which H is invariant under displacements, thefunction v(Q) must reduce to a constant. We shall refer to this case as afree particle.

We see from the foregoing that localizability and Galilei invariance, in theform which we have given it, lead to a one-parameter family of elementaryparticles distinguished by the real parameter p.

Let us now study the physical significance of the parameter p. In order todo this we refer to the classical aspects of an elementary particle. Classicallysuch a system would be described by the movement of a point in configurationspace, which in our example is the real line. It is therefore natural to identifythe classical motion of the particle with the motion of the expectation value ofthe position operator. We shall denote this value by x, so that

x = Tr WQ, (12—35)

where 14" is the density operator for the quantum-mechanical state. Thevelocity is then given by

dx= TrWQ= = iTr W[H,Q]=1 TrWP,

III

and for the momentum we find

(IX =Tr WP, (12—36)

(It /A

Page 223: Foundations of Quantum Mechanics

210 THE ELEMENTARY PARTICLE IN ONE DIMENSION 12-5

where rn is the mass of the particle. We thus see that the operator p =(rn/4u)P may be interpreted as the momentum operator. The quantity rn/uwhich connects the displacement operator with the momentum operator isa fundamental universal constant of the theory, which can be determined byany experiment which relates the measurement of a wavelength (for instance,a diffraction) to that of a momentum or energy. This constant is Planck'sconstant

m h

p 2ir

It has the approximate value

h 1.05 x 10—27 erg sec.

We can pursue the classical limit even further. If we calculate the secondderivative of x with the same method, we obtain

rn = —h Tr (w (12—37)

The operator hv(Q) v(Q) may thus be interpreted as the potential energyof the particle. Consistently with this we would interpret

p2 = p2 (12—38)2j.t 2rn

as the kinetic energy, and hH as the total energy of the particle.The relations which we have now established lead to a simple rule which

permits the transition from a classical to a quantum system. If, in the classicalanalogue of a spinless one-particle problem, we have a total energy of theform

EQi,q) = (1/2rn)p2 + V(q),

then we obtain the quantum mechanical evolution operator for such a systemby reinterpreting the quantities p and q according to the formulas

p—hP, q=Q, E=hH.

PROBLEM

1. If Q and P are canonical operators, there exists a unitary operator S whichcommutes with Q and which transforms P according to

5P5'=P+v(Q),where v is an arbitrary Borel function of a real variable.

Page 224: Foundations of Quantum Mechanics

12-6 THE HARMONIC OSCILLATOR 211

12-6. THE HARMONIC OSCILLATOR

The simple harmonic oscillator in classical mechanics is a mechanical systemof one degree of freedom with a total energy expression of the form

2

According to the rule of the preceding section, the quantum-mechanicalharmonic oscillator is characterized by the position operator Q, the displace-ment operator P = (1/h)p and the evolution operator

(12-40)2it 2h

We define the characteristic frequency w of the oscillator

(12-41)m uh

and carry out the canonical transformation

4/f/2 41h

h 4uf

then we obtain for H in the new variables the expression

H = + Q2). (12-42)

The properties of this operator can be established in considerable detail.For the following calculations we shall omit the constant w and reintroduce itin the final result.

It is easily seen that the operator H is a positive definite operator, since,for any e and e DQ,

(çli,Hçli) = 4 + >0.The operator H is also strictly positive since from = 0, it wouldfollow that Pi,L' = QçIi = 0, which is impossible because of the commutationrules given by Eq. (12—10).

Next we show that the operator H has at least one eigenvalue. To thisend we introduce the "annihilation" and "creation" operators defined by

a=

+ iP), a*=

— iP). (123)

On a dense domain oft, these operators satisfy the commutation rules

1114(1*1 = I. (12—44)

Page 225: Foundations of Quantum Mechanics

212 THE ELEMENTARY PARTICLE IN ONE DIMENSION 12-6

Furthermore, we have the relations

a*a=4(P2+Q2_l); aa*=+(P2+Q2+l),so that

(12—45)

Let be a normalized vector which satisfies

= 0 (12—46)

When this equation is expressed in the Schrodinger representation, we findthat !4'0(x) is the solution of the differential equation

+ = 0. (12—47)dx

The solution of this equation is given by

(12—48)

The real integration constant A is determined by normalization:

or

1

= J= A2J =

so that

= -L (12—49)

This normalized vector of the space t = L2(— cc, + cc) is thus an eigen-vector of N with the eigenvalue 0.

We can obtain a whole series of further eigenvectors if we use the com-mutation rules (12—44) which yield the two formulas

Na = a(N — 1), Na* = a*(N + 1). (12—50)

Consider now the vector = a*!frO. It satisfies

Na*çIio = a*(N + =

It is thus an eigenvector of N to the eigenvalue 1. Furthermore

=

= (1 +2 1

— silo — 1,

so that it is also normalized.

Page 226: Foundations of Quantum Mechanics

12-6 THE HARMONIC OSCILLATOR 213

We can continue this process and define the vector

=

It is eigenvector of N to the eigenvalue n and it is normalized (Problem 1)

= = 1.

In this manner we have obtained an infinite sequence of normalizedeigenvectors of N, one for each of the nonnegative integers n.

The eigenfunctions can be expressed explicitly in the Schrodinger repre-sentation.

= 1(x — ±/-)" (12—51)

dx

The execution of the differential operator leads to an expression of the form

(x — =dx

where is a polynomial of the nth degree in the variable x, the so-calledHermite polynomial.

Explicit evaluation of the first few polynomials give:

H0(x) = 1; H1(x) = 2x;

H2(x) = 4x2 — 2;

H3(x) = 8x3 — 12x.

The general formula is (Problem 2)

[n/21H " ' / '2 \n2kx)

where [n/2] denotes the greatest integer � n/2. These polynomials satisfy thedifferential equation

— + = 0,

and the recursion relation

— + = 0.

We shall show that the system of eigenvectors (n = 0, 1,

2, .), is complete in t if P and Q are irreducible in 31°. To this end let.t'0 c .i*° be the space spanned by the vectors and denote by E the projec-tion with range = El'. 1ff e then = 0 for all n. Since

(f *) = (f = 70? + I)!= 0,

Page 227: Foundations of Quantum Mechanics

214 THE ELEMENTARY PARTICLE IN ONE DIMENSION 12-6

and

(a*f, = (f,=

\/ (f, = 0,

it follows that t0 and are invariant under a and a*. Thus the projectionE reduces a and a*; consequently it also reduces

+a*)

and

P=

(a a*).

Since Q and P are irreducible, it follows that E = 0 or E = L The firstalternative is impossible; hence E = L This means that

(f, = 0 for all n impliesf = 0,

and we have proved that the system is a complete system.With this the operator H = (w/2)(Q2 + P2) is completely analyzed with

the result: There exists a complete system of eigenvectors with the eigen-values = w(n + 4) with n = 0, 1, 2,.. Each eigenvalue is non-degenerate.

PROBLEMS

1. The vector = (1/V'n!) is a normalized eigenvector of N = a*awith eigenvalues n.

2. (x — (d/dx))ne_0c212) = where

[n/2]H11(x)

= k!(n 2k)!

and [n/2] denotes the largest integer � n/2.3. The calculation of the eigenfunctions and eigenvalues of the evolution operator

I A

+ Q2, withA >0,

can be reduced to that of the harmonic oscillator by the substitutions

4. The eigenfunctions of the operator H = 1- -f Qa, with a real, canbe determined by carrying out the substitution Q - Q I a.

Page 228: Foundations of Quantum Mechanics

12-7 A HILBERT SPACE OF ANALYTICAL FUNCTIONS 215

12-7. A HILBERT SPACE OF ANALYTICAL FUNCTIONS [13]

The great similarity of the commutation relations for the operators a and a*with those for Q and P raises the question whether the operators a, a* may berepresented as differential and multiplication operators in a suitably definedHilbert space analogously to the Schrodinger representation of the operatorsQ and P. In such a space the vector would be represented by the functionf(z) = 1 (since a!fi0 = 0) and t would have to be represented by

fnaturally the linear combination = L Cn!/Jn, with >n kn12 c cc, would berepresented by the entire analytical function

f(z) =

Such functions are clearly a linear vector space. In order to have a Hilbertspace, it is necessary to complete this structure of linear vector space by ascalar product between any two such functions f, g. This product must beconsistent with the fact that a* is the Hermitian conjugate of a. Thus we musthave, for any pair of vectorsf, g from the domain of a and a*,

= (zf, g). (12-52)

In particular, for the functions = {Un(Z)}, we must have (t/n, Urn) =If we write the scalar product in the form

(f g) = j'f*(z)g(z) dj4z), (12-53)

then dj4z) is a positive measure in complex z-plane C which is completelydetermined by the preceding conditiOns. Indeed from these conditions itfollows that

(Z", zm) = (12—54)In particular, for n =

du(z) = n!. (12—54)'Jc

The set of equations (12—54)' are the conditions for a moment problem. Asis well known from the theory of this problem, these conditions determine thepositive measure dp(z) uniquely. The measure which has this property for allvaltics of ii is

dp(:)= ki'

Page 229: Foundations of Quantum Mechanics

216 THE ELEMENTARY PARTICLE IN ONE DIMENSION 12-7

where z = x + iy and dx dy denotes the Lebesgue measure in the complexz-plane.

We denote the Hubert space thus constructed by F. It is isomorphic tothe space t spanned by the eigenvectors !/Jn of the harmonic oscillator. Theexplicit form of the isomorphism is given by the formula:

f(z) = Zn. (12—55)

This realization has some remarkable properties which we shall brieflysketch here.

The space F is a function space, but because its elements are analyticalfunctions, the space differs in some respects from an L2-space of complex-valued functions. We recall that, in the latter space, it is not the functionsthemselves which are the elements of the space but rather classes of equivalentfunctions, where two functions are called equivalent if they differ at most on aset of measure zero.

In the space F, on the other hand, every element is represented by exactlyone function and not by a class of equivalent functions. This is so because twoanalytical functions which differ at most on a set of measure zero are identical.Another consequence of the same order is the fact that if a sequence of func-tions fn(Z) converges in the mean to a limit function f(z), then the functionsconverge pointwise tof(z).

To the particular vectorf = corresponds the functionf(z) = 1. Theoperators a and a* are represented by d/dz and by z, respectively.

The harmonic-oscillator problem in this representation is defined by theevolution operator H = a*a = z(d/dz). The eigenvectors Pn@) are thus theproperly normalized solutions of the differential equation

Z = (12—56)dz

given by Pn@) =A characteristic property of this space is the existence of a family of

principal vectors for each value of the complex number = + icc2.They are defined in the following manner. For each value of cc we considerthe bounded linear functional (cf. Problem 6)

= f(cc). (12—57)

According to the theorem of Riesz (cf. Section 3-1), every such functionaldetermines a unique vector e such that

f) = f(cc). (12—58)

From (a*f)(z) = zf(z) we find

(ç, a*f) = ccf(cc) = =

Page 230: Foundations of Quantum Mechanics

12-7 A HILBERT SPACE OF ANALYTICAL FUNCTIONS 217

for alif e F. Thus aç = cc*ç. When this relation is expressed inF it reads

*= aaz

from which we obtain

ç(z) = (12—59)

Here we have already chosen the correct normalization factor which is deter-mined by the condition = ç(cc). This may be rewritten and verified asthe identity

I dx dy = (12—60)

The expression (12—59) leads to the remarkable reproducing property forany analytical function f(z) e F:

= Je dx dy. (12—61)

If we choose for fin (12—58) the function ep, we obtain the relation

ep) = (12—62)

The normalized vectors (l/IIcII)c are thus given by

= (12—63)

It is easy to see that the set of vectors is total in f. Indeed if f were orthog-onal to all of them, it would have to satisfy (ç,f) = f(cc) = 0 for all cc; thatis,f= 0.

On the other hand, the vectors are linearly dependent. This can be seenmost easily from the relation (es, ep) = e/z). When the integration on theleft is written out explicitly, one obtains

Jez*(4ep(cc) dj4cc) = e/z).

Using e(cc) = (cf. Eq. 12—59), this relation becomes

du(cc) = ep(z), (1264)

which expresses e/z) linearly in terms of the In spite of this lineardependence of the vectors, it is possible to give a unique expansion of everyvector fin terms of the

f = (12-65)

Page 231: Foundations of Quantum Mechanics

218 THE ELEMENTARY PARTICLE IN ONE DIMENSION 12-7

provided we subject the c(cc) to the additional condition that c(cc) be an analyt-ical function of contained in Indeed from the relation (12—65) we cancalculate c(cc) = f(cc). Uniqueness then follows from the completeness rela-tion of the

It follows from these remarks that any finite set of the vectors is neces-sarily linearly independent. This gives us the possibility of an intrinsicconstruction of the Hilbert space

Let e1 and define the scalar product of two finite linear combina-tionsf = 21e1, g = by setting

(.1, g) =i,k

This set of vectors provided with this scalar product are a linear manifoldwith a scalar product. This is called a pre-Hilbert space. It can be com-

pleted to a Hilbert space by adjoining the limit points. This is the space F.The remarkable property of the of behaving like a coordinate system,

although they are linearly dependent, is brought to light even more explicitlyby the fact that in the space every linear operator T such that ç e isan integral operator and its matrix elements are given by

T(cc, fi) (ç, Te0). (12—66)

We have already seen this for the identity operator for which T(rx, /3) =(cf. Eq. 12—61). For the general case we prove it by evaluating

(Tf)(cc) = Tf) =

= du(fl)

= J(Tes)@)f(fl) du(fl)

= Te0)f(fl) dj4fl).

Here we have used the identity (Te11)*(cc) = (T*ej(fJ), which is left as a prob-lem (7).

The states are especially useful in connection with the study ofcoherence properties of radiation (cf. reference 14).

PROBLEMS

1. The vectors have a representation*11

[Hint: Use the relation f(a) together with Eq. (12—55).]

Page 232: Foundations of Quantum Mechanics

12-8 LOCALIZABILITY AND MODULARITY 219

2. Define = V82 = and

E'V(a) (a = a1 + ia2)then

W(a)WQ3) = e012) + /3).

3. For each a = a1 + Ia2, one finds

W(a)QW1(a) = Q + W(a)PW'(a) = P

4. The vector can be obtained from by the "translation" = W 1(a)çU

5. For any f e 31° one has, for the correspondingf(z) e

f(a) = E'V(a)f).

The correspondence which associates for each a e C and for eachf(z) e Fthe number f(a) is a bounded linear functional.

6. The bound of the linear functional L8 is e 2H812 From this one obtains theinequality

f(z) ]fH,valid for all f eF.

7. For any linear operator TinF such that e we have

(Te,i)*(a) = (T*e0j(8).

12-8. LOCALIZABILITY AND MODULARITY

In Section 5-6 we discussed the notion of modularity, and we stated therewithout proof that modularity is incompatible with localizability. In thissection we shall give the proof of this assertion. It involves studying thestructure of the lattice generated by the projection operators of Q and P.

Let us denote by EA the spectral projection of the spectral measure of Qassociated with the Borel set A on the real line. Similarly we denote by FAthe corresponding spectral projection of P. We define the lattice L generatedby EA and FA as the smallest lattice of projections which is complete andclosed under countably infinite operations of unions and intersections ofprojections in L. The intersection of any pair of projections X, Ye L isdefined by

Xn Y= lim(XY)",n —*

and the union is defined by

XuY=I—(I—X)n(I— Y).

If is any sequence of projections in L then by definition the projec-tiOflS fl and U arc also in I:tirthermore I. contains all the projec—tions and and it is the smallest lattice with these properties.

Page 233: Foundations of Quantum Mechanics

220 THE ELEMENTARY PARTICLE IN ONE DIMENSION 12-8

The question which we wish to answer here is whether the lattice L thusdefined is modular or not. If L were identical with the lattice of all projec-tions, our task would be finished, because we have shown in Section 5-7 thatthis lattice is nonmodular. But it is not certain whether this is the case, and sowe must employ more powerful results to prove the nonmodularity of L.

The crucial property which we need for this proof is the fl-continuity andits dual, the a-continuity, which we shall formulate now.

A lattice is called fl-continuous if, for every nondecreasing sequence� � � � and every be 2, we have

=

A lattice is called U-continuous if, for every nonincreasing sequence� � � � and every be 22, we have

fi(buan).

We are now equipped to understand the meaning of the following theorem ofKaplansky [12]:

Theorem (Kaplansky): A complete, orthocomplemen ted, modular latticeis fl-continuous and U-continuous.

The relevance of this theorem for our problem is evident since the latticedefined above is complete and orthocomplemented. Hence if it is alsomodular, it satisfies the hypotheses of the theorem and therefore also itsconclusion. Thus let us examine whether the lattice 2 is fl-continuous andU-continuous.

To this end we consider the increasing series of Borel sets

= {2 —n � 2 � +n}.We have

and consequently also, for EA,

Furthermore we find that = I. Now let b be represented by a projec-tion F FA where A is a fixed finite interval on the real line. We then findfrom fl-continuity the relation

But this relation is easily seen to be impossible, for the simple reason thatevery one of the terms F n on the right-hand side is the zero projection.

Page 234: Foundations of Quantum Mechanics

REFERENCES 221

This last property can be seen as follows. If is a vector contained in therange of F n then this would appear in the x-representation as a func-tion which vanishes outside the finite interval A = (— n, + n). Its Fouriertransform also would vanish outside the finite interval A. One can show thatsuch a function must be identically zero. Hence F n = 0.

With this we have shown that the hypothesis that L is modular is in-correct; hence L is not modular. We have thereby justified the choice of aweaker axiom for the structure of the lattice of propositions.

It should be mentioned that we have proved the nonmodularity of theproposition system only in the context of the usual Hilbert-space formulationof quantum mechanics, under the assumption that the system is localizable.The question of whether localizability can be formulated within a modularlattice of propositions in a more general setting remains open.

REFERENCES

1. J. VON NEUMANN, Math. Anna/en 104, 570 (1931).

2. A. PLE55NER, J. für Math. 160, 26 (1929).

3. M. H. STONE, /oc. cit. (Chapter 2, reference 5).

4. N. I. ACHIE5ER AND I. M. GLA5MANN, /oc. cit. (Chapter 3; especially Section 49).

5. G. MACKEY, Amer. J. Math. 73, 576 (1951).

6. G. MACKEY, Ann. of Math. 55, 101 (1952); 58, 193 (1953).

7. F. RELLICH, Gött. Nachr. 107, (1946).

8. J. DIXMIER, Compositio Math. 13, 263 (1958).

9. C. FOLIA5, L. GEHER, B. SZ.-NAGY, Acta Sci. Math. Szeged 21, 78 (1960).

10. Y. KILPI, Ann. Acad. Sd. Fennicae, A315, 3 (1962).

11. E. INONU AND F. P. WIGNER, Nuovo Cim. 9,705 (1952).

12. I. KAPLAN5KY, Ann. of Math. 61, 524 (1955).

13. V. BARGMANN, Commun. Pure and App/. Math. 14, 187 (1961); Proc. Nat. Acad.Sci. 48, 199 (1962).

14. R. J. GLAUBER, Phys. Rev. 131, 2766 (1963).

For the concept of localizability in relativistic quantum mechanics, consult:15. 1. D. NEWTON AND E. P. WIGNER, Rev. Mod. Phys. 21, 400 (1949).

16. L. FOLDY AND S. WOUTI-IUY5EN, Phys. Rev. 78, 29 (1950).

17. A. S. WIGI-ITMAN, Rev. Mod. Phys. 34, 845 (1962).

Page 235: Foundations of Quantum Mechanics

CHAPTER 13

THE ELEMENTARY PARTICLEWITHOUT SPIN

After having reached an opinion for a special case, one gradually modifies thecircumstances of this case in one's imagination as far as possible, and in sodoing tries to stick to the original opinion as closely as one can.

E. MACH

This chapter is devoted to the quantum mechanics of the spinless elemen-tary particle in three-dimensional space. The first two sections introduce thenotions of localizability, homogeneity and isotropy. The first two are notionswhich are easily recognized as straightforward generalizations of the anal-ogous notions in one dimension treated in the preceding chapter. The greaterrichness of the three-dimensional space is mirrored in the new notion ofisotropy. It originates from the complexity of the rotation group which isdiscussed in some detail in Section 13-3. The dynamical structure is intro-duced as in the preceding chapter via the notion of Galilei invariance (Section13-4), which largely determines the structure of the evolution operator H.In the following section (13-5) we discuss gauge transformations and we showthat gauge invariance is a consequence of Galilei invariance as formulatedhere. The question of the localization of various observable quantities leadsus to the general definition of the densities of observables of which the prob-ability density and the probability current are special cases. The last twosections are devoted to space inversion (13-7) and time reversal (13-8).

13-1. LOCALIZABILITY

In this section we shall formulate the notion of localizability in a Euclideanspace of three dimensions. This notion is a generalization of the content ofSection 12-1 of the preceding chapter. We aim at a complete quantum-mechanical description of an elementary particle which moves in the three-dimensional physical space E3. Let us briefly recapitulate the basic conceptswhich are needed for such a description.

As we explained in Section 12-1, the notion of localizability is describedby a function A —* EA which associates with every Borel subset A from F3

222

Page 236: Foundations of Quantum Mechanics

13-1 LOCALIZABILITY 223

a projection operator. The Bore! sets A are subsets of E3 and the projectionsEA are projections representing the yes—no experiments to find the partic!e in thesubsets A. We assume again, as before, that these e!ementary propositionsare a!! compatib!e and that, therefore, the projection operators a!! commutewith one another. From this property fo!!ow the fundamenta! re!ations(!2—!), which say that the correspondence A —* EA is a projection-va!uedmeasure.

There is an important difference between this measure and the one definedin the preceding chapter. The !atter was defined on the Bore! sets of the rea!!ine, the former on the Bore! sets of E3. A projection-va!ued measure on therea! !ine defines via the spectra! theorem a se!f-adjoint operator, the positionoperator.

In E3 it is a!so possib!e to define an operator if the measure A —*represents an e!ementary system. We define the system as "e!ementary" ifthe projections EA generate a maxima! abe!ian von Neumann a!gebra d, sothat

{EA}" = d = d'. (13—1)

According to a theorem of von Neumann I!], a maxima! abe!ian a!gebra cana!ways be generated by a sing!e se!f-adjoint operator X such that {X}" = d.However, such an operator has no physica! interpretation. There is noexperiment known which wou!d effectuate a measurement of the quantityrepresented by such an X. From the physica! point of view this Xis thereforenot very usefu!.

Much more usefu! are the position operators associated with the threeCartesian coordinates of a point in E3. We can define them as fo!!ows:

Let (r = 1, 2, 3) be the projections associated with the sets

A = {x1, x2,I

< 2} (r = 1, 2, 3), (13—2)

where X, are the Cartesian of a point in E3. It follows from thisdefinition that

� for <22 and = 0, = I. (13—3)

Thus the are, for each r = 1, 2, 3, a spectral family and therefore theydefine three operators Q, (r = 1, 2, 3). These operators represent the Car-tesian coordinates of the particle. More generally, if u(x) is a measurablefunction with respect to the measure A —* EA, that is, such that for anyf e L2(E3) the function is measurable in the ordinary sense with respect tothe numerical measure A —* p1(A) EJ), then we may define an operatoru(Q) by setting for anyf

u(Q)/' ) u(x ) dpj, For If c L2(

Page 237: Foundations of Quantum Mechanics

224 THE ELEMENTARY PARTICLE WITHOUT SPIN 13-2

We might symbolically write for this operator

u(Q) = u(x)dE.•) E3

This notation should be interpreted as an abbreviation of the precedingequation.

Just as we have done in Section 12-3, we can construct a special repre-sentation, the Schrodinger representation of the projection-valued measureA —* EA and the operators Q,.. It is defined in the Hilbert space L2(E3) ofLebesgue square-integrable complex-valued functions !fr (x). The projectionoperators EA then operate on the functions according to the formula

(EA!fr)(x) = 1A(x)clI(x), (13—4)

where lA(x) is the characteristic function of the set A, defined by

1 forxeA,=

0

In this representation the operators Q, appear as multiplication operators:

(Q,ct')(x) = xrçli(x) (r = 1, 2, 3),

while for any function u(Q) we have

(u(Q)çIi)(x) = u(x)t/i(x).

The operator u(Q) is self-adjoint if the u(x) is a real-valued Borel functionand the spectral projections of u(Q) are then given by the formula

= EUI(A). (13—5)

13-2. HOMOGENEITY AND ISOTROPY

We must now express the physically fundamental properties of homogeneityand isotropy of the physical space. We begin with homogeneity and we form-ulate it by generalizing the notions introduced in Section 12-2.

The translation by a vector transports every Borel set A into the setA + It consists of all the points of the form x + with x e A.

This transformation is physically indifferent if there exist unitary operatorssuch that

= (13—6)

The operators are only determined up to an arbitrary phase factor. Achoice of this factor is possible, so that the are a (projective) representa-

Page 238: Foundations of Quantum Mechanics

13-2 HOMOGENEITY AND ISOTROPY 225

tion of the three-dimensional translation group

= (13—7)

According to the theorem of Bargmann, there are three different classes ofsuch projective representations possible. In our case only the class whichcontains the vector representation (w 1) can occur. This is so because thephysical space is not only homogeneous; it is also isotropic. Thus we shallproceed to the notion of isotropy.

A rotation R of E3 is a transformation of the point with coordinates Xrinto the point with the coordinates

3

= (13—8)s= 1

where R,.s is a real orthogonal matrix. Every Borel set A transforms under theoperation R into another one A' = [A]R consisting of all the points x' = Rxwith x e A. Isotropy of the physical space is then taken to mean that thereexists a set of unitary operators UR such that for every A

= UREAUR. (13—9)

The UR too are a (projective) representation of the rotation group

URIUR2 = co(R1, R2)URIR2. (13—10)

We have shown in Section 9-6, Problem 4, that every such representation isequivalent to a vector representation of the universal covering group. Thismeans that a suitable choice of the phase factors in UR will reduce thefactor win Eq. (13—10) to ±1.

The two groups of translations and rotations can be welded together intoa single six-parameter group of Euclidean modons of E3. Let R) representthe general element of this group consisting of the rotation R, followed bythe translation The composition law of this group is then given by

R2) = + R1R2) = (a', R'). (13—11)

Let A' be the set obtained from A by the operation R). Then the two kindsof symmetries lead to the condition

= W'(cz, R)EAW(cz, R), (13—12)

where W(cz, R) is a (projective) representation of the group of Euclideanmotions in E3. It was shown by Bargmann (cf. reference in Section 9-6) thatthere exists locally only one class of projective representations of this group,and it contains the vector representation. Thus we may assume that, at leastlocally, we have the relations

W(i1, R)W(i2, W(i1 + R1i2, R1R2).

Page 239: Foundations of Quantum Mechanics

226 THE ELEMENTARY PARTICLE WITHOUT SPIN 13-2

An explicit representation of this relation is obtained from the Schro-dinger representation by setting

(W(cz, R)t/i)(Rx + ci) = (13—14)

With this definition of W(cz, R), Eq. (13—13) is identically satisfied. Further-more, the W(ci, R) thereby defined is unitary (Problems 1 and 2).

The relation (13—13), which holds in a suitable neighborhood of theidentity, can actually be extended to the entire group of Euclidean motions(Problem 3). This shows incidentally that only the single-valued representa-tions can be obtained in the form (13—14).

The operators W(ci, R), together with the projection-valued measureA —' EA defined by Eq. (13—4) are the canonical system of imprimitivities forthe representation (ci, R) —* W(ci, R) based on the Euclidean space E3 (cf.Section 12-3, Problem 4).

It is useful to remember at this point that the condition (13—12) deter-mines the canonical representation W(ci, R) only up to unitary equivalence.In fact, let Q be any unitary operator which commutes with all the EA. ThenCF'W(ci, R)Q is another representation of the Euclidean group which hasthe same property (13—12) (see Problem 5).

Let us now consider the subgroup of the translations only; that is, theelements of the form (ci, I). They are represented through Eq. (13—14) by asubgroup W(ci, I) which acts on according to

+ ci) = (13—is)

The together with the = . furnish us a representation of thethree-dimensional canonical commutation rules in Weyl's form:

= (13—16)

The infinitesimal generators P of the group defined by

= (13—17)

satisfy the canonical commutation rules

[Q,, Pj = (13—18)

The same precaution as in the one-dimensional case is necessary here. Theleft-hand side of Eq. (13—18) is only defined on a dense set D on which bothoperators Q, and are essentially self-adjoint. On this domain the com-mutator [Qr, is equal to The latter is a bounded essentially antiself-adjoint operator which admits a unique extension to the entire Hilbert space.

A reasoning similar to that employed in Section 12-3 then shows that inthe Schrodinger representation, these operators appear in the form

(Q4)(x) = x4(x,), (P4)(x) = — I (13-19)

Page 240: Foundations of Quantum Mechanics

13-3 ROTATIONS AS KINEMATICAL SYMMETRIES 227

We have now recovered the usual wave-mechanical description of anelementary particle in three-dimensional Euclidean space by starting fromthe notion of localizability.

Compared to the one-dimensional case we have an additional feature:The natural symmetry group of the system contains a compact subgroup, thethree-parameter rotation group. This fact permits an important simplifica-tion of many problems in elementary particle physics. We shall devote thefollowing section to the study of this feature.

PROBLEMS

1. The operators W(a, R) defined by Eq. (13—14) satisfy the relation (13—13).

2. The operator W(a, R) of Eq. (13—14) is unitary in the space L2(E3).

3. The relation (13—13) can be extended to the entire group for the representation(13—14).

4. The projections defined by (E4)(x) = satisfy the relation

= W1(a, R)EAW(a, R),

with W(a, R) defined by (13—14) and A' = [A](a, R) the translated domain.5. If W(a, R) is the representation (13—14) of the Euclidean group, then

(W'(a, R)t/J)(Rx + a) = - co(Rx + ez)]cU(x)

defines a new representation, where w(x) is an arbitrary real-valued measurablefunction of x. The new representation is connected with the old one by a unitarytransformation 0:

W'(a, R) = 0- 'W(a, R)0,with 0 defined by

(0!fr)(x) =

W'(a, R) also satisfies Eq. (13—12).

13-3. ROTATIONS AS KINEMATICAL SYMMETRIES

The natural invariance group of the particle in E3 is the Euclidean group ofmotions of this space. This is a six-parameter Lie group which contains thetranslations as an invariant abelian subgroup. We have seen in the precedingsection that the translations are generated by three self-adjoint displacementoperators P,jr = 1,2,3).

In this section we shall study the three-parameter subgroup of the ro-tations. This subgroup is neither invariant nor ahelian. On the other hand it iscompact, and this feature makes it possible to study its representations withrelatively elementary means. With t he three parameters are associated three

Page 241: Foundations of Quantum Mechanics

228 THE ELEMENTARY PARTICLE WITHOUT SPIN 13-3

generators, which are a representation of the nonabelian Lie algebra of thisgroup. In Cartesian coordinates the three generators are proportional to thethree observables which are the components of angular momentum.

However, the infinitesimal or local properties of this group are not theonly ones of physical interest. The investigation of the global propertiesshows that the rotation group is doubly-connected. This property has impor-tant physical consequences which we shall study in the following chapter.

Let us begin with some basic facts on rotations. A rotation is a trans-formation of the Euclidean space E3 which associates with every point xwith Cartesian coordinates X, another point x' with Cartesian coordinates x:

x —>:I:.RrsXs. (13—20)

Here the R,.s are a set of nine real numbers which satisfy the six relations

= (13—21)

from which follow also the relations

>R,tRst = (13—22)

It is often convenient to introduce a matrix notation in which the last threeequations appear in the shorter form

x' = Rx and CR RR" (13—23)

The letter R stands for the matrix with components R,.s and is the trans-posed matrix with components = Rsr. Eqs. (13—21) and (13—22) showthat the matrix R has an inverse. In fact they also show that (det R)2 = 1

and R' = From this it follows that det R = ±1. We shall in thissection be primarily concerned with the proper rotations for which det R =+ 1, and we shall refer to them as rotations without qualification.

If R1 and R2 are two rotations, then the rotation R1R2 is the matrixproduct of R1 and R2. It is clearly again a rotation. Thus the rotations forma group, the rotation group.

The rotation matrices R depend on nine real parameters which, in addi-tion, satisfy the six relations (13—21) or (13—22). They therefore contain threeindependent parameters. It is often convenient to have an explicit para-metrization of the rotations which associates with every rotation a point ina three-dimensional parameter space. This can be and has been done inmany different ways. We shall adopt one such representation which has theadditional advantage of a simple geometrical interpretation.

Every rotation R admits a vector which satisfies = (Problem I).This invariant determines the rotation axis. If this axis is chosen as the

Page 242: Foundations of Quantum Mechanics

13-3 ROTATIONS AS KINEMATICAL SYMMETRIES 229

3-axis of a Cartesian coordinate system, the rotation matrix has the form

/cosp —sinp O\R=(sinp cosp 0). (13—24)

\ 0 0 1/

The parameter q is the rotation angle. We can represent the rotation R by apoint with coordinates p = = 1) such that = p. These pointsare in the interior of a sphere of radius it. The points on the surface of thesphere represent rotations with angle p = ± it. Two such rotations atopposite ends of a diameter are identical since the corresponding matrices Rare identical.

We can introduce a topology in this group space by defining as open setsthe interiors of Euclidean spheres. The group operations are then con-tinuous, and the rotation group becomes a topological group. It is easilyseen that it is connected. The rotations RQ around a fixed rotation axisand variable angle p are one-parameter subgroups. We have

= + Q2] (13—25)

where [Pi + P2] = Pi + P2 (mod 2ir). Thus these one-parameter subgroupsare all isomorphic to the circle group (cf. Section 9-3, Problem 3), and toevery one we can associate an antisymmetric real transformation of E3 bythe definition

A = lim ± (RQ — 1). (13—26)

We shall call the operator A an infinitesimal rotation around the axisIn the special coordinate system used for the representation (13—24), A iseasily calculated and it has the form

/0 —1 O\A=(1 o 0). (13—27)

0 o/

The special rotations around the axes 1, 2, and 3 respectively are representedby the special antisymmetrical matrices

/0 0 0 0 1\ /0 —1 0\A1=(0 0 0 0 0), A3=(1 0 o). (13—28)

\o i 0 0/ 0 o/

These matrices satisfy the commutation rules

[A1. A2] = 43, A3] = [A3, A1] = A2. (13—29)

This set of commutation rules constitutes the Lie algebra of the rotationgroup.

Page 243: Foundations of Quantum Mechanics

230 THE ELEMENTARY PARTICLE WITHOUT SPIN 13-3

With the three fundamental infinitesimal rotations we can construct theinfinitesimal rotation A around any axis = 1) by the rule

A = A ct1A1 + ct2A2 + ct3A3.

It is then easily verified, by integrating the differential equation (13—26), thatthe rotation R, has the form

= = = (13—30)

With this form we have "parametrized" the rotation group. An arbitraryrotation is written explicitly as a function of the three parameters q'i,which represent the rotation (cf. Problem 5).

Let us now proceed to the interpretation of the rotation as a kinematicalsymmetry for an elementary particle. According to the preceding section,such a symmetry is expressed by a system of relations for the projection-valued measure A —* EA:

= Ui'EAUR. (13—31)

Here A is an arbitrary Borel set of E3, [A]R the set {x I R'x e A}, andUR = W(0, R) the representation of the rotation group obtained by restrictingthe representation W(cz, R) of the Euclidean group to the rotations alone.

If the particle is elementary, the representation W(cz, R) is, according tothe imprimitivity theorem, unique (up to unitary equivalence) and con-sequently so is UR. We shall now study this representation.

First, we note that the representation is, according to the theorem ofBargmann (cf. Section 9-6), locally equivalent to a vector representation, sothat we may assume

UR1UR2 = UR1R2 (13—32)

in a suitable neighborhood of the identity. The operator UR is explicitlygiven in the space L2(E3) by the formula

(URI/i)(Rx) = !fr(x), (13—33)

from which it can be seen that the relation (13—32) is valid for the entire group.This representation is reducible. Let us determine its irreducible parts.

To this end it is convenient to introduce polar coordinates r, w. Here

is the distance of the point x from the origin, and w represents a point on theunit sphere. If of = Rw is the corresponding point after the rotation, wehave

(URI/J)(r, of) = t/i(r, w). (13—34)

Thus the rotation affects only the angular part of the function It is thuspossible to discuss the representations UR by studying their effects on

Page 244: Foundations of Quantum Mechanics

13-3 ROTATIONS AS KINEMATICAL SYMMETRIES 231

functions on the unit sphere Q. These functions form a Hubert space L2ffl).Any function f(w) e L2(Q) can be developed in a complete orthonormalsystem of such functions. A system which is especially convenient for theproblem on hand consists of the spherical harmonics @'7'(w), where e and mare integers and I = 0, 1, 2, . . . ; —l � m � I. An explicit definition of thespherical harmonics is given by

= - 1)1. (13-35)

Here = cos 0; 0 is the polar angle, and p the azimuth of the point w onthe unit sphere.

One shows in standard texts [2, 3] that the spherical harmonics are acomplete orthonormal system of normalized functions on the sphere. Thatis, we have

qyrn*()qqm'()dco = '5u"5mm" (13—36)

and any functionf(w) in L2(Q) admits a convergent development

Cr = I dvi.

Under rotations the 21 + 1 functions @rfor fixed I transform irreducibly amongthemselves, such that

Fig. 13—1 The polar angle 8 andthe azimuth ç' of the point w onthe unit sphere.

These formulas give the explicit reduction of the canonical representationon the unit sphere into its irreducible parts. This result illustrates the

usefulness of the spherical harmonics.It is easy to see that, with the exception of the trivial case I = 0, all these

representations are faithful (Problem 2). It is more difficult to show thatthese are the only faithful representations of the rotation group. The usualprocedure for this demonstration is to reduce the to an algebraic

Sn

f(w)

with

3(13—37)

(13—38)

+1

tn'(UR°21r)

where

(13—39)

2

DL'm(w) = I

qyrn'*( )Uq?fm( ) thu.

(13—40)

Page 245: Foundations of Quantum Mechanics

232 THE ELEMENTARY PARTICLE WITHOUT SPIN 13-3

one, and to discuss the representations of the associated Lie algebra. To thisend one writes UR = where is the unit vector in the direction ofthe rotation axis and 4. are three self-adjoint operators defined by

Q*O

Since UR is a representation of the rotation group, the operators satisfy thesame commutation rules as the operators IA,, that is,

EL1, L2] = iL3

Using the definition of UR, one finds that

L1 = Q2P3 — Q3P2, L2 = Q3P1 — Q1P3, L3 = Q1P2 — Q2P1.

One can determine with elementary algebraic methods all the irreduciblematrices L3 which satisfy these commutation rules.

With this method one obtains not only the representations given abovein the finite form, but another infinite set associated with half-integer valuesfor 1. This set furnishes, however, not a representation of the rotation group,but only of its simply-connected covering group. It is sometimes referred toas the double-valued representation of the rotation group.

In order to see explicitly that the rotation group is not siniply- butdoubly-connected, we consider the parametrization introduced before in thep-sphere. The order of connectedness is the number of inequivalent classesof closed paths in the group. A closed path is equivalent to another if itcan be transformed into the other by continuous deformation.

Fig. 13—2 Inequivalent paths in therotation group.

Figure 13—2 shows two closed paths L1 and L2 which are inequivalent.Indeed, L1 is easily seen to contract continuously into the unit element whileL2 cannot be so contracted. The path L2 is closed because the point P andits opposite F' are to be identified.

Every closed path intersects the surface of the sphere in an even numberof points. If 2n is this number, then the path can be continuously contracted

P

P,

Page 246: Foundations of Quantum Mechanics

13-3 ROTATIONS AS KINEMATICAL SYMMETRIES 233

into a point if n is even and it cannot be so contracted if n is odd (Problem 3).Hence there are exactly two classes of inequivalent paths: The rotation groupis doubly-connected.

The simply-connected universal covering group is a larger group thanthe rotation group. It also has more representations. Yet its Lie algebra isidentical with that of the rotation group. This is the reason for the doublingof the number of representations when one uses the algebraic method fortheir construction.

Having thus established that all the irreducible representations of therotation group appear in the representation UR, we now proceed to thetransformation of operators under rotations. First we recall the basic relationfor the projection-valued measure:

12 — TTi12 TIMA]R — "R 4-'A"R

It follows from this relation that for any complex-valued functionmeasurable with respect to the measure EA, we have, for any f e(f, u(Q)f) = Ju(x) dpi, where p1 is the numerical-valued measure A —* p1(A) =(f, EJ). ft follows from this that

(f, URU(Q)Ui 'f) u(x)

= Ju(Rx) dp1

= (f, u(RQ)f).

Since this relation is true for anyf, we have established the operator relation

URU(Q)Ui' = u(RQ). (13—41)

In particular, for u = Q,., we find

URQ,UiI' = >1.RrsQs. (13—42)

It follows from this and the canonical commutation rules that

= (13—43)

This result may be generalized. Any operator triplet X, which transformsunder rotations according to

= = (13—44)

is called a rector operator. The I 2. 3) are such a triplet (Problem 4).

Page 247: Foundations of Quantum Mechanics

234 THE ELEMENTARY PARTICLE WITHOUT SPIN 13-4

In a similar way one can introduce tensor operators of any rank. Oncethe transformation law of the fundamental set Q, and is determined, thetransformation law of any operator which is a function of them is alsodetermined. If the system is irreducible (elementary particle), then thisincludes every operator.

With this we conclude this discussion of the rotations as kinematicalsymmetries and proceed to the dynamical characteristics of the one-particlesystem.

PROBLEMS

1. Every rotation matrix R admits a real eigenvector a with eigenvalue 1, so thatfor this a we have Ra = a. The secular determinant det (R — Al) always hasone root A = 1.

2. The representations R -> of the rotation group are faithful. That is,if = then R = L

3. Every closed path in the rotation group can be continuously contracted into apoint if the number n of rotations by the angle n contained in the path is even.If this number is odd, the path cannot be so contracted.

4. The operators L1 = Q2F3 Q3P2 and their cyclic permutations are a set ofvector operators. They satisfy relations

3

URL,.Ui1 = RrsLs.s—i

5. The matrix RQ for the rotation with angle ç' around the rotation axisa = (cc1, cc2, cc3), (lal = 1), is given explicitly by the formula

RQ = I + A sin ç' + A2(1 — cos

with

/ 3 cc2

cc3 0 —cc1

cc1

13-4. VELOCITY AND GALILEI INVARIANCE

Let A be any observable not depending explicitly on time. We can define avelocity of A, denoted by A, by requiring that for any state

= (13-45)

In this expression is the time-dependent state vector in the Schrodingerpicture (cf. Section 10-3) which satisfies a Schrodinger equation

= IIçíì,. (13 46)

Page 248: Foundations of Quantum Mechanics

13-4 VELOCITY AND GALILEI INVARIANCE 235

It follows from this definition of A that for every !fr =

[H, A]t/i') = (!fr, At/i)

or

A = i[H, A]. (13—47)

In particular, if A is one of the position operators Q,., we may define thevelocity operator

= i[H, Q,]. (13—48)

It is only defined if the operator H is known. In general both H and Q, areunbounded operators. Thus 0, is defined only on a dense set which weassume to be so large that 0, is essentially self-adjoint. Of all known exampleswhich have been studied, this is the case. We shall denote both 0, of (13—48)and its self-adjoint extension by the same symbol.

The 0, thus defined are observables and their expectation values can bemeasured.

If an observer 0 has measured the observable and has found a valuethen an observer 0' in relative motion with velocity v, with respect to 0,

who measures on the same system, will observe the value cc + v,. Thiswould be the expectation value of 0, + v, if the same state had been preparedby the observer 0'. Thus the connection between the two systems are obtainedvia the Galilei transformation which transforms position and velocity accord-ing to

Q, Q,, 0, + v,. (13—49)

We shall define the system as Galilei invariant if this transformation is akinematical symmetry; that is, if there exists a unitary operator whichcommutes with Q, and for which

GvO,G' Q, + v,. (1350)

This definition implies certain restrictions on the nature of the operators0, and hence also on the operator H which is involved in the definition of

For instance Eq. (13—SO) implies that the spectrum of Q, is continuousand extends from — cc to + cc.

Let us now determine the operators H which are admitted by particlessatisfying Gal ilei invariance. We do this under the assumption that we aredealing with an elementary system, so that the position operators Q, generatea maximal abelian algebra.

If we combine the Galilei transformations (I 3—49) with the displacements,we obtain a six-parameter group of translations. We define the family ofunitary operators v) with the properties

(1, + Yr(13-51)

4 r, v) O,rt' '(a, v).

Page 249: Foundations of Quantum Mechanics

236 THE ELEMENTARY PARTICLE WITHOUT SPIN 13-4

It follows from this that the W(cz, v) are a projective representation of thesix-dimensional vector space

v1) W(cz2, v2) = v1, + v1 + v2). (13—52)

According to the general theory of such representations developed inSection 9-6, it is possible to determine the as yet arbitrary phase factors of Win such a way that the factor w in (13—52) assumes the form

( . — —V1, V21 — e ,

where p is an arbitrary real constant 0. There exists also a representationfor p = 0, but it is reducible and does not describe elementary particles.

The two-parameter subgroups and are obtained by specializingthe parameter values

= W(cz, 0) and = W(0, v).

For these two subgroups, the relation (13—53) becomes

= (t3'—55)

If we set pv = /3 and = we obtain from (13—55) the canonical com-mutation rules in Weyl's form. It follows in particular that

1 1

+ = GvP,G,'. (13—56)

By comparing this with Eq. (13—50) we see that (1/j4P, and 0, have the samecommutation rules with Their difference commutes with thus itmust be a function of the Q, alone. Hence we find the important relation

110, = P, — a, (r = 1, 2, 3) (13—57)

where a,(Q) are three functions of Q1, Q2, Q3, which may depend explicitlyon time.

From the relation (13—57) we obtain the commutation rules

11[Q,, Os] = jt5,s, (13—58)

and it follows from them, that the operator H0 = (p12) 02 satisfies

i[H0, Q5] = Os. (13—59)

Consequently, in view of (13—48), H — H0 commutes with Q5, hence it mustbe a function v(Q) of the Q,, which may even depend on time.

We have now shown that the evolution operator H must have the form

(P—a)2+v. (13-60)2p

Page 250: Foundations of Quantum Mechanics

13-5 GAUGE TRANSFORMATIONS AND GAUGE INVARIANCE 237

We have thus arrived at the main result of this section. Every localizableelementary physical system which satisfies Galilei invariance in the sense of(13-50) evolves in time according to Eq. (13—46), with H as given by Eq.(13—60).

The operators a(Q) and v(Q) are not entirely determined by Eqs. (13—57)and (13—60) since the quantities P. and H are not determined by their com-mutation properties. The remaining ambiguity is closely connected with thegauge invariance of the theory which we shall discuss in the next section.

The physical interpretation of this result is analogous to the case of theparticle in one dimension. The parameter p is proportional to the mass mof the particle m = hp. The proportionality factor h is Planck's constantdivided by 2ir. The operator p = hP represents the momentum of the particle.The operators a and v represent the effects of external forces on the motionof the particles. These forces are the quantum mechanical analogues of theforces due to an arbitrary external electromagnetic field. The identificationis completed by identifying (ch/e)a = A with the vector potential and(h/e)v = V with the scalar potential of this field. Here e is the electric chargeof the particle which, for an electron, has the value —4.8 x 10— 10 esu.

It is interesting to note that the principle of Galilei invariance as statedat the beginning of this section limits the possible nature of external forcesto those of electromagnetic origin.

13-5. GAUGE TRANSFORMATIONS AND GAUGE INVARIANCE

In classical electrodynamics it is shown that the electromagnetic field deter-mines the potentials only up to a gauge transformation

(13—6 1)

v-*v——.

It is therefore natural to expect that this classical property correspond to acertain invariance property of the system characterized by the evolutionoperator H of Eq. (13—60). This indeed is the case. The invariance propertyis called gauge invariance.

Let Q be the unitary operator defined in the Schrodinger representationby

(Q!fr)(x) = (13—62)

Here 4(x) is an arbitrary differential function of x which may dependexplicitly on time. Under this transformation the operator P transformsaccording to (ci Problem I

(IN? ' P (I 3—63)

Page 251: Foundations of Quantum Mechanics

238 THE ELEMENTARY PARTICLE WITHOUT SPIN 13-5

Let us now determine the effect of such a transformation on the Schro-dinger equation. To this end we define Pt = and we find by an elementarycalculation (Problem 2) that

= (13—64)

with

G = QHIF' + Q) IF1. (13-65)

The explicit evaluation of this expression gives

(13-66)-

where 4 is now considered as a (possibly time-dependent) function of theoperators Q,..

Thus we find that the effect of the transformation Q is exactly the sameas that of a gauge transformation. Motivated by this example we can nowexpress explicitly what we mean by gauge invariance:

A quantum mechanical theory is gauge invariant if every gauge trans-formation of the electromagnetic potentials can be induced by a unitarytransformation of the Hilbert space of state vectors.

We have thus shown in this section that a one-particle theory whichsatisfies the principle of Galilei invariance is automatically gauge invariant.

When we consider the effect of gauge transformation on operator func-tions of Q and P, we may distinguish three types of quantities.

1) Operators which are invariant under the general transformations Q andgauge transformations. For example the operators Q, themselves.

2) Operators, such as 0, and H, which transform identically under Q andgauge transformations.

3) Operators which transform differently under Q and gauge transformations.

We shall call quantities which belong to one of the groups 1 and 2 gauge-invariant quantities. Thus, according to this definition, the evolution operatorG (see Eq. 13—66) is gauge invariant, but the vector potentials A, or the dis-placement operators P, are not (cf. Problem 4).

In classical physics it is known that only gauge invariant quantities areobservable. Thus it is reasonable to assume that only gauge invariant opera-tors can represent observables in quantum mechanics. According to thisassumption, the F, are not• observables but the 0, are, and they representthe cartesian components of the velocity. The operators which represent themomentum of a particle are mQ = h(P, — a,), and they, too, are gaugeinvariant and thus observable.

Page 252: Foundations of Quantum Mechanics

13-6 DENSITY AND CURRENT OF AN OBSERVABLE 239

PROBLEMS

1. Let U be the unitary operator defined by

=where is a differentiable and real-valued function. Then

UPU-1 = P —

2. If = and satisfies the Schrödinger equation = thensatisfies an equation

Id \= with G = UHIt1 + 0)0-1

3. If for every observable A we define

A = i[H, A] +

we have for the acceleration Q the relation

/' ba

4. Both velocity and acceleration are gauge invariant quantities but the displace-ment operator P is not. Under gauge transformations the latter transformsaccording to P -> P +

13-6. DENSITY AND CURRENT OF AN OBSERVABLE

Localizability, as we have previously formulated it, implies the existence ofa family of propositions represented by projection operators EA. Each ofthese projections represents the proposition that the localizable system iscontained in the Borel subset A of the three-dimensional Euclidean space E3.

This concept permits the definition of a density of probability in thespace E3. We may define this quantity, for instance, in the following manner.Let x be a point in 123 and let A = V be a volume element containing x.We may for instance choose for V a cube centered at the point x. Let!fr(x) be the Schrodinger function for a pure state. The probability of findingthe system in the volume element V is then given by We define theprobability density by

p(x) = Urn (13-67)

It is normalized 5 p(x) 1'v -- I. and positive definite: p(x) 0. In generalthe probability density is a functiun uf time. Rut it satisfies a differential

Page 253: Foundations of Quantum Mechanics

240 THE ELEMENTARY PARTICLE WITHOUT SPIN 13-6

conservation law, the continuity equation, which connects the time variationof p(x) with a suitably defined probability current j(x):

3(x) + Vj(x) = 0. (13—68)

The expression for the current is determined by this continuity equation upto a numerical vector field with vanishing divergence, and is given by

j(x) = — - + (13-69)

Equation (13—68) is then an elementary consequence of the Schrodingerequation

= - (V - + (13-70)

Expression (13—69) can be related to the velocity operator Q defined pre-viously. Indeed, if we define

OJV) = + (13—71)

then we verify without difficulty that

j(x) = lim (13-72)v-*o V

where stands for the operator (1/p)(P — a) = (1/pi)(V — ia).More generally, let & be any observable and define

+ (13—73)

then the quantity

&(x) lim (13-74)v—o F

defines the density of the quantity & at the position x. It can be given ex-plicitly if we define p(x) = (&!fr)(x). It then has the form

= 4(çp*(x)çj1i(x) + !fr*(x)p(x)) = Re (13—75)

That this expression is a generalization of the probability density p(x) canbe seen by taking for & the unit operator I.

The corresponding current can also be defined by

(p*(x)(y - - + ia)p*(x))4ju

+ — ia)p(x) — p(x)(V + (13-76)

Page 254: Foundations of Quantum Mechanics

13-7 SPACE INVERSION 241

However, only if (9commutes with H does it satisfy a continuity equation

(9(x) + V 50(x) = 0. (13—77)

As an example we may calculate the density of angular momentum (inunits of h) by choosing for (9the vector operator

(13—78)

with the components = Q2P3 — Q3P2, etc. We find after a short cal-culation (Problem 1) for the angular momentum density, the expression

L(x) = x j(x), (13—79)

which is just what one would have expected from the physical interpretationof the current operator j(x).

For an observable & which does not commute with the evolution operatorH the definition of the probability current is less useful, since the continuityequation (13—77) is no longer correct. The expression (13—77), instead ofbeing equal to zero, is then equal to a nonvanishing term representing thesources or sinks of the quantity (9 resulting from the action of the externalfields.

PROBLEMS

1. The angular momentum density L(x) for a spinless elementary particle is givenby

L(x) = ax >< j(x)

2. If tfr(x) is a function of r = + + alone, then the angular momentumdensity is identically zero.

3. (çÜ, 040 = 5 j(x) d3x for all pure states tfr(x).

4. For any observable (9the density (9(x) and the current 50(x) are real quanti-ties.

13-7. SPACE INVERSION

In order to simplify the notation we shall consider the one-dimensional case.The generalization to three dimensions is then obvious. The transformation

P—*—P=P' (13—80)

is called a space inversion. It leaves the commutation rules invariant. Thefollowing discussion is formal and, mathematically, not completely rigorous.If the operators Q and P are tinitarily equivalent to the Schrodinger

Page 255: Foundations of Quantum Mechanics

242 THE ELEMENTARY PARTICLE WITHOUT SPIN 13-7

representation, then the following formal discussion could be carried throughin full mathematical rigor, if we expressed it in terms of the bounded operatorsU2 = and =

We shall continue the discussion in the unbounded form, however,benefitting thereby from some formal simplification.

Since Q' and P' satisfy the same commutation rules, there exists a unitaryoperator fT with the property

flQfF' = —Q, fJPfF' = —P. (13-81)

We shall now determine this operator H. In order to do this it is useful toconsider the transformation (13—80) as a special case of a rotation

Q —* = cos ccQ + sin(13—82)

= —sincQ + cosxP.

This transformation is easily seen also to be canonical, so that there existsa unitary one-parameter group such that

= P' = (13—83)

The U2 = can be determined easily if we consider the infinitesimal part:

=Q + + =Q+ hx[A, Q] +/

It follows from this that A must satisfy the commutation relations

P = i[A,Q],(13—84)

—Q = i[A,P].

The formal solution of these relations is obtained by setting, for A,

A = + Q2). (13—85)

Thus= (13—86)

For the special value ct = iv we find the space inversion operator

H = + Q2). (13—87)

It is interesting to note that for c = ir/2, we obtain a transformation

4, = i(n/4)(P2+Q2)

with the following properties

=—Q, and

Page 256: Foundations of Quantum Mechanics

13-7 SPACE INVERSION 243

The transformation b is called the Fourier transformation (cf. Problems 1through 5). If the evolution operator H has the property that it commuteswith the space inversion operator H, then the space inversion constitutes adynamical symmetry of the system.

A dynamical symmetry in an extended sense is present even if the evolu-tion operator H does not commute with H but changes under H —* HHH'into another operator which can be obtained from the original one by thetransformation a —* a and v —* v.

Physically this latter transformation represents the space inversionoperation of the external electromagnetic field, and the invariance in thisextended sense expresses the fact that the original system in the original fieldhas the same dynamical structure as the space-inverted system in the space-inverted field.

It is immediately seen that an evolution operator H of the form (13—60)does have this invariance property, and we have thus established that Galileiinvariance implies not only gauge invariance, but also the extended dynamicalinvariance under space inversion.

PROBLEMS

1. The Fourier transformation has, in the spectral representation of Q, the ex-plicit form

=

x'> =

2. If1 1= , where A = (Q + IP) and Acpo = 0,

v2

then =3. Every vector f e at permits a unique decomposition into four vectors

+f2such that =

4. The Fourier transformation satisfies the properties

II, = 1.

5. If pn(x) is thc vcctor

/S is!

Page 257: Foundations of Quantum Mechanics

244 THE ELEMENTARY PARTICLE WITHOUT SPIN 13-8

in the spectral representation of Q, then the operator (13—86) may be re-presented as an integral operator with the kernel

x'> = +

71=0

6. The space inversion operator H commutes with the angular momentum operatorsLrfr = 1,2,3):

= Q2P3 Q3P2

7. If k/J(r, p) is the polar-coordinate representation of a wave function, then(Htfr)(r, p) = n ç' + ir). In particular,

Hqq7 = (—ly@r.

13-8. TIME REVERSAL

The time-reversal transformation is defined by Q —* Q, 0 —* — 0. It isimmediately seen that the evolution operator as given by Eq. (13—60) is in-variant under this transformation. It follows from this that it cannot be aunitary transformation. If it were, we could conclude, from the definitionof Q = i[H, Q], that 0 must be invariant too, contrary to the definingproperty of the time-reversal transformation.

Let us examine whether there exists an anti-unitary transformation whichgenerates the time-reversal transformation. Such a transformation doesindeed exist. If we define by T the complex conjugation of the state in thespectral representation of Q, that is, the transformation which assigns toany vector tfr(x) the vector

(Ttfr)(x) = tfr*(x), (13—88)

then we find that

PT=—TP, QT=TQ, and T2=LFurthermore, THT' = (l/2p)(P + a(Q))2 + v(Q). The evolution operatoris thus invariant under time reversal if the external fields a(Q) and v(Q) aretransformed simultaneously into —a(Q) and v(Q).

Since p0 P = —a(Q), this implies that 0 transforms into underthe transformation T, as it must.

A special case, which is of great importance in some applications, is thecase of scalar interactions in three dimensions. The evolution operator isthen always equivalent to the operator

H = 1p2 + v(Q)

Page 258: Foundations of Quantum Mechanics

13-8 TIME REVERSAL 245

for a suitable choice of the gauge. The form invariance of this operatorunder gauge transformations has the consequence that to every solutionof the Schrodinger equation there corresponds another one which is relatedto the first by a time-reversal transformation. The time-reversed solutionis defined by

= (13—89)

We verify that with this is also a solution of the Schrodinger equation.Thus let = Applying T on both sides of the equation, we find

= = =

Here we have used the fact that T is anti-unitary and that it commutes withH. Since = we may write the above result also in theform

= with =

This shows that represents the time-reversed solution of the Schrodingerequation.

It is interesting to examine the effect of time reversal on the angular-momentum operator. It is not difficult to verify (Problem 1) that

TL —LT. (13-90)

It follows from this that (up to a phase factor)

TO2J7' = (13—91)

For the particular choice of phases adopted in Eq. (13—35), the phase factoris + 1 as written in the last equation.

PROBLEMS

1. The angular-momentum operator anticommutes with the time-reversal operator:

TL=—LT.

2. If Ht4 = Ekfr, and E is nondegenerate, then the solution tfr is proportional to areal function in the spectral representation of Q.

3. The time-reversal transformation T commutes with the rotations and the spaceinversion:

TUR=URT, and TII=+HT.[Note: In the derivation of these relations it is essential to use the antilinearproperty of T.]

Page 259: Foundations of Quantum Mechanics

246 THE ELEMENTARY PARTICLE WITHOUT SPIN

4. In the spectral representation of P, the time-reversal transformation has theform

(Tcp)(k) =

5. If Tcommutes with H, and if there exists an eigenstate tfr of the evolution opera-tor HiJi = E'/i with a current density j(x) not identically zero, then the eigen-value E is necessarily degenerate. [Hint: A solution of HiJi = with j(x) 0is necessarily complex. The operator T then generates another solution.]

REFERENCES

1. J. VON NEUMANN, Annals of Math. 32, 191 (1931).

2. E. P. WIGNER, Gruppentheorie. Braunschweig: F. Vieweg (1931).

3. H. A. BETHE, Handbuch der Physik, Vol. 24, 1. Berlin: Springer (1933); espe-cially the Appendix.

Page 260: Foundations of Quantum Mechanics

CHAPTER 14

PARTICLES WITH SPIN

Das ist ja cm ganz witziger Einfall.

W. PAULI to R. de L. Kronig,January 8, 1925

In this chapter the nonrelativistic theory of a particle is refined to the theoryof a particle with spin. The case of spin is considered in detail, being atonce the simplest and the most fundamental one. In Section 14-1 we preparethe physical and mathematical background for the description of "quasi-elementary" particles, that is, particles which have some sort of internaldegree of freedom. These general considerations are then specialized inSection 14-2 to the elementary theory of a particle with spin f The aspect ofthe spin as an intrinsic angular momentum is then related to the representa-tion theory of the rotation group in Section 14-3. How the spin and the orbitalangular momentum combine to the resultant total angular momentum isexplained in Section 14-4. We sketch in Section 14-6 the effect of externalelectric and magnetic fields on the dynamical structure of a particle withspin which leads to the observable phenomena of hyperfine splitting andanomalous Zeeman effects. The case of a general spin is considered in thelast section (14-7), where it is shown that the entire kinematics of a particlewith any spin can be obtained from the imprimitivity theorem.

14-1. SPIN, A NONCLASSICAL DEGREE OF FREEDOM

ln Section 13-1 we gave a definition of an elementary particle as a localizablesystem for which the position observables constitute a complete system ofcompatible observables. ln mathematical terms this can be expressed intwo different, but equivalent, ways. ln order to do this, it is convenient tointroduce the abelian von Neumann algebra generated by the projection-valued measure A —* where A is a Borel set in the Euclidean space E3of three dimensions and EA is the projection representing the elementaryproposition: The particle is in the set A." The algebra .c/ generated by thisset is defined by the formula which says that .c/ is the double

2.47

Page 261: Foundations of Quantum Mechanics

248 PARTICLES WITH SPIN 14-1

commutant of the family of projections EA. a' consists of all functions u(Q)defined by

(f, u(Q)f)= J

u(x) dp1, where p1(A) = (f, Ej),

is the measure induced by the projections EA for any fixed vector f.The statement that EA is a complete system of compatible observables is

expressed in full rigor and extreme simplicity by the relation (cf. Eq. (13—1))

d = (14—1)

which says that d is a maximal abelian algebra.We have previously shown that this property can also be expressed in

terms of the cyclic vector g. In fact, Eq. (14—1) is equivalent to the statementthat the family of projections EA admits a cyclic vector: There exists a vectorg with the property that the linear manifold

{dg} (14—2)

is dense mat: b9Whether there exist elementary particles is ultimately a matter of ex-

perience. There is nothing in the axioms of quantum mechanics which wouldthrow any light on this question. We shall see that electrons, for instance, arenot elementary in this sense. They have an internal degree of freedom, calledspin, which one must include in a complete set of compatible observables.

But even for systems which in a first approximation may be considered"elementary," the notion may have only a limited validity.

We can illustrate this point by discussing a few examples: A He nucleusin its ground state can, to a good approximation, be considered an elementaryparticle in the sense just described. But we know that this notion representsonly an approximation, which is fully adequate for the description of alllow-energy phenomena, but which is no longer valid for very high-energyprocesses. This reason is that a He nucleus is a complex structure with internaldegrees of freedom, which we may conveniently ignore as long as we focusour attention only on the ground state. These internal degrees of freedommake themselves felt in processes of excitation and dissociation which mayoccur at high energies.

As a second example, we consider the it-meson. Again, if we consider ait-meson of a definite charge, we may, to a good approximation, consider itan elementary particle in the above sense, but only as long as we ignore itsinstability against decay into other particles. Furthermore we observe thatyr-mesons occur in nature in three varieties of charge, + e, — e, and 0, and ofmasses very nearly equal. For this reason physicists have for a long timeadopted the point of view that the it-meson might really be not an elementarysystem but rather a threefold degenerate version of a larger system. The

Page 262: Foundations of Quantum Mechanics

14-2 THE DESCRIPTION OF A PARTICLE WITH SPIN 4- 249

additional degrees of freedom which need to be added in order to obtain afull description of the system are, in this case, known as the isotopic spin.This point of view has proved very fruitful for the discussion of the so-calledstrong interactions, because it is found that these interactions are invariantunder rotations in the isotopic spin-space.

Finally we mention one of the most interesting cases of additionaldegeneracy from the domain of weak interactions. This is the system of theneutral K-mesons. There are two neutral particles K0 and K0 which haveidentical mass and no electric charge; they appear in different reactions andthey decay differently. What is more striking is that they may occur incoherent superpositions.

Historically the earliest example known of a system which is not elemen-tary in the above sense is furnished by the electron. Soon after the discoveryof quantum mechanics, it was realized that the number of terms found, forinstance in the analysis of the Zeeman effect for alkali atoms, was actuallytwice as large as the number one would have expected on the assumption ofelementarity for the electron. This doubling was first attributed to a residualangular momentum of the core electron [1]. This interpretation was, how-ever, soon found to be wrong, and the doubling of the states was recognizedto be a double degeneracy of the states of the electron itself. Furthermore itwas found that the doubly degenerate states transform under space rotationslike the two-component representation of the group SU(2), the unitary groupin two variables with determinant 1. This is the simply-connected universalcovering group of the rotation group of E3.

In physical language this means that the electron carries with it an intrin-sic angular momentum, different from the orbital angular momentum. Forthis additional intrinsic angular momentum, the word "spin" is quitegenerally used today.

14-2. THE DESCRIPTION OF A PARTICLE WITH SPIN

En this section we shall now consider the kinematics of a particle with spin 3-,a system for which the electron is a typical example.

Let A —* EA be the projection-valued measure describing localizability ofthe system and let d = {EA}" be the von Neumann algebra generated by theEA. Since the spectrum of this algebra is degenerate there does no longerexist a cyclic vector. Indeed a vector g1 with the property that the measure

= (g1, is maximal defines a linear manifold D91 ={dg1}, the closure of which, M1 M(g1) = D91, is a proper subspace ofthe Hilbert space.

Let F1 he the projection with range M1, and £2 = I the projectionwith range = 412. In 412 we can choose a second maximal vector g2such that the measure (%12. E4g2) is also maximal. It is therefore

Page 263: Foundations of Quantum Mechanics

250 PARTICLES WITH SPIN 14-2

= {dg2}.

The following facts are easily verified

1) The mapping p(J u) from L2(E3) x ir2 into ir is bilinear in [and u.2) Every vector ofit is a linear combination of vectors of the form tp(j u)

f®u.

equivalent to Pi• The invariance of the measures Ri and P2 under translationsimplies that both these measures are equivalent to Lebesgue measure inIf the algebra d is doubly degenerate, then the vector 92 generates the entirespace M(g2) so that

This is the situation which is found for electrons.Let tfr e at be an arbitrary unit vector in t, representing, for instance,

a pure state of the electron system, and tfr = + !112 the unique decom-position with = E1tfr, !412 = E2tfr. Furthermore, let d1 = dE1 =and d2 = dE2 = E2dE2 be the reduction of d to the subspaces M1 andAl2 respectively. Then d1 is maximal abelian in M1 and a'2 is maximalabelian in M2. Hence the vector admits a spectral representation ={tfr1(x)} and so does !412 =

We have thus verified that the pure states of the doubly degenerateelectron system permit a unique description in terms of pairs of functions

= {tfr1(x), tfr2(x)}. The square of the norm of the vector tfr is then equalto the sum of the square of the norms of and !112:

= += J

+ d3x.

A slightly more abstract description is obtained by noting that for eachvalue of x the two complex numbers tfr1(x) and tfr2(x) may be regarded as thetwo components in a two-dimensional Hilbert space c$°2. We may thereforedescribe the spectral representation of a doubly degenerate localizablesystem by giving for each x a vector tfr(x) in some Hilbert space *'2(x).Such a family of vectors tfr = {tfr(x)} with the norm defined by

= Jd3x

form a Hilbert spacer which is called the direct integral of the spacesInstead of a family of two-dimensional spaces, we may also choose a

standard space at2 (independent of x). 1ff e L2(E3) and u e r2, we candefine a vector q'(f, u) e r by setting = {q1(x), q2(x)}, where and

are proportional to the components of u in a fixed coordinate system(u1, u2) in That is

= 21f(x), = 22f(x)with

= (u1, u), 22 = (u2, u) and (u1, u2) = 0.

Page 264: Foundations of Quantum Mechanics

14-2 THE DESCRIPTION OF A PARTICLE WITH SPIN 3- 251

We have previously shown (Section 11-7) that these properties charac-terize the tensor product of the two spaces L2(E3) and r2. Thus we see:The pure states are unit vectors in the tensor product L2(E3) 0

Furthermore, because the space c$°2 is two-dimensional, every pure statein is the linear combination of exactly two states of the form q,(f, u).Indeed, let u1 and u2 be an orthonormal coordinate system in c$°2. We thenverify easily that the pure state

!112(X)}

is represented by

= + (14—4)

provided we choose for u1 and u2 the special representations

I1\ 10and

In the following, both representations of the vectors in at will be usedinterchangeably. The transition from one to the other is given by formula

The fact that pure states for a particle with spin are unit vectors in adirect product space of L2(E3) and the two-dimensional spin space hasinteresting consequences when we consider the reduction of the states tothese two factor spaces.

For instance, we may ask the question: What is the spin state of a particle•which is in the pure state (14—4)? Similarly we may ask: What is the state inthe space L2(E3)? We encounter here again the problem of the reduction ofthe state of a system in a product space to one of its factor spaces. Thisproblem is identical with the problem of the reduction of states in the measur-ing process. We have solved it in detail in Section 11-8 and we can take overthe solution here in full. From the general theory of Section 11-8 we knowthat a pure state in a product space is in general a mixture in one of thefactor spaces and is therefore best represented by the density matrix.

We express the result in the notation of Section 11-8 by introducing= L2(E3), Ct" = Ct2, and W = P11 for the pure state (14—4). A simple

calculation then gives for the reduced states W', W" the expressionsProblems 1, 2)

= + (14—5)

— (oh, till) (tlil, tfr2)\(14—6)-

From these cxpressions we can verify immediately what we know alreadyfrom the general reduction theory: The reduced states W' and W11 are purcif and only if' the pure state is of' the form .1. ® u where .1. c L2( L3) andU E .*, (Problems 3. 4).

Page 265: Foundations of Quantum Mechanics

252 PARTICLES WITH SPIN 14-2

The matrix W" represents the spin state for the entire system. We mayphysically also speak of the spin states of the particles inside a volumeelement A. It is then convenient to define the spin-state density W"(x)defined by the matrix

W"(x) = (Itfri(x)12(14—7)

I!112(x)I

This expression is normalized so that

Tr W"(x) d3x = 1.

The spin-state density is always a pure state since det W"(x) = 0; thereforeW"(x) is of rank 1. However, the spin state of a finite volume element,given by

= J W11(x) d3x (14—8)

is in general a mixture.An interesting illustration of these properties of the spin which was of

great importance in the development of the spin theory is the Stern-Gerlachand related types of experiments. These experiments are based on the factthat the magnetic properties associated with the different spin states permitmodifications of the components tfr1(x) and tfr2(x) by external magnetic fields.

Fig. 14—i Schematic representation ofthe Stern-Gerlach experiment.

For instance, in the Stern-Gerlach experiment an atom carrying the spinof a single electron is subject to an inhomogeneous magnetic field whichseparates the wave function tfr1(x) from tfr2(x) (cf. Fig. 14—1). In the regionA1 the pure state of the system is represented by a pair of functions of theform

=

The spin state for A1 is pure too, and it is given by

u = 21u1 + A2u2.

Page 266: Foundations of Quantum Mechanics

14-3 SPIN AND ROTATIONS 253

In the region A2 the state tfr(x) is still pure but it is given by a pair of functions

çU(x)=

with (tfri, tfr2) = 0.

The spin state for A2 is now a mixture of the states u1 and u2 with the respec-tive probabilities

d3x andJA2

d3x.

PROBLEMS

1. Let = c*" ® where ar = L2(E3) and C*7H = is a finite (n).-dimen-sional Hubert space. The reduction of a pure state tfr e to c*" has the form

=

where tfr,. is the rth component of tfr in the direct product

2. Under the assumptions of Problem 1, the reduction of the pure state W to thespace is given by the matrix of Gram

(r,s=1,...,n).3. If W1 and W11 of the preceding problems are pure, then the matrix is of

rank one and all the functions are proportional: =4. If W1 and W11 are pure, then tfr is of the form tfr = f® it with =

and

u=

14-3. SPIN AND ROTATIONS

So far we have formulated only the mathematical aspect of the degeneracycaused by an internal degree of freedom. Now we must proceed to expressin mathematical terms that this internal degree of freedom is an intrinsicangular momentum, in short, a spin.

This property implies a definite transformation law of the spin com-ponents under space rotations: The spin-space will undergo, under spacerotations, a linear transformation which furnishes a projective representationof the rotation group. The problem of finding the transformation of the spinis thus reduced to a mathematical one: Find all the projective representationsof the rotation group in a two-dimensional complex space. The solution ofthis problem is unique tip to a similarity transformation: There exists exactlyone class of equivalent representations of the rotation group. We shall nowdetermine this representation.

Page 267: Foundations of Quantum Mechanics

254 PARTICLES WITH SPIN 14-3

There are essentially two methods which one may employ in determiningthe unique two-dimensional projective representation of the rotation group.One of them emphasizes the global point of view, the other the local one.

The global method is based on the fact that the rotation group G isdoubly connected and that its simply connected universal covering group isSU(2), the group of unimodular unitary transformations in a two-dimensionalcomplex space. Every such transformation U induces, in fact, a rotation inE3 in the following manner: We represent a vector in c$°2 by two complexnumbers (z1, z2) and write for U,

= cz1 + fJz2(14-9)

= _/3*zi + ?z2,where cc fi are two complex numbers subject to the condition

+ fl"/3 = 1. (14—10)

It follows from Eq. (14—9) that the quantities

Si = ztz2 +

= —i(ztz2 — 4z1) (14—11)

53 = —

undergo a linear transformation with real coefficients if the z1, z2 aretransformed_according to Eq. (14—9). Furthermore the Euclidean length

+ + is left invariant under this transformation. One finds, infact, that (Problem 1)

+ + = + + + (14—12)

Thus to every transformation U there corresponds exactly one rotation.Evidently the same rotation corresponds to the transformations U and — U.Furthermore to every rotation R there corresponds exactly one such pair oftransformations U(R) in r2 (Problem 2).

It follows from this that the transformations U(R) constitute a ray repre-sentation of the rotations:

U(R1)U(R2) = ±U(R1R2). (14—13)

Let us now examine this representation from the local point of view.Locally, the rotation group G and the group SU(2) are isomorphic. Thismeans there exist neighborhoods of the identity in the two groups and aone-to-one correspondence of these neighborhoods which is left invariantunder the group operations. This local structure is completely characterizedby the Lie algebra, which for the rotation group is given by

[L1, L2] = IL3, . . cycl. (14—14)

Page 268: Foundations of Quantum Mechanics

14-3 SPIN AND ROTATIONS 255

Here and subsequently, we indicate by ". . . cycl." the other two equationswhich are obtained by cyclic rotations of the indices.

A solution of these commutation rules by two-dimensional matrices maybe obtained in the form

L,—2c, (r=1,2,3),where

/0 1\ /0 —i\ 11 O\C1

= o)' = o)' a3= —i)

(14—15)

are the Pauli spin-matrices. They satisfy, in addition to the relation (14—14),the stronger relations

= hi3 = —c2c1, . . . cycl.,

= I (r = 1,2,3). (14—16)

One can prove that the irreducible representations (14—15) of the commuta-tion relations are unique up to equivalence. This is expressed in the form ofthe following:

Theorem. If a (r = 1, 2, 3) is an irreducible representation of thealgebraic relations (14—16), then there exists a nonsingular transformationS such that

C=SCrS' (r=1,2,3)where Cr are the Paul! spin matrices (14—15).

If c are Hermitian matrices, then S is unitary. For the proof of thetheorem see Problems 3, 4, 5, and 6.

The uniqueness of the representations of (14- 14) can be also demon-strated by construction of all the representations of (14—14) with algebraicmethods. We shall here sketch this method. More details can be found inmany textbooks, since this is one of the oldest methods which has been usedfor this purpose.

We observe first that the operator L2 + + commutes withall three components and it is therefore a multiple of the unity. We denoteit by ,2 L Since L2 is a sum of squares, it is a positive operator. Thereforeic2 � 0, so that ic is real. We introduce the raising and lowering operators

= L1 ± IL2. (14—17)

They satisfy the commutation rules

[L3, L = ±L L_] = 2L3. (14-18)

Let tn> represent an cigcnvcctor of L3 with cigenvailtie tis. so that

L1 1w> wIn:>. (14 19)

Page 269: Foundations of Quantum Mechanics

256 PARTICLES WITH SPIN 14-3

It follows from the relations (14—18) that

Im> = (L+L3 + L+)Im> = (m + 1)L+ Im>. (14—20)

This shows that Im> is also eigenvector of L3 with eigenvalue m + 1.Thus we may write Im> = 'L rn + 1> where 2m is some numerical constantto be determined. One proves in a similar way that L_ Im> = Im — 1>,where is some other numerical constant. Since L+ and L_ are adjointsof one another it follows immediately that 2m = i-C + i. Furthermore, from

+ i[L1,L2] =L2 — —L3,

it follows that

111m+i12 = 12m12 = 1c2 — m(m + 1).

From this equation one sees immediately that the sequence (m) is boundedabove and below. Since Tr L3 = 0 (ci Problem 3), the values of m must besymmetrically distributed between positive and negative values. This is onlypossible if ,c2 = 1(1 + 1) and —/ � m � + / with / one of the values/ = 0, 1, 3/2, . . . It follows that 12m12 = 1(1 + 1) — m(m + 1) and thearbitrary phase can be fixed such that

#m+i=Am=vkl+1)—m(m+1).We have thus constructed the matrix representation of the operators L,

in a (2/ + 1)-dimensional space for / = 0, 1, 3/2, . . . , etc. We summarizethe result:

L31m>=mlm>

Im> = 2m Im + 1> (14—21)

L_ Im> = Im — 1>

with 2m = + 1) — m(m + 1).One verifies that one finds the representation (14—15) by specializing:I_ii—7.Returning to this special case / = 4- once more, we can now easily estab-

lish the connection between the local and the global approach. Since therotation R, with the parameter vector ç is generated by the infinitesimalrotation in the direction of ç, we find immediately that this rotation is repre-sented by R, —> ± U, with

U, = (14—22)

The last relation is very useful for obtaining an explicit formula for thematrix U,. If we write

and

Page 270: Foundations of Quantum Mechanics

14-3 SPIN AND ROTATIONS 257

we have

U, =

cos2

+ icc3 sin i(cc1 — icc2) sin2U, = (14—24)

i(cc1 + — icc3

so that, in the notation of (14—9),

cc = + = (cc2 + icc1) (14—25)

This transformation law of the spin components is now to be combinedwith the other effect of space rotations on the state vector to obtain the generaltransformation law for the state vectors. We define the transformed vector

RiJi by the equation(Rk/J)(Rx) = U(R)tfr(x). (14—26)

This definition of the transformed state vector Rçfi is made in such a way thatfor two rotations R1 and R2, we find the composition law

(R1R2kfr)(R1R2x) = U(R1)U(R2)tfr(x). (14—27)

PROBLEMS

1. If (r = 1, 2, 3) are defined according to (14—11), then the linear transforma-tion —> of the variables Sr induced by (14—9) is such that

5? + s? + = + fl*fl)2(sf + + si).

2. The correspondence between rotations R in E3 and unimodular unitary trans-formations U in r2 is such that to every pair of transformations + U, onerotation corresponds exactly.

3. If (r = 1, 2, 3) is any set of matrices satisfying (14—14), then

= 0.

4. Any set a,. satisfying Eq. (14—16) is linearly independent:

Arcrr = 0 implies Ar = 0.I

5. Every representation of the relation (14—16) is equivalent to a Hermitian one.

6. Every irreducible Hermitian representation of (14 16)is of dimension two, andevery two such representations are unitarily equivalent.

Page 271: Foundations of Quantum Mechanics

258 PARTICLES WITH SPIN 14-4

7. Two rotations R, and have the composition law R, = where therotation angle of is tfr = and is given by

99 99'cos = cos cos (cc' a ) sin sin

with1 1a=—cp, a =—cp'.

99 99

14-4. SPIN AND ORBITAL ANGULAR MOMENTUM

The combination of spin and orbital angular momentum gives rise to a num-ber of interesting phenomena which are caused by the fact that in generalneither the spin nor the orbital angular momentum are separately conservedin the time evolution of the system.

For a particle without spin we have found that in a spherically symmetricalexternal field the evolution operator H commutes with the angular momentumoperators Lr (r = 1, 2, 3). For a particle with spin this need no longer bethe case. Instead, another quantity, the total angular momentum, has thisproperty.

In order to formulate this conservation law it is convenient to change thenotation. We retain the notation 4 (r = 1, 2, 3) for the three componentsof the orbital angular momentum. They are defined as before as the threeoperators with the components 4. The spin angular momentum componentswe designate by Sr = ic,. (r = 1, 2, 3). We then define the total angularmomentum J in vector notation by

J = L + S. (14—28)

For an infinitesimal rotation with parameters we then have

(14-29)

In an external field which is spherically symmetric the rotations leave theevolution operator H invariant. It follows that the infinitesimal parts of therotations commute also with H:

[J, H] = 0. (14—30)

This equation expresses the conservation of angular momentum in quantummechanics.

For spinless systems we have shown that the possible values for the totalangular momentum in units of h are given by L2 = 1(1 + I) where I assumesone of the values I = 0, 1, 2,-..- We shall now examine the correspondingquestion in a system with spin s = j From (14—28) we find

J2 = + + 2L .5 (14 31)

Page 272: Foundations of Quantum Mechanics

14-4 SPIN AND ORBITAL ANGULAR MOMENTUM 259

Let us examine the properties of this operator in a subspace spanned by the2/ + 1 eigenvectors of L3 and the 2s + 1 = 2 eigenvectors of S3. We desig-nate these vectors in the 2(21 + 1)-dimensional product space characterizedby fixed values for L2 = 1(1 + 1) and 52 =

Let us examine the operator (14—31). Since the values of L2 and 52 arefixed we must concentrate our attention on the operator X = L S. Anelementary calculation, using the commutation rules of L and 5, yields therelation

x2 + 3-X = IL2 = 11(1 + 1). (14—32)

This shows immediately that the 2(21 + 1)-dimensional matrix X has onlytwo different eigenvalues which we may denote by x1 and x2. They are thetwo roots of the quadratic equation

x2= —3-(l+1).

It follows from this result that if we write J2 = j(j + 1) then j may assumeexactly two values which are determimed by the two equations:

j(j + 1) = 1(1 + 1) + I — = (1 + + 4)

Jo + 1) = 1(1+1) + —(1 + 1) = (1— })(l +

Thus the two possible values of j are I ±This is the composition law of a spins = 3- with an orbital angular momen-

tum 1: The resultant angular momentum j assumes only the two valuesj = I ± j-. This composition law is a special case of a more general lawwhich governs the composition of any two angular momenta j1 and j2 to aresultJ=j1 +j2. + =J2U2 + l),thenJ2 =j(j+ 1)

assumes the values 1 = A + 12' Ii + 12 — 1, . •, — J2 I. The proof ofthis is given in many references (e.g. [3]).

PROBLEMS

1- If L2 = 1(1 + 1) and 52 = s(s + 1) then L S may assume the values

1(1+ 1)—1(1+ 1)—s(s+ 1) -

2'

2. Two spins and each of value s - may combine into a triplet and asinglet system of total angular momentum] I or j 0 respectively

3, The value-sf resulting from the combination of two angular momenta are eitherall half integers or all ititcgcrs

Page 273: Foundations of Quantum Mechanics

260 PARTICLES WITH SPIN 14-5

14-5. SPIN UNDER SPACE REFLECTION AND TIME INVERSION

We shall now examine the effect on a spin of the transformations of spacereflection and time inversion.

The space reflection a is defined by the transformation:

P—*—P, and (14—33)

It is evident that this is a canonical transformation: The commutationrules are invariant and there exists a unitary transformation U such that

= -Q= —P (14-34)

[U,S]=0.

The transformation Uwith these properties is given by the same expression(13—87) as for the particle without spin.

The time inversion t is defined by the transformation

P—* —P, and 5—> —S. (14—35)

Just as in the spinless case it follows that the operator which produces thistransformation must be antiunitary. But it is not simply given by complexconjugation as in the spinless case.

This is so because the spin-matrices do not all reverse the sign undercomplex conjugation; only S2 does. Thus the transformation of complex con-jugation must be combined with a unitary transformation which commuteswith Q and P and and anticommutes with and S3. This transformationhas the form The time reversal transformation is thus finally given inthe form T = K, where K stands for the operation of complex conjugationin the x-representation. With this expression for Twe find indeed

= Q

= —P (14-36)

= —S.

We mention incidentally that T2 = — I, a relation which follows from thefact that

K2 = I and Ka2K = —a2,

since is purely imaginary. Thus a repetition of the transformation of timereversal does not restore the wave function of a pure state but it replaces itby its negative. The state is thereby not affected since it is represented by a

Page 274: Foundations of Quantum Mechanics

14-6 SPIN IN AN EXTERNAL FORCE FIELD 261

ray in Hubert space and both tfr and — are evidently in the same ray. Thisis in contrast to the case without spin where T2 = + I. The generalizationof this result for general spin is left as a problem (5).

PROBLEMS

1. In the standard representation of any angular momentum the complex conjuga-tion operation K has the effect

KL1K=L1

—L2

KL3K =

2 The unitary operator X with the property

X =

3. Fortheoperator Xof problem (2) one has KXK = X.

4. For a fixed value I for the angular momentum operator (L2 = 1(1 + 1);1= 0, 4-, 1, . ) we have

5. For a particle with a spin of value s the time inversion operator T has theproperty

T2 =

6. If T is a time inversion operator then T' = ei8T for any real is, too. Further-more

= T2.

7 Let H = (1/21t)P2 + v(Q) be the evolution operator for a spinning particle ofspin in an external electric field. (No magnetic field, a = 0). For such asystem every eigenstate ofHis at least doubly degenerate (Kramer's degeneracy).(Hint: Consider the transformation T, show that it commutes with H and thatfor any eigenvector tfr of H, tfr and are in different rays.)

14-6. SPIN IN AN EXTERNAL FORCE FIELD

In this section we shall examine the dynamical structure of a spin in anexternal field of force. In other words we shall determine the most generalform of the evolution operator I/in the presence ofa spin. We have shown inSection I 3—4 that under the hypothesis of (alilci invariance the evolution

Page 275: Foundations of Quantum Mechanics

262 PARTICLES WITH SPIN 14-6

operator for a spinless particle must necessarily be of the form (13—60). In thecase of a spin the arguments which led to this expression are no longer valid,since we used explicitly that the position operator generates a maximal abelianalgebra. In the presence of a spin degree of freedom this is not the case andthe evolution operator may assume other forms involving the spin operator.We shall now examine the other possibilities.

The possibilities for the other forms of H are restricted if we impose theconditions that the evolution operator be invariant not only under rotationsbut also under space reflections and time inversions.

We shall first of all assume that in the absence of an external field theevolution is governed by the operator H0 as it follows from Galilei invariancefor each spin-component separately. The spin-degree of freedom is thusaffected only in the presence of an external field.

We consider the case of a static field only. There are two vectors whichmust be considered: e = — Vv and h = V x a. Under space-inversione behaves like a polar vector and h like an axial vector. Under time reversale is invariant while h changes sign.

There are exactly two invariants which can be formed with e, h and theother dynamical variables. They are of the form (Problem 1)

= —c4e 0 x S), H2 = —/3(h . S). (14—37)

Here cc and fi are two constants to be determined by experiments. Invariancearguments alone do not suffice for their determination. The values obtainedfor them from the spectra for the atoms are

1h h 1

cc = — ——--i, . (14—38)2mc m /i

In conventional units where the external field e is measured in e.s.u. units,h in gauss, and the energy operator is given in ergs, they take on the values

ehfl= (14—38)'

2 \mc,' mc

The fact that in these expressions the constant c, the velocity of light,appears explicitly indicates that they cannot be calculated in a purely non-relativistic theory. On the other hand, in the relativistic theory of the electronthey can be calculated, and they are found to be in agreement with the ex-perimental values

The actual value of these constants played an important role in the earlydays of quantum mechanics- Indeed the value/i is twice as large as one wouldexpect on the basis of a simple model of the electron, giving rise to theremarkable phenomenon of the anomalous Zeenian effect.

Page 276: Foundations of Quantum Mechanics

14-6 SPIN IN AN EXTERNAL FORCE FIELD 263

On the other hand the constant ; also called the spin-orbit constant, ishalf as large as one would expect on the basis of the magnetic field in the restsystem of the electron due to the electric field e. This factor k was explainedby Thomas on the basis of a more exact quasi-relativistic spin theory. It isoften referred to as the Thomas factor [4].

In a spherical external field v(r) the term H1 can be transformed since

rdrtherefore

= c4S L) rdrIn this form it is seen that the term H1 produces a rotation-invariant

coupling between spin and orbital angular momenta. This term is responsiblefor the fine-structure splitting of the energy levels in the atoms

PROBLEMS

1. For a particle with spin the only invariants which, under space rotations,space reflections, and time inversion, depend on the external forces through e

- and h, are of the form

e-(OxS) and (h-S)

2. The hyperfine interaction is zero for an electron with orbital angular momentumI = 0.

3. A charged particle with charge e is placed in a uniform magnetic field H.The additional term in the energy operator (= h times the evolution operator)which is linear in H is of the form

(H L)2nic

where L is the orbital angular momentum.

4. A spin in a uniform magnetic field H is governed by the evolution operator

e (H5)2rnc

so that for any state W in the two—dimensional spin space

[ii U'] S), U]

where w eli /2,,w is I he i.mnior /requemr a iid (I / 11)11 is the till it veet uitii the (lircutiul) Of tI)C i)Etgiletic field.

Page 277: Foundations of Quantum Mechanics

264 PARTICLES WITH SPIN 14-7

5. Define the spin vector S with components

(r=1,2,3).It has the length

3

r j

6. For a spin in a uniform magnetic field H (ci Problem 4), the spin vector of anelectron satisfies the equation of motion = w(s >< cc), where w = eH/2nic(Larmor precession).

14-7. ELEMENTARY PARTICLE WITH ARBITRARY SPIN

In this last section we shall return to the case of the free particle but with anarbitrary value of the spin. We could, in principle, pursue the same road thatwe have used for the case of spin f However, we shall not do this here, butshall instead use this occasion to show how the imprimitivity theorem, whichwe quoted in Section 12-3, is sufficient to furnish us a classification of all thepossible types of elementary particles which can occur in nature. Let usreview the content of this theorem and apply it to the case we wish tostudy.

We are given a group G and a homogeneous space In our case G isthe group of Euclidean motions (or, rather, is a simply-connected coveringgroup), and M is the three-dimensional Euclidean space. Both are separableand locally compact, and A'! has an invariant measure. The action of thegroup on M is expressed as a continuous function from M x G to which,in the example, is transitive. If q e A'! and x e G, this function is denotedby [q]x.

A system of imprimitivities is a c-additive projection-valued measure onthe Borel sets of M and a representation x —* U, of G by unitary operatorsall in a Hilbert space and such that

U;1EAUX = E[A]X. (14—39)

The physical interpretation of such a system is as follows: The EA repre-sent the propositions of finding the particle in the set A and property (14—39)expresses the homogeneity and isotropy of the physical space in which theparticle is located. An irreducible representation of such a system then repre-sents an elementary particle. We therefore wish to know all the irreduciblerepresentations of the system (14—39). The answer is contained in the imprimi-tivity theorem (cf. Section 12-3).

Let q0 be an arbitrary point of G and let G0 be the set of all elementse G which leave the point q0 fixed: In our example the set G will consist

of the subgroup of all rotations with a rotation axis passing through q0.

Page 278: Foundations of Quantum Mechanics

14-7 ELEMENTARY PARTICLE WITH ARBITRARY SPIN 265

Thus is simply the rotation group (which we shall always replace by itscovering group). The elements e G0 can thus be identified with the ro-tations R, while a general element x e G is represented by (R, which meansthe rotation R followed by the translation given by the vector With thisconvention the composition law of the Euclidean group is given by

(R1, = (R1R2, + R1cc2). (14—40)

We shall continue with the general notation, however, until it is necessary touse specific properties of G and M.

Let —> be a particular irreducible representation of the little groupG0 in a Hilbert space The induced representation consists of operatorswhich act on functionsf(x) on the group with values in t*'Ø and satisfying

= Lj(x). (14—41)

Let be the quasi-invariant measure on M, and define the equivalent measureby setting Let be the Radon-Nikodym derivative

of with respect to /2. We then define

= (14—42)

(In our base we may take Lebesgue measure and assume = 1, but forthe time being we retain the more general formulas.) One can then provethat (14—42) is a unitary representation of G. Furthermore we have alsoobtained a canonical system of imprimitivities by setting

(EAf)(x) = (14—43)

The last two formulas contain the entire kinematics of nonrelativistic elemen-tary particles. However, in this form they have little resemblance to whatone might have expected as a generalization of the spin case.

This generalization would have looked as follows: The representationspace r = X'0) is the Hilbert space of functions F(q) over M withvalues in a "spin space" of finite dimensions. If D(R) is the irreduciblerepresentation of G0 with that dimension, then acts on such functionsaccording to

= D(R)F([q](R, a)), (14—44)

and the projections EA are given by

= (14—45)

The chief difference between the last two pairs of equations is that in onecase we are clezi Ii ng with functions over the grou p (1 and in the other casewith functions over the space 4!.

Page 279: Foundations of Quantum Mechanics

266 PARTICLES WITH SPIN 14-7

We shall now show that these two representations of the system ofimprimitivities are nevertheless equivalent in the sense that there exists anisometric linear mapping from y1L onto r which transforms the two systemsinto each other.

Let F(q) be a vector valued function over q. We can change it into afunction over the group by writing q = [q0]x. But as a function of x it wouldin general not satisfy Eq. (14—41). In order to obtain this relation we proceedas follows: Let x —÷ B(x) be a function from G to the unitary operators in

which satisfies the identity

= V e G0, x e G. (14—46)

We then define

f(x) = B(x)F([q0]x), (14—47)

and verify without effort that this function does satisfy Eq. (14—41). Con-versely, if f(x) is given, we can define a function F(q) by the formula

F([q0]x) = (14—48)

and again it is routine to verify that the F thus defined is indeed a functionon the right coset q = [q0]x only.

We have thus established a one-to-one correspondence between andwhich is linear and isometric. If we define this mapping by Q

which takes y1L to then we can calculate in a straightforwardmanner the form of the representation U, in X'0) defined by

= (14—49)

The result is

= F([q]x) (14—50)

with q = [q0]y. We find that for each fixed x there appears a unitary operator

1(y)B(yx).

Both Q,(y) and the numerical factor satisfy a functional equation whichcan be obtained from (14—50) by the simple device of expressing the factthat U, is a representation of G. Thus if we operate on the left twice, oncewith and then with U,2 and then compare the result with the operation

we find the identities:

= px1x2(y),

= (14—52)

Page 280: Foundations of Quantum Mechanics

14-7 ELEMENTARY PARTICLE WITH ARBITRARY SPIN 267

We have now brought U, into the form

= F([q]x). (14—53)

This begins to resemble Eq. (14—44), but is not yet identical with it. In orderto bring it into the final form we need to use special properties of the Eucli-dean group.

First we make somewhat less arbitrary the definition of B(x) and thus ofTo this end we choose in each right coset (remember M is in one-to-

one correspondence with the right cosets) an arbitrary element x0 so thatq = [q0]x0. Then we fix B(x) on each coset by setting B(x0) = L We thenobtain

B(x) = B(xx&'xo) =

since e G0. Thus, using B(x0) = I, we have an explicit formula for

B(x) = (14—54)

From this we find the standard form of

Q(q) Q(y) = (q = [q0]y). (14—55)

The last step consists now in showing that for the Euclidean group the repre-sentative element x0 in the coset q can be so chosen that becomessimply L(R, 0) (independent of q).

If x = (R1, ct) we choose x0 = (I, and find, from the compositionlaw (14—40), that

with

Let the element y = (S, be similarly written. Then we find

= (S, 0), yx = (SR, p + SL)

(yx)0 = (I, + R'cL)

= (SR, 0).

Substituting all these results into (14—55) we find

= L(R, (14—56)

This is the desired result. It suffices to 0) with D(R) of Eq. (14—44).The equivalence is therefore established and our expectations are fullyjustified. We have now also the assurance that the properties of localizabilityand homogeneity are sufficient to determine in a general way the kinematicsof the system. The only arbitrariness which is left is the positive integern = dirn.t0 which determines the spin of the particle by the formula

I )/2.

Page 281: Foundations of Quantum Mechanics

268 PARTICLES WITH SPIN

REFERENCES

The early history of the discovery of the spin is recounted in the following papers:

1. R. KRoNIci, "The Turning Point," Pauli Memorial Volume. New York: Inter-science (1960); p. 5.

2. B. L. VAN DER WAERDEN, "The Exclusion Principle and Spin." Ibid. (1960),p. 199.

Angular momenta in quantum mechanics are discussed in many good textbooks.We refer, for instance, to:

3. A. MEssIAH, Mécanique Quantique. Paris: Dunod (1960); especially Volume 2,Chapter XIII.

4. L. H. THOMAS, Nature 117, 514 (1926); Phil. Mag. 17, 3 (1927).

Page 282: Foundations of Quantum Mechanics

CHAPTER 15

IDENTICAL PARTICLES

"This must be the wood, where things have no names. I wonder what'!! becomeof my name when Igo in?"

LEWIS CARROLL,

A lice in Wonderland

The formal theory of welding several similar particles into a single system isdeveloped in Section 15-1. The central notion is the tensor product of the1-lilbert spaces, referring to the individual particles, a notion which wasalready introduced in connection with the measuring process in Section 11-7.Here we extend it to several factors, and study some superficial aspects of thestructure of the von Neumann algebras generated by the observables for sucha system.

In Section 15-2 we develop the theory of the tensor product with severalfactors in somewhat greater detail and generality, in order to have the toolsready for expressing the notion of identity in quantum mechanics, introducedin the following section (15-3). Great care is taken to emphasize the differencein meaning for this notion in quantum mechanics and in classical mechanics.It is then formalized in the proper mathematical language for a system of twoparticles.

In Section 15-4 we begin the treatment of any number of identical particles.We introduce the two classes of statistics corresponding to the symmetricaland the antisymmetrical subspaces of the tensor product space. The possi-bility of intermediate cases (parastatistics) is mentioned, and it is shown thatthe existence of a complete set of commuting observables is not compatiblewith parastatistics. The formal treatment of the Bose gas is the subject ofSection 15-5, and the Fermi gas, that of Section 15-6.

15-1. ASSEMBLY OF SEVERAL PARTICLES

The preceding sections were concerned with the quantum theory of a singleparticle. We now turn1 to the theory of an assenibly of several particles. Webegin with the theory of two particles: the generalinition to au arbitraryfinite nuniher of particles is then easy.

2w)

Page 283: Foundations of Quantum Mechanics

270 IDENTICAL PARTICLES 15-1

When we consider a physical system consisting of two particles, we aredealing with a special case of the union of two systems into a joint largersystem. The theory of this process was discussed in detail in Section 11-7, inconnection with the measuring process. Although the purpose here isdifferent, we can take over the formalism of that section in full.

Let t9°1 be the set of all observables of particle 1. They are represented bya family of self-adjoint operators in a Hilbert space X'1. We shall assume thatthere are no superselection rules, so that = This meansthat the von Neumann algebra generated by 1/1 is the set of all boundedoperators over

Let 1/2 be the corresponding set of operators for particle 2, and assumealso =

The kinematical independence of the two sets of observables expressesitself with the equation

2' (15—i)

which is equivalent to the statement that any observable from the set 1/1 iscompatible with any observable from the set

Let us now consider the joint system (1 + 2). Denote by 9' the set ofall observables for this system. Every such observable is a self-adjointoperator in the tensor product space. This set contains, first of all, observablesof the form A1 ® '2 with A1 e 1/1 and the identity in r2 which representsthe measurement of the quantity A1 on the joint system (1 + 2). Likewise,it will contain the observable ® A2 with the identity mr1 and A2 eThis observable represents the observation of A2 on the joint system. Sincethese last two operators commute, they can be measured independently, andsuch a measurement constitutes also a measurement of their product A1 ® A2,which is thus also an observable.

However, not every observable is of the form A1 ® A2. For instance, fortwo particles the total energy is not of this form, even in the absence of adynamical interaction. In fact, any quantity which is additive in the con-stituent parts cannot be of this form. We have thus

x 9' (15—2)

where x denotes the set of observables of the form A1 0 A2; e

Let us define the tensor product of two von Neumann algebras ® K2as the algebra generated by elements of the form T1 ® T2 with T1 e .iV1 andT2 ejV2. In the notation adopted with Eq. (15—2) we may thus write

K1 0 = x .iV2)". (15—3)

From Eq. (15—2) it follows first that99' (ii°1 x

Page 284: Foundations of Quantum Mechanics

15-1 ASSEMBLY OF SEVERAL PARTICLES 271

One can prove furthermore [1] that

x e9°2)' = (jV1 X (15—5)

Hence, combining Eqs. (15—4) and (15—5), we find

9" (A11 x A7'2). (15—6)

From this equation we obtain, after once more taking the commutant,

(A11 x A12)" 0 9" = A1, (15—7)

where is the von Neumann algebra generated by all the observables St onthe system.

Let us examine the commutant of A". According to Eq.(15—7), we obtain

A1' c (X1 ® A12) = A7'ç 0 A7'.The affirmation contained in the last equality sign is by no means obvious.It can be proved only for the so-called semi-finite algebras [2]. Fortunatelythis hypothesis is satisfied for our case where A11 = and A12 =

These algebras are known to be semifinite [3].Since A"'1 and A"'2 are both irreducible, it follows from Eq. (15—7) that

A"" = {AI}.

Thus A"' is irreducible too and A"' consists of all bounded operators over= ti 0 X'2. We summarize this result with the following:

Theorem: If is a set of self-adjoint operators in X'1 generating thealgebra of all bounded operators in X'1 : = and ifis a similar set of operators in X'2, then the set of operators A1 0 A2 in theproduct space ¶S = e e generates thealgebra of all bounded operators in

With this theorem established, we can now proceed with the descriptionof the system of two particles. Let c be a complete system of com-patible observables in that is, a system of commuting self-adjointoperators from such that = sic. For an elementary particlesuch a system always exists (cf. the definition of elementary particle in Section12-4). We denote by A1 the Cartesian product of the spectra of the system ofoperators in 'eI.

Let c be a similar set in 6°2, and A2 the Cartesian product of thespectra of the operators in

The spectral representation of the set is given by functionse A1) in the Hilbert space ). where Ri is a measure in the uniquely

defined measure class detcrniincd by in the space A1.Similarly the spectral representation of' 2 is given by functipns 1/12(22),e A2) in the I lilbert space ,(A 2).

Page 285: Foundations of Quantum Mechanics

272 IDENTICAL PARTICLES 15-1

Every operator A1 e may be amplified to an operator A1 ® '2= ® Similarly every operator A2 in r2 may be amplified

to an operator ® A2 in We shall denote these sets of amplified operatorsagain by and Their union will be denoted by = u It isnow possible to prove that if and are both complete sets of commutingoperators in r2 respectively, then is a complete commuting set ofoperators in = ® (Problem 1).

The spectral representation of this set is then given by functions t4(2122) e

x A2) over the Cartesian product space A1 x A2, where p is theproduct measure p1 x

In most calculations in applications one chooses the position and the spinfor the sets so that for particles with spin 4, for instance, A = {x, s}

(x e s = ± 1). For elementary particles with spin + this set is always acomplete set of commuting observables by the definition of elementarity.

The foregoing considerations can easily be extended to the case of anyfinite number of particles. Successive applications of the preceding theoremestablished for two Hilbert spaces gives the following result:

Theorem: If is a set of self-adjoint operators in t1 generating thealgebra of all bounded operators in t1 (i = 1,. , n) so that = =i!J(t1), then the set of operators A1 ® A2 0 ® in the productspace = t1 ® t2 0 ® with e generates the algebraof all bounded operators in

With the same reasoning one can establish that if c are a completeset of commuting operators in then, after amplification of the operatorsfrom t1 to (g, we can affirm that U u u is a complete set ofcommuting operators in The spectral representation for this set is given byfunctions 22, . . , An) in a space x A2 x •x An), where pis the product measure p = Pi X P2 )< x of the measures inducedin A1 by ct,.

PROBLEMS

is a complete abelian set of self-adjoint operators in a Hilbert space t1,and %'2 is a similar set in t2, then the set of operators A1 0 /2, ® A2, withA1 E and A2 E is a complete abelian set in X'1 ® tW'2. [Hint: Showthat g = g1 ® g2 is a cyclic vector in if g1 E t1, g2 E tW'2 are cyclic vectorsford1 and d2.]

*2. If E is the spectral representation for a set of operators in aHilbert space t1 and E a similar representation for another set

in another Hilbert space X'2, then A2) E x A2) is a spectralrepresentation for the set 'J in the Hilbert space J9°I (x) .W2. Furthermorethe measure p Pi x [consult: J. M. Jauch and B. Misra, Heir. P/irs. Ada38, 30 (1965)].

Page 286: Foundations of Quantum Mechanics

15-2 THE MULTIPLE TENSOR PRODUCT 273

15-2. MATHEMATICAL DIGRESSION:THE MULTIPLE TENSOR PRODUCT

The theorems stated at the end of the last section make use of the multipletensor product which we have not yet discussed explicitly. We can, of course,always proceed by iteration, using the abstract definition of the product oftwo spaces given in Section 11-7. However, it is desirable, especially for thetreatment of identical particles, to treat all the individual subsystems and theirHilbert space on a symmetrical basis. For this reason we shall briefly sketchhere how the multiple tensor product may be defined for any number offactors in a symmetrical manner.

We recall the definition of the tensor product given in Section 11-7.That such an object exists can always be shown by explicitly constructing it.There are some advantages in constructing the tensor product as a functionspace independently of any representation of the Hilbert spaces, since itsuniqueness is then much easier to prove and certain general aspects of thetensor product become very transparent.

We define a bounded conjugate multilinear functional on the Cartesianproduct t1 x x as a complex-valued function

(x1

with the following properties:

T(x1,...,x1+y1,...,x1)= T(x1,...x1,...x1)+ T(x1...y1,...x1)T(x1, x2, . . AX1, . .

= 2*T(xi, x2, . . x1, . . .

T(x1, X2,. . . , � CMx1 MX2M •.. (C < ao).

Such functionals always exist, for instance,

T(x1, x2, . . , = (x1, a1)(x2, a2) (xi, a1) (15—8)

for an arbitrary set of vectors a1 e We denote this particular functionalby

T a1® a2 a1.

The notation anticipates the role which it is going to play in the constructionof the tensor product.

We first note that the set of all bounded functionals is a linear vectorspace: lf T and S are two such functionals, then so is (T + S), as well as ATfor all complex A. The axioms of the vector space for these operations areverified.

Next let us define the scalar prod uct of two functionals T and S. Wedefine

(T,S)

Page 287: Foundations of Quantum Mechanics

274 IDENTICAL PARTICLES 15-2

where 9r1 e X'1 is a complete orthonormal system in X'1. The indicated sumneed not be convergent. We can prove (by Schwartz's inequality) its conver-gence if we admit only such functionals T for which (T, T) c This is thecase, for instance, for the special functional T = a1 ® a2 0 0 a1 forwhich we find easily

(T, T) =

It is then possible to prove that the indicated sum in the definition of(T, S) is actually independent of any particular choice of the orthonormalsystems

Let us now consider the set of all functionals T of the form T = a1 ®a2 0 ® a1. The set of all finite linear combinations of such functionals isthen a linear manifold. The closure in the norm topology of this manifoldis a Hilbert space and this space is the tensor product of the spacestl '

The separability of this space is not hard to prove. We note that thisconstruction can be carried through with some caution for an infinite sequenceof spaces. The caution refers to the convergence of infinite products asshown by Eq. (15—8). Separability is, however, in general not true. We shallnot use the infinite product.

In the next section we shall need certain subspaces of the space whichwe define as follows; To each T e of the form T = a1 ® a2 ® ® a1we can associate PT defined by

PT = a11 ® ® ®

where P stands for the permutation

2

\'i 12 IfWe then define

where the summation is extended to all permutations P. The operator 11+thus defined on all T of the special form a1 ® a2 ® ® a1 can be extendedby linearity to the entire space and it is a projection operator on this space.Its range defines a linear subspace (S7

In a similar way we define the space

oPP(gf.

Here the are the signature of the permututions P, and the sum extends overall permutations.

Page 288: Foundations of Quantum Mechanics

15-3 THE NOTION OF IDENTITY IN QUANTUM MECHANICS 275

15-3. THE NOTION OF IDENTITY IN QUANTUM MECHANICS

The elementary particles known to occur in nature fall into a finite number ofdistinct classes. The members of one and the same class all have the sameproperties. It would be possible logically that every individual particle havecertain identifying characteristics which would enable us to distinguish it,at least in principle, from all the other particles in the same class. This does notseem to be the case.

The problem of identity which we are facing here has for thousands ofyears been one of the most baffling in metaphysical speculations. It separatedthe schoolmen who were involved in the great controversy of the reality ofuniversals. Thomas Aquinas held that the only difference between identicalmaterial objects was accidental, their essences being considered by himexactly alike. Duns Scotus, on the other hand, held that there were alwaysdifferences in essence between two different individual things. This sameview was held by Leibnitz who defended it in the celebrated correspondencewith Clark against the views of Newton and his followers. In modernphilosophy the problem still exists in a modified form in the question con-cerning the theaning of proper names for individual objects [9].

In physics the problem reappears as an empirical question: "Can twoindividual physical systems always be distinguished, or are there systemswhich are exactly alike?"

This is a question which can be brought down from the purely speculativelevel to the empirical level. It suffices to remind us how we would establish adifference between two physical systems. The only way we can establish adifference is by measuring the properties of the two systems and to see whetherthey show any difference.

Here a first difficulty appears which we must overcome before we canuse this criterion. It is obvious that two systems cannot give the same resultsfor all the measurable properties if they are not in the same state. We canonly expect identical results for identical systems if they were prepared exactlythe same way, that is, if their states are identical. This remark leads us toemphasize again the distinction between extrinsic properties which dependon the state of the system and intrinsic properties which are independent ofthe state. This distinction corresponds exactly to the accidental and essentialqualities of the scholastic philosophers.

lt is also the same distinction that we made in Chapter 5 where weshowed that intrinsic properties are expressed in the lattice structure of theproposition system. For elementary particles the proposition system can becompletely characterized by the property of localizability and the value of afew constants such as mass, charge, and spin (and possibly some othcrs).

We are thus led to define identity for elementary Particles in the followingsense: Iwo clcmcntary particles are identical if they agree in all their intrinsicproperties.

Page 289: Foundations of Quantum Mechanics

276 IDENTICAL PARTICLES 15-3

In this sense two electrons, for instance, are identical and so are twoprotons or two neutrons. But a negaton (negative electron) differs from itspositive counterpart, the positon, in the sign of the charge. Likewise anelectron differs from a muon by the value of its mass and its lifetime, althoughthe two particles seem to be identical in all the other intrinsic properties.

There is an important difference between the notion of identity inclassical and in quantum mechanics to which we want to draw particularattention. Although classical particles may be identical in the above sense,they can still be distinguished from each other by their extrinsic properties,for instance their initial conditions at a given instant of time. A pair of suchparticles can, for instance, be distinguished, and thus identified, at all timesby the fact that the initial state evolves continuously in time. The evolutionof the states in classical mechanics is such that at any later time we can keeptrack of a particle with given initial conditions by following it along a con-tinuous path in phase space. Thus, although two particles may be mechani-cally identical, they can in a classical system be identified at all times. Wemight say the particles can be "named" and, although the names are purelyconventional, they serve at least to distinguish them from each other.

This possibility disappears for quantum systems. In such systems there isno orbit which could be retraced by continuity to some initial value. In fact,if the wave functions of the two particles overlap to any extent, they inter-twine so thoroughly that their identity is lost. Thus in quantum mechanicsthe identification by initial conditions is not possible. This is the reason whythe notion of identity in quantum physics is more fundamental than in classicalphysics.

Let us now express this notion in mathematical terms. We begin withthe case of two particles. Let be the set of observables for one of theparticles. They are a family of self-adjoint operators in a Hilbert space X'1.Let be a similar set of operators in a second Hilbert space tW'2. The twosets are in every respect identical; they are merely two copies of the sameset of observables.

The observables of the joint system will be a set of self-adjoint operators9' in the Hilbert space 0 tW'2. If the two particles are merelysimilar but not identical, then St consists of operators of the form A1 ® A2,A1 e 9's, A2 e and functions of them. If the particles are identical,on the other hand, not all these operators can be observables because someof them distinguish between particle 1 and 2. Only those operators whichdo not distinguish between the two particles can be observables.

We can express this in a mathematical language by introducing the per-mutation operator P which interchanges the two particles. To this operatorthere corresponds a unitary transformation of the Hilbert space definedas follows: Letf be a vector in of the form I with e aV1,

Page 290: Foundations of Quantum Mechanics

15-3 THE NOTION OF IDENTITY IN QUANTUM MECHANICS 277

f2 e To every such vector we associate another one UJ = 0 11.This defines the operator U1 on all vectors of the form jj ®f2. We nowextend the operator U1 by linearity to all linear combinations of vectors ofthis form and obtain in this manner the operator U1 in the entire Hubertspace It is easily shown that U1, is unitary (Problem 1).

Let 14" be a state and A e 6° an observable for the joint system. Theexpectation value of A is then <A> = Tr WA. If we carry out the trans-formation U1 the state changes to 14" = For this state theexpectation value of the observable A is

<A>' = Tr = Tr (15-9)

We shall now make the assumption that the observables are separated bythe states. By this we mean: If A1 and A2 are two different observables, thenthere exists a state W such that

Tr WA1 Tr WA2. (15-10)

Since Eq. (15—9) should be true for any state, it follows from Eq. (15—10) that

A = (15—11)

or commutes with A.Thus we see that only operators A which commute with can represent

observables of a system with two identical particles. The operators must besymmetrical in the individual particle variables. An operator which doeshave this property is, for instance, A 0 '2 + ® A. Another one isA1® A2 + A2 ®

Since the observables commute with they also commute with everyfunction of We define the two projection operators (Problem 4):

= -3(I + Ut), fl_ = — Ut). (15—12)

Every observable A e 9' commutes with them and it follows that the pro-jections 11± reduce all the A e 6°. If there are no superselection rules, thenthe only projections which reduce all the observables are 0 or L In this caseonly two possibilities exist:

1) 11+ =I,fI_ =0, =1,(15—13)

2)

In the first case the states and the operators are all reduced to the sub-space

=

ln the second ease they are reduced to the subspace

= [I (15-14)_

Page 291: Foundations of Quantum Mechanics

278 IDENTICAL PARTICLES 15-4

If we choose a spectral representation 9(2i, 22) with respect to somespectral set A1 x A2 for the system (1 + 2) the functions 9(2k, 22) satisfy

9(2k, 22) = 9(22, in case 1 (15—15)+

and

9(2k, 22) = —9(22, in case 2. (15—15)_

The two cases can be physically distinguished, for instance, by the analysisof the spectrum of a two-particle system. In case (1) one speaks of Einstein-Bose statistics, in case (2) of Fermi-Dirac statistics.

It is known empirically that all the particles which have an integralmultiple of the value of the spin fall under case (1), while all particles withhalf-integer value for the spin fall under case (2). Thus electrons satisfyFermi-Dirac statistics [4, 5]. This relation between spin and statistics canbe deduced from the relativistic theory of many-particle systems. So far ithas not been possible to do the same within the limitations of the non-relativistic theory. Thus in a nonrelativistic theory we must adopt it as anempirical fact.

PROBLEMS

1. The linear operator defined on the set of vectors f c t1 ® t2 of theformf==fj ®f2 is unitary.

2. The elgenvalues of Up of Problem 1 are ±1.

3. IfA = A1 ®A2, then = A2 ®A1.

4. The two operators H, = 4(1 + are projections which commute with allobservables of a system of two identical particles.

Every system of two identical particles admits a complete set of commutingobservables. [Hint: Establish 6°' —> 6°", and use Lemma 1, p. 704, Reference 8.]

6. If is the set of observables for n identical particles satisfying Einstein-Boseor Fermi-Dirac statistics, then the set of operators >< in ® at1have nonabelian superselection rules.

15-4. SYSTEMS OF SEVERAL IDENTICAL PARTICLES

In the last section we deduced from the notion of identity that a system oftwo identical particles can fall into one of two physically distinguishableclasses, the symmetrical or the antisymmetrical class. We now examine thesystem of several identical particles. We introduce a slight change of notation.The observables for any one of the single particles will be denoted by 6°,.They are operators in the Hilbert space t,. The Hubert space for thecombined system of n identical particles is the n-fold tensor product

=t Øat ® .. . ®r,. The set of observables // is a set of self-

Page 292: Foundations of Quantum Mechanics

15-4 SYSTEMS OF SEVERAL IDENTICAL PARTICLES 279

adjoint linear operators in the space The identity of the particles restrictsthis set to those operators which are symmetrical under the permutation ofthe n-particle variables. A general permutation is

Vi 2

It changes 1 into i1, 2 into i2, etc. To every such permutation we associate aunitary operator U1, operating in When operating on a vector f of theform ®f2 ® it changes it into ®f12 ®® j5. U1 is then extended to the entire space by linearity. The U1 thusdefined are a unitary representation of the permutation group (Problem 7).

The representation P —÷ U1, is reducible. Not all the irreducible repre-sentations contained in U1, are needed. Physically important are the twosubspaces of which contain the symmetrical and the antisymmetricalvectors.

A vector f is symmetrical if = f for all permutations P. It is anti-symmetrical if = öpf. The projection on the symmetrical subspaceis defined by

= (15—16)+

The projection on the antisymmetrical space is

fl=

(1516)_

where the summations are extended over all n! permutations P and is+ 1 or — 1 according to whether P is even or odd (Problem 1).

For identical particles we must have

[A, U1,] = 0 (15—17)

for all A e .9' and all (Problem 3). It follows immediately that the pro-jections reduce the observables:

=0. (15—18)

There exist other projections which reduce and A. This shows that thereoccur other irreducible representations in besides the symmetrical andantisym metrical ones. Physically they are of no importance since every kindof particle falls into one of the two classes 11+ = I, fl_ = 0, or = 0,FL = I. Thus the subspace in which the observables A operate is either

= F! The first case corresponds to Einstein-Bose statistics, thesecond to Fermi— Dirac statistics.

The question has often been discussed whether sonic particles mightobey a parashinsucs where the observa bles are reduced to sonic of the ot her

Page 293: Foundations of Quantum Mechanics

280 IDENTICAL PARTICLES 15-5

reducing subspaces. So far there has been no experimental indication thatsuch parastatistics occur in nature. Parastatistics can be excluded if onemakes the assumption that there exists at least one complete system ofcommuting observables in 6°. (Cf. Problems 4, 5, and 6.)

PROBLEMS

1. The operators Up, SpUp

are projections which commute with all the observables of the system of nidentical particles.

2. If 6° are the observables of a system of n identical particles and if the statesseparate the observables, then fbr all permutations p

forall Ac6°.3. The projections reduce the operators

[11±, Up] = 0 for all P.

There exists a complete system of commuting observables in the set 6° of allobservables if and only if 6°' c 6°" [cf. J. M. Jauch and B. Misra, reference 5].

5. If 6°' c 6°", then the set of operators which commute with all the observablescommute with each other (abelian superselection rules).

6. A system of identical particles with abelian superselection rules obeys eitherEinstein-Bose or Fermi-Dirac statistics.

7. The operators defined in are a faithful representation of the permutationgroup.

8. The reduction of to the subspaces is an abelian representa-tion of the permutation group.

15-5. THE BOSE GAS

We now consider an assembly of n identical noninteracting particles satisfyingthe Einstein-Bose statistics. Such a system will be called a free Bose gas.The Hilbert space for this system is = Let q,. (r = 1, 2, . . .) be acomplete orthonormal system of vectors in With such a system we canconstruct a complete orthonormal system in in the form

9r1®9r2®®9rfl(Problem 1). Every such vector when projected with into the subspace

yields a vector

ço[r1r2 . . . 11+9,. ® 9r2 ® (15—19)

Page 294: Foundations of Quantum Mechanics

15-5 THE BOSE GAS 281

These vectors are still orthogonal and complete, but in general they arenot normalized. In order to normalize them, we introduce a new notation.

Let n1 be the number of indices among the r1r2 which are equalto 1; similarly let n2 be the number of such indices equal to 2, and so on.The following facts follow then from the definition of Eq. (15—19).

b) Two 9[r1 r2 . rj are equal if and only if the corresponding n1, n2,. .are equal.

c)1 n,n,...n,...

2 1 2kp[r1r2 . . . rjM = -fl.p n.

(15—20)

(Problem 2). Because of (b) we may replace the index [r1r2 . . . rjby {n} {n1n2 . and we obtain with

p({n}) q(n1n2"

n2!

n!

nr!"® ® ® (15—21)

a complete orthonormal system of vectors in Since every observ-able is a self-adjoint operator which is reduced by it is possibleto express it entirely in the system (15—21). A representation in thissystem is called an occupation number representation.

So far we have considered the number n, the total number of particles,as fixed. There are several advantages in dropping this restriction andconsidering a much larger system with an undetermined number of particles.Such a system is described in the direct sum of the Hilbert spaces(n = 0, 1, 2, . . . , ao) called the Fock space. We define it as follows: A vectorFe . .. $ is a finite or infinite sequence of vectors

e subject to the condition

< cxj. (15—22)

The scalar product of two vectors F = and G = is then defined by

(F, G) g,j. (15—23)

Addition and multiplication with scalars is defined by

F + (5 = {f,, + (tIn), (15—24)

AF = {A/}

Page 295: Foundations of Quantum Mechanics

282 IDENTICAL PARTICLES 15-5

The set of all sequences F = with these properties thus have the structureof a Hubert space. The vectors in the space represent the system in thestate with no particle. We shall assume that this state is nondegenerate sothat t0 is a one-dimensional subspace. We call it the vacuum state. Thevectors (p(n1n2 ii. .

.defined in Eq. (15—21) are vectors in the space

tn (n = + + + '1r + as such they may also be considered asvectors F in the Fock space

0,k = n.

It is now possible to introduce a set of operators in the Fock space whichpermit a description of a Bose gas in terms of harmonic oscillators. Wedefine for each r = 1,2,.

= — 1 )'(15-25)

a9(nln2 = \/flr + 19(n1n2 +

and we call a,. a destruction operator and a creation operator. From thesedefinitions follows

ar9(O, 0 ) = 0, (15—26)

and for any pair we find

..= (15-27)

+ 1 + 1,

Thus the operator is the Hermitian conjugate of a,.. The operators a,.,satisfy the commutation rules (Problem 3):

[a,., aj = [ar, afl = 0,(15—28)

r *1"s J —

Furthermore, the operator N,. satisfies

11,. = 11,. (1529)

This justifies calling it the number operator for the state r. The total numberoperator is then defined by

N (1530)

We recognize in this formulation that the free Bose gas is kinematicallyindistinguishable from an independent collection of distinct harmonicoscillators. The numbers n,. correspond to the excitation of the oscillator

Page 296: Foundations of Quantum Mechanics

15-5 THE BOSE GAS 283

to the state n,.. Just as we have done for the simple harmonic oscillator, sowe can show here that the vacuum state vector is a cyclic vector: By usingEq. (15—25) we obtain

I . . . . . —

___________

( *\fli( *\fl2 . ( *\flp.9'flhhl2 11,. — kal)•

11r!

(15—3 1)

where p(O) = p(O, 0,. . . ,O,..Let us now consider the energy of the free Bose gas. By assumption the

evolution operator and therefore the energy for a state in is given by anoperator of the form Hr= H1+ H2 + where H,. I®I®••• ® H0 0 and where H0 is the energy of one single boson. Inorder to treat this problem with relatively elementary mathematical means,it is convenient to assume that H0 has only a discrete spectrum. We choosethen for our orthonormal system 9,. the eigenvectors of the operator H0:

H0ço,. = (15—32)

We find then

. ..) = + n2g2 + . . . . n,.

By using the number operator (15—29) we can write, therefore, for H theexpression

H = + + . . + + . . . . (15—33)

From this we see that also the dynamical structure of a free Bose gas isidentical with that of a collection of harmonic oscillators.

Equation (15—33) is a special case of an energy operator of the form

H = (15—34)r, S

It corresponds to a system of bosons in an external force-field. If V is theadditional term in the one-particle energy we find, by taking matrix elementsof (15—34) between any two one-particle states,

(q,., = (a9(0),

The right-hand side can be evaluated using the commutation rules and thedefinition (15—26) of the vacuum state, and it is thus found to be equal to

Thus we have generally a relation between the one-particle energyoperator and Eq. (15—34) in the form

(p,., Vq',) = Ars. (15—35)

The energy (Eq. 15 33) is clearly a special case of this for the diagonalone—particle energy operator V

-satisfying Eq. (15 32).

Page 297: Foundations of Quantum Mechanics

284 IDENTICAL PARTICLES 15-5

In a similar way we can express more general types of operators in theoccupation number representation. For instance, an energy operator, whichis the sum of operators between pairs of particles (two-body interaction),would have the form

H = (15—36)r r2Si

By a calculation which generalizes the arguments given above, one finds forthe coefficients the expression

= 0 0 (15—37)

which relates them to the two-particle interaction operator V considered asan operator in the space t2 = at1 ® It is not hard to see how thisresult can be generalized to many-body interactions of any kind.

We have thus succeeded in transcribing the entire theory of a system ofinteracting identical bosons into the theory of a system of interacting anddistinct harmonic oscillators.

We may remark here, by way of a historical digression, that this resultestablishes the connection with the original formulation of quantum theorywhich treated the statistics of a gas of photons as a system of harmonicoscillators with discrete energy levels (black-body radiation). In this treat-ment the oscillators were represented as the normal modes of the elec-tromagnetic field in a finite enclosure. These normal modes correspondto our choice of base vectors q,. in t1 which diagonalize the free energyoperator H0.

PROBLEMS

1. If p, c X'1 is a complete orthonormal system in X'1, then p,1 ® ®® is such a system in = X'1 ® t2 ®

2. For any two vectors of the form of Eq. (15—19) we have

n1!n2N"n4". . . rn], cp[sjs2 . . sn]) = 8nj mj 8fl,m,

n!

wheren, is the number ofr1 equal to r = 1,2,...,

in5 is the number ofskequaltos = 1,2,...,and

n =n1+ n2 + n, + + U!, +

3. The operators ar, a! defined by Eq. (15—25) are Hermitian conjugates of oneanother and they satisfy the commutation rules (15—28).

Page 298: Foundations of Quantum Mechanics

15-6 THE FERMI GAS 285

15-6. THE FERMI GAS

An assembly of n identical noninteracting particles satisfying the Fermi-Dirac statistics will be called a free Fermi gas. The Hilbert space for thissystem is = For any complete orthonormal system ewe define another such system in from the vectors

rJ = 0 ®

0 ® ® (1538)

If any two indices r1 = rk for i k, then the corresponding 9[r1r2 . . . rj isidentically zero. We may therefore assume that the indices are all differentfrom one another.

If we denote by n1 (= 0 or 1) the number of indices among the r1, r2,. .

which are equal to 1, we obtain in complete analogy to the properties forthe Bose gas,

a) n1+ n2 + + n.

b) Two 9[r1r2 . . . rn], with corresponding numbers 11,. equal, differ at mostby a sign.

c) . . . = 1/n!.

We call the numbers 11,. the occupation numbers of the Fermi gas andwe define the complete orthonormal system

1

"sr ® 0 . . 0 (15—39)

as the natural basis for the occupation-number representation.Just as we have done for the Bose gas, so we can also introduce here the

Fock space consisting of sequences of vectors F = e X'fl, satisfyingEq. (15—22). Addition and scalar multiplications are defined by Eqs. (15—23)and (15—24), and the vacuum state p(O) is the unique unit vector in the one-dimensional space X'0.

We can now introduce, in analogy to Eq. (15—25), the annihilation andcreation operators a,. and by defining

= ( '1,. — 1 ) (15—40)

= (— — 1

where

From these definitions it tollows that

C) (1$ 41)

Page 299: Foundations of Quantum Mechanics

286 IDENTICAL PARTICLES 15-6

and for any pair of vectors q we find

(ço(n1n2 11,. ), n))= ), q(n )). (15—42)

Thus the operator is the Hermitian conjugate of a,.. Furthermore, theoperators a,. satisfy the commutation rules

{a,., a} =a} =

Here we have introduced the anticommutator brackets defined by{A,B} AB + BA.

The operator N,. is the number operator for the state r. It isdiagonal in the representation which we have adopted, and it has the twoeigenvalues 0 and 1. The total number operator is then defined by

N (15—44)

In complete analogy to Eq. (15—31), we can obtain any vector q by operatingwith creation operators on the vacuum state p(O):

ço(n1n2 n,. . . . (a)" . . . p(O). (15—45)

The normalization factor is here superfluous since all the n,.! are equal to 1.[Note: 0! = 1.]

The energy of the free Fermi gas will be given by an operator H = H1 +H2 + + H,. + , where H,. = IØIØ H0 and whereH0 is the energy for a single fermion. We assume that H0 has a discretespectrum only, and we write q,. for the complete system of eigenvectorsfor H0:

H09,. = £,.p,.. (15—46)

We find then, for H in the occupation-number representation, the expression

H = c1ata1 + + ... + tJ,.a7a,. + .. . (1547)

which is formally identical with Eq. (15—33). The content is, however, quitedifferent, in spite of the apparent similarity, since the operators a,. for theFermi system satisfy different commutation rules. Thus the possible valuesfor the energy eigenvalues are

E = + + + + , (15—48)

where the n,. can assume only the values n,. = 0, 1, ii,. = ii.For the interacting Fermi system we may have additional ternis in the

energy operators of the form of either Eq. (15—34) or Eq. (15—36). The

Page 300: Foundations of Quantum Mechanics

REFERENCES 287

formulas (15—35) and (15—37) which we have derived for the constantsoccurring in the interaction energy remain exactly the same.

A new feature appears in the free Fermi gas in that the lowest value of theenergy is not simply given by E0 = (we assume as in theBose gas but instead by the expression

This different behavior for the ground-state energy has importantobservable physical consequences. For the atomic electrons, for instance, itis the reason for the shell structure of the periodic system which led Paulito the discovery of the exclusion principle [5].

REFERENCES

1. J. DIxMIER, Les a/gêbres d'opérateurs dans /'espace H//bert/en. Paris: Gauthier-Villars (1957); especially Chapter 1, §2 (26 if).

2. J. DIxMIER, ref. [1], Chapter 1, §6.9, Proposition 14, p. 102.

3. J. Dixrvirnii, ref. [1], Chapter 1, §6.1, Theorem 5.

The Fermi statistic for electrons was first discovered by Pauli in the form of theexclusion principle.

4. W. PAULI, Z. Phys. 31, 765 (1925).

5. W. PAULI, Nobel lecture, Stockholm (1948); also, Science 103, 213 (1946).

The general connection between spin and statistics was established in:

6. W. PAULI, Phys. Rev. 58, 716 (1940).

A modern version of this connection can be found in:

7. R. JO5T, Pau/i Memoria/ Vo/ume. London: Interscience (1960); especiallySection 6, p. 133 if.

8. J. M. JAUCH AND B. M15RA, He/v. Phys. Ada 34, 699 (1961).

Concerning "identity" in modern philosophy, consult, for instance:9. B. RUSSELL, Human Know/edge. New York: Simon and Schuster, (1948);

Section 4, Chapter VIII, p. 292 if.

Page 301: Foundations of Quantum Mechanics
Page 302: Foundations of Quantum Mechanics

INDEXES

Page 303: Foundations of Quantum Mechanics
Page 304: Foundations of Quantum Mechanics

AUTHOR INDEX

Folias, C., 221Frank, P., 88

Frobenius, G., 201

Galilei, G., 30Geher, L., 221Glasmann, I. M., 29,Glauber, R. J., 221Gleason, A. M., 104,Grosseteste, R., 3

Halmos, P. R., 17, 29Hamermesh, M., 150Hanson, N. R., 191Heisenberg, W., 191Helmholtz, H. L. F. von, 68Hertz, H., 68Horace, 46Horwitz, L., 134Huygens, C., 68

Ikebe, T., 64Inönü, E., 221

Jauch, J. M., 64, 120, 134, 272, 280,287

Jordan, P., 22, 29, 88Jost, R., 287

Kaplansky, I., 220, 221Kepler, J., 121Kilpi, Y., 205, 221Kolmogorov, A. N., 95, 110Korner, S., 191Kronig, R., 247, 268

Lebesguc, H. L., 9Lcihnitz, G. W. von, 275

45, 64, 221

110, 132, 134

Achieser, N. I., 29, 45, 64, 221Albertson, J., 120Aquinas, T., 275Artin, E., 133, 150

Baer, R., 133Bargmann V., 149, 150, 221, 225Bauer, E., 191Bell, J. S., 120Bethe, H. A., 246Biedenharn, L., 134Birkhoff, G., 79, 80, 83, 88, 110, 129,

133

Bohm, D., 120Bohr, N., 69, 90, 107, 110, 168, 187,

192Born, M., 195Brodbeck, M., 88Broglie de, L., 120, 191

Carroll, L., 269Croiset, R., 88, 133

d'Espagnat, B., 192Destouche-Février, P., 89Dirac, P. A. M., 31, 45, 105, 110Dixmier, J., 205, 221, 287Dubreil-Jacotin, L., 88, 133Dunford, N., 17

Einstein, A., 111, 112, 174, 185, 186,187, 190, 191, 192

Emch, G., 134, 150

Feigl, H., 88

Fermi, E., 278Finkelstein, 134Foldy, 221

291

Page 305: Foundations of Quantum Mechanics

292 AUTHOR INDEX

Lesieur, L., 88, 133 Rose, G., 88London, F., 191 Rosen, N., 185, 186, 187, 190, 192Loomis, L. H., 88, 125, 133 Russell, B., 287Ludwig, G., 88, 191

Scalettar, R., 88Mach, E., 69, 222 Scheibe, E., 89Mackey, G., 103, 110, 119, 201, 221 Schiff, L. I., 45Maeda, S., 88 Schrodinger, E., 185, 190, 191Maxwell, J. 0., 67 Schwartz, J. T., 17Menger, K., 134 Scotus, D., 275Messiah, A., 268 Segal, I. E., 104, 110Misra, B., 64, 120, 272, 280, 287 Speiser, D., 134Mittelstaedt, P., 89 Stone, M. H., 29, 57, 125, 133, 221Murray, F. J., 133 Stueckelberg, E. C. G,, 131, 134

Szász, G., 88Nagy, B. Sz., 17, 29, 45, 64, 205, 221Neveu, j., 110 Thomas, L. H., 263, 268Newton, T. D., 221 Titchmarsh, E. C., 64Nietzsche, F., 160 Uhlhorn, U., 150Nikodym, 0., 14

Van der Waerden, B. L., 268Pauli, W., 111, 120, 191, 247, 287 Varadarajan, V. S., 95, 103, 110Pierce, C. S., 18 von Neumann, J., 22, 29, 41, 45, 79,Piron, C., 88, 120, 133, 150 80, 83, 88, 103, 120, 123, 129, 133,Plessner, A., 199, 221 191, 201, 221, 246Podolsky, B., 185, 186, 187, 190, 192Pontryagin, L., 150 Weizsacker, C. F. v., 89

Weyl, H., 135, 151Radon, J., 14 Wick, G. C., 110Reichenbach, H., 89 Wightman, A. S., 110, 221Rellich, F., 205, 221 Wigner, E. P., 110, 144, 150, 187, 188,Rey, A., 69 191, 192, 221, 226Riesz, F., 17, 29, 31, 45, 64 Wouthuysen, S., 221

Page 306: Foundations of Quantum Mechanics

Page numbers in italic indicate principal references.

SUBJECT INDEX

Abelian, algebra, 57maximal, 57

group, 137Absolutely continuous measure, 8Adjoint operator, 35a.e., see Almost everywhereAlgebra, abelian, 57

Boolean, 5generated by an operator, 57Lie, 140maximal abelian, 57semi-finite, 271von Neumann, 57, 247, 249

Almost everywhere, 7Angular momentum, 228

composition of, 259density, 241intrinsic (or spin), 249orbital, 258total, 258

Annihilation operator, 44,285

Antiautomorphism, 129Antilinear mapping, 31Antisymmetrical subspace, 279Atomic proposition, 79, 87Automorphism, 136, 142, 153

anti, 129inner, 138

Bessel's inequality, 22Borel, function, 78, 9S, 101

set, 6

Bose-Einstein statistics, 278Bose gas (free), 280Bound of propositions,

least upper, 76Bounded, linear functional; see

Functionallinear operator; see Operator

Bra, 32

('ompact sptice, 59

74greatest lower,

211, 282,

Calculus, functional, forobservables, 101

propositional, 77Canonical, commutation rules, 43, 199

transformation, 211Cauchy sequence, 16, 19, 26Cayley equation, 49Center, of a lattice, 81, 108

trivial, 81Characteristic function, 39Closed, linear manifold, 25

operator, 40Closure of, a linear manifold, 25

a linear operator, 40Coherent, component, 143, 145

lattice, 124proposition system, 115sublattice, 109, 145

Commutation rules, irreduciblerepresentation of the, 201

Schrödinger representation of the, 200•Weyl's form of the, I 98.

293

Page 307: Foundations of Quantum Mechanics

294 SUEJECT INDEX

Compatibility, 80, 81Compatible, observables, 101

projections, 86propositions, 81, 95subspaces, 27

Complement, orthogonal, 25relative, of a subset, 4

Complementarity, 69Complete, lattice, 77,

u-, 80orthonormal system, 22space, 16, 19system of observables, 101

Conjugate, linear functional, 32linear transformation, 177

Connected, automorphisms, 145component, 145multiply, 141simply, 141

Conservative system, 156Convergence, in the measure, 12

strong, 21weak, 21

Coordinate system, 22Coset, left, 137

right, 137Covering group, universal, 141Creation operator, 44, 211, 282,

285Current, probability, 240

Datum, 174Dense sequence, 19Density, 240

operator, 132probability, 239

Dimension of a space, 20Dirac picture, 157Direct integral of Hilbert spaces,

250Direct sum of, Hilbert spaces, 281

subspaces, 25, 26Disjoint, propositions, 86, 95

subspaces, 28Displacement operator, 198Distributivity, 28, 39, 79

Domain, of a function, 9of an operator, 33

Dual space, 31

Eigenfunction expansion, 62Eigenvalue, of an operator, 47

simple, 47Eigenvector of an operator, 47Element, 3Elementary, particle, 205, 235

quasi-, system, 206system, 145

Energy, kinetic, 210potential, 210total, 210

Equality, of propositions, 74of sets, 3

Event, 174Evolution, operator, 154, 209

time, 151Expectation value of an observable, 99Experiment, Stern—Gerlach, 252

Yes—no; see PropositionExtension of an operator, 34

Faithful representation, 231, 234Fermi gas (free), 285Filter, 73, 165Fock space, 281Form, definite, 129

Hermitian, 129quadratic, 32sesquilinear, 129

Fourier transformation, 63, 243Function, analytic, 215

argument of a, 9Borel, 78, 98, 101continuous, 11domain of a, 9eigenfunction expansion, 62integrable, 12, 55inverse, 10measurable, 11, 55range of a, 9set, 10a- additive, 7

Page 308: Foundations of Quantum Mechanics

SUBJECT INI)EX 29S

simple, 11single-valued, 10

Function space, 16Functional, bounded linear, 30, 55

conjugate linear, 32norm of a linear, 30positive sesquilinear, 33sesquilinear, 32, 35strictly positive sesquilinear, 33symmetrical sesquilinear, 32

Functional calculus, 55

Galilei, invariance, 207invariant, 235transformation, 207

Gas, free Bose, 280free Fermi, 285

Geometry, continuous, 123projective, 123, 127, 128

Gleason's theorem, 132Gram determinant, 22Group, 137

abelian, 137cyclic, 137dynamical, 154factor, 138invariant sub-, 138Lie, 139sub-, 137topological, 139universal covering, 141

Hamiltonian (or evolution operator)of a system, 154

Harmonic oscillator, 211Heisenberg — picture, 155Hermite polynomials, 45, 213

Hermitian operator; see SymmetricalHidden variables, 116Hilbert, relation, 53

space, 18Homogeneity, 197, 224Homomorphism, 138

Identical particles, 275Imprimitivities; see System of

imprimitivities

Integral, definite, 12direct, 250indefinite, 12Lebesgue, 13Lebesgue-Stieltjes, /3of a simple function, /2

Intersection of, propositions; seeBound of propositions,greatest lower

sets, 4subspaces, 26, 38

Invariance, Galilei, 207gauge, 238

Inverse, of a function, /0image, 10, 98

of an operator, 34Isometry (partial), 35Isomorphism; see MorphismIsotopic spin, 249Isotropy, 225

Jacobi identity, 140

Ket, 32Kramer degeneracy, 261

Larmor frequency, 263Lattice, 26

atomic, 80Boolean, 80center of a, 81, 108

complete, 27, 77direct union of, 81distributive, 80irreducible, 81, 108, 124

modular, 83, 122

orthocomplemented, 27, 77, 129reducible, 81, 108, 125sub-, 80of subspaces, 85weakly modular, 86

Lebesgue, integral, 13measure, 7

-Stieltjes integral, 13-Stieltjes measure, 8theorem, 9

Page 309: Foundations of Quantum Mechanics

296 SUEJECT INDEX

Lie, algebra, 140group, 139, 141

Linear, manifold, 20, 24, 128closed, 25closure of a, 25

operator, 33vector space, 18

Linearly, dependent system, 20independent system, 20

Local factor, 148Localizability, 123, 196, 222Logic (formal), 77

Maximal symmetrical operator,

Measurable, function, 11, 55set, 7

Measure(s), absolutely continuous,8, 14, 58

comparable, 8convergence in the, 12discrete, 8equivalent, 8, 51Lebesgue, 7Lebesgue-Stieltjes, 8£-valued, 98quasi-invariant, 202set of, zero, 7spectral, 39, 50, 197

Measurement, ideal, 166, 167of the first kind, 165of the second kind, 165

Measuring process, 163, 183Minkowski's inequality (triangle

inequality), 21Mixture, 96Modular lattice, 83, 122, 219

weakly, 86Momentum, angular, 228

operator, 43, 210of a particle, 237

Morphism, 136automorphism, 136, 142, 144,

153homomorphism, 138inner automorphism, 138

41, 50, 54

41

Norm of, a linear functional, 30a linear operator, 34a vector, 19

Observable(s), 98, 99compatible, 101

Operator(s), adjoint, 35annihilation (destruction), 44, 211,

282, 285bounded linear, 33, 55closed, 40, 41continuous, 34creation, 44, 211, 282, 285density, 132displacement, 198domain of an, 33essentially self-adjoint, 41extension of an, 34identity, 35inverse, 34maximal, 41maximal symmetrical, 41momentum, 210multiplication with scalar, 36norm of a bounded linear, 34partial isometry, 35position, 42, 197positive, 36product of, 35, 42projection, 35, 37range of an, 33resolvent, 49self-adjoint, 35,shift, 36sum of, 34, 42symmetrical, 41tensor, 234total number, 282, 286unbounded linear, 34, 40unitary, 36vector, 233

Order (partial), 74Orthocomplementation, 76, 129Orthocomplemented lattice, 77, 129Orthogonal, complement, 25

subspaces, 25vectors, 2!

Page 310: Foundations of Quantum Mechanics

minimal; see Atomic propositionsystem, 87trivial, 76

SUBJECT INDEX 297

Orthonormal, system complete. 22 Radon-Nikodym, derivative, 15, 61sequence of vectors, 2! theorem, 14system, 22 Random variable; see Observable(s)

Range of, a function, 9an observable, 99

Paradox, Einstein, Podolsky, Rosen, an operator, 33185 Reduction of an operator, 42

Schrodinger's cat, 185 Regular value, 52Wigner's friend, 187 Representation, induced, 203

Parallelogram law, 22 occupation number, 281Parastatistics, 279 projective, of a group, 145, 146Parseval's equation, 63 Schrodinger, 200Partial, isometry, 35 Resolvent operator, 49, 53

order, 74 Riesz's theorem, 31Particle, elementary, 205 Ring, 5

free, 209 generated by a class of sets, 5Pauli matrices, 255 u, 5Phase factors, 146 Rotation, 228Physical system, 71 axis, 228Picture, Dirac (or interaction), proper, 228

157Heisenberg, 155 Scalar product, 19Schrodinger, 155 positive definite, 20

Planck's constant, 210 Schrödinger, cat paradox, 185Point; see Atomic proposition equation, 154

perspective, 125 picture, 155Polarization law, 33 representation, 200Position operator, 42, 197 Schwartz's inequality, 21Probability, density, 239 Secular equation, 47

current, 240 Self-adjoint operator, 35, 41, 50Product, of operators, 35, 42 essentially, 35

scalar, 19 Separable, proposition system,tensor, 176, 270, 273 100

topological, 176 space, 100Projection operator, 35, 37 sublattice, 19, 100Projections, commuting, 39 Sequence, Cauchy, 16, 19, 26

compatible, 86 dense, 19Projective, geometry, 123, 127, Set(s), Borel, 6

128 class of sub-, 4representation, 145, 146 compatible, of propositions, 81

Proposition(s), 73 disjoint, 4absurd, 76 empty, 4compatible set of, 81, 95 open sub-, 11disjoint, 86, 95 sub-, 4

total, 80x.-null, /00

u-homomorphism; see Observable(s)

Page 311: Foundations of Quantum Mechanics

298 SUBJECT INDEX

Space, compact, 142 Statistics, Einstein-Bose, 278complete, 16, 19 Fermi-Dirac, 278completion of a, 20 para-, 279dimension of a, 20 Stern-Gerlach experiment, 252dual, 31 Stone's theorem, 57Fock, 281 Strong, interaction, 249function, 16 convergence, 21Hilbert, 18, 131 Structure constants of a Lie algebra,homogeneous, 198 141L', 16 SU(2) group, 254L2, 23 Subspace(s) (closed linear manifold),

23 20, 24linear vector, 18 compatible, 28measure, 7 disjoint, 28separable, 19 Superposition principle, 106, 108topological, 10, 11 Superselection rules, 109

Space inversion, 241, 260 abelian, 280operator, 242 Symmetrical, operator, 41

Spectral, density, 58, 59 maximal, 41family, 54 subspace, 279measure, 39, 50, 197 Symmetry of a lattice; seerepresentation, 48, 56, 61 Automorphismtheorem, 53 System, classical, 78, 80

Spectrum, 52 complete, of observables, 101continuous, 52 composite, 196discrete, 52 conservative, 156of an observable, 100 nonconservative, 157

simple, 47, 49, 57, 59 physical, 71Spherical harmonics, 231 of propositions, 87Spin, 249, 267 separable, of propositions, 100

isotopic, 249 System of imprimitivities, 2021 249 canonical, 203space, 253 irreducible, 204, 205state, 252 transitive, 203

State(s), 91, 94, 112Thomas factor, 263classical, 172Time, evolution, 151

dispersion of a, 96, 114dispersion-free, 96, 114 reversal, 244, 260

equivalent, 170 operator, 244, 245Topological space, 11

macro, 171Transformation, Fourier, 63, 243

micro, 171mixture of, 96 gauge, 237

semilinear, 144pure, 96, 115, 132, 181

of states, 145reduced, 181spin, 252 Uncertainty relation, 162vacuum, 282 Union, direct, of lattices, 81vector, 132 of sets, 3

Page 312: Foundations of Quantum Mechanics

SUBJECT INDEX 299

of subspuccs, 26, 3M orthogonal, 21of systems, 179 orthonormal, 21

Unitary operator, 36 state, 132Universal covering group, /4/ Velocity, 206, 235

Value, eigen-, 47expectation, 99, 132 Weak, convergence, 21

regular, 52 modularity, 86

Vector(s), 18 Wigner's friend, 187

cyclic, 50, 51 theorem, 144

eigen-, 47linear, space, 18 Zeeman effect (anomalous), 262

ABCDEOOI7