Interpretation of Quantum Mechanics - scienide2scienide2.uwaterloo.ca/~nooijen/website/Previous... · Interpretation of Quantum Mechanics using a density matrix formalism. by Marcel

1

Interpretation of Quantum Mechanics

using a density matrix formalism. by Marcel Nooijen

Chemistry Department

Princeton University

One arrives at very implausible theoretical conceptions, if one attempts to maintain the

thesis that the statistical quantum theory is in principle capable of producing a complete

description of an individual physical system. On the other hand, those difficulties of

theoretical interpretation disappear, if one views the quantum mechanical description as

the description of ensembles of systems. Albert Einstein (1949).

Quantum mechanics has given rise to a lot of discussion since its conception, as some

things are so weird. I will try to clarify some of the issues here. One thing is very

important from the outset. Quantum mechanics is a statistical theory. It tells us the

various possible outcomes of experiments and the corresponding probablilities if we

would do a large number of identical experiments on individual quantum systems.

Identical experiments are necessarily idealizations, but this is not much of a restriction in

practice, as many variables (e.g. what's is going on in Sidney or on the next bench in the

lab) are irrelevant. If we take this view that quantum mechanics provides a statistical

description of a large number of identical experiments, but cannot say much (unless

probabilities are unity or zero) about the outcome of a single experiment, a lot of the

difficulties dissappear. In this context taking a spectrum of a sample in the gas phase

appears to be a single experiment but in our view it amounts to doing measurements on

many individual quantum systems. The systems are not all identical but this is the same

type of fluctuation that occurs in classical statistical descriptions. At first sight the

situation may not appear very different from the description provided by classical

statistical mechanics. In that case however, we have a description (classical mechanics)

that provides a complete description of the world, which is far too complex, however, to

2

be of use. Think of the microscopic description of a gas of classical particles for example.

In addition there are issues with classical chaotic behavior that make the premise of an

"in principle complete classical description" a little shaky. In quantum mechanics we do

have the statistics, but there appears to be no underlying more complete level of

description. For instance in quantum mechanics it is in principle impossible to say when a

particular nucleus will decay, or if an electron will make a left or a right turn in a Stern-

Gerlach experiment. The information we can obtain is inherently statistical, and there is

no indication at present that we will be able to do more than give a statistical description

of experiments. It also appears to be sufficient in practice. In particular there is no

contradiction in that quantum mechanics can be used to derive classical mechanics, as at

a microscopic level there are always many nearly identical 'experiments' leading to

deterministic probability distributions that can be used to formulate classical mechanics.

A detailed analysis of this is rather involved and certainly beyond the scope of these

notes. Let me emphasize however that statistical arguments based on a larger ensemble of

some sort will always need to be invoked in Quantum Mechanical explanations.

These notes consist of four parts. The first part gives a brief discussion of the connection

between theory and experiment and gives a brief description of the conventional

formulation of quantum mechanics, but phrased in terms of density matrices and

projectors, rather than wave functions. The wave function formulation itself can be found

in many text books and I refer you to the excellent textbook by Cohen-Tannoudji, Diu,

and Laloë. In the second part I will discuss a childishly simple example of a classical

measurement and indicate which aspects are different in quantum mechanics. They may

not make sense - which is the point I wish to drive home - but they do describe the factual

experimental situation. This example includes in essence a discussion of the uncertainty

relations, the so-called non-separability problem, Bell's inequalities and Aspects

experiments in the eighties and my reading of it. In the third part we will go beyond the

conventional presentation of QM and discuss in more detail the act of measurement and

an elegant formulation in terms of reduced density operators. This treatment is essential

to describe many common experiments as the conventional formulation of quantum

mechanics is too restricted. In particular we need to be able to describe a sequence of

measurements starting from a single ensemble. In the conventional treatment we always

3

continue with only the branch that yielded a particular outcome to the first experiment,

and so forth. This is seldom the case in practice. In the fourth part I will discuss that the

essence of measurement: decoherence of the wave packet and evolution into a statistical

mixture does not require a macroscopic apparatus. This provides a basis to derive

statistical mechanics from quantum mechanics using collisons of microscopic systems to

create density matrices that are diagonal in the energy representation. This is very much

the same as the classical treatment and Boltzmann's H-theorem. In a way these notes

extend far beyond interpretation issues, but that is how they started to evolve.

I. Elementary introduction.

Every description of an experiment on a microscopic system (even single molecule

spectroscopy) is essentially statistical. Typically one performs an experiment on a sample

consisting of similar microscopic systems. In an idealized theoretical description we view

such an experiment as equivalent to performing a sequence of measurements on each

(now supposedly identical) microscopic system in isolation. This generates a definite

result for each individual experiment, and the statistics of the distribution of results is

described by quantum mechanics. This distribution of results may not quite agree with

the experimental result for a variety of reasons. The actual experimental sample will

contain a distribution of different microsystems, it will involve some interaction between

different microsystems or the environment, and so forth. Some of these effects can be

taken into account by using statistical mechanics. This is beyond were I want to go,

however. Let us simply assume that the above quantum description would agree very

well with the experimental result.

You may have noticed that the above is quite an abstraction from the actual experimental

situation. Theory describes the outcome of experiment, but it does not even try to

describe the actual physical situation or reality. Taking this minimalistic point of view on

the theoretical descriptions will take the angle out of many issues on interpretation of

quantum mechanics. The agreement with experiment is what we can verify. All the rest is

speculation. I do not say that the rest is not important for science. As scientists we need

some visualization of reality in order to think creatively about new possibilities. And it is

4

even perfectly valid to teach these views as it helps to progress science. However, these

views may be personal and erroneous. We may argue a lot about these things, but at the

end of the day who is to say. Therefore I will restrict the discussion as much as possible

to things we do 'know'. Or, according to Popper: "a statement can only be scientific if it is

possible, in principle, to do an experiment to prove it wrong".

---- Discussion of postulates ----- See handout Cohen-Tannoudji -----

Discussion of postulates using density matrices and projectors.

The formulation of quantum mechanics can be phrased a little more compactly and

elegantly using density matrices and projectors. This avoids distinguishing between cases

where eigenvalues are degenerate and also the overall phase of the wave function is

irrelevant. It will pave the way for subsequent discussion.

Let us denote by a t i t t ni i, , , ,≡ = 1 a set of orthonormal eigenvectors of the operator A

that correspond to the same ni -fold degenerate eigenvalue ai that hence span the

corresponding subspace. Then we can define the orthogonal projector on the eigenspace

by ( ) , ,P a i t i tit

ni

==∑

1

, with matrix representation P a p i t i t qpq it

ni

( ) , ,==∑

1

. An

orthogonal projector has the important properties ,†P P P P= =2 (idempotency) as you

can verify for yourself. Moreover the only eigenvalues of ( )P ai are 0 or 1. Any vector

(completely) within the subspace corresponds to eigenvalue 1, while vectors orthogonal

to the subspace have eigenvalue 0. ( )P ai acts as the identity operator within the

subspace. The operator ( )P ai is independent of the precise definition of the eigenvectors

i t, . Another orthonormal set of vectors i x, that span the subspace would do just as

well: ( ) , ,P a i x i xix

ni

==∑

1

would give the same matrix repesentation P apq i( ) (verify).

Finally ( ) ( ) ,P a P a a ai j i j= ≠0 , because the eigenvectors corresponding to different

5

eigenvalues are orthogonal. The operator A can be represented as ( )A a P aii

i= ∑ , from

which follows immediately

( ) ( ) ( )

( ) ( ) ( )

A a P a a P a a P a

f A f a P a

ii

i jj

j ii

i

i ii

2 2= =

=

∑ ∑ ∑

∑

The probability to measure an eigenvalue ai in a state Ψ is given by

Ψ Ψ Ψ Ψi t i t P at

n

i

i

, , ( )=∑ =

1

. Moreover the (unnormalized) state after the

measurement is given by ( ) , ,P a i t i tit

ni

Ψ Ψ==∑

1

. In short projectors are a convenient

way to deal with degenerate states. Only the subspaces are relevant and this is precisely

the focus of the projectors.

We can go one step further and associate a projector with the state Ψ itself. This is

called the density operator and is denoted D = Ψ Ψ . In this case the density operator

corresponds to a pure state and is a projector. Moreover

Tr D p p p pp p

( ) = = = =∑ ∑Ψ Ψ Ψ Ψ Ψ Ψ 1

The density D completely characterizes the system and is independent of the overall

phase of Ψ . The probability to measure ai on a system described by D is given by

p a Tr P a D p i t i t p

p p i t i t i t i t

i ip t

n

p t

n

t

n

i

i i

( ) ( ( ) ) , ,

, , , ,

= = =

= =

∑ ∑

∑ ∑ ∑

=

= =

1

1 1

Ψ Ψ

Ψ Ψ Ψ Ψ

which agrees with the postulates. The system after measurement of eigenvalue ai

(without normalization) would be given by

( ) ( ) , , , ,,

P a DP a i t i t i s i si it s

= ∑ Ψ Ψ

The density as given above is normalized to

6

Tr P a DP a Tr P a P a D Tr P a D p ai i i i i i( ( ) ( )) ( ( ) ( ) ) ( ( ) ) ( )= = =

Later on we will see that the complete ensemble after the measurement of A (at time ta )

can be represented as ( ) ( ) ( )D t P a DP aa ii

i= ∑ with normalization p aii

( ) =∑ 1. This

density is not idempotent in general and is not a projector. It would not correspond to a

pure state but to a mixture ( ) ( )D t p aa ii

i i= ∑ Ψ Ψ . We will discuss this later on.

The time dependence of the density operator (in general, pure state or mixture) is

given by − ∂∂

=i Dt

D H, . This is discussed in section of Cohen-Tannoudji that

accompany these notes, and I have little to add. We will use this in the excersises.

Let us finally consider two hermitean operators A and B having the respective

eigenspace projectors ( )P ai and ( )P bj . If A and B commute they have a complete set of

common eigenvectors. It can be shown that in this case the projectors on the respective

eigenspaces commute ( ), ( ) ,P a P b i ji j = ∀0 . The proof runs as follows. Let

( ); ( )A a P a B b P bii

i jj

j= =∑ ∑

and

, ( ), ( ),

A B a b P a P bi j i ji j

= =∑ 0

Each individual term in the sum should equal zero as the operator parts are independent

(projectors on different subspaces). Therefore either ai = 0, bj = 0 or the individual

projectors commute. The special cases require some extra work. Since all projectors not

corresponding to zero eigenvalues necessarily commute, we know that

( ), ( ( )), ( ), ( )P a B P a B P a b P bia

j jji≠

∑ ∑LNM

OQP= = − = = − =

LNM

OQP=

00 00 1 0 0 0

Hence the "null-projector" for A commutes with all non-null-projectors for B and

therefore also with ( ( )) ( )1 00

− = =≠∑P b P bj jbj

, which completes the proof. This result is

completely equivalent to the statement that A and B have a complete set of common

7

eigenfunctions. We will use this result later on. The projectors ( ) ( )P a P bi j would project

on the subspace spanned by those eigenvectors a b ti j, , that all have the same

eigenvalues ai and bj .

II. Measurement of non-commuting observables.

The most famous example of two non-commuting observables are position and

momentum. The properties of these operators are a little complicated because their

spectra are continous. It is easier to consider the case of measuring angular momentum or

even better the spin of an S=1/2 system. The three cartesian components of S do not

commute and we have the commutation relations ,S S i Sx y z= . However we can very

well measure any of these individual quantities and we can also perform a sequence of

measurements and analyse the results. In the absence of magnetic interactions in the

hamiltonian the resulting state vectors after the measurement are independent of time,

which is another simplification. In fact to discuss the results of quantum mechanics let us

not use any mathematics at all. Let us analyse the real content first and then venture into

mathematical formulations.

As our ensemble we take a class of schoolkids. Each of these kids has a lunchpacket that

consists of three items. They all have a turkey or roastbeef sandwich (t or r ), a coke or a

sprite to drink (c or s ) and an apple or an orange (a or o ) for desert. Our measurement

consists of asking a kid what is in the lunchbag, and getting statistics on the ensemble

(the class). However, we can ask only one question at a time. For example: "everybody

with a turkey sandwich stand to the right". But not: "All that have an orange and a coke

please stand on the left". That is asking two questions at once, and in the anology with the

spin system reflects the impossibility to simultaneously measure non-commuting

observables. In fact any 'measurement' we do should obey the laws of quantum

mechanics. Our goal is to characterize the distribution of lunchbags (e.g how many

tca rca tsa rco, , , ,...) etc, are there. Can we do this? If things behaved classically, easily.

But not in the quantum world. Let us try. We would first ask all kids who has turkey and

8

who has roast beef, and partition them into two groups. Then we would ask the turkey

group who has a coke and set them apart. Fine. we already have an ensemble that has

both a turkey and a coke, right? Let us check, and ask again. Who has a coke? Everybody

has a coke. Now, who has a turkey sandwich? Oops. This doesn't work. Only about half

of them has turkey. Asking the coke question destroyed the information we had on the

turkey. In the quantum world it is impossible to isolate a group where everybody has both

a coke and turkey. Asking the question changes the ensemble. This is fairly easy to

understand mathematically, describing an ensemble as a vector in Hilbert space, that

rotates under measurement, but it certainly does not make much sense when asking about

lunch bags.

The above is a representation in as simple a language as possible of some puzzling

properties of quantum mechanics. The essence is that according to quantum mechanics

(sometimes) we cannot create an ensemble that for sure will yield definite values for two

non-commuting observables. This is the content of the Heisenberg uncertainty principle.

The precise formulation would be

∆ ∆A B A B≥12

, .

For a proof and discussion see Cohen-Tannoudhji pages 286-289.

It is often stated as "one cannot measure the precise value of A and B simultaneously".

This is a very incomplete statement of the principle and it has led to all kinds of

ingenious constructions to violate the principle. It is much easier and complete to

interpret the principle in a different way. There is no problem to measure A or B , and for

each measurement (either A or B , but not both) on an individual system you get definite

results. However, for certain pairs of eigenvalues of A and B , (a bi j, ) say, it is in

principle impossible (according to QM) to prepare an ensemble such that all of the

measurements on this ensemble yield precisely the result ai if you measure A and bj if

you would measure B . In contrast there is no problem in preparing an ensemble such that

every member would yield ai if you measure A. You might put in some effort to

appreciate the precise translation of the mathematical formulation of the uncertainty

9

principle into words. It is a little easier if the commutator ,A B is a constant, since then

no ensemble will yield the same value for A and B for all elements in the ensemble. So

necessarily there is a spread, and the minimum spread depends on the commutator. In the

general formulation the mimimum spread depends on an expectation value and hence on

the state under consideration. Note that quantum mechanics actually does not preclude

that individual systems have definite values for all observables. It does say that within the

realm of quantum mechanics you cannot create an ensemble to prove it. Also note that it

is impossible to discuss the uncertainty principle using a single system. It is perfectly

possible to have an experiment where you measure A then B then A then B and find

nothing weird: measurement of A yields ai twice, while the measurement of B yields bj

in both cases. This is quite a possible outcome of this experiment. But beforehand you

cannot be certain that it will happen that way. It is impossible to create an ensemble

where all elements necessarily behave in this fashion. Of course you might be lucky and

by chance, using small enough ensembles one can easily violate the Heisenberg

uncertainty principle. That is all part of statistics.

Let us discuss another hair raising situation. A long standing controversy is the so-called

Einstein-Podolski-Rosen Paradox (EPR). EPR sought for the properties of individual

systems obeying the laws of quantum mechanics. In essence all parties can agree on the

fact that a measurement can change the system. So in the example above if I ask Mary if

she has a coke, afterwards she might no longer have the roastbeef sandwich that she

started out with. However, the issue at hand is different. EPR thought it would be

possible that each lunchbag has a definite content before measurement, and we are simply

looking what is in it. By looking at one piece of information we might, in the convoluted

act of measurement, change another piece of information in ways that are hard to predict.

This would then be the reason that one cannot prepare well specified ensembles, which

are themselves prepared by measurements. It might be that we simply have too little

control over the act of measurement (at present ?). Quantum afficionados tend to think

differently about what happens during a measurement on an individual system. Their idea

is that by measuring you force the microsystem to take a position. It is like flipping a coin

10

at the moment of measurement. "Choose my dear electron! Up or Down?" By the act of

measurement you force the system into an eigenstate of the corresponding observable,

and it does so with probabilities predicted by quantum mechanics. The precise outcome

of an individual experiment is unpredictable in principle. If one reads initial accounts of

the Heisenberg uncertainty principle, they very much reflect the viewpoint of EPR.

Heisenberg himself for example discusses how measuring position necessarily changes

the momentum of an electron. The later accepted viewpoint according to the so-called

Kopenhagen interpretation is rather convoluted in that they use classical mechanics to

describe the measuring aparatus and so there is a mysterious connection between the

quantum and classical system. However, I think that the above stated position of the

quantum afficionado reflects the attitude of many scientists in the field. It was my

position until I wrote these notes.

Let us adjust our lunchbag parabel a little so that we can describe the EPR line of thought

in trivial terms. What if we could gain information about what is in a lunchbag without

asking a question? Let us set up the experiment in a tricky way. Say we know that the

lunchbags are handed out in complementary pairs. Each pair contains both turkey and

roastbeef, an apple and an orange, a coke and a sprite. So a pair of lunchbags might

consist of tca rso& or rso tca& and so forth. We look when the lunchbags are handed out

and keep track of the corresponding pairing of the kids. The actual quantum experiment

consists for example of two spin 1/2 atoms in an overall S=0 state. You will discuss it

yourself later on, working through a set of questions... Back to the kids. Let's say, Lois

and Clark form a pair. Now we take Clark out on the playground and ask him about his

sandwich. "Turkey he says. I would like salami!" Lois doesn't even know we asked, but

we now know that she has roastbeef without asking her (or perturbing her lunchbag). If

we would ask her she would say roastbeef 100% of the time. However, we don't need to

ask her about her sandwich as we know already. Instead we ask Lois about her drink. "I

have a coke she says". After we ask the coke question she might no longer have

roastbeef, but if we assume she has something definite in her lunchbag, before the coke

question it was most definitely roastbeef and a coke. So this is a smart measurement that

shows it makes perfect sense that every lunchbag has something definite in it and by

11

measuring we simply find out what it is. Only, by asking one specific question we might

change the content of the lunchbag in other respects, and in unpredictable ways. At the

time EPR wrote their paper this interpretation was in no conflict with any piece of data

whatsoever. It was just an interpretation that should have appealed as something far more

rational than flipping a coin at the time of measurement. If we take the alternative

quantum interpretation about what actually happens, the EPR experiment is seen to take

on all of its weirdness. Asking Clark what is in his lunchpacket forces him to take a

position. Clark flips a coin to make a decision. "Turkey". If we now would ask Lois about

her sandwich she will say roastbeef for sure. So she flips her coin too, but it always yields

the same result. If we wouldn't have asked Clark it would give a fifty-fifty result, but now

it yields a 100%. Now Lois nor her coin knows anything about our asking Clark. To put it

in the extreme: flipping a coin in Tokyo determines the outcome of the flipping of the

coin in New York. That doesn't make sense. The EPR interpretation is far more

reasonable: if we assume there is something definite in the lunch bag, there is nothing

strange about us knowing what is in Lois's lunch bag if we know what Clark has, given

they form a perfect pair.

However, EPR did something more. They claimed that physical theories should describe

'reality', which means that quantum mechanics should allow for ensembles of completely

specified lunchbags. This it did not, and therefore the theory was not quite up to par.

Quantum theory was incomplete. In order to describe ensembles of well specified

, ,S S Sx y z the structure of the theory needs to be changed completely. If we use the

concept of Hilbert space, operators and eigenvalues it can not accommodate EPR's

reality. Quantum theory was too successfull to discard it, just because of a difference in

interpretation that had apparently no measurable consequences. Glad we didn't. In my

opinion the more reasonable thing to do would have been to adopt the EPR interpretation

but live with the fact that quantum mechanics only describes ensembles that can actually

be prepared by measurements. Measurements unfortunately tend to perturb the system

such that no fully specified ensembles can be prepared. Or perhaps they could and one

might make further advances, necessarily leading to a new theory. In essence this would

mean to say EPR might be right, but quantum mechanics seemed to do the job in practice.

12

It would have kept the search for alternative theories alive but they would necessarily

have the same statistics as quantum mechanics, which has served us very well.

This was the situation until John Bell came around. He showed that the EPR

interpretation might lead to different results from the usual quantum theory for some

experiments. And he used the EPR experiment to show it. This is how it works in terms

of lunchbags. If EPR's postion is right then in fact I can construct what was in Lois's

lunchpacket from the pairing experiment. From Clark's answer I know she had roastbeef,

and by our question we also know she has a coke. We are simply assuming that the

question to Clark could not possibly have affected Lois's lunch box. There is no

unpredictable act of measurement that has a range from New York to Tokyo. Let us

assume therefore for the sake of argument that EPR are right. Every lunchbox has a

definite content and by doing the pairing experiment I can determine two items in a

luchbox. Now we take our whole class and do three types of experiment starting from

identical ensembles in each experiment. In the first experiment we use the pairing

experiment to determine if somebody has a turkey sandwich and a coke. By assumption

she would then have either an orange or an apple as the third item. If we do this for the

whole first ensemble we can write n t c n t c o n t c a[ , ] [ , , ] [ , , ]= +

where n t c[ , ] denotes the number of kids in the enesemble that have both a turkey and a

coke, and so forth. In the next group we determine the number that has a sprite and an

orange, in the third group turkey and orange. In total we would then have the following

relations, assuming the minimal EPR conditions

n t c n t c o n t c an s o n t s o n r s on t o n t c o n t s o

[ , ] [ , , ] [ , , ][ , ] [ , , ] [ , , ][ , ] [ , , ] [ , , ]

= += += +

From this we can derive the so-called Bell inequality: n t c n s o n t c o n t s o n t o[ , ] [ , ] [ , , ] [ , , ] [ , ]+ ≥ + =

This is an inequality that one can test in an actual experiment, as one can make a spin

zero pair, let it fly apart and measure the spin in different directions for the particle in

Tokyo and the particle in New York. We will discuss the full details of the precise

13

quantum treatment later on. But the outcome is that the usual treatment of quantum

mechanics is in conflict with the above analysis based on the assumptions of EPR.

Quantum mechanics violates Bell's inequalities. At the time of the EPR paper (1935)

people couln't really say if EPR was right or the standard Kopenhagen interpretation was

right. Neither did contradict any experiment. It appeared simply a matter of

interpretation. With Bell however, there was a testable hypothesis. Experiments were

done in the seventies, and the experiments by Alain Aspect are perhaps best known

(though not the first). The technical details and fine print are rather involved, but the

conclusion was that the traditional laws of quantum mechanics are correct. So one cannot

assume that individual particles actually have definite values for , ,S S Sx y z and we are

simply determining what they are, although perturbing these values in the process.

Does this mean that we have to accept the alternative interpretation? Flipping a coin in

Tokyo determines the outcome of the flipping of a coin in New York? Not in my opinion.

This 'making a choice during the measurement' aspect appears to be an act of human

imagination. Us trying to understand what we cannot grasp. I think it is better to take a

very mundane position. Quantum mechanics describes the statistical outcomes of

complete experiments. In doing the measurement in Tokyo I am preparing a specific

ensemble. The subsequent measurement in New York is described using this new

ensemble. From the perspective of quantum mechanics the pairing experiment is no

different from first asking Lois if she has roastbeef and then asking if she has a coke. A

measurement is a measurement, and if a measurement in Tokyo tells you something

about the situation in New York, you have to adjust the ensemble accordingly. Of course

this is nothing more than using the laws of quantum mechanics which are very definite

for this type of experiment. What is hard to understand is how there can be such a strong

correlation between two distant particles, which cannot be assumed to individually have

definite properties, while as a pair they do. Quantum mechanics gives us the

mathematical prescription but it does go against common sense, and this is illustrated

very vividly by the flipping the coin at the time of measurement picture. This

phenomenon is called entanglement in the literature. I have a set of excersises that has

you work out the quantum mechanics of Bell's experiment.

14

III. Measurements.

Below I will discuss a simple model of measurement, treating everything quantum

mechanically. Many things will be seen to fall into place, as long as the quantum

mechanics is treated only as a statistical theory and NOT as a description of individual

quantum systems. This includes interference effects that can be washed out by

measurements, 'reduction of the wave packet', Schrödinger's cat paradox, and so forth.

Let us start from the basic postulates of quantum mechanics and build a model for

measurements that is consistent with these postulates. Let us consider a two-level

quantum system described by a state Ψ and an observable B with two non-degenerate

eigenfunctions b b1 2, having eigenvalues b1 and b2 respectively. The observable B is

measured by a perfect measuring apparatus that can be in one of three possible states

B B B0 1 2, , , where B0 indicates that nothing has been measured, while B1 and B2

indicate two different values for the pointer on the apparatus to be associated with the

two eigenvalues of the observable B . We want to describe everything quantum

mechanically, so at time t0 (long before measurement) the overall state of the world W

is given by

W t B( )0 0= Ψ

Let us be perfectly clear about the notation. The Hilbert space that contains the state of

the world is a direct product space

H H HW B Q= ⊗

The fact that W t( )0 is a product function (rather than some linear combination) indicates

that the system is initially non-interacting. The overall Hamiltonian is given by

int intH H H H H H HW B Q B Q B Q= ⊗ + ⊗ + = ⊕ ⊕1 1 .

The interaction part of the Hamiltonian that is responsible for the actual measurement is

supposed to be short range (think of a particle flying through a detector as in a Stern-

Gerlach experiment). The system evolves in time and the Hamiltonian of interaction

15

between quantum system and measuring apparatus acts in such a fashion that shortly after

the measurement (time tb say) the system is described by

W t B b b B b bb( ) = +1 1 1 2 2 2Ψ Ψ

Beyond this time the effect of intH is negligible again, and both quantum system and

measuring apparatus evolve independently. We are making some simplications here.

There is no time evolution under the influence of the zeroth order Hamiltonian H HB Q+ ,

implying we are using the interaction picture (to be discussed later). We also completely

leave open the act of measurement or a discussion of intH as it does not enter the

postulates. Our aim is to find a model system that is completely described by quantum

mechanics, and which agrees with the postulates. A better understanding of

measurements will demand a more complete description. For the current goals of

interpretation the above is sufficient, however.

The data to be abstracted from the above wave function is as follows: "If we would repeat

this experiment many times the pointer would read B1 with a probability p b b( )1 12

= Ψ

and B2 with a probability p b b( )2 22

= Ψ ". This is in perfect agreement with the

postulates. Note that the wave function of the world is a superposition of macroscopic

states. This is not a problem as this just reflects the statistics of the situation and not a

"true state of affairs" (whatever that means). The paradox of Schrödinger's cat does not

appear, as Schrödinger discussed only a single experiment (one cat). A quantum

treatment and interpretation necessarily involves a large number of identical experiments.

In this case we do have a valid repeatable experiment. Now you may well ask, "why is

this a measurement? It looks like ordinary quantum mechanics of a compound system."

Very true. There is no reduction of the wave function and there appears to be nothing too

special about measurements. There are no quantum jumps, just a special type of time-

evolution, described by the world-Hamiltonian in the world-Hilbert space. There is

something special about the macroscopic states Bi however. They are so different that

the two components of the wavefunction can never show interference effects anymore.

Mathematically B O B1 2 0= for any physical operator O (that consists of one- and two-

particle interactions between elementary particles). This is the essence of the act of

16

measurement: The quantum wave function is decomposed into its eigenvector

components and each of these components becomes correlated with a macroscopic state,

destroying interference. At that point the various components of the wave function start

behaving as a classical ensemble (in particular no interference effects). This means that I

can think of my ensemble of being partitioned into two subsets. All particles in subset 1

have property b1, while the elements of subset 2 have property b2. Everything I can derive

from the wave function of the world would agree with such an interpretation. Finally an

element in the ensemble that has property b1 has the measurement apparatus in state B1

and this means that the pointer on the apparatus refers to this particular value. It is clear

that this discussion is not altogether clearcut or comfortable. The idea of measurement is

very difficult, and in the end we might not be able to say more than that we are linking

different entities together by introducing correlations. Ok, let us not make this into a

philosophy class and discuss what we can learn from the wave function of the world.

Next consider the following situation. Suppose that before we measure B we measure an

observable A. We have to adjust the wave function of the world and the Hilbert space

accordingly to incorporate the new situation. Using analogous notation, the wave

function before any measurement would look like

W t A B( )0 0 0= Ψ

Shortly after measuring A (time tA say) we have the wave function

W t A B a a A B a aA( ) = +1 0 1 1 2 0 2 2Ψ Ψ

while if we subsequently measure B we obtain

W t A B b b a a A B b b a a

A B b b a a A B b b a aB( ) = +

+ +1 1 1 1 1 1 1 2 2 2 1 1

2 1 1 1 2 2 2 2 2 2 2 2

Ψ Ψ

Ψ Ψ

There are now four distinct events with their individual probability amplitudes. They are

most easily collected in a matrix that reflect the pointers on apparata A and B

A A

B b a a b a aB b a a b a a

1 2

1 1 1 1 1 2 2

2 2 1 1 2 2 2

Ψ ΨΨ Ψ

17

The probability to measure b1 irrespective of the value of A is now given by the sum of

the squares in the first row

p b b a a b a a

b a p a b a p a

( )

( ) ( )

1 1 12

12

1 22

22

1 12

1 1 22

2

= +

= +

Ψ Ψ

This may be compared to the probability to measure B1 if we wouldn't measure A first,

which was given by

p b b b a a b a a

b a a b a a b a a b a a

( )

Re( )* *

1 12

1 1 1 1 2 22

1 12

12

1 22

22

1 1 1 1 2 22

= = +

= + +

Ψ Ψ Ψ

Ψ Ψ Ψ Ψ

It is seen that the two results differ by the so-called interference term in the B -only

measurement. Measuring A before B affects the results of the measurement of B unless

the interference term happens to be zero! This is the origin of the statement that

according to quantum mechanics measurements necessarily perturb the system. Please

note that the first result can be interpreted as that a fraction p a( )1 is in the state a1 and

another fraction p a( )2 is in the state a2 . The measurement of A has effectively

partitioned the ensemble into two subensembles. The above has immediate bearing on for

example the two-slit experiment: measuring through which slit the electron goes destroys

the interference pattern. The above discussion indicates that interference is the normal

case, when dealing with waves or wave functions. The source for wonder should be that

the act of measurement can destroy this interference pattern. According to quantum

mechanics it does not depend in any way on the details of the measurement (e.g. the

precise form of the interaction Hamiltonian). We will investigate in a minute what

happens if A and B commute and one might expect that the measurements do not

influence each other then.

Let me mention here another way to calculate the probabilities in the A B+

measurement. The wave function of the world corresponding to the eigenvalue b1 is two-

fold degenerate: A B b1 1 1 and A B b2 1 1 are two orthogonal eigenfunctions. The

18

probability to find the eigenvalue b1 is then the length of W along this direction which is

given by b a p a b a p a1 12

1 1 22

2( ) ( )+ as before.

The above notation is rather cumbersome as we have in principle to keep track of all

measurements to understand what is going on. Life can be simplified by using the

concept of density matrices and in particular so-called reduced density matrices. By

definition we define the density matrix of a pure quantum state Ψ as the operator

D Ψ Ψ Ψ=

It is easily verified that the density operator for a pure state is a projection operator

, †D D D D2 = = . In addition Tr D[ ]=1. Similarly, measurements are best described in

terms of projection operators, in case there are degeneracies. In the above model we

assumed no degeneracies, and we will continue to do so. However, using projection

operators it makes no difference, which is one reason to use them. So let us define

( ) , ( )P a a a P A A AQ A1 1 1 1 1 1= = , etc.

Please note that the projectors act on different Hilbert spaces. We can construct operators

that act in the complete space HW as before, e.g. ( )1 1A QP a⊗ . None of this poses any

essential difficulties. It is merely notation. The superscripts on the projection operators

are redundant and we will omit them from now on.

Let us return to a description of the measurement of an observable A using a description

in terms of density matrices. The initial density matrix of the world is given by

( )D t A AW0 0 0= Ψ Ψ

and it evolves under action of the measurement Hamiltonian into

( ) ( )( )D t A a a A a a a a A a a AWa = + +1 1 1 2 2 2 1 1 1 2 2 2Ψ Ψ Ψ Ψ

The density matrix ( )D tW still corresponds to a pure state, and we only used the time-

evolution of ΨW to derive the above equation. Later on we will discuss the equation of

motion for density matrices (the Liouville equation). At this point we have collected all

of the information concerning the measurement on A. In order to get rid of the

19

redundancies in the notation we can integrate out the inessential variables concerning the

measuring apparatus A. To this end we define a reduced density operator

( ) ( )

( ) ( )

( ) ( )

R t A D t A

a a a a a a a a

p a a a p a a a

p a P a

Qa i

i

WA i

ii

i

= =

+

= +

=

∑

∑

1 1 1 1 2 2 2 2

1 1 1 2 2 2

Ψ Ψ Ψ Ψ

The reduced density operator acts solely in the 'quantum' Hilbert space. It is seen that all

potential cross terms disappear due to taking the trace. This reduced density matrix fully

describes the quantum system after measurement of A. The weights p ai( ) are precisely

the probabilities to measure Ai.

Suppose now we decide to also measure B at a later time tb . We should go back and

redefine our initial ( )D tW . The state B0 carries through the whole analysis and after

taking the trace over the states in H A , we obtain for the density operator before

measurement of B (but after measurement of A):

( ) ( ) ( )R t p a B a a B p a B a a BBQa = +1 0 1 1 0 2 0 2 2 0

Upon measurement of B this operator evolves into

( ) ( )( )( )

( )( )( )

R t p a B b b a B b b a a b b B a b b B

p a B b b a B b b a a b b B a b b B

BQb = + +

+ +1 1 1 1 1 2 2 2 1 1 1 1 1 1 2 2 2

2 1 1 1 2 2 2 2 2 2 1 1 1 2 2 2 2

Upon taking the trace with respect to H B we obtain the reduced quantum density operator

( ) ( )( )

( )( )

R t p a b b a b b b a b

p a b b a b b b a b

Qb = +

+ +

1 1 1 12

1 2 2 12

2

2 1 1 22

1 2 2 22

2

Overall the probability to find the measured value b1 equals

p b p a b a p a b a( ) ( ) ( )1 1 1 12

2 1 22

= +

the same result as obtained before. At this point the usage of the reduced density matrices

becomes clear. One never really has to explicitly consider the measuring apparatus. For

good reasons it does not enter the postulates. The final measurement dictates what a

convenient representation of RQ looks like. It is diagonal in the eigenvector basis

20

corresponding to the final measurement, and in general it is not a pure state, but a

mixture, meaning R R2 ≠ in general.

The general formulation uses projection operators, which are independent of a choice of

basis. So if an initial (reduced) density matrix is given by ( )R t0 , the reduced density

matrix after measurement of an operator A would be given by

( ) ( ) ( ) ( )R t P a R t P aa i ii

= ∑ 0

which is independent of the basis set. The probability to find the eigenvalue ak is given

by

p a Tr P a R t

Tr P a P a R t P a

Tr P a R t

k k a

k ii

i

k

( ) [ ( ) ( )]

[ ( ) ( ) ( ) ( )]

[ ( ) ( )]

=

=

=

∑ 0

0

You can easily verify that Tr R t Tr R ta[ ( )] [ ( )]= =0 1 for any measurement. This does not

require any renormalization of any kind. It is clear that the use of reduced density

matrices has enormous advantages. We will need to spend some more time to find the

properties and practicing with the above general formulation. Something else that has

been neglected at this point is the time-dependence of the system due to H Q . This will

need to be clarified, but it is not very essential for the interpretation of Quantum

Mechanics.

The nature of the reduction postulate is also becoming clear at present. In the case that we

consider the measurement of A followed by B one considers what can happen next to a

system, given that at time ta one has actually measured ai . Therefore one is interested in

finding the so-called conditional probability to find bj , given the result ai upon

measuring A. This conditional probability is given by P A B P Ai j i( & ) / ( ), and is simply

given by b aj i

2 in the non-degenerate case. This is the reason to redefine the wave

function to be Ψ = ai if we want the conditional probability. In general:

21

Measuring A for a density matrix R yields

Tr P a R p ai i( ( ) ) ( )=

( ) ( ) ( )R t P a R t P aa ii

ib g =∑ 0

Subsequent measurement of B :

p b Tr P b P a RP aj j ii

i( ) [ ( ) ( ) ( )]= ∑

p b a Tr P b P a R t P aj i j i i( & ) [ ( ) ( ) ( ) ( )]= 0

p b a p aj i i( & ) / ( ) ??=

Nothing simple emerges. The above result would agree with the conditional probability

amplitude that is described by the usual postulates of quantum mechanics. However it

depends on the initial state in the non-degenerate case, and we cannot simplify. We will

do an excersise to make this clear. The fact remains that a reduction of the wave function

is done more out of convenience than anything else. Using reduced density matrices there

is little reason to proceed like this. The density matrix formalism is capable of describing

actual experiments that contain sequential measurements on a complete ensemble.

Moreover everything is described in the language of quantum mechanics. There is no

reference to the fact that the measuring apparatus should be described by classical

mechanics, as in the Kopenhagen representation of quantum mechanics.

Let us also consider the case of commuting observables. They are alternatively

characterized by the fact that the projectors on the respective eigenspaces commute

( ), ( ) ,P a P b i ji j = ∀0 as discussed before. Using this result we can easily show that the

measurement of A before B yields the same result as measurement of B alone, provided

A and B commute.

Starting from a density matrix R the density matrix after measuring A is given by

( ) ( )P a RP ai ii∑

Subsequently upon measuring B the probability to measure the eigenvalue bk is given by

22

Tr P b P a RP a Tr P a P b RP a

Tr P a P b R Tr P a P b R Tr P b R

k ii

i ii

k i

ii

k ii

k k

[ ( ) ( ) ( )] [ ( ) ( ) ( )]

[ ( ) ( ) ] [ ( ) ( ) ] [ ( ) ]

∑ ∑

∑ ∑

=

= = =2 1

which is the probability to measure bk directly. Therefore the measurements of A and B

do not influence each other. Both can be measured precisely, and a system can be

prepared in a common eigenstate of A and B . You may wish to return to the more

elementary discussion on page 13 and 14 to see what happens if A and B commute.

Finally it is easy to include the time evolution in the formalism. If we assume a time

independent hamiltonian for the quantum system we have the equation of motion

i D t

tHD t D t H D t H D t t D t

D t e D t eiH t t iH t t

∂∂

= − = = = →

= − − −

( ) ( ) ( ) ( ), ; ( ) ( )

( ) ( )( )/ ( )/

0 0

00 0

Following the discussion in Cohen-Tannoudji we can also use the evolution operator

U t t e e p piH t t i tQ p( , ) ( )/0

0= =− − − ω where the basis p are the eigenfunctions of the time-

independent HQ with eigenvalue Ep p= ω . Using the time evolution operator the time

dependent density matrix can be written as

( ) ( , ) ( ) ( , ) ( , ) ( ) ( , )†D t U t t D t U t t U t t D t U t t= =0 0 0 0 0 0 ,

and this expression is even valid if the Hamiltonian would be time-dependent. Please

notice the nice matrix-like ordering of the time arguments in this formulation. The time

evolution using U would take place until the time of (instantaneous) measurement, then

the projection operators corresponding to measurement are inserted and the time

evolution is continued. For example measuring A first at time ta and then B at time tb

would yield the probablities (in full glory)

p b t t t Tr P b U t t P a U t t D t U t t P a U t tj b ai

j b a i a a i a b( )[ , , ] ( ( ) ( , ) ( ) ( , ) ( ) ( , ) ( ) ( , )]0 0 0 0= ∑

Simplicfications occur if for example , ,H A = 0 and hence ( ), ( , )P a U t ti 0 0= . the

above expression would then no longer depend on ta as using the commutation and the

fact that U t t U t t U t tb a a b( , ) ( , ) ( , )0 0= we can write

23

p b t t Tr P b U t t P a D t P a U t t

Tr P b P a U t t D t U t t P a Tr P a P b P a D t

j bi

j b i i b

ij i b b i

ii j i b

( )[ , ] ( ( ) ( , ) ( ) ( ) ( ) ( , )]

( ( ) ( ) ( , ) ( ) ( , ) ( )] ( ( ) ( ) ( ) ( )]

0 0 0 0

0 0 0

=

= =

∑

∑ ∑

Using the cyclic invariance of the trace operation many different formulations can be

achieved, all yielding precisely the same results. In the excersises you will be asked to

evalutate various probabilities and their time dependence depending on the mutual

comutation relations of ,A B and H using a 4-dimensional Hilbert space.

That's it for now. We will work through some excersises to help you digest this material.

Interpretation of Quantum Mechanics - scienide2scienide2.uwaterloo.ca/~nooijen/website/Previous... · Interpretation of Quantum Mechanics using a density matrix formalism. by Marcel

Documents