Top Banner
Lectures on Foundations of Quantum Information Theory Vlatko Vedral October 3, 2004
64

Lectures on foundations of quantum information theory by vlatko vedral

May 10, 2015

Download

Education

Earnest Coutu

Lectures on Foundations of Quantum Information Theory by Vlatko Vedral

1.1 Introduction
This course deals with the basic concept of information, its importance,
classical versus quantum information and, if time permits, the experimental
state of art in quantum information. In terms of literature the closest article
to these lectures is my review of a few years back (Rev. Mod. Phys. 74, 197
(2002)). A good reference book (but far too large to ever read) is Nielsen
and Chuang “Quantum Computation and Quantum Information”.
The summary of the course is: first we discuss classical information
theory, and its application to communications; then we discuss quantum
mechanics and the communication based on the laws of quantum mechanics;
then we talk about entanglement, both from the fundamental perspective,
as well as from its information processing capability; finally, we introduce
quantum computation (the search and factorisation algorithms) and the
basics of quantum error correction. Throughout the lectures, connections
between information theory, thermodynamics and (quantum) physics are
emphasized.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lectures on foundations of quantum information theory by vlatko vedral

Lectures on Foundations ofQuantum Information Theory

Vlatko Vedral

October 3, 2004

Page 2: Lectures on foundations of quantum information theory by vlatko vedral

Contents

Table of Contents i

1 Classical Information 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Quantifying Information . . . . . . . . . . . . . . . . . . . 11.3 Data Compression . . . . . . . . . . . . . . . . . . . . . . . 31.4 Correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Quantum Mechanics 82.1 Rules of Quantum Mechanics . . . . . . . . . . . . . . . . 82.2 Mixed States . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3 Quantum Measurement . . . . . . . . . . . . . . . . . . . 102.4 General Measurement: POVM . . . . . . . . . . . . . . . 102.5 Two Level Systems . . . . . . . . . . . . . . . . . . . . . . 112.6 Many systems - tensor products . . . . . . . . . . . . . . 11

3 Quantum Information – Basics 123.1 Landauer’s Insight . . . . . . . . . . . . . . . . . . . . . . . 123.2 Quantum Encoding: Non-orthogonal states . . . . . . . 123.3 Quantum Cryptography . . . . . . . . . . . . . . . . . . . 133.4 No cloning of quantum bits . . . . . . . . . . . . . . . . . 133.5 Generalized Dynamics: CP maps . . . . . . . . . . . . . 143.6 Implementing CP maps physically (Ozawa) . . . . . . . 143.7 Rules of Quantum Mechanics Revisited . . . . . . . . . 14

4 Quantum Protocols 164.1 Pure State Entanglement . . . . . . . . . . . . . . . . . . 164.2 Schmidt decomposition . . . . . . . . . . . . . . . . . . . . 164.3 Dense Coding . . . . . . . . . . . . . . . . . . . . . . . . . . 194.4 Teleportation . . . . . . . . . . . . . . . . . . . . . . . . . . 204.5 Entanglement Swapping . . . . . . . . . . . . . . . . . . . 204.6 No Instantaneous Transfer of Information . . . . . . . . 20

i

Page 3: Lectures on foundations of quantum information theory by vlatko vedral

5 Quantum Information I 225.1 Von Neumann Entropy . . . . . . . . . . . . . . . . . . . . 225.2 Schumacher’s Data Compression . . . . . . . . . . . . . 235.3 Entropy of Observation . . . . . . . . . . . . . . . . . . . 24

6 Quantum Information II 256.1 Quantum Measures of Information . . . . . . . . . . . . 256.2 Encoding Classical Information into Quantum States 266.3 Discrimination and Second Law . . . . . . . . . . . . . . 28

7 Quantum Entanglement 297.1 Historical Background for Entanglement . . . . . . . . 297.2 Bell’s Inequalities . . . . . . . . . . . . . . . . . . . . . . . 297.3 Separable States . . . . . . . . . . . . . . . . . . . . . . . . 297.4 Pure Entangled States Violate Bell Inequalities . . . . 307.5 Mixed Entangled States May Not Violate Bell In-

equalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

8 Quantum Entanglement Detection I 318.1 Detecting Entanglement? . . . . . . . . . . . . . . . . . . 318.2 Jamiolkowski Isomorphism . . . . . . . . . . . . . . . . . 328.3 Examples of entanglement witnesses . . . . . . . . . . . 33

9 Quantum Entanglement Detection II 349.1 Peres-Hodorecki Criterion . . . . . . . . . . . . . . . . . . 349.2 Interferometric implementation of Peres-Horodecki . 349.3 Operational Definition: Formation and distillation . . 359.4 Pure State Distillation . . . . . . . . . . . . . . . . . . . . 359.5 Carnot Cycle Analogy . . . . . . . . . . . . . . . . . . . . 36

10 Measures of Entanglement 3710.1 Schmidt Decomposition & Correlations . . . . . . . . . 3710.2 Mixed States . . . . . . . . . . . . . . . . . . . . . . . . . . 3710.3 Relative Entropy of Entanglement . . . . . . . . . . . . 3810.4 Uniqueness of the measure? . . . . . . . . . . . . . . . . . 3910.5 Taxonomy of Entanglement . . . . . . . . . . . . . . . . . 3910.6 Other approaches? . . . . . . . . . . . . . . . . . . . . . . . 3910.7 Entanglement and Thermodynamics . . . . . . . . . . . 39

11 Quantum Algorithms 4011.1 Computational Complexity . . . . . . . . . . . . . . . . . 4011.2 Deutsch’s Algorithm . . . . . . . . . . . . . . . . . . . . . 4011.3 Grover’s Search . . . . . . . . . . . . . . . . . . . . . . . . 4111.4 Factorisation Problem . . . . . . . . . . . . . . . . . . . . 4111.5 Shor’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . 41

ii

Page 4: Lectures on foundations of quantum information theory by vlatko vedral

11.5.1 A. Beam Splitters . . . . . . . . . . . . . . . . . . . 4111.5.2 B. Fourier Transforms (Discrete) . . . . . . . . . 4211.5.3 C. Phase Estimation . . . . . . . . . . . . . . . . . 4411.5.4 D. Order Finding Problem . . . . . . . . . . . . . 4511.5.5 E. Factoring . . . . . . . . . . . . . . . . . . . . . . . 4611.5.6 F. Example of Order Finding . . . . . . . . . . . . 46

12 More on Entanglement and Quantum Measurements 4712.1 Search Optimality from Entanglement . . . . . . . . . . 4712.2 Model for Quantum Measurement . . . . . . . . . . . . 48

13 Quantum Error Correction 4913.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 4913.2 Simple Example . . . . . . . . . . . . . . . . . . . . . . . . 4913.3 General Conditions . . . . . . . . . . . . . . . . . . . . . . 5013.4 Reliable Quantum Computation from Unreliable Com-

ponents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

14 Appendix 5714.1 Measuring tr%2? . . . . . . . . . . . . . . . . . . . . . . . . 5714.2 Generalization of this to tr%k . . . . . . . . . . . . . . . . 5814.3 Checking for Entanglement . . . . . . . . . . . . . . . . . 5814.4 Measuring tr(ρT2)k . . . . . . . . . . . . . . . . . . . . . . . 5814.5 Measuring fidelity between % and σ . . . . . . . . . . . . 5814.6 Properties of von Neumann Entropy . . . . . . . . . . . 5914.7 Properties of Relative Entropy . . . . . . . . . . . . . . . 59

iii

Page 5: Lectures on foundations of quantum information theory by vlatko vedral

Chapter 1

Classical Information

1.1 Introduction

This course deals with the basic concept of information, its importance,classical versus quantum information and, if time permits, the experimentalstate of art in quantum information. In terms of literature the closest articleto these lectures is my review of a few years back (Rev. Mod. Phys. 74, 197(2002)). A good reference book (but far too large to ever read) is Nielsenand Chuang “Quantum Computation and Quantum Information”.

The summary of the course is: first we discuss classical informationtheory, and its application to communications; then we discuss quantummechanics and the communication based on the laws of quantum mechanics;then we talk about entanglement, both from the fundamental perspective,as well as from its information processing capability; finally, we introducequantum computation (the search and factorisation algorithms) and thebasics of quantum error correction. Throughout the lectures, connectionsbetween information theory, thermodynamics and (quantum) physics areemphasised.

1.2 Quantifying Information

.To quantify information it is necessary to agree on what the use of this

quantity should be in the first place (i.e. within which context do we speakabout information?). This then dictates the properties we would like the in-formation measure to possess. Shannon developed the first successful theoryof information in 1948. It should be noted that although his main motiva-tion was to maximize telephone communication profit for the Bell Labs,namely he wanted to investigate the maximum communication capacity fora (classical) communication channel, his theory is so general that it can be(and has been) applied to many diverse disciplines. It is used in biology

1

Page 6: Lectures on foundations of quantum information theory by vlatko vedral

(especially genetics), sociology and economics to name a few applications.Needles to say, it is also used in physics, and this usage will be the maintopic of present lectures. We will look at information from two differentperspectives: we investigate what the laws of physics have to say about in-formation processing and what the laws of information have to say aboutphysics (if anything, of course).

Shannon postulated the following (reasonable, as will be seen later) re-quirements for any measure of information to possess:

• The amount of information in an event must depend reciprocally onits probability, p. This is a very natural requirement since the moresurprized we are by an even happening, the more information thisevent carries. Since low probability events are more surprising thanhigh probability ones, this justifies our postulate.

• Two further constrains on the measure are:

1. I(p) is a continuous function of p. This is a reasonable postu-late for any physical quantity (disregarding phase transitions andsome other anomalies). It requires that if the probability changesby a very small amount, the information should also do so. Itmakes no sense that a small change it the occurence of some evenleads to us being much more surprised when the event takes place.

2. I(p1p2) = I(p1) + I(p2). The additivity assumption is the moststringent one in the sense that not many functions of real positivenumbers (i.e. probabilities) are additive. Why do we want addi-tivity? It is natural that if we have two events and one happenswith probability p1 and then subsequently the next event happenswith p2, our surprize should just be the sum of individual sur-prizes. I would like to warn the reader that this assumption canbe given up, but then the information measure will also change.Such measures exist, but are nowhere near as useful and appliedas Shannon’s information.

• Shannon then proved that there is a unique measure satisfying theserequirements (it is unique up to an affine transformation, to be moreprecise, which means up to an additive and multiplicative constant).The proof of uniqueness is as follows:

• Induction from property 2. I(tr) = rI(t), for any r ∈ Q

• Continuity. True for any r ∈ R+

Let t = 12 ⇒ I(t−r) = rI(2−1)

∴ I(p) = I(12)(− log2(p))

2

Page 7: Lectures on foundations of quantum information theory by vlatko vedral

We define I(12) = 1 (a bit!)

In this way we obtain the measure I(p) = − log p.

• Shannon’s measure of information is just the average of the above withrespect to some probability distribution {pi}:H =

i

pi log pi = 〈− log pi〉

This measure is the key to classical information theory (but also classicalstatistical mechanics via Boltzmann’s statistical formula for entropy – thisis in fact why Shannon called his quantity “entropy”).

Interestingly, Shannon’s logic in using entropy because of its additiveproperty is reminiscent of Maxwell’s argument in deriving his velocity dis-tribution in a gas of atoms (molecules). Briefly, he said that the densitymust be a function of energy, i.e. f(v2), where v is the velocity. However,probability distributions in x, y, z directions must be independent and so wemust have that

f(v2x + v2

y + v2z) = f(v2

x)f(v2y)f(v2

z) (1.1)

The only function that satisfies the above equation is the exponential andso f(v) = AeBv2

, i.e. the Maxwell Boltzmann distribution! You see how farsimple logic can take us in general...

1.3 Data Compression

Here is the first application of Shannon’s entropy and it will now becomeclearer why this concept is useful in the theory of comminication.

• The basic concepts of communication we need are: the sender, thereceiver, the source and the channel (noisy or noiseless). First I willexplain a special case of communication that is very easy to under-stand. This exmple will then easily be generalised to cover all relevantcases.

• Noiseless communication with two states:

(0 1p0 p1

)

p0 + p1 = 1 0001011000︸ ︷︷ ︸

n

• Number of Messages

Number of messages =

(n

np0

)

Calculate this quantity:

3

Page 8: Lectures on foundations of quantum information theory by vlatko vedral

log

(n

np0

)

= log n! − log(np0)! − log(np1)!.

Using Sterling’s approximation we obtain:

log

(n

np0

)

= n log(n) − np0 log(np0) − np1 log(np1) = nH(p0)

∴ N = 2nH(p0)

• Compression

n → nH(p)

Corrolary: A completely random string cannot be compressed.

• Noisy communication requires to investigate the concept of correla-tions (the channel capacity is in this case equal to the correlationsbetween the sender and the receiver after the information has gonethrough the noisy channel).

1.4 Correlations

(concept + quantification)A very important quantity in this course will be that of correlations

between events. We tend to think of correlations in terms of properties. Mysocks are correlated if they are always of the same colour. So, for example,either they are both black or white. Correlations imply that if I look at onesock and see that it is black (white) then I know that the other one is alsoblack (white). However, in order to talk about correlations we don’t needconcepts such as colour, black, white, up down, right, wrong, etc. All weneed is to be able to calculate probabilities for different alternatives (e.g.the probability that your socks are black). Once we have probabilities wecan then define correlations - and this is the best way of going about this.Let’s formalise things a bit to two random variables (coulours of two socksfor example).

• Example: two random variables X and Y :

X Y P

0 0 12

0 1 0

1 0 0

1 1 12

Correlations are measures with a quanitity called Mutual Information(also suggested by Shannon). It is defined as:

4

Page 9: Lectures on foundations of quantum information theory by vlatko vedral

• Mutual Information

I(X : Y ) = H(X) + H(Y ) − H(X, Y )

This is a good measure of correlations because it looks at the dif-ference between uncertainties in the individual events and the totaluncertainty in both events. So, if the events are uncorrelated the in-dividual and total uncertainity are tha same and this measure giveszero. Otherwise it is positive.

In the above table the mutual information between X and Y is equalto log 2 + log 2 − log 2 = log 2 which is called “one bit of correlation”or one c-bit (“c” stands for “classical”).

• Noisy Channel Capacity.

Let’s now explain why this is a good candidate for the capacity of anoisy classical channel. When one person communicates to another thesignal that is sent (the string of bits) can be thought of as a randomvariable, say X. If the channel is noisy, this string will in general bechanged (due to errors, which will ocasionally flip zeros into ones andvice verse). This, when received by receiver, will therefore be anotherrandom variable, say Y . How correlated X and Y are is in fact thesame as the channel capacity. If, when the sender sends one bit, thereceiver receives exactly this bit (perfect channel), then the correlationbetween them is also one bit and so is the capacity. If this bit has achance of being flipped with probability half, then the sender andreceiver are uncorrelated (randomising channel) and the correlation iszero and so is the capacity for communication.

Theorem (Shannon). If R is the rate of information production, thenproviding that R < C the information can be transmitted with anarbitrary reliability.

Here we only present an intuitive reasoning to justify the above formof the capacity. We stress that this proof is valid only for ergodic,stationary sources for which most sequences of n bits of a source withan entropy S have a probability of about e−nS . Loosely speaking asource is stationary if the probabilities of emitting states do not changeover time; it is ergodic if each subsequence of states appears in longersequences with a frequency equal to its probability. This statement isthen an information theoretic analogue of the Law of Large Numbersin probability theory. The source with entropy S will generate abouteTS(A) sequences in a time interval T (this result also follows fromthe Law of Large Numbers and ‘about’ indicates that this is only trueasymptotically). Now, each of these will be measured at the outputand each output could then be produced by about eTSB(A) inputs,

5

Page 10: Lectures on foundations of quantum information theory by vlatko vedral

since SB(A) represents the entropy of A once B has been measured.Therefore the total number of useful (non-redundant) messages (asthey are called in communications) is

N = eT (S(A)−SB(A)) (1.2)

and therefore for the capacity we choose a source with the entropythat maximizes S(A)−SB(A), and this is just the mutual informationas SB(A) = S(B) − S(A, B). If we instead chose a source whoseentropy produces a larger quantity than the channel capacity, then thatparticular channel will not be able to handle the input and inevitablyerrors will result. The mutual information between the input and theoutput of a communication channel is thus a very important quantitysince it determines the maximum rate at which the information canbe transmitted without errors occurring.

Formally, this is proved as follows. Suppose that the rate is R < C.Then the probability of successful communication is

pS =eTR

eTI= eT (R−I) (1.3)

If the rate is just below I, say R = I − ε, then the probability of errorafter n transmissions is one minus pS to the power of n,

pE = (1 − e−Tε)n ≈ (Tε)n → 0 (1.4)

Therefore the error tends to zero. If the rate is higher than I, the erroris amplified.

Finally, I introduce another relevant measure of information.

• Relative Entropy

H(X||Y ) =∑

i

p(xi) logp(xi)

p(yi)

This quantity was of great importance in classical statistical mechan-ics of Gibbs, but also is relevant in quantum information as many keyresults follow from its monotonic properties (to be discussed later).The relative entropy expresses the entropic difference between tworandom variables. This is, in fact, the most proper way of talkingabout infromation which is always a relative concept. The uncertaintyin some variable is always measured relative to some other variable.The mutual information, in fact, can be phrased as the difference be-tween the total probability, p(X, Y ), and the product of the marginals,p(X)p(Y ).

6

Page 11: Lectures on foundations of quantum information theory by vlatko vedral

• There are many other measures of information that will not be di-rectly relevant for us here. They invariably violate one of Shannon’sassumptions (such as continuity and/or additivity) and will thereforenot concern us.

7

Page 12: Lectures on foundations of quantum information theory by vlatko vedral

Chapter 2

Quantum Mechanics

This chapter reviews all the rules of quantum mechanics. We need to talkabout the basic, fundamental notions as well as some practical ways of per-forming quantum computations. The concept of information will be seen toplay an important role here through quantum measurement and quantumentanglement (which are not actually unrelated concepts themselves as wewill touch upon in the course). Both of these will be discussed extensivelythrought the lectures. I’ll do my best to make it independent of any inter-pretational issues (which of course are also exciting and important, but Idon’t want to “poison” the reader at this early stage!).

2.1 Rules of Quantum Mechanics

There are 4 basic postulates of quantum mechanics and these tell you howto represent physical systems, how to represent observations, how to carryout measurements and how the systems evolve when “not measured”. Noneof these rules are obvious or natural. They cannot be inferred from anythingdeeper as far as we know (although there are many whorthwhile attempts).There is also a fifth postilate when we talk about many-body quantummechanics (or relativistic field theory - which is the same thing basically),but we will only discuss this if and when necessary.

• Postulates of quantum mechanics: 1. States of physical systems arerepresented by vectors in complex vector spaces - known as Hilbertspaces (Hilbert spaces need additional requirements to just vectorspaces that are not relevant to us here); 2. Quantities that can beobserved (measured) are represented by Hermitian operators actingon states in Hilbert Spaces; 3. The Born Rule: when a measure-ment is made the state of the physical system “collapses” to one of theeigenstates of the Hermitian operator representing the observable. Theprobability with which this happens is given by the mod square of the

8

Page 13: Lectures on foundations of quantum information theory by vlatko vedral

overlap between the eigenstates and the actual state; 4. Schrodinger’sequation describes the evolution of the system when no measurementsare made (identical particles needed later on - 5th postulate).

Now how do we physically motivate postulates like these. They donot seem natural in any possible way. Why vectors? Why complexabove all? Why Hermitian etc? I will now present an example thatanswers quite a lot of these whys and at the same time offers us a wayto discuss quantum information and show how exactly it is differentto classical physics.

• Motivate superposition: We now analyse the well known Mach-Zehnderinterferometer (any other interferometer will serve the same purpose,of course).

• Dirac notation: statics and dynamics. |Ψ〉 → |Φ〉, |Φ〉〈Ψ|.

• Operators as matrices: representing the “butterflies” in the matrixform.

2.2 Mixed States

Pure state represent states of maximal knowledge about the system (wecannot know more even in principle according to the logic of quantum me-chanics as well as confirmed by all experiments so far). When we don’t havemaximal information then the state is said to be mixed and we need to usedensity matrices to represent them. Let’s first explain the physical differencebetween a superposition and a mixture as this is frequently misunderstoodby many people.

• Example with polarization: pure versus mixed (45 polarization vs un-polarized).

• Density Matrix

1. tr(%) = 1.

2. %† = %.

3. % positive, semi-definite.

• Purity Factor: tr(%2).

• Testing purity: Mach-Zehnder (see appendix).

Now we discuss the idea of extracting information from states by per-forming measurements.

9

Page 14: Lectures on foundations of quantum information theory by vlatko vedral

2.3 Quantum Measurement

• Projectors (formalised by von Neumann) |Ψ〉〈Ψ|

1. P 2 = P

2. P ≥ 0

• Evolution under “complete” measurement

|Ψ〉 → P0|Ψ〉〈Ψ|P0 + P1|Ψ〉〈Ψ|P1 = %

Suppose P0 = |0〉〈0| and P1 = |1〉〈1| (note that P0 + P1 = I)

∴ % = |0〉〈0|Ψ〉〈Ψ|0〉〈0| + |1〉〈1|Ψ〉〈Ψ|1〉〈1|= |〈0|Ψ〉|2|0〉〈0| + |〈1|Ψ〉|2|1〉〈1|

where |〈0|Ψ〉|2 = p0 and |〈1|Ψ〉|2 = p1

• If |Ψ〉 = a|0〉 + b|1〉, then

|a|2 ab∗

coherencea∗b |b|2

complete−→measurement

|a|2 0

decoherence0 |b|2

Comment: decoherence is a way in which some people understand thetransition from quantum to classical, i.e. the emergence of the classicalworld.

2.4 General Measurement: POVM

This is a very important concept that you would not have naturally seen inyour undergraduate studies. It tells us about the most general way in whichwe can formulate a measurement.

• Need positive operators (we want probabilities to remain positive):

∀|Ψ〉 : 〈Ψ|E|Ψ〉 ≥ 0

Aside:

If 〈Ψ|A|Ψ〉 ∈ R ⇒ A is Hermitian.

10

Page 15: Lectures on foundations of quantum information theory by vlatko vedral

Proof

(〈Ψ| + 〈Φ|)A(|Ψ〉 + |Φ〉) − (〈Ψ| − 〈Φ|)A(|Ψ〉 − |Φ〉)+i(〈Ψ| − i〈Φ|)A(|Ψ〉 − i|Φ〉) − i(〈Ψ| + i〈Φ|)A(|Ψ〉 + i|Φ〉)= 4〈Ψ|A|Φ〉

And, we can set up the above for A†; therefore one gets

〈Ψ|A|Φ〉 = 〈Ψ|A†|Φ〉

and A = A† Q.E.D.

• Probability = tr(E%) = tr(%E)

•∑

i Ei = 1 because of conservation of probability.

2.5 Two Level Systems

This course will be all about encoding information into discrete quantumsystems. Everything can be built out of the basic blocks - the two levelsystems.

• Bloch representation;

• Pauli Operators;

• Uncertainty;

• Preparation and measurement game.

2.6 Many systems - tensor products

When we have two systems, we create a joint state of the two by combainingtheir individual states using the operation of a direct product ⊗. So if systemA is in the state |ΨA〉 and the system B in the state |ΨB〉, then the jointstate is |ΨA〉 ⊗ |ΨB〉. The tensor product just means that there are twodifferent systems and we should treat them as such. So, if A is an operatoracting on the system A only, then this action is written as

A ⊗ I|ΨA〉 ⊗ |ΨB〉 = A|ΨA〉 ⊗ |ΨB〉 (2.1)

where I is the identity operator signifying that nothing is done to B. Thetensor product sign is usually superfluous and we will almost always omitit.

11

Page 16: Lectures on foundations of quantum information theory by vlatko vedral

Chapter 3

Quantum Information –Basics

3.1 Landauer’s Insight

• Information must be encoded into physical systems.

• Information must be processed using physical laws of microscopic dy-namics (whatever they are).

Therefore, all limitations on information processing follow from the re-strictions of the underlying physical laws. The quantum laws of physicsare fundamentally different from classical and so is therefore the resultinginformation processing.

3.2 Quantum Encoding: Non-orthogonal states

• How is quantum information different to classical?

0 ½ |Ψ0〉〈Ψ1|Ψ0〉 6= 0

1 ½ |Ψ1〉

Looks like we can pack more information!? Wrong!

• Retrieving Information

How much information we can encode into some states dependes onhow much information we can extract from them. Here is how we canformalise this idea.

– With Projectors (Holevo)

12

Page 17: Lectures on foundations of quantum information theory by vlatko vedral

P0 ½ |Ψ0〉P1 ½ |Ψ1〉

} Po + P1 = I

Perror = 〈Ψ0|P1|Ψ0〉 + 〈Ψ1|P0|Ψ1〉The error is minimum when the angles between |Ψ0〉 and P0 and|Ψ1〉 and P1 are the same.

– We can do better (i.e. unumbiguously) with POVM (Helstrom)

A0 = I−|Ψ1〉〈Ψ1|1+S A1 = I−|Ψ0〉〈Ψ0|

1+S A2 = I − A0 − A1

where S = |〈Ψ0|Ψ1〉| and A0, A1 and A2 are positive operators.

Inference is made in the following way:

A0 ½ |Ψ0〉 A1 ½ |Ψ1〉 A2 ½ Don’t know

How is this actually implemented in practice? We need an ancilla,then

|Ψi〉 → (|Ai〉 + a|A2〉) (3.1)

then measure As. Need to ensure the overlaps the same before andafter because of unitarity.

• Quantum Bits.

This concept will be formalised much later, but I won’t to first discussit briefly here as it is a natural place to do so.

3.3 Quantum Cryptography

First review the two state protocol. Then the BB84 protocol.

3.4 No cloning of quantum bits

The first striking difference between quantum and classical information stor-age is that we cannot clone (“unknown”) quantum states. This is due tolinearity of quantum mechanics. It is simple to see that if we want

|0〉|0〉 → |0〉|0〉 (3.2)

|1〉|1〉 → |1〉|1〉 (3.3)

then(|0〉 + |0〉)|0〉 → |0〉|0〉 + |1〉|1〉 6= (|0〉 + |1〉)(|0〉 + |1〉) (3.4)

13

Page 18: Lectures on foundations of quantum information theory by vlatko vedral

Cannot clone in all basis! This fact will be featuring in some quantumprotocols later in the lectures.

3.5 Generalized Dynamics: CP maps

• Linear, completely positive, trace preserving ⇒

% ½∑

i Mi%M †i (Kraus)

where∑

i M†i Mi = I ( % = TT † ⇒ Kraus)

• Explain complete positivity: positivity for all extensions of the state.

Density matrix from partial tracing of the extended Hilbert space

%s =∑

i

ri|ri〉〈ri| ⇒ |ΨSE〉 =∑

i

√ri|ri〉 ⊗ |li〉

where 〈ei|ej〉 = δij

%s = trE{|ΨSE〉〈ΨSE |}

3.6 Implementing CP maps physically (Ozawa)

The implementation runs as follows: first we add an ancialla (environment)to the system; then, we apply a unitary transformation on the combinedstate of the system and ancilla, and finally we trace out the ancilla. Thecombination of these operations is a CP-map on the system alone. Mathe-matically,

%s ½ %s ⊗ |Ψe〉〈Ψe| → USE{%s ⊗ |Ψe〉〈Ψe|}U †SE

½ %′ = trE [USE(%s ⊗ |ΨE〉〈ΨE)U †SE ] =

i〈ei|USE |ΨE〉%s〈|U †SE |ei〉,

where 〈ei|USE |ΨE〉 = Mi and 〈ΨE |U †SE |ei〉 = M †

i .

3.7 Rules of Quantum Mechanics Revisited

Here are the most general rules of quantum mechanics known currently:

• States are represented by density matrices over a Hilbert Space

• Observables are Hermitian Operators

14

Page 19: Lectures on foundations of quantum information theory by vlatko vedral

• Measurements are POVMs: p = tr(E%)

• Evolution is a CP-map

• Composite systems: |Ψ1〉 ⊗ |Ψ2〉

• Change of state (“collapse”)

% ½ %i =Mi%M†

i

tr(Mi%M†i )

, in case we register the ith outcome.

If no outcome registered then we have to average over all possibili-ties

% =∑

i

pi%i =∑

i

Mi%M †i (Kraus’ form)

Measurements and Dynamics are NOT distinguished (both are CP-maps)! This makes quantum theory similar to stochastic dynamicsin classical probability theory (and the measurement problem disap-pears!).

15

Page 20: Lectures on foundations of quantum information theory by vlatko vedral

Chapter 4

Quantum Protocols

So far we have discussed the encoding and extraction of information. Nowlet us describe how to communitacte this information. I now describe someprotocols that are only possible because of quantum entanglement and thathave no classical analogue.

4.1 Pure State Entanglement

• Definition: |Ψ12〉 6= |Ψ1〉 ⊗ |Ψ2〉 then |Ψ12〉 is entangled.

• Test:

Let % = tr|Ψ12〉〈Ψ12|. The state is entangled ⇐⇒ tr%22 < 1

Example: |φ+〉 = |00〉 + |11〉. Let’s trace out one of the qubits. Thentrρ2

2 = 1/2 6= 1 =⇒ state entangled.

4.2 Schmidt decomposition

A composite quantum system is one that consists of a number of quantumsubsystems. When those subsystems are entangled it is impossible to ascribea definite state vector to any one of them. The most often quoted entangledsystem is a pair of two photons, being in the “EPR” state. The compositesystem is then mathematically described by

|Ψ〉 =1√2(| ↑〉| ↓〉 + | ↓〉| ↑〉) (4.1)

where the first ket in either product belongs to one photon and the secondto the other. The property that is described is the direction of spin orpolarization along the z-axis, which can either be “up” (| ↑〉) or “down”(| ↓〉). A two level system of this type is a quantum analogue of a bit, which

16

Page 21: Lectures on foundations of quantum information theory by vlatko vedral

we shall henceforth call a qubit. We can immediately see that neither ofthe photons possesses a definite state vector. The best that one can say isthat if a measurement is made on one photon, and it is found to be in thestate “up” for example, then the other photon is certain to be in the state“down”. This idea cannot be applied to a general composite system, unlessthe former is written in a special form. This motivates us to introduce the socalled Schmidt decomposition, which not only is mathematically convenient,but also gives a deeper insight into correlations between the two subsystems.

According to the rules of quantum mechanics the state vector of a com-posite system, consisting of subsystems U and V , is represented by a vectorbelonging to the tensor product of the two Hilbert Spaces HU ⊗ HV . Thegeneral state of this system can be written as a linear superposition of prod-ucts of individual states:

|Ψ〉 =∑

n

m

cnm|un〉|vm〉 (4.2)

where {|un〉}Nn=1 and {|vm〉}N

m=1 are the orthonormal basis of the subsystemsU and V respectively, whose dimensions are dim U = N and dim V = M .We will now describe the procedure of Schmidt decomposition whereby theabove state |Ψ〉 is re-expressed in terms of the so called Schmidt basis.

To that end, let us assume that M > N , which in no way affects ourline of argument since the procedure is symmetric with respect to the sub-systems. Then we have the following five steps:

1. First we construct a density matrix describing |Ψ〉. Once the den-sity matrix is known all the properties of the system can be deducedfrom it. Moreover, ensembles which are prepared differently, buthave the same density matrix are statistically indistinguishable andtherefore equivalent. Generally, if we have a mixed state involvingvectors |Ψ1〉, |Ψ2〉, . . . |ΨD〉 with corresponding classical probabilitiesw1, w2, . . . , w3, then the density matrix is defined to be:

ρ =D∑

d=1

wd|Ψd〉〈Ψd| . (4.3)

Since in our case |Ψ〉 is a pure state, the density matrix is a projectionoperator on to |Ψ〉, i.e.

ρ = |Ψ〉〈Ψ| =∑

nm

pq

ρnmpq|un〉〈up| ⊗ |vm〉〈vq| (4.4)

where ρnmpq = cnmc∗pq. If we, however, wish to deal with one of thesubsystems only, then we employ the concept of the reduced densitymatrix.

17

Page 22: Lectures on foundations of quantum information theory by vlatko vedral

2. We find the reduced density matrix of the subsystem U , obtained bytracing ρ over all states of the subsystem V , so that

ρU =∑

q

〈vq|ρ|vq〉 =∑

nm

p

ρnmpm|un〉〈up| . (4.5)

The crucial step in the Schmidt decomposition is diagonalizing theabove. We shall call the eigenvalues of ρU |g1|2, |g2|2, . . . , |gN |2, andthe corresponding eigenvectors |u′

1〉, |u′2〉, . . . , |u′

N 〉.

3. Then we re-express the above in terms of |u′〉’s, i.e

|Ψ〉 =∑

n

m

c′nm|u′n〉|vm〉 . (4.6)

4. Now, we construct a new orthonormal basis of the subsystem V suchthat each new vector is a “clever” linear superposition of the old ones,so that

|v′i〉 =∑

m

c′imgi

|vm〉 . (4.7)

5. The Schmidt decomposition of |Ψ〉 is now given by

|Ψ〉 =∑

n

gn|u′n〉|v′n〉 . (4.8)

There are two important observations to be made, which are absolutelyfundamental to understanding correlations between the two subsystems ina joint pure state:

• The reduced density matrices of both subsystems, written in the Schmidtbasis, are diagonal and have the same positive spectrum. in particular,the overall density matrix is given by

ρ =∑

nm

gng∗m|u′n〉〈u′

m| ⊗ |v′n〉〈v′m| (4.9)

whereas the reduced ones are

ρU =∑

m

〈v′m|ρ|v′m〉 =∑

n

|gn|2|u′n〉〈u′

n| (4.10)

ρV =∑

n

〈u′n|ρ|u′

n〉 =∑

m

|gm|2|v′m〉〈v′m| . (4.11)

• If a subsystem is N dimensional it then can be entangled with no morethan N orthogonal states of another one.

18

Page 23: Lectures on foundations of quantum information theory by vlatko vedral

At the end we would like to point out that the Schmidt decomposition is,in general, impossible for more than two entangled subsystems. Mathemat-ical details of this fact are exposed in. To clarify it, however, we considerthree entangled subsystem as an example. Here, our intention would be towrite a general state such that by observing the state of the one of the sub-systems we instantaneously and with certainty know the state of the othertwo. But, this is impossible in general, for the presence of the third systemmakes the prediction uncertain. Loosely speaking, while we know the stateof one of the subsystems, the other two might still be entangled and can-not have definite vectors associated with them (an exception to this generalrule is, for example, a state of the Greenberger–Horne–Zeilinger (GHZ) type(1/

√2)(| ↑〉| ↑〉| ↑〉+ | ↓〉| ↓〉| ↓〉)). Clearly, involvement of even more subsys-

tems complicates this analysis even further and produces, so to speak, aneven greater mixture and uncertainty. The same reasoning applies to mixedstates of two or more subsystems (i.e. states whose density operator is notidempotent ρ2 6= ρ), for which we cannot have the Schmidt decompositionin general. This reason alone is responsible for the fact that the entangle-ment of two subsystems in a pure state is simple to understand and quantify,while for mixed states, or states consisting of more than two subsystems,the question is much more involved.

4.3 Dense Coding

• – Bell states

|Φ±〉 =1√2(|00〉 ± |11〉)

|Ψ±〉 =1√2(|01〉 ± |10〉)

– Pauli Operators (both Hermitian - observables and unitary - evo-lution).

In matrix form

σx =

(0 11 0

)

σy =

(0 −ıı 0

)

σz =

(1 00 −1

)

In operator form

σx = |0〉〈1| + |1〉〈0| σy = i[|1〉〈0| − |0〉〈1|] σz = |0〉〈0| −|1〉〈1|

• Shared entanglement: Alice sends two CLASSICAL bits by operatingon one qubit only and sending it to Bob.

19

Page 24: Lectures on foundations of quantum information theory by vlatko vedral

4.4 Teleportation

• Transfer state (unknown) from A to B using 1 entangled pair plus 2bits of classical communication

• This protocol is possible because of the following equality:

|ψ1〉|Φ+23〉 = |Φ+

12〉|ψ3〉+ |Φ−〉σz|ψ〉+ |Ψ+〉σx|ψ〉+ |Ψ−〉σy|ψ〉

• Measurement on 1, 2 followed by communication and correction (whichworks since σ2

x,y,z = I.

4.5 Entanglement Swapping

This is just teleportation of one out of two entangled systesm. However,it has some nice conceptual consequences, such as that particles that neverinteract can still become entangled!

4.6 No Instantaneous Transfer of Information

• Measurements of entangled particles seem to imply instantaneous col-lapse at a distance.

• This does not contradict RELATIVITY!

• General result:

A cannot change the state of B no matter what they do!

Proof

%′B = trA{∑

i Ai ⊗ I%ABA†i ⊗ I}

=∑

i trA{Ai ⊗ I%ABA†i ⊗ I}

=∑

i tr{A†iAi ⊗ I%AB}

= tr{∑i A†iAi ⊗ I%AB}

20

Page 25: Lectures on foundations of quantum information theory by vlatko vedral

Since∑

i

A†iAi is a complete set, i.e.

i

A†iAi = I ⇒

%′b = tr%AB = %B!!

This is very intriguing as quantum mechanics has nothing to do withrelativity apriori, and yet instantanoues transfer cannot be done withentanglement.

• What happens with other entangled states?

e.g. a|00〉 + b|11〉 (a 6= b)

Use mutual information to quantify entanglement? I = −2(a2 log a2 +b2 log b2).

• Cloning does lead to superluminal signalling. States |00〉〈00|+ |00〉〈00|and | + +〉〈+ + | + | − −〉〈− − | are not the same. Interesting!

21

Page 26: Lectures on foundations of quantum information theory by vlatko vedral

Chapter 5

Quantum Information I

Now I go through the quantum equivalent of chapter 1 and introduce quan-tum measures of information.

5.1 Von Neumann Entropy

• Classically, information is a function of probability distributions. Inquantum information it is a function of the density matrix.

• Natural conditions

1. IQ is a continuous function of %, meaning that as we continuouslychange the parameters of the density matrix, the correspondingmeasure of information also changes continuously.

2. IQ(%1 ⊗%2) = IQ(%1)+ IQ(%2), where %1⊗%2 is the mathematicaldescription of two uncorrelated systems.

• Analogous to Shannon ⇒ IQ = − log %

Note that the computation of functions of operators is defined as fol-

lows: f(%) =∑

i

f(ri)|ri〉〈ri|, where % =∑

i

ri|ri〉〈ri|, i.e. % is ex-

pressed in its diagonal form: ri are the eigenvalues of % and |ri〉 arethe corresponding eigenvectors.

• Von Neumman Entropy

S(%) = 〈− log %〉% = tr% log %

Thus,

Von Neumann’s entropy of % = Shannon’s entropy of eigenvalues of %

22

Page 27: Lectures on foundations of quantum information theory by vlatko vedral

The properties of S(%) are reviewed in the appendix.

Note that S(I/2) = log 2 ⇒ 1 qubit of information.

5.2 Schumacher’s Data Compression

• Definition. %⊗n ≡ % ⊗ % ⊗ . . . % =n⊗

i=1

%.

• Source↗|Ψ0〉

↘|Ψ1〉 % = p0|Ψ0〉〈Ψ0|+p1|Ψ1〉〈Ψ1|

• Diagonalize

% =n∑

i=1

ri|ri〉〈ri|

Then,

%⊗n =∑

i1,...,in

ri1 . . . rin |ri1 . . . rin〉〈ri1 . . . rin |

• What is the composition of the sequence ri1 . . . rin?(

nnr1

)

of r1;

(n

nr2

)

of r2; . . .

(n

nrm

)

of rm

• Size of ”typical” subspace:

N =n!

(nr1)!(nr2)! . . . (nrm)!≈ 2

n(−∑

riri log ri) = 2nS(%)

∴ Compress n ⇒ nS(%)

• Schumacher’s Protocol

– Project onto typical subspace

– If successful ⇒ encode

– If NOT ⇒ do nothing (prob → 0 as n→ ∞ according to the lawof large numbers in statistics).

• How about the equivalent of noisy communication capacity?

• Discuss quantum and classical capacity of quantum channels.

• Parallel Shannon’s proof of channel capacity...

23

Page 28: Lectures on foundations of quantum information theory by vlatko vedral

5.3 Entropy of Observation

In addition to the von Neumann entropy, there is another entropy in quan-tum mechanics, which is due to the measurement. Suppose that the stateof the system is ρ, and that we perform the measurement of the observableA =

i aiPi, where ais are the eigenvalues and Pi projectors onto eigenvec-tors of the observable. The probability to obtain aj is

pj = trρPj (5.1)

The corresponding Shannon entropy of this is

Sρ(A) = −∑

j

(trρPj) log(trρPj) (5.2)

Interestingly, projective measurement cannot decrease entropy, so that wehave

Sρ(A) ≥ S(ρ) (5.3)

so that the entropy in the state as a whole is less than the entropy in any ofits observables. The equality is achieved only when the observable commuteswith the state.

24

Page 29: Lectures on foundations of quantum information theory by vlatko vedral

Chapter 6

Quantum Information II

Other measures of qunantum information are introduced and one of themost important results derived - the Holevo bound.

6.1 Quantum Measures of Information

• Mutual information I(%AB) = S(%A) + S(%B) − S(%AB)

• Relative Entropy S(σ||%) = tr{σ(lnσ − ln %)}

Note: S(%AB||%A ⊗ %B) = I(%AB)

Distance of %AB to %A ⊗ %B is the amount of correlations between A and B in %AB

• Discuss relative entropy statistically (indistiguishability)

• Important theorem:

Theorem. Relative entropy NEVER INCREASES under CP-maps

S(σ||%) ≥ S(Φ(σ)||Φ(%))

Proof

S(Φ(σ)||Φ(%)) = S(tr2{U(σ ⊗ Pα)U †}||tr2{U(% ⊗ Pα)U †})≤ S(U(σ ⊗ Pα)U †||U(% ⊗ Pα)U †)§

1

= S(σ ⊗ Pα||% ⊗ Pα)§2

= S(σ||%) + S(Pα||Pα)§3

= S(σ||%)

25

Page 30: Lectures on foundations of quantum information theory by vlatko vedral

where σ = σ ⊗ Pα is the Ozawa representation, and the following re-sults hold

§1 : Tracing reduces information

§2 : information is invariant under unitary transformations

§3 : S(Pα||Pα) = 0

6.2 Encoding Classical Information into QuantumStates

↗ p1

→ p2...↘ pn

1 → %1

2 → %2...n → %n

What is the most efficient way to distinguish them?

What is the capacity of this communication?

Expect: S(∑

i

pi%i) −∑

i

piS(%i) =∑

i

pi S(%i||%)

Bob sets up a measurement {Ei}, POVM.

How much information does this carry?

pi = tr(%Ei)

p(i|j) = tr(%jEi)

The Accessible information is given by the mutual information betweenthe states and the POVM:

I(E : %) = −∑

i

{pi log pi +∑

j

pjp(i|j) log p(i|j)}

Theorem. Holevo Bound (1973).

maxEI(E : %) ≤∑

i

piS(%i||%)

Proof

26

Page 31: Lectures on foundations of quantum information theory by vlatko vedral

Ei = A†iAi (Kraus)

I(E : %) =∑

i,j

S(p(j|i)||p(j))

CP-map :%i → p(j|i) (trEj%i)% → p(j) (trEj%)

Since S does not increase under CP-maps

∴ I(E : %) ≤∑

i

S(%i||%)

Q.E.D.

Corollary. One qubit cannot contain more information than one bit.

Note: Holevo = I(%AB) where %AB =∑

i

pi|i〉〈i| ⊗ %i, where pi|i〉〈i| is the

mathematical representation of classical messages, and %i represents thequantum enconding.

All entropic measures are practically attainable only asymptotically

• Discuss: Distinguishability cannot be increased

σ

↘Φ1

↘Φ2

. . . ¦ Equilibrium State Φ(ξ) = ξ

↗Φ2

↗Φ1

%

Quantum analogue of (classical) data processing inequality

• Links with thermodynamics? When does entropy increases under CPmaps?

27

Page 32: Lectures on foundations of quantum information theory by vlatko vedral

6.3 Discrimination and Second Law

Show that discriminating non-orthogonal states leads to violation of theSecond Law of Thermodynamics. We can make a closed loop where non-orthogonal states are made orthogonal by perfect discrimination, then usedto do perfect work and finally returned to the same state. No heat is dissi-pated but work is done!

28

Page 33: Lectures on foundations of quantum information theory by vlatko vedral

Chapter 7

Quantum Entanglement

7.1 Historical Background for Entanglement

Einstein, Podolsky and Rosen. Schrodinger (1935). Heisenberg (1928) andMott (1930) on Heisenberg’s cut. Von Neumman (1930) on measurementbeing entanglement between the system and apparatus. Everett (1952) onuniversality of quantum mechanics. Bell (1964).

7.2 Bell’s Inequalities

A1, B1 and A2, B2 are dichotomic observables with eigenvalues ±1. Con-struct

A1B1 + A1B2 + A2b1 − A2B2 ≡ C

• Classically |〈C〉| ≤ 2

Proof

C = A1(B1 + B2) + A2(B1 − B2) = A1 ∗ (2) + A2 ∗ (0)

• Quantumly: max〈C〉 = 2√

2

• Why not 4?

7.3 Separable States

A separable (or disentangled) state is defined as:

%AB =∑

i

pi%iA ⊗ %i

B

29

Page 34: Lectures on foundations of quantum information theory by vlatko vedral

• Can be prepared locally (with classical correlations).

• Does not violate Bell inequalities (not the other way round).

7.4 Pure Entangled States Violate Bell Inequali-ties

• Physical Intuition (LO = Local Operations)

LO(a|00〉 + b|11〉) = |00〉 + |11〉But, |φ+〉 violates Bell ⇒ a|00〉 + |11〉 also violates Bell.

• Present Protocol

Suppose a ≤ b ⇒

POVM

A1 =

(1 00

√ab

)

A†2A2 = I − A†

1A1

Probability of success = 〈Ψ|A†1A1|Ψ〉 = 2a2

Unitary a la Ozawa: |g〉|1〉 → |g〉|1〉 + |e〉|0〉

|g〉 ⊗ (a|00〉 + b|11〉)⇒ a cos θ|g〉|00〉 + b sin θ(|g〉|11〉 + |e〉|01〉)= |g〉(a cos θ|00〉 + b sin θ|11〉) + |e〉(sin θ|01〉)

Measure |g〉 ⇒ a cos θ|00〉 + b sin θ|11〉

Ensure that cos θ = b ⇒ sin θ = a

7.5 Mixed Entangled States May Not Violate BellInequalities

• Example: Werner States (locally unitarily invariant)

%w = F |φ†〉〈φ†| + 1−F3 (ψ±φ−)

For F > 0.78 Bell’s inequalities are violated, however, the state is en-tangled for all F > 1/2.

Therefore, Bell’s inequalities are not necessary and sufficient indicatorof entanglement.

30

Page 35: Lectures on foundations of quantum information theory by vlatko vedral

Chapter 8

Quantum EntanglementDetection I

In this section I talk about mathematics behind being able to identify statesas entangled or disentangled. What I will say may sound too theoretical,but it can in fact be implemented experimentally (as I will discuss in thelectures - see appendix).

8.1 Detecting Entanglement?

• Set of all states is convex

%1, %2 ∈ T ⇒ λ1%1 + λ2%2 ∈ T .

The mixture of two physical states is also a physical state.

• Set of all disentangled states D is convex

%1, %2 ∈ D ⇒ λ%1 + (1 − λ)%2 ∈ D.

Proof

%1 =∑

i

pi%i1A ⊗ %i

1B and %2 =∑

i

pi%i2A ⊗ %i

2B ⇒ λ%1 + (1 − λ)%2 =

i

(λpi%1A ⊗ %1B + (1 − λ)pi%2A ⊗ %2B)︸ ︷︷ ︸

separable

Detecting entanglement is the same as determiningwhether a point belongs to a convex set or not

Simple observation: Given a convex set and a point outside, there ex-ists a plane such that the point is on one side of it, while the set is on

31

Page 36: Lectures on foundations of quantum information theory by vlatko vedral

the other side (this is a corroloary to the Han-Banach Theorem)

Translate into vector algebra

C

C

V

u

e

PLANE CONVEX SET

Note that all e on the same side of the plane as u we have that 〈u,e〉 > 0while for the points in the set, 〈u, e〉 < 0

• Translation into Hermitian Operators

Inner product of A, B = tr(A†B)

This is becauseA|Ψ〉 = |ΨA〉B|Ψ〉 = |ΨB〉

〈ΨA|Ψb〉 = 〈Ψ|A†B|Ψ〉 ⇒ tr(A†B)

• Therefore

Theorem. A state σ is entangled if there is a (Hermitian) operatorW such that tr(Wσ) < 0 while for all separable states %S we have thattr(W%S) ≥ 0

W is called Entanglement Witness.

Note: This is an impractical criterion.

8.2 Jamiolkowski Isomorphism

An isomorphism (1976) relating Hermitian oprators to positive maps:

A = I ⊗ ΦA(|φ+〉〈φ+|)

32

Page 37: Lectures on foundations of quantum information theory by vlatko vedral

where A is a Hermitian operator and ΦA is a positive operator, and

|φ+〉 = 1√N

N∑

n=1

|nn〉

Proof

A is Hermitian ⇒ A =∑

i

ai|ai〉〈ai|.

Let us suppose A ≥ 0.

Define I ⊗ Mi|φ+〉 =√

ai|ai〉 ⇒ A = I ⊗ Mi|φ+〉〈φ+|I ⊗ M †i

∴ ΦA =∑

i

Mi(¦)M†i Completely positive

If we need A to have negative eigenvalues as well as then ΦA needs to beonly positive.

Theorem. A state σ12 is entangled iff there exists a positive map Λ (not aCP-map) such that

I ⊗ Λ(σ12) < 0

Proof

tr(Wσ12) = tr{I ⊗ Λ(|φ+〉〈φ+|)σ12}〈φ+|I ⊗ Λ(σ12)|φ+〉

So,

tr(Wσ12) < 0 ⇐⇒ 〈φ+|I ⊗ Λ(σ12)|φ+〉 < 0 ⇐⇒ I ⊗ Λ(σ12) < 0

Q.E.D.

8.3 Examples of entanglement witnesses

• Heisenberg interaction.

• Thermodynamical witnesses.

33

Page 38: Lectures on foundations of quantum information theory by vlatko vedral

Chapter 9

Quantum EntanglementDetection II

This section discusses a special case of two qubits where there is a simpleoperational condition to decide on whether the state is entangled or not.

9.1 Peres-Hodorecki Criterion

• Positive maps and entanglement witnesses are not “useful” i.e. effec-tive

• No better criterion in general

• Lucky for 2 × 2 and 2 × 3

Positive = CP1 + CP2 ◦ T , where T stands for partial transposition

Example

12 0 0 1

20 0 0 00 0 0 012 0 0 1

2

T2

=

12 0 0 00 0 1

2 00 1

2 0 00 0 0 1

2

The eigenvalues of the matrix on the RHS are negative! ⇒ entan-gled

9.2 Interferometric implementation of Peres-Horodecki

Partial transposition is not a physical operation in that it is a positive andnot a CP-map. Therefore as we argued below it cannot be implemented

34

Page 39: Lectures on foundations of quantum information theory by vlatko vedral

within the quantum formalism. Looking at a simple 2 by 2 matrix of aqubit, transpoition corresponds to a time reversal (the off diagonal elementsare exchanged and this is like eiωt → e−iωt; we can also think of it as aphase conjugation). However, we remember that an entanglement witnessis the average of some Hermitian operator, and this average is a physicallymeasurable quantity. So, we must be able to measure effects of the partialtransposition.

The key observation is that we can use an interferometer (Appendix)to measure powers of an operator tr(ρn). With a bit more ingenuity wecan measure tr((ρT2

12)n, i.e. all the powers of the partially transposed ma-trix. These will just be some real numbers (not complex), so we are notcontradicting quantum physics!

9.3 Operational Definition: Formation and distil-lation

Local operations + classical communication: LOCC(σ⊗n12 ) = |φ12〉⊗m

Definition.

ED = limn→∞

m(n)

n← distillable

EF = minimum entanglement to invert in order to get the state by LOCC

9.4 Pure State Distillation

Consider the state:

|Ψ〉 = a|00〉 + b|11〉Start with |Ψ〉⊗n. Project onto typical subspace on A. Effectively data

compression.

(a2|0〉〈0| + b2|1〉〈1|)⊗n = (1

2|0〉〈0| + 1

2|1〉〈1|)⊗m

such thatlim

m

n= S(a2|0〉〈0| + b2|1〉〈1|)

Von Neumann entropy of reduced density matrix measuresentanglement for a pure state (= −a2 log a2 − b2 log b2)

35

Page 40: Lectures on foundations of quantum information theory by vlatko vedral

9.5 Carnot Cycle Analogy

LOCC ⇐⇒ Adiabatic processesEntanglement ⇐⇒ Entropy

QI : LOCC cannot increase entanglement2nd law : Adiabatic processes cannot decrease entropy

Question : Entropy unique in Thermodynamics.Is entanglement unique in QI?

36

Page 41: Lectures on foundations of quantum information theory by vlatko vedral

Chapter 10

Measures of Entanglement

10.1 Schmidt Decomposition & Correlations

• Theorem. Schmidt Decomposition.

|Ψ〉 =∑

i

j

dij |i〉 ⊗ |j〉

can always be written as

|Ψ〉 =∑

n

cn|an〉 ⊗ |bn〉

Proof

tr|Ψ〉〈Ψ| = %2 ← Diagonalize. Then, tr1|Ψ〉〈Ψ| =∑

n

|cn|2|bn〉〈bn|

Find |an〉 in the same way.

Q.E.D.

• Correlations |an〉 À |bn〉

How entangled A and B is the same as correlations which is the sameas how mixed %A and %B are.

Entanglement = S(%A) = S(%B) = −∑

n

|cn|2 log |cn|2

10.2 Mixed States

• Things don’t work this way

37

Page 42: Lectures on foundations of quantum information theory by vlatko vedral

• Need something to discriminate

|00〉+ |11〉 and |00〉〈00|+ |11〉〈11| (both correlated, but only the formeris entangled);

10.3 Relative Entropy of Entanglement

• Distance Idea

E(σ12) = min%12∈D

D(σ12||%12))

Take D to be S(σ||%) = tr{σ log σ − σ log %}

1. E(%) = 0 ⇐⇒ % ∈ D.

2. E invariant under local unitaries.

3. E does not increase under LOCC.

4. E(|ΨAB〉) = S(%A).

• Theorem. ED ≤ EER ≤ EF

Proof

nERE = nS(σ||ω) Definition≥ S(σ⊗n||ωn) Subadditivity≥ S(|Ψ〉⊗m||LOCC(ωn)) Rel. Ent. under CP-maps≥ S(|Ψ〉⊗m||%⊗m)= m log2(E(|Ψ〉))= m

∴ ERE ≥ m

n= ED

Also,S(

i piσi||ω) ≤ ∑

i piS(σi||ω) = EF

↑convexity

where S(∑

i

piσi||ω) ≥ ERE , by definition.

Q.E.D.

38

Page 43: Lectures on foundations of quantum information theory by vlatko vedral

10.4 Uniqueness of the measure?

• Pure bipartite states fine. ED = ERE = EF .

• Mixed states not.

Distillability and separability are different

• Multiparticle states: no equivalent of formation or distillation. Rela-tive entropy is OK, but the meaning is unclear and its computationdifficult.

• Question of convertibility.

Single copies - majorization.

Asymptotically: given σ and % is it true that either σLOCC−→ % or

%LOCC−→ σ?

10.5 Taxonomy of Entanglement

• Bound states. If a state is distillable, then it has a negative PT. Butthere are entangled state with positive PT. So, not all entangled statescan be distilled.

• PPT bound?

• Higher dimensions...

10.6 Other approaches?

Defining entanglement with respect to some other protocols, e.g. densecoding, teleportation.

10.7 Entanglement and Thermodynamics

• Mathematical equivalence;

• Using entanglement to extract work.

39

Page 44: Lectures on foundations of quantum information theory by vlatko vedral

Chapter 11

Quantum Algorithms

11.1 Computational Complexity

• Classical (networks, oracles).

• Difficult and easy problems.

• Examples.

• Quantum Complexity (Networks and oracles).

11.2 Deutsch’s Algorithm

Problemf(0) = f(1) orf(0) 6= f(1)

Classical:0 f(0)

f

1 f(1)

Quantum:|x〉 → eiπf(x)|x〉

|0〉 + |1〉 → eiπf(0)|0〉 + eiπf(1)|1〉

if f(0) = f(1) ±(|0〉 + |1〉)f(0) 6= f(1) ±(|0〉 − |1〉)

Our computation of “f” solves the problem.

40

Page 45: Lectures on foundations of quantum information theory by vlatko vedral

shifter

Π shifter

Π

BS

BS

11.3 Grover’s Search

• Statement of the problem: f(i) = 1, f(x 6= i) = 0.

• Quantum box: |x〉 → (−1)δxi |x〉.

• Network.

• 2D Picture.

• 2 qubit example.

11.4 Factorisation Problem

The whole point of this algorithm is to determine a period of a certainfunction and determining this period is something that classical computersfind difficult to do as far as we know.

• Compute f(x) = ax mod N where x = 0, 1, 2, . . ., N is the number offactorize and a is chosen such that gcd(a, N) = 1.

• Q.F.T. finds the period p of this function in a polynomial time.

• The factors of N can easily be computed by calculating gcd(ap/2±1, N).

This algorithm has implications for public key crypto-systems which iswhy the algorithm is considered important as well.

11.5 Shor’s Algorithm

11.5.1 A. Beam Splitters

The beam splitter has the following effect.

|0〉 → |0〉 + |1〉 (11.1)

41

Page 46: Lectures on foundations of quantum information theory by vlatko vedral

|1〉 → |0〉 − |1〉 (11.2)

This operation is know as the Hadamard transformation in quantum infor-mation theory.

f(1)

f(2)

|0>

|0>

|1>

If we input a |0〉 into a beam splitter with two different phase shiftersalong the two paths, we obtain the following.

|0〉 → |0〉 + |1〉 (11.3)

→ eiφ(0)|0〉 + eiφ(1)|1〉 (11.4)

→ eiφ(0)(|0〉 + |1〉) + eiφ(1)(|0〉 − |1〉) (11.5)

= cos[φ(0) − φ(1)

2

]

|0〉 + sin[φ(0) − φ(1)

2

]

(11.6)

An example is Deutsch’s problem when setting φ(0) = 0, π and φ(0) =0, π.

11.5.2 B. Fourier Transforms (Discrete)

x0

x1

x2...

xN−1

=

F.T.

y0

y1

y2...

yN−1

(11.7)

yk :=1√N

N−1∑

j=0

xje2πi jk

N (11.8)

This is the definition of the discrete Fourier transform. The quantum Fouriertransform is given by:

|j〉 → 1√N

N−1∑

k=0

e2πi jk

N |k〉 (11.9)

42

Page 47: Lectures on foundations of quantum information theory by vlatko vedral

Superpositions:N−1∑

j=0

xj |j〉 →N−1∑

k=0

yk|k〉 (11.10)

Take N = 2n where n is the number of qubits. J = J1J2 . . . Jn binaryrepresentation where Ji is zero or one. Equivalently in the decimal repre-sentation, J = J12

n−1 + . . . + Jn20. Also important is j = 0.jljl+1 . . . jm =jl/2 + . . . + jm/2m−l+1.

As an exercise, show the following.

|j1 . . . jn〉 →1√2n

(|0〉+e2πi(0.jn)|1〉)(|0〉+e2πi(0.jn−1jn)|1〉) . . . (|0〉+e2πi(0.j1j2...jn)|1〉)(11.11)

Network for the quantum Fourier transform:

|j1>

|jn-1>

|j2>

|jn>

|0>+exp(2pi(0.j1...jn))|1>

|0>+exp(2pi(0.j2...jn))|1>

|0>+exp(2pi(0.jn))|1>

|0>+exp(2pi(0.jn-1jn))|1>

H

H

H

H

R2

Rn-1

Rn-2

Rn

R2

Rn-1

Rk =

(1 0

0 e2πi/2k

)

(11.12)

Example with 3 qubits:

|j1>

|j3>

|j2>

|0>+exp(2pi(0.j1j2j3))|1>

|0>+exp(2pi(0.j3))|1>

|0>+exp(2pi(0.j2j3))|1>

H

H

H

R

R

R

43

Page 48: Lectures on foundations of quantum information theory by vlatko vedral

11.5.3 C. Phase Estimation

U |u〉 = e2πiφ|u〉 where φ is unknown. Control U2jwhich is a black box

operation. The way to perform this will be discussed later.

U2 = e2πi(21φ)|u〉 (11.13)

... (11.14)

U2j

= e2πi(2jφ)|u〉 (11.15)

phase ”kick-back” idea. It is not a physical ”kick-back”. |x〉|ψ〉 → |x〉e2πiφ(x)|ψ〉 =

|0>

|0>

|0>

|0>

|u> |u>

Q.F.T.

U20

U2t-3

U2t-2

U2t-1

|0>+exp(2pi(2t-1f))|1>

|0>+exp(2pi(20f))|1>

exp(2pif(x))|x>

Uf

|x>

|y> |y>

e2πiφ(x)|x〉|ψ〉Suppose φ = 0.φ1 . . . φt:

1√2t

2t−1∑

j=0

e2πiφj |j〉︸ ︷︷ ︸

£

|u〉 −→ |φ〉|u〉 (11.16)

where the arrow indicates an inverse quantum Fourier transform and £ isthe output of the above.

44

Page 49: Lectures on foundations of quantum information theory by vlatko vedral

|0>+exp(2pi(0.j3))|1>

|0>+exp(2pi(0.j2j3))|1>

|0>+exp(2pi(0.j1j2j3))|1> |j1>

|j3>

|j2>

H

H

H

R R

R

|f>

U2x

|0>

|u> |u>

|0>

|0>

Let’s have a look at an example with three qubits. Inverse Fourier Trans-

form. Recall, |0〉 + e2πi(0.j3)|1〉 H−→ |j3〉. So: Therefore: This is a networkfor phase estimation!!!

11.5.4 D. Order Finding Problem

Given a find r such that ar = 1 mod N where N is the number to factor.Consider Ua|y〉 = |ay mod N〉. Eigenvectors of Ua? U r

a = I.

|Ψn〉 =r−1∑

j=0

e−2πi krj |aj〉 (11.17)

45

Page 50: Lectures on foundations of quantum information theory by vlatko vedral

with respective eigenvalues e2πi kr . Estimate k

r and find r. Thus we knowhow to go from:

|0〉|Ψn〉FTN−→

N−1∑

x=0

|x〉|Ψn〉 (11.18)

Uax−→N−1∑

x=0

e2πi krx|x〉|Ψn〉 (11.19)

FT−1

N−→ |kr〉|{sin〉 (11.20)

Measure |kr 〉 and estimate r. But we do not know |Ψn〉!?. Use |1〉 =∑r−1

k=0 |Ψn〉.

|0〉|1〉 =r−1∑

k=0

|0〉|Ψn〉 (11.21)

Apply FTN :N−1∑

x=0

|x〉|1〉 =r−1∑

k=0

N−1∑

x=0

|x〉|Ψn〉 (11.22)

Apply Uax :N−1∑

x=0

|x〉|ax mod N〉 =

r−1∑

k=0

N−1∑

x=0

e2πi krx|x〉|Ψn〉 (11.23)

Apply FT−1N :

N−1∑

x=0

|x〉|ax mod N〉 =r−1∑

k=0

|kr〉|Ψn〉 (11.24)

First register is the state∑

k |kr 〈kr |. r can be estimated efficiently.

11.5.5 E. Factoring

Given N find its factors. Solution g.c.d.(ar2±1, N) r period of ar = 1 mod N .

11.5.6 F. Example of Order Finding

Factoring 91. a = 4 and N = 91.

1√2

n

2n−1∑

k=0

|k〉|1〉 −→ 1√2

n

2n−1∑

k=0

|k〉|ak mod N〉 (11.25)

=1√2

n [|0〉|1〉 + |1〉|4〉 + |2〉|16〉 + |3〉|64〉 + |4〉|74〉 + |5〉|23〉 + |6〉|1〉 + . . .]

Measure second register, suppose we get 74√

62n [|4〉+ |10〉+ |16〉+ |20〉+ . . .]

period r = 6. F.T.−1 on this we get∑

l dl|l〉.

46

Page 51: Lectures on foundations of quantum information theory by vlatko vedral

Chapter 12

More on Entanglement andQuantum Measurements

12.1 Search Optimality from Entanglement

– Ambiani’s Method: how quickly does BBox entangle?

Step n:

ij

αij |i〉|j〉 → % =∑

ijkl

αijα∗kl|ij〉〈kl|

trA ⇒∑

ijl

αijα∗il|j〉〈l| ← Memory’s density

So off-diagonals

c(j, l) =∑

i

αijα∗il

Step n+1:

%′ =∑

ij

αij(−1)%ij |i〉|j〉(〈Ψ|)

trA ⇒∑

ijl

αijα∗il(−1)δij+δil |j〉〈l|

is the memory’s density after BBox

47

Page 52: Lectures on foundations of quantum information theory by vlatko vedral

Now, compute the difference in off diagonals

j 6=l

|c(j, l)| − |c′(j, l)| ≤∑

j 6=l

|c(j, l) − c′(j, l)|

≤∑

i,j 6=l

|αijα∗il|(1 − (−1)δij+δil)

= 2∑

i6=l

|αiiα∗il| + 2

i6=j

|αijα∗ij |

4∑

i6=l

|αii||αil|

∼ N2 × 1N × 1√

N

≤ 4√

N

∴ we need to go from ∼ N to 0 and steps are at most√

N .

So, efficiency is O(√

N). This is a very general way of proving variousefficiencies using entanglement and black boxes.

12.2 Model for Quantum Measurement

• One qubit(a|0〉 + b|1〉)|m〉

→ a|0〉|m0〉 + b|1〉|m1〉〈m0|m1〉 = ξ

• Measurement ≡ establishment of correlations between system and ap-paratus (von Neumann)

Grover ≡ discretized measurement

(|1〉 + |2〉 + . . . |N〉)(|1〉 + |2〉 + . . . |N〉) Evolution−→Oracle

|1〉(−|1〉 + |2〉 + . . . |N〉) ↖+ |2〉(|1〉 − |2〉 + . . . |N〉) ← Non-zero entropy...+ |N〉(−|1〉 + |2〉 + . . . − |N〉) ↙

• Mixed apparatus?

48

Page 53: Lectures on foundations of quantum information theory by vlatko vedral

Chapter 13

Quantum Error Correction

13.1 Introduction

In the previous chapters we saw how to quantify entanglement, and howto use entanglement to compute. Now we turn our attention to realisticsituations involving entanglement manipulations. In most realistic casesentanglement is gradually lost due to the detrimental interactions with anenvironment. In this chapter we focus on methods of protection of quantumstates in dissipative and decoherent environments. In particular, with thediscovery of an algorithm to factorize a large number on a quantum com-puter in polynomial time instead of exponential time required by a classicalcomputer, the question of how to implement such a quantum computer hasreceived considerable attention. We have already stressed that this expo-nential increase crucially depends on being able to maintain large entangledstates for sufficiently long periods of time. However, realistic estimates soonshowed that decoherence processes and spontaneous emission severely limitthe bit size of the number that can be factorized by destroying entangle-ment. It has become clear that the solution to the problem does not lie inan increase in the lifetime of the transitions used in the computation. Atten-tion has now shifted towards the investigation of methods to encode qubitssuch that the correction of errors due to interaction with the environmentbecomes possible. In a number of recent publications, possible encodingschemes have been considered and theoretical work has been undertaken toelucidate the structure of quantum error correction codes.

13.2 Simple Example

A single qubit can suffer 3 errors, each represented by one Pauli operator. Itis tempted to think that we cannot have quantum error correction as cloningof quantum states is not possible. On the other hand, cloning of classicalstates is the key to error correction. The way it works classically is that

49

Page 54: Lectures on foundations of quantum information theory by vlatko vedral

instead of using one bit to encode zero or one, we use, for exmaple, threebits. Then if there is an error, we just need to look at all three bits andsee what the majority of them encode. If two are in the state one, then it’sclear that the initial encoded state was one. Redundancy in information isthe key to error correction (this is true in Nature as well, for example, inDNA encoding amino acids). Contrary to this logic, it turns out that wecan quantum error correct, its just that we are not cloning state, but bits ineach superposition element, and this is allowed by the rules.

It is easy to protect against a single flip. Encode with two more qubits:

a|0〉 + b|1〉 → a|000〉 + b|111〉 (13.1)

then apply a “coherent majority vote”. To protect against the phase errors,we need to encode according again with two extra qubits:

a|0〉 + b|1〉 → a| − −−〉 + b| + ++〉 (13.2)

and then again a majority vote. The remaining error is a product of theabove two errors and can therefore be corrected with 2 + 2 = 4 extra qubitsin total. If we have five qubits in total, then there are 3 × 5 single qubiterrors. And there are 24 = 16 orthogonal states to be occupied after errors.So all errors are distinguishable and can therefore be corrected.

13.3 General Conditions

We now describe an alternative way of manipulating quantum states whichis best handled using the language of quantum computation. An advantageof quantum computation lies in the fact that the input can be in a coher-ent superposition of qubit states, which are then simultaneously processed.The practical realisation of a qubit can be constructed from any two-statequantum system e.g. a two-level atom in an ion trap, where the unitarytransformations are implemented through interaction with a laser. The com-putation is completed by making a measurement on the output. However,a major problem is that the coherent superpositions must be maintainedthroughout the computation. In reality, the main source of coherence lossis due to dissipative coupling to an environment with a large number ofdegrees of freedom, which must be traced out of the problem. This loss isoften manifested as some form of spontaneous decay, whereby quanta arerandomly lost from the system. Each interaction with, and hence dissipa-tion to, the environment can be viewed in information theoretic terms asintroducing an error in the measurement of the output state. There are,however, techniques for ‘correcting’ errors in quantum states. The basicidea of error-correction is to introduce an excess of information, which canthen be used to recover the original state after an error. These quantum

50

Page 55: Lectures on foundations of quantum information theory by vlatko vedral

error correction procedures are in themselves quantum computations, and assuch also susceptible to the same errors. This imposes limits on the natureof the ‘correction codes’, which are explored in this section.

First we derive general conditions which a quantum error correction codehas to satisfy. Assume that q qubits are encoded in terms of n ≥ q qubits toprotect against a certain number of errors, d. We construct 2q code–words,each being a superposition of states having n qubits. These code-wordsmust satisfy certain conditions, which are derived in this section. There arethree basic errors (i.e. all other errors can be written as a combination ofthose): amplitude, A, which acts as a NOT gate; phase, P , which introducesa minus sign to the upper state; and their combination, AP . A subscriptshall designate the position of the error, so that P1001 means that the firstand the fourth qubit undergo a phase error.

We consider an error to arise due to the interaction of the system witha ‘reservoir’ (any other quantum system), which then become entangled.This procedure is the most general way of representing errors, which are notrestricted to discontinuous ‘jump’ processes, but encompass the most generaltype of interaction. Error correction is thus seen as a process of disentanglingthe system from its environment back to its original state. The operatorsA and P are constructed to operate only on the system, and are defined inthe same way as the operators for a complete measurement, eq. (??). Inreality, each qubit would couple independently to its own environment, sothe error on a given state could be written as a direct product of the errorson the individual qubits. A convenient error basis for a single error on asingle qubit is {1, σi}, where the σi’s are the Pauli matrices. In this case,the error operators are Hermitian, and square to the identity operator, andwe assume this property for convenience throughout the following analysis.

In general the initial state can be expressed as

|ψi〉 =

2q∑

k=1

ck|Ck〉 |R〉 (13.3)

where the |Ck〉 are the code–words for the states |k〉 and |R〉 is the initialstate of the environment. The state after a general error is then a superpo-sition of all possible errors acting on the above initial state

|ψf 〉 =∑

αβ

AαPβ

k

ck|Ck〉 |Rα,β〉 , (13.4)

where |Rα,β〉 is the state of the environment. (We keep the hat notation todesignate operators in this section in order to avoid any confusion.) Notethat |Rα,β〉 depends only on the nature of the errors, and is independent

of the code–words. The above is, in general, not in the Schmidt form, i.e.the code–word states after the error are not necessarily orthogonal (to be

51

Page 56: Lectures on foundations of quantum information theory by vlatko vedral

shown) and neither are the states of the environment. Now, since we have noinformation about the environment, we must trace it out using an orthogonalbasis for the environment {|Rn〉 , n = 1, d}. The resulting state is a mixtureof the form ηi =

n |ψn〉 〈ψn|, where

|ψn〉 =∑

αβ

xαβn AαPβ

k

ck

∣∣∣Ck

, (13.5)

and xαβn = 〈Rn|Rαβ〉. To detect an error, one then performs a measure-

ment on the state η to determine whether it has an overlap with one of thefollowing subspaces

Hαβ = {AαPβ |Ck〉, k = 1, . . . , 2q} . (13.6)

The initial space after the error is given by the direct sum of all the abovesubspaces, H =

αβ ⊕Hαβ. Each time we perform an overlap and ob-tain a zero result, the state space H reduces in dimension, eliminating thatsubspace as containing the state after the error. Eventually, one of theseoverlap measurements will give a positive result which is mathematicallyequivalent to projecting on to the corresponding subspace. The state afterthis projection is then given by the mixture ηf =

n |ψnProjαβ〉⟨ψnProjαβ

∣∣,

where

|ψnProjαβ〉 =

kl

γδ

xγδn AαPβ |Ck〉〈Ck|PβAαAγPδ|C l〉cl . (13.7)

The successful projection will effectively take us to the state generated bya superposition of certain types of error. One might expect that to distin-guish between various errors the different subspaces Hαβ would have to beorthogonal. However, we will show that this is not, in fact, necessary.

After having projected onto the subspace Hαβ we now have to correct

the corresponding error by applying the operator PβAα onto |ψProjαβ〉, since

PβAαAαPβ = 1. In order to correct the error successfully, the resulting statehas to be proportional to the initial state of code–words in |ψi〉. This leadsto the condition

kl

γδ

xγδn |Ck〉〈Ck|PβAαAγPδ|C l〉cl = zαβn

m

cm|Cm〉 . (13.8)

where zαβn is an arbitrary complex number. Now we use the fact that allcode words are mutually orthogonal, i.e. 〈Ck|C l〉 = δkl, to obtain that

l

γδ

clxγδn 〈Ck|PβAαAγPδ|C l〉 = zαβnck (13.9)

for all k and arbitrary ck. This can be written in matrix form as

Fαβnc = zαβnc , (13.10)

52

Page 57: Lectures on foundations of quantum information theory by vlatko vedral

where the elements of the matrix F are given by

Fαβnkl :=

γδ

xγδn 〈Ck|PβAαAγPδ|C l〉 . (13.11)

As eq. (13.10) is valid for all c it follows that

∀ k, l, Fαβnkl = zαβnδkl . (13.12)

However, we do not know the form of xγδn ’s as we have no information about

the state of the environment. Therefore, for the above to be satisfied forany form of x’s we need each individual term in eq. (13.11) to satisfy

〈Ck|PβAαAγPδ|C l〉 = yαβγδδkl (13.13)

where yαβγδ is any complex number. From eqs. (13.11,13.12,13.13) we seethat the numbers x, y, and z are related through

γδ

xγδn yαβγδ = zαβn . (13.14)

Eq. (13.13) is the main result in this section, and gives a general, and infact the only, constraint on the construction of code–words, which may thenbe used for encoding purposes. If we wish to correct for up to d errors, wehave to impose a further constraint on the subscripts α, β, γ, and δ; namely,wt(supp(α) ∪ supp(β)), wt(supp(γ) ∪ supp(δ)) ≤ d, where supp(x) denotesthe set of locations where the n–tuple x is different from zero and wt(x) isthe Hamming weight, i.e. the number of digits in x different from zero. Thisconstraint on the indices of errors simply ensures that they do not containmore than d logical ‘1’s altogether, which is, in fact, equivalent to no morethan d errors occurring during the process.

We emphasise that these conditions are the most general possible. Bysubstituting zαβγδ = δαβδγδ in eq. (13.13), we obtain the conditions

〈Ck|PβAαAγPδ|C l〉 = δβδδαγδkl (13.15)

These conditions show the main difference between a quantum and classicalerror correction: it is possible for two different errors to lead to the same

state providing that the overlap is the same for all the code–words.

13.4 Reliable Quantum Computation from Unre-liable Components

We have seen how to protect qubits against general error, and in particu-lar how to protect an atom against spontaneous emission. However, this

53

Page 58: Lectures on foundations of quantum information theory by vlatko vedral

protection is rather “static”, i.e. our qubits are not evolving while errorsoccur. Suppose we would like to implement a Controlled-NOT between twoqubits which can undergo an error during this operation. Is there a point toencoding these qubits in the first place, since the encoding and decoding pro-cedures are just composed of a number of CNOTs (and other gates) whichthemselves can undergo errors? It appears that if we (realistically) allowencoding and decoding to undergo errors then there is no point is protectinggates since this action introduces even more errors. The conclusion would bethat quantum error correction cannot be used in quantum computation! Thesame conclusion was reached in 1930s about classical computation. Then,however, von Neumann showed this to be a completely erroneous conclusionand he proved that a reliable computation (classical of course, as von Neu-mann did not know about quantum computation) is possible from unreliablecomponents. His argument can directly be translated into quantum com-puting and this gives rise to the fault tolerant quantum computation, i.e.,in von Neumann’s jargon, reliable quantum computation from unreliablecomponents. We now present a sketch of this argument. This is intendedonly as a qualitative argument that the quantum error correction we havestudied in this chapter can be applied to quantum computing in general,and no details will be given.

The idea of fault tolerant quantum computation is to encode the qubits insuch a way that the encoding does not introduce more errors than previouslypresent. If the error stays at the same level we then keep performing errorcorrection until the error has decreased in magnitude. The present state ofthe art requires 5− 10 qubits to encode a single qubit against a single error.It is the iterative application “in depth” of the encoding that will enable usto reduce error to an arbitrarily small level providing it is below a certainlevel to start with. In other words we will be encoding the encoding bits.Before we give more details let us just recapitulate the main points about aquantum computer.

An input to a quantum computer is a string of qubits. For this cal-culation a quantum computer is viewed as consisting of two main parts:quantum gates and quantum wires. By basic quantum gates we mean anyset of quantum gates which can perform any desired quantum computation.A universal quantum gate is the one whose combination can be used to sim-ulate any other quantum gate. A quantum wire is used as a representationof that part of computation of any qubit where the evolution is a simpleidentity operation (i.e. no gate operates on the qubit), as well as the timethe qubit spends during the gate operation.

For stable quantum computation, obviously, we require that the proba-bility of error after the fault–tolerantly encoded basic gate is of higher order(i.e. the error is smaller) than the probability of error after the unencodedgate (that is the whole point of encoding and fault–tolerant error correc-tion!). From this we derive the bound on the size of allowed errors in the

54

Page 59: Lectures on foundations of quantum information theory by vlatko vedral

wires and in the gates. When we encode the encoding bits again, we reducethe error further and can reduce the error arbitrarily for an arbitrarily longcomputation. Therefore given certain initial limits on the error rate in thegates and wires we can stabilize any computation to a desirably small errorrate, given an unlimited amount of time. Consider a two input two outputquantum gate. The probability of having any of the three basic errors inthe first as well as in the second wire is η–giving the overall first order wireerror of 2η. The error in the gate itself is ε. We assume that the overallerror of the whole basic gate is ≤ 2η + ε. Suppose that the basic gate is nowencoded fault tolerantly against a single error of any kind, using l qubits.Then the overall second order error is at the end of the gate:

η∗(η, ε, l) =

(

1 − l(l − 1)

2l4η2

)

l2ε +l(l − 1)

2l4η2(1 − l2ε) , (13.16)

i.e. equal to having error in the wires (this time in second order) and notin the gates plus having error in the gates and not in the wires. The terml(l − 1)/2 comes from choosing two out of l gates to err and the factor l4

derives from the use of l2 gates, so that the error is transformed accordingto η → l2η and is of second order. We require that the fault tolerant errorcorrection reduces the error. Hence:

(

1 − l(l − 1)

2l4η2

)

l2ε +l(l − 1)

2l4η2(1 − ε) ≤ 2η + ε . (13.17)

As the LHS is > η, we simplify the above without a greater loss in generalityto: (

1 − l(l − 1)

2l4η2

)

l2ε +l(l − 1)

2l4η2(1 − ε) ≤ η , (13.18)

The solutions to the equation derived from the above are:

η± =1 ±

1 − 2(l8ε − 2l10ε2)

(l6 − 2l8ε), (13.19)

We require that η ∈ R (and that 0 ≤ η ≤ 1/2) so that we have the followingtwo regimes of error

1. 0 < η < η+ and e ≤ e−.

2. 0 < η < η− and e ≥ e+.

where ε± = 12l2

(1 ±√

1 − 2l−6). The output of the first encoded basic gateis fed into the next one (or part of the output into one next basic gateand the rest into another next basic gate). It is evident that if condition 1holds, further encoding can only decrease the error. The residual error nottaken into account is ∼ l3(l2η)3 = l9η3 (i.e. the second order error is notcorrected by our encoding). In the worst case when ε = ε− ∼ l−8 we get

55

Page 60: Lectures on foundations of quantum information theory by vlatko vedral

η ∼ l−6, which means that the residual uncorrected error is ∼ l−9. This errorcan accumulate over time if the computation is sufficiently long. Howeverthe residual error after n in depth encodings is l−O(n), which can be madearbitrarily small using sufficiently large n. Therefore if the initial error pergate is sufficiently small, these gates can be used to perform arbitrary largequantum computations. If we need l = 10 qubits to fault-tolerantly encodeone qubit, then the tolerant error rate is 10−6 which a more careful analysisshows to be correct. By using some further tricks we can make this estimate10−3.

56

Page 61: Lectures on foundations of quantum information theory by vlatko vedral

Chapter 14

Appendix

14.1 Measuring tr%2?

ρU

HHϕ

� � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � �

� � � � � � � � �� � � � � � � � �

�������������������

�������������������� �

� �

��

trU% = v ¦eiα, where v represents reduction in visibility and α representsthe shift.

ρ

HHϕ

� � � � � � � � � � � � � � � � � �

ρ

� � � � � � � � � � � �� � � � � � � � � � � �

� �� �

��

� �� �

��

V

57

Page 62: Lectures on foundations of quantum information theory by vlatko vedral

State “swap”: V |α〉|β〉 = |β〉|α〉

trV % ⊗ % = tr%2

14.2 Generalization of this to tr%k

NeedV (k)|α1〉|α2〉 . . . |αk〉 = |αk〉|α1〉 . . . |αk−1〉

ThustrV (k)%1 ⊗ . . . %k = tr(%1%2 . . . %k) ⇒ tr%k

Can infer eigenvalues of % from tr%k ∀k - not all needed, just as many as the dimensionality

14.3 Checking for Entanglement

Need to know whether [Π ⊗ Λ](%) ≥ 0But λ is only positive ⇒ (I) ⊗ Λ is NOT physical.

However, it can be made physical by introducing

D(%) = 1mI %ism × m

pD + (1 − p)Λ = Λ ; Λ can be made completely positive

Prescription: implement I⊗ Λ(%) and compute its eigenvalues as described.

Check if the lowest eigenvalue when shifted by D is lower than 0. If so⇒ % is entangled.

14.4 Measuring tr(ρT2)k

There is a simple method of measuring the eigenvalues of the partially trans-posed state and therefore of checking the Peres-Horodecki criterion.

14.5 Measuring fidelity between % and σ

F = tr(%σ) = tr(σ%)

V |i〉|j〉 = |j〉|i〉 and trV (σ ⊗ %) = trσ%

58

Page 63: Lectures on foundations of quantum information theory by vlatko vedral

ρ

HH

�������������������

�������������������� �

� �� �� �

σ

14.6 Properties of von Neumann Entropy

1. Concavity: S(λ%1 + (1 − λ)%2) ≥ λS(%1) + (1 − λ)S(%2). Mixingincreases uncertainty (ignorance).

2. Additivity: S(%1 ⊗ %2) = S(%1) + S(%2). Independent uncertaintiesadd up.

3. Araki-Lieb: S(%12) ≤ S(%1) + S(%2). Uncertainty in a whole less thansum of its parts (because parts might be correlated).

4. S(U%U †) = S(%). Unitarities cannot change entropy (as they representonly the change of basis).

5. CP-maps can both increase and decrease S. This is why the key quan-tity is the relative entropy as it is semi-directional.

Theorem. If %′ =∑

i

A†i%Ai and

i

A†i%Ai =

i

Ai%A†i = I ⇒

S(%′) ≥ S(%)

Proof

i AiI%A†i = I ⇒

S(%||I) ≥ S(%′||I) ⇒S(%) ≤ S(%′).

Q.E.D.

14.7 Properties of Relative Entropy

1. S(σ||σ) = 0. Distance.

59

Page 64: Lectures on foundations of quantum information theory by vlatko vedral

2. S(σ1 ⊗ σ2||%1 ⊗ %2) = S(σ1||%1) + S(σ2||%2). Inherits additivity fromentropy.

3. S(λσ1+(1−λ)σ2||%) ≤ λS(σ1||%)+(1−λ)S(σ2||%). Convexity: mixingdecreases distance (indistinguishability).

4. S(UσU †||U%U †) = S(σ||%).

5. S(trpσ||trp%) ≤ S(σ||%). Partial trace decreases distinguishability.

6. Donald:∑

k

pkS(%k||σ) =∑

k

pkS(%k||%)+S(%||σ), where % =∑

k

pk%k.

)ρ||σ S(

σ

n

3

2

1

ρ

ρ

ρ

ρ

ρ

� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �

� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �

� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �

� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �

� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �

� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �

� � � � � � � � � � � �

� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �� � � � � � �

! !

! !! !! !

" "" "" "" "

Average distance to σ = average distance to %+ distance % to σ

60