Lectures on Quantum Computing - CS Departmentdcm/Teaching/QuantumComputing/...Lectures on Quantum Computing Dan C. Marinescu and Gabriela M. Marinescu Computer Science Department University

Lectures on Quantum Computing

Dan C. Marinescu and Gabriela M. MarinescuComputer Science DepartmentUniversity of Central Florida

Email: [dcm,magda]@cs.ucf.edu

October 13, 2003

1

Contents

1 Preface 8

2 Introduction 112.1 Computing and the Laws of Physics . . . . . . . . . . . . . . . . . . . . . . . . 122.2 Quantum Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3 Quantum Computers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.4 The Wave and the Corpuscular Nature of Light . . . . . . . . . . . . . . . . . 182.5 Deterministic versus Probabilistic Photon Behavior . . . . . . . . . . . . . . . 192.6 State Description, Superposition, and Uncertainty . . . . . . . . . . . . . . . . 212.7 Measurements in Multiple Bases . . . . . . . . . . . . . . . . . . . . . . . . . . 222.8 Measurements of Superposition States . . . . . . . . . . . . . . . . . . . . . . 252.9 An Augmented Probabilistic Model. The Superposition Probability Rule . . . 282.10 A Photon Coincidence Experiment . . . . . . . . . . . . . . . . . . . . . . . . 322.11 A Three Beam Splitter Experiment . . . . . . . . . . . . . . . . . . . . . . . . 342.12 BB84, the Emergence of Quantum Cryptography . . . . . . . . . . . . . . . . 362.13 A Qubit of History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392.14 Summary and Further Readings . . . . . . . . . . . . . . . . . . . . . . . . . . 442.15 Exercises and Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3 Quantum Mechanics, a Mathematical Model of the Physical World 473.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473.2 n-Dimensional Real Euclidean Vector Space . . . . . . . . . . . . . . . . . . . 493.3 Linear Operators and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 513.4 Hermitian Operators in a Complex n-Dimensional Euclidean Vector Space . . 533.5 n-Dimensional Hilbert Spaces. Dirac’s Notations . . . . . . . . . . . . . . . . . 563.6 The Inner Product in an n-Dimensional Hilbert Space . . . . . . . . . . . . . . 583.7 Tensor and Outer Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613.8 Quantum States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623.9 Quantum Observables. Quantum Operators . . . . . . . . . . . . . . . . . . . 643.10 Spectral Decomposition of a Quantum Operator . . . . . . . . . . . . . . . . 683.11 The Measurement of Observables . . . . . . . . . . . . . . . . . . . . . . . . . 713.12 More about Measurements. The Density Operator . . . . . . . . . . . . . . . . 733.13 Young’s Double-Slit Experiments . . . . . . . . . . . . . . . . . . . . . . . . . 753.14 Stern - Gerlach Type Experiments . . . . . . . . . . . . . . . . . . . . . . . . . 793.15 The Spin as an Intrinsic Property . . . . . . . . . . . . . . . . . . . . . . . . . 823.16 Schrodinger’s Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 833.17 Heisenberg’s Uncertainty Principle . . . . . . . . . . . . . . . . . . . . . . . . 853.18 A Brief History of Quantum Ideas . . . . . . . . . . . . . . . . . . . . . . . . . 863.19 Summary and Further Readings . . . . . . . . . . . . . . . . . . . . . . . . . . 893.20 Exercises and Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

4 Qubits and Their Physical Realization 944.1 One Qubit, a Very Small Bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 944.2 The Bloch Sphere Representation of One Qubit . . . . . . . . . . . . . . . . . 974.3 Rotation Operations on the Bloch Sphere . . . . . . . . . . . . . . . . . . . . . 984.4 The Measurement of a Qubit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

2

4.5 Pure and Impure States of a Qubit . . . . . . . . . . . . . . . . . . . . . . . . 1054.6 Two Qubits. Entanglement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1064.7 The Fragility of Quantum Information. Schrodinger’s Cat . . . . . . . . . . . . 1084.8 Qubits, from Hilbert Spaces to Physical Implementation . . . . . . . . . . . . 1094.9 Qubits as Spin 1

2Particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

4.10 The Measurement of the Spin . . . . . . . . . . . . . . . . . . . . . . . . . . . 1144.11 The Qubit as a Polarized Photon . . . . . . . . . . . . . . . . . . . . . . . . . 1184.12 Entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1224.13 The Exchange of Information Using Entangled Particles . . . . . . . . . . . . . 1234.14 Summary and Further Readings . . . . . . . . . . . . . . . . . . . . . . . . . . 1254.15 Exercises and Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

5 Quantum Gates and Quantum Circuits 1295.1 Classical Logic Gates and Circuits . . . . . . . . . . . . . . . . . . . . . . . . . 1295.2 One Qubit Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1325.3 The Hadamard Gate, Beam splitters and Interferometers . . . . . . . . . . . . 1345.4 Two Qubit Gates. The CNOT Gate . . . . . . . . . . . . . . . . . . . . . . . . . 1355.5 Can we Build Quantum Copy Machines? . . . . . . . . . . . . . . . . . . . . . 1385.6 Three Qubit Gates. The Fredkin Gate . . . . . . . . . . . . . . . . . . . . . . 1405.7 The Toffoli Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1425.8 Quantum Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1445.9 The No Cloning Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1455.10 Qubit Swapping and Full Adder Circuits . . . . . . . . . . . . . . . . . . . . . 1465.11 Unitary Operations on a Single Qubit. Rotation Matrices . . . . . . . . . . . . 1505.12 Single Qubit Controlled Operations . . . . . . . . . . . . . . . . . . . . . . . . 1525.13 Multiple Qubit Controlled Operations . . . . . . . . . . . . . . . . . . . . . . . 1575.14 A Quantum Circuit for the Walsh-Hadamard Transform . . . . . . . . . . . . 1605.15 Mathematical Models of a Quantum Computer . . . . . . . . . . . . . . . . . 1625.16 Errors, Uniformity Conditions, and Time Complexity . . . . . . . . . . . . . . 1655.17 Summary and Further Readings . . . . . . . . . . . . . . . . . . . . . . . . . . 1665.18 Exercises and Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

6 Quantum Algorithms 1716.1 Introduction to Quantum Algorithms . . . . . . . . . . . . . . . . . . . . . . . 1716.2 Quantum Turing Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1726.3 Quantum Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1736.4 Deutsch’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1756.5 Quantum Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1796.6 Simon’s Algorithm for Phase Estimation . . . . . . . . . . . . . . . . . . . . . 1876.7 Order Finding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1906.8 Quantum Algorithms for Integer Factoring . . . . . . . . . . . . . . . . . . . . 1906.9 The Hidden Subgroup Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 1906.10 Quantum Search Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1906.11 Quantum Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1906.12 Summary and Further Readings . . . . . . . . . . . . . . . . . . . . . . . . . . 1906.13 Exercises and Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

3

7 Reversible Computations 1917.1 Turing Machines, Reversibility, and Entropy . . . . . . . . . . . . . . . . . . . 1917.2 Thermodynamic Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1937.3 Maxwell Demon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1957.4 Energy Consumption. Landauer Principle . . . . . . . . . . . . . . . . . . . . 1967.5 Low Power Computing. Adiabatic Switching . . . . . . . . . . . . . . . . . . . 1987.6 Bennett Information Driven Engine . . . . . . . . . . . . . . . . . . . . . . . . 1997.7 Logically Reversible Turing Machines and Physical Reversibility . . . . . . . . 1997.8 Summary and Further Readings . . . . . . . . . . . . . . . . . . . . . . . . . . 2017.9 Exercises and Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

8 The “Entanglement” of Computing and Communication with Quantum Me-chanics 2028.1 Uncertainty and Locality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2028.2 Possible Explanations of the EPR Paradox . . . . . . . . . . . . . . . . . . . . 2048.3 The Bell Inequality. Local Realism . . . . . . . . . . . . . . . . . . . . . . . . 2048.4 EPR Pairs and Bell States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2078.5 Quantum Teleportation with Maximally Entangled Particles . . . . . . . . . . 2098.6 Anti-Correlation and Teleportation . . . . . . . . . . . . . . . . . . . . . . . . 2178.7 Dense Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2198.8 Quantum Key Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2238.9 Summary and Further Readings . . . . . . . . . . . . . . . . . . . . . . . . . . 2278.10 Exercises and Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

9 Appendix I: Modular Arithmetic 2289.1 Elementary Number Theory Concepts . . . . . . . . . . . . . . . . . . . . . . . 2289.2 Euclid’s Algorithm for Integers . . . . . . . . . . . . . . . . . . . . . . . . . . 2319.3 Euclid’s Algorithm for Polynomials . . . . . . . . . . . . . . . . . . . . . . . . 2329.4 The Chinese Remainder Theorem and its Applications . . . . . . . . . . . . . 2329.5 Computer Arithmetic for Large Integers . . . . . . . . . . . . . . . . . . . . . 2339.6 Summary and Further Readings . . . . . . . . . . . . . . . . . . . . . . . . . . 2349.7 Exercises and Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

10 Appendix II: Welsh-Hadamard Transform 23610.1 Hadamard Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23610.2 The Fast Hadamard Transform . . . . . . . . . . . . . . . . . . . . . . . . . . 23910.3 Further Readings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24210.4 Exercises and Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242

11 Glossary 251

4

Notations

c The speed of light in vacuum: c = 3× 1010 cm s−1.h Planck’s constant: h = 6.6262× 10−34 J s.� Reduced Planck’s constant: � = h

2π= 1.054× 10−34 J s.

kB Boltzman’s constant: kB = 1.131× 10−23 J K−1.G The universal gravitational constant: G = 6.672× 10−8 cm3 g−1 s−2.R The field of real numbers.C The field of complex numbers.Zn The finite field of integers modulo n with n a prime number.GF (q) The Galois field with q elements with q = pn,

with p a prime number and n an integer.The finite field GF (pn) has characteristic p.

i =√−1 Imaginary square root of unity.

α0, α1, . . . Complex numbers. αj = Real(αj) + i× Imaginary(αj)α∗

0, α∗1, . . . Complex conjugates. α∗

j = Real(αj)− i× Imaginary(αj)| αj | The modulus of the complex number αj.

| αj |=√

[Real(αj)]2 + [Imaginary(αj)]

2

eiα Euler’s formula: eiα = cos(α) + isin(α).Cn n- dimensional vector space over the field of complex numbers.H2 Two-dimensional Hilbert space.Hn n-dimensional Hilbert space.| ψ〉, | φ〉 ket vector (Dirac’s notation); e.g., | ψ〉, | φ〉 column vectors in C3

| ψ〉 =

α0

α1

α2

| φ〉 =

β0

β1

β2

〈ψ | bra vector (Dirac’s notation); the dual of | ψ〉, i.e., | ψ〉†. The row vector 〈ψ |is the transpose of the complex conjugate of | ψ〉.

If | ψ〉 =

α0

α1

α2

then 〈ψ |=| ψ〉† = (| ψ〉∗)T = (α∗

0, α∗1, α

∗2).

〈ψ | φ〉 The scalar (inner) product of | ψ〉 and | φ〉; it is a complex number.

〈ψ | φ〉 =| ψ∗〉T | φ〉 = (α∗0, α

∗1, α

∗2)

β0

β1

β2

= α∗

0β0 + α∗1β1 + α∗

2β2.

| ψ〉⊗ | φ〉 The tensor product of | ψ〉 and | φ〉; it is a vector.

| ψ〉⊗ | φ〉 =| ψ〉 | φ〉 =

α0

α1

α2

⊗

β0

β1

β2

=

α0β0

α0β1

α0β2

α1β0

α1β1

α1β2

α2β0

α2β1

α2β2

5

| ψ〉〈φ | The outer product of | ψ〉 and | φ〉; it is a linear operator or a matrix.

| ψ〉〈φ |=| ψ〉(| φ〉∗)T =

α0

α1

α2

(β∗

0 , β∗1 , β

∗2) =

α0β

∗0 α0β

∗1 α0β

∗2

α1β∗0 α1β

∗1 α1β

∗2

α2β∗0 α2β

∗1 α2β

∗2

|| | ψ〉 || The norm of vector | ψ〉.|| | ψ〉 || =

√〈ψ | ψ〉 =

√| α0 |2 + | α1 |2 + | α2 |2.

A Linear operator or matrix.∂A/∂ai Partial derivative of operator A.tr(A) The trace of matrix A; the sum of its diagonal elements.det(A) =| A | The determinant of matrix A.MA

ij Minor obtained by eliminating row i and column j from A.AT Transpose of matrix Amn; row i, 1 ≤ i ≤ m of Amn becomes column i of AT

mn.A∗ Complex conjugate of matrix Amn;

aij → a∗ij, 1 ≤ i ≤ m, 1 ≤ j ≤ n.

A† Hermitian conjugate, or dual of A. A† = (A∗)T .ρ Density matrix.

δij Kronecker’s delta function. δij =

{0 if (i �= j)1 if (i = j)

∆x ∆px ≥ �

2A formulation of Heisenberg’s uncertainty principle;∆x and ∆px are indeterminacies in particle position andmomentum along direction x, respectively.

i� ddt

Ψ = HΨ Schrodinger equation.H Hamiltonian operator.p = h

λDe Broglie’s equation. p is the momentum of a particle andλ is the wavelength of the wave associated with it.

S = kBln(W ) The thermodynamic entropy S. kB is the Boltzman’s constant.W is the probability of a given state.

u⊕ v Sum modulo 2 (Exclusive OR). Given binary n-tuples u = (u0, u1, . . . un−1) andv = (v0, v1, . . . vn−1) then u⊕ v = (u0 ⊕ v0, u1 ⊕ v1, . . . un−1 ⊕ vn−1)ui ⊕ ui = 1 if ui = 0 and vi = 1, or if ui = 1 and vi = 0.

a� Largest integer smaller than or equal to the real number a.gcd(m,n) The greatest common divisor of integers m and n. The largest integer

dividing both m and n.lcd(m,n) The least common multiple of integers m and n.| A | The cardinality of the set A. The number of elements of A.F ∗ The set of non-zero elements of a field F , {F − {0}}.ord(α) The order of the element α ∈ F , with F a finite field.a−1 (mod n) The multiplicative inverse of integer a modulo n. a · a−1 ≡ 1 (mod n).Zp[x] Polynomials in x with coefficients from the finite field Zp.[g(x)] The equivalence class containing the polynomial g(x).(a0, a1, a2, a3) Vector representing the polynomial g(x) = a0 + a1x + a2x

2 + a3x3.

Vk(F ) Vector space of k tuples over the field F .[n,M ]-code Block code with M codewords each of length n.(n, k)-code Linear code with k information symbols and blocks of length n.(n, k, d)-code Linear code with k information symbols, blocks of length n, and distance d.C⊥ Orthogonal element of code C.

6

{n, k, d1, d2} Quantum code where n qubits are used to store or transmit k bitsof information and allow correction of up to (d1 − 1)/2� amplitude errorsand simultaneously up to (d2 − 1)/2� phase errors.

σ0, σ1, σ2, σ3 Pauli matrices.

σ0 = I =

(1 00 1

)σ1 = σx = X =

(0 11 0

)

σ2 = σy = Y =

(0 − ii 0

)σ3 = σz = Z =

(1 00 −1

)H,S, T The Hadamard (H), phase (S) and π/8 (T) matrices for one qubit gates

H = 1√2

(1 11 −1

)S =

(1 00 i

)T =

(1 00 eiπ/4

)GCNOT The matrix of the controlled-NOT, CNOT, a two qubit gate.

GCNOT =

1 0 0 00 1 0 00 0 0 10 0 1 0

GToffoli The matrix of the Toffoli gate, a three qubit gate.

GToffoli =

1 0 0 0 0 0 0 00 1 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 1 0 0 0 00 0 0 0 1 0 0 00 0 0 0 0 1 0 00 0 0 0 0 0 0 10 0 0 0 0 0 1 0

Wj The Welsh-Hadamard transform of j qubits.Wj is defined recursively as: W1 = H Wj+1 = H ⊗Wj.

X Discrete or continuous random variable.pX(x) The probability density function of the random variable X.pX,Y (x, y) The joint probability density function of random variables X and Y .pX|Y (x | y) The conditional probability density function of X conditioned by Y .FX(x) The cumulative distribution function of the random variable X.

FX(x) = Prob(X < x) =∫ x

−∞ pX(t)dt → continuous case.

E [X] The expected value of the random variable X.

E[X] =

{ ∑ni=1 xipX(xi) → discrete case∫XpX(x)dx → continuous case.

V ar[X] The variance of the random variable X.

V ar[X] =

{ ∑ni=1(xi − E[X])2pX(xi) → discrete case∫ +∞

−∞ (x− E[X])2pX(x)dx → continuous case.

H(X) The Shannon entropy of random variable X.

H(X) =

{−∑n

i=1 pX(xi)log2pX(xi) → discrete case−∫

pX(x)log2pX(x)dx → continuous case.H(X,Y ) The joint entropy of X and Y .

H(X,Y ) =

{−∑

xi

∑yj

pX,Y (xi, yj)× log2pX,Y (xi, yj).

−∫

X

∫Y

pX,Y (x, y)× log2pX,Y (x, y).

7

1 Preface

A tremendous progress has been made in the area of quantum computing and quantuminformation theory during the past decade. Thousands of research papers, a few solid referencebooks, and many popular-science books have been published in recent years in this area. Thegrowing interest in quantum computing and quantum information theory is motivated by theincredible impact this discipline could have on how we store, process, and transmit data andknowledge in this information age.

Computer and communication systems using quantum effects have remarkable properties.Quantum computers enable efficient simulation of the most complex physical systems we canenvision. Quantum algorithms allow efficient factoring of large integers with applications tocryptography. Quantum search algorithms speedup considerably the process of identifyingpatterns in apparently random data. We can guarantee the security of our quantum commu-nication systems because eavesdropping on a quantum communication channel can always bedetected.

It is true that we are years, possibly decades away from actually building a quantumcomputer requiring little if any power at all, filling up the space of a grain of sand, andcomputing at speeds that are unattainable today even by covering tens of acres of floor spacewith clusters made from tens of thousands of the fastest processors built with current stateof the art solid state technology. All we have at the time of this writing is a 7 (seven) qubitquantum computer able to compute the prime factors of a small integer, 15. Building aquantum computer faces tremendous technological and theoretical challenges. At the sametime, we witness a faster rate of progress in quantum information theory where applicationsof quantum cryptography seem ready for commercialization. Recently, a successful quantumkey distribution experiment over a distance of some 100 km has been announced.

It is very difficult to predict how much time will elapse from the moment of a greatdiscovery until it materializes into a device that profoundly changes our lives. The firstatomic bomb was exploded in 1945, less than 10 years after the discovery of the nuclearfission by Lise Meitner and Otto Hahn [74]. The first microprocessor was built in late 1970s,some 30 years after the discovery of the transistor on December 23, 1947 by William Shockley,John Bardeen, and Walter Brattain. Francis Harry Compton Crick and James Dewey Watsondiscovered the double helix structure of the genetic material in 1957 and the full impact oftheir discovery will continue to reverberate for years to come.

We believe that the time to spread the knowledge about quantum computing and quan-tum information outside the circle of quantum computing researchers and students majoringin physics is ripe. Students and professionals interested in information sciences should getacquainted with a very different way of thinking than the one used to construct today’s algo-rithms. This certainly posses tremendous challenges, since, for many years, computer sciencestudents have been led to believe that they can get by with some knowledge of discrete math-ematics and little if any understanding of physics at all1. But times are changing; in somesense we are going back to the age when a strong connection between physics and computersexisted.

The present volume, “Lectures on Quantum Computing”, is devoted to quantum com-

1This seems to be a perennial problem. When James II, the king of Great Britain, insisted that a Benedic-tine monk be given a degree without taking any examinations or swearing the required oaths, Isaac Newton,who was the Lucasian professor at Trinity College at Cambridge, wrote to the Vice-Chancellor “Be coura-geous and steady to the Laws and you cannot fail.” The Vice-Chancellor took Newton’s advice and..... wasdismissed from his post.

8

puting. The first chapter introduces the reader to the quantum world by way of severalexperiments. The second chapter provides the most basic concepts of quantum mechanicsand of the supporting mathematical apparatus. The third chapter introduces the qubit andhints at simple physical realizations of a qubit. The next chapter is devoted to quantum gatesand quantum circuits. The fifth chapter presents quantum algorithms. The sixth chapteris devoted to reversible computations. The last chapter introduces the reader to quantumteleportation, quantum key distribution, and dense coding. The text is intended to be selfcontained; concepts, definitions, and theorems from linear algebra, necessary to develop themathematical apparatus of quantum mechanics are introduced in Chapter 2. Appendix 1presents modular arithmetic necessary for understanding the factoring algorithms. Appendix2 is devoted to the Welsh-Hadamard transform.

We treat a quantum computer as a mathematical abstraction. Yet, we discuss in somedepth the fundamental properties of a quantum system necessary to understand the subtletiesof counterintuitive quantum phenomena such as entanglement.

“Lectures on Quantum Computing” is intended as a textbook for a one semester firstcourse in quantum computing. The time table we suggest for covering the material is: twoweeks for Chapter 1, two weeks for Chapter 2 two weeks for Chapter 3, three weeks forChapter 4, three weeks for Chapter 5 and Appendices 1 and 2, one week for Chapter 6, andtwo weeks for Chapter 7. Any graduate or undergraduate student with a solid background inlinear algebra and calculus should be able to do well in the class.

A second volume, “Lectures on Quantum Information Theory”, is devoted to quantuminformation theory. The volume starts with a deeper discussion of the postulates of quantummechanics, necessary to understand the rather difficult subject of quantum measurements andthe Bell inequality. Then we cover classical quantum information theory and classical errorcorrecting and error detecting codes. The next chapters expose the reader to the quantuminformation theory concepts and to quantum error correcting codes. We devote a chapter toquantum cryptography and conclude the book with a chapter on the physical realization ofquantum computing and communication systems. An appendix provides a summary of con-cepts, definitions, and theorems related to algebraic structures and to finite fields in particular,necessary to understand coding theory.

A fair number of well-written “popular science” books devoted to qualitative explanationof quantum physics and quantum computing are available. The term “popular” simply meansthat such books have a larger audience than traditional scientific books, yet, probably, thoseacquainted with such books form a relatively small segment of the “populus”. As sciencemakes new discoveries at a fairly rapid pace, the gap between those who are acquainted tosome degree with the scientific knowledge and those who are oblivious to science becomeslarger and larger. This fact should be a cause of great concern for all of us.

The two volumes on quantum computing and quantum information theory combine aqualitative presentation with a more rigorous, quantitative analysis. Whenever possible weattempt to avoid the sometimes difficult mathematical apparatus, the trademark of quantummechanics. In his marvellous book “A Brief History of Time” [61], Stephen Hawking, theastrophysicist who is now the Lucasian professor, shares with his readers the warning he gotfrom his editor: “expect the sales to be cut in half for every equation in your book”. Thereare k × 102 equations in this series of lectures and 2100 ≈ 1, 00010 is a very large number.

Peter Shor has made several constructive suggestions and signaled some of the problems inan early version of the manuscript. P.K. Aravind, Dan Burghelea, and Boris Zel’dovich havegone with a fine tooth comb over a more evolved version of the first volume. We are greatly

9

indebted to the four of them. Robert E. Lynch from the Mathematics and the ComputerScience Departments at Purdue University has reviewed an early version of the manuscriptand made a number of recommendations. Of course, the authors are responsible for all theremaining errors. Tom Robbins, our editor from Prentice Hall, communicated with us hisvision and helped shape the book.

10

2 Introduction

Two of the greatest scientific discoveries of the twentieth century, quantum mechanics and thegeneral theory of relativity, target physical phenomena which we rarely, if ever, experience inour daily lives. After all, no human being has ever travelled at a speed approaching the speedof light, nor are we often in the position to observe quantum effects on earth.

Quantum is a Latin word meaning some quantity. In physics it is used with the samemeaning as the word discrete in mathematics, i.e., some quantity or variable that can takeonly sharply defined values as opposed to a continuously varying quantity. The conceptscontinuum and continuous are known from geometry and calculus. For example, on a segmentof a line there are infinitely many points, the segment consists of a continuum of points. Thismeans that we can cut the segment in half, and then cut each half in half, and continue theprocess indefinitely.

It is extremely difficult to accept the non-determinism governing the quantum world. Evensome of the greatest minds of our time, including Albert Einstein, had doubts about suchunsettling ideas. Even more troubling for most of us is the concept of non-locality, the factthat two quantum objects even when separated by a distance possibly measured in light years,could influence one another, and that the change of state of one determines an instantaneouschange of state of the other.

Most laws governing the physical phenomena studied by quantum physics are counterintuitive and can only be presented with the aid of a rather sophisticated mathematicalapparatus. We have to free our thinking from sensory information and step into a verydifferent world. For example, we believe that we can measure the position first and then thevelocity (thus the momentum) of an object, or we can measure the velocity first and then theposition, and expect exactly the same results in both experiments. This is true for a car, anairplane, a rocket, or a bullet in flight, but it is false for quantum particles such as electrons, orfor photons. Measuring the position of a quantum particle makes it impossible to determineits velocity with an arbitrary high level of accuracy. The reverse is also true, if we measurefirst the velocity it is impossible to determine the position of the particle with an arbitraryhigh level of accuracy. The explanation, very simple for the layperson with some knowledgeof linear algebra, has its roots in a rather trivial property of the matrix multiplication: theproduct of matrices is non-commutative2. Measuring a property (the velocity or the position)of a quantum particle corresponds to a mathematical operation, namely, applying the operatorassociated with that property (observable) to the vector describing the state of the system.A linear operator has a matrix associated with it, and the non-commutativity of the productof two matrices implies that the order of the measurements is important.

It is clear, as we shall see shortly, that the laws of physics and the laws of quantum me-chanics in particular, limit our ability to process information increasingly faster and cheaperusing present day solid state technologies. The question addressed by the new discipline ofquantum information processing is if the strange world of quantum phenomena can be har-nessed and eventually be put to “good use”. It turns out that communication and computersystems using quantum effects have remarkable properties. A quantum communication chan-nel transmits information using quantum effects, e.g., when the information is encoded intothe spin of an electron, or in the polarization of a photon. Eavesdropping on a quantumcommunication channel can always be detected. Quantum computing enables efficient simu-lation of the most complex physical systems that we can envision. Quantum algorithms have

2If A and B are matrices then their product is non-commutative, AB �= BA.

11

been discovered that allow efficient factoring of large integers. This particular problem is ofimmense practical interest because efficient factoring algorithms would allow us to decryptwith ease communication encrypted with today’s state of the art encryption techniques. Par-allel search quantum algorithms are equally promising for data mining and other importantapplications.

In this chapter we first discuss the laws of physics that limit our ability to build fastercomputers using current technologies. Then, we discuss several experiments that provide someinsight regarding quantum effects and discuss a practical application of quantum informationtheory. To conclude the chapter we outline the milestones leading to the new discipline ofquantum computing.

2.1 Computing and the Laws of Physics

Computers are systems subject to the laws of physics. One of these laws, the finite speedof light, limits the potential reliability of future computing systems. The components ofa computer exchange information among themselves, e.g., the processor reads and writesinformation from/to memory, data is transferred from the internal registers to the Arithmeticand Logic Unit (ALU), and so on. Transmission of information is associated with a transportof energy from the source to the destination, but no physical phenomena may propagate witha speed larger than the speed of light. It takes one nanosecond, 10−9 seconds, for the lightto travel a distance of 30 cm in vacuum and about 20 cm in a metallic conductor. Therefore,the speed of a computer is limited by the size of its components. It is inevitable that inour quest to increase the speed, at some point the components of a computer will approachatomic dimensions 3. When this happens, switching, the change of the state of a component,will be governed by Heisenberg’s uncertainty principle 4. It follows that we may not beable to determine the state of that component with absolute certainty and the results of acomputation, though carried out by a very fast computer, will be unreliable.

The technology enabling us to build smaller and faster computing engines encountersother physical limitations as well. We are limited in our ability to increase the density andthe speed of a computing engine. Indeed, the heat produced by a super dense computingengine is proportional to the number of elementary computing circuits, thus, to the volumeof the engine. To prevent the destruction of the engine we have to remove the heat through asurface surrounding the device, see Figure 1(b). Let us assume that we pack the logic gatesas densely as possible, say in a sphere of radius r. The heat dissipated is then proportionalto the number of gates, thus to the volume of the sphere, V = (4/3)πr3. But the heat canonly be removed through the surface of the sphere A = 4πr2. Henceforth, the amount of heatincreases with the cube of the sphere while our ability to remove heat increases as the squareof the radius of the sphere.

Moreover, if there is a minimum amount of energy dissipated to perform an elementaryoperation, then the increase in speed (in the number of operations performed each second by

3The atoms are 1− 2× 10−10m in radius4Heisenberg’s uncertainty principle says we cannot determine both the position and the momentum of a

quantum particle with arbitrary precision. In his Nobel prize lecture on December 11, 1954 Max Born saysabout this fundamental principle of Quantum Mechanics : “... It shows that not only the determinism ofclassical physics must be abandoned, but also the naive concept of reality which looked upon atomic particlesas if they were very small grains of sand. At every instant a grain of sand has a definite position and velocity.This is not the case with an electron. If the position is determined with increasing accuracy, the possibilityof ascertaining its velocity becomes less and vice versa.”

12

Energy consumption of a logic circuit

Speed of individual logic gates

S

E

(a) (b)

Heat removal for a circuit with densely packedlogic gates poses tremendous challenges.

Figure 1: Energy consumption and heat dissipation of classical logic gates. (a) If there is aminimum amount of energy ε necessary to perform an elementary logic operation, the bestwe can hope for is a linear increase of the amount of energy used by the device as the speedof the logic gates increases. Thus, the energy required at speed S is E = εS. (b) To increasethe speed of the device we need to pack the logic gates as densely as possible, say in a sphereof radius r. The heat dissipated is then proportional with the number of gates, thus with thevolume of the sphere, V = (4/3)πr3. But the heat can only be removed through the surfaceof the sphere A = 4πr2.

the computing engine) requires at least a linear increase of the amount of energy dissipated bythe device, as shown in see Figure 1(a). The computer technology vintage year 2000 requiressome ε = 3×10−18 Joules per elementary operation. Even if this limit is reduced, say 100-fold,we shall see a ten fold increase in the amount of power needed by devices operating at a speed103 times larger than the speed of today’s devices. Moreover, the heat will be dissipatedwithin a much smaller volume and our ability to cool the system will be diminished.

In 1992 Ralph Merkle from Xerox PARC calculated that a 1 GHz computer operating atroom temperature, with 1018 gates packed in a volume of about 1 cm3, the size of a sugarcube, would dissipate 3 MW of power [28]. A small city with 1, 000 homes each using 3 kWwould require the same amount of power; a 500 MW nuclear reactor could only power some166 such circuits.

The laws of classical physics limit our ability to compute faster and cheaper, thus weneed to look elsewhere and consider a revolutionary rather than an evolutionary approach tocomputing. One of the directions to explore is that of quantum physics.

Does quantum theory play only a supporting role by defining the limitations of presentday physical systems used for computing and communication? The answer is a resoundingno. Quantum properties such as uncertainty, interference, and entanglement form the foun-dation of a new brand of theory, the quantum information theory where computational andcommunication processes rest upon fundamental physics.

Quantum computing is not fiction. We have a reasonable hope to build quantum computersduring the next few decades and we expect amazing results.

13

2.2 Quantum Information

Until recently, information and computation models had a feeble connection with physics.Complexity theory addressed the time and space complexity of algorithms. Time and spaceare physical attributes, thus a connection with physical reality is still maintained. Informationtheory was concerned with entropy as a measure of the uncertainty associated with a randomevent and its relationship with information transmission over communication channels. Itis fair to say that our information and computation models lacked physical awareness andrequired little understanding of the basic laws of physics. The laws of quantum mechanicswere viewed more as an annoyance than a necessity; they could be needed at a distant pointin the future when the physical systems used to store and transfer information would consistonly of a few atoms.

Now this view is challenged by very significant results from the new discipline of quan-tum computing. Quantum computing and quantum information theory are concerned withthe transmission and processing of quantum states and the interactions of such quantuminformation with the “classical” one.

A quantum bit, or qubit for short, is a quantum system used to store information. Asopposed to a bit which can be in one of two states “0” and “1”, a qubit can exist in a continuumof states. Moreover, we can measure the value of a bit with certainty without affecting its state,while the result of measuring a qubit is non-deterministic and the measurement alters its state.In existing computers, if a bit is, for instance, in state “0”, then when it is measured the resultis “0” and the state of this bit remains “0”, after the measurement.

A classical communication channel allows electromagnetic waves or optical phenomena topropagate and it is characterized by its capacity, the maximum quantity of information thatcan be transmitted through the channel per unit of time. A quantum communication channelis a physical system capable of delivering quantum systems more or less intact from one placeto another.

The path from classical information to quantum information is a process of extension,refinement, and completion of our knowledge. It follows the evolution of our thinking inother areas of science. Consider the example of number theory; one started with a conceptinspired by the physical reality, positive integers; then it was realized that one needs todefine the additive inverse of a positive integer and negative integers were born; soon onediscovered that the multiplicative inverse of an integer is needed and the rational numberswere introduced; after a while irrational numbers and then real numbers were added to thefamily of numbers.

2.3 Quantum Computers

Let us take a closer look at our computing machines and get a glimpse at the potentialadvantages of their quantum incarnation. A classical computing engine is a deterministicsystem evolving from an input state to an output state. All the initial states the system couldbe in at the beginning of the computation as well as all the states traversed by the systemduring its dynamic evolution have some canonical labelling; the label of each state can bemeasured by an external observer. The measured output state label is a deterministic functionf of the input label state; we say that the engine computes a “function” f . Two classicalcomputing engines are equivalent if given the same labelling of their input and output statesthey compute the same function f [39].

14

Quantum computers are stochastic engines because the state of a quantum system isuncertain, a certain probability is associated with any possible state the system can be in.The output states of a stochastic engine are random: the label of the output state cannotbe discovered. All we can do is to label a set of pairs consisting of an output state of anobservable and a measured value of that observable. In layman’s terms, observable meansa characteristic, or attribute; in quantum mechanics we say that each pair consists of aneigenstate of a Hermitian operator and one of its eigenvalues as we shall see in Section 3.There is an asymmetry between the input and the output of a quantum engine reflectingthe asymmetry between preparation and measurement in quantum mechanics. We put moreinformation into the preparation of a quantum system than we can get back out in a singlemeasurement of that system.

Now, a few words about quantum mechanics. It is a mathematical model of the physicalworld. This model allows us to specify states, observables, measurements, and the dynamicsof quantum systems. As we shall see later, a Hilbert space, a space of n-dimensional complexvectors is the center stage for quantum mechanics. A Hilbert space is a mix of TrafalgarSquare, Place Pigalle, and Times Square....where you could meet Heisenberg, von Neumann,Schrodinger and other luminaries.

A Hilbert space is indeed a very large space....and we need every bit of it. Let us revisitthe statement made earlier that numerical simulation of physical processes was and continuesto be the motivation for the development of increasingly more powerful computing engines.Whether the scientists look up at the sky and try to answer fundamental questions relatedto the evolution of the Universe or try to decipher the structure of the matter, they need tosimulate increasingly more complex systems. One measure of the complexity of a system is thenumber of states the system can be in. For example, the theory of black hole thermodynamicspredicts [39] that a system enclosed by a surface with an area A has a number N(A) ofobservable states given by:

N(A) = eAc3/4�G.

with c = 3 × 1010 cm/second, the speed of light; � = 1.054 × 10−34 Joules × seconds isPlanck’s constant; G the gravitational constant, G = 6.672 × 10−8 cm3 g−1 s−2. Even fora relatively small spherical object with a radius of 1 km, N(A) ≈ e80, which is a very largenumber.

Let us now turn briefly our attention to the concept of numerical simulation of a physicalsystem. To simulate the behavior of a physical system we first construct a model of the system.Then we design a program that reflects the essential properties of the physical system capturedby the model.

The model abstracts for us the physical system and the results of the simulation can onlybe as good as the model is. For example, an atomic model of a virus structure describes thegeometric position and the type of each atom. The atomic model of a virus structure is usefulfor virological and immunological studies and for the design of antiviral drugs. Antiviraldrugs block the binding site of a virus, and prevent it to infect healthy cells. Typically amacromolecular structure, such as a virus, has several million atoms. If instead of describingthe position and the type of each atom we describe only groups of atoms (say groups of 102

atoms), then we reduce the complexity of the model (instead of say 106 objects we describeonly 104), but our model is less accurate. If the model of the virus is less accurate, then theantiviral drugs designed based upon this model are less likely to be effective.

Suppose the model of the system has n states. The simulation program must contain

15

a description of each state, thus the space complexity of the simulation program is at leastO(n). The model should also describe the dynamics of the system. This implies that thesimulation program should describe the set of transitions among the n states and the action(s)associated with every transition. Assuming that each sate is reachable from every other state,the dynamics of the system requires the description of the action(s) taken for each of theO(n2) transitions. Multiple actions may be associated with a transition from a state si to astate sj, depending upon the past history of the system, reflected by the path followed by thesystem to reach state si.

We conclude that the time (and the space) complexity of a simulation program is at leastO(n), where n represents the number of states of the model. To carry out a numerical simu-lation we need a physical computer with sufficient resources (mainly CPU rate and primarymemory) to complete the simulation in a reasonable time, namely hours or days, rather thanyears, centuries, or even longer.

We have to balance the accuracy of the model versus the feasibility of the numericalsimulation. With today’s computers we simply cannot simulate system with a very largestate space, thus an accurate simulation of a physical system is rarely possible. Quantummechanics allows us to accommodate an extremely large state space, and may lead to anexact simulation of a physical system.

Mathematically, | ψ〉, the state of a quantum bit, a qubit is represented as a vector in atwo-dimensional complex vector space5. In this space a vector has two components and theprojections of the state vector on the basis vectors are complex numbers.

While a classical bit can be in one of two states, “0” or “1”, the qubit can be in states | 0〉,and | 1〉 called computational basis states and also in any state that is a linear combinationof these states. This phenomenon is called superposition.

Consider now a system consisting of n such particles whose individual states are describedby vectors in the two-dimensional vector space. In classical physics the individual states ofparticles combine through the Cartesian product. The possible states of the quantum systemof n particles form a vector space of 2n dimensions; given n bits, we can construct 2n n-tuplesand describe a system with 2n states.

Individual state spaces of n particles combine quantum mechanically through the ten-sor product. If X and Y are vectors, then their tensor product X ⊗ Y is also a vector,but its dimension is dim(X) × dim(Y ) while the vector product X × Y has dimensiondim(X) + dim(Y ). For example, if dim(X) = dim(Y ) = 10, then the tensor product of thetwo vectors has dimension 100 while the vector product has dimension 20.

In a quantum system, having n qubits, the state space has 2n dimensions. There are 2n

basis states forming a computational basis and there are superposition states resulting fromthe superposition of basis states. The catch is that even though one quantum bit, a systemwith 21 basis states, can be in one of infinitely many superposition states, when the qubit ismeasured, the measurement changes the state of the quantum system to one of the two basisstates. From one qubit we can only extract a single classical bit of information.

In quantum systems the amount of parallelism increases exponentially with the size of thesystem, thus with the number of qubits. This means that the price to pay for an exponentialincrease in the power of a quantum computer is a linear, and very small, to start with, increasein the amount of matter and space needed to build the larger quantum computing engine.

The access to the results of a quantum computation is restrictive because any interactionof an outside observer with a quantum system disturbs the quantum state. The process of dis-

5A detailed explanation of the terms used now can be found in Chapter 3.

16

turbing the quantum state due to the interaction with the environment is called decoherence.This is a major problem we have to overcome in designing quantum algorithms.

Two photons, or two electrons can be in a superposition state of close coupling with eachother, an intimately fused state known as an entangled state, a state with no classical analogy.Entanglement is the exact translation of the German term Verschrankung used by Schrodingerwho was the first to recognize this quantum effect. It means that the state of a two-particlequantum system cannot be written as a tensor product of the states of the individual particles.The state of an entangled system cannot be decomposed into contributions of individualparticles.

Two quantum particles may be in an entangled state even if they are not in close prox-imity of each other. Even when the entangled particles are separated from one another, achange of state of one of the entangled particles instantaneously affects the other particleand determines its state. Later, in Chapter 3, we see that entanglement occurs for otherquantum systems as well. For example, the singlet state6, the antisymmetric state of a pairof electrons is an entangled state. The two electrons find themselves in the same orbital(three quantum numbers are identical) but, according to Pauli exclusion principle, they havetheir spins oriented in opposite directions, i.e., | 1 ↑〉 | 2 ↓〉 or | 1 ↓〉 | 2 ↑〉. If, as a resultof an experiment, one of the electrons is made to change the orientation of its spin, then asimultaneous measurement of the other one finds it in a state with an opposite spin.

It is very difficult to provide an intuitive and simple explanation of this subtle phenomenonwhich has no classical analogy. With the risk of oversimplification consider an example ofentanglement taken from... our daily lives!! When Alice and Bob, two characters of manycryptography texts, get married with one another, their lives become entangled. After afew months they find out that Alice is pregnant. Shortly afterwards, Bob takes off for anintergalactic voyage to Andromeda. The very moment Alice gives birth to their child namedSamantha, Bob’s state changes instantly, though he may be half a light-year away somewherein our galaxy. Bob becomes a father. An external observer could see the baby and decidethat Bob’s state has changed 7.

Entanglement is a very puzzling phenomenon. Feynman writes [47]: “A description of theworld in which an object can apparently be in more than one place at the same time, in whicha particle can penetrate a barrier without breaking it, in which widely separated particles cancooperate in an almost psychic fashion, is bound to be both thrilling and bemusing.”

In a recent paper [17], Charles Bennet and Peter Shor, two of the pioneers of quantumcomputing and quantum information theory, discuss the similarities and dissimilarities be-tween classical and quantum information. They point out that“classical information can becopied freely, but can only be transmitted forward in time, to a receiver in the sender’s for-ward light cone. Entanglement, by contrast cannot be copied, but can connect any two pointsin space-time. Conventional data-processing operations destroy entanglement but quantumoperations can create it, preserve it and use it for various purposes, notably speeding upcertain computations and assisting in the transmission of classical data (quantum superdensecoding) or intact quantum states (teleportation) from a sender to a receiver.”

These facts are intellectually pleasing but two questions come immediately to mind. Can

6A singlet electron state corresponds to a pair of electrons with anti-parallel spins |↑↓〉− |↓↑〉, i.e., theelectrons have different quantum numbers +1/2 and −1/2. The total spin of the state is zero.

7There is no physical change in Bob’s state but only a (somewhat abstract) change in his family statusas a result of his wife’s (very physical) birthing process. Of course, Bob learns of his new state only when amessage sent through a classical communication channel reaches him, but this is besides the point.

17

such a quantum computer be built? Are there algorithms capable of exploiting the uniquepossibilities opened by quantum computing?

The answer to the first question is that only a seven-bit quantum computer has been builtso far [119]; several proposals to build quantum computers using nuclear magnetic resonance,optical and solid state techniques, and ion traps have been studied. The answer to thesecond question is that problems in integer arithmetic, cryptography, or search problems havesurprisingly efficient solutions in quantum computing.

We have a decent hope that quantum computers can be built. When this time comes,quantum computers will be able to solve computational problems that are unsolvable today,and will remain unsolvable using classical computers, even if we assume that Moore’s law 8

governing the rate of increase of the speed of classical devices will continue to hold for thenext few decades.

There are also skeptics who have serious doubts about the future of quantum computers.One of them, Dyakonov, contrasts the excitement generated by recent results in quantuminformation theory with the enormously difficult problems posed by the actual physical real-ization of a quantum computing device [45]. Talking about the universal quantum gate thatcan transform an arbitrary state of the quantum computing engine into another state via aunitary transformation he tells an old joke. “During the first months of World War II, aninventor kept telling anybody willing to listen to him, that he had an idea of extreme militaryvalue. Eventually, the rumor reached Stalin who decided to listen to the inventor.

Inventor: I propose that you should have three buttons on your desk, colored green, blue,and white. When you push the green button, all enemy ground forces shall disappear; whenyou push the blue button, all enemy U-boats shall sink to the bottom of the oceans; and,when you push the white button, all airplanes shall blow up in smoke.

Stalin: How will it work?Inventor: Well, it is up to your scientists and engineers to figure that out. I just gave you

the idea.”

2.4 The Wave and the Corpuscular Nature of Light

Sophisticated quantum phenomena such as interference and entanglement are critical forunderstanding quantum computing and quantum information theory. It is true that we donot need quantum mechanics to explain most of the phenomena we observe in our daily life,but there are many phenomena involving light that cannot be explained unless we accept thefact that light comes as corpuscules, as finite grains of energy. Look through the window andbesides the objects in your garden you may see faint reflections of yourself. This happensbecause some of the photons, the discrete particles carrying light, bounce off the glass of thewindow.

The nature of light was a constant source of interest for philosophers and then for physi-cists. Aristotle believed that white light was a basic single entity. An early treatise on thesubject, Kepler’s “Optics”, did not challenge this idea. Isaac Newton’s first work as the Lu-casian Professor at Cambridge was in optics; in January 1670 he delivered his first lecture onthe subject. The chromatic aberration in a telescope lens convinced Newton that Aristotle’shypothesis was false. Newton argued that white light is a mixture of many different typesof rays which are refracted at slightly different angles, and that each type of ray produces

8Moore’s law states that the speed of microprocessors doubles every 1.8 years.

18

a different spectral color. Newton’s “Opticks” appeared in 1704; it dealt with the theory oflight and color; it covered the diffraction of light and “Newton’s rings”. To explain someof his observations he had to use a wave theory of light in conjunction with his corpusculartheory.

At the end of 19th century, James Clerk Maxwell constructed a consistent theory of elec-tromagnetism and showed that light is a form of electromagnetic radiation. Yet, as Maxwell’stheory was gaining universal acceptance, in 1905 Einstein used the new quantum theory ofPlanck to explain the photoelectric effect.

Einstein’s explanation, involving photons, seemed to signal a return to the Newtonianmodel of light. Photons, though massless, are like any other particles, they carry both amomentum and energy and act like billiard balls. When colliding with the electrons at thesurface of a metal the photons can knock them free and leave behind a positive charge. Thisis the simplified explanation for the photoelectric effect. The photoelectric effect is often usedto measure the intensity of light. A device called a photomultiplier allows us to detect lightof very low intensity, and even individual photons.

In 1923 Louis de Broglie proposed that a wave should be associated with every kind ofparticle. He provided an equation that linked the momentum p of a particle to the wavelengthλ of the wave associated with it, and the Planck constant h

p =h

λ.

de Broglie’s assumption opened a new chapter in quantum physics and in our understand-ing of the atomic and subatomic world.

There are experiments involving light with very intriguing results that cannot be explainedwithout invoking the principles of quantum mechanics. We discuss several experiments to putus in the frame of mind needed for understanding quantum information. First, we present asimple experiment revealing the granular nature of light. Next, we discuss a simple model toaddress the non-deterministic effects. Then, we challenge this model to explain the results ofsuccessive measurements performed upon a beam of quantum particles using different bases.We augment the model with the superposition probability rule for events that may occur onalternative paths and discuss several experiments with results consistent with this rule.

2.5 Deterministic versus Probabilistic Photon Behavior

The contemporary theory regards light as a flux of photons and, at the same time, supportsthe fact that the light exhibits the properties of a wave. To understand this duality considera device called a beam splitter, a half-silvered mirror, see Figure 2(a). A beam of light fallingon a beam splitter is split into two components, one transmitted and one reflected. A beamsplitter may transmit a larger or a smaller fraction of the incident light depending upon thecharacteristics of the silver deposition. For the experiments discussed in this section the twocomponents are of equal intensity. The color, i.e., the wavelength, of the light is not alteredby a beam splitter, a behavior consistent with a wave.

If we decrease the intensity of the incident light we are able to observe the granular natureof light. Imagine that we send a single photon. Then either detector D1 or detector D2 inFigure 2(a) senses the photon. If we repeat the experiment involving a single photon over andover again we observe that each one of the two detectors records the same number of events(one event is the detection of a photon).

19

(b)(a)

D1

D2

D3

D5

D7

Detector D1

Detector D2

Beam splitter

Incident beam of light

Reflected beam

Transmitted beam

Figure 2: (a) One beam splitter. We send a single photon and observe that either detector D1or detector D2 sense the photon. If we repeat the experiment many times, then we observe thateach one of the two detectors records about the same number of events. (b) Cascaded beamsplitters. All detectors have a chance of detecting the single photon sent in one experiment.The cascaded beam splitter experiment convinces us to dismiss the idea that a photon carriesa “gene” and exhibits a deterministic behavior.

This is puzzling; could there be hidden information, which controls the behavior of aphoton? Does each photon carry a “transmit” or a “reflect” gene and one with a “transmit”gene continues and reaches detector D2 and another with a “reflect” gene ends up at D1 [77]?If this is true, the two genes should have an equal probability of occurrence throughout theentire population of photons.

It is not difficult to dismiss this genetic view of photon behavior: consider the setup inFigure 2(b) with a cascade of beam splitters. As before, we send a single photon and repeatthe experiment many times and count the number of events registered by each detector.According to our theory we expect the first beam splitter to decide the fate of an incomingphoton; the photon is either reflected by the first beam splitter or transmitted by all of them.Thus, only the first and last detectors in the chain are expected to register an equal numberof events.

The experiment shows that all the detectors have a chance to register an event. Thisresult discredits our theory of deterministic behavior caused by a gene and leads us to seekanother possible explanation of the strange effects revealed by this experiment, one basedupon a probabilistic behavior of quantum particles. The term “probabilistic behavior” in thiscontext means that multiple outcomes of an experiment are possible and there is a certainprobability for each of these outcomes to occur. For example, we may assume that a photontosses a coin when reaching a beam splitter and the result of the toss determines the fate ofthe photon emerging from that beam splitter.

If a photon tosses an unbiased coin the probability of flipping four heads in a row toreach detector D2 is pD2 = (1/2)4. Thus, if we repeat this experiment 1, 000, 000 times, thendetector D2 will register about 1, 000, 000/16 = 62, 500 events. For this discussion we assumed

20

that a photon is transmitted by a beam splitter if it flips a head, and it is reflected if it flipsa tail, and we considered independent events.

2.6 State Description, Superposition, and Uncertainty

Let us now try to construct a simple mathematical model for the state of a two-dimensionalquantum system able to capture what we have observed so far. When measured by an externalobserver the state should reveal one of two values that are opposite to each other, for example,“0” or “1”. Other names for the two opposite values9 are possible, e.g., “transmitted” and“reflected”, “black” and “white”, “hard” and “soft”.

O

1V

(a)

= q

O

1V

(b)

0 0

1 1q1

q0

q1 = q

q045o

30o

Figure 3: A model for the measurement of a quantum system in a superposition state; thebinary quantum state is represented by a vector �v of unit length. The projection of thevector measures the probability of an outcome. (a) The two possible outcomes have equalprobabilities, p0 = p1 = q2 with q = 1/

√2. (b) One of the outcomes, “1”, has a larger

probability than the other, “0”, p1 = q21 > p0 = q2

0 because q1 > q0.

Let us consider that the state is represented by a vector �v connecting the origin of thecoordinates system to a point on the periphery of a unit circle. The two axis are labelled “0”and “1” and the projection of the vector on each of them gives the probability of observingthe outcome “0” or the outcome “1”, respectively.

The projections of the vector are called probability amplitudes. The probability of anoutcome is the square of the number giving its probability amplitude. The two possible out-comes of the experiment occur with probabilities p0 = q2 and p1 = 1− q2. The normalizationcondition requires that p0 +p1 = 1; thus we have to restrict ourselves to vectors of length one.

In Figure 3(a) we illustrate the case when the two projections are equal, q = 1/√

2, andthus p0 = p1 = 1/2. In Figure 3(b) we observe that one of the projections is larger than thesecond p1 > p0, thus we are more likely to observe the outcome “1” than “0”.

This very simple model is extremely powerful, has very profound implications, and, as weshall see in the next chapter, it is able to explain experimental results that classical physics failsto explain. A salient feature of the model is that the state represented by �v is a superpositionof two possible basis states “0” and “1”. Here �v is the actual state of the system before we

9We often use the mathematical term orthogonal states when talking of opposite valued outcomes forreasons that shall become apparent later.

21

make an observation. To measure (observe) the system means to project the state vector onthe two basis states.

Here the outcomes corresponding to projections on these basis states are all we can learnabout the system as a result of a measurement, and this leaves us uncertain about the actualstate of the system before we made our observations. However, after we performed ourobservation, the system is no longer in an uncertain state, but is is precisely in one of the twopossible states, “0”, or “1”.

Obviously, there are other models which could explain the behavior we have seen so far,but it turns out that this model was not picked up accidentally; it is a model consistent withother gedanken (thought) experiments, as well as with real experiments in quantum physics.

This model explains the behavior we noticed in the cascaded beam splitter experimentpresented in Figure 2(b). Call

| ψ0〉 = α0t | t〉+ α0

r | r〉the state of an incoming photon with α0

t the probability amplitude of the photon to betransmitted and α0

r , the probability amplitude of the photon to be reflected. In our caseα0

t = α0r = 1/

√2, thus the probability of a photon to be transmitted is equal to the probability

to be reflected, pt = pr = (1√

2)2 = 1/2. All cascaded beam spitters are identical, and theprobability of the two observable events, transmission and reflection are equal.

In this case the quantum behavior is consistent with the classical probabilistic behaviordiscussed in Section 2.5. If we repeat the experiment 1, 000, 000 times then D1 will registerabout 500, 000 events, D3 will register about 250, 000 events, D5 will register about 125, 000events and D7 and D2 will register about 62, 000 events each.

We shall see in the next chapter that uncertainty and probabilistic behavior are quintessen-tial properties of quantum systems. In fact, one of the pillars of the quantum model of thephysical world is Heisenberg’s uncertainty (indetermination) principle which states that onecannot measure both the position x and the momentum p along direction x of a particle,with arbitrary precision. The uncertainty in the measurement of the position, ∆x and theuncertainty in the measurement of the momentum, ∆p must satisfy the following inequality:

∆x × ∆p ≥ 1

2�.

with � = h/2π = 1.054 × 10−34 Joules × second and h = 6.626 × 10−34 Joules × second10.

This inequality reflects the impossibility to know precisely the complete state of a quantumsystem. It also reflects the fact that any measurement disturbs the system being measured.

2.7 Measurements in Multiple Bases

Let us now test the model we have developed in a more complex setting. We discuss the casewhen we measure properties of a quantum system using distinct vector bases.

Assume that we have a quantum particle with two properties, “color”, and “hardness”[4]. The “color” of a particle may be either “white” or “black” and a particle may be “hard”or “soft”. The two properties are attributed to a particle in a totally random fashion, about50% of the particles are “white” and about 50% are “black”; similarly about 50% are “hard”and about 50% are “soft”.

10For practical purposes, Planck’s constant can be expressed in units of Joules/Hz.

22

black

white

hard

soft

(a) (b)

black

white

black

white

hard

soft

(c)

Figure 4: (a) A “color separation system”. (b) A “hardness separation system”. (c) Threemeasurement systems in cascade. We observe an equal number of “white” and “black” par-ticles emerging from the third box though we expected that no “black” particles enter thesecond box and the second box cannot fabricate “black” particles.

Imagine that we can construct a “color separation system”, a box with a slit on the lefthand side, one on the right hand side and one at the top, see Figure 4(a). A beam of particlesenters the box through the slit on the left and the beam is split by a color-based beam splitter,the “white” particles continue along their path and exit the box through the right slit, whilethe “black” particles are deflected by an angle of 90o and exit through the upper slit. Weconstruct also a “hardness separation system” in Figure 4(b), similar to the “color” separationsystem. This time the beam splitter lets the “soft” particles continue and deflects the “hard”ones.

When we use the “color separation system” exactly 50% of the particles, the “white” onesare allowed to continue; when we use the “hardness separation system” only 50% of incomingparticles, the “soft” ones are allowed to continue.

Now we design a more sophisticated experiment involving a sequence of three separationsystems; first color, then hardness, and finally one more color separation system, see Figure4(c). They are aligned, allowing the beam of “white” particles emerging from the first box toenter the second separation box and then the “soft” particles emerging from the second box toenter the third one. Amazingly enough, we observe an equal number of “white” and “black”particles emerging from the third box. It looks like the hardness separation system is able to“fabricate” “black” particles though, we have taken all possible precautions in building thehardness box so that it only deflects the “hard” particles and lets the “soft” ones continuetheir path.

Can we explain this strange behavior using our model? The answer is yes. First, recallfrom the description of our model in Section 2.6 that each measurement represents a projectionof the state vector on the vectors forming the basis of a coordinate system. We perform threemeasurements in distinct bases rotated in respect to one another, each base corresponding toone of the two separation systems, see Figure 5.

23

|

0

1

1

0

0

1

Figure 5: We perform three successive measurements and use two vector bases. The first has|↔〉 and |�〉 as basis vectors, while the second has |↖〉 and |↗〉 as basis vectors. The firstmeasurement represents the projection of the original state vector | ψ〉 on the bases formedby |↔〉 and |�〉. The projections are | α0〉 and | α1〉. As a result of the first measurementwe obtain either |↔〉, or |�〉. Let us assume that we obtained |�〉. The second measurementmeans to project the result of the first measurement, |�〉, onto the second coordinate systemformed by |↖〉 and |↗〉. Now we get two distinct projections β0 and β1. Assume that weobserve the result |↖〉. For the third measurement we project |↖〉, the result of the secondmeasurement, on the bases of the first coordinates system, formed by |↔〉 and |�〉. The twoprojections are γ0 and γ1.

Let us move closer to the traditional notations used in quantum mechanics and denote thestate vector �v as | ψ〉. Assume that when we use the first separation system associated withproperty, P1, we perform the measurement on a basis formed by two orthogonal vectors |↔〉and |�〉. Call α0 and α1 the projections, or the “amplitudes” of | ψ〉 on these basis vectors.

Recall that the result of a measurement can only be one of the two basis vectors andassume that the result of the first measurement is the state vector |�〉. Now we use the secondseparation system associated with property P2 and we measure on a new basis formed bythe vectors |↖〉 and |↗〉. Call β0 and β1 the projections of |�〉. Assume that the resultof the second measurement is the |↖〉 state vector. After that we measure again using the

24

separation system associated with property P1; this time the state vector |↖〉, is projectedon the basis vectors |↔〉 and |�〉 and its projections are γ0 and γ1.

The model explains why the hardness separation system seems to fabricate “black” parti-cles.

No refinement of the experimental set-up discussed in this section can possibly alter thisunexpected outcome. We have identified a practical manifestation of the fact that the quan-tum state is a superposition of state projections on a set of basis vectors. The state of theparticles emerging from the second box is a superposition of projections on “white” and“black” basis vectors.

We have to free our thinking from the notations used and grasp the essence of the phe-nomena. We have a vector describing the state and we can use the standard notation �r, orwe may use the traditional notation of quantum mechanics [43], introduced by Dirac | ψ〉

| ψ〉 = α0 | 0〉 + α1 | 1〉with α0 and α1 complex numbers and with | 0〉 and | 1〉 two vectors forming an orthonormalbasis for the vector space.

We use the words “measurement” and “observation” rather loosely to describe an interac-tion of a quantum particle with its environment. Indeed, a beam splitter, or a color separationsystem, means in fact interacting with a photon or another particle, rather than observing ormeasuring it.

2.8 Measurements of Superposition States

We now reinforce what we have just learned, in the context of an experiment emphasizing thesuperposition states of light particles, photons. Light is a form of electromagnetic radiation 11.

As an electromagnetic radiation, light consists of an electric and a magnetic field per-pendicular to each other and, at the same time, perpendicular to the direction the energyis transported by the electromagnetic (light) wave. The electric field oscillates in a planeperpendicular to the direction of flight and the way the electric field vector travels in thisplane defines the polarization of the light. When the electric field oscillates along a straightline we say that the light is linearly polarized. When the end of the electric field vector movesalong an ellipse, the light is elliptically polarized. When the end of the electric field vectormoves around a circle, the light is circularly polarized. If the light comes toward us and theend of the electric field vector moves around in a counterclockwise direction, we say that thelight has right-hand polarization; if the end of the electric field vector moves in a clockwisedirection, we say that the light has left-hand polarization. A polarization filter is a partiallytransparent material that transmits light of a particular polarization.

This experiment is discussed in [93]. We have a source S capable of generating randomlypolarized light and a screen E where we measure the intensity of the light. We also have threepolarization filters, A, for vertical polarization, �, B for horizontal polarization, ↔, and C,for polarization at 45o ↗.Using this experimental setup we make the following observations:

(i) Without any filter the intensity of the light measured at E is I, see Figure 6(b).

(ii) If we interpose filter A between S and E, the intensity of the light measured at E isI ′ ≈ I/2, see Figure 6(c).

11The wavelength of the radiation in the visible light spectrum varies from 780 nm (red) to 390 nm (violet).

25

S

0

|1

|

Figure 6: (a) The polarization of a photon is described by a unit vector | ψ〉 = α0 |�〉+α1 |↔〉on a two-dimensional space with basis |�〉 and |↔〉. Measuring the polarization is equivalentto projecting the random vector | ψ〉 onto one of the two basis vectors. The two projections areα0 and α1. (b) Source S sends randomly polarized light to the screen; the measured intensityis I. (c) The filter A with vertical polarization is inserted between the source and the screen anthe intensity of the light measured at E is about I/2. (d) Filter B with horizontal polarizationis inserted between A and E. The intensity of the light measured at E is now 0. (e) Filter Cwith a 45o polarization is inserted between A and B. The intensity of the light measured atE is about 1/8.

(iii) If between A and E in the previous setup we interpose filter B, then the intensity of thelight measured at E is I ′′ = 0, see Figure 6(d).

(iv) If between filters A and B in the previous setup we interpose filter C, the intensity of thelight measured at E is I ′′′ ≈ I/8, see Figure 6(e).

The photons emitted by the source S have random polarizations; this means that thevectors describing the polarization of the photons have random orientations, see Figure 6(a).

26

Let us denote by | ψ〉 a 2-dimensional vector representing the random polarization of aphoton12. The vector | ψ〉 can be expressed as a linear combination of a pair of orthogonalbasis vectors. We can use different orthonormal bases and we choose one denoted as |�〉 and|↔〉:

| ψ〉 = α0 |�〉+ α1 |↔〉, | α0|2 + | α1|2 = 1.

Measuring the polarization of a photon is equivalent to projecting the random vector | ψ〉onto one of the two basis vectors. The measurement performed by a vertical polarizationfilter, |�〉, provides an answer to the question: “does the incoming photon have a verticalpolarization?”; similar statements can be made about horizontal, 45o, or filters with anyother polarization.

After the measurement in Figure 6(c), the superposition state | ψ〉 is resolved as one of thebasis states, |�〉, or |↔〉, a photon is forced to choose either the vertical, |�〉, or the horizontal,|↔〉 polarization state. The probability that a photon in state | ψ〉 with random polarizationis forced to choose the |�〉 state is

p� = | α0 |2.The probability that it is forced to choose the |↔〉 state is

p↔ = | α1 |2.The sum of the two probabilities must be one, each photon is forced to choose one of the

two basis states

p� + p↔ = 1.

Once the choice is made, only those photons which have made a choice agreeing with thepolarization of the filter are allowed to pass. This explains the results of (ii) and (iii) above,see Figure 6(c) and (d). Indeed, due to the random polarization of the photons emitted bythe source only about 50% of them emerge as vertically polarized from filter A. Clearly, noneof them can make it through filter B; all of them have a vertical polarization, |�〉, thus theprojection of their polarization on |↔〉 basis vector is zero. The probability of each photonto reach the screen E, is zero.

Until now everything seems clear and reasonable. The interesting fact is the introductionof filter C with a 45o polarization between A and B. Filter C measures the quantum statewith respect to a different basis than filters A and B; the new basis consists of vectors at 45o

and 135o, given by:

{ 1√2(|�〉+ |↔〉), 1√

2(|�〉− |↔〉)}.

Filter C has a 45o polarization. It forces incoming photons to choose between the two basisstates, 45o and 135o. About 50% of the photons emerge from filter C with a 45o polarizationand continue along their path. The other 50% end up with a 135o polarization and are stoppedby filter C.

12A randomly polarized photon is described by its density matrix as discussed in Section 3. The state vectorrepresentation is only valid when the phases of the coefficients α0 and α1 are random in respect to each other.

27

Recall that only about 50% of the photons emitted by the source reach C because of thefiltering done by A, therefore only about 25% of the photons emitted by the source reachfilter B in Figure 6(e).

Now, once again the basis used to measure the polarization of the photons changes, filterB has a horizontal polarization and forces a measurement using the original basis vectors, |�〉,or horizontal, |↔〉. Again the incoming photons with a 45o orientation are forced to make achoice, based upon the projection on the new base. Again only about 50% make it through.Thus only about 1/8 = 12.5% of the original number of photons make it to the screen, andall have horizontal polarization.

What did we learn from this simple experiment? First, we identified a candidate tostore and transport information; we can use the polarization of a photon to store a bit ofinformation. But this bit is unusual, it may not only take the values “0” and “1” but aninfinite set of values. Second, there is something strange about the measurement process;once we measure a bit we affect its state. Even though this bit may take infinitely manyvalues prior to the measurement, when we interact with it during the measurement process,we force it to take one of the two possible values, “0” or “1”.

2.9 An Augmented Probabilistic Model. The Superposition Prob-ability Rule

Now let us augment our model with an additional rule to describe even more complex ex-periments when an outcome may be reached through several different paths. As before, weconsider a state vector

| ψ〉 = α0 | 0〉+ α1 | 1〉describing a superposition of two basis state vectors, denoted as | 0〉 and | 1〉. Here α0 and α1

are the probability amplitudes corresponding to the two possible outcomes. The probabilityof an outcome is the modulus of the complex number giving its probability amplitude.

The behavior of quantum systems in such cases is governed by thesuperposition probabilityrule: if an event may occur in two or more indistinguishable ways then the probability ampli-tude of the event is the sum of the probability amplitudes of each case considered separately.

To illustrate the new probability rules consider the experiment in Figure 7(a). A photonemitted by S1 or by S2 could be either reflected (R) or transmitted (T ) by BS1 and BS2

and will eventually be detected either by D1 or by D2. In certain conditions, we observeexperimentally that a photon emitted by S1 is always detected by D1 and never by D2 andone emitted by S2 is always detected by D2 and never by D1.

A photon incident on one of the beam splitters may come from either direction1 orfrom direction2. The state vector of a photon coming to one of the beam splitters fromdirection1 is described by the vector

| ψ1〉 = α0 | 0〉+ α1 | 1〉,while the state vector of the same photon coming to one of the beam splitters from direction2

is described by the vector

| ψ2〉 = α0 | 0〉 − α1 | 1〉.

28

(a)

O

1V

| 0 >

| 1 >

(b)

V

| 0 >

| 1 >

(c)

1+q

+q

+q

-q

(| t >) (| t >)

(| r >) (| r >)

direction2direction1

Reflecting mirror U

Reflecting mirror L

Detector D1

Detector D2

Source S1

Beam splitter BS2Beam splitter BS1

Source S2

direction1

direction2

Figure 7: (a) Two sources, S1 and S2, generate photons, one at a time. There are two beamsplitters, BS1 and BS2 and two detectors, D1 and D2. The beam splitters BS1 and BS2 areconstructed so that the probability of a photon to be reflected is equal to the probability ofthe photon to be transmitted, thus q = 1/

√2. We observe experimentally that all photons

generated by S1 are detected by D1, none reaches D2. Conversely, all photons generated byS2 are detected by D2. The experimental observations are consistent with the superpositionprobability rule. (b) The state vector of a photon coming to one of the beam splitters fromdirection1 is described by the vector | ψ1〉 = α0 | 0〉 + α1 | 1〉. The projection of thisstate vector on | 0〉 is the probability amplitude of the event that the photon is transmitted,α0 = +q. The probability amplitudes of a photon to be reflected is α1 = +q. c) The statevector of a photon coming to one of the beam splitters from direction2 is described by thevector | ψ2〉 = α0 | 0〉 − α1 | 1〉. The projection of this state vector on | 0〉 is the probabilityamplitude of the photon to be transmitted, α0 = +q. The projection on | 1〉 is the probabilityamplitudes of a photon to be reflected, α1 = −q.

The two state vectors must be different. Indeed, the phenomena we describe must bereversible. This implies that we should be able to trace back each photon to its source.

A photon emitted by one of the sources (S1 or S2) may take one of four different paths (seeFigure 8) depending wether it is transmitted, or reflected by each of the two beam splitters.

29

D1S1

(a) - The TT case: the probabilityamplitude is (+q)(+q).

T +q T +q

(b) The RR case: the probabilityamplitude is (+q)(+q).

D2

S1

(c) - The TR case: theprobability amplitude is (+q)(-q).

T +q

R -q

(d) The RT case: the probabilityamplitude is (+q)(+q).

D2

S1

T +q

R +q

D1S1

R +q R +q

direction1 direction2

BS1 BS2

BS1 BS1

BS1

BS2

BS2

BS2

L

L

U

U

Figure 8: When a photon is emitted by S1 there are four possible events. (a) The probabilityamplitude of a photon coming from direction1 to be transmitted by BS1 is +q and theprobability amplitude of a photon coming from direction2 to be transmitted by BS2 isalso +q. Thus, TT - the event that the photon is transmitted by both beam splitters has aprobability amplitude (+q)(+q) = q2. (b) The probability amplitude of a photon coming fromdirection1 to be reflected (by BS1) is +q and the probability amplitude of a photon comingalso from direction1 to be reflected by BS2 is also +q. Thus the probability amplitude ofRR is again (+q)(+q) = q2. (c) TR - the photon is transmitted by the first beam splitter andreflected by the second with a probability amplitude −q2. (d) RT - the photon is reflectedby the first beam splitter and transmitted by the second with a probability amplitude q2. Asbefore, q = 1/

√2. The probability amplitude of a photon emitted by S1 to reach D1 is the

sum of the two amplitudes of TT and RR, thus it is 2q2. Then the probability of a photonemitted by S1 to reach D1 is pS1→D1 = (2q2)2 = 4q4 = 4(1/

√2)4 = 1.

For example, for a photon emitted by S1 the paths are

TT - transmitted by both,

30

RR - reflected by both,

TR - transmitted by BS1 and reflected by BS2, and

RT - reflected by BS1 and transmitted by BS2.

A photon emitted by S1 will reach D1 following either the TT , or the RR path. Similarly,a photon emitted by S2 will reach D2 following either the TT , or the RR path. A photonemitted by S1 will reach D2 following either the TR, or the RT path. Similarly, a photonemitted by S2 will reach D1 following either the TR, or the RT path.

The probability amplitude of a photon coming from direction1 to be transmitted by BS1

is +q and the probability amplitude of a photon coming from direction2 to be transmittedby BS2 is also +q with q = 1/

√2. Thus, TT has a probability amplitude q2, as shown in

Figure 8(a). Notice that in this case the photon comes to the two beam splitters from differentdirections.

Now let us examine the RR case. Notice that this time the photon comes to the twobeam splitters from the same direction. The probability amplitude of a photon coming fromdirection1 to be reflected by BS1 is +q and the probability amplitude of a photon comingalso from direction1 to be reflected by BS2 is also +q. Thus, the probability amplitude ofRR is again (+q)(+q) = q2, as shown in Figure 8(b).

The probability amplitude of a photon emitted by S1 to reach D1 is the sum of the twoamplitudes of TT and RR, thus it is q2 + q2 = 2q2. Therefore, the probability of a photonemitted by S1 to reach D1 is pS1→D1 = (2q2)2 = 4q4 = 4(1/

√2)4 = 1.

Now the probability amplitude of a photon coming from direction1 to be transmitted byBS1 is +q and the probability amplitude of a photon coming from direction2 to be reflectedby BS2 is −q. Thus, TR has a probability amplitude (+q)(−q) = −q2, as shown in Figure8(c). The probability amplitude of a photon coming from direction1 to be reflected by BS1

is +q and the probability amplitude of a photon coming from direction1 to be transmittedby BS2 is +q. Thus, RT has a probability amplitude (+q)(+q) = q2, as shown in Figure8(d). The probability amplitude of a photon emitted by S1 to reach D2 is the sum of thetwo amplitudes of TR and RT , thus it is q2 − q2 = 0. Therefore, the probability of a photonemitted by S1 to reach D2 is pS1→D2 = 0.

Similar arguments can be provided for a photon emitted by S2. From this discussion itfollows that a beam splitter is characterized by four numbers, the probability amplitudes ofa photon coming from direction1 and direction2 to be transmitted or be reflected.

Thus superposition probability rule enables us to show that

pS1→D1 = 1 pS1→D2 = 0

and

pS2→D2 = 1 pS2→D1 = 0

and to explain the results of the experiment presented in this section. We can now com-prehend the meaning of the word “indistinguishable” in the formulation of the superpositionprobability rule. Once a photon reaches detector D1 we know that it originated from S1, butwe have no idea whether it followed path TT or RR; the two paths are indistinguishable. Thesystem is reversible because a photon detected by D1 can always be traced back to its source,S1, regardless of the path the photon followed.

31

Finally, we stress the fact that probability rules are different for classical systems whenthe Bayes rules for conditional probabilities apply. Assume that it is known that event Aoccurred, but it is not known which one of the set of mutually exclusive and collectivelyexhaustive events B1,B2, . . .Bn has occurred. Then the conditional probability that one ofthese events, Bj occurs, given that A occurs is given by

P (Bj | A) =P (A | Bj)P (Bj)∑i P (A | Bi)P (Bi)

.

P (Bj | A) is called the a posteriori probability. For example, assume that there aretwo different routes to reach the summit of Everest, call them B1,B2 and denote by P (A) theprobability of reaching the summit and by P (Bi) the probability of taking route i, i = {1, 2}.Then according to Bayes’s rule:

P (A) = P (A | B1)× P (B1) + P (A | B2)× P (B2).

with P (A | Bi) the conditional probability of reaching the summit via route i. In the generalcase when there are n different alternatives:

P (A) =n∑

i=1

P (A | Bi)× P (Bi).

2.10 A Photon Coincidence Experiment

Another experiment is illustrated in Figure 9. A group at the University of Rochester used thephenomenon of photon parametric down-conversion 13 to generate two photons simultaneously,in two separate beams; this is by no means trivial but, to keep the presentation focused, we donot discuss such details. Each beam is directed by a reflecting mirror to a beam splitter [76].When reaching the beam splitter each photon has a fifty-fifty chance of being transmitted orreflected.

In the realm of classical physics there are two possible outcomes:

(i) each of the two photons has a different fate; one is reflected and the other one is transmittedby the beam splitter. Both photons end up either at detector D1, or at detector D2. Whenthe photon coming from mirror U is reflected by the beam splitter and the one coming frommirror L is transmitted both photons end up at detector D1 . When the scenario is reversedboth photons end up at detector D2.

(ii) both photons have identical fate; they are either reflected or transmitted by the beamsplitter. In this case detectors D1 and D2 get one photon each and signal a coincidence. Ifboth photons are transmitted by the beam splitter, the one coming from mirror U ends upat detector D2 and vice-versa. When both photons are reflected by the beam splitter the onecoming from mirror U ends up at detector D1 and the one coming from mirror L ends up atdetector D2.

13Parametric down conversion is an elementary quantum process of decay of a photon of frequency ωp intotwo new photons of lower frequencies ω1 (signal photon) and ω2 (idler photon), so that ωp = ω1 + ω2;the emission of the two highly correlated (entangled) photons (the photons have the same polarization) issimultaneous. This decay process appears when photons from an incident laser beam interact with a nonlinearmedium, for example a crystal of ammonium dihydrogen phosphate (ADP), and split into two lower frequencysignal and idler photons.

32

Reflecting mirror U

Reflecting mirror L

Detector D1

Detector D2

Source

Beam splitter

Figure 9: A source of light generates two photons simultaneously. Each photon is reflectedby a mirror and directed to a beam splitter. When reaching the beam splitter each photonhas a fifty-fifty chance of being transmitted or reflected. Two detectors are used to determinethe outcome of the experiment. Both photons may reach the same detector - when one of thephotons was reflected and the other one transmitted, or each detector gets one photon (weobserve a coincidence) - when both photons are either reflected or transmitted. Surprise, acoincidence never occurs!

Well, to our surprise the second outcome, the coincidence, is very hard to observe!But now we can explain why: the beam splitter is characterized by the four numbersgiving the probability amplitudes for the two photons to be transmitted and reflected((pT, pR)1, (pT, pR)2) = (+q, +q, +q,−q). Call TT the event that both photons are trans-mitted by the beam splitter, RR the event that both are reflected. Figure 10(a) shows thatthe probability amplitude for the TT event is 1/2 and Figure 10(b) shows that the one forthe RR event is −1/2. The two photons are indistinguishable after the beam splitter so theprobability amplitude to have both transmitted or both reflected is the sum of the two prob-ability amplitudes, so it is zero. We never observe a coincidence in a gedanken experiment 14

. When only one photon is emitted there is an equal chance that it is detected either by D1

or by D2, as expected.Milburn [76] calls this a quantum two-up experiment. He describes a game of chance

“Australia’s very own way to part a fool and his money”. Two fair and identical coinsare tossed up in the air. Four outcomes are possible: two heads (HH) with probabilitypHH = 1/4, two tails (TT ) with pTT = 1/4, or one head and one tail ( HT or TH) withpHT = pTH = 1/4. The spinner aims to toss the HH combination three times beforetossing either a TT , or five consecutive odds (HT or TH). If (s)he succeeds (s)he wins, if not(s)he loses.

14In reality, in a carefully designed experiment coincidences can be detected with a probability of about10−4

33

D2

(a) The TT case: the probabilityamplitude is (+q)(+q).

T +q

D1

R +q

R -q

(b) The RR case: the probabilityamplitude is (+q)(-q).

D1

T +q

D2

Figure 10: (a) The probability amplitude for the TT case is q2. (b) The probability amplitudefor the RR case is −q2. There are two indistinguishable ways for a coincidence to occur, TTand RR, thus the probability amplitude of this event is zero. Here q = 1/

√2

.

2.11 A Three Beam Splitter Experiment

Let us now test our model augmented with the superposition probability rule in a slightlydifferent setup, one involving three beam splitters, as the one in Figure 11.

We want to show that a photon emitted by S1 has an equal probability of being sensed bydetector D1 or by D2. Then, it is trivial to show that the same is true for a photon emittedby S2.

As before, we assume that we have two particles with different state vectors. To makethings easier to understand and demystify our notations, instead of using the basis vectors| 0〉 and | 1〉 we use | t〉 and | r〉, where “t” stands for “transmitted” and “r” for “reflected”.The two particles moving in direction1 and respectively in direction2 have the followingstate vectors

| ψ1〉 = q | t〉+ q | r〉and respectively

| ψ2〉 = q | t〉 − q | r〉with q = 1/

√2 the probability amplitude for either transmission, or reflection. We assume

50-50 beam splitters, thus the two probability amplitudes are equal.This time we use a three-tuple to denote the events at the three beam splitters. Thus

RTR means that a photon is reflected (R) by BS1, transmitted (T) by BS2 and reflected (R) byBS3. A probability amplitude of (+q)(+q)(−q) associated with the path RTR means that theprobability amplitude of reflection at the first beam splitter is (+q), the probability amplitudeof transmission at the second beam splitter is (+q), and finally, the probability amplitude ofreflection at the third beam splitter is (−q). The following table summarizes the eight possiblepath taking a photon from S1 to D1 or to D2, the probability amplitude (PA) of transmissionor reflection at each beam splitter, the total probability amplitude for each path and theprobability of the photon to reach D1 or to reach D2.

34

(a)

S1

S2

D1

D2

BS1 BS2 BS3

URM1 URM2

LRM2LRM1

Figure 11: A system of three beam splitters. In addition to the three beam splitters, BS1, BS2,and BS2, we have two sources S1 and S2, two detectors D1 and D2, two upper reflecting mirrors,URM1 and URM2, and two lower reflecting mirrors, LRM1 and LRM2. In this experimental setupa photon emitted by S1 in direction1 has an equal probability of being sensed by detectorD1 or by D2. Similarly, a photon emitted by S2, in direction2 has an equal probability ofbeing detected by D1 or by D2.

S =⇒ D Path Individual PAs Path PA TotalPA Probability

S1 =⇒ D1 TTR (+q)(+q)(+q) +q3

RRR (+q)(+q)(+q) +q3

TRT (+q)(−q)(+q) −q3

RTT (+q)(+q)(+q) +q3 2q3 = 1/√

2 pS1→D1 = 1/2

S1 =⇒ D2 TTT (+q)(+q)(+q) +q3

RRT (+q)(+q)(+q) +q3

TRR (+q)(−q)(−q) +q3

RTR (+q)(+q)(−q) −q3 2q3 = 1/√

2 pS1→D2 = 1/2

It is left as an exercise to the reader to prove that if we have an experimental setup withan odd number of beam splitters, 1, 3, 5, . . . (2k + 1) . . . we get the result presented in thissection

pS1→D1 = pS1→D2 = 1/2 pS2→D1 = pS2→D2 = 1/2.

When we have an even number of beam splitters, 2, 4, 6, . . . (2k) . . . we get

35

pS1→D1 = 1 pS1→D2 = 0 pS2→D1 = 0 pS2→D2 = 1.

2.12 BB84, the Emergence of Quantum Cryptography

Before closing this introductory chapter we want to stress the idea that quantum computingand quantum information theory have very important applications, some of them feasibleeven today. The simple and easy to grasp application we have selected to illustrate the powerof quantum information theory is in the area of computer security.

The explosive developments in the communication area and computer networks in par-ticular have affected virtually all aspects of human activity including, but not limited toeconomic, defense, scientific, educational, and social. The Internet supports financial trans-actions, enables distance learning and collaborative research, allows large companies to sharedatabases of material properties and various design tools. Specialized networks support de-fense applications and tele-medicine allows physicians to share medical knowledge. In all theseexamples confidential information such as credit card numbers, personal records, proprietarycode and data, medical records, and highly classified defense data, flow through differentcommunication media.

To ensure confidentiality, data is often encrypted. The most reliable encryption techniquesare based upon one time pads, whereby the encryption key is used for one session only and thendiscarded. Thus, there exists the need for reliable and effective methods for the distributionof the encryption keys. The problem rests on the physical difficulty to detect the presence ofan intruder when communicating through a classical communication channel. To date, secureand reliable methods for cryptographic key distribution have largely eluded the cryptographiccommunity in spite of considerable research effort and ingeniousness.

Could quantum information help solve this problem once and for all? The answer isprovided by an idea published by Bennett and Brassard in 1984, thus the name BB84 [14].Let us consider communication using photons prepared in different states of polarization.Figure 12(a) shows photons with vertical/horizontal (VH), and diagonal (DG) polarization- at 45o and 135o We use a calcite crystal to separate photons of different polarization, asshown in Figure 12(b).

From the previous experiments we already know that the vertical - horizontal and diagonal(45o − 135o) states of the photon polarization are pairs of orthogonal states. In other words,if the measuring system is made to distinguish states in one basis (in our case the crystalis oriented to separate the photons with vertical from those with horizontal polarization) itcannot reliably distinguish the states corresponding to the other basis (in our case the crystalcannot distinguish between photons with 45o polarization and those with 135o polarization).We also know that any measurement alters the quantum state. Thus, if we position the crystalto separate photons with vertical polarization from those with horizontal polarization and aphoton with diagonal polarization arrives, its state will be randomly altered by the crystal.

This preliminary discussion gives us some hint that Alice and Bob can communicate usingpolarized photons and detect if Eve, the perennial eavesdropper, has attempted to interceptthe exchange of photons carrying information. In our setup, in addition to the quantumcommunication channel, Alice and Bob are also connected via a classical channel, say atelephone line, see Figure 12(c). Eve is able to intercept all communications on both thequantum and the classical channel. The quantum key distribution protocol is fully describedin Section 8.8; here we only present an outline of the protocol.

36

(a)

Vertical Horizontal 45 deg

Vertical/Horizontal (VH) Diagonal (DG)

(b)

(c)Classical communication channel

Quantum communication channel

Classical wiretap

Quantum wiretapPhoton

separationsystem

Source ofpolarizedphotons

Alice Bob

Eve

135 deg

Figure 12: The quantum key distribution algorithm of Bennett and Brassard. (a) The photonsprepared by Alice may have vertical/horizontal (VH) or diagonal polarization (DG). The pho-tons with vertical/horizontal (VH) polarization may be used to transmit binary informationas follows: a photon with vertical polarization may transmit a 1 while one with a horizontalpolarization may transmit a 0. Similarly, those with diagonal (DG) polarization may transmitbinary information, 1 encoded as a photon with 45o polarization, and 0 encoded as a photonwith a 135o polarization. (b) Bob uses a calcite crystal to separate photons with differentpolarization. Shown here is the case when the crystal is set up to separate vertically polarizedphotons from the horizontally polarized ones. To perform a measurement in the DG basisthe crystal is oriented accordingly. (c) Alice and Bob are connected via two communicationchannels, a quantum and a classical one. Eve eavesdrops on both.

First, let us assume that Eve is very persistent and unsophisticated and intercepts all thephotons sent by Alice to Bob, as well as their exchanges over the voice line. Alice and Bobcommunicate using the following algorithm:

(i) Alice selects n, the approximate length of the encryption key. Alice generates two randomstrings a and b, each of length (4 + δ)n. By choosing δ sufficiently large Alice and Bob canensure that the number of bits kept is close to 2n with a very high probability.

A subset of the bits in string a will be used as the encryption key and the bits in string bwill be used by Alice to select the basis (VH) or (DG) for each photon sent to Bob.(ii) Alice encodes the binary information in string a based upon the corresponding valuesof the bits in string b. For example, if the i-th bit of string b is 1 then Alice selects V H

37

polarization, and if the i-th bit is 0 she selects horizontal polarization. If V H is selected, thena 1 in the i-th position of string a is sent as a photon with vertical polarization and a 0 asa photon with horizontal polarization; if (DG) is selected a 1 is sent as a photon with a 45o

polarization and a 0 as a photon with 135o polarization. Both Alice and Bob use the sameencoding convention for ecah of the bases.

(iii) In turn, Bob picks up randomly (4 + δ)n bits to form a string b′. He uses one of the twobasis for the measurement of each incoming photon in string a based upon the correspondingvalue of the bit in string b′. For example, a 1 in the i-th position of b′ implies that the i-thphoton is measured in the DG basis, while a 0 requires that the photon is measured in theV H basis. As a result of this measurement Bob constructs the string a′.

(iv) Bob uses the classical communication channel to request the string b and Alice respondson the same channel with b. Then Bob sends Alice string b′ on the classical channel.

(v) Alice and Bob keep only those bits in the set {a, a′} for which the corresponding bits inthe set {b, b′} are equal. Let us assume that Alice and Bob keep only 2n bits.

(vi) Alice and Bob perform several tests to determine the level of noise and eavesdropping onthe channel. The set of 2n bits is split into two sub-sets of n bits each. One sub-set will bethe check bits used to estimate the level of noise and eavesdropping, and the other consists ofthe data bits used for the quantum key. Alice selects n check bits at random and sends thepositions and values of the selected bits over the classical channel to Bob. Then Alice andBob compare the values of the check bits. If more than say t bits disagree then they abortand re-try the protocol.

For example, if the sequence sent by Alice and the one used by Bob is

Photon# 1 2 3 4 5 6 7 8 . . .Alice >>> V H V H DG V H V H DG DG V H . . .Bob <<< V H DG DG V H DG V H DG DG . . .Match(Y/N) Y N Y Y N N Y N . . .

then Alice tells Bob that photons numbered 1, 3, 4, 7, . . . were sent and received using the samebasis. Alice consults her log and finds out which polarization was selected for each of thesephotons. In turn, Bob examines the results of his measurements performed on these photonsand finds out the polarization of each of them. The measurements were performed in the samebasis the photons were prepared into, thus the results of Alice’s and Bob’s investigationsshould be the same. For example, if Alice uses vertical polarization for photon #1, 135o

polarization for photon #3, horizontal polarization for photon #4, and 45o polarization forphoton #7, then the binary string transmitted by Alice, and received correctly by Bob is1001 . . .

In absence of eavesdropping, roughly 50% of the photons are measured correctly. First ofall, Alice selects each of the two basis randomly with a probability of pAlice

V H = pAliceDG = 1/2. Bob

selects independently the basis with probabilities pBobV H = pBob

DG = 1/2. Four events are possiblewhen selecting the basis, (VH)(VH), (VH)(DG), (DG)(VH), and (DG)(DG); the encoding ofevents reflects the selection of Alice and that of Bob, (VH)(DG) means that Alice has selectedV H and Bob DG. The probabilities of the four events are

p(V H)(V H) = pAliceV H × pBob

V H = 1/4p(DG)(DG) = pAlice

DG × pBobDG = 1/4

38

p(V H)(DG = pAliceV H × pBob

DG = 1/4p(DG)(V H) = pAlice

V H × pBobDG = 1/4

It follows that the probabilities of Alice and Bob selecting the same basis and of selectingdifferent basis are

pagreement = p(V H)(V H) + p(DG)(DG) = 1/2pdisagreement = p(V H)(DG) + p(DG)(V H) = 1/2

When Eve measures every single photon, she alters the state of all the photons for whichshe had used the wrong basis. Arguments similar with those presented above show thatthe polarization of about 50% of the photons sent by Alice is altered by Eve. Under theseconditions Bob and Alice may agree on only half of the photons whose state was not alteredby Eve, and this means on some 25% of the total number of photons. This is observed onlyafter Alice and Bob perform the test (vi) to determine the level of noise and intrusion.

Eve could be less intrusive and measure only a small fraction of the photons passingthrough the quantum communication channel. This lowers the probability of her being de-tected by Alice and Bob, but it also lowers her chances of getting enough information aboutthe encryption key. Bob and Alice could preempt this by including a number of parity checkbits in their message. For example, Alice may add to each group of say 15 bits a parity checkbit to ensure that each group of 16 bits contains an even number of 1’s. In this case, regardlesshow subtle Eve’s intrusion is, it cannot go unnoticed.

Many research teams work diligently on practical applications of quantum cryptography.In June 2003, Andrew Shields and colleagues revealed a record-breaking cryptographic linkover a distance slightly larger than 100 km [126]. In view of the facts discussed in this sectionit is not surprising that quantum cryptography is the first example of a quantum informationtheory application ready for commercialization.

2.13 A Qubit of History

Quantum computing is the result of a marriage between two great discoveries of the twentiethcentury, quantum mechanics and the general-purpose computer.

It all started more than one hundred years ago when in 1900 Max Planck proposed anamazing solution to a puzzling problem, the so called ultraviolet catastrophe. Contrary toexperimental evidence and even to common sense, classical physics predicted that the intensityof radiation emitted by a hot body increases without any limit as the frequency of radiationincreases. A hot body in equilibrium would radiate an infinite amount of energy, and, sincethis is a physical impossibility, it followed that thermal equilibrium was impossible and thiswas absurd. Planck calculated the so-called blackbody radiation spectrum assuming that thebody emitted energy in discrete packets called quanta and his calculations agreed with theexperiment and avoided the contradiction of the classical theory. Shortly afterwards, in 1905,Albert Einstein used Planck’s quantum hypothesis to explain what happens when light shineson a negatively charged metal plate, the phenomena known as the photoelectric effect. Then,in 1913, Niels Bohr proposed a quantum model of the atom. In 1925 Werner Heisenbergdeveloped an astoundingly new formulation of what Max Born would be the first to call“quantum mechanics”; Heisenberg’s work was followed shortly, in 1926 by the introduction ofa wave equation by Erwin Schrodinger. During the years 1925 and 1926 Born published, withHeisenberg and Jordan, investigations on the principles of quantum mechanics (sometimescalled matrix mechanics) and soon after this, his own studies on the statistical interpretation

39

of quantum mechanics. Six of them got the Nobel prize for physics for their revolutionarydiscoveries: Planck in 1918, Einstein in 1921, Bohr in 1922, Heisenberg in 1932, Schrodinger in1933, and Max Born in 1954. The Nobel prize lectures of these great scientists are fascinatingreadings and can be found at the Nobel prize Web site, http://www.nobel.se/physics.

In 1935 an eccentric young fellow (don) of King’s College at Cambridge by the name AlanTuring, dreamed up an imaginary typewriter-like contraption called the Universal TuringMachine; Turing conceived the principle of the modern computer. He described an abstractdigital computing machine consisting of a limitless memory and a scanner that moves backand forth through the memory, symbol by symbol, reading what it finds and writing furthersymbols. The actions of the scanner are dictated by a program of instructions that is storedin the memory in the form of symbols. This is Turing’s stored-program concept.

The Universal Turing Machine embodies the essential principle of a computer: a singlemachine which can be turned to any well-defined task by being supplied with the appropriateprogram. Turing envisioned a machine that could do anything with a few simple instructions,an idea we take for granted today. He believed that an algorithm could be developed foralmost any problem and the most difficult task was to break down a problem into a sequenceof simple instructions that the machine could perform.

Turing had reached the conclusion that “every function which can be regarded as com-putable can be computed by an universal computing machine” at about the same time asthe work of the American logician Alonzo Church was published. Turing’s paper,“On Com-putable Numbers with an Application to the Entscheidungsproblem” 15, referred to Church’swork, and was published in August 1936 [118]. In his answer to this question, Turing madea triple correspondence between logical instructions, the action of the mind, and a machinewhich could in principle by embodied in a practical physical form. He was the first to use as“definite method” a concept which is called an algorithm in modern language.

Almost ten years later, in the fall of 1945, the world’s first general purpose computer, thebrainchild of J. Presper Eckert and John Mauchly, became operational after several years ofdevelopment at the Moore School of the University of Pennsylvania. The ENIAC (ElectronicNumerical Integrator and Calculator) could perform 5, 000 addition cycles a second and dothe work of 50, 000 people computing by hand; it could calculate the trajectory of a projectilein 30 seconds, instead of the 20 hours necessary with a desk calculator. The ENIAC required174 kW of power for the 17, 468 vacuum tubes, 70, 000 resistors, and 10, 000 capacitors. Evenwhen the computer was not operational the cost of electricity to keep the filaments of thevacuum tubes heated and the fans running to dissipate the heat was about $ 650 per hour[73].

Arguably, in the mid 1940’s, computer simulation of physical systems and phenomenawas the motivating force for the development of the general purpose computers. It is not acoincidence that the general-purpose computers are based upon the so called von Neumannarchitecture, named after the renown mathematician and physicist. In early 1940’s scientistsassociated with the Manhattan project at Los Alamos, including John von Neumann andRichard Feynman, were feverishly developing a fission device. Sophisticated calculationswere necessary and the hundreds of “ human calculators” employed to solve large numericalproblems were simply not sufficient. The Manhattan Project first resorted to the use oftabulating machines and calculators and then got access to the ENIAC. Nicolas Metropolisand Stanly Frankel, two other physicists from the Manhattan Project, had the honor of

15German term for the “decision problem” posed in fact by Hilbert: “ Could there exist, at least in principle,a definite method or process, by which it could be decided whether any mathematical assertion was provable?”

40

running the first test programs once the ENIAC became operational; the problem they solvedremains classified even today. Later that year, Stanislaw Ulam who doubted Edward Teller’sdesign of a thermonuclear device used the ENIAC to compute the results of a thermonuclearreaction at increments of one ten-millionth of a second.

Some question the paternity of the ideas presented in the 1946 report, co-authored by Johnvon Neumann, [29] which proposed the development of a new computer called the EDVAC(Electronic Discrete Variable Automatic Computer). This report is probably the reason whywe talk today about von Neumann architecture rather than Eckert-Macauly architecture, butthis is besides the point for our topic. We only want to stress that numerical simulation becamea new investigative tool in the mid twentieth century. Numerical simulation complements thetwo traditional exploratory methods of science: experimental work and theoretical modelling.

In 1948 Claude Shannon published “A Mathematical Theory of Communication” in theBell System Technical Journal. This paper founded a new discipline, the information theory,and proposed a linear model of a communications system. Shannon considered a sourceof information which generates words composed of a finite number of symbols transmittedthrough a channel; if xn is the n-th symbol produced by the source, then xn is a stationarystochastic process [99].

The first commercial computer, UNIVAC I, capable of performing 1, 900 additions/secondwas introduced in 1951; the first supercomputer, the CDC 6600, designed by Seymour Cray,was announced in 1963; IBM launched System/360 in 1964; a year later DEC unveiled the firstcommercial minicomputer, the PDP 8, capable of performing some 330, 000 additions/second.A very large percentage of the cycles of all these systems were devoted to numerical simulation.In 1977 the first personal computer, the Apple II was marketed and the IBM PC, rated atabout 240, 000 additions/sec, was introduced 4 years later, in 1981.

Today’s computers are very different from the ENIAC. In 2001 a high-end PC had a1.5 GHz CPU and 512 MB of memory. What about the future? 16 Changes in the VLSItechnologies and computer architecture are projected to lead to a 10-fold increase in com-putational capabilities over the next 5 years and 100-fold increase over the next 10 years.Towards the end of 2003 the same PC is projected to have a 2.3 GHz processor and a 1–2 GBmemory. By 2011 the CPU speed is projected to be 11.5 GHz, the main memory to increaseto 16 GB.

By 2003 the minimum feature size has reached 0.09 µm and it is expected [127] to decreaseto 0.05 µm in 2011(see Table 1). As a result, during this period the density of memory bitswill increase 64-fold and the cost per memory bit will decrease 5-fold. It is projected thatduring this period the density of transistors will increase 7-fold, the density of bits in logiccircuits will increase 15-fold, and the cost per transistor will decrease 20-fold.

While the solid-state technology continued to improve at a very fast pace, making ourcomputers faster and cheaper, theoretical physicists became restless and started asking ques-tions about the physical limitations of our computational models. They reasoned that ifthere is a minimum amount of power dissipated for the execution of a logical step, then thefaster computers become, the more power is needed for their operation. The amount of powerdissipated increases as the computers become faster and faster and it becomes harder andharder to deal with the heat generated during this process. This motivated Rolf Landauer

16Predicting the future is a business involving considerable risks and potential ridicule. Cases in point: in1943, discussing the future of the computer industry, Thomas J. Watson, the chairman of IBM corporationsaid: “I believe that there is a market for maybe five computers.”; in 1949 a widely circulated popular sciencemagazine speculated: “Computers in the future may weigh no more than 1.5 tons.”

41

Table 1: Projected evolution of VLSI technology.

Year 2003 2005 2008 2011Minimum feature size (µm, 10−6 meter) 0.10 0.08 0.07 0.05Memory; Bits per chip (billions, 109) 1 2 6 16Logic; Transistors per cm2 (millions, 106) 24 44 109 269Microprocessor; Transistors per chip (millions, 106) 95.2 190 539 1523

to investigate the heat generation during the computational process starting from the basiclaws of thermodynamics. His results were published in 1961 [70].

Following in Landauer’s footsteps, in 1973 Charles Bennett studied the logical reversibilityof computations. This concept is discussed in Chapter 7. In layman’s terms logical reversibilitymeans that once a computation is finished, one can retrace every step and reconstruct thedata used as input of every step. Bennett argued that Turing machines and any other general-purpose computing automata are logically irreversible [12]. A device is said to be irreversibleif its transfer function does not have a single-valued inverse. Bennett developed a theoreticalframework proving that reversible general-purpose computing automata can be built and thattheir construction makes plausible the possibility of building thermodynamically reversiblecomputers. There is no positive lower bound on the energy dissipated per logical step bya thermodynamically reversible computer, thus, in principle, such a device could computedissipating little, if any, energy at all.

The widespread interest in quantum computing was probably generated by the contribu-tions of Richard Feynman. In 1981 he gave a talk with the title “Simulating Physics withComputers” at a meeting held at MIT [28]. Feynman argued that in traditional numericalsimulations such as weather forecasting or aerodynamic calculations, computers model physi-cal reality only approximately. He advanced the idea that physics was computational and thata quantum computer could do an exact simulation of a physical system, even of a quantumsystem. He identified quantum mechanics as the most important ingredient for constructingcomputational models of physics.

Feynman speculated that in many instances computation can be done more efficientlyby using quantum effects [51]. His ideas were inspired by previous work of Bennett [12, 13]and Benioff [10]. Starting from basic principles of thermodynamics and quantum mechanics,Feynman suggested that problems for which polynomial time algorithms do not exist could besolved; computations for which polynomial algorithms exist could be speeded up considerablyand even made reversible.

Ed Fredkin, Thomasso Toffoli, and Norman Margolus, who were associated with the Labo-ratory for Computer Science at MIT, contributed to the field of quantum computing. Fredkinand Toffoli gates are some of the most common building blocks of quantum circuits.

In 1985 David Deutsch reinterpreted the Church-Turing conjecture as “every finitely re-alizable physical system can be perfectly simulated by a universal model computing machineoperating by finite means” and conceived a universal quantum computer [39]. In 1993 CharlesBennet, Giles Brassard, Christian Crepeau, R. Josza, A. Peres, and W.K. Wooters discoverquantum teleportation.

In 1994 Peter Shor developed a clever algorithm for factoring large numbers [100] andgenerated a wave of excitement for the newly founded discipline of quantum computing. Ayear later Robert Calderbank, Peter Shor, and Andrew Steane address the problem of relia-

42

Table 2: Milestones in quantum physics, quantum computing and information theory.≈ 1800 Thomas Young (1773-1829) conducts the “double-slit experiment”.1900 Max Planck presents the black body radiation theory; the quantum theory is born.1905 Albert Einstein develops the theory of the photoelectric effect.1911 Ernest Rutherford develops the planetary model of the atom.1913 Niels Bohr develops the quantum model of the hydrogen atom.1923 Louis de Broglie relates the momentum p of a particle with the wavelength λ of

the wave associated with it, p = h/λ.1925 Werner Heisenberg formulates the matrix quantum mechanics.1925 Max Born and Pasqual Jordan use infinite matrices to represent basic physical

quantities and develop a complete formalism for quantum mechanics.1926 Erwin Schrodinger proposes the equation for the dynamics of the wave function.1926 Erwin Schrodinger and Paul Dirac show the equivalence of Heisenberg’s

matrix formulation and Dirac’s algebraic one with Schrodinger’s wave function.1926 Paul Dirac and, independently, Max Born, Werner Heisenberg, and Pasqual

Jordan obtain a complete formulation of quantum dynamics.1926 John von Newmann introduces Hilbert spaces to quantum mechanics.1927 Werner Heisenberg formulates the uncertainty principle.1927 Davisson and Germer observe the diffraction of electrons by a crystal and confirm

the wave character of an electron and thus, de Broglie’s theory.1928 Paul Dirac develops the relativistic quantum mechanics.1932 John von Neumann publishes Mathematical Foundations of Quantum Mechanics.1936 Alan Turing dreams up the Universal Turing Machine, UTM.1936 Alonzo Church publishes a paper asserting that “every function which can be

regarded as computable can be computed by a universal computing machine”.1945 ENIAC, the world’s first general purpose computer, the brainchild of

J. Presper Eckert and John Macauly becomes operational.1946 John von Neumann co-authors a report outlining the principles of the

program-stored computer, the von Neumann architecture. Proposes EDVAC.1946 John Wheeler makes the assumption that the two photons produced through

electron-positron annihilation have opposite polarizations.1948 Claude Shannon publishes A Mathematical Theory of Communication.1949 Madame Chien-Shiung Wu and Irvin Shaknow confirm Wheeler’s hypothesis

and produce for the first time a pair of entangled photons.1951 UNIVAC I, the first commercial computer is delivered.1961 Rolf Landauer decrees that computation is physical and studies heat generation.1973 Charles Bennet studies the logical reversibility of computations.1981 Richard Feynman suggests that physical systems including quantum systems

can be simulated exactly with quantum computers.1982 Peter Beniof develops quantum mechanical models of Turing machines.1984 Charles Bennet and Gilles Brassard introduce quantum cryptography.1985 David Deutsch reinterprets the Church-Turing conjecture.1993 Bennet, Brassard, Crepeau, Josza, Peres, Wooters discover quantum teleportation.1994 Peter Shor develops a clever algorithm for factoring large numbers.1996 Robert Calderbank, Peter Shor, and Andrew Steane develop quantum codes.

43

bility of quantum computing and communication and develop quantum codes. A summaryof the milestones in quantum physics, quantum computing, and quantum information theoryis presented in Table 2.

2.14 Summary and Further Readings

In this section we first discuss the physical limitations of classical electronic circuits usedin modern computer systems. Heat dissipation restricts our ability to compute faster andcheaper, while quantum effects will ultimately limit the reliability of future computing systemsbuilt with current technologies. We conclude that we need fundamentally new ideas forbuilding systems able to store and process information with higher speed, and at a lower cost.Systems using quantum particles, governed by the laws of quantum mechanics, seem ideallysuited to store and process large volumes of information using little, if any, energy at all.

Next, we present the milestones in the evolution of quantum ideas and the unprecedenteddevelopments in quantum computing and quantum information theory we have witnessedduring the past two decades. We introduce the reader to the world of quantum effects byway of several experiments involving beams of photons, or light in layman’s terms. Theseexperiments are meant to put us in the frame of mind needed for understanding quantuminformation. First, we present a simple experiment revealing the granular nature of light.Next, we discuss a simple model to address the non-deterministic effects. Then, we challengethis model to explain the results of successive measurements performed upon a beam ofquantum particles using different bases. We augment the model with Feynman’s probabilityrule for events that may occur on alternative paths and discuss several experiments withresults consistent with this rule.

The sometimes strange and non intuitive outcomes of such experiments are relatively easyto grasp based upon elementary probability arguments. We conclude the section with anapplication of quantum information theory to computer security, namely the quantum keydistribution.

From the experiments discussed in this section we learn that a photon can be used tostore and to transmit information. Measuring the polarization of the photon, to retrieve theclassical information (“0” or “1”) from the quantum information reflected by its quantumstate, produces a reliable outcome if and only if we use for the measurement the same basisused to encode the information. Moreover, the measurement process alters the state of aquantum system; the state vector is projected on one of the two basis state. This effect isexploited by the quantum key distribution algorithm of Bennett and Brassard to detect thepresence of an intruder.

Roland Omnes points out in [82]: “When a theory is so strange that it must be interpreted,whether it be relativity theory or quantum mechanics, the aim of this interpretation is toreconcile the fundamental, outrageously abstract concepts with plain empirism.” This isprecisely the reason why in this book we discus several experiments revealing quantum effects,and, whenever possible, we provide intuitive explanations and present analogies, even whenthey fail to convey the full complexity of the physical phenomena.

Throughout this section we avoid the sometimes difficult mathematical apparatus, thetrademark of quantum mechanics. The only mathematical concepts used are the vectors andthe tensor product. The readers unfamiliar with these concepts are encouraged to refreshtheir knowledge of linear algebra and consult a text such as “Lectures on Linear Algebra” byGelfand [53].

44

A fair number of “popular science” books attempt to provide intuitive explanations ofsome of the phenomena of interest to quantum computing and quantum information, withminimal references to mathematical equations. The list should start with Richard Feynman’sbook, “QED - The Strange Theory of Light and Matter” [50]. David Deutsch’s book “TheFabric of Reality” should also be high on the reading list of the curious reader [41]. A book byAczel provides a very vivid encounter with some of the great physicists of last century and anentertaining presentation of entanglement, one of the most puzzling and non intuitive physicalphenomena [2]. Closer to the subject of quantum computing are Brown’s book “The Quest forQuantum Computer” [28] and Milburn’s “The Feynman Processor” [76] and “Schrodinger’sMachines” [77].

Feynman’s “Lecture Notes on Computation” should be a required reading for anyoneinterested in the subject of quantum computing and quantum information theory [51]. Wefound the discussion of reversibility, the presentation of quantum gates, and informationtheory particularly impressive. The paper by Charles Bennett “Quantum Information andComputation” provides a very insightful discussion of some of the critical aspects of thesubject [16]. A more advanced and rigorous, yet comprehensible introduction to quantumcomputing is the paper of Rieffel and Polack [93].

The most authoritative text to date is Nilssen and Chuang’s book “Quantum Computingand Quantum Information Theory” [80]. Several lecture notes are available on the Web.Among them, the ones of John Preskill stand out [92].

2.15 Exercises and Problems

(1) Read the Nobel lectures of Max Planck, Albert Einstein, Niels Bohr, Werner Heisen-berg, Erwin Schrodinger, Max Born, and Richard Feynman and discuss how each of themviewed his contributions to science as reflected in his lecture. These lectures can be found athttp://www.nobel.se/physics/laureates.

(2) How did the political turmoil in Europe of 1930’s and 1940’s influence the human relationsbetween the great physicists Max Planck, Albert Einstein, Niels Bohr, Werner Heisenberg,and John von Neumann, as well as their lives (according to [2]).

(3) Present and analyze critically the main ideas and concepts related to “positivism”, “re-ductionism”, and “holism”, as approaches to explain physical phenomena, see [41] pages 4-28.

(4) What is the justification of the statement in [41] page 197, “Complexity theory has notyet been sufficiently well integrated with physics to give many quantitative answers”?

(5) Analyze critically Fredkin’s idea that the universe is a gigantic digital computer processinginformation [52]. Note that the physicist Philip Morrison remarked that “the only reasonFredkin thought that the universe was a computer was because he was a computer scientist,in the same way that if he had been a cheese maker, he would have claimed that it was madeout of cheese” [28] page 59.

(6) Discuss the benefits of computer simulation as an alternative method of science and en-gineering, complementing the traditional approaches, theoretical modelling and experiments.Give examples from your own area and outline the benefits of computer simulation for yourspecific examples.

(7) Relate the number of states of a system subject to computer simulation to the executiontime and the space complexity of the algorithms used for simulation.

45

(8) Is it feasible to simulate a quantum system using a classical computer? Justify youranswer.

(9) Consider the multiple beam splitter experiment discussed in Section 2.4. Assuming that abeam splitter tosses a fair ternary coin to decide if a photon should be transmitted, reflected,or absorbed, what are the probabilities for the five detectors in Figure 2(b) to detect a photon?If the experiment is carried out 100, 000 times how many counts will each detector register?

(10) Draw the four diagrams revealing the path of a photon emitted by S2 in Section 2.9 andcalculate the probability of a photon from S2 to reach D1 and D2.

(11) Consider the experiment discussed in Section 2.9 and presented in Figures 7 and 8.Assume that a beam of electrons crosses the paths of the photons coming from sources S1and S2 before they reach the beam splitter BS1. How will the outcome of this experimentbe affected?

(12) Write a Java program to simulate the system discussed in Section 2.9 and presented inFigures 7 and 8.

(13) Write a Java program to simulate the BB84 protocol. You have to simulate(a) a source of polarized photon which picks up randomly either a vertical/horizontal (VH)

and then a vertical, or a horizontal polarization, or a diagonal polarization (DG) and then a45o or 135 deg. polarization;

(b) the photon separation system which picks up randomly a vertical/horizontal (VH) ora diagonal (DG) orientation for the crystal and then measures without any error an incomingphoton if its basis is identical with the crystal orientation;

(c) a quantum communication channel transporting a photon from the sender to thereceiver with the possible eavesdropping component; and

(d) a classical noiseless communication channel allowing the sender and the receiver toexchange binary information.

(14) Consider a setup like the one in Sections 2.9 and 2.11. Show that if we have an experi-mental setup with an odd number of beam splitters, 1, 3, 5, . . . (2k + 1) . . ., we get the resultpresented in this section

pS1→D1 = pS1→D2 = 1/2 pS2→D1 = pS2→D2 = 1/2.

When we have an even number of beam splitters, 2, 4, 6, . . . (2k) . . . we get

pS1→D1 = 1 pS1→D2 = 0 pS2→D1 = 0 pS2→D2 = 1.

(15) Consider an experimental setup like the one in Section 2.9 and assume that the statevectors of the photons coming from direction1 and direction2 respectively are

| ψ1〉 = +q | 0〉+ q | 1〉and

| ψ2〉 = −q | 0〉+ q | 1〉Show that

pS1→D1 = 0 pS1→D2 = 1 pS2→D1 = 1 pS2→D2 = 0.

46

3 Quantum Mechanics, a Mathematical Model of the

Physical World

The material presented in this section represents a minimal set of concepts necessary tounderstand the physical phenomena the new discipline of quantum computing is based upon.A number of experiments, some of them discussed in the previous chapter, have shown thatthe classical theory cannot explain some of the phenomena involving light, as well as atomicand subatomic particles. During the first decades of the twentieth century it became obviousthat we needed a more refined model of the physical world. Quantum mechanics representsa mathematical model of the physical world able to explain phenomena, such as interference,which classical mechanics fails to explain. The predictions of quantum mechanics have beenconfirmed by a plethora of experimental evidence.

We first introduce the basic definitions and concepts necessary to understand the math-ematical apparatus of quantum mechanics. First, we introduce basic concepts from linearalgebra. We present vector fields, Hermitian operators, Hilbert spaces, as well as inner prod-ucts, tensor products, and outer products of vectors in a Hilbert space.

Then we present some of the milestones in the evolution of quantum ideas and brieflydiscuss two of the pillars of quantum mechanics, Schrodinger’s equation and Heisenberg’suncertainty principle. We present in some depth the double slit experiments imagined byYoung to demonstrate the interference phenomena associated with the wave-like behavior oflight and the Stern-Gerlach experiments revealing the spin.

The postulates of quantum mechanics are then introduced gently. We discuss the rep-resentation of quantum states and observables, outline the properties of quantum operatorsand the spectral decomposition of an operator, and conclude with a section devoted to themeasurement of observables.

3.1 Vector Spaces

To define the concept of a vector space we first need to introduce two basic algebraic structures,the group and the field. A group G is a set equipped with one binary operation “·”, calledmultiplication, which satisfies three conditions

(i) Associative law: ∀(a, b, c) ∈ G a · (b · c) = (a · b) · c.(ii) Identity element: ∀a ∈ G there is an identity element e ∈ G such that a · e = e · a = a.

(iii) Inverse element: ∀a ∈ G,∃a−1 such that a · a−1 = a−1 · a = e.A group G whose operation satisfies the commutative law (i.e. a·b = b·a) is a commutative,

or Abelian group.

A field is a set F equipped with two binary operations, addition and multiplication, suchthat

(i) under addition, F is an Abelian group with identity (or neutral) element 0: 0 + a =a, ∀a ∈ F .

(ii) under multiplication, the nonzero elements form an Abelian group with neutral element1: 1 · a = a, ∀a ∈ F , and 0, ·a = 0, ∀a ∈ F .

(iii) the distributive law holds: a · (b + c) = a · b + a · c.

47

A vector space A assumes three objects:

(1) An abelian group (V, +) whose elements are called “vectors” and whose binary operation“+” is called addition,

(2) A field F of numbers (either R, the real numbers, or C, the complex numbers), whoseelements are called “scalars”, and

(3) An operation called “multiplication with scalars” and denoted by “·”, which associates toany scalar c ∈ F and vectors α ∈ A a new vector c · α ∈ A. The multiplication with scalarsoperation has the following properties

c · (α + β) = c · α + c · β

(c + c′) · α = c · α + c′ · α

(c · c′) · α = c · (c′ · α), 1 · α = α.

where α, β ∈ A and c, c′ ∈ F.Observations:

(a) Often we omit the “·” symbol and write the product of two scalars as cc′ instead ofc · c′ and the product of a scalar with a vector as cα instead of c · α.

(b) In this volume we are only concerned with either R, the field of real numbers, or C,the field of complex numbers. In the second volume we discuss other fields, e.g., finite fields.

Given n scalars (c0, c1, c2, . . . cn−1) ∈ R, then n vectors (e0, e1, e2, . . . en−1) ∈ Rn are linearlyindependent if

c0e0 + c1e1 + c2e2 + . . . + cn−1en−1 = 0 =⇒ c0 = c1 = c2 = . . . = cn−1 = 0.

Vectors that are not linearly independent are called linearly dependent.

A subspace S of a vector space A is a subset of A which is closed with respect to theoperations of addition and scalar multiplication. This means that the sum of two vectors inS is in S and for any vector α ∈ S and scalar c ∈ R the vector cα is in S.

Examples of subspaces

(i) The set of polynomials of degree at most m is a subspace of the vector space of allpolynomials which is a subspace in the vector space of complex valued continuous functionsCR(R).

(ii) The set of all continuous functions f(x) defined for 0 ≤ x ≤ 2 is a subspace of the linearspace of all functions defined on the same domain.

The set of all linear combinations of any set of vectors of a vector space A is a subspaceof A. Given c′, c′1, c

′2, . . . c

′n ∈ F the following two identities allow us to prove this statement

(c1α1 + c2α2 + . . . cmαm) + (c′1α1 + c′2α2 + . . . c′mαm) =(c1 + c′1)α1 + (c2 + c′2)α2 + . . . (cm + c′m)αm

c′(c1α1 + c2α2 + . . . cmαm) = (c′c1)α1 + (c′c2)α2 + . . . (c′cm)αm

48

The subspace consisting of all linear combinations of a set of vectors of A is the smallestsubset containing all the given vectors. The set of vectors span the subspace.

A linearly independent subset of vectors which spans the whole space is called a basis ofa vector space. A vector space A is finite dimensional if and only if it has a finite basis.

The minimum number of vectors in any basis of a finite-dimensional vector space A iscalled the dimension of a vector space, dim(V ). For example, the ordinary space R3 can bespanned by three vectors, (1, 0, 0), (0, 1, 0), and (0, 0, 1).

3.2 n-Dimensional Real Euclidean Vector Space

Consider now an n-dimensional real vector space, Rn. If for every pairs of vectors α, β ∈ Rn

we have an associated real number (α, β) such that the following four conditions are satisfied

(1) (α, β) = (β, α)

(2) (cα, β) = c(α, β), if c ∈ R

(3) (α + γ, β) = (α, β) + (γ, β)

(4) (α, α) ≥ 0 and (α, α) = 0 if and only if α = 0

then we say that we have an n-dimensional Euclidean space and that (α, β) is the innerproduct of vectors α and β. All the facts known from Euclidean geometry can be establishedin an Euclidean space. The length of a vector α in an Euclidean space is defined to be thereal number

| α | =√

(α, α).

The angle between two vectors α and β is the real number

ϕ = arc cos(α, β)

| α | | β | −→ cos ϕ =(α, β)

| α | | β |If (α, β) = 0, where α �= 0 and β �= 0, then ϕ = π/2, and cos ϕ = 0. Two vectors α and β

in an Euclidean space are orthogonal if

(α, β) = 0.

The n vectors e0, e1, e2, . . . en−1 form an orthogonal basis in an n-dimensional Euclideanspace if they are pairwise orthogonal. If, in addition each of them has a unit length, thenthey form an orthonormal basis and satisfy the condition

(ei, ej) = δi,j =

{0 i �= j1 i = j

0 ≤ (i, j) ≤ (n− 1).

with δi,j the Kronecker delta function.

It is relatively easy to show [53] that every n-dimensional Euclidean space contains or-thogonal bases. Given an orthonormal basis e0, e1, e2, . . . en−1 of an n-dimensional Euclideanspace we can express any two vectors as

α = a0e0 + a1e1 + a2e2 + . . . + an−1en−1

49

and

β = b0e0 + b1e1 + b2e2 + . . . + bn−1en−1.

Then we use the fact that (ei, ej) = δi,j to show that

(α, ei) = ai

and that

(α, β) =n−1∑i=0

aibi.

The last two equations express the fact that(a) the projection of the vector α on the basis vector ei is the inner product (α, ei), i.e.,

the real number ai, and(b) the inner product of two vectors in an Euclidean vector space is the sum of the product

of projections of the two vectors on all vectors of an orthonormal basis.

The simplest functions defined on vector spaces are the linear transformations. The func-tion A(α) is a linear form (function) if

A(α; β) = A(α) + A(β)

and

A(cα) = cA(α).

Assume that A and B are two vector spaces over the same field F , (α, β) ∈ A, c ∈ F , andA(α) ∈ B. A maps vectors in A to vectors in B.

The function A(α; β) is said to be a bilinear form (function) of vectors α and β if

(i) For any fixed β, A(α; β) is a linear function of α,

(ii) For any fixed α, A(α; β) is a linear function of β.

This implies that if A and B are two vector spaces over the same field F and we considervariable vectors α, α′ ∈ A and β, β′ ∈ B and scalars a, b, c, d ∈ F the function A(α, β) withvalues in F is a bilinear function if

A(aα + bα′; β) = aA(α; β) + bA(α′; β)

and

A(α; cβ + dβ′) = cA(α; β) + dA(α; β′)

A bilinear function is symmetric if

A(α; β) = A(β; α).

The inner product of two vectors in an Euclidean vector space is an example of a symmetricbilinear function.

If A(α; β) is a symmetric quadratic form, then the function A(α; α) is called a quadraticform. A quadratic form A(α; α) is positive definite if for every vector α �= 0

50

A(α; α) > 0.

The bilinear form A(α; β) is called the polar form associated with the quadratic formA(α; α). It can be shown that A(α; β) is uniquely determined by its quadratic form [53].

Now we can provide an alternative definition of an Euclidean vector space as a vectorspace with a positive definite quadratic form A(α; α). The inner product (α, β) of two vectorsis the value of the bilinear form A(α; β) associated with A(α; α).

3.3 Linear Operators and Matrices

A rectangular array of elements of a field F with m rows and n columns, A, is called a matrix

A =

a11 a12 . . . a1n

a21 a22 . . . a2n...

... . . ....

am1 am2 . . . amn

Matrix A can be interpreted as a linear map from the vector space of dimension n, Fn, tothe vector space of dimension m, Fm equipped with the canonical base. If n = m, then A isa linear map from Fn to itself.

Let α1 = (a11 a12 . . . a1n), α2 = (a21 a22 . . . a2n), . . . αm = (am1 am2 . . . amn) bea set of vectors spanning a subspace of dimension m of a vector space Vn(F ) called the rowspace of the m× n matrix A.

The elementary row operations on a matrix are(i) the interchange of any two rows,(ii) multiplication of all the elements of a row by a constant c ∈ F , and(iii) addition of any multiple of a row to any other row.Two matrices are row-equivalent if one is obtained from the other by a finite sequence of

row operations.

In the case n = m the identity matrix A = [aij] is the matrix with aii = 1 and aij =0, if i �= j. For example, the identity matrix in a vector space of dimension 8 is

I8 =

1 0 0 0 0 0 0 00 1 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 1 0 0 0 00 0 0 0 1 0 0 00 0 0 0 0 1 0 00 0 0 0 0 0 1 00 0 0 0 0 0 0 1

A permutation matrix P is an identity matrix with rows and columns permuted.

The determinant of an n× n matrix A = [aij] is the polynomial

det(A) =| A |=∑

φ

sgn(φ) a(1, 1φ) a(2, 2φ) . . . a(n, nφ).

51

φ denotes one of the n! different permutations of integers 1, 2, . . . n . If φ is an even permutationthen sgn(φ) = +1 and sgn(φ) = −1 for an odd permutation.

The determinant can be written as:

det(A) = Ai1ai1 + Ai2ai2 + . . . Ainain

Here the coefficient Aij of aij is called the cofactor of aij. A cofactor is a polynomial in theremaining rows of A and can be described as the partial derivative (∂ | A |/∂aij) of A. Thecofactor polynomial contains only entries from an (n− 1)× (n− 1) matrix Mij (also called a“minor”) obtained from A by eliminating row i and column j.

If we permute two rows of A then the sign of the determinant | A | changes. The deter-minant of the transpose of a matrix is equal to the determinant of the original matrix:

| AT |=| A | .

If two rows of A are identical then:

| A |= 0.

A square matrix (n × n) is triangular if all entries below the diagonal are zero. Thedeterminant of a triangular matrix is the product of its diagonal elements. To compute thedeterminant | A | one should perform elementary row operations on matrix A and reduce itto a triangular form.

The characteristic polynomial of matrix A is:

c(λ) ≡ det | A− λI | .

The trace of matrix A is the sum of its diagonal elements.

Tr(A) =∑

i

aii.

The trace of an operator A is the trace of the matrix representation of A. Given any twomatrices A and B over F and a scalar c ∈ F it is easy to show that the trace has the followingproperties:

(i) It is cyclic

Tr(AB) = Tr(BA).

(ii) Linearity

Tr(A + B) = Tr(A) + Tr(B) Tr(cA) = c× Tr(A).

(iii) Invariance under the similarity transformation, S

Tr(SAS†) = Tr(S†SA) = Tr(A).

The determinant, the trace, and the characteristic polynomial are quantities associatedwith any linear map from a finite dimensional vector space into itself.

52

Now we can express a bilinear form A(β; γ) in terms of the projections of the two vectors, βand γ, namely b0, b1, b2 . . . bn−1 and c0, c1, c2 . . . cn−1 on the orthonormal basis e0, e1, e2, . . . en−1.

A(β; γ) = (b0e0 + b1e1 + b2e2 + . . . + bn−1en−1; c0e0 + c1e1 + c2e2 + . . . + cn−1en−1).

But A is a bilinear function, thus

A(β; γ) =n−1∑i,j=0

A(ei; ej)bicj

Here A(ei; ej) is a real number that we denote it by aij and

A(β; γ) =n−1∑i,j=0

aijbicj.

The matrix with elements aij, A = [aij], 0 ≤ (i, j) ≤ n − 1 is called the matrix of thebilinear form A(β; γ) relative to the basis e0, e1, e2, . . . en−1.

3.4 Hermitian Operators in a Complex n-Dimensional EuclideanVector Space

Later in this chapter we shall use vectors to describe the state of a quantum system. Quantumsystems evolve in time and, in order to capture the dynamics of a system in a mathematicalmodel, we have to study transformations of states represented as mathematical operatorsapplied to vectors. Note that the concepts “form”, “transformation”, and “operator” areused interchangeably throughout this chapter.

It is also necessary to consider vector spaces over fields other than R, the field of realnumbers. Cn, the n-dimensional vector space over C, the field of complex numbers is ofparticular interest to us. But first we need to introduce the complex numbers.

The equation x2 = −1 has no root among real numbers. An imaginary number i satisfyingthe equality i2 = −1 was introduced. Informally, the complex numbers are the smallest fieldwhich contains the field of real numbers as a subfield and the imaginary number i.

The underlying set of C consists of ordered pairs of real numbers (x, y) with x called thereal component and y called the imaginary component. It is convenient to write a complexnumber as z = x + iy. Addition and multiplication of complex numbers are defined as

z ± z′ = (x + iy)± (x′ + iy′) = (x± x′) + i(y ± y′)

z · z′ = (x + iy) · (x′ + iy′) = (x · x′ − y · y′) + i(x · y′ + y · x′).

The complex conjugate of z = x + iy is

z∗ = x− iy.

The modulus of a complex number z = x + iy is defined as

| z |2= z · z∗ = x2 + y2.

53

The concepts of linear and bilinear transformations introduced for an n-dimensional Eu-clidean vector space over the field of reals, extend to finite-dimensional dimensional vectorspaces over other fields. For example, the inner product in a vector space Cn over a field Cis a bilinear map which for every α ∈ Cn, and β ∈ Cn, in addition to bilinearity property,satisfies three conditions

(α, β) = (β, α)∗

(α, α) ≥ 0

and

(α, α) = 0 ⇐⇒ α = 0.

Recall that c = (α, β) is a complex number, c = a+ ib and that c∗ = a− ib with i =√−1.

The inner product in a finite-dimensional vector space induces a norm, but a norm mayexist even if a inner product is not defined. A finite dimensional vector space with a norm isa Banach space.

The inner product permits to measure the “length” of a vector and the “angle” betweentwo vectors. If (α, β) ∈ Cn and c ∈ C, then the norm is a non-negative function with thefollowing properties

|| α + β || ≤ || α || + || β ||

|| c · α || = | c | || α ||and

|| α || = 0 if and only if α = 0.

We define orthogonality and orthonormal vector bases, in an n-dimensional Euclidean vec-tor space over the field of complex numbers similarly to the ones defined for an n-dimensionalEuclidean vector space over the field of real numbers.

We now introduce Hermitian operators, a concept analogous to the symmetric bilinearform in a real n-dimensional Euclidean space. If α, β ∈ Cn then the bilinear form A(α; β) iscalled Hermitian if

A(α; β) = A∗(β; α)

A bilinear form has a matrix associated with it and the necessary and sufficient condi-tion for an operator A to be Hermitian is that its matrix A = [aij] relative to some basise0, e1, e2 . . . en−1 satisfies the condition

aij = a∗ji

The adjoint operator associated with a bilinear form A is denoted as A†(α; β) = A∗(β; α).If α ∈ Cn then by definition α† = (α∗)T .

If A is the matrix representation of the linear operator A and we want to obtain the adjointmatrix, we first construct the complex conjugate matrix A∗ and then take its transpose

54

A† = (A∗)T .

For example, the adjoint of matrix

A =

(1− 5i 1 + i1 + 3i 7i

)is

A† =

(1− 5i 1 + i1 + 3i 7i

)†=

[(1− 5i 1 + i1 + 3i 7i

)∗]T

Thus

A† =

(1 + 5i 1− i1− 3i −7i

)T

=

(1 + 5i 1− 3i1− i −7i

).

The adjoint of the matrix

I3 =

1 0 0

0 1 00 0 1

is

I†3 =

1 0 0

0 1 00 0 1

and the adjoint of matrix A with real elements, (a, b, c, d, e, f, g, h, i, j, k, l,m, n, o, p) ∈ R

A =

a b c de f g hi j k lm n o p

is

A† =

a e i mb f j nc g k od h l p

An operator A is normal if

AA† = A†A.

If A is a Hermitian (self-adjoint) operator, then it is also a normal operator. A matrix Uis unitary if

U †U = In

where In is the identity matrix in an n-dimensional vector space

55

In =

1 0 0 . . . 00 1 0 . . . 00 0 1 . . . 0...

...... . . .

...0 0 0 . . . 1

Unitary operators preserve the inner product of vectors. Let (α, β) ∈ Cn. Then

(Uα,Uβ) = (α, β).

Usually we write the inner product using the “·” symbol. With this convention the previousequation becomes

Uα · Uβ = α · β.

3.5 n-Dimensional Hilbert Spaces. Dirac’s Notations

Traditionally, a Hilbert space is defined as an infinite-dimensional vector space with an innerproduct and a norm [69]. The infinite-dimensional vector spaces that are important in quan-tum mechanics are analogous with the finite-dimensional vector spaces and can be spannedby a countable basis17. They are called separable Hilbert spaces [75].

The quantum computing literature [80] has inherited the convention to call an n-dimensional complex Euclidean vector space an n-dimensional Hilbert space, Hn. We shallfollow this convention throughout this book.

With this convention an n-dimensional Hilbert space Hn is an n-dimensional vectorspace over the field of complex numbers with an inner product. The elements of Hn aren-dimensional vectors. These vectors can be added together, or multiplied by scalars and theresults of these operations are also elements of the Hilbert space. Hn is isomorphic with Cn.An n-dimensional Hilbert space Hn is also called a unitary space.

All the concepts discussed in the previous section apply to Hn. A collection of vectors{e0, e2, . . . en−1} ∈ Hn is called an orthonormal basis if the inner product of any two of themis zero, ei · ej = 0, ∀(i, j) ∈ {0, n − 1}, and the inner product of any of them with itself isone, ei · ei = 1, ∀i ∈ {0, n− 1}.

A set of orthonormal vectors form an orthonormal basis inHn. For the following discussionwe select one of many possible choices of orthonormal basis vectors. We use the traditionalnotation of quantum mechanics for vectors in Hn introduced by Dirac and represent thevectors of this particular orthonormal basis as kets

| 0〉, | 1〉, . . . , | i〉, . . . , | n− 1〉,or, as bras

〈0 |, 〈1 |, . . . , 〈i |, . . . , 〈n− 1 | .In matrix representation each ket vector | i〉 is expressed as a column vector with 1 in the

ith position and 0 in all the others. For example,

17The set of vectors forming an orthonormal basis is countable.

56

| 0〉 =

10...0...0

, | 1〉 =

01...0...0

, . . . , | i〉 =

00...1...0

, . . . , | n− 1〉 =

00...0...1

.

In turn, each bra vector 〈i | is expressed as a row vector with 1 in the ith position and 0 inall the others:

〈0 | =(

1 0 . . . 0 . . . 0),

〈1 | =(

0 1 . . . 0 . . . 0),

...

〈i | =(

0 0 . . . 1 . . . 0),

...

〈n− 1 | =(

0 0 . . . 0 . . . 1).

An n-dimensional ket vector | ψ〉 can be expressed in this basis as a linear combinationof the orthonormal ket vectors

| ψ〉 = α0 | 0〉 + α1 | 1〉 + . . . + αi | i〉 + . . . + αn−1 | n− 1〉where α0, α1, . . . , αi, . . . , αn are complex numbers.

For each ket vector | ψ〉 there is a dual, the bra vector denoted by 〈ψ |. The bra andket vectors are related by Hermitian conjugation

| ψ〉 = (〈ψ |)†, 〈ψ |= (| ψ〉)†

The bra vector 〈ψ |, the dual of the ket vector | ψ〉, is expressed as a linear combinationof the orthonormal bra vectors

〈ψ | = α∗0〈0 | + α∗

1〈1 | + . . . + α∗i 〈i | + . . . + α∗

n−1〈n− 1 |where α∗

0, α∗1, . . . , α∗

i , . . . , α∗n are the complex conjugates of α0, α1, . . . , αi, . . . , αn−1.

In matrix representation, the ket vector is expressed as the column matrix

| ψ〉 =

α0

α1...αi...

αn−1

57

and the dual bra vector is expressed as the row matrix

〈ψ | =(

α∗0 α∗

1 . . . α∗i . . . α∗

n−1

)

3.6 The Inner Product in an n-Dimensional Hilbert Space

The inner product 〈ψa | ψb〉 of two vectors (| ψa〉, | ψb〉) ∈ Hn is a complex number. Theinner product in a Hilbert space has the following properties

(i) The inner product of a vector with itself is a non-negative real number

〈ψ | ψ〉 ∈ R

〈ψ | ψ〉 =

{= 0 if | ψ〉 = 0> 0 otherwise

(ii) Linearity. If (| ψa〉, | ψb〉, | ψc〉) ∈ Hn and c ∈ C

〈ψa | ( c|ψb〉 ) = c〈ψa | ψb〉

( a〈ψa | + b〈ψb | ) | ψc〉 = a〈ψa|ψc〉 + b〈ψb|ψc〉

〈ψc | ( a | ψa〉 + b | ψb〉 ) = a〈ψc | ψa〉 + b〈ψc | ψb〉(iii) Skew symmetry

〈ψa | ψb〉 = 〈ψb | ψa〉†.

Let us observe that the skew symmetry implies a skew linearity in the second factor

〈ψa | (b | ψb〉+ c | ψc〉) = 〈(b|ψb〉+ c|ψc〉)|ψa〉†

= b∗〈ψb|ψa〉† + c∗〈ψc|ψa〉†

= b∗〈ψa|ψb〉+ c∗〈ψa|ψc〉The inner product maps an ordered pair of vectors in Hn to complex numbers in C. For

example, if | ψa〉, | ψb〉 ∈ H3 and

| ψa〉 = α0 | 0〉+ α1 | 1〉+ α2 | 2〉 | ψb〉 = β0 | 0〉+ β1 | 1〉+ β2 | 2〉then

〈ψa | ψb〉 =(

α∗0 α∗

1 α∗2

) β0

β1

β2

= α∗

0β0 + α∗1β1 + α∗

2β2

For example, if

| ψa〉 = (1 + i) | 0〉+ (2− 3i) | 1〉 | ψb〉 = (1− 2i) | 0〉+ (3 + 2i) | 1〉

58

then

〈ψa | ψb〉 = (1 + i)∗(1− 2i) + (2− 3i)∗(3 + 2i)= (1− i)(1− 2i) + (2 + 3i)(3 + 2i)= −1 + 10i

Recall that the inner product of a vector with itself is a real number. Indeed

〈ψa | ψa〉 =(

α∗0 α∗

1 α∗2

) α0

α1

α2

= α∗

0α0 + α∗1α1 + α∗

2α2 =| α0 |2 + | α1 |2 + | α2 |2

with | αi |2, the square of the modulus of the complex number αi being equal to the sum ofthe squares of its real and imaginary components

| αi |2= [Re(αi)]2 + [Im(αi)]

2

For example, if

| ψa〉 = (1 + 2i) | 0〉+ (4− 3i) | 1〉then

〈ψa | ψa〉 =| 1 + 2i |2 + | (4− 3i) |2= (12 + 22) + (42 + 32) = 5 + 25 = 30.

A few comments regarding the notations: 〈ψa | ψb〉 is an abbreviated notation for 〈ψa ||ψb〉; a complete bracket expression 〈 | 〉 denotes either a real or a complex number while anincomplete bracket expression such as 〈 |, or | 〉 denotes a vector. The notation P1 =⇒ P2

means “proposition P1 implies proposition P2”. The notation | αi | denotes the square rootof the modulus of the complex number αi, while || | α〉 || denotes the norm of the vector| α〉.

Two vectors | ψa〉 and | ψb〉 in Hn are orthogonal (we write | ψa〉 ⊥ | ψb〉) if their innerproduct is zero

〈ψa | ψb〉 = 0 =⇒ | ψa〉 ⊥ | ψb〉The skew symmetry implies that orthogonality is a symmetric relation

| ψa〉 ⊥ | ψb〉 =⇒ | ψb〉 ⊥ |ψa〉.

A normal unitary basis of an n-dimensional space is a set of n vectors | ψ1〉, | ψ2〉 · · · |ψi〉, · · · | ψn〉 where each vector has the norm (or “length”) equal to one

|| | ψ1〉 || = || | ψ2〉 || = · · · || | ψi〉 || = . . . = || | ψn〉 || = 1

and any two vectors are orthogonal

〈ψi | ψj〉 = 0 ∀(| ψi〉, | ψj〉) ∈ Hn and i �= j.

The unit vectors 〈0 | = (1, 0, . . . , 0), . . . , 〈n− 1 | = (0, 0, . . . , 1) have unit length andare mutually orthogonal; they form a normal unitary basis. It can be proven that any set of

59

m < n mutually orthogonal vectors of length one of a unitary space forms part of a normalunitary basis of the space.

The inner product satisfies the Schwartz inequality

〈ψa | ψa〉〈ψb | ψb〉 ≥ | 〈ψa | ψb〉 |2

A non-zero vector | a〉 is an eigenvector and the scalar a is an eigenvalue of the linearoperator A if the following equation is satisfied:

A | a〉 = a | a〉The eigenvalues of the operator A are the solutions of the characteristic equation

c(λ) = 0

where the characteristic polynomial is

c(λ) ≡ det | A− λI | .According to the fundamental theorem of algebra a polynomial over C has at least one

complex root. Thus every operator A in a finite dimensional vector space over C has at leastone eigenvalue (actually it has n eigenvalues if n is the dimension of the vector space.)

The eigenspace corresponding to an eigenvalue a of operator A is the set of vectors {| ai〉}which have eigenvalue a. It is a subspace of the vector space that A operates on.

Any operator A can be written as

A =∑

i

ai | bi〉〈ci | .

A diagonal representation of the operator A is a representation of the form

A =∑

i

ai | bi〉〈bi |

where the vectors | bi〉 form an orthogonal set of eigenvectors of A corresponding to theeigenvalues ai. When such representation exists, it is unique.

It is easy to prove that a normal matrix is Hermitian if and only if it has real eigenvalues.The proof is left as an exercise to the reader, and here we only show that if a matrix in H2 isHermitian it has real eigenvalues.

Let

A =

(a11 a12

a21 a22

)Then

A† =

(a∗

11 a∗21

a∗12 a∗

22

)The eigenvalues of A are the solutions of the equation

60

det |A− λI2| = 0.

or

(a11 − λ)(a22 − λ)− a12a22 = 0

The quadratic equation

λ2 − (a11 + a22)λ + (a11a22 − a12a21) = 0

has real roots if

(a11 + a22)2 − 4(a11a22 − a12a21) ≥ 0.

This is clearly the case because the fact A is Hermitian ( A = A†) implies that

a11 = a∗11 a22 = a∗

22 a12 = a∗21 a21 = a∗

12.

3.7 Tensor and Outer Products

If A(F ) and B(F ) are vector spaces over the field F , then the set F(A,B) of all bilinearfunctions A(α, β) with α ∈ A and β ∈ B is also a vector space over the field F .

An n×m matrix A is regarded as a linear operator from an n-dimensional Hilbert space,Hn, to an m-dimensional Hilbert space, Hm, namely A : Hn −→ Hm.

The tensor product A⊗B of two vector spaces A and B over the same field F is the dualF(A,B)∗ of the space F(A,B) of bilinear functions from A and B to F .

The tensor product of two finite dimensional Hilbert spaces, Hn and Hm, is Hn ⊗Hm =Hnm. If (e0, e1, . . . en−1) is an orthonormal basis for Hn and (f0, f1, . . . fn−1) is an orthonormalbasis for Hm, then (e0 ⊗ f0, e1 ⊗ f1, . . . en−1 ⊗ fn−1) is an orthonormal basis for Hnm.

Similarly, one defines the tensor product of two linear operators, in particular the tensorproduct of two matrices. For example, if A is an m× n matrix and B is a p× q matrix thenusing the Kronecker product representation for the tensor product we have

A⊗B =

a11B a12B . . . a1nBa21B a22B . . . a2nBa31B a32B . . . a3nB...

......

am1B am2B . . . amnB

with

A =

a11 a12 . . . a1n

a21 a22 . . . a2n

a31 a32 . . . a3n...

......

am1 am2 . . . amn

B =

b11 b12 . . . b1q

b21 b22 . . . b2q

b31 b32 . . . b3n...

......

bp1 bp2 . . . bpq

.

61

here aijB, 1 ≤ i ≤ n, 1 ≤ j ≤ m, is a sub-matrix whose entries are the products of elementsof matrix B multiplied by aij.

The tensor product of an m × n matrix and a p × q matrix is an mp × nq matrix. Forexample, the tensor product of vectors (a, b) and (c, d) is the vector;

(ab

)⊗(

cd

)=

acadbcbd

.

The tensor products of vectors (| 0〉, | 1〉) ∈ H2 is

| 0〉⊗ | 1〉 =

(10

)⊗(

01

)=

0100

.

The tensor product of two vectors, one in Hp, and the other in Hq, is a vector in Hpq. Inthe previous example the tensor product of two vectors in H2 is a vector in H4.

The outer product of a ket vector and a bra vector | ψa〉〈ψb | is a linear operator, de facto,a matrix. In H3 we have

| ψa〉〈ψb |=

α0

α1

α2

(

β∗0 β∗

1 β∗2

)=

α0β

∗0 α0β

∗1 α0β

∗2

α1β∗0 α1β

∗1 α1β

∗2

α2β∗0 α2β

∗1 α2β

∗2

3.8 Quantum States

A state is a complete description of a physical system. In quantum mechanics, a state isrepresented by a vector of length one in the Hilbert space Hn. Vectors which differ by acomplex number of modulus equal to one, represent the same state. The traditional notationfor a state in quantum mechanics is

| Ψa〉 = α0 | 0〉+ α1 | 1〉 . . . + αi | i〉 . . . + αn−1 | n− 1〉and we shall follow this notation rather than | ψ〉 for the rest of this chapter. The length of aket vector | Ψa〉 or of the corresponding bra vector 〈Ψa | is defined as the square root ofthe positive number 〈Ψa | Ψa〉.

By convention, state vectors are assumed to be normalized, i.e. 〈Ψa | Ψa〉 = 1 .Therefore, ∑

i

| αi |2 = 1.

For a given state, the ket and the bra vector corresponding to it, are defined only asdirection and their length is determined up to a factor. This factor is usually chosen sothat the vector length is equal to unity. Even when the vector length is unity the vector is

62

undetermined because it can be multiplied by a phase factor, a quantity of modulus 1, suchas the complex number eiγ, where γ is real.

In fact a quantum state is a ray in Hilbert space, a mathematical abstraction that exhibitsonly direction. It can be represented as a straight line through the origin of the coordinatesystem. While a vector has a well defined magnitude and direction, a ray has only relativedirection.

A ray is related to an equivalence class of vectors that differ by multiplication by a nonzerocomplex scalar. In fact, for any non-vanishing vector we can choose an element of this classto have unit norm:

〈Ψa | Ψa〉 = 1.

For such a normalized vector we can say that | Ψa〉 and eiγ | Ψa〉, where |eiγ| = 1, describethe same physical state. The eiγ represents the relative phase.

The inner product of two state vectors | Ψa〉 and | Ψb〉 represents the “generalized angle”between the states and gives an estimate of the overlap between the states | Ψa〉 and | Ψb〉.The interpretation of 〈Ψa | Ψb〉 = 0 as representing orthogonal states and the implication of〈Ψa | Ψb〉 = 1 that Ψa and Ψb are one and the same state are immediately evident. The innerproduct of two state vectors is a complex number and its modulus, || 〈Ψa | Ψb〉 ||2, can beconsidered a quantitative measure of the “relative orthogonality” between these states.

If the state of a dynamical system is the result of a superposition of other states, its corre-sponding ket vector can be expressed as a linear combination of the ket vectors correspondingto states entering the superposition. The states involved in a superposition are said to bedependent. The “superposition principle” can be formulated as: every ray in Hn correspondsto a possible state, so that given two states | Ψa〉 and | Ψb〉, we can form another state as(a | Ψa〉 + b|Ψb〉), a superposition of the original two states. The relative phase in such asuperposition (a | Ψa〉+ b | Ψb〉) is associated with eiγ (a | Ψa〉+ b | Ψb〉) and it is physicallysignificant.

There is a fundamental difference between a quantum superposition and a classical su-perposition. For example, a superposition of a classical membrane vibration state with itselfresults in a different state with a different magnitude of the oscillation. There is no physicalcharacteristic of a quantum state corresponding to the magnitude of the classical oscillation.A classical state with amplitude of oscillation zero everywhere is a membrane at rest. Nocorresponding state exists for a quantum system since a zero ket vector corresponds to nostate at all.

A set of unit vectors | 0〉, | 1〉, · · · | i〉, . . . | n − 1〉 forms a normal unitary basis in then-dimensional state vector space. For example, in H3 a state can be represented by thevector

| Ψa〉 −→ α0 | 0〉+ α1 | 1〉+ α2 | 2〉where

| 0〉 =

1

00

| 1〉 =

0

10

| 2〉 =

0

01

We can work with bra vectors instead of ket vectors. Then

63

〈Ψa |−→ α∗0 〈0 | +α∗

1 〈1 | +α∗2 〈2 |

The unit vectors satisfy the relation

〈i | j〉 = δi,j

where δi,j is the Kronecker delta function previously defined.

The same physical state | Ψa〉 can be expressed in different bases. For example, the samestate | Ψa〉 ∈ H2 can be expressed in two different bases

{| 0〉, | 1〉}as

| Ψa〉 = a0 | 0〉+ a1 | 1〉and in

{| x〉 =1√2(| 0〉+ | 1〉), | y〉 =

1√2(| 0〉− | 1〉)}

as

| Ψa〉 =1√2

(α0 + α1) | x〉+1√2

(α0 − α1) | y〉.

3.9 Quantum Observables. Quantum Operators

An observable is a property of a physical system that, in principle, can be measured. Theformalism of quantum mechanics associates an observable with a self-adjoint or Hermitianoperator.

An operator O maps state vectors to state vectors in Hn. If we take a quantum systemin state Ψa and apply to it a transformation described by the operator O, we get a differentstate Ψb. This transformation, regardless of whether it is a rotation, waiting for some time∆t, or performing a measurement, is described mathematically as

| Ψb〉 = O | Ψa〉.Recall that an operator O is

Hermitian (self adjoint) if O = O†

Unitary if OO† = O†O = INormal if [O,O†] = OO† −O†O = 0

with O† the adjoint of O and I the identity operator. Clearly, a unitary Hermitian operatoris normal.

An operator O acts on a ket state vector from the left and on a bra state vector from theright

| Ψa〉 → O | Ψa〉〈Ψa | → 〈Ψa | O.

64

We are only concerned with linear transformations and thus with linear operators. Thenfor any two state vectors, (Ψa, Ψb) ∈ Hn, and any complex numbers, a ∈ C and b ∈ C

O(a | Ψa〉+ b | Ψb〉) = aO | Ψa〉+ bO | Ψb〉.The state vector | Ψa〉 can be expressed as a linear combination of the basis states, | 0〉, |

1〉, . . . | i〉, · · · | n− 1〉

| Ψa〉 =n−1∑i=0

αi | i〉

with αi, 0 ≤ i ≤ n− 1 complex numbers representing the amplitudes. It is easy to show that

αj = 〈j | Ψa〉Indeed, if we multiply the expression giving the expansion of | Ψa〉 in terms of the basis

vectors with the bra state vector 〈j | we obtain

〈j | Ψa〉 = 〈j |n−1∑i=0

αi | i〉 =n−1∑i=0

αi〈j | i〉 = αj.

Recall that the unit vectors satisfy the relation

〈j | i〉 =

{0 if (j �= i)1 if (j = i)

Thus the amplitudes 〈j | Ψa〉 give the amount of each basis state | j〉 we find in | Ψa〉.Now let us consider | Ψb〉 = O | Ψa〉 as an algebraic equation and multiply it by 〈j |

〈j | Ψb〉 = 〈j | O | Ψa〉If we express the state vector | Ψa〉 in terms of the basis vectors and substitute the value

of αi we obtain

〈j | Ψb〉 = 〈j | O∑

i

αi | i〉 =∑

i

〈j | O αi | i〉 =∑

i

〈j | O 〈i | Ψa〉 | i〉

We change the order of the last two elements, the complex number 〈i | Ψa〉 with the ket

vector | i〉 and obtain

〈j | Ψb〉 =∑

i

〈j | O | i〉〈i | Ψa〉

where the states | j〉 are from the same set as | i〉. In this algebraic equation the complexnumbers 〈j | O | i〉, are the inner products of the bra state vector 〈j | with the ket statevector O | i〉 and they express how much of each amplitude 〈i | Ψa〉 goes into each term. Theoperator O has an associated matrix O = [Oji] with the complex elements

Oji = 〈j | O | i〉.This matrix can be transformed from one vector basis to another, i.e. from one represen-

tation to another. When we want to obtain the results of a particular transformation in aspecific basis we have to give the components of the state vector and we have to identify the

65

operator O by its matrix O = [Oij] in terms of the same set of basis states. Once we knowa matrix for one particular set of base states, we can calculate the corresponding matrix foranother base.

In general we do not have to specify a particular set of basis states, any basis will do.Thus, the original equation where the basis states are not specified, | Ψb〉 = O | Ψa〉, isconvenient to use.

We now discuss the procedure to construct an operator given its desired function. Considera two-dimensional Hilbert space with basis vectors (| 0〉, | 1〉). We introduce an operator Owhose function is to interchange the projection of a state vector on | 0〉 with the one on | 1〉.Thus O is defined as

O = | 0〉〈1 | + | 1〉〈0 | .The matrix expression of this operator is

O =

(10

)(0 1

)+

(01

)(1 0

)=

(0 10 0

)+

(0 01 0

)=

(0 11 0

).

When we apply O to a ket state vector | Ψa〉 we obtain another state vector | Ψb〉

| Ψb〉 = O | Ψa〉 = (| 0〉〈1 | + | 1〉〈0 |) (α0 | 0〉+ α1 | 1〉)then

| Ψb〉 = O | Ψa〉 = α0[(| 0〉〈1 | + | 1〉〈0 |) | 0〉] + α1[(| 0〉〈1 | + | 1〉〈0 |) | 1〉]Knowing that 〈0 | 1〉 = 〈1 | 0〉 = 0 and that 〈0 | 0〉 = 〈1 | 1〉 = 1 it is easy to show that

(| 0〉〈1 |) | 0〉 =| 0〉(〈1 | 0〉) =

(00

)

(| 1〉〈0 |) | 0〉 =| 1〉(〈0 | 0〉) =| 1〉 =

(01

)

(| 0〉〈1 |) | 1〉 =| 0〉(〈1 | 1〉) =| 0〉 =

(10

)

(| 1〉〈0 |) | 1〉 =| 1〉(〈0 | 1〉) =

(00

)thus,

| Ψb〉 = O | Ψa〉 = α0 | 1〉+ α1 | 0〉Similarly, for a bra state vector we obtain

〈Ψb |= 〈Ψa | O = (α∗0 〈0 | +α∗

1 〈1 |) (| 0〉〈1 | + | 1〉〈0 |) = α∗0〈1 | +α∗

1〈0 |The same results could be obtained using the matrix formalism

| Ψb〉 = O | Ψa〉 =⇒(

0 11 0

)(α0

α1

)=

(α1

α0

)=⇒ | Ψb〉 = α1 | 0〉+ α0 | 1〉

66

and

〈Ψb |= 〈Ψa | O =⇒(

α∗0 α∗

1

)( 0 11 0

)=(

α∗1 α∗

0

)=⇒ 〈Ψb |= α∗

1〈0 | +α∗0〈1 | .

Let us now turn our attention to another type of operator that plays an important rolein the measurement process, the projector operator. Recall that any state vector has unitarylength. The outer product of any state vector with itself is a projection operator

| Ψa〉〈Ψa | = PΨa

A projection operator has the following property

( PΨa )2 = | Ψa〉〈Ψa | Ψa〉〈Ψa | = | Ψa〉〈Ψa | = PΨa

Two projectors Pi, Pj are orthogonal if, for every state in the Hilbert space, ∀Ψa ∈ Hn

PiPj | Ψa〉 = 0

This condition is often written as PiPj = 0.

We now give several examples of quantum mechanics operators. In these examples x, y,and z represent indexing variables, rather than the coordinates of a three-dimensional Hilbertspace.

(i) The rotation operator Ry(θ) takes a state | Ψa〉 into a new state, in fact the old state asseen in a rotated coordinate system.

(ii) The inversion (parity) operator P creates a new state by reversing all coordinates.

(iii) The spin 12

operators given by Pauli matrices σx, σy, σz.

(iv) The displacement operator Dx(L) along the x - axis, by distance L. If we cause a smalldisplacement ∆x along the x-axis, the state | Ψa〉 is transformed into another state | Ψb〉

| Ψb〉 = Dx(∆x) | Ψa〉If ∆x goes to zero, then | Ψb〉 should be the initial state | Ψa〉. Thus

Dx(0) = 1.

For infinitesimally small displacement distances, δx, the change of Dx should be proportionalto δx. The coefficient of proportionality is a product of a constant which turns out to be(i/�), with the momentum operator for the x-component, px. Thus

Dx(δx) = (1 +i

�px δx).

This expression serves as the definition of the momentum operator px.

We now give several examples of rotation matrices.

(a) Rotation matrices for spin 12

about the z-axis, Rz(φ). There are two basis states, | +〉,spin “up” and | −〉, spin “down”. The spin numbers in these states are respectively s = +1

2

and s = −12.

67

Rz(φ) | +〉 | −〉

〈+ | e+ i φ2 0

〈− | 0 e− i φ2

(b) Rotation matrices for spin 1 around the z- and y-axis, Rz(φ) and Ry(θ). There are threebasis states | +〉, | 0〉 and | −〉. The spin numbers in these states are respectively s = +1,s = 0, and s = −1.

Rz(φ) | +〉 | 0〉 | −〉

〈+ | e+ i φ 0 0

〈0 | 0 1 0

〈− | 0 0 e− i φ

The Ry(θ) matrix depends upon the particular choice of phase

Ry(θ) | +〉 | 0〉 | −〉

〈+ | 12(1 + cosθ) + 1√

2sinθ 1

2(1 − cosθ)

〈0 | − 1√2

sinθ cosθ + 1√2

sinθ

〈− | 12(1 − cosθ) − 1√

2sinθ 1

2(1 + cosθ)

(c) Rotation matrices for photons of circular polarization in the xy-plane, Rz(φ). The xy-plane is perpendicular to the flight path of the photons (the z-direction). The photons couldhave Right Hand Circular (RHC) or Left Hand Circular (LHC) polarization. There are twobasis states

| R〉 = 1√2

(| x〉 + i | y〉), m = + 1 (RHC polarized)

| L〉 = 1√2

(| x〉 − i | y〉), m = − 1 (LHC polarized)

Rz(φ) | R〉 | L〉

〈R | e+ i φ 0

〈L | 0 e− i φ

3.10 Spectral Decomposition of a Quantum Operator

Let | Ψ〉 be a vector in Hn and O a normal operator. | Ψ〉 is called an eigenvector and λ aneigenvalue of O if

O | Ψ〉 = λ | Ψ〉

68

The identity operator I has the unity (1) as an eigenvalue

I | Ψ〉 = 1 | Ψ〉The equation defining the eigenvector and the eigenvalue of O can be rewritten as

O | Ψ〉 = λI | Ψ〉or

(O− λI) | Ψ〉 = 0

Let | u1〉, | u2〉, . . . | ui〉 . . . | un〉 be an orthonormal basis in Hn and express the statevector as

| Ψ〉 =n−1∑

i

γi | ui〉

Then the previous equation becomes

(O− λI)n−1∑

i

γi | ui〉 = 0

Both operators O and I can be expressed as matrices, O = [Oij] and I = [δij]. Then theprevious equation involving O and I is transformed into a matrix equation

n−1∑i

(Oij − λδij)γi = 0

which has a non-trivial solution if and only if det(O−λI) = 0. The eigenvector of the operatorO corresponding to the eigenvalue ω can be found by solving the system of n linear equationabove.

By definition an observable is any Hermitian operator whose eigenvectors form a basis.The following statements are true but will not be proved here:

(i) The eigenvalues of a Hermitian operator are real numbers.

(ii) Eigenvectors corresponding to different eigenvalues are mutually orthogonal.

(iii) If two Hermitian operators commute they have a common basis of orthonormal eigenvec-tors, a so called eigenbasis. If they do not commute, then no common eigenbasis exists.

(iv) A complete set of commuting observables is the minimal set of Hermitian operators witha unique common eigenbasis.

In a finite-dimensional vector space, every normal operator has a complete set of orthonor-mal eigenvectors. Thus in Hn every normal operator N has n eigenvectors, | ni〉, 1 ≤ i ≤ n,each corresponding to an eigenvalue λi

N | ni〉 = λi | ni〉.Every state vector | Ψa〉 ∈ Hn can be expressed on the basis formed by the n eigenvectors

of the normal operator N

69

| Ψa〉 =∑

i

αi | ni〉

The normality of N implies that the moduli of the amplitudes sum to unity∑i

| αi |2= 1.

Now the transformation of the state vector performed by the normal operator N can beexpressed as

N | Ψa〉 = N∑

i

αi | ni〉 =∑

i

αiN | ni〉 =∑

i

αiλi | ni〉.

Recall that the outer product of any state vector with itself is a projection operator

| Ψa〉〈Ψa | = PΨa

Let us construct the projection operators of the basis vectors | ni〉

Pi =| ni〉〈ni | .Now we apply the projector operator to the state vector | Ψa〉

Pi | Ψa〉 =| ni〉〈ni |∑

j

αj | nj〉 =∑

j

αj | ni〉〈ni | nj〉 =∑

j

αj | ni〉(〈ni | nj〉)

But 〈ni | nj〉 = δij, henceforth

Pi | Ψa〉 = αi | ni〉.We substitute Pi | Ψa〉 for αi | ni〉 in the expression of N | Ψa〉

N | Ψa〉 =∑

i

αiλi | ni〉 =∑

i

λiPi | Ψa〉.

From this expression we conclude that

N =∑

i

λiPi

and this means that this spectral decomposition of the normal operator N is true for anystate and it is independent of the basis. An observable specifies an exhaustive measurementthrough its spectral decomposition.

Finally, we give an example of spectral decomposition of an operator in H2. Consideran operator N with two eigenvalues λa and λb and with the corresponding orthonormaleigenstates with the eigenvectors | a〉 and | b〉

| a〉 = α0 | 0〉+ α1 | 1〉 ↔(

α0

α1

)

| b〉 = β0 | 0〉+ β1 | 1〉 ↔(

β0

β1

)

70

The projection operators corresponding to these eigenvectors can be written, respectively

Pa =| a〉〈a |=(

α0

α1

)(α∗

0 α∗1

)=

(| α0 |2 α0α

∗1

α1α∗0 | α1 |2

)

Pb =| b〉〈b |=(

β0

β1

)(β∗

0 β∗1

)=

(| β0 |2 β0β

∗1

β1β∗0 | β1 |2

)We can use the spectral decomposition to write the operator using the matrix notation

O ↔ λa

(| α0 |2 α0α

∗1

α1α∗0 | α1 |2

)+ λb

(| β0 |2 β0β

∗1

β1β∗0 | β1 |2

).

3.11 The Measurement of Observables

So far we have learned that the numerical outcome of a measurement of the observable O isan eigenvalue, λi, of the operator O. Immediately after the measurement, the quantum stateis the eigenstate | oi〉 of O with the measured eigenvalue λi. Quantum mechanics postulatesthat mutually exclusive measurement-outcomes correspond to orthogonal projection operators(projectors) {P0,P1, . . .}.

Given the Hilbert space Hn, a complete set of orthogonal projectors is a set{P0,P1, . . .Pi . . .Pm−1} such that

m−1∑i=0

Pi = 1.

It follows that the number of projectors in a complete orthogonal set must be less than, orequal to the dimension of the Hilbert space

m ≤ n.

The postulate can be reformulated as: A complete set of orthogonal projectors specifiesan exhaustive measurement. If we take a closer look at the spectral decomposition of anoperator we conclude that an observable specifies an exhaustive measurement. Whenever wesay that “we measure an observable” we mean that we measure the set of projectors Pq

i andwe associate the eigenvalue λq

i with the occurrence of the i-th outcome.

We now compute the probability to obtain the value λi as the outcome of the measurement,if the quantum state just prior to the measurement is | Ψa〉:

Prob (λi) = | Pi | Ψa〉 |2

= (Pi | Ψa〉)† Pi | Ψa〉

= 〈Ψa | P†i Pi | Ψa〉

= 〈Ψa | (Pi)2 | Ψa〉

= 〈Ψa | Pi | Ψa〉

71

In this derivation we used the fact that

P2i = (| ni〉〈ni |) (| ni〉〈ni |) =| ni〉 (〈ni | ni〉) 〈ni | = | ni〉〈ni |= Pi.

The total probability for all possible outcomes of a measurement is

m−1∑i=0

Prob (λi) = 1

due to the completeness of the set of projectors. The post measurement normalized purequantum state after the outcome λi is

Pi | Ψa〉√〈Ψa | Pi | Ψa〉

As an example consider the case of a two-dimensional Hilbert space with an orthonormalbasis formed by the vectors | x〉 and | y〉. Consider two state vectors | Ψa〉, | Ψb〉 ∈ H2

| Ψa〉 = ax | x〉 + ay | y〉 and | Ψb〉 = bx | x〉 + by | y〉On this system we perform a large number M of measurements corresponding to the

projectors

Px =| x〉〈x | and Py =| y〉〈y |We assume that before the measurement the system is a mixed ensemble of quantum

states, it could be in state | Ψa〉 with probability p or in state | Ψb〉 with probability 1− p

Prob(| Ψa〉) = p and Prob(| Ψb〉) = 1− p

The probability that a system in state | Ψa〉 before the measurement will produce theoutcome λx is

Prob(λx | Ψa〉) = 〈Ψa | Px | Ψa〉 =| ax |2

while the probability that a system in state | Ψb〉 before the measurement will produce theoutcome λx is

Prob(λx | Ψb〉) = 〈Ψb | Px | Ψb〉 =| bx |2 .

We try to predict the number of times mx, out of the total number M of measurements, inwhich we expect to obtain the measurement outcome corresponding to the basis vector | x〉.

According to the probability theory

mx = M [Prob(| Ψa〉)Prob(λx | Ψa〉) + Prob(| Ψb〉)Prob(λx | Ψb〉)]

= M [p 〈Ψa | Px | Ψa〉+ (1− p)〈Ψb | Px | Ψb〉]

= M [p | ax |2 +(1− p) | bx |2]Since 0 ≤ p ≤ 1, the quantity mx/M is bounded from below by the smaller of | ax |2 and

| bx |2

72

mx

M= p | ax |2 + (1− p) | bx |2 .

Now we consider that the system is in a coherent superposition of the states | Ψa〉 and| Ψb〉 with amplitudes γa and γb. | Ψ(γa, γb)〉 is a pure state

| Ψ(γa, γb)〉 = γa | Ψa〉+ γb | Ψb〉

= (γaax + γbbx) | x〉+ (γaay + γbby) | y〉Here, γa and γb are chosen such that the state | Ψ(γa, γb)〉 is normalized. In this case, the

probability of a measurement outcome corresponding to basis vector | x〉 is

px = 〈Ψ(γa, γb) | Px | Ψ(γa, γb)〉

= | γaax + γbbx |2

In certain cases it is possible to choose γa and γb such that

γaax = −γbbx �= 0.

Then px = 0 due to the phenomenon of destructive interference even though | ax |2, |bx |2 > 0. This reflects an important distinction between coherent superpositions andincoherent admixtures. A coherent superposition produces a single pure state in which casedestructive interference is possible; an incoherent admixture produces a mixed ensemble ofquantum states and destructive interference is not possible in that case .

3.12 More about Measurements. The Density Operator

Let us assume that we have N independent versions of a quantum system, each in somequantum state | ψi〉, with 1 ≤ i ≤ N . We measure the same observable A on all N versionsof the system and we want to compute the ensemble average of A, and call it 〈A〉 [1].

Let {φa} denote the set of vectors of an orthonormal basis and assume that these or-thonormal vectors are the eigenstates of a Hermitian operator.

〈A〉 is an ensemble average:

〈A〉 =1

N

N∑i=1

〈ψi | A | ψi〉.

But

〈ψi | A | ψi〉 =∑

a

〈ψi | φa〉〈φa | A | ψi〉 =∑

a

〈φa | A | ψi 〉〈ψi | φa〉,

and

〈A〉 =1

N

N∑i=1

∑a

〈φa | A | ψi 〉〈ψi | φa〉 =∑

a

〈φa | A | φa〉1

N

N∑i=1

| ψi 〉〈ψi | .

If we denote

73

ρ =1

N

N∑i=1

〈ψi | ψi〉

then the ensemble average 〈A〉 can be written as

〈A〉 =∑

a

〈φa | ρA | φa〉 = Tr(ρA).

If Pa =| φa〉〈φa | is the projector operator onto the eigenstate φa of the Hermitian operatorthen the probability to find the system in a particular state φa is

Prob(φa) =1

N

N∑i=1

| 〈ψi | φa〉 |2= Tr(ρPa).

When {φa} form an orthonormal basis, the density matrix is Hermitian. Indeed, itselements are

ρba =1

N

N∑i=1

〈ψi | φa〉〈ψi | φb〉∗.

The trace of ρ is the expected value of the identity operator

Tr(ρ) = 1.

If all N independent systems are in the same state | ψi〉 =| ψ〉 then

Tr(ρ2) = Tr(ρ) = 1.

Indeed,

ρba =1

N

N∑i=1

〈ψ | φa〉〈ψ | φb〉∗.

The eigenvalue λa corresponding to the eigenvector | φa〉 of ρ defined by the relation

ρ | φa〉 = λa | φa〉is non-negative

λa = ρaa =1

N

N∑i=1

〈ψ | φa〉〈ψ | φa〉∗ =1

N

N∑i=1

| 〈ψ | φa〉 |2 ≥ 0.

It is also easy to see that

0 ≤ Tr(ρ2) ≤ 1.

Indeed,

Tr(ρ2) =∑

a

ρ2aa =

∑λ2

a ≤[∑

a

λa

]2

.

74

3.13 Young’s Double-Slit Experiments

To understand the difference between corpuscular and wave behavior we consider classicalsystems first and then briefly discuss the two double-slit experiments, one involving bulletsand the other water waves. In the first experiment, a gun shoots bullets at random at abarrier with two slits a and b. The bullets go through the two slits, some of them may bounceoff the edges of the slits and are collected by a backstop. We repeat the experiment threetimes, first with slit a open, then with slit b open, and finally with both slits open. For eachexperiment we place a mobile detector at the lower edge of the backstop x = xmin and fire nbullets and count the number of bullets reaching the detector. Then we move the detector toa new position, fire the same number of bullets, count them, and repeat the process until wereach the upper edge of the backstop, x = xmax.

Ib

Ia Ia+b

= +

Mobiledetector

Gun

BackstopBarrier

x

Slit a

Slit b

Ia+b Ia Ib Ia+b(x)

Ib(x)

Ia(x)

= Ia(x) + Ib(x) for all x in ( xmin,xmax)

xmin

xmax

Figure 13: The double slit experiment with bullets. If Ia(x), Ib(x), and Ia+b(x) are the numberof bullets recorded at position x when only slit a is open, only slit b is open, both slits areopen, then Ia+b(x) = Ia(x) + Ib(x) ∀x ∈ (xmin, xmax).

Call Ia(x), Ib(x), and Ia+b(x) the number of bullets recorded at position x when only slita is open, only slit b is open, and both slits are open, respectively. We observe that

Ia+b(x) = Ia(x) + Ib(x) ∀x ∈ (xmin, xmax).

The three functions Ia(x), Ib(x), and Ia+b(x) are plotted in Figure 13. We do not observeany interference pattern in Ia+b.

75

A similar setup is provided for a second experiment shown in Figure 14. In this case, waterwaves produced by a wave source arrive at the barrier, penetrate through the slits and reachthe mobile detector. The detector is mounted on the absorber wall behind the barrier to avoidreflection of the waves. The mathematical description of the wave reaching the absorber wallis

w(t) = βeiωt = β[cos(ωt) + i sin(ωt)]

with β a complex number.The detector is a device which measures the instantaneous height of the wave, and converts

it to the intensity of the wave I(x) =| β |2 as a function of position. When both slits areopen, the result of the measurement is a curve I(x) with maxima and minima.

Ib

Ia

Ia+b

Mobiledetector

Absorbing wallBarrier

x

Slit a

Slit b

Ib(x)

Ia(x)

xmin

xmax

Source

Ia+b(x)

Figure 14: The double slit experiment with water waves. The wave coming through slit a isβae

iωt and its intensity is Ia =| βa |2; the wave coming through slit b is βbeiωt and its intensity

is Ib =| βb |2. When both slits are open the height of the wave reaching the detector is thesum of the individual waves coming through slit a and slit b, i.e., (βa +βb)e

iωt and its intensityis Ia+b =| βa + βb |2. In this case Ia+b = Ia + Ib + 2

√IaIb cos δ �= Ia + Ib.

As before, we repeat the experiment three times, once with only slit a open, then withslit b open, and finally with both slits open and obtain three curves, Ia(x), Ib(x), and Ia+b(x).Now we observe that Ia and Ib have a maximum centered at the respective slit position.The maxima and minima of Ia+b are the result of constructive and, respectively, destructiveinterference of the waves. We also see that Ia+b �= Ia + Ib.

We assume that the wave coming through slit a has the height βaeiωt and the intensity

Ia =| βa |2, while the wave coming through slit b has the height βbeiωt and the intensity

Ib =| βb |2. When both slits are open the height of the wave reaching the detector is the sum

76

of the individual waves coming through slit a and slit b, i.e., (βa + βb)eiωt and its intensity is

Ia+b =| βa + βb |2. Here,

| βa + βb |2=| βa |2 + | βb |2 +2 | βa || βb | cos δ

where δ is the phase difference between βa and βb. Henceforth

Ia+b = Ia + Ib + 2√

IaIb cos δ.

In this experiment we observe the interference among the waves diffracted by slit a and thosediffracted by slit b.

Let us now consider a third experiment, this time involving electrons, see Figure 15, wherewe expect to notice quantum effects. The electrons produced by an electron gun pass throughone of the slits or through both and then are recorded by a Geiger counter. The electronsarrive at the Geiger counter individually, very much like the bullets of the first experiment.Indeed, the Geiger counter emits a sound, a “click”, whose intensity is proportional to theelectric charge. The fact that we hear distinct clicks of the same intensity proves that theelectrons are indistinguishable and that they arrive individually, rather than in packets ofmore than one, or as fractional entities.

First, we measure electrons coming through one slit at a time, either slit a or slit b. Thenumber of electrons measured at a given spot gives the current intensity which is proportionalto the rate of clicks. The intensities for electrons coming either through slit a or slit b are Ia

and Ib, respectively. The result of the measurement when both slits are open, the curve Ia+b,shows an interference pattern, though the electrons are recorded as individual entities, likethe bullets in the first experiment.

We observe experimentally that Ia+b(x) �= Ia(x) + Ib(x), the number of electrons thatarrives at a particular point when both slits are open is not equal to the number of electronscoming through slit a plus the number of electrons coming through slit b. At certain positionscorresponding to the minima of the Ia+b curve the number of electrons detected by the Geigercounter is clearly smaller than the number expected to come through slits a and b.

The only possible explanation is that the electrons arrive like particles and the probabilityof arrival of these particles is distributed like the intensity of a wave, as if each electron weretravelling through both slits at the same time and then interfering with itself upon arrival atthe detector.

The probability P of an event in a quantum experiment is given by the square of theabsolute value of a complex number α which is called the probability amplitude:

P =| α |2

When an event can occur in several alternative ways, the probability amplitude for theevent is the sum of the probability amplitudes for each way considered separately (the super-position probability rule).

Assume that Ia =| α1 |2 and Ib =| α2 |2, where α1 and α2 are complex numbers. Thecombined effect of the two slits is Ia+b =| α1 + α2 |2 and the resultant curve is similar to thatobtained in the waves case.

Feynman discusses a variation of the last experiment in Chapter 3 of Volume 3 of the“Lecture Notes on Physics” [47]. The aim is to identify the slit each electron passes through.A strong light source is placed behind the wall between the two slits. The electrons have

77

I1

I2

I12

Geigercounter

Gun

BackstopWall

xSlit b

Slit a

Figure 15: The double slit experiment showing electron interference. We observe experimen-tally that Ia+b(x) �= Ia(x) + Ib(x), the number of electrons that arrives at a particular pointwhen both slits are open is not equal to the number of electrons coming only through slit aplus the number of electrons coming through slit b.

electric charges and scatter the photons of the light beam. In this experiment we hear theclick of the Geiger counter signaling an incoming electron and, at the same time, we see aflash of light indicating if the electron has passed through slit a or through slit b.

Let us call I ′a(x), I ′

b(x), and I ′a+b(x) the values of Ia(x), Ib(x), and Ia+b(x) defined for

the previous experiment. We notice that I ′a(x), and I ′

b(x) are similar to Ia(x), and Ib(x) .Whether the slits are open one at a time or both at the same time, the electrons we see comingthrough slit a are distributed in the same way as those coming through slit b. In this case

I ′a+b = I ′

a + I ′b = Ia + Ib.

When we identify which slit the electrons are coming through, I ′a+b does not look like

Ia+b; it shows no interference effect. The photons colliding with the electrons change themomentum and the trajectory of the electrons, not very much, but enough to “smear” thetotal probability distribution.

We can change the intensity or the wavelength (frequency) of the light. When we reducethe light intensity, we reduce the number of photons flying into the paths of electrons; wecan reduce it so far that some electrons coming either through slit a or slit b are not seenbefore being detected (their presence being marked by a click) because the probability of anelectron-photon collision decreases significantly. Assume we record the electrons which havebeen detected and seen as coming either through slit a or slit b, as well as the electrons whichare coming through slit a and slit b and are detected without being seen. The distributions

78

for slit a and slit b are similar to Ia and Ib, as before, but I ′a+b is similar to Ia+b showing an

interference pattern.When we reduce the frequency of the light (thus increase its wavelength) we do not notice

any change; we record a monotonous curve I ′a+b until we reach values of the wavelength larger

than the distance between the two slits. At that moment we begin to see big fuzzy flashesof light, we can no longer distinguish which slit the electron went through, and we notice theappearance of some interference effect. As the wavelength continues to decrease, the changein the electron momentum becomes small enough to observe a curve similar to Ia+b.

In summary, when an event may occur in several alternative ways, the probability am-plitude of the event, α is the sum of the probability amplitudes for each way consideredseparately. In this case we observe the interference between the separate paths. For example,when there are two possible ways for an event to occur, the probability amplitude and theactual probability of the occurrence of the event are respectively

α = α1 + α2

P =| α1 + α2 |2 �= P1 + P2

with P1 =| α1 |2 and P2 =| α2 |2 the probability of the occurrence of each path.After we have determined that one of the alternative was actually taken, the probability

of the event is the sum of the probabilities for each alternative. The interference is lost. Inthe two path case we have

P = P1 + P2.

Heisenberg had suggested that the laws of the quantum mechanics could be consistentonly if there were some basic limitations, previously not recognized, on our experimentalcapabilities. His uncertainty principle sets a lower limit h, the Planck’s constant, on theproduct between the position uncertainty and the momentum uncertainty. In fact, the entiretheory of quantum mechanics depends on the correctness of the uncertainty principle.

In terms of this experiment Feynman proposes to state Heisenberg’s uncertainty principlein the following way: “It is impossible to design an apparatus to determine which slit theelectron passes through, that will not at the same time disturb the electrons enough to destroythe interference pattern”.

The correct interpretation of this experiment would be: if one has an experimental setup capable of determining whether the electrons go through slit a or slit b, then one can saythat they go either through slit a or slit b. But, when one does not try to tell which way theelectrons go and there is nothing in the experiment to disturb the electrons, then one maynot say that an electron goes either through slit a or through slit b.

The motion of all forms of matter (even bullets) must be described in terms of waves. In theexperiment with bullets no interference patterns could be observed because the wavelengths ofthe bullets are extremely short and the finite size of the detector would not allow to distinguishthe individual maxima and minima. The result was an average over all the rapid oscillationsof the total probability, hence, the classical curve.

3.14 Stern - Gerlach Type Experiments

An experiment first performed in 1922 by Otto Stern and Walther Gerlach to measure themagnetic moment of an atom played a crucial role in signaling the existence of a new intrinsic

79

property of atoms and particles, the spin. At that time the physicists understood that theelectrons orbiting around the nucleus represented an electric current. The existence of thisorbital electric current meant that the atom had a magnetic field, called a magnetic dipolemoment.

Therefore, atoms placed in a magnetic field would behave like little bar magnets and theyshould be deflected by the applied field. This is precisely what Stern and Gerlach expectedto see during their experiment.

The Stern-Gerlach experimental setup uses a magnet with asymmetric polar caps to createa non-uniform magnetic field. One polar cap is quasi-planar and the other one is wedge-shaped. This configuration results in a non-uniform magnetic field whose z-axis componentis normal to the planar cap. The components of the magnetic field along x and y axis arenegligible. Silver atoms evaporated from an oven are collimated with the aid of a slit in ascreen and then are “beamed” perpendicular to the gradient of the magnetic field, see Figure16.

N

S

Oven

Photographic plate

Collimator

Pattern recorded on the photographic plate when the magnetic field is zero

Experimental result recorded on the photographic plate with non-zero magnetic field

Beam ofsilver atoms

Non-uniformmagnetic field

Classical expectation

z

xy

Figure 16: The Stern Gerlach experiment with silver atoms. The image recorded on thephotographic plate when a non-zero magnetic filed is present differs from the classical expec-tation. The only possible explanation is that the atoms have spins which interact with thenon-uniform magnetic field. In the absence of the magnetic field, the image observed on thephotographic plate is a darkened image of the slit.

The deflection of each atom depends on the atom’s magnetic moment and the magneticfield generated by the two magnets (the component along z axis). The atoms emitted bythe oven are expected to have their magnetic moments oriented randomly in every direction.Thus, according to the classical view, the atoms were expected to be deflected by the magneticfield at all angles, resulting in a continuous distribution of the number of atoms versus thedeflection angles. Instead, the atoms were deflected at a discrete set of angles.

80

The experiment was repeated in 1927 with hydrogen atoms which have only one electronorbiting around the nucleus. What was very surprising was the number of peaks in theelectron distribution seen in that experiment. The hydrogen atoms were considered to havezero magnetic dipole moment and were expected to exit the magnetic field as an undeflectedbeam. Instead, two beams were observed, one deflected upwards and the other deflecteddownwards by the magnetic field.

N

SSource ofhydrogen

atomsMagnet

Screen

Figure 17: The Stern Gerlach experiment with hydrogen atoms. Hydrogen atoms are expectedto exit the magnetic field as an un-deflected beam. In this experiment two beams emerge fromthe magnetic filed, one deflected upwards and the other deflected downwards by the magneticfield.

The result was difficult to explain unless it was postulated that the electron in the hydrogenatom had associated with it a quantity called spin, representing an internal rotation of theelectron itself. Thus, the magnetic dipole moment of the hydrogen atom has two components,one due to the rotational motion of the electron, and the second one due to its spin.

The spin is the intrinsic angular momentum of the electron and it is in no way associatedwith the rotation of the electron around the nucleus. The spin (the intrinsic angular mo-mentum) of the electron has an intrinsic magnetic moment associated with it; this magneticmoment proportional to the value of the spin.

It can be shown that a non-uniform magnetic field acts upon a magnetic moment with aforce aligned with the direction of the gradient of the magnetic field. The value of the forceis proportional to the gradient of the field and to the component of the magnetic moment(which is proportional to the spin) in the direction of the field gradient. The value of the spincan be estimated from the beam deflection distance on the screen.

In the experimental setup the gradient of the magnetic field was vertical and the initialdirection of the electron beam was horizontal. Thus, the electrons had to be deflected upwardsor downwards according to the component of their spin in the vertical direction. The electrons

81

whose vertical spin component was “up”, i.e. positive, were deflected upward and those whosevertical spin component was “down”, i.e. negative, were deflected downwards.

3.15 The Spin as an Intrinsic Property

At the time of its discovery, the spin represented a new physical quantity. What is the spinof a particle, for example, an electron? When Pauli concluded that a series of unexpectedexperimental observations of electrons’ behavior could be explained by accepting the existenceof a new quantum number, i.e. a new degree of freedom, Uhlenbeck and Goudsmit associatedthat new degree of freedom with an internal motion, a spinning, of the electron, different fromthe motion described by the dynamical variables position and momentum. The introductionof this two-valued quantum number for the electron also led Pauli to the postulation of hisexclusion principle. According to Pauli’s exclusion principle, no more than two electronscan occupy the same orbital and those two electrons must have anti-parallel spins. Thenon-relativistic electron was described by a two-component wave function, where the twocomponents represented the states of spin ±1

2.

The spin of an electron was predicted as an intrinsic property by Paul Dirac. Diracwas trying to develop a relativistic quantum mechanical wave equation for a relativistic freeparticle 18. He proposed a differential equation of first order in the space and time deriva-tives, satisfying all the requirements of special relativity and quantum mechanics. For anon-relativistic (v � c) electron in a magnetic field this equation reduces to the Schrodingerequation with a correction term. The correction term takes into account the interaction ofthe intrinsic magnetic moment of the electron with the external magnetic field and predictsthe electron spin as an intrinsic property.

Protons and neutrons are also characterized by spin 1/2 as an intrinsic property. The spinof the atoms is related to the spins of the electrons orbiting around its nucleus and to thespin of the nucleus itself.

The rotation of the electron represents physically the rotation of an electrical charge whichgenerates a magnetic field parallel to the axis of rotation and, therefore, randomly distributedin space. The electron is characterized by a spin magnetic moment

µz = − e

2me

gS

where e and me are the electron charge and mass, respectively, g is the spin gyromagneticfactor, and S is the spin angular momentum. The z-component of the spin angular momentumis

Sz = ± 1

2�

where + or − specify the orientation “up” or “down”. The value of the z-component of themagnetic moment can be written as

µz = ± e

2me

gSz = ± e

2me

g1

2� = ± 1

2gµB

where the natural constant µB = e�/2me = 9.2740154 × 10−24J/T is called the Bohrmagneton. In classical electromagnetism, a spinning sphere of charge can produce a magnetic

18A relativistic particle moves at a speed v close to the speed of light c, i.e. v ≈ c.

82

moment, but the magnitude of the electron spin magnetic moment calculated above cannotbe reasonably modelled by considering the electron as a spinning sphere. According to highenergy electron-electron scattering, the size of the electron is < 10−18 m. A sphere of this sizewould have to spin at a preposterously high rate of some 1032 radian/s to match the observedangular momentum.

In the Stern - Gerlach experiment, the direction of the measurement was determined bythe selection of the magnets and the orientation of the applied magnetic field. The particlesused in the experiment were neutral silver atoms. Since the magnetic moments of the electronsin a silver atom cancel out in pairs except for one electron, the resultant magnetic momentis equal to that of the unpaired electron. When only spin is considered, each silver atombehaves like an electron.

The (neutral) atoms are separated into two beams, one deflected upward, the other down-ward, recorded as two distinct spots on the photographic plate being used as detector, withnothing in between. If the atoms were behaving as classical particles, the image would havebeen smeared into a single big spot. These two beams correspond to the only two states ofspin possible along any given axis.

In quantum mechanics the intrinsic angular moment, the spin, is quantized and the valuesit may take are multiples of the rationalized Planck constant � (� = h/2π). The spin ofan atom or of a particle is characterized by the spin quantum number s, which may assumeinteger and half-integer values. For a given value of s, the projection of the spin on any axismay assume 2s + 1 values ranging from −s to +s by unit steps, in other words the spin isquantized. The electron has spin s = 1

2and the spin projection can assume the values +1

2,

referred to as spin up, and −12, referred to as spin down

The electron spin is one of the possible physical implementations of the quantum bit,the qubit. The spin is a measurable quantity, an observable; the spin states for which theprobability of obtaining a particular result is unity play a central role in defining the statesof a qubit.

3.16 Schrodinger’s Wave Equation

To study the properties of a quantum system we have to know its state as well as the evolutionin time of its state. Schrodinger’s equation allows us to describe the state of a stationarysystem (a system whose state is independent of time) and the evolution in time of a non-stationary system. Erwin Schrodinger proposed the following equation for the wave functionΨn(q) of a stationary state of energy En of an atom

H(q,h

2πi

∂

∂q)Ψn(q) = EnΨn(q).

The Hamilton function H(q, p) represents the classical energy of the atom when expressedin terms of position q and momentum p coordinates. Schrodinger replaced the momentumvariable p by the differential operator (h/2πi)(∂/∂q). Now, H is the Hamiltonian operator,Ψn is the wave function associated with the atom in a stationary state of energy En.

Schrodinger assumed that the time evolution, the dynamics of the wave function, is gov-erned by another partial differential equation

ih

2π

∂

∂tΨ(q, t) = H(q,

h

2πi

∂

∂q)Ψ(q).

83

Here we have to make two important observations:

(i) The presence of the complex number i in this equation implies that the wave function iscomplex,

(ii) The solution to this equation, Ψ(q, t), is a function of time and represents the timeevolution of the stationary wave function Ψn(q)

Ψ(q, t) = e−i 2πh

Ent Ψn(q).

Schrodinger realized that the wave function of a many-electron atom was not defined in theordinary, physical, three-dimensional space, as de Broglie had used, but in a more abstractconfiguration space, i.e., the space described by the coordinates of the positions of all theelectrons. The wave associated with this system is different from an electromagnetic wave;the wave function exists in a formal space and its values are complex. In the beginning, thephysical interpretation of such a complex function was a problem.

By the end of the same year, 1926, Max Born suggested the probabilistic meaning of thewave function 19. According to Born, the probability density that a particle can be found ata given location is equal to the square of the amplitude of the wave function at that location

Prob =| Ψ(q, t) |2

Born made an analogy between the scattering of an electron colliding with an atom and thediffraction of X - rays and concluded that the electron can be anywhere in the space wherethe wave function is different from zero, but there is no way to pinpoint its position becausethis is a random event.

This formula is extremely important and it represents the essence of what the quantumtheory can give us. In the world of classical physics the position and the speed of an object canbe measured or predicted with 100% certainty (in principle), while in the world of quantumphysics our predictions are only statistical in nature. Schrodinger’s equation allows us tomake probabilistic predictions; we can determine where a particle will be (if the positionobservable is considered) only in terms of probabilities of different outcomes, or, equivalently,what proportion of a large number of particles will be found at a specific location. But, thequantum theory does not give any indication as to a specific outcome.

The superposition principle is an essential element of the quantum theory brought forthby Schrodinger’s equation. The equation is linear, therefore, when two function Ψ1 and Ψ2

are among its solutions, their sum Ψ1 +Ψ2 is another solution. The corresponding probabilityis proportional to | Ψ1 + Ψ2 |2 and it can show interference effects. For example, an electroncan be found in a state that is a superposition of two other states. A solution of Schrodinger’sequation for the electron is a sine wave and, as we know, a sum of such sine waves is also asine wave and thus another solution.

The superposition can be extended to a single particle, i.e., a particle can be superposedwith itself. As we shall see shortly, in Young’s experiment when the light is reduced to onephoton emitted at a time, we still find an interference pattern on the screen (after enoughphotons have been detected). The explanation is that a single photon goes through both slitsand then it interferes with itself, as two waves do by superposition. Similar experiments were

19Einstein had the intuition of the probabilistic nature of the wave function at about the same time, buthad reservations about the possible randomness of the physical world.

84

later performed with electrons, neutrons, atoms, and even bucky balls 20. All these particlesbehaved like waves and created interference patterns.

Schrodinger himself realized that if a quantum system contains more than one particle, thesuperposition principle gives rise to the phenomenon of entanglement, the system’s interferencewith itself. The result is an entangled system. The term entanglement was first used bySchrodinger himself in 1935 [95] in his discussion of the Einstein, Podolsky, and Rosen (EPR)paper [46].

3.17 Heisenberg’s Uncertainty Principle

The concept of probability in quantum mechanics is very different from the one in classicalphysics. Classical probabilities only express some lack of information about the fine details ofa given situation. The randomness is due to uncontrolled causes that are recognized to existand that, if known better, would make the predictions better. For example, we can predictvery accurately the arrival time of a transatlantic flight if we know precisely the speed of thetail or head wind throughout the entire journey. But if the atmospheric conditions changeduring the flight, so does the accuracy of the predicted arrival time.

On the other hand, the quantum probabilities assume as a matter of principle that a moreprecise knowledge at the quantum level is impossible. This limitation cannot be avoided asshown by many experiments performed over the years. Einstein, who doubted that “God isplaying dice” questioned the truth of such an indeterminacy.

The uncertainty principle21 discovered by Werner Heisenberg in 1927 is at the heart of thespecial nature of the probabilistic aspect of the quantum theory. We use the term observablefor a physical property of a quantum system. The mathematical equivalent to measuring aquantum system is to apply a measurement operator to the vector describing the state of thesystem.

Consider two such observables, X the position of a particle and PX the momentum ofthe same particle at position X. We assume that X and PX are three dimensional vectors.The operators corresponding to these observables, X and PX are non-commutative. Thismeans that if operator X is applied first to the state vector and the operator PX is second(the state vector is projected first to a basis corresponding to property X and then to abasis corresponding to property PX) the results are not the same as when the operator PX isapplied first and the operator X is second (now, the state vector is projected first to a basiscorresponding to property PX and then to a basis corresponding to property X).

The uncertainty principle states that the uncertainty in determining the position ∆X andthe uncertainty in determining the momentum ∆PX at position X are constrained by theinequality

∆X ∆PX ≥ �

2

where � = h/2π is a modified form of Planck’s constant.The uncertainty is an intrinsic property of quantum systems. The precise knowledge

(knowledge beyond a given limit) of basic physical properties such as position and momentumalong the direction where position in measured is simply forbidden in a quantum system.

20A bucky ball is a molecule (also called fullerene) of sixty or seventy carbon atoms arranged in a structureresembling the geodesic domes built by the architect Buckminster Fuller.

21Heissenberg called it the undeterminacy principle

85

3.18 A Brief History of Quantum Ideas

The ideas behind quantum mechanics appeared at the very beginning of the twentieth century.During the following three decades they evolved rapidly into a new mathematical modelwhich provides a description of the physical world, consistent with the experimental evidencecollected over the years. For the past twenty years, or so, quantum mechanics, once considereda domain of interest only to physicists, has revealed that it may have a profound effect on howwe manipulate information. Thus, it seems appropriate to discuss the evolution of quantumideas before introducing the basic concepts of quantum computing.

Quantum mechanics is the description of the behavior of matter and light, in particular,of what is happening on the atomic and sub-atomic scale. Quantum physics would be a moreappropriate name because it provides a general framework for the whole of physics ratherthan dealing with only the field of mechanics.

Things on a very small scale do not behave like waves, nor do they behave like parti-cles; they behave like none of the objects or phenomena we encounter in our surrounding,macroscopic world.

Isaac Newton thought that light was made up of particles, though he was aware of thediffraction and interference phenomena which were indications that it behaves like waves.Sometime at the beginning of the 19th century, Thomas Young (1773 - 1829), a Britishphysician and physicist, conducted the now famous “double-slit experiment” on light, demon-strating the wave-theory effect of interference. In his experiment Young used a light source, abarrier with two slits in it and a screen behind the barrier. He shone light from the source onthe barrier with two slits and obtained an interference pattern on the screen. An interferencepattern is the characteristic signature of waves behavior. Particles, as we know them fromdaily experience, do not interfere with each other.

In physics the notion of quanta as elementary units of energy was introduced in 1900,when Max Planck presented his theory of blackbody radiation. The blackbody radiation is theelectromagnetic radiation emitted by a body in thermodynamical equilibrium, or the radiationcontained in a cavity when its walls are at a uniform temperature. The radiation is allowedto escape through a small aperture so that its frequency spectrum and energy density canbe measured. Classical thermodynamics predicted that the intensity of the radiation emittedin a small frequency interval ∆ν should be proportional with the square of the frequencyν2; when integrated over all frequencies that gives an infinite total intensity. The theoreticalpredictions were in contradiction with the experimental results at higher frequencies, wherethe measured intensity rather than increasing like ν2 was decreasing exponentially.

Max Planck assumed, as a new postulate of physics, that the energy of the emitted ra-diation does not vary continuously, but by small amounts which are multiples of a basicquantum. Planck denoted this quantum of energy by hν, where ν is the frequency of the ra-diation (which is a wave) and h is a fundamental constant, now known as Planck’s constant.(The value of this constant is h = 6.6262× 10−34 Joule× second and represents the productEnergy × Time). Planck proposed the following formula for the energy levels of the blackbody radiation (approximated as a Maxwell - Hertz oscillator)

E = 0, hν, 2hν, 3hν, 4hν, . . . , nhν,

where n is a non-negative integer.In 1905 Albert Einstein showed that the empirical properties of the photoelectric effect22

22Einstein received the Nobel prize for his theory of the photoelectric effect and not for his special and

86

could be explained by assuming that light consists of particles, each one of them havingan energy hν and moving at the speed of light. Einstein’s light particle became known asa photon 23. According to Einstein, the existence of quanta was not due to the emissionprocess but was a property of the light itself. The best evidence of this idea came later, whenArthur Compton found that the scattering of gamma rays by electrons has the kinematicalcharacteristics of the collision between two particles (or billiard balls). Different regions ofthe energy spectrum of the electromagnetic radiation have different names such as gammarays, X - rays, and (visible) light.

Based on results produced by alpha particles scattering experiments, Ernest Rutherfordestablished in 1911 that an atom consists of electrons surrounding a positively charged heavynucleus. A planetary model of the atom was proposed at that time.

In 1913 Niels Bohr modified the planetary model of the atom based on the 1908 discoveryof Walter Ritz that all the frequencies in the spectrum of a given atom can be obtained witha simple formula ν = νn− νm, where the frequencies νi (i = 1, 2, 3, . . .) characterize the atom.Bohr noted that the angular momentum of the orbiting electron in his model of the hydrogenatom 24 has the same dimensions as Planck’s constant h.

Bohr postulated that the angular momentum of the orbiting electron must be a multipleof the Planck’s constant divided by 2π, that is

mvr =h

2π, 2

h

2π, 3

h

2π, . . .

where mvr is the classical definition of the angular momentum, m is the electron mass, v isthe electron velocity, and r is the radius of the electron orbit.

The quantization of the angular momentum led Bohr to postulate the quantization of theatom energy. Bohr also postulated that when the hydrogen atom made a transition from oneenergy state (“level”) to a lower one, the difference between its initial and final energies wasemitted in the form of a quantum of energy

Ea − Eb = hνab

Here, Ea is the initial energy level of the electron around the nucleus, Eb is the final energylevel after the transition from its prior state, h is Planck’s constant, and νab is the frequencyof the ’light’ quantum (gamma ray) emitted during the electron’s jump from the first to thesecond energy level.

The quantum mechanics developments in the 1920s and the 1930s reinforced the view thatlight behaves as a collection of particles as well as waves. In its interaction with matter lightexhibits phenomena that are characteristic of waves, such as interference and diffraction,as well as phenomena that are characteristic of particles, as it happens in the case of thephotoelectric effect.

The next significant step was made by the French physicist Louis (Prince) de Broglie in1923. Drawing an analogy with light and its dual character as both a wave and a particle, deBroglie proposed that a wave should be associated with every kind of particle and particularlywith the electron. De Broglie also proposed an equation that linked the momentum, p, of aparticle to the wavelength, λ, of the wave associated with it and the Planck constant

general relativity theories for which he is so well known.23Photon is derived from the Greek word “photos” meaning light.24The hydrogen atom has one electron orbiting the nucleus which contains one proton

87

p =h

λ.

This assumption was fundamentally new25; it was no longer a correction, or a new chapterof classical physics imposed by quantum constraints. The experimental confirmation of deBroglie’s assumption came in 1928 when Clinton Davisson and Lester Germer observed thediffraction of electrons by crystals which proved the wavelike characteristics of electrons.

The equation describing the evolution of the wave associated with a particle was proposedin 1926 by Erwin Schrodinger, who, thus, gave a precise formulation to de Broglie’s wave hy-pothesis. In 1926 Heisenberg developed a theory of the quantum mechanics based on matrices,equivalent to Schrodinger’s theory which was based on the wave equation. In Heisenberg’smore abstract approach, infinite matrices represent properties of observable entities and themathematics used is that of matrix algebra. The matrix multiplication is non-commutativeand that has important consequences in quantum mechanics. Heisenberg’s leading idea wasthat physics should use only observable quantities, quantities that could be measured. Inhis opinion, such quantities as the classical “orbits” should not even be mentioned, since noexperiment has ever shown their existence.

Max Born, Pasqual Jordan, and Paul Dirac were the first to realize that the use of non-commutative quantities to replace position and momentum was an essential feature of Heisen-berg’s theory and they recognized the rules of matrix calculus in their own theories. Theyconcentrated upon a new kind of mechanics where the dynamical variables are not ordinarynumbers, but new non-commutative mathematical objects.

In 1925 Born and Jordan developed a complete formulation of this new mechanics usinginfinite matrices to represent the basic physical quantities. The same year, 1925, Dirac intro-duced abstract mathematical objects, such as Qj for the position coordinates and Pj for themomentum coordinates, without trying to specify them; the multiplication rules were thoseof the new mechanics. He postulated a general form for the commutator between two quan-tum dynamical variables using the Poisson’s brackets. Dirac’s abstract quantities Q and Pcan be identified with some operators acting upon the wave functions. Heisenberg’s matricesrepresenting position and momentum can be obtained from the wave functions.

In 1926 Scrodinger and Dirac showed the equivalence of Heisenberg’s matrix formulationand Dirac’s algebraic one with the Schrodinger’s wave function. In 1926 Dirac and indepen-dently Born, Heisenberg, and Jordan, obtained a complete formulation of quantum dynamicsthat could be applied to any physical system; the first calculations were done for the hydrogenatom.

In 1926 John von Neumann introduced the concept of Hilbert space to quantum mechanics.The idea came to him while attending a lecture of Heisenberg who was presenting his matrixmechanics and the difference between his model and the the one based upon Schrodinger’swave equation. During the presentation, David Hilbert, considered the greatest mathemati-cian of the time, asked for clarifications. Then, John von Neumann decided to formulate thequantum theory in terms familiar to the great mathematician. He wrote a note for Hilbert,explaining Heisenberg’s version of quantum theory in terms of what will later be known asHilbert spaces. A Hilbert space is a vector space with a measure of the distance, called thenorm, and the property of completeness. Von Neumann expanded this explanation into abook The Mathematical Foundations of Quantum Mechanics published in 1932.

25de Broglie received the Nobel Prize for his contributions to quantum theory in 1929.

88

Von Neumann demonstrated that the geometry of vectors over the complex plane hasthe same formal properties as the states of a quantum mechanical system. The states ofthe quantum systems, the wave functions, are represented as vectors in a Hilbert space andthe operations associated with the position and the momentum act like matrices upon thesevectors. He also derived a theorem, using some assumptions about the physical world, whichproved that there are no “hidden variables”. The inclusion of such hidden variables wouldcompletely eliminate the uncertainty inherent in the quantum mechanical description of phys-ical systems. In 1966 John Bell (successfully) challenged von Neumann’s assumptions andproved his own theorem establishing that indeed hidden variables could not exist.

In 1928 Paul Dirac developed the relativistic quantum mechanics that combined quantummechanics with relativity. He introduced corrections for relativistic effects to the quantummechanics equations for particles moving at close to the speed of light. This allowed theproperties of spin to be obtained in a natural way from the relativistic Schrodinger equation.In January 1925 Wolfgang Pauli had suggested the existence of a fourth quantum number ofthe electron with values (+1

2,−1

2) and formulated the “exclusion principle” bearing his name.

At that time, George Uhlenbeck and Samuel Goudsmit had interpreted the new quantumnumber as the spin of the electron, associated with the intrinsic rotation (spin) of the electron.In 1930, Dirac predicted the existence of anti-electrons, particles with an opposite charge tothat of the electron.

In 1931 Carl Anderson discovered the positron, the anti-electron, in cosmic radiation.In 1949 Madame Chien-Shiung Wu and Irving Shaknov of Columbia University producedpositronium, an artificial association between an electron and a positron circling each other.This element lives for a fraction of a second, then the electron and the positron spiral towardeach other, causing mutual annihilation. Two photons of gamma radiation, each with anenergy of 0.511 MeV, are emitted in this process.

This experiment verified an assumption made by John Wheeler in 1946 that the twophotons produced when an electron and a positron annihilate each other, have opposite po-larizations; if one is vertically polarized, the other must be horizontally polarized. What isremarkable about this experiment is the fact that it is the first one in history to produce theso called entangled photons. This important fact was recognized only eight years later, in1957, by David Bohm and Yakir Aharonov.


Phenomena such as interference and diffraction of light can only be explained on the basis of awave theory, while other phenomena including the photo-electric emission, show that light iscomposed of small particles called photons. Classical mechanics could not accept this dualityand a new model of the physical world was necessary.

Quantum theory is such a mathematical model developed by Werner Heisenberg and ErwinSchrodinger in the mid 1920’s. The cornerstones of quantum mechanics are Schrodingerwave equation and Heisenberg’s uncertainty principle. The quantum mechanical model ischaracterized by the way it represents the states of a physical system, the observables of thesystem, the measurements of these observables, and the dynamics of the system.

The stage for quantum mechanics are n-dimensional Hilbert spaces. A n-dimensionalHilbert space Hn is a vector space over the field of complex numbers with an inner productthat helps define the norm, which in turns gives the “length” of a vector. The elements ofHn are n-dimensional vectors. Traditionally, quantum mechanics texts use Dirac’s notations

89

for vectors. In addition to inner products of two vectors, quantum mechanics uses the tensorproduct of two vectors or matrices and the outer product of two vectors.

The state vector | Ψa〉 ∈ Hn can be expressed as a linear combination of the basis states,| 0〉, | 1〉 . . . | i〉, · · · | n− 1〉 as | Ψa〉 =

∑n−1i=0 αi | i〉, with αi, 0 ≤ i ≤ n− 1 complex numbers

representing the amplitudes, with the property that αj = 〈j | Ψa〉. The “superpositionprinciple” tells us that given two states | Ψa〉 and | Ψb〉, we can form another state as(a | Ψa〉+ b | Ψb〉), a superposition of the original two states.

Transformations of states in Hn are described by Hermitian operators. An operator Omaps state vectors to state vectors inHn. If we take a quantum system in state | Ψa〉 and applyto it a transformation described by the operator O, we get a different state | Ψb〉 = O | Ψa〉.Each operator O has an associated matrix O = [Oij] with elements given by Oij = 〈i | O | j〉,where | i〉 and | j〉 are unit vectors and 〈i | j〉 = δij.

A projection operator is given by the outer product of any state vector | Ψa〉 withitself PΨa =| Ψa〉〈Ψa |. A complete set of orthogonal projectors in Hn is a set{P0,P1, . . .Pi . . .Pm−1} such that

∑m−1i=0 Pi = 1. The number of projectors in a complete

orthogonal set must be less than or equal to the dimension of the Hilbert space m ≤ n.

InHn every normal operator N has n eigenvectors | ni〉 and, correspondingly, n eigenvaluesλi. If Pi are the projectors corresponding to these eigenvectors, Pq

i =| ni〉〈ni |, then theoperator N has the spectral decomposition N =

∑i λiPi.

An observable O is a property of a physical system that, in principle, can be measured. Theobservable O has an associated Hermitian operator O. The outcome of the measurement ofthe observable O of a system in state | Ψa〉 is an eigenvalue λi corresponding to the eigenvector| oi〉 of the operator O. Immediately after the measurement the state of the system becomes| Ψb〉 = O | Ψa〉 =| oi〉. The probability to observe outcome λi is equal to 〈Ψa | Pi | Ψa〉, withPi the projector corresponding to the eigenvector | oi〉.

The density matrix contains all the information regarding the results of measurements ofan ensemble of N independent versions of a quantum system and gives the expected value ofof any observable of the system. The density matrix does not uniquely determine the statesof individual particles. The density matrix is Hermitian and its eigenvalues are nonnegative.

John Preskill’s notes [91] summarize the quantum concepts directly related to quantumcomputing. There are extremely well written books on quantum mechanics and it is verydifficult to single out only a few of them.

The first three chapters of Dirac’s book “The Principles of Quantum Mechanics” [42] areaccessible to a large audience while another classical book, “Mathematical Foundations ofQuantum Mechanics” [122] by John von Neumann seems accessible only to the most math-ematically sophisticated students. The third volume of Feynmann’s “Lectures on Physics”provides a comprehensive and clear exposition of quantum mechanics, vintage 1960s.

Among the more modern texts we recommend “The Interpretation of Quantum Mechan-ics” by Roland Omnes [82], a very readable account of “classical” as well as more “modern”ideas and concepts of quantum mechanics, such as the density matrix.

An excellent reference is Mertzbacher’s “Quantum Mechanics” [75]. A very accessible andmodern text is also “Quantum Mechanics” by E.S. Abers [1]. Nilssen and Chung’s book“Quantum Computing and Quantum Information” provides a concise review of the basicconcepts of quantum mechanics for quantum computing.

90

A very clear and concise book on linear algebra is “Lectures on Linear Algebra” by Gelfand[53]. An excellent reference on functions and functional analysis is Kolmogorov’s book “Ele-ments of the Theory of Functions and Functional Analysis” [69].


(1) Show that any vector x = (x1, x2, . . . xn) ∈ Cn is a linear combination of n unit vectors:

ε1 = (1, 0, 0, . . . 0), ε2 = (0, 1, 0, . . . 0), . . . εn = (0, 0, 0, . . . 1).

(2) Let n vectors span a vector space A containing r linearly independent vectors. Prove thatn ≥ r.

(3) Prove that all bases of any finite-dimensional vector space A have the same finite numberof dimensions.

(4) Prove that if a vector space A has dimension n, then (i) any set of (n + 1) elements of Aare linearly dependent, and (ii) no set of (n− 1) elements can span A.

(5) Prove that row-equivalent matrices have the same row space.

(6) Prove the three properties of the trace of a matrix from Section 3.3.

(7) Consider two vectors over C2:

| ψ〉 = α0 | 0〉+ α1 | 1〉

| φ〉 = β0 | 0〉+ β1 | 1〉Show that:

| ψ〉⊗ | φ〉 = α0β0 | 00〉+ α0β1 | 01〉+ α1β0 | 10〉+ α1β1 | 11〉.

(8) Show that

(A†)† = A.

(9) Show that a normal matrix is Hermitian if and only if it has real eigenvalues.

(10) Consider vectors ψ, φ, ξ and scalars m,n. Show that

mψ ⊗ nφ = mn(ψ ⊗ φ)

ξ ⊗ (ψ + φ) = ξ ⊗ ψ + ξ ⊗ φ

(ψ + φ)⊗ ξ = ψ ⊗ ξ + φ⊗ ξ

91

(11) Consider matrices V,W,X, Y, Z and vectors ψ, φ. Show that

(V ⊗W )(X ⊗ Y ) = V X ⊗WY

(V ⊗W )(ψ ⊗ φ) = (V ψ)⊗ (Wφ)

(V WX Y

)⊗ Z =

(V ⊗ Z W ⊗ ZX ⊗ Z Y ⊗ Z

)

(12) Consider n unitary matrices Vi, 1 ≤ i ≤ n. Let W = V1 ⊗ V2 ⊗ . . .⊗ Vn. Show that Wis unitary.

(13) Show that in H2

(| 0〉〈1 |) | 0〉 =| 0〉(〈1 | 0〉)

(| 1〉〈0 |) | 0〉 =| 1〉(〈0 | 0〉) =| 1〉

(| 0〉〈1 |) | 1〉 =| 0〉(〈1 | 1〉) =| 0〉

(| 1〉〈0 |) | 1〉 =| 1〉(〈0 | 1〉)

(14) Given the operator

O = | 0〉〈1 | + | 1〉〈0 |and the vector

| Ψa |= α0 mod 0〉+ α1 | 1〉,show that for ∀ | Ψa〉 ∈ H2 we have

〈Ψa | O = (α∗0 〈0 | +α∗

1 〈1 |) (| 0〉〈1 | + | 1〉〈0 |) = α∗0〈1 | +α∗

1〈0 |

(15) Show that | +〉 = 1/√

2(| 0〉+ | 1〉) and | −〉 = 1/√

2(| 0〉− | 1〉) form an orthonormalbasis in H2. Construct the projectors P+ and P− corresponding to these basis vectors.

(16) Show that | 0〉, | 1〉, | 2〉, | 3〉 form an orthonormal basis in H4. Construct the operatorΠ4 that permutes circularly the basis in H4 as follows | 0〉 →| 1〉, | 1〉 →| 2〉, | 2〉 →| 3〉and | 3〉 →| 0〉. Construct its matrix representation. Calculate Π4 | ψ〉 and 〈ψ | Π4 with| ψ〉 = α0 | 0〉+ α1 | 1〉+ α2 | 2〉+ α3 | 3〉.

(17) Consider an operator O in H3 with three eigenvalues λ0, λ1, and λ2 and with thecorresponding orthonormal eigenstates | o0〉, | o1〉, and | o2〉

| o0〉 = α0 | 0〉+ α1 | 1〉+ α2 | 2〉

92

| o1〉 = β0 | 0〉+ β1 | 1〉+ β2 | 2〉

| o2〉 = γ0 | 0〉+ γ1 | 1〉+ γ2 | 2〉Construct the three projectors corresponding to these eigenvectors. Express the matrix Ocorresponding to O in terms of the eigenvalues and the projectors.

93

4 Qubits and Their Physical Realization

A classical bit is an abstraction of a physical system capable of being in one of two states,“1” or “0”. To perform any computation we should be able to: (a) store information in thephysical system implementing a bit, (b) retrieve the information from the physical system,and (c) change the state of the physical system.

To perform a meaningful computation we need a fair amount of bits and the ability toaccess and modify them very fast. That is why a computer using electro-mechanical relays ismore useful than the traditional abacus, computers using electronic circuits are more capablethan ones based upon electro-mechanical relays, and so on.

¿From solid state studies we already know that the “smaller” and “simpler” are the physi-cal systems used to realize a bit the less energy is necessary to perform the operations describedabove, and the faster is the resulting computing engine. By pushing the limits of the physicalsystems used to realize a bit we inevitably end at the level of atomic or even sub-atomicparticles where we enter the realm of quantum systems governed by quantum mechanics.

4.1 One Qubit, a Very Small Bit

A quantum bit or qubit is an elementary quantum object used to store information. Since itis difficult at this stage to explain what a quantum object is, for the time being we view aqubit as a mathematical abstraction and then hint to possible physical implementations ofthis abstract object. We review the basic facts revealed by the simple model discussed inChapter 2 and reinforced by the elements of Quantum Mechanics presented in Chapter 3.

A qubit ψ is a vector in a two-dimensional complex vector space. In this space a vectorhas two components and the projections of the vector on the basis are complex numbers. Weuse Dirac’s notations for vector | ψ〉 with α0 and α1 complex numbers and with | 0〉 and | 1〉two vectors forming an orthonormal basis for this vector space [43]

| ψ〉 = α0 | 0〉 + α1 | 1〉.While a classical bit can be in one of two states, 0 or 1 the qubit ψ can be in states | 0〉

and | 1〉 called computational basis states and also in any state that is a linear combination ofthese states; α0 and α1 are the coefficients of the linear expression describing the actual statethe qubit can be in. This phenomenon is called superposition.

When we observe or measure a classical bit we determine its state with probability 1, thebit is either in state 0 or in state 1, the result of a measurement is strictly deterministic. Onthe other hand, when we observe or measure a qubit we get the result

| 0〉 with probability | α0 |2

| 1〉 with probability | α1 |2 .

For these expressions to be true we need that the vector length, or the norm of the vectorto be one, otherwise the probabilities do not sum to unity. This means that

| α0 |2 + | α1 |2 = 1.

We say that a qubit can be in a continuum of states between | 0〉 and | 1〉 until we measureit. For example a qubit can be in state

94

1√2| 0〉 +

1√2| 1〉

and we get the result | 0〉 with probability 1/2 and | 1〉 with probability 1/2; alternatively thequbit can be in the state

1

2| 0〉 +

√3

2| 1〉

and we get the result | 0〉 with probability 1/4 and | 1〉 with probability 3/4.The superposition and the effect of the measurement of a quantum state (the state of the

qubit) really mean that there is hidden information that is preserved in a closed quantumsystem until it is forced to reveal itself to an external observer. We say that the system isclosed until it interacts with the outside world, e.g, until we decide to perform an observationof the system. The fundamental question for us is how to use this hidden information.

So far we have used the states | 0〉 and | 1〉 to represent a qubit. But this is one of manychoices; we can choose a different set of vectors as an orthonormal basis. For example, wecan choose the basis vectors

| +〉 ≡ | 0〉 + | 1〉√2

and

| −〉 ≡ | 0〉 − | 1〉√2

.

In this case a qubit can be represented as

| ψ〉 = α0 | 0〉 + α1 | 1〉 = α0| +〉 + | −〉√

2+ α1

| +〉 − | −〉√2

| ψ〉 =α0 + α1√

2| +〉 +

α0 − α1√2

| −〉

Let us now discuss briefly the transformations of a qubit by means of operators 26 givenby the Pauli matrices. The traditional notations for the Pauli matrices in quantum mechanicsare: σ0 or I; σ1 or σx or X; σ2 or σy or Y; and σ3 or σz or Z.

It is probably now the time to reinforce our understanding of the formalism, the Dirac ketand bra notations by deriving the Pauli matrices from the descriptions of the transformationsthey perform. Throughout this derivation we use the outer product of vectors defined inSection 3.7.

Let us start with the basics and show the basis vectors and their duals in matrix notation

| 0〉 =

(10

)〈0 | =

(1 0

)

| 1〉 =

(01

)〈1 | =

(0 1

)26The terms operator and matrix are used interchangeably used throughout this chapter and the rest of the

book.

95

The I operator performs an identity transformation, | 0〉 →| 0〉 and | 1〉 →| 1〉. This canbe written as

I =| 0〉〈0 | + | 1〉〈1 |=(

10

)(1 0

)+

(01

)(0 1

)

I =

(1 00 0

)+

(0 00 1

)=

(1 00 1

).

The X operator negates, or flips a qubit, | 0〉 →| 1〉 and | 1〉 →| 0〉. This can be written as

X =| 0〉〈1 | + | 1〉〈0 |=(

10

)(0 1

)+

(01

)(1 0

)

X =

(0 10 0

)+

(0 01 0

)=

(0 11 0

).

The Z operator performs a phase shift operation, | 0〉 →| 0〉 and | 1〉 → − | 1〉. This canbe written as

Z =| 0〉〈0 | − | 1〉〈1 |=(

10

)(1 0

)−(

01

)(0 1

)

Z =

(1 00 0

)−(

0 00 1

)=

(1 00 −1

).

The Hadamard operator H will also be frequently used

H =1√2

(1 11 −1

)

We display now the I matrix and the Pauli matrices and the result of the transformationof the original state of the qubit, | ψ〉 = α0 | 0〉 + α1 | 1〉, into a new state , | φ〉 = σ | ψ〉under their action.

σ0 = I =

(1 00 1

),

(1 00 1

)(α0

α1

)=

(α0

α1

)−→ | φ〉 = α0 | 0〉+ α1 | 1〉.

σ1 = X =

(0 11 0

),

(0 11 0

)(α0

α1

)=

(α1

α0

)−→ | φ〉 = α1 | 0〉+ α0 | 1〉.

σ2 = Y =

(0 − ii 0

),

(0 − ii 0

)(α0

α1

)= i

(−α1

α0

)−→ | φ〉 = −iα1 | 0〉+iα0 | 1〉.

σ3 = Z =

(1 00 −1

),

(1 00 −1

)(α0

α1

)=

(α0

−α1

)−→ | φ〉 = α0 | 0〉 − α1 | 1〉.

Now let us briefly mention the physical incarnation of a qubit. A qubit can be realizedas the polarization of a photon as we have seen in the example discussed in Section 2.1. Alaser and a polarizing lens form a source of polarized photons. Another possible physicalrealization of a qubit is the spin state “up” or “down” of an electron orbiting inside an atom.The energy “ground state” or ”excited state” of a bound electron may be another physicalsupport of a qubit state information. Assuming that the electron on an atom can be in oneof two states one can provide enough energy to move the electron from a “ground” state toan “excited” state, e.g., by shining light and to change the qubit state . Physical means torealize a qubit are discussed later in more detail.

96

4.2 The Bloch Sphere Representation of One Qubit

It is always useful to have a pictorial, or graphic representation of an abstract concept andthis is what we are going to attempt next. To this end we express the state of a qubit usingthree real numbers, θ, ϕ, γ

ψ = eiγ[cosθ

2+ eiϕ sin

θ

2] = α0 | 0〉+ α1 | 1〉

with

α0 = eiγ cosθ

2α1 = eiγeiϕ sin

θ

2.

In this representation eiγ is an overall phase factor that is not observable, and thus, it isgenerally ignored. From Euler’s formulas

eiγ = cos γ + i sin γ

and the fact that sin2 γ + cos2 γ = 1 it follows immediately that | eiγ | = | eiϕ | = 1.Thus

| α0 |2 + | α1 |2 = 1.

Figure 18: A qubit is represented as a vector from the origin to a point on the three-dimensional sphere with a radius of one, the so called Bloch sphere. θ is the angle of thevector �r with the z-axis, and ϕ is the angle of the projection of the vector in the x− y planewith the x axis, γ has no observable effect.

This representation lends itself to an interesting geometrical interpretation, a qubit | φ〉is a vector �r from the origin to a point on the three-dimensional sphere with a radius of one,the so called Bloch sphere, see Figure 18. In this representation θ is the angle of the vector �rwith the z-axis, and ϕ is the angle of the projection of the vector in the x−y plane with the x

97

axis, and, as mentioned earlier, γ has no observable effect. If the spin of the qubit (regardedas a spin-half particle) is measured along θ, then ϕ it is found to be up with unit probability.This also justifies the use of θ/2 rather than θ in the expression of α0 and α1 above.

b

z

xy

Figure 19: A qubit in a superposition state | ψ〉 = (| 0〉+ | 1〉)/√

2. In this case θ = 90 deg.and ϕ = 0 deg.

Figure 19 shows the Bloch sphere representation of one qubit in a superposition state| ψ〉 = (| 0〉+ | 1〉)/

√2. In this case α0 = α1 = 1/

√2. Then we see that

α0 = cosθ

2=

1√2

=⇒ θ

2= 45 deg =⇒ θ = 90 deg.

α1 = eiϕ sinθ

2=

1√2

=⇒ eiϕ = 1 =⇒ ϕ = 0 deg.

A classical bit can be in one of two states, “0’ or “1”, see Figure 20(a) while a qubit canbe in a continuum of states represented as points on the Bloch sphere, Figure 20(b). Thestate space of a qubit contains the two “basis” or “logical”, states, | 0〉 and | 1〉. The initialstate of a qubit is always one of the basis states.

4.3 Rotation Operations on the Bloch Sphere

Single qubit operations are defined as rotations on the Bloch sphere. When a qubit is rep-resented as a spin 1/2 particle, the Pauli spin matrices, �σ = {σx, σy, σz}, discussed in theprevious section describe spin rotations on the Bloch sphere. To derive the rotation matriceswe first prove the following proposition.

If γ is a real number and if matrix A is such that A2 = I then

eiγA = cos(γ)I + i sin(γ)A.

Recall the following Taylor series expansions for x ∈ R [56]

ex = 1 + x +x2

2!+

x3

3!+ . . . +

xk

k!+ . . . =

∞∑k=0

xk

k!

98

0

1

0

1

(a) One bit (b) One qubit

Superposition states

Basis (logical) state 1

Basis (logical) state 0

Figure 20: A bit can be in one of two states, 0 or 1. A qubit | ψ〉 is represented as a vectorfrom the origin to a point on the Bloch sphere. A qubit can be in a basis state, | 0〉 or | 1〉,or in a superposition state | ψ〉 = α0 | 0〉+ α1 | 1〉 with | α0 |2 + | α1 |2 = 1.

sin(x) = x− x3

3!+

x5

5!+ . . . (−1)k x2k+1

(2k + 1)!+ · · · =

∞∑k=0

(−1)k x2k+1

(2k + 1)!

cos(x) = 1− x2

2!+

x4

4!+ . . . + (−1)k x2k

(2k)!+ . . . =

∞∑k=0

(−1)k x2k

(2k)!

Given a matrix A we can expand eA as

eA = I + A +A2

2!+

A3

3!+ . . . +

Ak

k!+ . . .

Therefore

eiγA = I + (iγA) +(iγA)2

2!+

(iγA)3

3!+ . . . +

(iγA)k

k!+ . . .

But A2 = I and i =√−1 and we can group the terms of the sum as follows

(1− γ2

2!+

γ4

4!+ . . . + (−1)k (iγA)2k

(2k)!+ . . .)I + i(γ − γ3

3!+

γ5

5!+ . . . (−1)k (iγA)2k+1

(2k + 1)!+ . . . )A

Thus

eiγA = cos(γ)I + i sin(γ)A.

99

A finite rotation through an angle γ about a given vector �n on the Bloch sphere is definedas

R�n(γ) = exp(−iγ

2�n · σ) = cos(γ/2)I − i sin(γ/2)�n · �σ.

Recall that

σx =

(0 11 0

)σy =

(0 −ii 0

)σz =

(1 00 −1

)The rotations about the �x, �y, and �z axes with the same angle γ are defined as

R�x(γ) = cos(γ

2)I − i sin(

γ

2)σx =

(cos(γ

2) 0

0 cos(γ2)

)+

(0 −i sin(γ

2)

−i sin(γ2) 0

).

Thus

R�x(γ) =

(cos(γ

2) −i sin(γ

2)

−i sin(γ2) cos(γ

2)

).

Now

R�y(γ) = cos(γ

2)I − i sin(

γ

2)σy =

(cos(γ

2) 0

0 cos(γ2)

)+

(0 − sin(γ

2)

sin(γ2) 0

).

Thus

R�y(γ) =

(cos(γ

2) − sin(γ

2)

sin(γ2) cos(γ

2)

).

Finally

R�z(γ) = cos(γ

2)I − i sin(

γ

2)σz =

(cos(γ

2) 0

0 cos(γ2)

)+

(−i sin(γ

2) 0

0 i sin(γ2)

).

Thus

R�z(γ) =

(cos(γ

2)− i sin(γ

2) 0

0 cos(γ2) + i sin(γ

2)

)=

(e−γ/2 0

0 eγ/2

)The composition of the two rotations with angles δ and γ is

R�z(δ)R�x(γ) =

(e−δ/2 cos(γ

2) −ie−δ/2 sin(γ

2)

−ieδ/2 sin(γ2) eδ/2 cos(γ

2)

)Any rotation on the Bloch sphere can be reduced to the previous expression for some

angles δ and γ.

It is easy to see that the rotation operations are reversible, a rotation about the �x, �y, or �zaxis with an angle γ followed by a rotation with an angle −γ about the same axis is equivalentto an identity transformation, it leaves the qubit in the original state

100

R�x(γ)R�x(−γ) = R�y(γ)R�y(−γ) = R�z(γ)R�z(−γ) = I

Recall that cos(γ) is an even function, while sin(γ) is an odd function, cos(γ) = cos(−γ)and sin(γ) = − sin(−γ). We only show that the equality is true for the rotation about the �xaxis

R�x(γ)R�x(−γ) =

(cos(γ

2) −i sin(γ

2)


2)

)(cos(γ

2) i sin(γ

2)

i sin(γ2) cos(γ

2)

)=

(cos2(γ

2) + sin2(γ

2) i cos(γ

2) sin(γ

2)− i cos(γ

2) sin(γ

2)

−i cos(γ2) sin(γ

2) + i cos(γ

2) sin(γ

2) cos2(γ

2) + sin2(γ

2)

)Thus

R�x(γ)R�x(−γ) =

(1 00 1

)= I

An alternative method to prove the reversibility of rotation operations is to apply to thequbit

| ψ〉 =

(cos( θ

2)

eiφ sin( θ2)

)a rotation with an angle γ around the �x axis, R�x(γ) and followed by a rotation with an angle−γ around the �x axis, R�x(−γ). The result of composing these two transformations should bethe original state | ψ〉. Call | ψγ〉 the state of the qubit after the rotation with angle γ aboutthe �x axis

| ψγ〉 = R�x(γ) | ψ〉 =

(cos(γ

2) −i sin(γ

2)


2)

)(cos( θ

2)

eiφ sin( θ2)

)It follows that

| ψθ〉 =

(cos(γ

2) cos( θ

2)− ieiφ sin(γ

2) sin( θ

2)

−i sin(γ2) cos( θ

2) + eiφ cos(γ

2) sin( θ

2)

)Now

| (ψγ)−γ〉 = R�x(−γ) | ψγ〉

| (ψγ)−γ〉 =

(cos(γ

2) i sin(γ

2)

i sin(γ2) cos(γ

2)

)(cos(γ

2) cos( θ

2)− ieiφ sin(γ

2) sin( θ

2)

−i sin(γ2) cos( θ

2) + eiφ cos(γ

2) sin( θ

2)

)=

(cos2(γ

2) cos( θ

2)− ieiφ cos(γ

2) sin(γ

2) sin( θ

2) + sin2(γ

2) cos( θ

2) + ieiφ cos(γ

2) sin(γ

2sin(γ

2) sin( θ

2)

i sin(γ2) cos(γ

2) cos( θ

2) + eiφ sin2(γ

2) sin( θ

2)− i sin(γ

2) cos(γ

2) cos( θ

2) + eiφ cos2(γ

2) sin( θ

2)

)

Thus

| (ψγ)−γ〉 =

(cos( θ

2)

eiφ sin( θ2)

)=| ψ〉.

101

The composition of two rotations with angles γ1 and γ2 is a rotation with angle γ1 + γ2

about the same axis

R�x(γ1)R�x(γ2) = R�x(γ1 + γ2)

R�y(γ1)R�y(γ2) = R�y(γ1 + γ2)

R�z(γ1)R�z(γ2) = R�z(γ1 + γ2).

We only discuss the second identity and leave the derivation of the others as an exercise.Recall that [56]

sin(γ1 ± γ2) = sin(γ1) cos(γ2)± cos(γ1) sin(γ2)

cos(γ1 ± γ2) = cos(γ1) cos(γ2)∓ sin(γ1) sin(γ2)

We have

R�y(γ1)R�y(γ2) =

(cos(γ1/2) − sin(γ1/2)sin(γ1/2) cos(γ1/2)

)(cos(γ2/2) − sin(γ2/2)sin(γ2/2) cos(γ2/2)

)

=

(cos[(γ1 + γ2)]/2 − sin[(γ1 + γ2)/2]sin[(γ1 + γ2)]/2 cos[(γ1 + γ2)/2]

)= Ry(γ1 + γ2)

4.4 The Measurement of a Qubit

Quantum measurements of a qubit in state

| ψ〉 = α0 | 0〉+ α1 | 1〉.are characterized by a set of operators, {Mi}. The probability that the outcome with indexi occurs as a result of the measurement is

p(i) = 〈ψ | M†iMi | ψ〉.

The sum of the probabilities of all possible outcomes of the measurement must be 1∑i

p(i) =∑

i

〈ψ | M†iMi | ψ〉 = 1.

The measurement causes the qubit to change its state. If the state of the qubit immediatelyprior to the measurement is | ψ〉, then, after the measurement Mi it becomes

| ψ〉 =⇒ | ϕi〉 =Mi | ψ〉√

〈ψ | M†iMi | ψ〉

There are only two possible outcomes of a measurement in the | 0〉, | 1〉 basis; we can onlyobserve the basis states | 0〉 or | 1〉, see Figure 21.

The corresponding measurement operators are

102

0

1

Possible states of one qubit beforethe measurement

The state of the qubit afterthe measurement

p1

p0

Figure 21: The effect of a measurement upon the state of one qubit. A measurement forcesa qubit in a superposition state to one of the two basis states. A qubit in the superpositionstate | ψ〉 = α0 | 0〉 + α1 | 1〉 behaves like | 0〉 with probability p0 =| α0 |2 and like | 1〉 withprobability p1 =| α1 |2.

M0 =| 0〉〈0 |=(

10

)(1 0) =

(1 00 0

)and M1 =| 1〉〈1 |=

(01

)(0 1) =

(0 00 1

).

It is easy to see that these two operators are Hermitian. A direct computation of the dualsof M0 and M1 show that

M†0 =

(1 00 0

)= M0,

and

M†1 =

(0 00 1

)= M1.

103

Using this result we show that M†0M0 = M0 and M†

1M1 = M1.

M†0M0 = M2

0 =

(1 00 0

)(1 00 0

)=

(1 00 0

)= M0

and

M†1M1 = M2

1 =

(0 00 1

)(0 00 1

)=

(0 00 1

)= M1.

The probability of the outcome corresponding to the eigenvector | 0〉 is

p0 = 〈ψ | M†0M0 | ψ〉 = 〈ψ | M0 | ψ〉

We calculate first the effect of applying the operator M0 to the qubit state | ψ〉

M0 | ψ〉 =

(1 00 0

)(α0

α1

)=

(α0

0

)

The bra vector 〈ψ | is the dual of the original state, | ψ〉 = (α0 α1)T

〈ψ | = ( α∗0 α∗

1)

thus

p0 = 〈ψ | (M0 | ψ〉) = ( α∗0 α∗

1)

(α0

0

)= | α0 |2

But

M0 | ψ〉 =

(α0

0

)= α0

(10

)= α0 | 0〉.

Thus, the state of the qubit after applying the measurement operator M0 is

| ϕ0〉 =M0 | ψ〉√

〈ψ | M†0M0 | ψ〉

=α0 | 0〉| α0 |

=α0

| α0 || 0〉

Similarly, the probability of the outcome corresponding to | 1〉 is

p1 = 〈ψ | M†1M1 | ψ〉 = 〈ψ | M1 | ψ〉

The effect of applying the operator M1 to the state | ψ〉 is

M1 | ψ〉 =

(0 00 1

)(α0

α1

)=

(0α1

)

M1 | ψ〉 = α1 | 1〉Then

p1 = 〈ψ | (M1 | ψ〉) = ( α∗0 α∗

1)

(0α1

)=| α1 |2

The state of the qubit after applying the measurement operator M1 is

104

| ϕ1〉 =M1 | ψ〉√

〈ψ | M†1M1 | ψ〉

=α1 | 1〉| α1 |

=α1

| α1 || 1〉

The completeness condition is satisfied

p(0) + p(1) =| α0 |2 + | α1 |2= 1.

4.5 Pure and Impure States of a Qubit

We can define more rigourously the Bloch sphere, using the density matrix of a single qubit.Informally, the elements of the density matrix give the probabilities of the possible outcomesof a measurement performed on a quantum system. The density operator is

ρ =∑

µ

pµ | ψµ〉

with pµ the probability of state | ψµ〉. The individual states in the expression of the densitymatrix are

| ψµ〉 =∑

i

αµi | φi〉

with {| φi〉} is the set of the basis vectors for some observable operator M and αµi are theinner products | ψµ〉〈φi |.

The density operator is Hermitian, ρ = ρ†. Indeed, for a pure state | ψ〉

ρ =| ψ〉〈ψ |and for a mixed, or superposition state, ρ is the sum of Hermitian operators, thus it isHermitian. The trace of the density matrix is Tr(ρ) = 1. The trace of a matrix is the sum ofits main diagonal elements. Indeed, the trace is

Tr(ρ) =∑

i

pi = 1.

Using these constraints, it is relatively easy to prove that the most general form of thedensity matrix for a single qubit is a 2 × 2 matrix obtained as the sum of Pauli matrices(σx, σy, σz) and the identity matrix (I) with real coefficients (βx, βy, βz)

ρ =1

2(I + βxσx + βyσy + βzσz) =

(1 + βz βx − iβy

βx + iβy 1− βz

)The density matrix must have a positive determinant∣∣∣∣ 1 + βz βx − iβy

βx + iβy 1− βz

∣∣∣∣ ≥ 0

This means that

(1 + βz)(1− βz)− (βx − iβy)(βx + iβy) = 1− β2x − β2

y − β2z ≥ 0.

105

In turn, this inequality can be written as

β2 = β2x + β2

y + β2z ≤ 1.

It implies that there is a one-to-one correspondence between the possible density matricesof a single qubit and the points on the sphere β ≤ 1, the Bloch sphere. At the same time weget a criteria for distinguishing the pure states of a single qubit from the impure states.

Let us first compute ρ2

ρ2 =1

2


βx + iβy 1− βz

)1

2


βx + iβy 1− βz

)Then

ρ2 =1

4

(1 + β2

x + β2y + β2

z + 2βz 2(βx − iβy)2(βx + iβy) 1 + β2

x + β2y + β2

z − 2βz

)Now

Tr(ρ2) =1

4[(1 + β2

x + β2y + β2

z + 2βz) + (1 + β2x + β2

y + β2z − 2βz)] =

1

2(1 + β2).

Pure states are represented by points on the Bloch sphere and in that case β2 = 1 and thisimplies that Tr(ρ2) = 1. Impure states are represented by points inside the sphere, β2 ≤ 1and this implies that Tr(ρ2) ≤ 1.

4.6 Two Qubits. Entanglement.

We now consider a system of two qubits. We extend our mathematical support system to afour-dimensional complex vector space, H4. In this Hilbert space of dimension 4 we chooseas a basis four vectors, | 00〉, | 01〉, | 10〉, and | 11〉. A vector describing the state | ψ〉of a two-qubit system is a linear combination of the basis vectors with complex coefficientsα00, α01, α10, and α11

| ψ〉 = α00 | 00〉+ α01 | 01〉+ α10 | 10〉+ α11 | 11〉.When we measure/observe a pair of qubits we project the state | ψ〉 of the system to one

four basis states | 00〉, | 01〉, | 10〉 and | 11〉 with probabilities | α00 |2, | α01 |2, | α10 |2,and | α11 |2, respectively. The normalization condition, reflects the fact that the norm of thevector | ψ〉 is one, thus the sum of probabilities must be one

| α00 |2 + | α01 |2 + | α10 |2 + α11 |2 = 1.

Note that before the measurement of the two qubits, the state is uncertain. After themeasurement the state is certain, it is | 00〉, | 01〉, | 10〉, or | 11〉, similar to the case of aclassical two-bit system.

But nobody forces us to observe both qubits, what if we observe only the first qubit, whatconclusions can we draw? Intuitively, we expect that the system to be left in an uncertainsate, because we did not measure the second qubit that can still be in a continuum of states.The first qubit can be

106

(i) 0 with probability | α00 |2 + | α01 |2, or

(ii) 1 with probability | α10 |2 + | α11 |2.The normalization condition is satisfied, the sum of the two probabilities is unity.Call | ψI

0〉 the post-measurement state when the first qubit is measured to be 0 and ψI1〉

when the first qubit is measured to be 1. The two post-measurement states are

| ψI0〉 =

α00 | 00〉 + α01 | 01〉√| α00 |2 + | α01 |2

and

| ψI1〉 =

α10 | 10〉 + α11 | 11〉√| α10 |2 + | α11 |2

.

Now let us measure the second qubit only. The second qubit can be

(i) 0 with probability | α00 |2 + | α10 |2, or

(ii) 1 with probability | α01 |2 + | α11 |2.Call | ψII

0 〉 the post-measurement state when the second qubit is measured to be 0 andψII

1 〉 when the first qubit is measured to be 1. The two post-measurement states are

| ψII0 〉 =

α10 | 10〉 + α00 | 00〉√| α10 |2 + | α00 |2

and

| ψII1 〉 =

α01 | 01〉 + α11 | 11〉√| α01 |2 + | α11 |2

.

Let us now consider a special state of a two qubit system when

α00 = α11 = 1/√

2

and

α01 = α10 = 0.

This state is called a Bell state and the pair of qubits is called an EPR pair. In this statewhen we measure the first qubit the two possible outcomes are

(i) 0 with probability 1/2, and

(ii) 1 with probability 1/2.

The corresponding post-measurement states are

| ψI0〉 = | 00〉

and

| ψI1〉 = | 11〉.

When we measure the second qubit, the two possible outcomes are

107

(i) 0 with probability 1/2, and

(ii) 1 with probability 1/2.

The corresponding post-measurement states are

| ψII0 〉 = | 00〉

and

| ψII1 〉 = | 11〉.

This is quite an amazing result! The two measurements are correlated, once we measurethe first qubit we get exactly the same result as when we measure the second one. The twoqubits need not be physically constrained to be at the same location and yet, because ofthe strong coupling between them, measurements performed on the second one allow us todetermine the state of the first.

There are four special states | β00〉, | β01〉, | β10〉 and | β11〉 called the Bell states whichform a normal basis

| β00〉 =| 00〉 + | 11〉√

2

| β01〉 =| 01〉 + | 10〉√

2

| β10〉 =| 00〉 − | 11〉√

2

| β11〉 =| 01〉 − | 10〉√

2

The Bell states can be distinguished from one another. All four states are called maximumentangled. The last one is called an anti-correlated state.

An EPR pair is in one of the four Bell states. Einstein, Podolky, and Rosen (thus, thename EPR pair) where the first to discover the strange behavior of states like Bell states. Thisstrange behavior hints to possible applications of quantum information that are well beyondwhat we could possibly envision in a classical universe. This is the basis of a phenomenacalled teleportation.

4.7 The Fragility of Quantum Information. Schrodinger’s Cat

By now we should be convinced that classical information can be encoded into a quantumsystem. Having a potentially infinite number of states a qubit offers dazzling possibilities;two qubits are truly amazing, as we have seen in the case of EPR pairs. This seems toogood to be true; is there a catch? Yes, the catch is that quantum information is encodedinto very fragile nonlocal correlations of different parts of a physical system. Why do wesay that these interactions are ”fragile”? Because a quantum system interacts continuallywith its environment and these interactions with the environment destroy the correlationsencoded into the quantum system. In time, the internal correlations of the quantum system

108

are transferred into correlations between the quantum system and the environment and theinformation encoded into the quantum system is lost.

Scrodinger gave an extreme example of the apparent ambiguity of quantum informationand of superposition states. Consider the quantum description of a physical entity we arefamiliar with, .... a cat

| cat〉 =1√2(| dead〉 + | alive〉).

Since all cats we have ever seen are either dead or alive, a layperson would viewSchrodinger’s cat as a definitive proof of foolishness; even Schrodinger himself consideredthis example as a blemish to his theory [91]. Today we are more sophisticated and realizethat in fact the state | cat〉 though possible, it is extremely rare; it can be constructed asa superposition of two possible states a cat can be in, dead and alive, but it would beimmediately transferred to correlations between the cat and the environment and becomeinaccessible. All the cats we have ever seen are in fact projections of a cat generated by theenvironment into either the dead, or the alive states.

What can possibly go wrong with quantum information? A very serious problems is thatwe disturb the state of the quantum system when we attempt to measure it, as we discussedabove. Another concern is that quantum information cannot be copied with fidelity. Whatabout errors, what type of errors should we expect?

With classical bits we are aware of bit-flip errors a 0 becomes a 1 and a 1 becomes a 0.We should expect the same to happen to qubits

| 0〉 ↔ | 1〉 and | 1〉 ↔ | 0〉.In addition, a qubit may experience a phase error; the phase may flip and then

| 0〉 ↔ | 0〉 and | 1〉 ↔ − | 1〉.The quantum information is continuous. If the state of a qubit is

| ψ〉 = α0 | 0〉 + α1 | 1〉then either α0 or α1, or both, may change by an infinitesimal quantity, ε, and then weexperience a bit error of a new type, another one we cannot encounter when dealing withclassical bits.

4.8 Qubits, from Hilbert Spaces to Physical Implementation

Let us now summarize what we have learned so far about qubits. Recall that an n-dimensionalHilbert space is a vector space over the field of complex numbers. If (| ψ〉, | φ〉) ∈ Hn theinner product is defined by 〈φ | ψ〉 and the norm is defined by

|| | φ〉 ||=√〈φ | φ〉.

We use Dirac’s notation for the inner product between two n-dimensional vectors | φ〉 =α0 | e0〉 + α1 | e1〉 + . . . αn−1 | en−1〉 and | ψ〉 = β0 | e0〉 + β1 | e1〉 + . . . βn−1 | en−1〉 with| e0〉, | e1〉 . . . | en−1〉 the vectors of an orthonormal basis [43]. Then the inner product of thetwo vectors is

109

〈φ | ψ〉 =n−1∑i=0

α∗i βi.

where the asterisk, (*), denotes complex conjugation. We can think of this as the productof the row vector 〈φ | = (α∗

0, α∗1, . . . α

∗n−1) by the column vector

| ψ〉 =

β0

β1

. . .βn−1

.

A qubit is a microscopic system, e.g., the spin of an electron or the polarization of aphoton, and may exist in a continuum of “intermediate states”, or “superpositions”. Wehave already seen that individual states of a quantum system consisting of one qubit can berepresented as unit vectors in a two-dimensional complex vector space denoted as H2.

We can choose different basis vectors associated with “basis states” to describe the “inter-mediate states” of a single qubit. For example, the polarization of a photon can be describedusing as basis vectors, | 0〉 and | 1〉, or

√1/2(| 0〉 + | 1〉) and

√1/2(| 0〉 − | 1〉) 27, or any

other pair of orthogonal vectors in H2.Our ability to distinguish between the states of a single qubit is limited [17]. We have

already seen that the superposition | ψ〉 = α0 | 0〉 + α1 | 1〉 behaves like | 0〉 with probability| α0 |2 and like | 1〉 with probability | α1 |2.

This gets even more complicated; two quantum states of one qubit can only be distinguishedif and only if they are represented in the same basis. We can distinguish | 0〉 from | 1〉 butwe cannot distinguish | 0〉 from

√1/2(| 0〉 − | 1〉). One last observation: quantum states

form equivalence classes, multiplication with eiγ does not alter an unit vector. Therefore,the quantum state of one qubit is a “ray” in H2, the equivalence class of a vector undermultiplication by a complex constant.

We expect more complications for quantum system with more than one qubit. We haveseen in Section 4.6 that a two qubit system may be in one of four basis states, | 00〉, | 01〉,| 10〉 and | 11〉 or in a superposition of them, in a Hilbert space with four dimensions, H4.Now the two qubits can either be in states when each qubit has a well defined state such as

1√2| 1〉 [| 0〉 + | 1〉] =

1√2[| 10〉 + | 11〉]

or in an entangled state when neither qubit has a defined state, though the pair does have awell defined state

1√2(| 00〉 + | 11〉).

In general a system of n qubits is represented by a complex unit vector in a 2n-dimensionalHilbert space, H2n defined as a tensor product of n two-dimensional Hilbert spaces

H2n = (H2)n.

27In the example in Section 2.8 we used the more intuitive notation for the basis vectors, |�〉 and |↔〉 insteadof | 0〉 and | 1〉 and |↗〉 and |↖〉 instead of

√1/2(| 0〉 + | 1〉) and

√1/2(| 0〉 − | 1〉).

110

Classical systems consisting of many components can be described by providing separatelythe state of each individual component. It follows that the complexity of the description ofa classical system, henceforth the number of states, grows linearly with the number of compo-nents, or bits. For a quantum system the previous equation shows that the dimensionality ofthe state space grows exponentially with the number of components, or qubits. The majority ofthe states of a quantum system are entangled and they are responsible for both the immensepower of a quantum computer and, at the same time for the difficulties of building them.

The evolution of a quantum system in isolation is unitary, it is linear and conserves theinner product in a Hilbert space. This means that the evolution of the system preserves su-perposition and distinguishibility of the system states. A superposition of the input states ofa quantum system consisting of a number n > 1 of qubits evolves into a corresponding super-position of output states as we see in the next chapter devoted to quantum gates and quantumcircuits. Most gates map unentangled initial states of the quantum system into entangled out-put states. Conventional computations and communication destroy the entanglement, whilequantum operations can create entanglement, preserve and use the entanglement to speed upcomputations and to transmit information over quantum channels.

Suppose that we have a qubit in such a general normalized (unknown) state

| ψ〉 = α0 | 0〉+ α1 | 1〉.If we want to determine the computational value of this qubit, we have to perform a

measurement of the qubit or, more exactly, of that particular property (observable) of thequantum system through which the qubit is implemented. The choice of the observabledetermines the frame of reference for the measurement; it specifies a basis, such as {| 0〉, | 1〉} ,in the Hilbert space of the system. The basis vectors are eigenvectors of the observableoperator. Mathematically, the physical measurement represents a projection the state of thequbit on this basis. The system randomly changes from the initial state characterized bystate vector | ψ〉 to either the state with state vector | 0〉 with probability | α0 |2, or to thestate with state vector | 1〉 with probability | α1 |2. The computational value of the qubitobtained as a result of the measurement is a real number, it is one of the eigenvalues λ0 andλ1, associated with the eigenvectors of the observable operator | 0〉 and | 1〉, respectively.

As a result of the measurement we are not able to determine the initial state of the qubit.The coefficients α0 and α1 cannot be determined with this single measurement; what we do isto prepare the qubit in a known state which, in general, is different from the initial one. Thestate vector of this known state is now one of the eigenvectors (| 0〉 or | 1〉) of the measuredobservable. A qubit in a known state is what we need before we do any manipulation of it,i.e., before we apply any of the quantum gates.

Two physical systems leading to the simplest possible embodiments of a qubit are

(a) the electron with two independent spin values, ±1/2, and

(b) the photon, with two independent polarizations, say horizontal and vertical, or righthand and left hand (in case of circular polarization).

The spin is an intrinsic angular momentum28 of a quantum particle, related to rotationabout an intrinsic arbitrary direction. The observable associated with this propriety is calledthe spin quantum number.

28The intrinsic angular momentum of a quantum particle should be distinguished from the orbital angularmomentum.

111

There are two classes of quantum particles, those with spin 1/2, the so-called fermionsand those with spin 1, the so called bosons. The spin quantum number of fermions can bes = +1/2 or s = −1/2. The spin quantum number of bosons can be s = +1, s = −1, ors = 0. The spin of a quantum particle can be observed as an interaction of the intrinsicangular momentum of the particle with an external magnetic field �B.

A beam of particles is

(a) completely polarized - if all particles are in a pure state,

(b) unpolarized - if particles have an equal probability to have any of the possible values ofthe spin, or

(c) partially polarized - if particles have unequal probabilities to have any of the possible valuesof the spin.

4.9 Qubits as Spin 12 Particles

The first embodiment of a qubit is the spin state of a particle with spin 1/2, such as theelectron29. The concept of “spin” does not have a correspondent in classical mechanics.Classical mechanics operates with the concept of an “angular momentum” arising from arotation around a well defined axis of the body. A quantum particle, such as the electron,rotates intrinsically about directions randomly oriented in space.

The observable associated with the electron rotation is the intrinsic angular momentum,also called the spin angular momentum of the electron. The “spin” is the quantum numbercharacterizing the intrinsic angular momentum of the electron. The electron spin is found tohave either the value s = +1/2 or s = −1/2 along the measurement axis, regardless of whatthat axis is, see Figure 22(a).

The | 0〉 and | 1〉 states of a qubit correspond to the “spin up” |↑〉 and “spin down” |↓〉states of an electron along a chosen axis. Such axis could be the x, y, or the z-axis. It isconvenient to represent the spin states as orthogonal unit vectors

|↑〉 =

(10

)|↓〉 =

(01

)No classical analogy can be used to study the spin. The theory of the spin is based upon

the assumption that the components of the spin angular momentum are connected with therotation operators σx, σy and σz [43]. The self-adjoint operators Sx,Sy,Sz associated withthe components of the intrinsic angular momentum of the electron can be expressed in atwo-dimensional representation as

Si =1

2σi, i = x, y, z

with σi are the Pauli matrices σx, σy, σz, introduced in Section 4.1 and they represent com-ponents of the rotation operator along the x, y, and z axes in real space. The eigenvalues ofSi are ±1/2 in units � = 1. Rotations about distinct axes do not commute, i.e., a series ofrotations if performed in different sequences will not have the same result. The operators asso-ciated with rotations about distinct axes are noncommutative. Recall that the Pauli matricesare mutually anticommutative (a negative sign appears in the equalities relating them)

29The protons and the neutrons are other particles with spin 12 .

112

σxσy − σyσx = 2iσz,

σyσz − σzσy = 2iσx,

σzσx − σxσz = 2iσy,

and their squares are equal to I

σ2x = σ2

y = σ2z = I.

These four equations can be reduced to a simpler form

σyσz = − σzσy = iσx,

σzσx = − σxσz = iσy,

σxσy = − σyσx = iσz,

and σxσyσz = i

In Section 4.3 we have seen that finite rotations about a given vector −→n in the Blochsphere can be represented as

R−→n (θ) = exp(−iθ

2−→n · σ) = cos(

θ

2)I − i sin(

θ

2)−→n · σ,

where σ = {σx, σy, σz}. It is easy to show that the matrices R−→x (θ),R−→y (θ), R−→z (θ) repre-senting rotations about the individual axes −→x ,−→y ,−→z are unitary and have determinant 1.Moreover, any general 2×2 unitary matrix with determinant 1 can be expressed in this form.

Sz

12

h

12

h-

Rn(0)

Rn(180)

(a) (b)

Figure 22: The spin of the electron. (a) Electrons and other particles have intrinsic angularmomentum characterized by the quantum number 1

2. (b) There are two rotations operators;

the first keeps the spin unchanged, and the second flips the spin to an orthogonal state.

113

When a qubit is represented as the spin state of an electron, or any spin 1/2 particle,possible arbitrary unitary transformations acting on the state are rotations of the spin. In H2

a rotation operator has two eigenvalues corresponding to the angles θ = 0 and θ = 2π andthus there are two possible rotations of the spin, one by an angle θ = 0 which keeps the spinunchanged, and one by θ = 2π flips the spin to an orthogonal state, as in Figure 22(b)

θ = 0 =⇒ R−→n (0) = cos(0)I − i sin(0)�n · σ = I

θ = 2π =⇒ R−→n (2π) = cos(π)I − i sin(π)�n · σ = −I.

4.10 The Measurement of the Spin

Formally, measuring the spin along an axis −→n is equivalent to first applying a rotation trans-formation that rotates the −→n axis to the −→z axis and then measuring along −→z . In the actof taking a measurement along a defined direction, chosen by the observer in the laboratoryframe of reference, the electron assumes a definite axis of rotation.

How do we measure the spin? We use the intrinsic magnetic moment and let it interactwith an external magnetic field. In a Stern - Gerlach type of experiment we can measurethe spin component along the vertical axis by observing which way the spin-1/2 particle isdeflected and by how much, as seen in Figure 23.

N

SSource of spin

½ particles(silver atoms) Magnet

Photographic plateused as detector

Figure 23: The measurement of the spin in a Stern Gerlach type setup. The homogeneousmagnetic field exerts a force on the electrons while the electron is traversing the magnet. Theforce acts upward on the electron if the spin is up and downward if the spin is down. Oncethe electrons exit the magnet they continue on a rectilinear trajectory.

The applied magnetic field interacts with the spin magnetic moment of the unpairedelectron in each silver atom, defines its axis of rotation and separates the states of given spin.At the same time the magnetic field exerts a force on the electrons; the force acts upward onthe electron if the spin points up and downward if the spin points down.

At any instant before a measurement, an electron’s spin state | ψ〉, can be represented bya linear combination of those two possible observable states. The choice of a measurementdirection is equivalent to choosing a basis for expressing the spin components of the state | ψ〉.

In the case of a spin s = 12, there are two such states which are generally denoted by |↑〉

and |↓〉, according to the sign of the corresponding value | +12〉 and | −1

2〉, respectively. These

114

two states are vectors in a complex Hilbert space and have all the properties associated withsuch a space. They form an orthonormal basis in a two-dimensional Hilbert space. A generalnormalized state, for which positive and negative results are possible, can be constructed bylinear superposition of these two states:

| ψ〉 = c+ |↑〉+ c− |↓〉A qubit as a state (vector) in a two-dimensional Hilbert space can take any value of the formmentioned above. The coefficients c+ and c− are arbitrary complex numbers, subject to thenormalization condition

| c+ |2 + | c− |2= 1

and they contain all the information we can ever obtain about a quantum state of spin 12.

If this maximum information is available, the system is said to be in a pure state. In sucha state, the probabilities for the two possible results of the measurement like in the Stern -Gerlach experiment are

P+ =| c+ |2 and P− =| c− |2

and they correspond to the two components of the beam split by the magnetic field, onedeflected upwards and the other downwards, respectively. We may have two detectors, onefor the upwards deflected beam and the other for the downward deflected beam. Dependingon which detector is clicking, the initial general pure state of the system (electron or atom)is reduced to spin state | +〉 or | −〉 through the measurement. In fact, we do not needtwo detectors. If we have one detector mounted on the upward-deflection path, them if thedetector does not click, we conclude that the electron went downward and, therefore, it haddownward spin and was in a | −〉 spin state.

The measuring apparatus consists of multiple components. Each of these componentscontributes to the overall definition of the measurement direction, and has its own vector basis.Therefore, the travel of the electron through the apparatus is associated with measurementsin several bases. Each one of these measurements contributes to the overall probability offinding the electron in a particular final state.

A modified Stern - Gerlach apparatus containing more than one set of magnets can beused to filter an incident beam of atoms, so that after the last magnet we end up with apolarized beam consisting of atoms in a definite pure spin state.

In quantum mechanics, each possible measurement basis is associated with an opera-tor whose eigenvalues represent the possible outcomes of the corresponding measurement.Consider a basis consisting of three orthogonal axes x, y and z. Then, the three principalcomponents of the angular momentum of the spin of the atomic particle, Sx, Sy, and Sz,corresponding to measurements along the three axes, x, y and z are

Sx =1

2σx =

1

2

(0 11 0

), Sy =

1

2σy =

1

2

(0 −ii 0

), Sz =

1

2σz ==

1

2

(1 00 −1

).

Here σx, σy, σz are Pauli matrices. It is easy to prove that any two-by-two matrix A, such asthe Hamiltonian of any two-state system, can be represented as a linear combination of Paulimatrices and the identity matrix I

115

A =

(a11 a12

a21 a22

)= c0I + c1σx + c2σy + c3σz

In the case of an electron in a magnetic field the Pauli matrices have a special geometricalsignificance, but in the general case they can be used simply as matrices.

The eigenvalues of the operator associated with the measurement (the spin angular mo-mentum operator, in this case) along whichever direction we choose (x, y, or z) determine theprobabilities of the possible outcomes (+1/2, or −1/2). The eigenvectors of the spin operatorsalong the three principal directions Sx, Sy, Sz corresponding to the eigenvalues +1/2 or −1/2are

−12

+12

Sx1√2

(−1

1

)1√2

(11

)

Sy1√2

(1−i

)1√2

(1i

)

Sz

(10

) (01

)Each pair of eigenvectors constitutes a basis for the state space. The state vector of the

atomic particle can be expressed as a linear combination of the basis vectors for the desiredmeasurement (along one of the principal axes) and the coefficients give the probability of thatmeasurement yielding either the spin up or the spin down eigenvalues.

We may consider measuring the spin of an atomic particle along an arbitrary axis relativeto the state vector of the particle. We assume that the magnetic field vecB in a Stern-Gerlachtype experiment is not along the z-direction, but in a direction defined by the polar angle θand ϕ. Here θ is the angle between the direction of the magnetic field and the positive z-axis,and ϕ is the azimuthal angle between the projection of the magnetic field in the xy-plane andthe x-axis, as in Figure 24.

We assume that the particle has been prepared with its spin pointing along the magneticfield. We want to express the state of the electron in the vector basis corresponding to thez-axis as

| ψθ, ϕ〉 = α0 |↑z〉 + α1 |↓z〉where |↑z〉 and |↓z〉 are the states spin up, and spin down along the z-axis.

We wish to determine the values of the coefficients α0 and α1 which give the probabilitiesto find the atomic particle with its spin up, respectively down, along the z-axis, knowing thatits spin is oriented along the axis defined by the angles θ and ϕ. Solving the Schrodingerequation for an atomic particle in a magnetic field30 shows that

α0 = cosθ

2e−iϕ/2,

30A clear presentation of the calculations to solve the Schrodinger equation for an atomic particle in amagnetic field can be found in [47].

116

S

z

B

y

x

Figure 24: The spin S of an atomic particle oriented along an axis defined by the polar angleθ and ϕ. Here θ is the angle between the direction of the magnetic field B and the positive z-axis, and ϕ is the azimuthal angle between the projection of the magnetic field in the xy-planeand the x-axis

α1 = sinθ

2e+iϕ/2.

up to an exponential factor with imaginary exponent which does not count in the evaluationof the probabilities.

We notice that the value of the magnetic field B does not appear in these two expressions;hence, the result is the same for the case when B = 0. This result is the solution of the moregeneral case an arbitrary particle, an atomic particle of spin either 1/2, or 1, whose spin isalong an arbitrary axis. For example, assume that |↑z〉 represents a state with spin up alongthe z-axis and |↓z〉 represents the spin down state along the z-axis. Given another vector |↑z′〉representing a state of “spin up” along a different z′-axis which makes the polar angle θ andϕ with the z-axis, we can write its projections on the z-axis vector basis as

〈↑z|↑z′〉 = cosθ

2e−iϕ/2,

117

〈↓z|↑z′〉 = sinθ

2e+iϕ/2.

Let us assume that we perform an experiment, similar to the Stern - Gerlach experiment,where the electrons are moving along the y-axis and their spins are measured in the z direction;that means the z-component of the electron spin vector is pinned down. If we filter out theelectrons with “spin down” in the z-direction, we are left with electrons with the “spin up” inthe z-direction. Assume that now we perform a new spin measurement on the remaining “spinup” electrons along a direction in the xz-plane at an angle θ with the positive z-axis. Themeasurement spin operator in this direction, Sθ, will be given by its projections on the Sx andSz operators (the probabilities to obtain the spin values along the basis axes are interpretedas the projections of the electron state vector onto those axes).

Sθ = sin(θ)Sx + cos(θ)Sz =1

2

(cos(θ) sin(θ)sin(θ) − cos(θ)

)The eigenvalues of this operator are +1/2 and −1/2 in units of � = 1 and the correspondingeigenvectors are, respectively,(

cos( θ2)

sin( θ2)

) (− sin( θ

2)

cos( θ2)

).

Each electron, after being measured in the first stage of the experiment, has now thestate vector as ”spin up” in the z-direction; this state vector is the initial state vector for themeasurement along the axis in the xz-plane and it can be expressed as a linear combinationof these new basis vectors(

10

)= c1

(cos( θ

2)

sin( θ2)

)+ c2

(− sin( θ

2)

cos( θ2)

)The coefficients are estimated to be c1 = cos(θ/2) and c2 = − sin(θ/2). The probabilitiesof spin up and spin down for the measurement of such an electron along the θ-direction are| cos(θ/2) |2 and | sin(θ/2) |2, respectively, where θ is the angle between the two measurementdirections.

4.11 The Qubit as a Polarized Photon

In the classical theory light is described as an electromagnetic radiation consisting of anelectric and magnetic field. The electric field oscillates in the xy-plane perpendicular to thedirection of propagation, the z-axis. The electric field oscillates either vertically, and we saythat the light is x-polarized as in Figure 25(a) , or horizontally, and then we say that the lightis y-polarized, as shown in Figure 25(b)

If the electric field has an arbitrary orientation in the xy-plane, then it will have both xand y components. If these components are out of phase by 90o, the electric field rotates andthe light is said to be elliptically polarized. When the x and the y components are equal andout of phase by 90o the light is said to be circularly polarized. Circularly polarized light canbe right-hand polarized, or left-hand polarized, depending on the propagation sense along thez-direction.

If we look at the individual photons participating in the “light” we can not talk about anelectric field associated with a single photon, but a single photon must have a property as

118

the analog of the classical phenomena of polarization. A photon is characterized by its vectormomentum (the vector momentum determines the frequency) and its polarization.

x

z

y

z

y

x

h

v

(a)

(b)

Figure 25: Linear photon polarization. (a) Vertical polarization, the polarization vector, valong the x-axis. (b) Horizontal polarization, the polarization vector, h along the y-axis.

Photons differ from the spin 12

electrons in two ways: (1) they are massless and (2) theyhave spin 1. A photon can have two independent polarizations and it is another importanttwo-state system able to represent a qubit.

A photon can be described as a two-state system; a photon can be in state | h〉 or instate | v〉. All photons in a classically y-polarized beam of light are said to be in polarizationstate | h〉 and, similarly, all photons in a classically x-polarized beam of light are said to bein polarization state | v〉. The states | h〉 and | v〉 can be used as base states to describe thepolarization of a photon with given momentum oriented along the z-direction.

Light contains photons in these two states of polarization and we can experimentallyobserve the photons in each of the two states. If we use a polarization filter (or polarization

119

analyzer) and set its axis to let pass y-polarized light, then all photons in the state | v〉 willbe absorbed in the filter and only the photons in state | h〉 will pass through. If the axis ofthe polarization filter is set to let pass x-polarized light, then all photons in state | h〉 will beabsorbed and only photons in state | v〉 will pass through.

x

y| h >

(a)

(b)

x'

y'| h' >

y

x

x

y

z

| h >

| h' >

| h' >

Incident beamof light

PF1

PF2

Figure 26: The effect of a polarization filter. (a) The polarization filter PF1 is set at angle θwith respect to the coordinate system of the incoming beam of light. The emerging photonsare in a superposition state, | h′〉 = α0 | h〉+ α1 | v〉, with α0 = cos θ and α1 = sin θ. (b) Thepolarization filter at angle θ, PF1 is followed by a second filter, PF2 which lets pass onlyh-polarized photons with probability cos θ.

A material called calcite has the inverse action on a beam of light: the light will splitit into an | h〉 state beam and a | v〉 state beam. That is similar to the action of a Stern-Gerlach apparatus which separates the spin s = +1/2 states from spin s = −1/2 states. Ifthe polarization filter is set in a new position and the light passing through is polarized in ay′-direction which makes an angle θ with the y-direction, as in Figure 26(a), then the photonscoming out of the filter will be in a | h′〉 polarization state. This state can be expressed as alinear combination of the base states | h〉 and | v〉 as

| h′〉 = cos θ | h〉 + sin θ | v〉

120

Assume now that these photons in state | h′〉, i.e., polarized along direction y′, have to gothrough a second polarization analyzer set in position θ = 0, as in Figure 26(b). This filterwill let pass only photons in state | h〉. We have to estimate the probability that a photon instate | h′〉 is also in state | h〉, i.e., we have to do the projection

〈h | h′〉 = cos θ〈h | h〉 + sin θ〈h | v〉where | h〉 and | v〉 are base states and 〈h | h〉 = 1 and 〈h | v〉 = 0. Thus, the probability is

| 〈h | h′〉 |2 = cos2 θ

We have seen that under a rotation of angle θ about the axis of propagation the horizontalpolarization state | h〉 transforms as

| h〉 →| h′〉 = cos(θ) | h〉+ sin(θ) | v〉At the same time the vertical polarization state | v〉 transforms as

| v〉 →| v′〉 = − sin(θ) | h〉+ cos(θ) | v〉The matrix for this transformation is(

cos θ sin θ− sin θ cos θ

)and it has the eigenstates

| R〉 =1√2

(1i

)| L〉 =

1√2

(i1

)

It is easy to see that the eigenvalues are eiθ and e−iθ, respectively, for these eigenstateswhich we call right and left polarization states. These states are also eigenstates of the rotationoperator

σy =

(0 −ii 0

)with eigenvalues ±1. Because the eigenvalues are ±1 we say that the photon has spin 1.Remember the experiment with polarized light (photons) presented in Section 2. We had

a polarization filter that allowed only one of the two linear photon polarizations, say h, topass through. Only 1/2 of the photons could get through. We added an filter for polarizationv and no photon could get through. Next, we interposed a 45o rotated polarizer between theh and v analyzers. As a result, an h polarized photon coming out of the first analyzer hadprobability 1/2 of passing through the 45o polarizer and a 45o polarized photon coming outof the 45o polarizer had probability 1/2 of passing through the v-polarizing filter.

A device can be constructed that rotates the linear polarization of a photon. Such a deviceapplies the first transformation mentioned above to our qubit and sets the qubit in a mixedstate. We can use another device that alters ω, the relative phase of the two orthogonal linearpolarization states and performs the following transformations

| h〉 =⇒ eiω/2 | h〉

121

| v〉 =⇒ e−iω/2 | v〉When the two devices are used together they apply an 2 × 2 unitary transformation to thephoton polarization state.

4.12 Entanglement

Assume that an experiment is set up to measure the spins of two particles of spin −1/2, inour case electrons, emitted in opposite directions following the decay of a singlet state withzero total spin. A singlet electron state corresponds to a pair of electrons with anti-parallelspins 31 , i.e., the state |↑↓〉 when the electrons have different quantum numbers +1/2 and−1/2. The total spin of the singlet state is zero.

For such a state the conservation of the angular momentum requires that the spin vectorsof the two particles are oriented in opposite directions. Hence, if we measure the spin ofone of these two particles along a certain direction and find it in the state “spin up”, then,along that direction, the other particle must be in the state “spin down”. By measuringthe spin of one particle and thus, reducing its state vector to one of the eigenvectors of themeasurement basis, we automatically project the state vector, collapse the wave function, ofthe other particle onto the same basis. Instead of a set of probabilistically possible states weobtain one, well defined state.

Now, assume that we perform the following experiment with a pair of particles in a singletstate. First, we measure the spin of one particle (call it particle I) along a direction inthe xz-plane and find it in the “spin down” state; then, the other particle, (call it particleII) is in a pure “spin up” state along that direction. Next, we measure the spin of particleII along a direction at an angle θ with the direction of first measurement. We find thatparticle II is in a combination of “spin up” and “spin down” states with probabilitiescos2(θ/2) and sin2(θ/2), respectively. If the first measurement on particle I yields “spinup”, then particle II is in a pure “spin down” state along that direction. A measurement ofthe spin of particle II along a new direction at an angle θ relative to the first measurement,finds particle II in a combination of “spin up” and “spin down” states with probabilitiessin2(θ/2) and cos2(θ/2), respectively.

Let us now perform successive measurements of each of the two particles along directionsat an angle θ relative to each other. The probability that the successive measurements give thesame result (up-up or down-down) is sin2(θ/2) while the probability that they yield oppositeresults (up-down or down-up) is cos2(θ/2.

The two particles emitted from the singlet are said to be entangled; regardless of how farapart they travel before the spin measurements are made, the joint results will exhibit thesejoint probabilities.

These probabilities can be evaluated from the results of multiple measurements; suchmeasurements have to be performed over several particle pairs, not just one pair. In principlea large number of particle pairs can be prepared in an identical way, in space-separatedlocations (in a string like formation) and the measurements can be performed independentlyon the pairs. According to quantum mechanics the results are expected to satisfy the samecorrelations.

31The state of a pair of electrons with parallel spins |↑↑〉 i.e., with equal spin quantum numbers +1/2 and+1/2 is called a triplet.The total spin of a triplet state is +1.

122

At this time we can imagine a quantum computer as a collection of prepared pairs ofparticles whose entanglement (correlations) satisfy all the requirements for solving a certainparallel algorithm, or the algorithm can be designed to take advantage of a specific entangle-ment. Two entangled particles in a singlet state have total spin 0 because the spins of theindividual particles about any axis are always opposite; their spin will exist only as poten-tialities until an observer chooses a definite axis and performs the measurement.

The measurement axis defines the vector basis for both electrons and the instant particleI is measured, particle II acquires a definite spin along the chosen axis, instantaneously.If we let a vertical axis represent the binary value 1 and a horizontal axis represent 0, it ispossible to pass 1’s and 0’s apparently “instantaneously” across large distances by selecting theappropriate axis of measurement. We use the word “instantaneously” in quotes to emphasizethe fact that the transmission of information does not really happen with a speed greaterthan the speed of light, an impossibility according to relativity theory. At the receiving endthe measurement must not only be performed after the initial measurement at the sendingend, but it must be also carried out in a matching vector basis. Thus, the sender must usesome classical means of communication to to inform the receiver about the vector basis usedfor the measurement. The delay between the moments the two results are obtained preventsa truly instantaneous transmission of information.

Entanglement is an elegant, almost exact translation of the German term Verschrankungused by Schrodinger who was the first to recognize this quantum effect. An entangled pair isa single quantum system in a superposition of equally possible states. The entangled statecontains no information about the individual particles, only that they are in opposite states.The important property of an entangled pair is that the measurement of one particle influencesthe state of the other particle. Einstein called that ”Spooky action at a distance”.

4.13 The Exchange of Information Using Entangled Particles

The entanglement of a pair of qubits can be exploited to transmit information as discussedin depth in Chapter 8. Here we only sketch the basic idea of communication using qubits inantisymmetric states. In this case, when one of the qubits is forced into one state, then thesecond member of the pair is forced into an opposite state. The conceptual idea is strikinglysimple, but poses immense practical challenges. We are able to create pairs of entangledparticles, but separating them, while maintaining their entanglement is very challenging.

The method described in [15] can be used in conveying information in a secure way at adistance. Assume that we have prepared three particles. The first, particle1 is located inNew York and its initial state is | ψ1〉. The second and the third particles, called particle2

and particle3, are entangled and separated, particle2 is sent to New York together withparticle1, while particle3 is sent to London. The entire process is depicted in Figure 27.

In the following discussion we use the abbreviated notation | 01〉 to mean that particle1is in state | 0〉 and particle2 is in state | 1〉. The essential point is to perform specificmeasurements on particle1 and particle2 which projects them onto the entangled state

| ψ12〉 =1√2(| 01〉− | 10〉).

This is one of the four possible maximally entangled states of a pair of particles, as discussedin Section 4.6.

123

New York

New York

London

London

particle1

particle1

particle1

particle2 particle3

particle2

particle2

particle3

particle3

particle3particle2particle1

Initial state

Entangle particle2 and particle3

Particle2 and particle3 in an anti-symmetric entangled state

Separate the entangled particles

Entangle particle1 and particle2Measure the entangled system

(particle1, particle2)

Figure 27: Communication with entangled particles. Given three particles, particle1,particle2, and particle3 we wish to transmit the state of particle1 in New York toparticle3 in London. We first entangle particle2 and particle3. Then we separate thetwo entangled particles, particle2 goes to New York and particle3 to London. Once inNew York we entangle particle1 with particle2. Then we measure the entangled sys-tem by projecting it into the state | ψ12〉. Automatically, this measurement process forcesparticle2 into an antisymmetric state to the one of particle1. But particle2 is also in anantisymmetric entanglement relation with particle3, thus particle3 is forced into a stateopposite to particle2. But this is precisely the state of particle1.

1√2(| 00〉 + | 11〉)

1√2(| 01〉 + | 10〉)

124

1√2(| 00〉 − | 11〉)

The | ψ12〉 state is distinguished from the other three by the fact that it changes sign whenparticle1 and particle2 are interchanged. This anti-symmetric feature plays an importantrole in experimental identification of this state.

Quantum physics predicts that once the entangled particle1 and particle2 are pro-jected onto | ψ〉12, particle3 is instantaneously projected into the initial state of particle1.

What is the explanation? Since we have entangled particle1 and particle2, no mat-ter what state particle1 is in, particle2 must be in the opposite state, a state which isorthogonal to the state of particle1. Initially, particle2 and particle3 were prepared instate | ψ23〉 and that means that the state of particle2 is also orthogonal to the state ofparticle3. That is only possible if particle3 in London is in the same state as particle1was initially.

It is important to realize that as a result of this communication process we do not have twoidentical copies of particle1, one in New York and one in London. During the entanglementwith particle2, particle1 loses its identity, its state cannot be extricated from the state ofthe entangled system. The state | ψ1〉, where the information was “inscribed”, is destroyedduring the measurement process on the New York side during the measurement process, butthe information has been “communicated” to particle3 on the London side of the messageexchange.


A quantum bit or qubit is a mathematical abstraction for an elementary quantum object usedto store information. A qubit ψ is a vector in a two-dimensional complex vector space, theHilbert space of dimension 2, H2.

The state space of a qubit contains the two “basis” or “logical”, states, | 0〉 and | 0〉. Theinitial state of a qubit is always one of the basis states. Using the transformation discussedin this chapter we can obtain states which are “superpositions” of the basis states, see Figure20. Superpositions can be expressed as sums over the basis states with complex coefficients

| ψ〉 = α0 | 0〉+ α1 | 1〉.In addition to the ket notation of Dirac displayed above, the state of a qubit can be

expressed as a vector

| ψ〉 =

(α0

α1

).

Throughout this book the two notations are used interchangeably.A superposition is a pure state if the corresponding vector | ψ〉 has length 1, that is if

| α0 |2 + | α1 |2 = 1. Such a superposition, or vector is said to be “normalized”.The amplitudes of a qubit in a superposition state can be expressed using three real

numbers, θ, ϕ, γ such that

α0 = eiγ cosθ

2α1 = eiγeiϕ sin

θ

2.

125

The elements of the density matrix give the probabilities of the possible outcomes of ameasurement performed on a qubit. The most general form of the density matrix for a singlequbit is

ρ =1

2(I + βxσx + βyσy + βzσz) =


βx + iβy 1− βz

)with I the identity 2× 2 matrix and βx, βy, βz real numbers. There is a one-to-one correspon-dence between the possible density matrices of a single qubit and the points on the Blochsphere β2 = 1 with β2 = β2

x + β2y + β2

z . Pure states are represented by points on the Blochsphere and while impure states are represented by points inside the sphere.

Single qubit operations are defined as rotations on the Bloch sphere described by the Paulispin matrices {σx, σy, σz}.

Our ability to distinguish between the states of a single qubit is limited. The superpositionstates of one qubit cannot be reliably distinguished from basis states. The superposition| ψ〉 = α0 | 0〉 + α1 | 1〉 behaves like | 0〉 with probability | α0 |2 and like | 1〉 withprobability | α1 |2. A qubit in a superposition state is measured to one of the two basis statesas shown in Figure 21.

Two quantum states of two qubits can only be distinguished if and only if their vectorrepresentations are orthogonal. Quantum states form equivalence classes, multiplication witheiγ does not alter an unit vector. Therefore, the quantum state of one qubit is a “ray” in H2.

The state of a system of two qubits is a linear combination of the basis vectors withcomplex coefficients α00, α01, α10, α11:

| ψ〉 = α00 | 00〉 + α01 | 01〉 + α10 | 10〉 + α11 | 11〉.When we measure a pair of qubits we decide that the system is in one of four basis states

| 00〉, | 01〉, | 10〉 and | 11〉 with probabilities | α00 |2, | α01 |2, | α10 |2, and | α11 |2 respectively.The sum of probabilities must be one:

| α00 |2 + | α01 |2 + | α10 |2 + α11 |2 = 1.

Before the measurement the state is uncertain. The post measurement state of the qubitis | 00〉, | 01〉, | 10〉, or | 11〉.

When α00 = α11 = 1/√

2 and α01 = α10 = 0 we have a Bell state and the pair ofqubits is called an EPR pair. In this case when we measure only the first qubit we get thesame results as when we measure only the second qubit.

A system of n qubits is represented by a complex unit vector in a 2n-dimensional Hilbertspace, H2n defined as a tensor product of n two-dimensional Hilbert spaces

H2n = (H2)n.

This equation shows that the dimensionality of the state space grows exponentially withthe number of components or qubits. For a classical system the number of states growslinearly with the number of components.

The evolution of a quantum system in isolation is unitary, it is linear and conserves theinner product in a Hilbert space. The evolution of the system preserves superposition and dis-tinguishibility of the system states. A superposition of the input states of a quantum systemconsisting of a number n > 1 of qubits evolves into a corresponding superposition of out-put states. Conventional computations and communication destroy the entanglement, while

126

quantum operations can create, preserve and use the entanglement to speed up computationsand to transmit information over quantum channels.

The physical systems leading to the simplest possible embodiments of a qubit are: theelectron with two independent spin values, ±1/2, and the photon, with two independentpolarizations.

The spin is the quantum number characterizing the intrinsic angular momentum of theelectron. The electron spin is found to have either the value +1/2 or −1/2 along the mea-surement axis, regardless of what that axis is.

A photon can have two independent polarizations. Photons differ from the spin-12

particles;they are massless and have spin 1. A photon is characterized by its vector momentum (thevector momentum determines the frequency) and its polarization. In the classical theory lightis described as having an electric field which oscillates either vertically, the light is x-polarized,or horizontally, the light is y-polarized in a plane perpendicular to the direction of propagation,the z-axis.

A singlet electron state corresponds to a pair of electrons with anti-parallel spins. Considertwo particles of spin −1/2, in our case electrons, emitted in opposite directions following thedecay of a singlet state with zero total spin; in this case the conservation of the angular mo-mentum requires that the spin vectors of the two particles are oriented in opposite directions.The two particles emitted from the singlet are said to be entangled.

If we measure the spin of one of these two particles along a certain direction and find it inthe state “spin up”, then, along that direction, the other particle must be in the state “spindown”. By measuring the spin of one particle and thus, reducing its state vector to one of theeigenvectors of the measurement basis, we automatically project the state vector, collapse thewave function, of the other particle onto the same basis. Instead of a set of probabilisticallypossible states we obtain one, well defined state.

The entanglement of a pair of qubits can be exploited to transmit information as discussedin Section 4.13.

The paper of Rieffel and Polack [93] and Nilssen and Chuang’s book “Quantum Computingand Quantum Information Theory” [80] supplement the information provided in this chapteron qubits. A recent collection of articles on the physics of quantum information edited by D.Bouwmester, A. Ekert, and A. Zeilinger [25] contains several insightful papers on the subjectof the physical realization of qubits.


(1) Show that the Pauli matrices σ1 or σx, σ2 or σy, σ3 or σz, can be defined as σ†i = σi, i =

(1, 2, 3). Show that: (i) they are Hermitian, (ii) σ21 = σ2

2 = σ23 = 1, they square to unity, (iii)

σ1σ2 = iσ3 (this is also true for a cyclic permutation of indices); and (iv) they satisfy therelation σ1σ2 + σ2σ1 = 0 (this is also true for a cyclic permutation of indices).

(2) Prove the following relations among the I, H, and Pauli matrices σx, σy, and σz.

σxσx = I; HσxH = σz; HσzH = σx; HσyH = −σy.

(3) Show that the composition of two rotations with angles θ1 and θ2 is a rotation with angleθ1 + θ2 along the same axis

R�r(θ1)R�r(θ2) = R�r(θ1 + θ2)

127

(4) Show that the rotation matrices R�r are unitary. Recall that a unitary operator preservesthe distance. What is the implication of this property for the representation of a qubit usinga Bloch sphere?

(5) Prove that any two-by-two matrix A can be represented as a linear combination of Paulimatrices and the identity matrix I

A =

(a11 a12

a21 a22

)= c0I + c1σx + c2σy + c3σz

(6) Write a Java applet with a graphics user interface (GUI) which displays a qubit on theBloch sphere. The program requires as input the values of the complex coefficients α0 and α1

of the state vector, ψ = α0 | 0〉+ α1 | 1〉.Hint: Once the user clicks on the link to the object stored on the Web server, a form shouldbe displayed allowing the user to enter the necessary data, verify that | α0 |2 + | α1 |2= 1,carry out the necessary calculations, then download the applet.

(7) Augment the program in the previous assignment to compute the effect of a transformationperformed on a qubit by: the three Pauli matrices, the Hadamard, (H), the phase, (S), theπ/8, (T ), the phase-shift with angle θ, Pθ, Rθ, and Rk matrices, (see Section 5. The GUIshould allow the user to specify: (i) the values of the complex coefficients α0 and α1 of the statevector, ψ = α0 | 0〉+α1 | 1〉, (ii) the transformation, and (iii) the values of the elements of thetransformation matrix for Pθ, Rθ, and Rk matrices. The program should display the originaland the transformed qubit on the same Bloch sphere, and the matrix of the correspondingtransformation.

128

5 Quantum Gates and Quantum Circuits

In this chapter we discuss quantum gates, the building blocks of a quantum computer. First,we review familiar gates and logic circuits used to transform the information in a classicalcomputer. A classical circuit is characterized by a truth table relating its input to the out-put. To maintain consistency for those familiar with classical logic circuits we use the sameformulation when talking about quantum gates; we say that a gate transforms its input to itsoutput according to the rules hardwired into the truth table.

The physical reality is that a quantum gate transforms a quantum state into a new state. Aquantum system interacts with the quantum gate or circuit and, as a result of this interaction,the state of the system changes. The state transformations performed by quantum gates andcircuits are described by linear transformations. Such linear transformations are described bymatrices and we impose the condition that these matrices be unitary. When we say that atransformation is applied to the input vector we mean that a change of state is forced uponthe quantum system. To maintain consistency with the familiar concepts from circuit theorywe call the matrix describing the state transformation a transfer matrix.

We introduce single qubit gates and two qubit gates including an ample discussion ofthe CNOT gate. We move on to three qubit gates and discuss the Fredkin and Toffoli gatesand continue with the no-cloning theorem. Then we present quantum circuits. We discusssingle and multiple qubit controlled operations and then the Welsh-Hadamard transform. Weconclude the chapter with a section on mathematical models and quantum gate arrays anddiscuss errors in quantum computing.

5.1 Classical Logic Gates and Circuits

Boolean variables have values of either 0 or 1. The Boolean operations are: NOT, AND, NAND,

OR, NOR, and XOR. Boolean algebra deals with Boolean variables and Boolean operations.Logic gates are the active elements of a computer, they transform the information using

the laws of Boolean algebra, see Figure 28. Logic gates implement Boolean operations. Weonly describe standard logic gates with one or two inputs and one output; gates with morethan two inputs exist. Figure 28 presents six classic logic gates. For each gate we show theoutput as a Boolean function of the input. Each logic function is characterized by a truthtable giving its output for different combinations of inputs.

We denote by ⊕ the addition modulo two; the output is 1 when the two inputs differ andit is 0 if they are identical. If one of the inputs, say y = 0, then x⊕ y = x. The truth tableof a modulo two adder is the same as the one of an XOR gate:

x y x⊕ y0 0 00 1 11 0 11 1 0

It is not very difficult to prove that NAND gates are universal; any logic function can beexpressed using only NAND, Boolean operations thus, one can construct a logic circuit usingonly NAND gates. By contrast XOR is not; indeed, it does not change the parity of its input. Ifthe input has odd parity (01 or 10) so does the output, it is 1; if the input has even parity

129

NOT gate

AND gate

x y = NOT(x)

x y z0

111 1 1

0 00 0

0 0

x y z0

111 1

10

0

00

NAND gate

x

yz = (x) AND (y)

x

yz = (x) NAND (y) 1

1

x y0

011

0

11

x y z0

111 1

10

00

x

yz = (x) OR (y)OR gate

x y z0

111 1

100

0

x

yz = (x) NOR (y)NOR gate 0

00

x y z0

111 1

10

00

x

yz = (x) XOR (y)XOR gate

1

0

0

Figure 28: Classic logic gates. The truth table of each logic gate gives its output function ofits input(s).

(00 or 11) so does the output, it is 0. Thus the class of Boolean functions constructed withXOR alone is limited.

All gates with two inputs from Figure 28 are irreversible or non invertible. This meansthat knowing the output we cannot determine the input for all possible combinations of inputvalues. For example, knowing that the output of an AND gate is 0 we cannot identify theinput combination; 0 can be produced by three possible combinations of inputs 00, 01 and 10.Clearly, the NOT gate is reversible and two cascaded NOT gates recover the input of the firstone.

The irreversibility of classical gates means that there is an irretrievable loss of informationand this has very serious consequences regarding the energy consumption of classical gates.

130

Now we give an example of a logic circuit using some of the logic gates presented in thissection, the full-adder, Figure 29. This circuit has three inputs a, b, and CarryIn and twooutputs Sum and CarryOut, see Figure 29(a).

FullAdder

a

b Sum

CarryIn

CarryOut

a

b

CarryOut

CarryIn

(a) (b)

Figure 29: (a) A full one bit adder. (b) The circuit for the CarryOut.

The truth table of the full adder is:

a b CarryIn Sum CarryOut0 0 0 0 00 0 1 1 00 1 0 1 00 1 1 0 11 0 0 1 01 0 1 0 11 1 0 0 11 1 1 1 1

You may recall from an introductory course in computer architecture that one can derivethe Boolean equations giving the outputs of a logic circuit function of the inputs from thetruth table of the circuit as a sum of products [83]. Here sum stands for the Boolean OR andproduct stands for Boolean AND. In this section a denotes the negation of the Boolean variablea. Each term of the sum corresponds to an entry in the truth table where the output variableis 1; each term is the product of the corresponding input variables in that row negated if thevalue of the variable is 0, or without negation if the value is 1.

From the truth table of the full adder it is easy to see that:

Sum = abCarryIn + abCarryIn + abCarryIn + abCarryIn

andCarryOut = abCarryIn + abCarryIn + abCarryIn + abCarryIn

A manipulation of the last Boolean expression shows that:

CarryOut = ab + aCarryIn + bCarryIn.

Indeed, the truth table of the last expression is:

131

a b CarryIn ab aCarryIn bCarryIn CarryOut0 0 0 0 0 0 00 0 1 0 0 0 00 1 0 0 0 0 00 1 1 0 0 1 11 0 0 0 0 0 01 0 1 0 1 0 11 1 0 1 0 0 11 1 1 1 1 1 1

If we compare the last column of the two truth tables we verify that the two Booleanexpressions are equivalent. Figure 29(b) shows the circuit implementing the last Booleanexpression for the CarryOut.

5.2 One Qubit Gates

A qubit gate is a black box transforming an input qubit | ψ〉 = α0 | 0〉 + α1 | 1〉 into anoutput qubit | φ〉 = α′

0 | 0〉 + α′1 | 1〉.

Mathematically, a gate G is represented by a 2× 2 transfer matrix with complex elementsgi,j (i, j) ∈ {1, 2}:

G =

(g11 g12

g21 g22

)

Recall that the normalization condition requires that | α0 |2 + | α1 |2 = 1 and similarly| α′

0 |2 + | α′1 |2 = 1. This implies that G must be a unitary matrix, in other words that

G†G = 1. Here G† is the adjoint of G, a matrix obtained from G by first constructing GT ,the transpose of G and then taking the complex conjugate of each element (or by first takingthe complex conjugate of each element and then transposing the matrix)

gij = Real(gij) + i× Imaginary(gij)

g∗ij = Real(gij) − i× Imaginary(gij).

The transpose of a matrix has as rows the columns of the original matrix, thus:

GT =

(g11 g21

g12 g22

)It follows that:

G† =

(g∗11 g∗

21

g∗12 g∗

22

)The condition for G to be unitary is:

G†G =

(g∗11g11 + g∗

21g12 g∗11g21 + g∗

21g22

g∗12g11 + g∗

22g12 g∗12g21 + g∗

22g22

)= I

132

The inverse of a unitary matrix G is also unitary, GG−1 = I with I the identity matrix.This implies that a quantum gate with unitary transfer matrix can always be inverted by an-other quantum gate. This is extremely important, it shows that quantum gates are reversibleas opposed to classical gates which are irreversible.

Given the transfer matrix of a quantum gate, G, and the input and the output qubitsrepresented as column vectors the transformation performed by the gate is given by theequation:

| φ〉 = G | ψ〉For a single qubit gate this equation can be written as:(

α′0

α′1

)=

(g11 g12

g21 g22

)(α0

α1

)Thus

α′0 = g11α0 + g12α1

and

α′1 = g21α0 + g22α1.

We examine few important one qubit gates:

(i) I identity gate; leaves a qubit unchanged.

(ii) X or NOT gate; transposes the components of an input qubit.

(iii) Y gate.

(iv) Z gate; flips the sign of a qubit.

(v) H the Hadamard gate.

The transfer matrices of the first four gates, I, X, Y, and Z are the Pauli matrices σ0, σ1

or σx, σ2 or σy, σ3 or σz. The transfer matrices, and the output of these gates , | φ〉 = G | ψ〉,given the input | ψ〉 = α0 | 0〉 + α1 | 1〉 are listed below.

σ0 = I =

(1 00 1

) (1 00 1

)(α0

α1

)=

(α0

α1

)−→ | φ〉 = α0 | 0〉+ α1 | 1〉.

σ1 = X =

(0 11 0

) (0 11 0

)(α0

α1

)=

(α1

α0

)−→ | φ〉 = α1 | 0〉+ α0 | 1〉.

σ2 = Y =

(0 − ii 0

) (0 − ii 0

)(α0

α1

)= i

(−α1

α0

)−→ | φ〉 = −iα1 | 0〉+ iα0 | 1〉.

σ3 = Z =

(1 00 −1

) (1 00 −1

)(α0

α1

)=

(α0

−α1

)−→ | φ〉 = α0 | 0〉 − α1 | 1〉.

H = 1√2

(1 11 −1

) (1 11 −1

)(α0

α1

)=

(α0 + α1

α0 − α1

)−→ | φ〉 = α0

|0〉+|1〉√2

+α1|0〉−|1〉√

2.

The Hadamard gate, H, when applied to a pure state, | 0〉 or | 1〉, creates a superpositionstate, | 0〉 → 1/

√2(| 0〉+ | 1〉) and | 1〉 → 1/

√2(| 0〉− | 1〉).

133

In Section 4.1 we derived the Pauli matrices from the description of the transformationcarried out by each one of them. Recall that single gate operations correspond to rotationsand reflections on the Bloch sphere, as pointed out in Section 4.2.

5.3 The Hadamard Gate, Beam splitters and Interferometers

In Chapter 2 we discussed at length experiments involving a very simple device called a beamsplitter. Let us note first that beam splitters have been constructed not only for photons, butalso for other types of quantum particles.

We consider now a 50/50 beam splitter where an incident particle coming from aboveor from below has the same probability of emerging as un upwards or downwards beam, asseen in Figure 30(a). It turns out that the transformation performed by the beam splitter isdescribed by the Hadamard transform [25].

(b)

(a)

H

| 0 > | 0 >

| 1 >| 1 >

| 0 > | 0 >

| 1 > | 1 >

H H

Figure 30: (a) A beam splitter performs a transformation described by a Hadamard gate. (b)An interferometer performs a transformation described by two cascaded Hadamard gates.

134

Indeed, let us call the input to a Hadamard gate | ψ〉 = α0 | 0〉+α1 | 1〉 and call its output| φ〉 = H | ψ〉. We have seen earlier that

| φ〉 =1√2

(α0 + α1

α0 − α1

)or

| φ〉 =1√2(α0 + α1) | 0〉+

1√2(α0 − α1) | 1〉 = α0

| 0〉+ | 1〉√2

+ α1| 0〉− | 1〉√

2

The probability amplitude for finding the particle in the outgoing beam directed upwardsis 1/

√2(α0 + α1) and the probability amplitude for finding the particle in the outgoing beam

directed downwards is 1/√

2(α0 − α1).

Let us now consider a system consisting of two cascaded beam splitters. The setup inFigure 30(b) reminds us of the experiment presented in Figure 7 in Section 2.9.

In this case the output | φ〉 is

| φ〉 = HH | ψ〉.It is easy to see that HH = I. Indeed,

HH =1√2

(1 11 −1

)1√2

(1 11 −1

)=

1

2

(2 00 2

)= I

Thus

| φ〉 = HH | ψ〉 = I | ψ〉 =| ψ〉.This result can be generalized. If we apply n = 2k successive Hadamard transformations

to a qubit in state | φ〉 the qubit ends up in the same state

| ψ〉 = H2k | φ〉 = [H2]k | φ〉 = Ik | φ〉 = I | φ〉 =| φ〉.If n = 2k + 1, then n successive Hadamard transformations of a qubit in state | φ〉 leads

to

| ψ〉 = H2k+1 | φ〉 = [H2]kH | φ〉 = IkH | φ〉 = IH | φ〉 = H | φ〉.

5.4 Two Qubit Gates. The CNOT Gate

Now we describe a gate with two inputs and two outputs called CNOT, controlled not gate, seeFigure 31. One of the inputs is called the control input, the other one is the target input.The first output is called the control and the second the target.

The classical equivalent of a quantum CNOT gate is the XOR gate: its output is the summodulo two ⊕ of its two inputs. For a classical CNOT gate the target output is equal to thetarget input if the control input is 0 and flipped if the control input is 1.

A quantum CNOT gate has two inputs as well; the control input is a qubit in state | ψ〉and the target input is a qubit in state | φ〉. The operation of the CNOT quantum gate isinformally described as follows: the control input is transferred directly to the control output

135

a a

b a b

addition modulo 2+O

Target input

Control input

O+

Figure 31: A classical CNOT gate. The target output is equal to the target input if the controlinput is 0 and flipped if the control input is 1. A quantum CNOT gate has two inputs aswell; the control input is a qubit in state | ψ〉 and the target input is a qubit in state | φ〉.

of the gate. The target output is equal to the target input if the control input is | 0〉 and itis flipped if the control input is | 1〉.

Flipping a bit a means complementing it, transforming it in a: if a = 0, it becomes 1 andviceversa. Flipping a qubit | φ〉 = α0 | 0〉 + α1 | 1〉 results in | φ′〉 = α1 | 0〉 + α0 | 1〉,the projections on the two bases vectors are swapped.

The input and the output qubits of a CNOT quantum gate can be represented as vectors ina four dimensional vector space, the H4 Hilbert space. Recall that the two qubits applied tothe input of the CNOT gate in Figure 31 are a control qubit | ψ〉 and a target qubit | φ〉

| ψ〉 = α0 | 0〉 + α1 | 1〉 | φ〉 = β0 | 0〉 + β1 | 1〉.The input vector of the quantum CNOT gate is:

| VCNOT 〉 = | ψ〉⊗ | φ〉 =

(α0

α1

)⊗(

β0

β1

)=

α0β0

α0β1

α1β0

α1β1

The components of the input vector are transformed by the CNOT quantum gate as follows:

| 00〉 →| 00〉 | 01〉 →| 01〉 | 10〉 →| 11〉 | 11〉 →| 10〉.Thus transfer matrix of the CNOT quantum gate GCNOT is:

GCNOT = | 00〉〈00 | + | 01〉〈01 | + | 10〉〈11 | + | 11〉〈10 |Let us start from the basics and construct the basis vectors | 00〉, | 01〉, | 10〉 , and | 11〉 in

the 4-dimensional Hilbert space H4

| 00〉 =| 0〉⊗ | 0〉 =

(10

)⊗(

10

)=

1000

| 01〉 =| 0〉⊗ | 1〉 =

(10

)⊗(

01

)=

0100

136

| 10〉 =| 1〉⊗ | 0〉 =

(01

)⊗(

10

)=

0010

| 11〉 =| 1〉⊗ | 1〉 =

(01

)⊗(

01

)=

0001

The outer products of the basis vectors with themsleves give us the four projector operatorscorresponding to the four eigenvalues, 00, 01, 10 and 11, the four possible outcomes of ameasurement of the state of a two-qubit system

| 00〉〈00 | =

1000

(1 0 0 0) =

1 0 0 00 0 0 00 0 0 00 0 0 0

| 01〉〈01 | =

0100

(0 1 0 0) =

0 0 0 00 1 0 00 0 0 00 0 0 0

| 10〉〈11 | =

0010

(0 0 0 1) =

0 0 0 00 0 0 00 0 0 10 0 0 0

| 11〉〈10 | =

0001

(0 0 01 0) =

0 0 0 00 0 0 00 0 0 00 0 1 0

Therefore the transition matrix of the circuit is

GCNOT =

1 0 0 00 1 0 00 0 0 10 0 1 0

It is easy to determine the output | WCNOT 〉 given the input | VCNOT 〉 and the transfermatrix of a CNOT gate:

| WCNOT 〉 = GCNOT | VCNOT 〉.

| WCNOT 〉 =

1 0 0 00 1 0 00 0 0 10 0 1 0

α0β0

α0β1

α1β0

α1β1

=

α0β0

α0β1

α1β1

α1β0

.

137

This result can be written as

| WCNOT 〉 = α0β0 | 00〉+ α0β1 | 01〉+ α1β1 | 10〉+ α1β0 | 11〉We see that the circuit in Figure 31 preserves the control qubit (the first and the sec-

ond component of the input vector are replicated in the output vector) and flips the targetqubit (the third and fourth component of the input vector, | VCNOT 〉 become the fourth andrespectively the third component of the output vector).

The output can also be written as

| WCNOT 〉 = α0 | 0〉 [β0 | 0〉+ β1 | 1〉] + α1 | 1〉 [β1 | 0〉+ β0 | 1〉].The CNOT gate is reversible. Indeed

| WCNOT 〉 | WCNOT 〉 =

1 0 0 00 1 0 00 0 0 10 0 1 0

1 0 0 00 1 0 00 0 0 10 0 1 0

=

1 0 0 00 1 0 00 0 1 00 0 0 1

The control qubit, | ψ〉, is replicated at the output and knowing it we can reconstruct thetarget input qubit | φ〉 given the target output qubit of the CNOT gate, | ψ⊕φ〉, see Figure 31.

Later, we show that CNOT is a universal quantum gate, any multiple qubit gate can beconstructed from single qubit and CNOT gates.

5.5 Can we Build Quantum Copy Machines?

There are several ways to replicate an input signal using classical gates. In Figure 32(a) wesee a classical analog of a CNOT gate, a binary circuit with two inputs, a control bit x and atarget bit y and two outputs x and x⊕ y. The second output is produced by an XOR gate. Ifthe target bit is zero, y = 0, the circuit simply replicates input x on both output lines, asshown in Figure 32(b).

Now that we have a two qubit CNOT gate and have figured out how to replicate an inputclassical bit, let us see if we can copy qubits. In Figure 32(c) we show a CNOT with an arbitrarycontrol qubit as input:

| ψ〉 = α0 | 0〉 + α1 | 1〉.We wish to replicate | ψ〉 on its output lines and try to determine the target qubit that

will allow us to do so. To replicate the input implies that the output of this gate should be avector | W 〉 in H4

| W 〉 = | ψ〉 | ψ〉 = (α0 | 0〉+ α1 | 1〉)(α0 | 0〉+ α1 | 1〉)= α0

2 | 00〉+ α0α1 | 01〉+ α0α1 | 10〉+ α12 | 11〉.

Alternatively:

| W 〉 = | ψ〉 | ψ〉 =| ψ〉⊗ | ψ〉 =

(α0

α1

)⊗(

α0

α1

)=

α20

α0α1

α1α0

α21

.

138

z

1|0| 10

1|0| 10

1|0| 10

1|0| 10

11|00| 10

11|10|01|00|

)1|0|)(1|0|(

2

11010

2

0

1010

11|10|01|00| 2

11010

2

0

10|00|)0)(|1|0|( 1010

CNOT CNOTyx

xx x x

y xx 00

?

0|

Figure 32: (a) A classical binary circuit with two inputs x and y and two outputs x andx ⊕ y. (b) When y = 0 the circuit in (a) simply replicates input x on both output lines.(c) A quantum CNOT with an arbitrary input | ψ〉 = α0 | 0〉 + α1 | 1〉; we would like it toreplicate | ψ〉 on its output lines. We know its desired output state but we do not know yetwhat the second input should be. (d) If we select the second input to be | 0〉 then the outputis α0 | 00〉 + α1 | 11〉, not exactly what we wished for.

We do not know yet what the input should be, but, based upon the analogy with theclassical case, we suspect that | 0〉 may do it. Let us try to determine the actual output stateof the CNOT gate in Figure 32(d). First we determine the components of its input vector:

| V 〉 = (| ψ〉)(| 0〉) = (α0 | 0〉 + α1 | 1〉)(| 0〉) = α0 | 00〉 + α1 | 10〉.

139

| V 〉 =

α0

0α1

0

The actual output vector is:

| W 〉 = GCNOT V =

1 0 0 00 1 0 00 0 0 10 0 1 0

α0

0α1

0

=

α0

00α1

.

The actual output vector is different from the desired one. The only conclusion we candraw from the exercise described above is that the CNOT gate in Figure 32(d) cannot be usedto copy qubits. As we shall see when we discuss the no-clonning theorem we simply cannotcopy quantum states.

An informal explanation of the limitations of the circuit in Figure 32(d) is that once wemeasure one of the qubits of α0 | 00〉 + α1 | 11〉 we obtain “0” with probability | α0 |2 or“1” with probability | α1 |2 and once we have measured one qubit the other is completelydetermined.

5.6 Three Qubit Gates. The Fredkin Gate

Three qubit gates have three inputs and three outputs. One or two of the inputs are referredto as control qubit(s) and are transferred directly to the output.

The gate in Figure 33 is called a Fredkin gate. There are quantum Fredkin gates whichare reversible, as well as classical ones that perform similar functions, but are not reversible.We use the classical version of this gate to construct the table of truth and derive the Booleanexpressions relating the inputs and outputs.

The Fredkin gate has two regular inputs, a, b, and a control input c and three outputs,a′, b′ and c′. The control input c is transferred directly to the output, c′ = c. The controlinput determines the regular output as follows:

(i) When c = 0 the two regular inputs are transferred without modification to the output,a′ = a and b′ = b, see Figure 33(a). The truth table for this case has four entries, one foreach possible combinations of values of a and b.

(ii) When c = 1 the two regular inputs are swapped, a′ = b and b′ = a, see Figure 33(b)for the circuit and its truth table.

The full truth table of the Fredkin gate in Figure 33(c) is obtained by concatenating thetruth tables for the two configurations in Figures 33(a) and (b). The logic expressions for theoutput of the Fredkin gate are derived following the rules we used for the CarryOut of thefull adder presented at the beginning of this section. The last equation below confirms thatthe control input is transferred directly to the output:

a′ = abc + abc + abc + abc = ac(b + b) + bc(a + a) = ac + bc.b′ = abc + abc + abc + abc = bc(a + a) + ac(b + b) = bc + ac.c′ = abc + abc + abc + abc = ac(b + b) + ac(b + b) = ac + ac = c(a + a) = c.

Now we discuss several properties of the Fredkin gate. First, we show that indeed theFredkin gate is reversible, knowing a′, b′ and c′ we can determine a, b and c. This is easy to

140

a

b

c

a'

b'

c'

a

b

0

a

b

0

a

b

1

b

a

1

0

Input Outputa b c a' b' c'

0 0

0

0

0

00

0 0

0

0 0

0 0

01

1

1

1

1 1 1 1


1 1

1

1

1

11

0 0

0

0 1

0 0

11

1

0

0

1 1 1 1


0 0

0

0

0

0

0

0 0

0

0 0

0 0

01

1

1

1

1 1 1 1

0 1 0

1

1

1

1

1

1

1

1

0 0

0

0 0

0

1

1

1 1

1

1

11

0

0

(a) (b)

(c)

(d) (e)

0

b

c

bc

bc

c'

1

0

c

c

c

c'

a=0 a=1 & b=0 a’= bc & b’ = bc a’= c & b’ = c

Threequbitgate

Figure 33: The Fredkin gate has three inputs, a, b, and control, or c; it also has three outputs,a′, b′, and c′ = c. (a)When c = 0 the inputs appear at the output, a′ = a and b′ = b.(b) When c = 1 the inputs are swapped, a′ = b and b′ = a. (c) The truth table for theFredkin gate. (d) The Fredkin gate becomes an AND gate when a = 0; then a′ = bc andb′ = bc. (e) The Fredkin gate becomes a NOT gate when a = 1 and b = 0.

prove; for example, one can express a and b function of a′, b′ and c using the truth table inFigure 33(c). If we apply two consecutive Fredkin gates with inputs and outputs (a, b, c) and(a′, b′, c′) for the first and (a′, b′, c′) and (a′′, b′′, c′′) for the second, the second gate has as inputthe output of the first gate. Then a′′ = a, b′′ = b, and c′′ = c, the output of the secondgate is the same as the input of the first one.

141

We examine the truth table once again and observe that the Fredkin gate conserves thenumber of 1’s at its input and for this reason is called a conservative logic gate. This prop-erty suggests analogies with other physics laws regarding the conservation of mass, energy,momentum.

Last, but not least, we show that the Fredkin gate is universal. The Fredkin gate cansimulate an AND gate and a NOT gate. Consider the case shown in Figure 33(d) when a = 0.In this case we have an AND gate (recall that the product of b and c corresponds to the booleanAND):

a′ = bc and b′ = bc.

When a = 1 and b = 0, see Figure 33(e), we have a NOT gate:

a′ = c and b′ = c.

The configuration in Figure 33(e) generates two copies of the input c and can be used as aFANOUT gate. The Fredkin gate can perform a switching function called CROSSOVER when itswaps its two inputs as in Figure 33(b).

The input and output vectors of a quantum Fredkin gate are related by a set of linearequations:

| WFredkin〉 = GFredkin | VFredkin〉where GFredkin is the transfer matrix of the gate.

A system of three qubits requires an eight-dimensional complex vector space with a basisconsisting of eight vectors: | 000〉, | 001〉, | 010〉, | 011〉, | 100〉, | 101〉, | 110〉, and | 111〉. Inthis space a vector | V 〉 is a linear combination of the basis vectors with complex coefficientsα000, α001, α010, α011, α100, α101, α110, α111:

| VFredkin〉 = α000 | 000〉 + α001 | 001〉 + α010 | 010〉 + α011 | 011〉 +α100 | 100〉 + α101 | 101〉 + α110 | 110〉 + α111 | 111〉.

Thus GFredkin = [ gij] (i, j) ∈ {1, 3}.

5.7 The Toffoli Gate

While the Fredkin gate had only one control input the Toffoli gate has two control inputs, aand b and one target input c. The outputs are: a′ = a, b′ = b and c′, see Figure 34(a). TheToffoli gate is a universal gate and it is reversible.

There are both classical and quantum versions of a Toffoli gate. The table of truth andthe transfer matrix of classical and quantum Toffoli gates are identical. For the sake of claritywe describe first the function of a classical Toffoli gate. If both control bits are 1 then c isflipped, otherwise its state is unchanged. The truth table of the classic Toffoli gate reflectsthe functional description, c′ = c for the first six entries and c′ = c for the last two, whena = b = 1:

142

a

b

c

a

b

c O ab+

(a)

a a

(b) (c)

1 1

b b

1 NAND(ab) 0

b b

b

Figure 34: (a) The Toffoli gate has three inputs, two control inputs a and b and a target, c.The outputs are a′ = a, b′ = b and c′. If both control bits or qubits are 1 then c is flipped,otherwise its state is unchanged. (b) A classic Tofolli gate can be used to implement a NAND

gate; when c = 1 then c′ = 1⊕ (a AND b) = NOT(a AND b). If c = 0 then c′ = a AND b. (c)A quantum Toffoli gate performs a FANOUT function.

a b c a′ b′ c′

0 0 0 0 0 00 0 1 0 0 10 1 0 0 1 00 1 1 0 1 11 0 0 1 0 01 0 1 1 0 11 1 0 1 1 11 1 1 1 1 0

From the table of truth we see that c′ = c ⊕ (a AND b). From the definition of additionmodulo two (denoted by ⊕) it follows immediately that for any binary variable a, 1⊕ a = aand 0⊕ a = a.

If c = 1 then c′ = a NAND b. If c = 0 then c′ = a AND b. The Toffoli gate implementsboth NAND and AND functions, see Figure 34(b). Figure 34(c) shows a FANOUT circuit; whena = 1 and c = 0, the target output c′ = c⊕ (a AND b) = 0⊕ b = b. The second control inputbit is replicated.

Let us now examine the quantum Toffoli gate. Let the three input qubits correspondingto a, b, and c be:

| ψ〉 = α0 | 0〉 + α1 | 1〉| φ〉 = β0 | 0〉 + β1 | 1〉| ζ〉 = γ0 | 0〉 + γ1 | 1〉

Then the input is | VToffoli〉 = | ψ〉 | φ〉 | ζ〉. If we substitute the expressions for thethree qubits and carry out the vector multiplication we get:

| VToffoli〉 = α0β0γ0 | 000〉 + α0β0γ1 | 001〉 + α0β1γ0 | 010〉 + α0β1γ1 | 011〉 +α1β0γ0 | 100〉 + α1β0γ1 | 101〉 + α1β1γ0 | 110〉 + α1β1γ1 | 111〉.

The vector describing the input is:

143

| VToffoli〉 =

(α0

α1

)⊗(

β0

β1

)⊗(

γ0

γ1

)=

α0β0

α0β1

α1β0

α1β1

⊗

(γ0

γ1

)=

α0β0γ0

α0β0γ1

α0β1γ0

α0β1γ1

α1β0γ0

α1β0γ1

α1β1γ0

α1β1γ1

The transfer matrix of a Toffoli gate is:

GToffoli =

1 0 0 0 0 0 0 00 1 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 1 0 0 0 00 0 0 0 1 0 0 00 0 0 0 0 1 0 00 0 0 0 0 0 0 10 0 0 0 0 0 1 0

The output of the Toffoli gate is | WToffoli〉 = GToffoli | VToffoli〉:

| WToffoli〉 =

1 0 0 0 0 0 0 00 1 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 1 0 0 0 00 0 0 0 1 0 0 00 0 0 0 0 1 0 00 0 0 0 0 0 0 10 0 0 0 0 0 1 0

α0β0γ0

α0β0γ1

α0β1γ0

α0β1γ1

α1β0γ0

α1β0γ1

α1β1γ0

α1β1γ1

=

α0β0γ0

α0β0γ1

α0β1γ0

α0β1γ1

α1β0γ0

α1β0γ1

α1β1γ1

α1β1γ0

.

5.8 Quantum Circuits

We learned from the previous sections that quantum particles can be used to store information.The state space of n qubits is considerably larger than the one of a classical system with n bits,2n. We could simulate very complex physical systems with an n-qubit quantum computerable to manipulate qubits. Now we ask ourselves what a quantum computer might be andhow the qubits can be transformed inside such a quantum computing device.

According to Andrew Steane [111] “the quantum computer is first and foremost a machine,which is a theoretical construct, like a thought experiment, whose purpose is to allow quantuminformation processing to be formally analyzed.” In the same paper [111], Steane provides amore precise definition of a quantum computer put forward by David Deutsch: “A quantumcomputer is a set of n qubits in which the following operations are feasible

(i) Each qubit can be prepared in some known state | 0〉.(ii) Each qubit can be measured in the basis {| 0〉, | 1〉}.

144

(iii) A universal quantum gate (or set of gates) can be applied at will to any fixed-size subsetof gates.

(iv) The qubits do not evolve other than via the above transformations.”Quantum circuits are built by interconnecting quantum gates. Several limitations are

imposed in the realization of quantum circuits. First, the circuits are acyclic; feedback fromone part of the circuit to another is not allowed, there are no loops. Second, we cannot copyqubits.

5.9 The No Cloning Theorem

The transformations carried out by quantum circuits are unitary. As a consequence of thisfact unknown quantum states cannot be copied or cloned. Several proofs of this theorem areavailable [80].

Here we present a proof by contradiction presented first in [125]. Let us assume that a twoinput gate capable of cloning one of its input exists. By now we know that a gate performsa linear transformation of its input vector V into an output vector W = GV with G thetransfer matrix of the gate. We already know that transformation performed by matrices canbe represented by linear operators. Let us call U the unitary transformation correspondingto the unitary matrix G of a two input gate that allows us to replicate an input qubit.

Let a and b be two orthogonal quantum states, or qubits. If we apply each one of them atthe first input of the gate independently, the fact that the gate clones its input means that

U(| a0〉) = | aa〉and

U(| b0〉) = | bb〉.These two equations simply state that when the second input qubit is 0 the output vectorof the gate (a vector in H4 obtained by composing the two output qubits) consists of tworeplicas of the first qubit, | a〉 for the first case and | b〉 for the second case.

Now consider another state | c〉 = 1/√

2(| a〉 + | b〉). U is a linear transformation thus

U(| c0〉) =1√2[U(| a0〉) + U(| b0〉)] =

1√2[| aa〉 + | bb〉].

We could try to apply c at the input of our cloning gate. Then we expect c to be cloned

U(| c0〉) =| cc〉.But

| cc〉 = [1/√

2(| a〉 + | b〉)][1/√

2(| a〉 + | b〉)] =1

2(| aa〉 + | ab〉 + | ba〉 + | bb〉).

This contradicts the expression we got earlier assuming linearity. We have to concludethat there is no linear operation to reliably clone unknown quantum states. It is possible toclone known states, after a measurement has been performed. Before the measurement of aquantum system in an unknown state the outcome of the measurement is uncertain. Afterthe measurement the outcome is the projection of the state on one of the bases vectors, thusit is well determined.

145

5.10 Qubit Swapping and Full Adder Circuits

In this section we give examples of quantum circuits. Figure 35 presents a two bit adder circuitconstructed with reversible gates. Figure 36 displays a circuit for swapping two qubits.

a

b

c'’= CarryOut= a AND bc=0

b'’=Sum =a XOR b

a' a'’=a

b'

c' = a AND b

Figure 35: A two-bit adder made of reversible gates.

The two bit adder consists of a Toffoli gate followed by a CNOT gate. The two controlinputs to the Toffoli gate are a and b and its target input is c = 0. The target output of theToffoli gate is c′ = c ⊕ (aANDb) = aANDb, because c = 0. This propagates to the output andc′′ = aANDb. Then a′′ = a′ = a as the control input of the CNOT propagates to its outputunchanged. Finally, b′ = b and this becomes the target input of the �CNOT gate. Then thetarget output of the CNOT gate is b′′ = a′XORb′ = aXORb. Thus the two outputs of the circuitare the Sum and the CarryOut of the two inputs , a and b.

The circuit for swapping two qubits consists of three CNOT gates.

a a'=a

b a O b+a O b+ b'’=

a O (a O b)+ +

b'=

a'’=a’ O b’ =+ a'’’=b

b'’’=a’’O b’’= a+

Figure 36: A circuit for swapping two qubits. The inputs and outputs of each stage are shown.

Given boolean variables a and b we note that a ⊕ (a ⊕ b) = (a ⊕ a) ⊕ b = b andb⊕ (a⊕ b) = (b⊕ b)⊕ a = a. It is easy to see that a⊕ a = 0 and 0⊕ b = b.

To show that the two inputs are swapped we determine the output of each stage. Theinputs to the second stage are a′ = a and b′ = a ⊕ b. The inputs to the third stage area′ ⊕ b′ = b and b′′ = b′. Based upon the previous observations we see that the outputs of thethird stage are a′′′ = b and b′′′ = a, thus the circuit swaps it two inputs.

Example. We show that the circuit in Figure 37 constructed with Toffoli and CNOT gatesimplements a full one qubit adder.

First, we observe that the first three qubits from the top, namely | c〉 =| CarryIn〉, | x〉and | y〉 are control qubits on all gates thus, they are transferred without any change to theoutput. We only have to compute expressions for | Sum〉 and | CarryIn〉.

For simplicity we drop the ket notations and denote qubit | x〉 simply as x.

146

| CarryIn> | CarryIn>

| CarryOut>

| Sum>

| 0 >

| 0 >

| x >

| y >

| x >

| y >

1 2

3 4

Figure 37: A full one qubit adder constructed with Toffoli and CNOT gates.

If we call s1 and s2 the results of CNOT transformations at stages marked as 1 and 2 inFigure 37 we see that

s1 = c⊕ 0 = c,

s2 = s1 ⊕ x = c⊕ x,

Sum = s2 ⊕ y = (c⊕ x)⊕ y.

It is easy to show that

(c⊕ x)⊕ y = xyc + xyc + yxc + cxy.

This is precisely the expression for the Sum derived in Section 5.1. To prove this equalitywe use the fact that x ⊕ y = xy + xy and we also apply de Morgan’s Laws x + y = xy andxy = x + y.

We call c3 and c4 the results of transformations performed by Toffoli gates at stages markedas 3 and 4 in Figure 37 we see that

c3 = 0⊕ (xy) = xy

c4 = c3 ⊕ (cx) = (xy)⊕ (cx),

CarryOut = c4 ⊕ (cy) = (xy)⊕ (cx)⊕ (cy).

It is easy to show that

(xy)⊕ (cx)⊕ (cy) = xyc + xyc + xyc + xyc.

This is precisely the expression for the CarryOut derived in Section 5.1.

Example. The transfer matrix of the circuit in Figure 38 is

147

| a >

| b >

| c >

| a’ >

| b’ >

| c’ >

Figure 38: A three qubit gate with a CNOT.

GQ =

1 0 0 0 0 0 0 00 1 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 1 0 0 0 00 0 0 0 0 1 0 00 0 0 0 1 0 0 00 0 0 0 0 0 0 10 0 0 0 0 0 1 0

We observe that the topmost qubit | a〉 is the control qubit of a CNOT gate, thus it istransferred to the output unchanged, | a′〉 =| a〉. The second qubit is not affected, | b′〉 =| b〉.The third qubit is flipped when is | a〉 =| 1〉 and left alone if | a〉 =| 0〉. Henceforth, the truthtable of this circuit is

a b c a′ b′ c′

0 0 0 0 0 00 0 1 0 0 10 1 0 0 1 00 1 1 0 1 11 0 0 1 0 11 0 1 1 0 01 1 0 1 1 11 1 1 1 1 0

This means that the quantum circuit maps its input to the output as follows

| 000〉 →| 000〉| 001〉 →| 001〉| 010〉 →| 010〉| 011〉 →| 011〉| 100〉 →| 101〉| 101〉 →| 100〉| 110〉 →| 111〉| 111〉 →| 110〉

Therefore

148

GQ =| 000〉〈000 | + | 001〉〈001 | + | 010〉〈010 | + | 011〉〈011 | +| 100〉〈101 | + | 101〉〈100 | + | 110〉〈111 | + | 111〉〈110 | .

We compute the individual base vectors

| 000〉 =

(10

)⊗(

10

)⊗(

10

)=

1000

⊗

(10

)=

10000000

| 001〉 =

(10

)⊗(

10

)⊗(

01

)=

1000

⊗

(01

)=

01000000

| 010〉 =

(10

)⊗(

01

)⊗(

10

)=

0100

⊗

(10

)=

00100000

| 011〉 =

(10

)⊗(

01

)⊗(

01

)=

0100

⊗

(01

)=

00010000

| 100〉 =

(01

)⊗(

10

)⊗(

10

)=

0010

⊗

(10

)=

00001000

149

| 101〉 =

(01

)⊗(

10

)⊗(

01

)=

0010

⊗

(01

)=

00000100

| 110〉 =

(01

)⊗(

01

)⊗(

10

)=

0001

⊗

(10

)=

00000010

| 111〉 =

(01

)⊗(

01

)⊗(

01

)=

0001

⊗

(01

)=

00000001

It is easy to compute the outer products in the expression above. We only compute thelast term of this sum.

| 111〉〈110 |=

00000001

(0 0 0 0 0 0 1 0) =

0 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 1 0

5.11 Unitary Operations on a Single Qubit. Rotation Matrices

In the previous sections we stressed the fact that single qubit transformations must be unitaryto preserve the norm. We also discussed Pauli matrices

X =

(0 11 0

)Y =

(0 − ii 0

)Z =

(1 00 −1

)We denote by I the identity matrix

150

I =

(1 00 1

)The Hadamard (H), the phase - (S), the π/8 (T ), the phase-shift with angle θ, Pθ, Rθ,

and Rk are also unitary matrices

H = 1√2

(1 11 −1

)S =

(1 00 i

)T =

(1 10 eiπ/4

)

Pθ =

(eiθ 00 eiθ

)Rθ =

(1 00 eiθ

)Rk =

(1 0

0 e2πi

2k

)The state of a qubit can be represented as a unit vector on the Bloch sphere, see Section

4.2. When we apply a transformation to a qubit the corresponding vector on the Bloch sphereis transformed. In particular, we are interested in rotations with an angle θ about the x, y,and z axes. The three rotation are performed respectively by the following operators obtainedby the exponentiation of the X, Y , and Z Pauli matrices.

For simplicity we drop the vector notations used when discussing rotation operations andthe Bloch sphere in Section 4.3. We now denote the rotation operators Rx(θ),Ry(θ),Rz(θ)instead of R�x(θ),R�y(θ),R�z(θ)

Rx(θ) = cos(θ/2)I − i sin(θ/2)X =

(cos(θ/2) −i sin(θ/2)−i sin(θ/2) cos(θ/2)

)

Ry(θ) = cos(θ/2)I − i sin(θ/2)Y =

(cos(θ/2) − sin(θ/2)sin(θ/2) cos(θ/2)

)

Rz(θ) = cos(θ/2)I − i sin(θ/2)Z =

(e−iθ/2 0

0 eiθ/2

)We have seen that the rotation matrices satisfy the following properties

Ry(θ1)Ry(θ2) = Ry(θ1 + θ2)

Rz(θ1)Rz(θ2) = Rz(θ1 + θ2)

XRy(θ)X = Ry(−θ)

XRz(θ)X = Rz(−θ)

Now we show that every unitary 2× 2 matrix A can be expressed as

A =

(eiδ 00 eiδ

)(eiα/2 0

0 e−iα/2

)(cos(θ/2) sin(θ/2)− sin(θ/2) cos(θ/2)

)(eiβ/2 0

0 e−iβ/2

)

with α, β, δ, θ ∈ R. Recall that a matrix is unitary if and only if its row and columnvectors are orthonormal.

If U is a unitary 2 × 2 matrix then there exist unitary matrices A, B, and C such thatABC = I and U = AXBXC with X the Pauli matrix defined earlier.

151

Let us consider matrices A, B, and C defined by

A = Rz(β)Ry(γ/2)

B = Ry(−γ/2)Rz(−(δ + β)/2)

C = Rz((δ − β)/2)

It is easy to see that ABC = I. Indeed

ABC = Rz(β)Ry(γ/2)Ry(−γ/2)Rz(−(δ + β)/2)Rz((δ − β)/2).

We use the properties of the rotation matrices

Ry(γ/2)Ry(−γ/2) = Ry((γ/2− γ/)2) = Ry(0) = I

Rz(β)Rz(−(δ + β)/2)Rz((δ − β)/2) = Rz(β − (δ + β)/2 + (δ − β)/2) = Rz(0) = I.

To compute AXBXC we recall that the Pauli matrix X has the property XX = I, thus

AXBXC = A(XRy(−γ/2)X)(XRz(−(δ + β)/2)X)C.

But we showed earlier that

XRy(−γ/2)X = Ry(γ/2)

and

XRz(−(δ + β)/2)X = Rz((δ + β)/2).

It follows that for our choices of matrices A, B, and C we have

AXBXC = Rz(β)Ry(γ)Rz(δ).

Sometimes the unitary matrix U from the previous proposition is U = eiγAXBXC withα ∈ R, an overall phase factor, as discussed in Section 4.2.

5.12 Single Qubit Controlled Operations

We address now the flow of control in quantum circuits, enabling us to implement “if ... then...else ...” constructs. More precisely, we need quantum circuits able to execute a unitarytransformation G depending upon the state of one qubit.

The CNOT is one of the gates in this family of quantum circuits that allows us to controlthe target qubit depending upon the value of the control qubit. Figure 39 shows the diagramof a generic quantum circuit behaving as follows: when the control qubit is | c〉 =| 0〉 thenthe target qubit is transferred directly to the output; when the control qubit is set, then thetransformation described by G is applied to the target qubit. Throughout this chapter weuse the following convention: control qubits are shaded while the target qubits are not.

152

| c >

| t > | t > Gc | t >

+

Target input

Control input

G

| c >

Figure 39: A controlled-G operation. When the control qubit is | c〉 =| 0〉 then the target qubitis transferred directly to the output. When the control qubit is set, then the transformationdescribed by G is applied to the target qubit.

Example. The controlled-Z operation. Figure 40(a) shows a quantum circuit for controlled-Zoperation. Recall that the Z gate flips the sign of the projection of a qubit on | 1〉. It maps

| 0〉 → | 0〉 | 1〉 → − | 1〉

Z

Z

(a) (b)

Figure 40: (a) A quantum circuit for controlled-Z operation. (b) An equivalent circuit forcontrolled-Z.

Thus, the controlled-Z gate maps its input to the output as follows

| 00〉 → | 00〉 | 01〉 → | 01〉 | 10〉 → | 10〉 | 11〉 → − | 11〉The transfer matrix of the circuit Figure 40(a), a controlled-Z, is

GcontroledZ =| 00〉〈00 | + | 01〉〈01 | + | 10〉〈10 | − | 11〉〈11 |=

1 0 0 00 1 0 00 0 1 00 0 0 −1

.

It is easy to see that the controlled-Z gate in Figure 40(b) maps its input to the outputexactly like the one in Figure 40(a), therefore the two circuits are equivalent.

The roles of control and target qubits can be reversed by an appropriate change of base.

Let us first examine the circuit in Figure 41(b). This circuit performs the following map-pings

| 00〉 → | 00〉 | 01〉 → | 11〉 | 10〉 → | 10〉 | 11〉 → | 01〉

153

H

H H

H

(a) (b)

Figure 41: The roles of control and target qubits can be reversed by an appropriate changeof base. (a) A circuit where the basis of the control and target qubit are changed by a pairof Hadamard gates. (b) A reversed CNOT gate equivalent with the circuit in (a).

Its transfer matrix is

Gb =| 00〉〈00 | + | 01〉〈11 | + | 10〉〈10 | + | 11〉〈01 | .or

Gb =

1 0 0 00 0 0 00 0 0 00 0 0 0

+

0 0 0 00 0 0 10 0 0 00 0 0 0

+

0 0 0 00 0 0 00 0 1 00 0 0 0

+

0 0 0 00 0 0 00 0 0 00 1 0 0

Gb =

1 0 0 00 0 0 10 0 1 00 1 0 0

.

To compute the transfer matrix of the circuit in Figure 41(a) we first construct

G2H = H ⊗H =1√2

(1 11 −1

)⊗ 1√

2

(1 11 −1

)=

1

2

1 1 1 11 −1 1 −11 1 −1 −11 −1 −1 1

.

Now

G12 = GCNOT G2H =

1 0 0 00 1 0 00 0 0 10 0 1 0

1

2

1 1 1 11 −1 1 −11 1 −1 −11 −1 −1 1

=

1

2

1 1 1 11 −1 1 −11 −1 −1 11 1 −1 −1

.

Finally

Ga = G2HG12 =1

4

1 1 1 11 −1 1 −11 1 −1 −11 −1 −1 1

1 1 1 11 −1 1 −11 −1 −1 11 1 −1 −1

=

1

4

4 0 0 00 0 0 40 0 4 00 4 0 0

.

154

Ga =

1 0 0 00 0 0 10 0 1 00 1 0 0

.

This completes the proof that the circuits in Figures 41(a) and (b) are equivalent.

Example. The controlled-H operation. Figure 42 shows a quantum circuit for controlled-Hoperation.

H

Figure 42: A quantum circuit for controlled-H operation.

The controlled-H gate maps its input to the output as follows

| 00〉 → | 00〉 | 01〉 → | 01〉 | 10〉 → 1√2(| 10〉+ | 11〉) | 11〉 → 1√

2(| 10〉− | 11〉)

The transfer matrix of the circuit Figure 42 is

GcontroledH =| 00〉〈00 | + | 01〉〈01 | + 1√2| 10〉(〈10 | +〈11 |) +

1√2| 11〉(〈10 | −〈11 |).

GcontroledH =

1 0 0 00 1 0 00 0 1√

21√2

0 0 1√2− 1√

2

.

Example. The controlled-Rk operation. Figure 43 shows a quantum circuit for controlled-Rk

operation.Recall that

Rk =

(1 0

0 e2πi

2k

)

The controlled-Rk gate maps its input to the output as follows

| 00〉 → | 00〉 | 01〉 → | 01〉 | 10〉 → | 10〉 | 11〉 → e2πi

2k | 11〉The transfer matrix of the circuit Figure 43 is

155

Rk

Figure 43: A quantum circuit for controlled-Rk operation.

GcontroledRk=| 00〉〈00 | + | 01〉〈01 | + | 10〉〈10 | +e

2πi

2k | 11〉〈11 | .

GcontroledRk=

1 0 0 00 1 0 00 0 1 0

0 0 0 e2πi

2k

.

Example. Show that the controlled-P circuit in Figure 44(a) is equivalent to the phase shiftcircuit in Figure 44(b).

P

T

(a) (b)

Figure 44: (a) A quantum circuit for controlled-P operation. (b) An equivalent circuit.

Pα is a phase shift and Tα is a phase transformation

Pα =

(eiα 00 eiα

)Tα =

(1 00 eiα

).

It is easy to see that the quantum circuit in Figure 44(a), a controlled-P circuit, performsthe following transformation of its input qubits

| 00〉 → | 00〉 | 01〉 → | 01〉 | 10〉 → eiα | 10〉 | 11〉 → eiα | 11〉Thus, the transfer matrix of the controlled-P quantum circuit is

GcontroledP =| 00〉〈00 | + | 01〉〈01 | +eiα | 10〉〈10 | +eiα | 11〉〈11 |=

1 0 0 00 1 0 00 0 eiα 00 0 0 eiα

.

156

On the other hand the transfer matrix of the circuit in Figure 44(b) is

GTα = Tα ⊗ I =

(1 00 eiα

)⊗(

1 00 1

)=

1 0 0 00 1 0 00 0 eiα 00 0 0 eiα

.

(b)

A CB

| c >

| t >

(a)

U

| c >

| t >

Figure 45: (a) A generic single qubit controlled gate with matrix U = AXBXC. (b) A circuitsimulating the generic single qubit controlled gate in (a).

The quantum circuit in Figure 45(b) simulates the general one qubit controlled gate inFigure 45a.

When the control qubit is | c〉 =| 0〉 the target qubit becomes

| t〉 −→ ABC | t〉 = I | t〉 =| t〉

When | c〉 =| 1〉, the target qubit becomes

| t〉 −→ AXBXC | t〉 = U | t〉because ABC = I and U = AXBXC according to the last proposition of the previous section.

5.13 Multiple Qubit Controlled Operations

Consider a circuit with n control qubits

| c0〉, | c1〉 . . . | cn−1〉and k target qubits at the input of a circuit performing a unitary transformation, U

| t0〉, | t1〉 . . . | tk−1〉In Figure 46(a) we present the case when k = 1 and the unitary operator U is given by

U =

(u00 u01

u10 u11

)When all controlled qubits of the circuit in Figure 46(a) are set to 1 we apply to the target

qubit, | t〉 = α0 | 0〉 + α1 | 1〉 the transformation U. When the logical product, AND, of allcontrol qubits is not 1 we transfer directly the qubits to the output of the circuit. Using

157

| c0 >

| t > U

| c1 >

| c2 >

| cn-1 >

(a)

| c0 >

| t 0>

| c1 >

| c2 >

| cn-1 >

U

(b)

| t k-1>

| t 1>

| t 2>

Figure 46: (a) A n-qubit controlled, single target qubit unitary function quantum circuit.(b) An n-qubit controlled, k-target qubit quantum circuit. The unitary transformation U isapplied to the k target qubits.

Feynman’s notation [50] we define the operator for this quantum circuit acting upon n + 1qubits with n = 0, 1, 2, . . . as follows

∧m(U)(| c1c2 . . . cnt〉) =

{| c1c2 . . . cnt〉 if ∧n−1

j=0 (cj) = 0α0 | c0c1 . . . cn−10〉+ α1 | c1c2 . . . cn1〉 if ∧n−1

j=0 (cj) = 1

The transfer matrix of the circuit is

1 0 . . . 0 0 00 1 . . . 0 0 0. . . . . . . . . . . . . . . . . .0 0 . . . 1 0 00 0 . . . 0 u00 u01

0 0 . . . 0 u10 u11

Figure 47 shows a quantum circuit using Toffoli gates for n = 6 control qubits and onetarget qubit. The circuit is general and can be used for any number n of control qubits. Ithas an additional n− 1 work qubits initially in state | 0〉. At the end of the computation thework qubits are returned to state | 0〉.

The circuit in Figure 47 consists of three stages. The first stage produces the logicalproduct, AND, of all six control qubits in one of the work qubits, w4. The second stage weperform a single qubit U-controlled transformation of the target qubit. The last stage returnsthe work qubits, w0, w1, w2, w3, w4 to their original state, | 0〉. To see that the first stage

158

U

|c0>

|c1>

|c2>

|c3>

|c4>

|c5>

| w0>=| 0 >

| 0>

| t >

Control

Target

| 0>

| 0>

| 0>

| 0>

| w1>=| 0 >

| w2>=| 0 >

| w3>=| 0 >

| w4>=| 0 >

|c0>

|c1>

|c2>

|c3>

|c4>

|c5>

Figure 47: A 6-qubit controlled, 5-work qubit and one target qubit subject to the unitarytransformation U implemented with Toffoli gates.

operates as described we observe that the first Toffoli gate from the left ANDs c0 and c1 andchanges the state of w0 from | 0〉 to | c0c1〉. The next Toffoli gate changes the state of w1

from | 0〉 to | c0c1c2〉, and so on.Recall that we are only interested in reversible quantum circuits, therefore we should

return the work qubits to their original state. We only show that the last work qubit is setto | 0〉. Indeed, the first Toffoli gate of the third stage performs the following operation

w4 AND [(c0 AND c1 . . . AND c5) AND (c0 AND c1 . . . AND c5)] = w4.

A Toffoli gate can be implemented using only H, T , and S one qubit gates. Combinedwith circuit presented in Figure 47, which can be generalized for any number n of controlqubits, this result shows that we can construct an arbitrary n-qubit controlled circuit usingonly Hadamard, H, phase, S, and π/8, T , single qubit gates where

H =1√2

(1 11 −1

)S =

(1 00 i

)T =

(1 00 eiπ/4

)T † =

(1 00 e−iπ/4

)

Figure 46(b) presents a quantum circuit with k > 1 target qubits. An alternative notationfor multiple qubit controlled operation is Cn(Uk) where n is the number of controlled qubitsand k is the number of target qubits the unitary transformation Uk is applied to when thecontrol condition is satisfied. The transformation is then described as

Cn(Uk) | c0c1 . . . cn−1〉 | t0t1 . . . tk−1〉 = | c0c1 . . . cn−1〉U c0c1...cn−1 | t0t1 . . . tk−1〉

159

H T T T T H

T T S

T

Figure 48: A quantum circuit using only H, S, and T one qubit gates to simulate a Toffoligate.

with U c0c1...cn−1 | t0t1 . . . tk−1〉 meaning that the transformation U is applied to the targetqubits only if the logical product of c0c1 . . . cn−1 is one.

5.14 A Quantum Circuit for the Walsh-Hadamard Transform

Recall from Section 5.11 that the Hadamard gate describes a unitary quantum “fair coin flip”performed upon a single qubit

H =1√2

(1 11 −1

)The Hadamard gate transforms an input qubit in state | 0〉 into a superposition state

(| 0〉+ | 1〉)/√

2 and | 1〉 into (| 0〉− | 1〉)/√

2. This transformation is its own inverse.Performing H twice on an initial state | 0〉 or | 1〉 gives | 0〉 or | 1〉 respectively, withprobability 1.

The Walsh-Hadamard transform is presented in more depth in Appendix II. This lineartransformation is defined recursively as

W1 = H

Wj+1 = H ⊗Wj

When applied to n qubits in state | 0〉 individually, the Walsh-Hadamard transform createsa superposition of 2n states

(H ⊗H ⊗H . . .⊗H) | 000 . . . 0〉 =1

2n/2[(| 0〉+ | 1〉)⊗ (| 0〉+ | 1〉) . . .⊗ (| 0〉+ | 1〉)].

In the general case, the n-qubit Walsh-Hadamatd transform implemented by the circuitin Figure 49 is described by a 2n × 2n matrix GHn .

Proposition. The entries of the matrix GHn are

hjk = (1√2)n(−1)j·k

160

H

H

H

| j0 >

| j1 >

| jn-1 >

| k0 >

| k1 >

| kn-1 >

Figure 49: A quantum circuit for the Walsh-Hadamard transform.

where j · k is a short hand notation for the number of 1’s in the binary representation ofintegers j and k, or the scalar product of vectors j and k.

j = j02n−1 + j12

n−2 + · · ·+ jn−22 + jn−120 = (j0, j1, . . . jn−2, jn−1)

k = k02n−1 + k12

n−2 + · · ·+ kn−22 + kn−120 = (k0, k1, . . . kn−2, kn−1).

Let | j〉 be the input vector (state) and | k〉 the output vector (state) of the circuit inFigure 49

| j〉 =| j0〉+ | j1〉 · · · | jn−1〉 = j0 | 000 . . . 00〉+ j1 | 000 . . . 01〉+ . . . jn−1 | 111 . . . 11〉

| k〉 =| k1〉+ | k2〉 · · · | kn−1〉 = k0 | 000 . . . 00〉+ k1 | 000 . . . 01〉+ . . . kn−1 | 111 . . . 11〉

Then

| k〉 = GHn | j〉 = (H ⊗H ⊗H . . . H)(| jn−1〉⊗ | jn−2〉 · · · ⊗ | j0〉.or

GHn | j〉 =

(1√2

)n

[| 0〉+ (−1)jn−1 | 1〉] + · · · ⊗ [| 0〉+ (−1)j0 | 1〉]

GHn | j〉 = =

(1√2

)n 2n−1∑k=0

(−1)jn−1kn−1 | jn−1〉 ⊗ . . . (−1)j0k0 | j0〉

Finally

GHn | j〉 =

(1√2

)n(

2n−1∑k=0

(−1)j·k)| j〉.

Each row of the 2n × 2n matrix corresponds to a different state of the quantum systemand has two non-zero entries, taken from either the top row, or from the bottom row of the

161

Hadamard matrix H. Each column has only two non-zero entries taken either from the leftcolumn, or from the right column of H.

The result of performing a sequence of n Hadamard transformations is a superposition ofall 2n possible binary strings of length n. The final amplitude of each string is 2−n/2. Simonpoints out [108] that “as the transformations are applied in turn, the phase of the resultingstate is changed.” The output vector | k〉 for an input vector | j〉 is the superposition

2−n2

∑k

(−1)j·k | j〉.

5.15 Mathematical Models of a Quantum Computer

Several mathematical models for a quantum computer have been proposed, including thequantum Turing machine model [6], and the quantum cellular automata model [38]. In arecent publication Peter Shor provides a succinct description of the quantum circuit model[106].

A quantum computer consists of input and output wires carrying qubits and quantumcircuits, which in turn are made out of quantum gates. The number of input and outputbits of a classical gate may differ, while a quantum gate maps q input qubits into precisely qoutput qubits. This is a necessary but not a sufficient condition for reversibility.

The mathematical representation of such a quantum gate is a unitary matrix with 2q rowsand 2q columns, G = [gi,j] 1 ≤ i, j ≤ 2q. If V is an input vector, then the output vector Wis given by

W = GV.

The input vector V is the product of q two-dimensional vectors, each one of them represent-ing the state of one qubit. Each qubit can be in a superposition state | ψ〉 = α0 | B0〉+α1 | B1〉where B0 and B1 are orthonormal base vectors in H2. | 0〉 and | 1〉 are often used as basevectors of H2.

The input vector of the gate represents a superposition state of a quantum system in aHilbert space H2q, see Section 4.8. Similarly, the output of the gate is a vector representingthe superposition state of a quantum system in the Hilbert space H2q and can be regarded asthe product of q qubits.

Let us now consider an n qubit quantum computer. The joint space of n qubits is thetensor product of the individual state spaces of the n qubits, H2n is a tensor product of ntwo-dimensional Hilbert spaces

H2n = (H2)n.

If b1b2b3, . . . bn is a binary string and | Vb1〉, | Vb2〉, | Vb3〉, · · · | Vbn〉 are vectors in H2 thenthe tensor product of these vectors is

Vb1b2b3,...bn =| Vb1〉⊗ | Vb2〉⊗ | Vb3〉 · · · ⊗ | Vbn〉.An equivalent notation is

Vb1b2b3,...bn =| Vb1Vb2Vb3 . . . Vbn〉.

162

The basis vectors of the H2n Hilbert space space areB0 =| 000 . . . 000〉, B1 =| 000 . . . 001〉, B2 =| 000 . . . 010〉, B3 =| 000 . . . 011〉,

. . . . . . . . . . . . B2n−1 =| 111 . . . 110〉, B2n =| 111 . . . 111〉.The input to a quantum computer is classical information represented as a binary string

of say k ≤ n bits. This string is mapped to an input vector with the last n−k bits set to zero

V = b0b1 . . . bk−100 . . . 0.

The output of the n-qubit quantum computer is

W =2n−1∑i=0

αiBi.

The αi are complex numbers called probability amplitudes and they must satisfy the con-dition

2i−1∑i=0

| αi |2 = 1.

According to Heisenberg’s uncertainty principle we cannot measure the complete state ofa quantum system. If we observe the output of the quantum computer we decide that theresult is i (represented by the binary string b1b2 . . . bn with probability | αi |2. For example,if n = 3, the output value i = 6 corresponds to the binary string 110 with b0 = 1, b1 =1, b2 = 0. If α110 = 0.3 + 0.4i then the probability of observing the value 6 of the output is√

(0.32 + 0.42) = 0.5.Each gate of a quantum computer acting on q-qubits induces a transformation of the state

space H2n of the entire quantum computer. For example, the action of a 2-qubit quantumgate with a 4× 4 transfer matrix G = [gk,l, k, l ∈ {1, 2}] acting on two qubits, say i and j,of an n-qubit quantum computer is represented by the tensor product of the matrix G withn− 2 identity matrices, each acting on one of the remaining qubits.

We discuss now the statement “given any classical circuit capable to compute a functionf(x) we can build a reversible quantum circuit capable of computing f(x)”.

Recall from Section 5.13 that a reversible circuit needs some work qubits, or ancillaryqubits, to contain intermediary values required by the computation. For example, the circuitin Figure 47, needs five such qubits to store partial products of the control qubits. Ultimately,the product of all control qubits, stored in one of the work qubits controls the gate thatconditionally transforms the target qubit. These work qubits are initially set to | 0〉 and thereversible circuit returns them to the same state, before completing its function.

The Fredkin gate discussed in Section 5.6 has a control input and two target qubits. Oneof the target qubits can be thought of as a work qubit and the other as target qubit proper.The Fredkin gate is universal, it can simulate an AND and a NOT gate. Given a function f(x)we can construct a quantum circuit consisting of Fredking gates only, capable of transformingtwo qubits | x〉 and | y〉, into | x y⊕ f(x), see Figure 50. We stress the fact that the functionf(x) is hardwired into the circuit.

In the general case we can have multiple control and work qubits and the transformationcarried out by the quantum circuit is

| x w〉 −→ | (f(x) g(x)〉

163

Uf

| x > | x >

| y > | y O f(x) >+

Figure 50: A two qubit quantum gate array performing the transformation Uf :| x y〉 → |x y ⊕ f(x)〉. When | y〉 =| 0〉 then the transformation is | x 0〉 → | x f(x)〉.

with f(x) the result and g(x) some undesirable output that should be cleaned up. Asbefore, | x w〉 means the tensor product of | x〉 and | w〉. If we have m control qubitsthen | x〉 =| x0x1...xm−1〉 with xi = 0, 1. Therefore, x may take all possible valuesin the range 0, 1, . . . 2m − 1. If we have k work qubits then | w〉 =| w0w1...wk−1〉 withwi = 0, 1. The work qubits should start in state | 0〉. In this case the transformation is| x0x1...xm−1, 00 . . . 0〉 −→ | (f(x), g(x)〉, or in a compact form

| x 0〉 −→ | f(x) g(x)〉.We can gradually augment our circuit to carry out a reversible computation, see Figure

51. First, we create a copy of the input x using a CNOT gate. This copy will not be alteredduring the computation. Now, the transformation carried out by the circuit is

| x 0 0〉 −→ | x f(x) g(x)〉.Now we need to reverse the status of the work qubits to their original state. We construct

a circuit with four input registers: the first contains the input x, the second is originally | 0〉and at the end of the computation will contain the result, the third contains the work qubitsin state | 0〉 and at some point will contain the undesirable intermediate results, g(x) and thefourth starts in some state | y〉. This four register circuit is used to compute f and carriesout the following transformation

| x 0 0 y〉 −→ | x f(x) g(x) y〉.In the next step we augment the quantum circuit with CNOT gates to add bitwise f(x)

obtained in the second register to the fourth register y. As a result we reach the state

| x 0 0 y〉 −→ | x f(x) g(x) y〉 −→ | x f(x) g(x) y ⊕ f(x)〉.But all the steps to compute f(x) are reversible and we can return the second and third

register (the work register containing the garbage g(x)) to zero and finally reach the state

| x 0 0 y〉 −→ | x f(x) g(x) y〉 −→ | x f(x) g(x) y ⊕ f(x)〉 −→ | x 0 0 y ⊕ f(x)〉.

In a compact form the previous result becomes

| x y〉 −→| x y ⊕ f(x)〉.

164

Uf

y

x f( x)

g(x)

Uf

x x

f( x)

g(x)

y y

Uf

x x

f( x)

g(x)

y

0

0

0

0

Add a secondregister for theresult and a thirdfor ancillary bits

Add CNOT gatesfor bitwise AND ofy and f(x)

y + f(x)O

y + f(x)O

Uf

x x

y

0

0

0

0

Add CNOT gatesto reverse thecomputation andto set the secondand third outputregisters to zero.

Figure 51: A reversible quantum gate array. We start with a quantum gate array with twoinput registers for | x〉 and | y〉. As a first step towards reversibility we add two new inputregisters, one for the result and one for the ancillary bits. Then we add CNOT gates to producethe bitwise AND of | y〉 and | f(x)〉. Finally we add CNOT gates to reverese the computationand set the second and and third output registers to zero. We end up with the input | x〉 andwith | y ⊕ f(x)〉. When | y〉 =| 0〉 then the second output is | f(x)〉.

Thus, we can construct a reversible quantum circuit computing an arbitrary function f .

5.16 Errors, Uniformity Conditions, and Time Complexity

We follow Shor’s approach [106] regarding the class of functions that can be computed inpolynomial time on a quantum computer and the definition of the Bounded-Error QuantumPolynomial, (BQP), time complexity class.

Let us briefly review what we already know about the errors in quantum computations.Closed quantum systems are system isolated from the environment. All state transformationsof closed quantum systems are unitary and the superposition states are maintained duringthe evolution of a closed system. All the qubits of a quantum system evolve simultaneouslyas described by the Hamiltonian of the system.

165

Unfortunately, it is unfeasible to maintain a system in total isolation from the environmentand the physical reality forces us to consider open systems where a certain degree of interactionof the system with the environment occurs. The evolution of an open system is not unitary.In an open system the superposition states decay very quickly and we witness the decoherencephenomena. This explains why in our everyday life we rarely observe the superposition statesand also hints of potential problems in quantum computing.

Errors due to decoherence are inevitable in a quantum computer. If no measures are taken,and an error of order 1/e is introduced when a gate transforms a quantum state, then afterO(e) such state transformations, the quantum state becomes so noisy that a wrong answeris almost certain [20]. Thus, error correction becomes a necessity for quantum computers.Fault-tolerant quantum computing is even more intricate, it requires measurements duringthe evolution of a computation, while the quantum circuit model defers all measurements tothe end of a computation.

In our discussion of quantum circuits we assumed a constant, as opposed to a variablenumber of bits for the input, a condition we refer to as circuit uniformity. This is a veryreasonable assumption inherited from classical circuits and traditional computer architecture;a 32-bit microprocessor accepts as input 32-bit integer and floating point representations, itsinternal registers, arithmetic and logic unit (ALU), as well as data busses are 32-bit wide.It is conceptually feasible to allow arbitrary length inputs but in that case one could hidea non-computable function in the design of the quantum circuits for each input length. Fornon-uniform quantum circuits we allow at most a polynomial amount of extra informationhidden into the circuit design.

Definition. The Bounded-error Quantum Polynomial, (BQP), functions are all function withthe domain the set of binary strings computable by uniform quantum circuits whose number ofgates is polynomial in the number of input qubits, and which give the correct answer at least2/3 of the time.

The corresponding family of function when we allow non-uniform quantum circuits and apolynomial amount of extra information hidden into the circuit design is called BQP/Poly.


A quantum computer consists of input and output wires carrying qubits and quantum circuits,which in turn are made out of quantum gates. The number of input and output bits of aclassical gate may differ, while a quantum gate maps q input qubits into precisely q outputqubits32.

The input state V of a quantum computer is the tensor product of n two-dimensionalvectors, each one of them representing the state of one qubit. The input to a quantumcomputer is classical information represented as a binary string of say k ≤ n bits. This stringis mapped to an input vector with the last n− k bits set to zero

V = b0b1 . . . bk−100 . . . 0.

The output state of an n-qubit quantum computer is

32This is a necessary, but not a sufficient condition for reversibility.

166

W =2n−1∑i=0

αiBi.

The αi are complex numbers called probability amplitudes and they must satisfy the con-dition

2i−1∑i=0

| αi |2 = 1.

Steane [111] defines a quantum computer as a set of n qubits in which the followingoperations are feasible

(i) Each qubit can be prepared in some known state | 0〉.(ii) Each qubit can be measured in the basis {| 0〉, | 1〉}.(iii) A universal quantum gate (or set of gates) can be applied at will to any fixed-size subsetof gates.

The qubits do not evolve other than via the above transformations.We first discuss one-qubit gates. The identity gate I, which leaves a qubit unchanged.

The X or NOT gate transposes the components of an input qubit. The Z gate flips the sign of aqubit. The Hadamard gate H when applied to a pure state, | 0〉 or | 1〉, creates a superpositionstate, | 0〉 → 1/

√2(| 0〉+ | 1〉) and | 1〉 → 1/

√2(| 0〉− | 1〉).

Then, we introduce the CNOT gate which has as input a control qubit and a target qubit.The control input is transferred directly to the control output of the gate. The target outputis equal to the target input if the control input is | 0〉 and it is flipped if the control input isnot | 0〉.

If the control qubit | ψ〉 and the target qubit | φ〉 are

| ψ〉 = α0 | 0〉 + α1 | 1〉 | φ〉 = β0 | 0〉 + β1 | 1〉.then the input state of the quantum CNOT gate is

| VCNOT 〉 = | ψ〉⊗ | φ〉 =

(α0

α1

)⊗(

β0

β1

)=

α0β0

α0β1

α1β0

α1β1

The transfer matrix of the CNOT gate is

GCNOT =

1 0 0 00 1 0 00 0 0 10 0 1 0

The output state of the CNOT gate is

| WCNOT 〉 = GCNOT | VCNOT 〉 =

α0β0

α0β1

α1β1

α1β0

.

167

The Fredkin gate has two regular inputs, a, b, and a control input c and three outputs,a′, b′ and c′. The control input c is transferred directly to the output, c′ = c. The controlinput determines the regular output as follows:

(i) When c = 0 the two regular inputs are transferred without modification to the output,a′ = a and b′ = b, see Figure 33(a). The truth table for this case has four entries, one foreach possible combinations of values of a and b.

(ii) When c = 1 the two regular inputs are swapped, a′ = b and b′ = a, see Figure 33(b)for the circuit and its truth table.

The Fredkin gate is reversible, it is a conservative logic gate, it conserves the number of1’s at its input, and it is universal, it can simulate an AND gate and a NOT gate.

The Toffoli gate has two control inputs, a and b and one target input c. The outputsare: a′ = a, b′ = b and c′, see Figure 34(a). The Toffoli gate is a universal gate and it isreversible.

The input state of a Toffoli gate is

| VToffoli〉 =

(α0

α1

)⊗(

β0

β1

)⊗(

γ0

γ1

)=

α0β0

α0β1

α1β0

α1β1

⊗

(γ0

γ1

)=

α0β0γ0

α0β0γ1

α0β1γ0

α0β1γ1

α1β0γ0

α1β0γ1

α1β1γ0

α1β1γ1

The transfer matrix of a Toffoli gate is:

GToffoli =

1 0 0 0 0 0 0 00 1 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 1 0 0 0 00 0 0 0 1 0 0 00 0 0 0 0 1 0 00 0 0 0 0 0 0 10 0 0 0 0 0 1 0

Quantum circuits are built by interconnecting quantum gates. Several limitations areimposed in the realization of quantum circuits. The circuits are acyclic, feedback from onepart of the circuit to another is not allowed, there are no loops. We cannot copy qubits.

We discuss quantum circuits able to execute a unitary transformation G depending uponthe state of one or more control qubits.

Reference [5] provides the background material for one and multiple qubit controlled op-erations. Chapter 4 in [80] covers the subject in depth and it is accessible to a large audience.


(1) Prove that the two expressions in Section 5.1 giving the CarryOut are equivalent byBoolean algebra manipulation rather than truth tables.

168

(2) Prove de Morgan’s Laws

a + b = ab

ab = a + b

using the truth table method. a and b are two Boolean variables.

(3) Consider the matrix G = [gij], 1 ≤ (i, j) ≤ 2, defined in Section 5.2. Derive the fourequations relating the real and imaginary parts of gij implied by the condition that G isunitary.

(4) Verify that the transfer matrices of the X, Z, and H gates are unitary. Verify that theoutputs of the three gates are the ones given in Section 5.2.

(5) Prove that

Pθ1Pθ2 = Pθ1+θ2

(6) Prove that H2 = I, where I is the unity matrix. Thus applying H twice to an input doesnot change it.

(7) Prove that the following relationship between the CNOT gate and the I and X single qubitgates

GCNOT =| 0〉〈0 | ⊗I+ | 1〉〈1 | ⊗X.

(8) Construct GFredkin, the transfer matrix of a Fredkin gate.

(9) Show that GFredkin is unitary. This implies that the normalization condition holds

| α000 |2 + | α001 |2 + | α010 |2 +α011 |2 + | α100 |2 + | α101 |2 + | α110 |2 + | α111 |2= 1.

(10) Prove that the following relationship between the Fredkin gate and the swap gate andthe I gates

GFredkin = I⊗ | 0〉〈0 | +Gswap⊗ | 1〉〈1 |with

Gswap =| 00〉〈00 | + | 01〉〈10 | + | 10〉〈01 | + | 11〉〈11 |

(11) Show that the quantum circuit in Figure 48 simulates a Toffoli gate.

(12) Prove that the following relationship between the Toffoli gate and the CNOT and the I

gates

GToffoli = I⊗ | 0〉〈0 | +GCNOT⊗ | 1〉〈1 | .

(13) Write a Java simulator for the CNOT, the Fredkin, and the Toffoli gates. The simulatorshould have a GUI allowing the user to select the gate and then the inputs to that gate. Oncethe gate is selected, the GUI should display the gate, allow the user to specify the input to

169

that gate, and then display the output of the gate. The simulator should be written as anapplet.

(14) Write a Java simulator for the full one qubit adder constructed with Toffoli and CNOT

gates in Figure 37. Display the values of the individual qubits at every stage of the circuit,as discussed in Section 5.10.

170

6 Quantum Algorithms

In Chapter 7 we discuss models of computations and introduce Turing machines. To establishif a function f(x) is computable or not, we have to find a Turing machine able to carry outthe computation prescribed by the function f(x).

Church’s thesis, “all computing devices can be simulated by a Turing machine”, has pro-found implications for the computability theory: it tells us that it is sufficient to restrictourselves to Turing machines, instead of investigating a potentially infinite set of computingdevices. A quantitative version of this thesis is: “any physical computing device can be sim-ulated by a Turing machine in a number of steps polynomial in the resources used by thecomputing device”. While no one has been able to find counter-examples for this thesis, thesearch has been limited to systems constructed based upon the laws of classical mechanics.

But the universe is essentially quantum mechanical, therefore there is a possibility that thecomputing power of quantum mechanical computing devices might be greater [102]. If this istrue, then problems such as factoring integers 33, or finding discrete logarithms, for which nopolynomial time algorithms are known, could be solved in linear time by a quantum device.Therefore, the investigation of quantum computing devices and of quantum algorithms is wellmotivated.

6.1 Introduction to Quantum Algorithms

In the early 1980’s Peter Benioff from IBM Research established the connection betweenquantum mechanics and computations [9, 10, 11]. He showed that quantum mechanics leadsto a computational model as powerful as Turing machines. Few years later, Feynman [48, 49]suggested that quantum mechanics might be even more powerful computationally than Turingmachines. He showed that simulating quantum mechanical systems on a classical computeris unfeasible. In 1980 Yurii Manin published a paper in Russian arguing along the same line[72].

In 1985 David Deutsch recognized that a quantum computer has capabilities well beyonda classical computer and suggested that such capabilities can be exploited by cleverly craftedalgorithms. Deutsch realized that a quantum computer can evaluate a function f(x) for manyvalues of x simultaneously and called this strikingly new feature quantum parallelism [39].

In 1992 Deutsch and Jozsa [40] and Berthiaume and Brassard [21, 22] showed that someproblems that can only be solved with high probability using a random number generator,can be solved exactly with quantum computers. The next step was to show that there areproblems that can be solved in quantum polynomial time but for these problems only classicalexponential algorithms on a classical computer are known. This milestone was achieved byPeter Shor.

In 1994 Peter Shor [100] found a polynomial time algorithm for factoring n-bit numbers onquantum computers and generated a wave of enthusiasm for quantum computing. Like mostfactoring algorithms, Shor’s algorithm reduces the factoring problem to the problem of findingthe period of a function, but uses quantum parallelism to find a superposition of all valuesof the function in one step. Then the algorithm calculates the quantum Fourier transformof the function, which sets the amplitudes into multiples of the fundamental frequency, the

33A paper presenting a polynomial time algorithm which determines if a number is prime or composite wasposted on a Web site on August 6, 2002 [3].

171

reciprocal of the period. To factor an integer the Shor algorithm measures the period of thefunction.

In a recent paper [107] Peter Shor classifies the quantum algorithms known to offer asignificant speed-up over their classical counterparts into three broad categories:

(i) Algorithms that find the periodicity of a function using Fourier transform methods. Si-mon’s algorithm [108], Shor’s algorithms for factoring and for computing discrete logarithms[104], and Hallgren’s algorithm to solve Bell’s equation are all members of this class.

(ii) Search algorithms which can perform an exhaustive search of N items in√

N time.Grover’s algorithms [57, 58, 59] belong to this class.

(iii) Algorithms for simulating quantum systems, as suggested by Feynman. This is a poten-tially large class of algorithms, but not many algorithms in this class have been developedso far. Once quantum computers become a reality we should expect the development of aconsiderable number of programs to simulate quantum systems of interest.

Yet, quantum algorithms have not witnessed the same effervescent developments we haveseen in quantum information theory, or quantum complexity. Clearly, the development ofquantum algorithms requires a very different thinking than the one we are accustomed withfor classical computers. To offer a considerable speed-up a quantum algorithm must relyon superposition states and this is a foreign concept for those unaccustomed with quantummechanical thinking.

In addition to this obvious reason, Peter Shor speculates [107] that the number of problemsthe quantum algorithms may offer a substantial speedup over the classical algorithms may bevery limited. To see the spectacular speedups we have to concentrate on problems not in theclassical computational class P . Many believe that quantum algorithms solving NP -completeproblems in polynomial time do not exist, even though no proof of this assertion is availableat this time. Shor argues that if we assume that no polynomial time quantum algorithms existfor solving NP -hard problems, then the class of problems we have to search for, is neitherNP -hard, nor P , and the population of this class is relatively small.

6.2 Quantum Turing Machines

A classical computation carried out by a Probabilistic Turing Machine, (PTM), can be rep-resented as a tree where each node corresponds to a state of the PTM. The root of the treereflects the starting configuration and each level corresponds to a step of the computation.An edge represents a transition from one state to another; a state can be reached from a par-ent state (situated one level up in the tree) with a certain probability. The same state mayappear multiple times at any level of the tree, or at different tree levels. The probability ofreaching a state appearing multiple times at level i of the tree, is the sum of the probabilitiesof reaching individual instances of that state. The probability of reaching an instance of astate of the PTM is equal to the product of probabilities assigned to all edges leading fromthe root to the node corresponding to that state. The sum of the probabilities of all states atany given level of the tree should be one.

A computation is said to be well defined if and only if the probabilities on the edges froma parent node and the children states do not depend upon the position of the node in thetree. Well-defined configurations can also be represented as Markov chains. A Markov chainhas a stochastic matrix associated with it whose entries are the transition probabilities amongstates.

172

The quantum model of computation proposed by David Deutsch, the Quantum TuringMachine, (QTM), model is similar with the PTM, model, but considerably more powerful.The traditional laws of probabilities obeyed by classical systems are replaced by different lawsfor quantum systems [108].

A computation on a QTM is represented by a computational tree where each edge hasan associated amplitude, a complex number with a magnitude at most 1. The amplitudeassociated with a node (state) is the product of the amplitudes of all edges on the path fromthe root to that node. The amplitude corresponding to a state appearing multiple times atany level i of the tree is the sum of the amplitudes of all nodes corresponding to that state atthat particular level of the tree. The probability of an instance of a state is the square of theamplitude of the corresponding node. The sum of the probabilities of all states at any givenlevel of the tree should be one. For example, “the probability of a particular final state is thesquare of the sum (not the sum of the squares) of all leaf nodes corresponding to that state”[108].

The condition that the squares of the amplitudes associated with all edges emerging froma given node sum to 1 is no longer a sufficient condition. We require that a QTM alwaysexecute unitary steps. The evolution of a QTM is unitary and reversible. At each step, theamplitudes of possible states are determined by the amplitude of the current state accordingto a fixed, local, unitary transformations similar to a stochastic matrix of a Markov process.

6.3 Quantum Parallelism

We have seen in section 5.15 that given a function f(x) we can construct a quantum circuitconsisting of Fredkin gates only, capable of transforming two qubits | x〉 and | y〉, into |x y⊕ f(x)〉, as shown in Figure 52(a). We stress the fact that the function f(x) is hardwiredinto the circuit.

If the second qubit is set to | 0〉, as seen in Figure Figure 52(b), then the transformationcarried out by the circuit is:

| x 0〉 −→ | x f(x)〉.Now instead of | x〉 consider a | 0〉 applied first to a Hadamard gate, and the output of the

Hadamard gate, (| 0〉+ | 1〉)/√

2, applied to the previous circuit (Figure 52(c)). The resultingstate of the system is then:

| 0 f

(| 0〉+ | 1〉√

2

)〉

or

| 0 f(0)〉+ | 1 f(1)〉√2

.

The output state contains information about f(0) and about f(1). This remarkable prop-erty of quantum circuits is called quantum parallelism.

Figure 53 illustrates a quantum gate array characterized by a linear transformation givenby Uf . The inputs to this gate array are | x〉 ∈ Hm an m-dimensional vector acting as a controlinput, and | y〉 ∈ Hk a k-dimensional vector. The outputs are | x〉 and | y⊕ f(x)〉 ∈ Hn, withn = m + k. When | y〉 =| 0〉 then the second output becomes | y ⊕ f(x)〉 =| f(x)〉.

173

O

(c)

Uf

H

| 0 >

+

| x >

f(x) >

| x >

| y| y >

(a)

| x >

| f(x) >

| x >

| 0 >

(b)

| 0 >

Uf Uf

Figure 52: (a) A reversible quantum circuit transforming | x y〉 → | x (y⊕f(x))〉. (b) Thecircuit in (a) with | y〉 =| 0〉. (c) The first qubit is in a superposition state, (| 0〉+ | 1〉)/

√2

and the second qubit is | 0〉. The resulting state of the circuit is | 0 f(0)〉+ | 1 f(1)〉/√

2.

Uf

| x > | x >

| y > | y O f(x) >+

(m-dimensional)

(k-dimensional) (n=m+k)-dimensional)

Figure 53: The transformation Uf :| x, y〉 → | x, y ⊕ f(x)〉 performed by the quantum gatearray.

Now assume that the input vector | x〉 is in a superposition state and can be expressedas a linear combination of 2m vectors forming an orthonormal basis in Hm. The gate arrayperforms a linear transformation. Henceforth, the transformation is applied to all basis vectorsused to express the input superposition simultaneously, and it generates a superposition ofresults. In other words, the values of the function f(x) for the 2m possible values of itsargument x are computed simultaneously. The quantum parallelism justifies our statementin Section 2.3 that quantum computers can provide an exponential amount of computationalspace in a linear amount of physical space.

Quantum parallelism allows us to construct the entire truth table of a quantum gate arrayhaving 2n entries in one at once. In a classical system we can compute the truth table in onetime step with 2n gate arrays running in parallel, or we need 2n time steps with a single gatearray.

Typically we start with n qubits, each in state | 0〉 and we apply a Welsh-Hadamard

174

transformation. Each qubit is transformed by a Hadamard gate; recall that the Hadamardgate transforms a | 0〉 as follows:

H :| 0〉 → 1/√

2(| 0〉+ | 1〉).Thus:

(H ⊗H ⊗ . . . H) | 00 . . . 0〉 =1√2n

[(| 0〉+ | 1〉)⊗ (| 0〉+ | 1〉)⊗ . . . (| 0〉+ | 1〉) =1√2n

2n−1∑x=0

| x〉.

The output of the gate array when we add a k-bit register to the superposition state ofintegers in the range 0 to 2n − 1 is:

Uf (1√2n

2n−1∑x=0

| x, 0〉) =1√2n

2n−1∑x=0

Uf (| x, 0〉) =1√2n

2n−1∑x=0

| x, f(x)〉.

When we measure the output of the quantum gate array we can only observe one value.Henceforth, we need some level of algorithmic sophistication to exploit the quantum paral-lelism.

6.4 Deutsch’s Algorithm

0 f(0) 1 f(1)

2T

0 f(0)

1 f(1)

T

(a) (b)

O+

| x >

| y > f(x) >

| x >

| y

T(c)

Uf

Figure 54: Classical and quantum parallelism. (a) Sequential solution to Deutsch’s problemusing a classical computer. (b) A parallel solution to Deutsch’s problem using a classicalcomputer. (c) The quantum black box with a transfer function Uf . It evaluates f(0) andf(1) simultaneously.

Quantum parallelism is best illustrated by the solution to the so-called “Deutsch’s prob-lem”. Consider a black box characterized by a transfer function that maps a single input bit

175

x into an output, f(x). The transformation performed by the black box, f(x), is a generalfunction and might not be invertible. We assume that it takes the same amount of time, T ,to carry out each of the four possible mappings performed by the transfer function f(x) ofthe black box:

f(0) = 0 f(0) = 1 f(1) = 0 f(1) = 1.

The problem posed is to distinguish if f(0) = f(1) or f(0) �= f(1).Using a classical computer one alternative is to compute sequentially f(0) and f(1) and

then compare the results, see Figure 54(a) with a total time 2T . A classical parallel solutionis illustrated in Figure 54(b) where we feed 0 as input to one of the replicas of the black boxand 1 to the other and then compare the partial results. In this case we obtain the answerafter time T but we need two systems rather than one.

Consider now a quantum computer with a transfer function Uf that takes as input twoqubits | x〉 (control) and | y〉 (target) and two outputs, | x〉 and | y〉 ⊕ f(x)〉. We have thechoice of selecting the states of the two qubits | x〉 and | y〉. We choose the first qubit in thestate | x〉 = 1√

2(| 0〉 + | 1〉) and the second qubit in the state | y〉 = 1√

2(| 0〉 − | 1〉).

| y + f(x) >

Uf

| x >

| y >

| x >

O

| 0 > H

H| 1 >

H

0 1 2 3

Figure 55: A quantum circuit for solving Deutsch’s problem.

Figure 55 illustrates the circuit for solving Deutsch’s problem. In this figure there are fourcuts and we examine the four vectors, ξ0, ξ1, ξ2 and ξ3 describing the state of the system atthese cuts

The input vector is

| ξ0〉 =| 0 1〉 =

(10

)⊗(

01

)=

0100

.

the transfer matrix of the first stage is

176

G1 = H ⊗H =1√2

(1 11 −1

)⊗ 1√

2

(1 11 −1

)=

1

2

1 1 1 11 −1 1 −11 1 −1 −11 −1 −1 1

Now

| ξ1〉 = G1ξ0 =1

2

1 1 1 11 −1 1 −11 1 −1 −11 −1 −1 1

0100

=

1

2

1−1

1−1

or

| ξ1〉 =1

2(| 00〉− | 01〉+ | 10〉− | 11〉) =

[| 0〉+ | 1〉√

2

] [| 0〉− | 1〉√

2

]The two qubits applied to the input of the black box are

| x〉 =

[| 0〉+ | 1〉√

2

]| y〉 =

[| 0〉− | 1〉√

2

]We know that | 0〉 ⊕ f(x)〉 = | f(x)〉 thus:

| y〉⊕ | f(x)〉 =(| 0〉 − | 1〉)√

2⊕ | f(x)〉 =

(| f(x)〉 − | 1〉 ⊕ f(x)〉)√2

.

But | 1〉 ⊕ f(x)〉 is equal to 0 when f(x) = 1 and it is equal to 1 when f(x) = 0 thus:

| y〉⊕ | f(x)〉 = (−1)f(x) (| 0〉 − | 1〉)√2

.

Then

| y〉⊕ | f(x)〉 =

{(|0〉−|1〉)√

2if f(x) = 0

− (|0〉−|1〉)√2

if f(x) = 1

It is easy to see that when f(0) = f(1), the second output qubit of the black box, |x ⊗ (y ⊕ f(x)), is

| ξ2〉 =

[|0〉+|1〉√

2

] [|0〉−|1〉√

2

]= 1

2

1−1

1−1

if f(0) = f(1) = 0

−[|0〉+|1〉√

2

] [|0〉−|1〉√

2

]= −1

2

1−1

1−1

if f(0) = f(1) = 1

We leave as an exercise the proof that if f(0) �= f(1), then the output of the black box is

177

| ξ2〉 =

[|0〉−|1〉√

2

] [|0〉−|1〉√

2

]= 1

2

1−1−1

1

if f(0) = 0 and f(1) = 1

−[|0〉−|1〉√

2

] [|0〉−|1〉√

2

]= −1

2

1−1−1

1

if f(0) = 1 and f(1) = 0

Combining these two results we have

| ξ2〉 =

±[|0〉+|1〉√

2

] [|0〉−|1〉√

2

]= 1

2

1−1

1−1

if f(0) = f(1)

±[|0〉−|1〉√

2

] [|0〉−|1〉√

2

]= 1

2

1−1−1

1

if f(0) �= f(1)

The transfer matrix of the third stage of the quantum circuit in Figure 55 is:

G3 = H ⊗ I =1√2

(1 11 −1

)⊗(

1 00 1

)=

1√2

1 0 1 00 1 0 11 0 −1 00 1 0 −1

If f(0) = f(1) then

| ξ3〉 = ± 1√2

1 0 1 00 1 0 11 0 −1 00 1 0 1

1

2

1−1

1−1

=

1√2

1−1

00

= ± | 0〉 | 0〉− | 1〉√

2

If f(0) �= f(1) then

| ξ3〉 = ± 1√2

1 0 1 00 1 0 11 0 −1 00 1 0 −1

1

2

1−1−1

1

=

1√2

001

−1

= ± | 1〉 | 0〉− | 1〉√

2

We observe that

f(0)⊕ f(1) =

{0 if f(0) = f(1)1 if f(0) �= f(1)

Then finally we rewrite | ξ3〉

| ξ3〉 = ± | f(0)⊕ f(1)〉[| 0〉− | 1〉√

2

].

178

This expression tells us that by measuring the first output qubit of the circuit in Figure55 we are able to determine f(0)⊕ f(1) performing a single evaluation of the function.

6.5 Quantum Fourier Transform

Let v and w be two vectors in an N -dimensional complex vector space, v, w ∈ CN , v =(v1, v2, . . . vN−1, vN) and w = (w1, w2, . . . wN−1, wN). We restrict our discussion to the casewhen N = 2n. The Discrete Fourier Transform (DFT) maps v into w as follows:

wk ≡1√N

N−1∑j=0

vje2πijk/N .

The Quantum Fourier Transform (QFT) is a linear operator that transforms a state | v〉of a quantum system into another state | w〉:

| v〉 −→ w.

with:

| v〉 =N−1∑j=0

vj | j〉

and

| w〉 =N−1∑k=0

wk | k〉.

The QFT transforms an orthonormal basis | 0〉, . . . | k〉, · · · | N − 1〉 as follows:

| j〉 −→ 1√N

N−1∑k=0

e2πijk/N | k〉.

We now consider the binary representation of integers j and k and obtain another expres-sion for the QFT:

j = j02n−1 + j12

n−2 + · · ·+ jn−22 + jn−120.

k = k02n−1 + k12

n−2 + · · ·+ kn−22 + kn−120.

Then, the definition of the QFT can be rewritten as:

| j0, j1, . . . jn−1〉 −→1

2n/2

∑k0=(0,1)

∑k1=(0,1)

. . .∑

kn−1=(0,1)

e2πij∑n−1

l=0 kl2−l | k1k2 . . . kn〉

| j0, j1, . . . jn−1〉 −→1

2n/2

∑k0=(0,1)

∑k1=(0,1)

. . .∑

kn−1=(0,1)

n−1⊗l=0

e2πijkl2−l | kl〉

179

| j0, j1, . . . jn−1〉 −→1

2n/2

n−1⊗l=0

{∑

kl=(0,1)

e2πijkl2−l | kl〉}

The bit kl may only take two values, 0 and 1 thus:

| j0, j1, . . . jn−1〉 −→1

2n/2

n−1⊗l=0

{| 0〉+ e2πij2−l | 1〉}.

We want to write this expression as a product of ket vectors and we use the followingnotation 0.jljl+1 . . . jn−1 = jl2

−1 + jl+12−2 + . . . + jn−12

n−l [80].Then

| j0, j1, . . . jn−1〉 −→(| 0〉+ e2πi0.jn−1 | 1〉)(| 0〉+ e2πi0.jn−2jn−1 | 1〉) . . . (| 0〉+ e2πi0.j0j1...jn−1 | 1〉)

2n/2

The resulting quantum state after a transformation has the binary number expressing abase vector reversed and we need the reverse the bits.

H

H

Rn-1 Rn

Rn-2 Rn-1

H

H

R2

R2

| j0>

| j1>

| jn-2>

| jn-1>

| k0>

| k1>

| kn-1>

| kn-2>

Figure 56: A circuit for Quantum Fourier Transform.

Figure 56 shows a circuit for quantum Fourier Transform based upon the expression for thetransformation of each base vector we derived earlier. Each basis vector is first transformedinto a superposition state by a Hadamard gate and then it goes through one or more Rk gates.Recall from Section 5.12 that the Rk gate transforms a qubit by multiplying its projection on| 1〉 by e2πi/2k

:

Rk =

(1 0

0 e2πi

2k

)An equivalent expression for the QFT can be obtained if we denote by Hp the Hadamard

transform applied to qubit p and by the Sp,q the joint transformation of qubits p and q aswith:

180

Sp,q =

1 0 0 00 1 0 00 0 1 0

0 0 0 eiπ/2p−q

.

Then the QFT is given by:

H0(S0,1 . . . S0,n−1)H1(S1,2 . . . S1,n−1) . . . Hn−3(Sn−3,n−2Sn−3,n−1)Hn−2(Sn−2,n−1)Hn−1

followed by a bit reversal transformation. In Shor’s algorithm and in other applications theQFT is followed by a measurement, and, in that case the bit reversal can be done usingclassical algorithms.

Let us now follow the state transformations produced by the circuit in Figure 56. TheHadamard gate applied to the first qubit produces the following change of state:

| j0, j1, j3 . . . jn−1〉 −→1

21/2(| 0〉+ e2πi0.j0 | 1〉) | j1, j2 . . . jn−1〉.

with

e2πi0.j0 = e2πij0/2 =

{+1 j0 = 0−1 j0 = 1

Each controlled-R gates applied to the first qubit adds an extra bit to the phase of theprojection on | 1〉. As a result we observe the successive changes of state:

121/2 (| 0〉+ e2πi0.j0 | 1〉) | j1, j2 . . . jn−1〉 −→ 1

21/2 (| 0〉+ e2πi0.j0j1 | 1〉) | j1, j2 . . . jn−1〉 −→1

21/2 (| 0〉 + e2πi0.j0j1j2 | 1〉) | j1, j2 . . . jn−1〉 . . . −→ 121/2 (| 0〉 + e2πi0.j0j1j2...jn−1 | 1〉) |

j1j2 . . . jn−1〉The transformations of the second qubit are:

122(1/2) (| 0〉+ e2πi0.j0j1...jn−1 | 1〉)(| 0〉+ e2πi0.j1 | 1〉) | j2 . . . jn−1〉 −→

122(1/2) (| 0〉+ e2πi0.j0j1...jn−1 | 1〉)(| 0〉+ e2πi0.j1j2 | 1〉) | j2 . . . jn−1〉 −→ . . .

. . . 122(1/2) (| 0〉+ e2πi0.j0j1...jn−1 | 1〉)(| 0〉+ e2πi0.j1j2...jn−1 | 1〉) | j2 . . . jn−1〉 −→

If N �= 2n then the QFT gives only approximative results. From the previous expressionand from Figure 56 it is easy to see that the total number of gates required by the QFT is:

ngates =n(n− 1)

2.

Let us now give a simple, yet non-trivial example of a QFT calculation. We consider thecase N = 23 and sketch the path to construct the transfer matrix of the circuit. Recall thataccording to Euler’s formula:

eiθ = cosθ + isinθ.

and that

cos(2k + 1)π

2= 0 sin(2k + 1)

π

2= 1.

181

Thus:

ei π2 = i and ei π

4 =√

i

If we denote ω =√

i we can write

R2 =

(1 0

0 ei 2π22

)=

(1 00 ω2

)R3 =

(1 0

0 ei 2π23

)=

(1 00 ω

)

The transfer matrix of a controlled-R2 circuit is based upon the derivation in Section 5.12:

GcontroledR2 =

1 0 0 00 1 0 00 0 1 0

0 0 0 e2πi22

=

1 0 0 00 1 0 00 0 1 00 0 0 ω2

.

H

H

R3

R2

H

R2

G1 G2 G3 G4 G5 G6

Figure 57: A circuit for quantum Fourier Transform for N = 23 consists of six stages. Thetransfer matrix of individual stages are G1 to G6.

Figure 57 shows the circuit obtained from one in Figure 56 when n = 3. We see that thecircuit consists of six stages and we now compute the transfer matrix of each stage. For thefirst stage we have:

G1 = H⊗I⊗I =1√2

(1 11 −1

)⊗(

1 00 1

)⊗(

1 00 1

)=

1√2

1 0 1 00 1 0 11 0 −1 00 1 0 −1

⊗

(1 00 1

).

182

G1 =1√2

(I4 I4

I4 −I4

)I4 =

1 0 0 00 1 0 00 0 1 00 0 0 1

For the second stage we have:

G2 = GcontrolledR2 ⊗ I =

1 0 0 00 1 0 00 0 1 00 0 0 ω2

⊗

(1 00 1

)=

(I4 00 N2

4

).

with

N24 =

1 0 0 00 1 0 00 0 ω2 00 0 0 ω2

.

For the third stage we observe that the circuit performs the following mapping:

| 000〉 → | 000〉 | 001〉 → | 001〉 | 010〉 → | 010〉 | 011〉 → | 011〉| 100〉 → | 000〉 | 101〉 → ω | 001〉 | 110〉 → | 010〉 | 111〉 → ω | 011〉Thus, the transfer matrix of the third stage is:

G3 =

(I4 00 N3

4

)with

N34 =

1 0 0 00 ω 0 00 0 1 00 0 0 ω

.

The transfer matrix of the fourth stage is:

G4 = I⊗H⊗I =1√2

(1 00 1

)⊗(

1 11 −1

)⊗(

1 00 1

)=

1√2

1 1 0 01 −1 0 00 0 1 10 0 1 −1

⊗

(1 00 1

).

Thus, the transfer matrix of the fourth stage is:

G4 =

(N4

4 00 N4

4

)with

183

N44 =

1 0 1 00 1 0 11 0 −1 00 1 0 −1

The transfer matrix of the fifth stage is:

G5 = I ⊗GcontrolledR2 =

(1 00 1

)⊗

1 0 0 00 1 0 00 0 1 00 0 0 ω2

=

(N5

4 00 N5

4

).

with N54 = GcontrolledR2.

Finally for the last stage we have:

G6 = I ⊗ I ⊗H = I4 ⊗H =1√2

1 0 0 00 1 0 00 0 1 00 0 0 1

⊗

(1 11 −1

)=

1√2

(N6

4 00 N6

4

).

with

N64 =

1 1 0 01 −1 0 00 0 1 10 0 1 −1

.

In the following example we show that the transfer matrix of the circuit in Figure 57 is:

GQFT3 =

(1√2

)3

1 1 1 1 1 1 1 11 ω1 ω2 ω3 ω4 ω5 ω6 ω7

1 ω2 ω4 ω6 1 ω2 ω4 ω6

1 ω3 ω6 ω1 ω4 ω7 ω2 ω5

1 ω4 1 ω4 1 ω4 1 ω4

1 ω5 ω2 ω7 ω4 ω1 ω6 ω3

1 ω6 ω4 ω2 1 ω6 ω4 ω2

1 ω7 ω6 ω5 ω4 ω3 ω2 ω1

.

Solution. Matrices G1, . . ., G6 are:

G1 =1√2

1 0 0 0 1 0 0 00 1 0 0 0 1 0 00 0 1 0 0 0 1 00 0 0 1 0 0 0 11 0 0 0 −1 0 0 00 1 0 0 0 −1 0 00 0 1 0 0 0 −1 00 0 0 1 0 0 0 −1

G2 =

1 0 0 0 0 0 0 00 1 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 1 0 0 0 00 0 0 0 1 0 0 00 0 0 0 0 1 0 00 0 0 0 0 0 ω2 00 0 0 0 0 0 0 ω2

184

G3 =

1 0 0 0 0 0 0 00 1 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 1 0 0 0 00 0 0 0 1 0 0 00 0 0 0 0 ω 0 00 0 0 0 0 0 1 00 0 0 0 0 0 0 ω

G4 =1√2

1 0 1 0 0 0 0 00 1 0 1 0 0 0 01 0 −1 0 0 0 0 00 1 0 −1 0 0 0 00 0 0 0 1 0 1 00 0 0 0 0 1 0 10 0 0 0 1 0 −1 00 0 0 0 0 1 0 −1

G5 =

1 0 0 0 0 0 0 00 1 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 ω2 0 0 0 00 0 0 0 1 0 0 00 0 0 0 0 1 0 00 0 0 0 0 0 1 00 0 0 0 0 0 0 ω2

G6 =1√2

1 1 0 0 0 0 0 01 −1 0 0 0 0 0 00 0 1 1 0 0 0 00 0 1 −1 0 0 0 00 0 0 0 1 1 0 00 0 0 0 1 −1 0 00 0 0 0 0 0 1 10 0 0 0 0 0 1 −1

An input | v〉, to the circuit illustrated in Figure 57 is transformed to G1 | v〉 and then toG2G1 | v〉, etc. After passing through the six stages, it becomes

| w〉 = G6G5G4G3G2G1 | v〉 = Gtemp | v〉

It is easy to carry out the steps of the multiplication to get

Gtemp =

(1√2

)3

1 1 1 1 1 1 1 11 −1 1 −1 1 −1 1 −11 ω2 −1 −ω2 1 ω2 −1 −ω2

1 −ω2 −1 ω2 1 −ω2 −1 ω2

1 ω1 ω2 ω3 −1 −ω1 −ω2 −ω3

1 −ω1 ω2 −ω3 −1 ω1 −ω2 ω3

1 ω3 −ω2 −ω5 −1 −ω3 ω2 ω5

1 −ω3 −ω2 ω5 −1 ω3 ω2 −ω5

Because −1 = e2πi/2 = ω4, we also have

Gtemp =

1 1 1 1 1 1 1 11 ω4 1 ω4 1 ω4 1 ω4

1 ω2 ω4 ω6 1 ω2 ω4 ω6

1 ω6 ω4 ω2 1 ω6 ω4 ω2

1 ω1 ω2 ω3 ω4 ω5 ω6 ω7

1 ω5 ω2 ω7 ω4 ω1 ω6 ω3

1 ω3 ω6 ω1 ω4 ω7 ω2 ω5

1 ω7 ω6 ω5 ω4 ω3 ω2 ω1

185

We observe that the permutation matrix

P =

1 0 0 0 0 0 0 00 0 0 0 1 0 0 00 0 1 0 0 0 0 00 0 0 0 0 0 1 00 1 0 0 0 0 0 00 0 0 0 0 1 0 00 0 0 1 0 0 0 00 0 0 0 0 0 0 1

transforms Gtemp to the matrix GQFT3 :

GQFT3 = PGtemp =

(1√2

)3

1 1 1 1 1 1 1 11 ω1 ω2 ω3 ω4 ω5 ω6 ω7

1 ω2 ω4 ω6 1 ω2 ω4 ω6

1 ω3 ω6 ω1 ω4 ω7 ω2 ω5

1 ω4 1 ω4 1 ω4 1 ω4

1 ω5 ω2 ω7 ω4 ω1 ω6 ω3

1 ω6 ω4 ω2 1 ω6 ω4 ω2

1 ω7 ω6 ω5 ω4 ω3 ω2 ω1

Note. One can verify the expression for GQFT3 directly from the definition of the QFT.

We have n = 3 and the three input qubits are

| a〉 = α0 | 0〉+ α1 | 1〉 | b〉 = β0 | 0〉+ β1 | 1〉 | c〉 = γ0 | 0〉+ γ1 | 1〉thus the initial state of the system is

| a〉⊗ | b〉⊗ | c〉 =

α0β0γ0

α0β0γ1

α0β1γ0

α0β1γ1

α1β0γ0

α1β0γ1

α1β1γ0

α1β1γ1

Then ω = e2π i/8, and

0.j3 =j3

2=

4j3

8, 0.j2j3 =

j2

2+

j3

4=

4j2 + 2j3

8, 0.j1j2j3 =

4j1 + 2j2 + j3

8

and the right hand side of the general equation for QFT becomes(α0

w4j3α1

)⊗

(β0

w4j2+2j3β1

)⊗

(γ0

w4j1+2j2+j3γ1

)

186

which is equal to

α0β0

w4j2+2j3 α0β1

w4j3 α1β0

w4j3 w4j2+2j3 α1β1

⊗

(γ0

w4j1+2j2+j3γ1

)=

α0β0γ0

w4j1+2j2+j3 α0β0γ1

w4j2+2j3 α0β1γ0

w4j2+2j3 w4j1+2j2+j3 α0β1γ1

w4j3 α1β0γ0

w4j3 w4j1+2j2+j3 α1β0γ1

w4j2+6j3 α1β1γ0

w4j2+6j3 w4j1+2j2+j3 α1β1γ1

This gives the transformation performed by the QFT :

α0β0γ0

α0β0γ1

α0β1γ0

α0β1γ1

α1β0γ0

α1β0γ1

α1β1γ0

α1β1γ1

−→

1 1 1 1 1 1 1 11 w1 w2 w3 w4 w5 w6 w7

1 w2 w4 w6 1 w2 w4 w6

1 w3 w6 w1 w4 w7 w2 w5

1 w4 1 w4 1 w4 1 w4

1 w5 w2 w7 w4 w1 w6 w3

1 w6 w4 w2 1 w6 w4 w2

1 w7 w6 w5 w4 w3 w2 w1

α0β0γ0

α0β0γ1

α0β1γ0

α0β1γ1

α1β0γ0

α1β0γ1

α1β1γ0

α1β1γ1

6.6 Simon’s Algorithm for Phase Estimation

Consider a function f given as a “black box”. We can observe the results of the computationbut we cannot examine the code of the function. The function f maps f : Fn

2 → Fm2 and

need not be efficiently computable. We know that f is periodic and wish to determine itsperiod s:

f(x) = f(y) ←→ x ≡ y (mod s).

This is a so-called “oracle” problem. We provide a pair x, s, the oracle returns the resultof the evaluation of the functions f(x) and f(x + s), we compare them and if not equal thenwe provide a different value of s and the process continues until we find the right value of s.

The best known algorithm for determining the period of a function takes an exponentialtime on a classical computer. x and s may take any of the 2n possible values in Fn

2 . At leasthalf of the time one must try more than half of the possible values of s, thus we need O(2n/2)evaluations of the function f .

In the general case we need to perform the phase estimation procedure many times tosolve problems such as integer factorization.

Dan Simon created in 1994 an algorithm for phase estimation that requires a quadratictime on a quantum computer. Simon formulates the phase estimation problem as follows[108]: “We are given a function f : {0, 1}n → {0, 1}m with m > n and we know that either(a) f is a one-to-one function, or (b) there exists a non-trivial s such that

∀x �= x′ f(x) = f(x′) ⇐⇒ x′ = x⊕ s.

187

We wish to determine which one of the two statements, (a) or (b) is true. If (b) is truewe wish to find the value s.”

The following algorithm provides a solution with time complexity O(nTf (n) + G(n)) withTf (n) the time to compute the function f for an input of size n, and G(n) the time to solvea linear system of n× n equations over C2.

Given a function f(x) we wish to establish whether f(x) is a periodic function, and if so,to determine its period. The following steps are taken:

Step 1: Apply the Quantum Fourier Transform (QFT) to a register of n qubits in state | 0〉.The result is

2−n2

∑x

| x〉.

Step 2: Compute f(x) and concatenate the result with | x〉 to produce

2−n2

∑x

| x, f(x)〉.

Step 3: Apply the QFT on x to produce

2−n2

∑y

∑x

(−1)x·y | y, f(x)〉.

Proof: Assume first that (a) is true. Then every time we carry out steps 1− 3 we obtain asuperposition state | y, f(x)〉 with amplitude 2−n resulting from 2n distinct states each withoccurring with probability 2−2n. Assume that we repeat steps 1 − 3 k-times. Then, the kvalues obtained in Step 3 have the same amplitude.

Assume now that (b) is true. For each pair y, x the states | y, f(x)〉 and | y, f(x⊕ x)〉 areidentical and the amplitude of this state is

α(x, y) = 2−n((−1)x·y + (−1)(x⊕s)·y)

We observe that if y · s ≡ 0 (mod 2) then

(x⊕ s) · y (mod 2) = (x · y)⊕ (s · y) (mod 2) = (x · y)⊕ 0 (mod 2) = (x · y) (mod 2).

When (x⊕ s) · y (mod 2) = (x · y) (mod 2) then

α(x, y) = 2−n((−1)x·y + (−1)x·y) = 2−n+1(−1)x·y

When (x ⊕ s) · y (mod 2) �= (x · y) (mod 2) then the two terms in the sum below haveopposite signs and their sum is 0

α(x, y) = 2−n((−1)x·y + (−1)(x⊕s)·y) = 0

Assume that we repeat the algorithm O(n) times. Then we obtain many independentvalues of y whose dot product with an unknown string, say s∗ are even. We can determine s∗

by solving the linear system of equations.

188

Therefore the time complexity of the algorithm reflects the sum of the time to repeat ntimes the Steps 1 − 3, O(nTf (n)), and the time to solve a linear system of n × n equationsover C2, O(G(n)).

An equivalent formulation of the phase estimation problem is: “Given a unitary operatorUf with an eigenvector | u〉 and an eigenvalue e2πiϕ, with ϕ unknown, estimate ϕ”, [80].

The operator Uf implementing the transformation described by the function f is relatedto the eigenvector | u〉 and the eigenvalue e2πiϕ

Uf | u〉 = e2πiϕ | u〉.

H

H

H

H

| 0>

| 0>

| 0>

| 0>

| 0>+exp(2iπ(2n-1ϕ) | 1>

Uf20

Uf21 Uf

2n-2

Uf2n-1

| 0>+exp(2iπ(2n-2ϕ) | 1>

| 0>+exp(2iπ(21ϕ) | 1>

| 0>+exp(2iπ(20ϕ) | 1>

| u > | u >

Figure 58: A quantum circuit for phase estimation. The second register holding | u〉 acts ascontrol. The normalization factors (1

√2) for the n target qubits are omitted.

Figure 58 describes the quantum circuit used for phase estimation. The circuit has tworegisters, the first one holds n qubits, initially in state | 0〉. The choice of n depends uponthe desired accuracy for the estimation of the phase ϕ. The second register holds | u〉.

Recall from Section 5.12 that the role of the control and target qubits can be reversed bya change of basis. Thus, second register in the circuit in Figure 58 acts as control and thefirst as target. Each of the boxes representing the unitary transformations U2q

f contributes aphase shift of the second component of

e2πi(2q ϕ).

Thus, the final state of the first register is

189

1

2n/2(| 0〉+ e2πi(2n−1ϕ) | 1〉(| 0〉+ e2πi(2n−2ϕ) | 1〉) . . . (| 0〉+ e2πi(21ϕ) | 1〉)(| 0〉+ e2πi(20ϕ) | 1〉)

An additional stage applies an inverse QFT to the results in the first register. Finallywe read out the state of the first register by doing a measurement in the new computationalbasis.

6.7 Order Finding

6.8 Quantum Algorithms for Integer Factoring

6.9 The Hidden Subgroup Problem

6.10 Quantum Search Algorithms

6.11 Quantum Simulation



(1) Show that if f(0) �= f(1), then the output of the black box in Deutsch’s problem is

| ξ2〉 =

[|0〉−|1〉√

2

] [|0〉−|1〉√

2

]= 1

2

1−1−1

1

if f(0) = 0 and f(1) = 1

−[|0〉−|1〉√

2

] [|0〉−|1〉√

2

]= −1

2

1−1−1

1

if f(0) = 1 and f(1) = 0

(2) In 1995 Griffiths and Niu proposed a network of only one qubit gates for performingquantum Fourier transforms. Analyze and discuss the merits of the solution proposed by thetwo researchers.

(3) Read the paper “A Pseudo-Simulation of Shor’s Quantum Factoring Algorithm” by J.F.Schneiderman, M. E. Stanley, and P.K. Aravind and construct a Java program able to simulateShor’s algorithm.

190

7 Reversible Computations

7.1 Turing Machines, Reversibility, and Entropy

We are now concerned with the relationships between the wonderful abstractions used by thetheoretical computer science and the physical systems surrounding us. We want to under-stand the physical support of information and determine the energy required to store and totransform information.

A general purpose computer is a system capable of storing and of transforming information.Information is a primitive concept and we defer a formal definition of this concept for thesecond volume of this lecture series. For the time being we consider familiar objects such asstrings of characters, images, numbers and refer to them collectively as information.

Models of computations and computing machines that abstract properties of the thoughtprocess necessary to solve a problem as well as the properties of a physical system able to carryout these steps automatically have been developed. Turing machines, effective procedures,finite-state machines, virtual memory, virtual machines, are familiar examples of such highlevel models. These abstractions generalize knowledge extracted from particular cases.

A Turing machine is an abstraction, a system with a finite number of internal states anda read-write head, that moves over a tape consisting of cells. Each cell contains a symbol.The Turing machine starts in a certain state, looks at the symbol currently under the headand depending upon the internal state of the machine and the current symbol, it might erasethat symbol and replace it with another symbol, or leave it as it is, and then move eitherto the left or to the right and change its internal state. Let Qi and Si denote the state andthe symbol read at time t; then Qj, Sj, and D, the new state, the symbol written, and thedirection of movement are given by

Qj = F (Qi, Si)

Sj = G(Qi, Si)

D = D(Qi, Si).

The evolution of a computation carried out by a Turing machine is given by the set ofquintuples (Qi, Si, Qj, Sj, D) and it is complectly specified by the original tape and this set.

There exists a Universal Turing machine capable of mimiking any other Turing machine.Assume that the set of quintuples describing the computation carried by the original Turingmachine is available. We feed the Universal Turing machine the description of the originalTuring machine (the set of quintuples) as well as the input tape of the original Turing machineand the indication where to start and where to end.

Church’s thesis is that all computing devices can be simulated by a Turing machine.Though not a mathematical theorem, this thesis greatly simplifies the study of computations,it reduces the potentially infinite set of computing devices to a relatively simple model, theUniversal Turing machine.

Let us turn our attention to the functions that can be computed by a Turing machine.We expect that the process of finding a solution to some problems cannot be automated.Formulas to calculated the integrals of many functions exist, see for example [56], but nogeneral rules to obtain the analytic expression of the integral of an arbitrary function are

191

known. Proving theorems and solving Euclidian geometry problems is an art, no generalrules to prove theorems and to construct the solution of an Euclidian geometry problem exist.

On the other hand, whenever a function has a derivative, the analytic expression of itsderivative can be obtained by applying a finite set of rules to the original expression. Ana-lytical geometry reduces Euclidian geometry problems to problems in a branch of algebra.

There are “effective procedures” to solve some problems while “effective procedures” maynot exist for other classes of problems. Having an “effective procedure” to do some computa-tion amounts to finding a Turing machine able to carry out the same computation. Let F (x)be a function of x. F (x) is Turing computable if there is Turing machine TF which fed a tapecontaining a description of x will eventually halt with a description of F (x) on tape.

Understanding how a computer works is a non-trivial task while a description of a Turingmachine needs only a few paragraphs and, yet, it allows us to reason about fundamental issuessuch as the “halting problem” 34.

The distinction between computable and non-computable function is not sufficient andthe complexity theory is concerned with the efficiency with which a specific function F (x) iscomputed, it is concerned with time and space complexity of algorithms. Yet, when addressingthe problem of resources necessary for computations, besides time and space, we have toexamine also the energy consumption by the physical devices performing a computation.

Entangled in abstractions, we have moved further and further from the physical realitiesand now is the time to come back; the information must have a physical substrate so a tauntingquestion is the relationship between information and energy. The connection between energydissipation and computations was examined by Leo Szilard in 1929 [114] and later by vonNeumann in a lecture given in 1949 [121].

The models discussed above view a computation as a unidirectional process taking asystem from an initial state to a final state, transforming some input data into results. Assuch, a computation is an irreversible process inexorably condemned to consume energy, instrong contrast with physical processes which are reversible.

A physical process is said to be reversible if it can evolve forwards as well as backwardsin time. A reversible system can always be forced back to its original state from a new statereached during an evolutionary process.

Reversibility is a fundamental property of nature. This may not be obvious at macroscopiclevel, a crystal glass dropped on the flor breaks and the pieces cannot be glued back together,rare, red wine spilled on a silk table cloth cannot be recovered, a missile fired from a launchercannot be brought back. Yet, at the atomic level, all processes are reversible. This meansthat all equations of classical and quantum mechanics are symmetric in time. If we replacetime, t, with −t, the equations are not altered.

Macroscopic systems could behave reversibly as well. Of course, if you drop a piano fromthe 25-th floor of an apartment building to the street below, it will break into pieces. Butif you have a pulley and lower it slowly, it will reach street level in one piece, its potentialenergy will be gradually transferred to the counter weight. The trick is to make slowly subtlechanges in the environment rather than sudden, dramatic changes. The system must be inequilibrium with its environment at all times.

Sadi Nicolas Leonard Carnot published in June 1824 a book “Reflexions sur la puissancemotrice du feu et sur les machines propres a developper cette puissance” 35 showing that a

34For some input x a Turing machine may not halt. It is not possible to construct a computable functionwhich predicts whether or not the Turing machine TF with x as input will ever halt.

35Reflections upon the motric power of the fire and upon the machines capable of using this power.

192

heat engine can behave reversibly. Carnot’s engine consists of an idealized gas in a cylinderwith a piston. The system can be heated and cooled by placing it in contact with heat andcool reservoirs, respectively. In contact with the heat reservoir the gas within the cylinderexpands, pushes the piston and does some useful work. In contact with the cool reservoir, thegas contracts, restoring the piston to its most compressed state. The motion of the pistonmust be very slow to ensure that the gas in the cylinder is always in equilibrium with itssurroundings.

Carnot came to the conclusion that if an engine is reversible, it makes no difference howthe engine is actually designed. The amount of work obtained if the engine absorbs a givenamount of heat at temperature T1 and delivers heat at temperature T2 is a property of theworld and not of that particular engine [47].

In 1865, in conjunction with the study of heat engines, the German physicist RudolphClausius abandoned the idea that heat was conserved and stated formally the First Law ofThermodynamics. He reconciled the results of Joule with the theories of Sadi Carnot.

Clausius defined entropy as a measure of energy unavailable for doing useful work. Hediscovered the fact that entropy can never decrease in a physical process and can only remainconstant in a reversible process. This result became known as the Second Law of Thermody-namics. Amazingly enough our Universe started into a state of perfect order and its entropyis steadily increasing leading to a.... “heat death”. Of course, this sad perspective could bebillions or trillions of years away...

Statistical arguments show that systems tend to become more disordered. Indeed, ourimmediate experience and intuition show that disordered states vastly outnumber the highlyordered states of a system. After any change of state a system is more likely to settle intoa disordered state, than into an ordered one. In the days of punched computer cards almostnever a deck of cards falling on the floor remained in order.

Any student of history recognizes that revolutions and wars contribute greatly to an in-crease of the social disorder and destruction of civilization, thus to an increase of “socialentropy”. The Roman empire imposed order over a significant portion of the Western worldand contributed greatly to the Western civilization. After the fall of the Roman Empire, in476 AD, Europe was thrown into a period of bloody fights and little progress in arts andscience is noticeable for almost a millennium.

The higher the entropy of a system, the less information we have about the system.Henceforth, the information is a form of negative entropy. Claude Shannon recognized therelationship between the two and, on von Neumann’s advice, he called the negative logarithmof probability of an event, entropy. It is rumored that von Neumann told Shannon: “It isalready in use under that name... and besides it will give you a great edge in debates becausenobody really knows what entropy is anyway” [28].

7.2 Thermodynamic Entropy

We now turn back from abstractions to properties of the matter. The question we pose is ifthere is a price to pay for the lack of physical awareness of our computational models and whatare the limitations of such models. We ask ourselves what is the relationship of informationwith energy and matter.

Fortunately, some properties of materials do not require a detailed knowledge of the struc-ture of the matter and they were studied during the XIX century, well before the atomicstructure of the matter was fully understood. The subject of the branch of physics called

193

thermodynamics is the statistical behavior of assembles of molecules. For example, we wish tocharacterize a gas by familiar macroscopic properties such as, volume, pressure, and temper-ature without knowing its microscopic properties, e.g., the vectors describing the velocitiesof individual molecules of the gas. A comprehensive discussion of the subject is well beyondour intentions, the interested reader will certainly enjoy the presentation in physics texts, ourfavorite being Feynman’s ”Lectures on Physics” [47]; here we only present several conceptsuseful to illustrate the relationship between information and energy dissipation.

The laws of thermodynamics are relationships between macroscopic quantities such as thetemperature, T , the pressure, p, the heat, Q, the energy of the system, U , the free energy, F ,the work done on the system, W , all of them defined as statistical averages. The First Lawof Thermodynamics is a conservation law; it states that the total change in the energy of asystem is the sum of the heat put into the system and the work done on the system

∆U = ∆Q + ∆W.

Throughout this section ∆ signifies a finite change while δ is an infinitesimal change of avariable.

The thermodynamic entropy of a gas, S is also defined statistically but it does not reflecta macroscopic property. The entropy quantifies the notion that a gas is a statistical ensembleand it measures the randomness, or the degree of disorder of the ensemble. The entropy islarger when the vectors describing the individual movements of the molecules of gas are ina higher state of disorder than when all of them are well organized and moving in the samedirection with the same speed

S = kBln(W ).

where kB is Boltzman constant, kB = 1.381× 10−23 Joules per degree Kelvin and W is theprobability of a given state. A version of this equation is engraved on Ludwig Boltzmann’stombstone 36.

The Second Law of Thermodynamics tells us that the entropy of an isolated system neverdecreases. Indeed, differentiating the previous equation we get

δS = kBδW

W≥ 0.

It is relatively easy to see that when we compress a volume containing N molecules of gasfrom V1 to V2, maintaining the temperature of the system constant (isothermal compression),the work done on the system is

W =

∫ V2

V1

NkBT

VdV = NkBT ln

V2

V1

.

Indeed, the pressure, the variation of volume δV and the work are related by δW = pV. Butfor an ideal gas at temperature T , pressure p, and with volume V we know that pV = NkBT .

It follows immediately that

∆S = NkBlnV2

V1

.

36Boltzman took his life in 1906 not knowing the impact of his findings upon the world of physics.

194

Let us now consider a gas consisting of a single molecule. When N = 1 and we reduce thevolume of the gas in half, V2 = V1/2 the previous expression becomes

∆S = −kBln(2).

The reduction of an ensemble to a single molecule requires a leap of faith; yet it allows usto relate information and entropy.

By reducing the volume where the molecule can be located we have increased our infor-mation about the system and we have decreased the entropy. Now the molecule can hide ina volume twice as small as before.

Was the Second Law of Thermodynamics violated by this experiment? No, because we donot have an isolated system, we have in fact increased the amount of free energy by kBT ln(2)and decreased the entropy by kBln(2).

Now we can use the gas cylinder with a molecule of gas to store information. We resetour “bit” by compressing the volume and then let the volume expand and the molecule willbe on one or the other halves of the cylinder depending upon its energy and the bit will beeither 0 or 1. But we need to expend energy to compress the gas and this means that erasinginformation is the moment when we expend energy.

Two terms frequently used in thermodynamics are adiabatic meaning “without transferof heat” and isothermal meaning “at a constant temperature”. For example, if we open thevalve of a gas canister, the gas rushes out and expands without having the time to equalizeits temperature with the environment. The rushing gas feels cool.

7.3 Maxwell Demon

According to the Second Law of Thermodynamics the entropy of a system is a non-decreasingfunction of time. Now we describe an experiment that seems to violate this law.

James Clerk Maxwell is best known for equations of the electromagnetic field and for thekinetic theory of gases. By treating gases statistically in 1866 he formulated, independentlyof Ludwig Boltzmann, the Maxwell-Boltzmann kinetic theory of gases. This theory showedthat temperatures and heat involved only molecular movement.

In 1871 Maxwell proposed a “thought” experiment with puzzling results. Imagine themolecules of a gas in a cylinder separated in two by a slit covered with a door controlled by alittle demon. The demon examines every molecule of gas and determines its velocity; those ofhigh velocity on the left side are let to migrate to the right side and those with low velocity onthe right side are allowed to migrate to the left side. As a result of these measurements, thedemon separates hot from cold in blatant violation of the Second Law of Thermodynamics.According to the Second Law of Thermodynamics, the entropy of a system can only increase.The entropy is a measure of the degree of disorder of a system and the Maxwell Demon createsorder by separating hot molecules (those with high velocity) from cold molecules (those withlow velocity).

For more than a century physicists tried to spot the flaw in Maxwell’s argument withoutgreat success. In 1929 Leo Szilard had the intuition to relate the demon with the binaryinformation. He imagined a simplified version of Maxwell’s thought experiment. Considera horizontal cylinder with a piston and a single molecule of gas inside the cylinder. If thedemon waits until the molecule is on the right side the piston is pushed halfway without anyenergy consumption, Figure 60(a). As the molecule moves to the left, the piston is pushedand lifts the weight, Figure 60(b).

195

Figure 59: The Maxwell Demon separates fast-moving molecules of gas from slow-movingones; the fast moving ones end up on the right hand side of the box and the slow moving oneson the left hand side. The demon separates hot from cold in blatant violation of the SecondLaw of Thermodynamics .....or so it seems.

(a) (b)

Figure 60: Szilard’s gedanken experiment considers a single molecule of gas. The demonperforms a measurement of the position of the molecule in the cylinder. (a) If the moleculeis on the right the piston is pushed halfway without any energy consumption. (b) As themolecule moves to the left, the piston is pushed and lifts the weight.

The great insight brought by Leo Szilard is that the demon has to expand energy andreduce the entropy of the system by performing the measurements. Many regard Leo Szilardas the father of information theory, he identified the measurement, the information, and thememory as critical aspects of the thorny problem posed by Maxwell.

In 1950 Leon Brillouin advanced the idea that the daemon needed a torchlight to see wherethe molecule is located and that the energy of the photons emitted by the torchlight shouldexceed the energy of the background photons.

7.4 Energy Consumption. Landauer Principle

Is there an analogy between a heat engine and a computing engine? Is it sensible to thinkthat only irreversible processes in computations require energy consumption? These were themain questions addressed by Rudolf Landauer a physicist working at IBM Research.

The precise physical phenomena leading to energy consumption in a computing device

196

was identified in 1961 by Landauer [70]. Landauer discovered that erasing information is theprocess that requires energy dissipation. At a first sight, this seems a strange statement sowe need a simple model to justify it. We already know that if a process is irreversible itrequires some energy consumption. Thus, we only need to prove that erasing information isan irreversible process. This is much more easier to grasp, we all know that once we haveerased the blackboard in our office the information on it is lost, that we need to back up ourfiles, once a disk drive fails the information stored on it cannot be retrieved.

Let us consider a slightly more formal justification and consider a storage system for binaryinformation. The double well in Figure 61 allows us to store a bit of information; if the ballis on the left, then we have a “0” and if it is on the right we have stored a “1”. Erasinginformation means that wherever the ball happens to be in the current state, we should endin a state with the ball on the left side of the well. In other words, we have a mapping fromtwo different initial states, the ball on the left well and the ball on the right well, to a singlefinal state, the ball on the left well. Clearly, such a process is irreversible, henceforth erasinginformation is always associated with energy dissipation.

0 1

Figure 61: Landauer’s double well information storage model.

There are two equivalent formulation of the so-called Landauer principle, one in terms ofenergy consumption and the other in terms of entropy:

Landauer principle: Suppose a compute erases a bit of information. Then the amount ofenergy dissipated into the environment is at least kBT ln(2) with kB the Boltzmann constantand T the temperature of the environment.

Landauer principle: Suppose a compute erases a bit of information. Then the entropy of theenvironment increases by at least kBln(2) with kB the Boltzmann constant.

Landauer principle traces the energy consumption in a computation to the need to eraseinformation. This seems a bit counterintuitive, we would expect that writing information alsorequires energy dissipation, that some symmetry of these two operations exists.

Landauer provides only a lower bound on the energy consumption. It turns out that thelogic circuits in microprocessors vintage year 2000 need roughly 500kBT ln(2) in energy foreach logic operation [80]. Since 1970 the computing power of microprocessors has doubledevery 18 to 24 months following very closely the prediction of Intel’s Gordon Moore. To limitthe energy consumption of increasingly more powerful microprocessors the energy consumedfor every logical operation had to decrease at a rate comparable to, or exceeding the one givenby the Moore’s law.

A consequence of Landauer principle is that no strictly positive lower bound on energyconsumption exists for a reversible computer. If we could build a reversible computer that

197

does not erase any information, then we can compute in principle without any energy loss.This is in itself less shocking than it seems at a first glance; all laws of physics are reversibleand if we know the final state of a closed physical system then we can determine its initialstate.

After understanding Landauer principle the flaw in the Maxwell’s demon gedanken exper-iment is obvious: the deamon has to perform some measurement to allow a molecule of gasapproaching the slit to cross over the other side. The results of measurements must be storedin demon’s memory. But his memory is finite, so the deamon must start erasing informationafter a while. According to Landauer principle the entropy of the entire system, including thegas cylinder and the demon, increases as a result of erasing information. This increase shouldbe large enough to compensate the decrease in entropy caused by the separation of moleculeswith high velocity from the ones with low velocity and to vindicate the the Second Law ofThermodynamics.

Leo Szilard noticed in 1929 the role of the measurement process in this gedanken ex-periment. Bennett was the first to point out that the erasure of information and not themeasurements are the source of the entropy created in the process of separating high velocitymolecules from the low velocity ones in this experiment. His model for the demon’s behaviorallows the demon to make the measurements with zero energy expenditure: the demon isinitially in a state of uncertainty, let us call this state U . After measuring the velocity of amolecule, the demon enters a state H for a high velocity molecule approaching molecule, orstate L for a low-velocity one and overwrites U with either H or L. This can be done withoutany energy expenditure; the energy is dissipated in the next step when the demon has to erasethe H or L and set the value to U , to prepare for the next measurement.

The fact that erasing information requires consumption of energy is in strict agreementwith the Second Law of Thermodynamics. Erasing information reduces our knowledge aboutthe state of a system thus, it leads to an increases of the entropy; we need to consume energyto compensate for the increase of the entropy.

7.5 Low Power Computing. Adiabatic Switching

von Neumann [121] was the first to reflect on the absolute minimum amount of energy re-quired for an elementary operation of an abstract computing device capable of making binarydecisions and of transmitting information. He advanced the idea that this energy is of theorder of kBT . At the room temperature kBT ≈ 3 × 10−21 Joules. von Neumann reasonedthat if a capacitor is used to store a bit of information one would need an amount of energylarge enough to guarantee that the level of the signal is above the noise level.

As early as 1978 Fredkin and Toffoli discussed a scheme to implement reversible logiccircuits using switches, capacitors, and inductors. Under normal circumstances a capacitorwith capacity C at voltage V if charged or discharged instantly dissipates an amount of energyequal to (CV 2)/2 as heat. In their scheme, a capacitor used to store a bit of information couldbe discharged without losing the energy. The electricity is transferred to an inductor and fromthere to another capacitor. Unfortunately, the scheme is not practical, inductors cannot beaccommodated on a silicon substrate.

Adiabatic switching is a term coined for a switching device that does not produce heat.Charles Seitz from Caltech invented a scheme called hot-clocking when the energy is savedby varying the power supply voltages. Raplh Merkle from Xerox PARC, William Athas fromUSC, and Storrs Hall from Rutgers pioneered a reversible adiabatic switching scheme.

198

7.6 Bennett Information Driven Engine

Bennett imagined an information driven engine; instead of using electricity or gas Benenett’sengine consumes a tape with information stored on it and converts this information intoenergy see Figure 62. This sounds very exciting, but do not jump to conclusions yet; it maybe a while until you’ll be able to create a tape recording your kids in the evening and feedthe tape next morning to your Honda Civic instead of filling it up with gas, hydrogen, orelectricity.

90

Input Tape withInformation on it

RandomizedOutput Tape

Heat Bath

Figure 62: Bennett information-driven engine. The input tape contains information, theoutput tape is randomized. The entropy of the system increases and the energy produced isused to move a piston.

The engine vaguely resembles a Turing machine, the input tape contains cells, each cell issimilar to the cylinders with one molecule of gas in it. The machine spits out a “randomizedtape” where the atom can be anywhere in the cylinder. This system is just the oppositeof the one presented earlier where we use energy to force the molecule of gas in one halfof the cylinder and reduced the entropy. In this engine the entropy, which is a measure ofrandomness, increases and this means that some energy is produced.

Let us now describe carefully the setup. The engine itself is submerged into a heat bath.A heat bath is a system able to keep constant the temperature T of the engine. When acell enters the engine its contents are spilled into a cylinder with a piston half way into thecylinder. The molecule heats up to the temperature of the heat bath. When the systemhas reached thermal equilibrium the molecule pushes isothermally the piston and with someimagination we are able to extract the energy of this movement. If we have a tape with nbits of information on it, then the work produced by the engine is nkBT ln(2).

7.7 Logically Reversible Turing Machines and Physical Reversibil-ity

An ordinary Turing machine has a control unit and a read-write head; it performs a sequenceof read-write-shift operations on an infinite tape divided into squares. The dynamics of the

199

Turing machine is described by quintuples (A, T,A′, T ′, σ) of the form

AT −→ T ′ σ A′.

The significance of this notation is that when the control unit is in state A and the symbolcurrently scanned by the read-write head is T , then the machine will first write T ′ in place ofT and then shift left one square, right one square, or remain on the same square dependingupon the value of σ (σ = −, +, 0 respectively) and the new state of the control unit willbe A′. An n-tape Turing machines is one were T , T ′, and σ of each quintuple are themselvesn-tuples.

UniversalReversibleComputer

(U)

Input Tape

HistoryTape (blank)

00 0000 0 00

Results Tape

History Tape (garbage)

(a)

0 101 11 0 0 01 1 11 0 0

0 0 00 0 0

UniversalComputer

(U)

UniversalComputer

(U-1)

CopyUnit

Input Tape

Results Tape

HistoryTape (blank)

0

0 00

000 00

Modified Input Tape

Copy of Results

HistoryTape

Stage One Stage Two Stage Three

11 1

1 1

1 1 11

1

11 1

11

0 0 0 0 00

0 0 0

00 11 1 0 11 11

00 0

0 00

0

Figure 63: (a) A Universal Reversible Computer. (b) A Zero Entropy Loss Reversible Com-puter.

A Turing machine performs a mapping of its entire current state into a successor stategiven by its transition function. The entire state of a Turing machine is given by the state ofits control unit, the tape contents, and the position of the read-write head. When a Turingmachines traverses a set of states we have a set of mappings associated with this evolution.

A Turing machine is deterministic if and only if the quintuples defining the mappingshave non-overlapping domains. This is guaranteed by requiring that the portion of the tapeto the left of the arrow marking the position of the read-write head be different for differentquintuples.

A Turing machine is reversible if and only if the mappings have non-overlapping ranges.An ordinary Turing machine is not reversible. Indeed, the write and shift operations in acycle do not commute; the inverse of a read-write-shift cycle is shift-read-write.

200

The problem of constructing a reversible computing automaton is non-trivial. A temptingsolution is to add to the Turing machine a history tape, initially blank, and save on this tapethe details of every operation performed. Then we would be able to retrace the steps of thedirect computation; starting from the last record on the tape we would determine the previousstate and keep going back until we reached the last record on the tape.

The history tape must be left blank as it was when we started the process. We know bynow that erasing information requires energy dissipation. Thus, we require that the reversiblecomputer if it halts to have erased all intermediate results leaving only the original input andthe desired output so the process of building a reversible automaton is more intricate thanwe have anticipated.

Charles Bennett was able to prove in 1973 that given an ordinary Turing machine S onecan construct a reversible three tape machine R which emulates S on any standard input,and which leaves behind, at the end of its computation only the original input and the desiredoutput. The formal proof of this statement can be found in [12].

Here we present Bennett’s informal argument. Imagine that at the end of the originalcomputation which is deterministic and reversible we continue with a stage when the ma-chine uses the inverse of the original transfer function and make the machine carry out theentire computation backwards. Like the forward computation the backwards computation isdeterministic and reversible.

We have to be a bit careful and create a copy of the results on a new tape immediatelyafter the completion of the forward computation and before the backward one starts, becausethe backward computation destroys the results. We have also to stop recording on the historytape during the process of copying the results. The copy operation of the results can be donereversibly if we start with a blank tape.

After this three stage process the system will consist of a copy of the results obtainedduring the first phase, the copy being made during the second phase, and the original inputtape reconstructed during the third phase. A vast amount of storage on the history tape wasused but returned to its original blank condition after the third stage.

Once convinced that logically reversible automata exist we can think of thermodynamicallyreversible physical computers operating very slowly near thermodynamic equilibrium. Forexample a chemical reversible computer could consist of DNA encoding logical states andreactants able to change the logical state.

Figures 63(a) and (b) depict a Universal Reversible Computer and a Zero Entropy LossReversible Computer based upon Bennette’s arguments discussed above. They are inspiredby Feynman [51].



201

8 The “Entanglement” of Computing and Communica-

tion with Quantum Mechanics

8.1 Uncertainty and Locality

Earlier we noted that Heisenberg’s uncertainty principle reflects the fact that measurementsof quantum systems disturb the system being measured. This fundamental law of physics iseven more profound. Outcomes of experiments, we call them observables, have a minimumlevel of uncertainty that cannot possibly be removed with the help of a theoretical model.

Let us assume that we are interested in two observables of a quantum particle, call themA and B, for example, the position and the momentum of an electron. We prepare a largenumber of quantum systems (in our example electrons) in identical states | ψ〉 and measurefirst the observable A and then the observable B on all particles. Since all systems wereinitially in the same state, which we choose to be an eigenstate of A, we obtain the samevalue for the observable A and we see a large standard deviation of observable B.

Alternatively, we can measure the observables A on some and the observable B on the oth-ers. Call ∆(A) and ∆(B) the standard deviation of the measurements of A and B respectively.Heisenberg’s uncertainty principle states that

∆(A)×∆(B) ≥ 1

2| 〈ψ | [A,B] | ψ〉 | .

Uncertainty is a fundamental property of quantum systems. The classical interpretationof probabilities as lack of knowledge regarding future events is no longer true for quantumsystems where probabilities reflect our limited ability to acquire knowledge about the present.After observing the weather in Orlando, Florida, for many years we may determine that theconditional probability of rain for a fifth consecutive day given that there were already fourconsecutive rainy days is say, one in one hundred. So after four rainy days, we can say withhigh probability, 0.99, that there will be no rain during the fifth day. But, very seldom, thisprediction of the future will turn out to be false. On the other hand, for quantum systems oncewe have measured the position of a particle there is a certain distribution of its momentum,it cannot be precisely determined now, regardless of our ingeniosity.

Leading the camp of those who did not accept the uncertainty at the core of quantummechanics there is no more imposing figure than Albert Einstein. He is one of the pioneers ofthe new theory; in 1905 the little known public servant of the patent office in Bern publishedthree papers that attracted the world’s attention: one on the photoelectric effect, one on thespecial theory of relativity, and one on statistical thermodynamics. In his explanation of thephotoelectric effect Einstein considered the photons as discrete packets carrying an energyE = hν with h the Planck constant and ν the frequency of the light. His explanation wasin perfect agreement with the experiments and with Max Plank’s expression for the energyproduced by a light-emitting system, E = nhν with n = 0, 1, 2, ....

Until his death, in 1955, Albert Einstein argued with Bohr, Heisenberg, Schroedinger,Dirac, and other supporters of the Copenhagen doctrine that non-determinism cannot possiblybe the basis for the laws of nature. The famous pronouncement “God does not play dice”reflects Einstein’s strong conviction that the quantum theory was missing something, it wasan incomplete scientific theory. Einstein believed that some “hidden variables” are probablymissing from quantum mechanics and if one could discover the values of these variables therandomness would disappear.

202

Albert Einstein and other physicists believed that there are three fundamental principlesfor a theory attempting to accurately describe the nature: (i) A deterministic view allowingvariables to be known with great precision like the one supported by Isaac Newton’s theory.Probabilities are acceptable to describe the outcomes of experiments, but only under specialconditions, e.g., when boundary condition limit our ability to get a complete descriptionof reality. (ii) Locality, the fact that systems far apart in space can influence each otheronly by exchanging signals subject to the limitations cause by the finite speed of light. (iii)Completeness, the inclusion of elements of reality, such as position and momentum of aparticle.

For years Einstein together with the most preeminent scientists of the time, met regu-larly at Solvay Conferences 37. Regularly, at these meetings Einstein attempted to illustratethe fallacy of Heisenberg’s uncertainty through the so-called “gedanken” or “thought” ex-periments, to the exasperations of his friends and foes alike. Bohr & Co. were forced tofind explanations for the apparent contradictions brought forth by Einstein. One of theseexperiments is outlined below. A box containing radiation is weighted both before and afterreleasing several photons. The time of the release is measured precisely by a clock controllingthe door of the box and the energy of the photons can be deduced precisely using Einstein’sformula E = mc2, where c = 3 × 1010 cm/sec is the speed of light. “Thus both the timeof the release of the photons and their energy can be precisely determined”, argued Einstein.It took Niels Bohr a sleepless night in Bruxelles to find the flaws in Einstein’s argument. Arecent discussion of this scientific duel is found in [116].

But the real test for uncertainty came later, in 1933, when Albert Einstein imagined thefollowing “gedanken” experiment designed to put the matter to rest once and for all. Theidea of this experiment is to have two particles related to each other, measure one and gatherknowledge about the other.

Consider two particles “A” and “B” with known momentum flying towards each other andinteracting with each other at a known position, for a very brief period of time. An observer,far away from the place where the two particles interacted with each other, measures themomentum of particle “A” and based upon this measurement is able to deduce the momentumof particle “B”. The observer may choose to measure the position of particle “A” instead ofits momentum. According to the principles of quantum mechanics this would be a perfectlylegitimate proposition, but in flagrant violation of common sense. How could the final stateof particle “B” be influenced by a measurement performed on particle “A” long after thephysical interaction between the two particles has terminated?

A year later Einstein, Podolski and Rosen (thus the abreviation EPR), wrote a paperpublished in the Physical Review that stimulated a great interest and seemed at that timeto have definitely settled the dispute between Niels Bohr and Albert Einstein in favor of theauthor of the relativity theory [46]. The ingeniosity of the EPR experiment is that the positionand the momentum of one particle is determined precisely by measurements performed on its“entangled twin”. The authors of the EPR paper write “If, without disturbing the system, wecan predict with certainty (i.e., with probability one) the value of physical quantity, then thereis an element of physical reality corresponding to this physical quantity.” They concludedthat though the position and the momentum of the “entangled twin” are elements of thephysical reality and because the quantum mechanics does not allow both to be part of thedescription of the state of the particle, the quantum mechanics is an incomplete theory.

37Ernest Solvay was an industrialist from Bruxelles who financed scientific meetings with the hope to havean audience for his own scientific theories.

203

In 1952 David Bohm a former student of Robert Openheimer at Berkeley, suggested achange of the EPR thought experiment, this time involving two particles but with one variableof interest, the spin with two possible values, “Up” and “Down”, instead of two variables ofinterest, the position and the momentum. Two entangled particles, “A” and “B” are generatedby the same source and move away from each other. The two particles are entangled in theirspins and if one has spin “Up” the other has spin “Down” along the same direction. OnceAlice measures the spin of particle “A” along the x-axis and finds it to be “Up”, the spinmeasured by Bob on particle “B”, along the same direction x, must be “Down”.

8.2 Possible Explanations of the EPR Paradox

In a previous chapter we learned that an EPR experiment consists of generating a pair ofmaximally entangled particles and demonstrating their strange properties with the aid of themain characters of cryptographic texts, Alice and Bob, with Caroll in a supporting role. Aliceand Bob are at different locations and need to communicate with one another. Caroll sendsone of the entangled particles to Alice and the other to Bob. The state of these particles isdescribed by the vector | ψ〉 = 1/

√2(| 00〉 + | 11〉).

Let us assume that when Alice measures her qubit she observes the state | 0〉. This meansthat the combined state is | 00〉 and Bob will observe the same state, | 0〉 on his qubit. WhenAlice observes her qubit in state | 1〉, the combined state is | 11〉 and Bob observes the stateof his qubit to be | 1〉.

This behavior appears to suggest that non-local effects may occur. Einstein, Podolski,and Rosen proposed a hidden variable theory. They argued that the state of the particle ishidden from us; both particles are either in state | 0〉 or in state | 1〉, but we do not knowwhich. Later, John Bell showed that the hidden variable theory predicts that measurementsperformed on any system must satisfy the so-called Bell inequality [7, 8]. But measurementsperformed on quantum systems indicate that the Bell inequality is violated thus the hiddenvariable theory must be false. The results of measurements performed with respect to differentbasis confirm the fallacy of the hidden variable theory.

Another attempt to explain the EPR paradox is that the measurement on one of theparticles affects the other. But this contradicts the relativity theory as we shall see shortly.Imagine two external characters, say Samantha and Hector who are moving relative to eachother while observing Alice and Bob. Samantha reports that Alice measures her particle first(Alice may have observed the state | 1〉 of her particle forcing the same state on Bob’s particle).In turn, Hector reports that Bob measures his particle first (Bob may have measured the state| 0〉 of his particle forcing the same state on Alice’s particle). Yet, the laws of physics must beinvariant of the observer. In our case we must provide equally consistent explanations of theobservation reported by both Samantha and Hector. Therefore, causality does not explainthe EPR paradox either.

8.3 The Bell Inequality. Local Realism

We now describe an experiment similar to EPR, that sets up the stage for deriving the Bellinequality. This new experiment is consistent with common sense and does not involve anyreference to quantum mechanics. The only assumptions we need to derive our results areintuitive, common sense, and are called local realism

204

(i) Locality - measurements of different physical properties of different objects, carried out bydifferent individuals, at distinct locations, cannot influence each other.

(ii) Realism - physical properties are independent of observations.

Assume that Caroll prepares two particles and sends one of them to Alice and the other toBob, see Figure 64. There are two physical properties that Alice could measure on her particle,Q and R. The results of these measurements are the values of the two physical properties, Qand R respectively. We impose the condition that the results of the measurements can onlybe Q,R = ±1.

Similarly, Bob can measure on his particle two physical properties S and T . The resultsof these measurements can be S, T = ±1. All four properties (Q,R,S, T ) are objective andthe results of the measurements, (Q,R, S, T ), have a well defined physical interpretation.

Caroll

Bob

S = +/- 1T = +/- 1

Alice

Q = +/- 1R = +/- 1

Figure 64: The experimental set-up for the Bell inequality. Caroll prepares two parti-cles and sends one of them to Alice and the other to Bob. Alice can measure on herparticle properties Q and R while Bob can measure S and T on his particle. The re-sults of the measurements can only be Q,R, S, T = ±1. Bell’s inequality requires that| E(QS) + E(RS) + E(RT ) − E(QT ) | ≤ 2.

Neither Alice, nor Bob know in advance which property they will measure. They performthe measurements simultaneously. Before a measurement, each one of them tosses a fair cointo decide which property they are going to measure. Q,R, S, T are random variables with ajoint probability distribution function p(q, r, s, t) = Prob(Q = q, R = r, S = s, T = t). Weshall prove that

| E(QS) + E(RS) + E(RT ) − E(QT ) | ≤ 2

where E(κ) means the expected value of the random variable κ and | E(κ) | means theabsolute value of E(κ).

Define

κ = QS + RS + RT − QT = S(R + Q) + T (R−Q).

It is relatively easy to see that κ = ± 2. Indeed, either the first term, S(R + Q), or thesecond one, T (R−Q), must be equal to zero because Q,R, S, T = ± 1. If one of the termsis zero, then the other one must be equal to ±2.

205

Now

E(κ) =∑

(q,r,s,t)

p(q, r, s, t)(qs + rs + rt− qt) ≤ 2×∑

(q,r,s,t)

p(q, r, s, t).

But ∑(q,r,s,t)

p(q, r, s, t) = 1.

The results of the measurements performed by Alice ,Q and R, and the ones performedby Bob, S and T , are independent random variables. The pairs of random variables, eachpair consisting of one variable from Alice and the other from Bob, QS, RS, RT , and QT , arealso independent random variables, thus

E(QS + RS + RT −QT ) = E(QS) + E(RS) + E(RT ) − E(QT ).

This completes our proof of the Bell inequality and it is the time to leave the worldof classical systems and turn again our attention to quantum systems. Consider a pair ofentangled qubits in the state ψ = 1/

√2(| 01〉 − | 10〉). Caroll sends one of the entangled

particles to Alice and the other to Bob. We have several one qubit gates that can be used toobserve a qubit; among them are the X gate (it transposes the components of a qubit) andthe Z gate (it flips the sign of a qubit). Alice and Bob measure the following observables

Alice =⇒ Q = Z1 R = X1

Bob =⇒ S = 1/√

2(−Z2 −X2) T = 1/√

2(Z2 −X2)

The observable Q is the output of a Z gate and R the one of an X gate with Alice’sparticle as input. A similar interpretation holds for Bob’s observables.

Let 〈QS〉 denote the average value of the observable QS. The average values of pairs ofobservables are

〈QS〉 = 〈RS〉 = 〈RT 〉 =1√2

〈QT 〉 = − 1√2.

It follows immediately that

〈QS〉+ 〈RS〉+ 〈RT 〉 − 〈QT 〉 = 2√

2.

This means that quantum mechanics predicts a value for the sum of averages of observablesin violation of the Bell inequality. When we obtain two contradictory results using twodifferent theoretical models it means that one of the models is wrong. Then we must turn toexperiments to determine which one is wrong.

In this case the experiments prove the quantum mechanics to be correct. This means thatat least one of the two common sense assumptions presented at the beginning of this sectionis wrong. Sandu Popescu discusses issues related to non-locality in [87].

206

8.4 EPR Pairs and Bell States

Some of the applications presented in this chapter are based upon the “miraculous” propertiesof entangled particles, the so called EPR pairs. An EPR pair can be in one of four statescalled Bell states. These states form a normal basis

| β00〉 =| 00〉 + | 11〉√

2

| β01〉 =| 01〉 + | 10〉√

2

| β10〉 =| 00〉 − | 11〉√

2

| β11〉 =| 01〉 − | 10〉√

2

The four Bell states are states of maximum entanglement between the two particles; anyone of them can be transformed into any other by a purely local rotation on one of theparticles. The last state is usually called anti-correlated state.

H|a>

|b>

|a>|a’>

H

I|b>|b’>

(a)

|V> |W>

(b)

stage 1 stage 2

Figure 65: (a) A quantum circuit to create Bell states. (b) The two stages of the circuit onthe left showing the I gate for the second qubit in the first stage.

The circuit in Figure 65(a) takes as input two particles in pure states | 0〉 and | 1〉 andcreates a pair of entangled particles. The circuit consist of a CNOT gate with a Hadamard gateon its control input.

It is easy to show that the truth table of the quantum circuit in Figure 65(a) is

In Out

| 00〉 (| 00〉 + | 11〉)/√

2 = | β00〉| 01〉 (| 01〉 + | 10〉)/

√2 = | β01〉

| 10〉 (| 00〉 − | 11〉)/√

2 = | β10〉| 11〉 (| 01〉 − | 10〉)/

√2 = | β11〉

The drawing in Figure 65(b) helps us understanding how the output of the circuit togenerate entangled states is obtained. Here we distinguish two stages:

207

(i) A first stage when the two input qubits are transformed separately. The first qubit, | a〉, isapplied at the input of a Hadamard gate, H and produces an output | a′〉. The second qubit,| b〉, is applied at the input of an identity gate, I and produces an output | b′〉 =| b〉.(ii) A second stage consisting of a CNOT gate with input | V 〉 and output | W 〉. Here

| V 〉 =| a′〉⊗ | b′〉.

| W 〉 = GCNOT | V 〉.

Thus the output of the circuit is | W 〉 = GCNOT | V 〉 = GCNOT (| a′〉⊗ | b′〉).Recall that the output of the Hadamard gate with | ψ〉 = α0 | 0〉 + α1 | 1〉 as input is

α0(| 0〉+ | 1〉)/√

2 + α1(| 0〉− | 1〉)/√

2.

We now derive the output of the first and the second stage of the circuit in Figure 65(b).There are four possible cases depending upon the input qubits | a〉 and | b〉(i) | a〉 =| 0〉 and | b〉 =| 0〉.

| a′〉 =| 0〉+ | 1〉√

2| b′〉 = | 0〉

| V 〉 =1√2

(11

)⊗(

10

)=

1√2

1010

| W 〉 = GCNOT | V 〉 =

1 0 0 00 1 0 00 0 0 10 0 1 0

1√

2

1010

=

1√2

1001

=| β00〉.

(ii) | a〉 =| 0〉 and | b〉 =| 1〉.

| a′〉 =| 0〉+ | 1〉√

2| b′〉 = | 1〉

| V 〉 =1√2

(11

)⊗(

01

)=

1√2

0101

| W 〉 = GCNOT V =

1 0 0 00 1 0 00 0 0 10 0 1 0

1√

2

0101

=

1√2

0110

=| β01〉.

(iii) | a〉 =| 1〉 and | b〉 =| 0〉.

| a′〉 =| 0〉− | 1〉√

2| b′〉 = | 0〉

208

| V 〉 =1√2

(1

−1

)⊗(

10

)=

1√2

10

−10

| W 〉 = GCNOT V =

1 0 0 00 1 0 00 0 0 10 0 1 0

1√

2

10

−10

=

1√2

100

−1

=| β10〉.

(iv) | a〉 =| 1〉 and | b〉 =| 1〉.

| a′〉 =| 0〉− | 1〉√

2| b′〉 = | 1〉

| V 〉 =1√2

(1

−1

)⊗(

01

)=

1√2

010

−1

| W 〉 = GCNOT V =

1 0 0 00 1 0 00 0 0 10 0 1 0

1√

2

010

−1

=

1√2

01

−10

=| β11〉.

This completes the derivation of the output of the quantum circuit used to create entangledparticles from particles in pure states.

8.5 Quantum Teleportation with Maximally Entangled Particles

The Einstein-Podolsky-Rosen (EPR) paradox reveals that if two particles interact they be-come correlated and when measuring one particle we gather information about the wavefunction of the other.

In 1992 a group of scientists were discussing the impact of entanglement upon informationtransmission with application to the distribution of encryption keys. As you might alreadyknow Alice and Bob are the traditional names used in cryptography for two individualswishing to exchange information in a secure manner. An embellished version of the problemfollows. Assume that Alice and Bob are given as a wedding present an EPR pair, a pair ofEPR entangled particles called “particle 1” and “particle 2”. After several years Bob alonetakes part in an expedition on K9 38 and takes “particle 2” with him, while Alice remains inLondon to take care of their newborn infant and keeps “particle 1” with her. A third party,Caroll, asks Alice to deliver a secret message from the Royal K9 Society to Bob. The messageis encoded in the state of “particle 3”.

Alice cannot send the quantum state of “particle 3” directly because of the risks associatedwith sending quantum information over fiber optics or quantum channels. What someoneconfined to classical thinking might suggest Alice, is to perform a measurement of “particle3” and deliver the information to Bob via a classical communication channel. Then Bob

38K9 is a fictitious mountain not far from K2. A number of canine expeditions to the summit are planned.

209

could reconstruct the quantum state by manipulating a particle similar with “ particle 3”.This is not going to work because Alice can only get partial state information as a result ofa measurement on “particle 3”. For example, let us assume that the information is in thepolarization of a photon. Alice may be given a photon polarized at 45 deg, and, not knowingthe orientation, she might measure its horizontal polarization. In this case Alice would notonly get the wrong answer, but she will alter the quantum sate of the photon.

The solution for Alice is to perform a joint measurement on her own half of the EPR pair,“particle 1” and on the particle given by Caroll, “particle 3”. Then she sends to Bob over aclassical communication channel the result of her measurement. At his end, upon receivingAlice’s results, Bob shall perform upon his own particle, “particle 2”, one of the four typesof transformations, the one communicated by Alice. The four transformations are done byapplying its own qubit at the input of a I, X, Y, Z gates. The last three transformationsare in fact 180 deg rotations along the x, y, z axis. As a result of these transformations“particle 2” will be a perfect replica of “particle 3”.

At first sight, this seems to violate the principle of no-cloning discussed earlier. This isnot true because when Alice measures the joint state of “particle 1” and “particle 3”, thestate of “particle 3” is altered, thus, the original copy of the particle given to Alice by Carollis destroyed.

In summary, Alice is able to transfer the quantum state, not the actual particle sendingonly classical information to Bob. The transfer of hidden quantum information appears tohappen instantly, though Bob needs to receive first classical information regarding the resultof Alice’s measurement.

As we have seen earlier there are several entangled states of two particles. Let us assumefirst that the particles of Alice and Bob (“particle 1” and “particle 2”) are in the maximallyentangled state

| ψ+〉 =| 00〉+ | 11〉√

2

and that “particle 3” is in state

| ϕ〉 = α0 | 0〉+ α1 | 1〉 with | α0 |2 + | α1 |2= 1.

The joint state of “particle 3” and “particle 1” is a vector in H8

| ξ〉 =| ϕ〉⊗ | ψ+〉 =

(α0

α1

)⊗ 1√

2

1001

=

1√2

α0

00α0

α1

00α1

or | ξ〉 = 1/√

2(α0 | 000〉+ α0 | 011〉+ α1 | 100〉+ α1 | 111〉).Alice carries out two operations on the two qubits in her possession, the particle from

Caroll and her own half of the entangled pair

(i) Alice applies a CNOT to the pair; she uses Caroll’s qubit as a control and her own as atarget. She applies the GCNOT ⊗ I to the state ξ.

210

Caroll

iY

CNOTparticle 1 - target qubitparticle 3 - control qubit

particle1

Aliceparticle

2

The measurement onthe pair (1&3) changesthe state of particle 2 toone of four states: S1,

S2, S3, S4

particle1

particle2

ClassicalChannel

Bob

particle3

I ZX

Pair of entangled qubits

particle3

Measurementparticle 3 - measuredparticle 1 - unchanged

QuantumChannel

Send to Bob results ofmeasurement

00 01 10 11

Receive from Aliceresults of measurements

00 01 10 11

Y

Particle 2 is in the samestate as particle 3

Figure 66: Caroll wants to send Bob a secret message encoded as the state of “particle 3”.Alice and Bob share a pair of entangled particles, “particle 1” and “particle 2”. Caroll givesto Alice “particle 3”. Alice performs a CNOT on the pair, using “particle 3” as the controlqubit and “particle 1” as the target qubit. Alice then measures “particle 1” and sends overthe classical communication channel the result of the measurement, “00”, “01”, “10”, or “11”.Then Bob applies one of four transformations to transform the state of “particle 2”.

| κ〉 = (GCNOT ⊗ I)(ξ) =

1 0 0 00 1 0 00 0 0 10 0 1 0

⊗

(1 00 1

)(ξ).

211

| κ〉 =

1 0 0 0 0 0 0 00 1 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 1 0 0 0 00 0 0 0 0 0 1 00 0 0 0 0 0 0 10 0 0 0 1 0 0 00 0 0 0 0 1 0 0

1√2

α0

00α0

α1

00α1

=1√2

α0

00α0

0α1

α1

0

Or | κ〉 = 1/√

2(α0 | 000〉+ α0 | 011〉+ α1 | 101〉+ α1 | 110〉).(ii) Alice measures the first qubit and leaves the second (the one entangled with Bob’s)untouched. This means that she applies to | ξ1〉 the transformation H ⊗ I ⊗ I and obtains

| ζ〉 = (H ⊗ I ⊗ I) | κ〉 =1√2

(1 11 −1

)⊗(

1 00 1

)⊗(

1 00 1

)| κ〉.

| ζ〉 =1√2

1 0 0 0 1 0 0 00 1 0 0 0 1 0 00 0 1 0 0 0 1 00 0 0 1 0 0 0 11 0 0 0 −1 0 0 00 1 0 0 0 −1 0 00 0 1 0 0 0 −1 00 0 0 1 0 0 0 −1

1√2

α0

00α0

0α1

α1

0

=1

2

α0

α1

α1

α0

α0

−α1

−α1

α0

| ζ〉 =1

2[α0(| 000〉+ | 011〉+ | 100〉+ | 111〉) + α1(| 001〉+ | 010〉− | 101〉− | 110〉)].

We can rewrite the expression for | ζ〉 and isolate the first two qubits in the expression ofthe new state

| ζ〉 = 1/2[| 00〉(α0 | 0〉+ α1 | 1〉) + | 01〉(α0 | 1〉) + α1 | 0〉) +

| 10〉(α0 | 0〉 − α1 | 1〉) + | 11〉(α0 | 1〉 − α1 | 0〉)].From this expression it follows that when Alice performs a joint measurement of her two

qubits, she gets the results | 00〉, | 01〉, | 10〉, | 11〉 with equal probability. She then sends Bobover a classical communication channel the result, 00, 01, 10, or 11. At the same time, themeasurement performed by Alice forces the qubit in Bob’s possession to one of four states

(i) | η〉 = α0 | 0〉+ α1 | 1〉 when the result is 00.

(ii) | η〉 = α0 | 1〉+ α1 | 0〉 when the result is 01.

(iii) | η〉 = α0 | 0〉 − α1 | 1〉 when the result is 10.

(iv) | η〉 = α0 | 1〉 − α1 | 0〉 when the result is 11.

Bob applies to his qubit the transformations from the following table depending upon thestring received from Alice

String received by Bob Gate applied by Bob The qubit becomes00 I I(η) = ϕ01 X I(η) = ϕ10 Z I(η) = ϕ11 Y I(η) = ϕ

212

Now the qubit in Bob’s possession is in the same state as the original qubit of Caroll’s. Aspointed out earlier, the state of Caroll’s qubit has been altered, henceforth the no clonningtheorem is not violated.

Note that the new state obtained as a result of the transformation of the original state of“particle 1” and “particle 3” can also be written as

| ζ〉 = (H ⊗ I ⊗ I)(GCNOT ⊗ I)(ξ).

The discussion of teleportation with maximally entangled particles gives us the oppor-tunity to observe some of the subtleties of handling entangled particles. We already knowthat the state of a pair of entangled particles is a vector in H4. Therefore even though Alicepossesses only two particles, Caroll’s and her own half of the entangled pair the state of thistwo particles is a vector in H8. This explains why she applies H ⊗ I ⊗ I to measure the firstqubit.

Figure 67 shows a circuit involving several CNOT and Hadamard gates able to performteleportation. The circuit has three inputs, a, b, c and three outputs a′, b′, c′. The unknowninput a appears at output c′ = a.

H|a> |a'>

|b>=| 0> |b'>

|c'>=|a>H H

H

|c>=| 0>

Alice

1 2 3 4 5 6 7 8 9 10

Bob

Figure 67: A quantum teleportation circuit. The left side shows Alice’s transformations, theright side Bob’s. Caroll’s particle, “particle 3” (shown at the top) initially is in the state | a〉.Bob’s particle, “particle 2” (shown at the bottom) initially is in the state | c〉. Alice’s particle,“particle 1”, (shown in the middle) initially is in state | b〉. The final state of Bob’s particleis identical with the initial state of Caroll’s particle, | c′〉 =| a〉.

We denote | a〉 = α0 | 0〉+α11〉 and identify ten stages in the the circuit in Figure 67. Thecorresponding states vectors are ξ1, ξ2, . . . ξ10. We compute the state vectors stage by stageas follows

213

| ξ1〉 =

(α0

α1

)⊗(

10

)⊗(

10

)=

α0

000α1

000

.

| ξ2〉 = (I ⊗H ⊗ I) | ξ1〉 =1√2

1 0 1 0 0 0 0 00 1 0 1 0 0 0 01 0 −1 0 0 0 0 00 1 0 −1 0 0 0 00 0 0 0 1 0 1 00 0 0 0 0 1 0 10 0 0 0 1 0 −1 00 0 0 0 0 1 0 −1

α0

000α1

000

| ξ2〉 =1√2

α0

0α0

0α1

0α1

0

.

| ξ3〉 = (I ⊗GCNOT ) | ξ2〉 =

1 0 0 0 0 0 0 00 1 0 0 0 0 0 00 0 0 1 0 0 0 00 0 1 0 0 0 0 00 0 0 0 1 0 0 00 0 0 0 0 1 0 00 0 0 0 0 0 0 10 0 0 0 0 0 1 0

1√2

α0

0α0

0α1

0α1

0

| ξ3〉 =1√2

α0

00α0

α1

00α1

.

214

| ξ4〉 = (GCNOT ⊗ I) |〉ξ3 =

1 0 0 0 0 0 0 00 1 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 1 0 0 0 00 0 0 0 0 0 1 00 0 0 0 0 0 0 10 0 0 0 1 0 0 00 0 0 0 0 1 0 0

1√2

α0

00α0

α1

00α1

| ξ4〉 =1√2

α0

00α0

0α1

α1

0

.

| ξ5〉 = (H ⊗ I ⊗ I) | ξ4〉 =1√2

1 0 0 0 1 0 0 00 1 0 0 0 1 0 00 0 1 0 0 0 1 00 0 0 1 0 0 0 11 0 0 0 −1 0 0 00 1 0 0 0 −1 0 00 0 1 0 0 0 −1 00 0 0 1 0 0 0 −1

1√2

α0

00α0

0α1

α1

0

Thus

| ξ5〉 =1

2

α0

α1

α1

α0

α0

−α1

−α1

α0

.

or

| ξ5〉 =1

2(α0 | 000〉+ α1 | 001〉+ α1 | 010〉+ α0 | 011〉+

+α0 | 100〉 − α1 | 101〉 − α1 | 110〉+ α0 | 111〉)Note that ξ5 can be written as

| ξ5〉 =1

2(| 00〉 (α0 | 0〉+ α1 | 1〉)+ | 01〉 (α0 | 1〉+ α1 | 0〉)+

215

+ | 10〉(α0 | 0〉 − α1 | 1〉)+ | 11〉(α0 | 1〉 − α1 | 0〉)).Now

| ξ6〉 =| ξ5〉.

ξ7 = (I ⊗GCNOT )ξ6 =

1 0 0 0 0 0 0 00 1 0 0 0 0 0 00 0 0 1 0 0 0 00 0 1 0 0 0 0 00 0 0 0 1 0 0 00 0 0 0 0 1 0 00 0 0 0 0 0 0 10 0 0 0 0 0 1 0

1

2

α0

α1

α1

α0

α0

−α1

−α1

α0

=1

2

α0

α1

α0

α1

α0

−α1

α0

−α1

.

| ξ8〉 = (I ⊗ I ⊗H) | ξ7〉 =1√2

1 1 0 0 0 0 0 01 −1 0 0 0 0 0 00 0 1 1 0 0 0 00 0 1 −1 0 0 0 00 0 0 0 1 1 0 00 0 0 0 1 −1 0 00 0 0 0 0 0 1 10 0 0 1 0 0 1 −1

1

2

α0

α1

α0

α1

α0

−α1

α0

−α1

| ξ8〉 =1

2√

2

α0 + α1

α0 − α1

α0 + α1

α0 − α1

α0 − α1

α0 + α1

α0 − α1

α0 + α1

.

We now use a result from Section 5.10 giving the transfer matrix of the gate involved inthis stage, GQ

| ξ9〉 = GQ | ξ8〉 =

1 0 0 0 0 0 0 00 1 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 1 0 0 0 00 0 0 0 0 1 0 00 0 0 0 1 0 0 00 0 0 0 0 0 0 10 0 0 0 0 0 1 0

1

2√

2

α0 + α1

α0 − α1

α0 + α1

α0 − α1

α0 − α1

α0 + α1

α0 − α1

α0 + α1

=1

2√

2

α0 + α1

α0 − α1

α0 + α1

α0 − α1

α0 + α1

α0 − α1

α0 + α1

α0 − α1

.

216

| ξ10〉 = (I ⊗ I ⊗H) | ξ9〉 =1√2

1 1 0 0 0 0 0 01 −1 0 0 0 0 0 00 0 1 1 0 0 0 00 0 1 −1 0 0 0 00 0 0 0 1 1 0 00 0 0 0 1 −1 0 00 0 0 0 0 0 1 10 0 0 1 0 0 1 −1

1

2√

2

α0 + α1

α0 − α1

α0 + α1

α0 − α1

α0 + α1

α0 − α1

α0 + α1

α0 − α1

| ξ10〉 =1

2

α0

α1

α0

α1

α0

α1

α0

α1

| ξ10〉 = 12[| 00〉(α0 | 0〉+ α1 | 1〉)+ | 01〉(α0 | 0〉+ α1 | 1〉)+ | 10〉(α0 | 0〉+ α1 | 1〉)+

+ | 11〉(α0 | 0〉+ α1 | 1〉)]

| ξ10〉 =1

2(α0 | 0〉+ α1 | 1〉)(| 00〉+ | 01〉+ | 10〉+ | 11〉).

8.6 Anti-Correlation and Teleportation

The two entangled qubits could be in another EPR state. Now we discuss briefly the so calledanti-correlation. Consider the following two qubit state called a spin singlet

| ψ−〉 =| 01〉− | 10〉√

2.

This is an entangled state and we shall prove later that a measurement reveals that thetwo particles involved are in opposite states. For example, if we measure the spin of thefirst particle and find it to be “Up”, then the spin of the second is “Down” and vice versa.Miraculously, the second qubit changes its state, as if knowing the result of the measurementon the first qubit. This state is distinguished from the three other entangled states; it changessign when “particle 1” and “particle 2” of an entangled pair are interchanged.

Let us analyze the anti-correlation case described above. For the sake of clarity we considerthe three particles to be photons and use the symbol → for horizontal polarization and ↑ forvertical polarization. The corresponding basis vectors are |→〉 and |↑〉. The subscript identifiesthe particle in the pair. The entangled pair (1, 2) forms a single quantum system with a sharedstate

| ψ−〉12 =1√2(|→〉1 |↑〉2 − |↑〉1 |→〉2).

217

This entangled state indicates only that the two particles are in opposite state but providesno information about the state of each particle of the pair. Once we make a measurement ofone of the particles by projecting it onto one of the basis vectors say |→〉 then the state ofthe other particle becomes instantaneously |↑〉. If “particle 1” is measured and found to havevertical polarization, then “particle 2” will have horizontal polarization.

Now we perform a specific joint measurement on “particle 1” and “particle 3” whichprojects them onto the entangled state

| ψ−〉13 =1√2(|→〉1 |↑〉3 − |↑〉1 |→〉3).

Now “particle 1” and “particle 3” are anti-correlated. If “particle 3” has horizontal polar-ization it forces “particle 1” to have vertical polarization. In turn, “particle 1” forces “particle2” to have opposite polarization, thus “particle 2” ends up with the same polarization as “par-ticle 3”.

A demonstration of quantum teleportation was carried out in 1997 at the University ofRome by Francesco de Martini based upon an idea of Sandu Popescu, and, at about the sametime, at Innsbruck, by Zeillinger. In both experiments the quantum state was teleported afew meters.

Reflecting mirror

Source

PolarIzer

Alice Bob

Carol

h

A

B

D

C

h

v

v

Figure 68: The teleportation experiment at University of Rome. The source generates aphoton with horizontal polarization for Alice and one with vertical polarization for Bob. Theentanglement is in the path selection. If Alice gets her photon via path A, then Bob gets hisvia path C; if Alice gets the photon via path B then Bob gets his via path D. Caroll encodesher quantum information using a polarizer. Alice measures the polarization of the photon shereceives and sends this classic information to Bob.

The experiment of de Martini is illustrated in Figure 68 [28]. In this experiment the in-formation is double encoded into a single photon instead of two. The source generates two

218

parametric downconverted 39 photons with opposite polarization, “photon 1”, with horizontalpolarization, h, for Alice and “photon 2”, with vertical polarization, v, for Bob. The polariza-tion entanglement of the two photons sent to Alice and Bob is converted into an entanglementof the paths followed by the two photons. A calcite crystal performs this conversion.

If “photon 1” travels to Alice via path A then “photon 2” travels to Bob via path C; if“photon 1” travels via path B, then “photon 2” travels via path D. Caroll encodes her messagein the polarization of the photon sent to Alice, “photon 1”. Alice measures the polarizationof the photon she receives from the source and sends the classical result to Bob. Finally,Bob performs the measurement suggested by Alice’s result and he gets a photon with thepolarization imposed by Caroll.

In this experiment the polarizer forces a certain polarization on “photon 1” and becauseof the anti-correlation of “photon 1” and “photon 2”, the latter is forced to an oppositepolarization.

8.7 Dense Coding

Coding is the process of transforming information during a communication process. Thesender of a message encodes the message, then transmits the encoded information over aclassical communication channel. The recipient of the message decodes the encoded informa-tion. The question we address now is whether there is an advantage to exchange quantuminformation, qubits over a quantum communication channel, instead of sending classical bitsover a classical communication channel.

The main characters of the following example are also Alice and Bob with Caroll in asupporting role. Alice and Bob have been married for some time and Alice, inspired by thewonderful pictures taken by Bob during his last trip to K9 decides to join a new expeditionto that remote part of the world. They want to exchange daily messages and compress themas much as possible to reduce communication costs. They agree prior to Alice’s departure toexchange daily information about the temperature and the cloud cover on K9. They decidethat Alice will construct two bit messages. The first bit will describe the temperature (0 if thetemperature is below 0 degree F, 1 if it is above 0 degree F); the second bit will describe thecloud cover (0 is there are no clouds and 1 if there is a cloud cover. The sentence “At noontoday the temperature on the summit of K9 is below zero and there are no clouds. Love Alice”will be encoded as the binary string “00”. The other possible messages are “01” (“...belowzero... clouds..”), “10” (“...above zero... no clouds..”), and “11” (“...above zero...clouds..”).

To send one of the four messages Alice must transmit two bits of classical information. ButAlice and Bob are already acquainted with quantum information. They decide to exchangea single qubit of information over a quantum channel to encode and decode the four possiblemessage. Here is the intricate story on how they succeed.

Assume that there is a source of entangled particles and Caroll is able to send to Aliceone qubit (one of the two entangled particles) and to Bob the other entangled particle of thepair, see Figure 69. By now we know that both particles are in the state

39A parametric downconversion source uses a UV laser beam which upon an interraction with a non-linearmedium - a crystal of ammonium dihydrogen phosphate, (ADP) - generates two photons for one input photon.

219

Caroll

I Z X iY

Alice’s modifiedbut still entangled qubit

Alice’squbit

Alice

Bob’squbit

Qubit fromAlice

Alice’squbit

Bob’squbit

Pair of entangled qubits

CNOT

Quantumchannel

00 01 10

H

1

Bob

Secondqubit

Firstqubit

0

0 1

11

Figure 69: Dense coding. Alice sends to Bob one qubit instead of two bits.

| ψ〉 =| 00〉 + | 11〉√

2| ψ〉 =

1√2

1001

.

Alice uses the one qubit transformations given by the Pauli matrices, I,X, Y and Z. Recallfrom Section 5.2 that these matrices are

I =

(1 00 1

)X =

(0 11 0

)Y =

(0 −ii 0

)Z =

(1 00 −1

).

Alice prepares her qubit as follows

(i) to send “00” she applies to her qubit the transformation produced by the I, the identity

220

matrix and transmits

| ϕ00〉 =| 00〉 + | 11〉√

2| ϕ00〉 =

1√2

1001

.

(ii) to send “01” she applies to her qubit the transformation produced by the Z matrix andtransmits

| ϕ01〉 =| 00〉 − | 11〉√

2| ϕ01〉 =

1√2

100

−1

.

(iii) to send “10” she applies to her qubit the transformation produced by the X matrix andtransmits

| ϕ10〉 =| 01〉 + | 10〉√

2| ϕ10〉 =

1√2

0110

.

(iv) to send “11” she applies to her qubit the transformation produced by the iY matrix andtransmits

| ϕ11〉 =| 01〉 − | 10〉√

2| ϕ11〉 =

1√2

01

−10

.

We already know that the X, Y , Z, and I are one-qubit gates and are used to transformvectors in H2, but this time we have an entangled qubit, a vector in H4. Each one of thefour matrices used to transform the first qubit of the entangled pair, the qubit in possessionof Alice, is obtained as the tensor product of the corresponding single qubit gate and theidentity matrix. We transform only the first but leave the second qubit of the entangled pairuntouched.

The transfer matrices and the outputs for the four transformations Alice is expected tocarry on her qubit are

00 −→ G00 = I ⊗ I =

(1 00 1

)⊗(

1 00 1

)=

1 0 0 00 1 0 00 0 1 00 0 0 1

| ϕ00〉 =

1 0 0 00 1 0 00 0 1 00 0 0 1

1√

2

1001

=

1√2

1001

.

01 −→ G01 = Z ⊗ I =

(1 00 −1

)⊗(

1 00 1

)=

1 0 0 00 1 0 00 0 −1 00 0 0 −1

221

| ϕ01〉 =

1 0 0 00 1 0 00 0 −1 00 0 0 −1

1√

2

1001

=

1√2

100

−1

.

10 −→ G10 = X ⊗ I =

(0 11 0

)⊗(

1 00 1

)=

0 0 1 00 0 0 11 0 0 00 1 0 0

| ϕ10〉 =

0 0 1 00 0 0 11 0 0 00 1 0 0

1√

2

1001

=

1√2

0110

.

11 −→ G11 = Y ⊗ I =

(0 −ii 0

)⊗(

1 00 1

)=

0 0 −i 00 0 0 −ii 0 0 00 i 0 0

| ϕ11〉 = i

0 0 −i 00 0 0 −ii 0 0 00 i 0 0

1√

2

1001

=

i√2

0−i

i0

=

1√2

01

−10

.

After getting the qubit from Alice, Bob is in possession of both qubits of the entangledpair. Let’s us see how he decodes the message. First he applies a CNOT to the pair and getsthe following results

00 −→| ξ〉 = GCNOT ϕ00 =

1 0 0 00 1 0 00 0 0 10 0 1 0

1√

2

1001

= 1√

2

1010

| ξ00〉 =| 00〉+ | 10〉√

2=| 0〉+ | 1〉√

2| 0〉.

01 −→| ξ〉 = GCNOT ϕ01 =

1 0 0 00 1 0 00 0 0 10 0 1 0

1√

2

100

−1

= 1√

2

10

−10

| ξ01〉 =| 00〉− | 10〉√

2=| 0〉− | 1〉√

2| 0〉.

10 −→| ξ〉 = GCNOT ϕ10 =

1 0 0 00 1 0 00 0 0 10 0 1 0

1√

2

0110

= 1√

2

0101

| ξ10〉 =| 01〉+ | 11〉√

2=| 0〉+ | 1〉√

2| 1〉.

222

11 −→| ξ〉 = GCNOT ϕ11 =

1 0 0 00 1 0 00 0 0 10 0 1 0

1√

2

01

−10

= 1√

2

010

−1

| ξ11〉 =| 01〉− | 11〉√

2=| 0〉− | 1〉√

2| 1〉.

Amazingly enough, Bob can now measure the second qubit without affecting the state ofthe entangled pair. If the second qubit is “0” it means that Alice sent either “00” or “01”. Ifthe second qubit is “1” it means that Alice sent either “10” or “11”.

Now Bob applies the Hadamard gate to the first qubit. The results are

00 −→| ζ00〉 = H ξ00 =1√2

(1 11 −1

)1√2

(11

)=

1

2

(20

)=

(10

)

| ζ00〉 =| 0〉.

01 −→| ζ01〉 = H ξ01 =1√2

(1 11 −1

)1√2

(1

−1

)=

1

2

(02

)=

(01

)

| ζ01〉 =| 1〉.

10 −→| ζ10〉 = H ξ10 =1√2

(1 11 −1

)1√2

(11

)=

1

2

(20

)=

(10

)

| ζ10〉 =| 0〉.

11 −→| ζ11〉 = H ξ11 =1√2

(1 11 −1

)1√2

(1

−1

)=

1

2

(02

)=

(01

)

| ζ11〉 =| 1〉.Now Bob knows precisely which one of the four messages was sent. If the second qubit is

“1” and the result of the transformation (performed by the Haddamard gate on the first qubitof the pair) is “1” then the message sent is ‘00”; if the result of the H transformation is 0then the message sent is “01”. If the second qubit is “0” and the result of the transformationis “1” then the message sent is ‘10”; if the result of the H transformation is “0” then themessage sent is “11”.

8.8 Quantum Key Distribution

System security is a critical concern in the design of networked computer systems. Confiden-tiality is a property of a system that guarantees that only agents with proper credentials haveaccess to information. Confidentiality can be compromised during transmission over insecurecommunication channels or while being stored on sites that allow multiple agents to modifyit.

223

The common method to support confidentiality is based on encryption. Data or plaintextin cryptographic terms is mathematically transformed into ciphertext and only agents withthe proper key are able to decrypt the ciphertext and transform it into plaintext.

The algorithms used to transform plaintext into ciphertext and back form a cipher. Asymmetric cipher uses the same key for encryption and decryption. Asymmetric or public keyciphers involve a public key that can be freely distributed and a secret private key. Data isencrypted by the sender using the public key of the intended recipient of the message and itis decrypted using the private key of the recipient, see Figure 70.

Plaintext

Ciphertext

Encrypt withsecret key

Decrypt withsecret key

(a)

Encrypt withpublic key of the

recipient

Decrypt with theprivate key of the

recipient

(b)

Plaintext

Ciphertext

Plaintext

Plaintext

Figure 70: (a) Secret key cryptography. (b) Public key cryptography.

The problem of distributing the key of a symmetric cipher is known as the key distributionproblem. Decryption using an asymmetric cipher is time consuming. Typically, the partiesinvolved use the public key system to exchange a session key, and then they use a symmetriccypher based upon this key to communicate. To make it harder for a third party to breakthe cipher the encryption key should be changed as frequently as possible.

The key distribution has an ingenious solution when in addition to the classical commu-nication channel, a quantum communication channel is available. We show now that the twoparties involved in the exchange of the cipher key can detect if a third party eavesdrops.

As before,the main characters are Alice who wants to send the encryption key to Bob, andCaroll who attempts to eavesdrop.

The exchange without a third party is captured in Figure 71(a). Alice sends over thequantum communication channel n qubits encoded randomly in one of two bases, (|→〉, |↑〉)and (|↗〉, |↙〉). Bob randomly chooses one of the two bases when receiving and measuringone qubit. Bob ends up guessing correctly the bases for about n/2 of the qubits, but he does

224

n qubits

(a)

Classical communication channel

n qubitsAlice


Bob

(b)

Classical communication channel

n qubits

Alice


Bob

n bits

n bits

Caroll

Figure 71: (a) Alice sends over the quantum communication channel n qubits encoded ran-domly in one of two bases. Bob randomly chooses one of the two bases when receiving onequbit. Bob ends up guessing correctly the bases for about n/2 of the qubits. Then Alice andBob exchange over the classical communication channel the base used by each one of themfor each qubit. Now both Alice and Bob know precisely on which qubits they have agreedand use them as the encryption key. (b) Caroll intercepts each of the n qubits. She randomlypicks up one of the two bases to measure each qubit and then she re-sends the qubit to Bob.Approximately n/2 of the qubits received by Bob have their state altered by Caroll. Bobguesses the correct bases for only about half of the qubits whose state has not been alteredby Caroll. When Alice and Bob exchange information over the classical communication chan-nel they realize that they agree on considerably less than n/2 qubits. Thus they detect thepresence of Caroll.

not know which qubit is measured correctly. Then Alice uses the classical communicationchannel to send Bob the bases used for each qubit and Bob sends Alice the bases he guessedfor each qubit. Now both Alice and Bob know precisely the qubits they have agreed upon.They use the approximately n/2 qubits as the encryption key.

The exchange when Caroll eavesdrops on both the quantum and the classical communi-cation channels is illustrated in Figure 71(b). Caroll intercepts each of the n qubits. Sherandomly picks up one of the two bases to measure each qubit and then she re-sends thequbit to Bob. Approximately n/2 of the qubits received by Bob have their state altered byCaroll. When Alice and Bob exchange information over the classical communication channelthey realize that they agree in considerably less than n/2 of qubits. Thus they detect thepresence of Caroll.

We give without proof the following proposition: To distinguish between two non-

225

orthogonal quantum states we can only gain information by introducing additional disturbanceof the system.

Protocols for quantum key distribution are based upon the idea of transmitting non-orthogonal qubit states and then checking for the disturbance in their transmitted states. Toprove the correctness of a quantum key distribution (QKD) protocol we have to show thatAlice and Bob agree on a key about which Caroll can obtain only an exponentially smallamount of information.

Eavesdropping must be distinguished from the noise on the communication channel. Todo so “check” bits must be interspaced randomly among the “data” bits used to constructthe encryption key.

We now outline the first QKD protocol, BB84, proposed by Bennett and Brassard in 1984[14]. A proof of security of this protocol can be found in [105]. Here we follow the descriptionof the protocol in [80].

(i) Alice selects n, the approximative length of the desired encryption key. Then she generatestwo random strings of bits a and b of length (4 + δ)n.

(ii) Alice encodes the bits in string a using the bits in string b to chose the basis ( either Xor Z) for each qubit in a. She generates | ψ〉, a block of (4 + δ)n qubits

| ψ〉 =

k=(4+δ)n⊕k=1

| ψakbk〉

where ak and bk are the k-th bit of strings a and b respectively. Each qubit is in one of fourpure states in two bases, [| 0〉, | 1〉] and [1/

√2(| 0〉+ | 1〉), 1/

√2(| 0〉− | 1〉)].

The four states used are

ψ00 =| 0〉ψ10 =| 1〉ψ01 = 1/

√2(| 0〉+ | 1〉)

ψ011 = 1/√

2(| 0〉− | 1〉).(iii) If E describes the combined effect of the channel noise and Caroll’s interference then theblock of qubits received by Bob is E(| ψ〉〈ψ |).(iv) Bob constructs a random string of bits, b′, of length (4 + δ)n. He then measures everyqubit either in basis X or in basis Z depending upon the value of the corresponding bit ofb′. As a result of his measurement he constructs the binary string a′. He tells Alice over theclassical channel that he now expects information about b.

(v) Alice uses the classical channel to disclose b.

(vi) Alice and Bob exchange information over the classical channel and keep only the bitsin the set {a, a′} for which the corresponding bits of the strings b and b′ are equal. Let usassume that Alice and Bob keep only 2n bits. By choosing δ sufficiently large Alice and Bobcan ensure that the number of bits kept is close to 2n with a very high probability.

(vii) Alice and Bob perform several tests to determine the level of noise and eavesdroppingon the channel. The set of 2n bits is split into two sub-sets of n bits each. One sub-set willbe the check bits used to estimate the level of noise and eavesdropping, and the other consistsof the data bits used for the quantum key. Alice selects n check bits at random and sends the

226

position of the selected bits over the classical channel to Bob. Then Alice and Bob comparethe values of the check bits. If more than say t bits disagree then they abort and re-try theprotocol.

In summary, the attempt of an intruder to eavesdrop increases the level of disturbance ofa signal on a quantum communication channel. The two parties wishing to communicate ina secure manner establish un upper bound for the level of disturbance tolerable. They usea set of check bits to estimate the level of noise and/or eavesdropping. Then they reconciletheir information and distil a shared secret key.



(1) Find out what were the arguments of Bohr, or provide your own arguments showing thatthe experiment suggested by Einstein and discussed in Section 8.1 does not contradict theuncertainty principle.

(2) Assume that G1 is the transfer matrix of “stage 1” consisting of the H and I gate andG2 = GCNOT is the transfer matrix of “stage 2” of the quantum circuit in Figure 65(b).Construct GBell = G2G1. Apply GBell to an input to the circuit in Figure 65(a) and calculatethe output. Explain the results.

(3) Can the strategy for encoding and decoding two bits of classical information into onequbit be extended? If you believe that this is possible, describe an algorithm for dense codingof 3 bits into one qubit. If not justify your answer.

227

9 Appendix I: Modular Arithmetic

Factoring large integers into primes is a hard computational problem and because of thisproperty has been used as the basis of several cryptosystems. The first known quantum algo-rithm developed by Peter Shor in 1994 is a polynomial time algorithm for prime factorization[100]. This justifies our interest in modular arithmetic.

9.1 Elementary Number Theory Concepts

If x is a real number then x� denotes the floor of x, the greatest integer less then or equal tox. If x and y are real numbers we define the remainder when x is divided by y as the binaryrelation

x mod y =

{x− yx/y� if y �= 0x if y = 0

such that

y > 0 =⇒ 0 ≤ x mod y < y and y < 0 =⇒ 0 ≥ x mod y > y.

A prime number is an integer greater than one which has no exact divisor other than 1and itself. The “fundamental theorem of arithmetic” states that any positive integer n canbe expressed as a product of prime numbers with unique nonnegative exponents

n = 2n2 · 3n3 · 5n5 · . . . =∏

p prime

pnp .

A proof of this theorem by induction follows. The statement of the theorem is true forn = 2. Let us assume that the statement is also true for all integers up to n, namely 2, 3, 4, . . . nand prove that is also true for n + 1. From the definition of a prime number it follows thatn + 1 is either prime or a product of some numbers

n + 1 = p1 · p2 . . . · pq.

It is obvious that 1 ≤ p1, p2 . . . pq ≤ n. According to our assumption p1, p2 . . . pq can peexpressed as products of prime numbers. It follows that n + 1 can be expressed as a productof prime numbers.

Integers, n and m are said to be relatively prime if they do not have any common factors.Given two integers n and m their greatest common divisor, gcd(n,m), is the largest integerthat divides evenly both n and m. Their least common multiple, lcm(n,m) is the smallestpositive integer that is a multiple of both n and m. According to the fundamental theoremof arithmetic

n =∏

p prime

pnp and m =∏

p prime

pmp .

It follows that

gcd(n,m) =∏

p prime

pmin(np,mp)

228

lcm(n,m) =∏

p prime

pmax(np,mp)

Given any integers a, b, c the following identities are true

a · b = gcd(a, b) · lcm(a, b) if a, b ≥ 0

gcd[(a, b) · c] = gcd(a · c, b · c) if c ≥ 0

lcm[(a, b) · c] = lcm(a · c, b · c) if c ≥ 0

gcd[a, gcd(b, c)] = gcd[b, gcd(a, c)] = gcd[gcd(a, b), c]

lcm[a, gcd(b, c)] = lcm[b, gcd(a, c)] = lcm[gcd(a, b), c]

gcd[lcm(a, b), lcm(a, c)] = lcm[a, gcd(b, c)]

lcm[gcd(a, b), gcd(a, c)] = gcd[a, lcm(b, c)]

Consider the set of q non-negative integers (p1, p2, . . . pi, . . . pq). The notation (k | m)means that k divides m exactly. The greatest common divisor (gcd) of (p1, p2, . . . pi, . . . pq) isthe largest integer that divides all of them

gcd(p1, p2, . . . pi, . . . pq) = max(m : m | pi, 1 ≤ i ≤ q).

The least common multiple of (p1, p2, . . . pi, . . . pq) is the smallest integer divisible by all ofthem

lcm(p1, p2, . . . pi, . . . pq) = min(m : m > 0 and pi | m, 1 ≤ i ≤ q).

Euclid’s algorithm discussed in the next section allows us to compute the gcd of twointegers n and m. We factor n and m into prime numbers, select the common factors at thelargest power, and multiply them.

n = 94929 = 7 · 19 · 23 · 31

m = 1289245 = 5 · 19 · 31 · 41

then

gcd(n,m) = gcd(94929, 1289245) = 19 · 31 = 589.

Both 19 and 31 divide 94929 and 1289245, but 589 = 19 · 31 is the largest integer dividingboth 94929 and 1289245.

To compute the lcm of two integers n and m we factor the integers. Then we include everyprime number appearing in the factorization of either m or n and compute the product. Forexample

94929 = 7 · 19 · 23 · 31

1289245 = 5 · 19 · 31 · 41

then

lcm(94929, 1289245) = 5 · 7 · 19 · 23 · 31 · 41 = 19439945.

Other integers, such as 1845414538905 = 94929 · 1289245, and all multiples of it, are divisibleby 94929 and by 1289245 but 19439945 is the smallest integer divisible both by 94929 and by1289245.

229

Consider p1 = 3, p2 = 5, p3 = 7. Then

lcm(3, 5, 7) = 3 · 5 · 7 = 105

lcm(5, 7) = 5 · 7 = 35 and lcm(3, 35) = 3 · 35 = 105

lcm(3, 5) = 3 · 5 = 15 and lcm(15, 7) = 15 · 7 = 105

Two integers, n and m, are said to be congruent modulo another integer p if (n−m) is aninteger multiple of p. We write

n ≡ m (mod p)

The congruence relation has the following properties

(i) If n ≡ m and a ≡ b then n · a ≡ m · b and n± a ≡ m± b. All congruences are modulo p.

(ii) If n · a ≡ m · b (mod p) and n ≡ m (mod p) and if n is relatively prime to p thena ≡ b (mod p).

(iii) n ≡ m (mod p) if and only if m · k ≡ n · k (mod p · k).

(iv) If p is relatively prime to q, then n ≡ m (mod p · q) if and only if n ≡ m (mod p) andn ≡ m (mod q)

Theorem. If p is a prime number, then mp ≡ m (mod p) for any integer m (Fermat’stheorem dated 1640).

Proof: We consider only the case when m and p are relatively prime. Since p is prime thisimplies that m (mod p) �= 0. If m is a multiple of p then mp ≡ 0 (mod p) and m ≡ 0 (mod p)and the statement of the theorem is trivially satisfied.

First, we observe that the p integers:

0 (mod p) m (mod p) 2m (mod p) . . . (p− 1)m (mod p)

are all distinct. Indeed, if a · m ≡ b · m (mod p) is only possible if a ≡ b (modulo p) andcertainly this is not the case. Moreover, all the (p − 1) non-zero distinct integers in thesequence above are smaller than p, therefore they must be the set of integers 1, 2, . . . (p− 1)in some order. Thus, we have the following congruence relationship

m · 2m · 3m. . . (p− 1) ·m ≡ 1 · 2 · 3 . . . (p− 1) (mod p).

We multiply both side of the above relation by m and obtain the result proving Fermat’stheorem

mp[1 · 2 · 3 . . . (p− 1)] (modulo p) ≡ m[1 · 2 · 3 . . . (p− 1)] (mod p)

Using similar arguments one could prove the following theorem

Theorem. Given any positive integer n let a be relatively prime to n. Then aϕ(n) ≡ 1 (mod n)(Euler’s theorem).

230

9.2 Euclid’s Algorithm for Integers

The algorithm to compute the greatest common divisor of two integers was most likely knownfew hundred years before appearing around year 300 B.C., in Book 7 of Euclid’s Elements.

Let a, b ∈ I be two positive integers. The quotient, q, and the remainder, r, when a isdivided by b are defined by

a = q · b + r 0 ≤ r < b.

If r = 0 then we say that b divides a.

If gcd(a, b) = c then there are two integers d and e such a = c · d and b = c · e.The greatest common divisor of positive integers a and b is the last remainder in the

sequence of applications of the division algorithm

a = q2 · b + r2 0 < r2 < b

b = q3 · r2 + r3 0 < r3 < r2

r2 = q4 · r3 + r4 0 < r4 < r3

. . .

rk−3 = qk−1 · rk−2 + rk−1 0 < rk−1 < rk−2

rk−2 = qk−1 · rk−1 + rk 0 < rk < rk−1

rk−1 = qk+1 · rk + 0.

The extended Euclid algorithm allows us to determine integers s and t such that

s · a + t · b = gcd(a, b)

We start with:

s0 = 1 t0 = 0 r0 = a

s1 = 0 t1 = 1 r1 = b.

For i ≥ 2 we apply the division algorithm to ri−2 and ri−1 to obtain qi and ri

ri−2 = qi · ri−1 + ri Z 0 ≤ ri < ri−1

Then we compute si and ti

si = si−2 − qi · si−1

ti = ti−2 − qi · ti−1.

Example. We apply the extended Euclid algorithm to integers 78 and 63

i si ti ri qi

0 1 0 78 −1 0 1 63 −

231

9.3 Euclid’s Algorithm for Polynomials

9.4 The Chinese Remainder Theorem and its Applications

Consider the following problem posed sometime during the 4-th century AD, by the Chinesephilosopher Sun Tsu Suan-Ching [123]: “There are certain things whose number is unknown.Repeatedly divided by 3 the remainder is 2, by 5 the remainder is 3, and by 7 the remainderis 2. What is the number?” Similar problems appear in the hindu text Brahma-Sphurta-Siddhanta in the 6-th century AD.

The more generally formulation of these problems is: find the integer n that satisfiessimultaneously the equations:

n = r1 (mod p1)n = r2 (mod p2). . .n = ri (mod pi). . .n = rq (mod pq)

We assume that the moduli pk, 1 ≤ k ≤ q are prime numbers.First, we prove that a solution to this problem exists and it is unique modulo the least

common multiple of the integers pi, 1 ≤ i ≤ q. Then, we present an algorithm to solve theproblem.

Theorem. (The Chinese Remainder Theorem) Let p1, p2, . . . pi . . . pq be positive integers thatare pairwise coprime, i.e. gcd(pi, pj) = 1, if i �= j. Let P = p1 · p2 . . . · pi . . . · pq and leta, r1, r2, . . . rk, . . . rq be integers. Then there is exactly one integer n such that

0 ≤ n ≤ P and n = ri (mod pi) 1 ≤ i ≤ q.

Proof: Let Qi, 1 ≤ i ≤ q be integers such that

Qi ≡ 1 (mod pi) and Qi ≡ 0 (mod pj) i �= j.

Such integers exist because P and P/pi are relatively prime, therefore according to Euler’stheorem

Qi = (P/pi)ϕ(pi).

The integer

n = (r1Q1 + r2Q2 + . . . riQi + . . . rmQm) (mod P )]

is divisible by all pi 1 ≤ i ≤ m. We leave to the reader the proof of the uniqueness of thesolution.

The algorithm for solving the problem is:

Step I - Compute the product of the moduli

P = p1 · p2 . . . · pk . . . · pm.

232

Step II - Compute the integers Qk, 1 ≤ k ≤ m, the inverse modulo pk of each of the quantitiesP/pk, using the following expression

Qk = (P/pk)k−2 (mod pk)

Step III - The solution is

n = (P/p1) · r1 ·Q1 + (P/p2) · r2 ·Q2 + . . . (P/pk) · rk ·Qk + . . . + (P/pm) · rm ·Qm (mod P ).

The calculation of quantities Qk, in Step II is justified by the following arguments. Ac-cording to Fermat’s theorem apk−1 = 1 (mod pk) for any integer a. This can be writtenas

a · (apk−2) = 1 (mod pk) =⇒ a−1 (mod pk) = (apk−2).

By definition

Qk = (P/pk)−1 (mod pk)

Thus

Qk = (P/pk)k−2 (mod pk)

Let us now apply this algorithm to solve the problem posed at the beginning of this section:

n = 2 (mod 3), n = 3 (mod 5), n = 2 (mod 7)

In this case P = 3 · 5 · 7 = 105 and we have to compute Qi = [P/pi]pi−2 mod pi. The three

integers are:

Q1 = [105/3]3−2 = (35)1 = 11 · 3 + 2 = 2 (mod 3)

Q2 = [105/5]5−2 = (21)3 = 9261 = 1852 · 5 + 1 = 1 (mod 5)

Q3 = [105/7]7−2 = (15)5 = 759375 = 108482 · 7 + 1 = 1 (mod 7)

The result isn = 35 · 2 · 2 + 21 · 3 · 1 + 15 · 2 · 1 = 233 = 2 · 105 + 23 = 23 (mod 105)

Indeed

23 =

7 · 3 + 24 · 5 + 33 · 7 + 2

9.5 Computer Arithmetic for Large Integers

The number of bits in a computer word limits the range of integers that can be manipulatedby a particular processor, as well as the accuracy of a floating point number. If we have n-bitinternal registers, then we can only represent integers in the range 0 ≤ i ≤ 2n−1. About halfof this range is allocated to positive integers and half to negative integers. We use one bitfor the sign, and the two’s complement to represent a negative integer. The largest positiveinteger that can be represented by a 32-bit processor is 231− 1 = 2, 147, 483, 367, a very large

233

number, but certainly smaller than the US deficit expressed in dollars, the number of atomsin the universe, or the time in seconds since the extinction of dinosaurs.

The problem we address now is how to do arithmetic on large integers given the phys-ical limitation imposed by the architecture of a processor. Number theory provides anelegant solution to the problem. Recall that the “modulus m” relation, with m a posi-tive integer, partitions the set of positive integers into m equivalence classes, namely inte-gers whose residues, or remainders, when divided by m are 0, 1, 2, . . . m − 1. For example,113 mod 3 = 1234567892 mod 3 = 2.

Now consider a set of “moduli” that contain no common factors, p1, p2, . . . pq. Given aninteger n we can divide n by these moduli and compute the corresponding residues:

r1 = m mod p1, r2 = n mod p2, . . . rq−1 = n mod pq−1, rq = n mod pq.

This is a reversible process and we can always compute n given the set (r1, r2, . . . rq−1, rq)using the algorithm in Section 9.4. Thus, the set (r1, r2, . . . rq−1, rq) can be thought of asanother internal representation of the positive integer n. It turns out that this internal rep-resentation allows us to perform addition, subtraction, and multiplication of large, positiveintegers in a very convenient manner. It is easy to see that if we have two integers a with theinternal representation a = (a1, a2, . . . aq−1, aq) and b = (b1, b2, . . . bq−1, bq) then:

a+b = (a1, a2, . . . aq−1, aq) + (b1, b2, . . . bq−1, bq) = ((a1+b1) mod p1, (a2+b2) mod p2, . . .

(aq−1 + bq−1) mod pq−1, (aq + bq) mod pq).

a−b = (a1, a2, . . . aq−1, aq) − (b1, b2, . . . bq−1, bq) = ((a1−b1) mod p1, (a2−b2) mod p2, . . .

(aq−1 − bq−1) mod pq−1, (aq − bq) mod pq.

a · b = (a1, a2, . . . aq−1, aq) · (b1, b2, . . . bq−1, bq) = ((a1 · b1) mod p1, (a2 · b2) mod p2, . . .

(aq−1 · bq−1) mod pq−1, (aq · bq) mod pq.

The major advantage of the modular representation of integers is that it leads to relativelysimple hardware implementation. You may recall from Section 5 that the classical one fullbit adder must include a circuit to compute the CarryOut and must deal with the CarryIn.This is no longer necessary because the modular representation is carry-free.

There are also some disadvantages of using modular arithmetic. For example, given themodular representation, it is difficult to test if a number is positive or negative, or to performinteger division. It is also difficult to test if a > b. The idea of using modular arithmeticfor the hardware implementation is attributed by Knuth [68] and others [113], to A. Svobodaand M. Valach.


Knuth provides a comprehensive discussion of all the topics covered in this section [67, 68].Vanstone and van Oorschot include a brief presentation of the Euclid algorithms and theChinese Remainder Theorem [120].

234

Positionalrepresentation of

large integersConversion from positionalto modular representation

Modular representationof large integers

Conversion from modularto positional representation

Arithmetic operationswith large integers usingmodular representation

Positionalrepresentation of

large integers

Modular representationof large integers

Figure 72: The schematics of performing integer arithmetic on large integers based uponnumber theory. The positional representation of integers is first converted to a modular rep-resentation, then the arithmetic operations are performed using this modular representation,and finally the results are converted back to positional representation.


(1) Given three integers a, b, c prove that the following identities are true

a · b = gcd(a, b) · lcm(a, b) if a, b ≥ 0

gcd[(a, b) · c] = gcd(a · c, b · c) if c ≥ 0

lcm[(a, b) · c] = lcm(a · c, b · c) if c ≥ 0

gcd[a, gcd(b, c)] = gcd[b, gcd(a, c)] = gcd[gcd(a, b), c]

lcm[a, gcd(b, c)] = lcm[b, gcd(a, c)] = lcm[gcd(a, b), c]

gcd[lcm(a, b), lcm(a, c)] = lcm[a, gcd(b, c)]

lcm[gcd(a, b), gcd(a, c)] = gcd[a, lcm(b, c)]

(2) Prove that that if we have two integers a with the internal representation a =(a1, a2, . . . ar−1, ar) and b = (b1, b2, . . . br−1, br) then the following three equation are true:

a+b = (a1, a2, . . . ar−1, ar) + (b1, b2, . . . br−1, br) = ((a1+b1) mod m1, (a2+b2) mod m2, . . .

(ar−1 + br−1) mod mr−1, (ar + br) mod mr).

a−b = (a1, a2, . . . ar−1, ar) − (b1, b2, . . . br−1, br) = ((a1−b1) mod m1, (a2−b2) mod m2, . . .

(ar−1 − br−1) mod mr−1, (ar − br) mod mr.

a · b = (a1, a2, . . . ar−1, ar) · (b1, b2, . . . br−1, br) = ((a1 · b1) mod m1, (a2 · b2) mod m2, . . .

(ar−1 · br−1) mod mr−1, (ar · br) mod mr.Use a mod mi = b mod mi ⇐⇒ a = b.

235

10 Appendix II: Welsh-Hadamard Transform

Hadamard matrices and the Walsh-Hadamard transform are used to describe quantum circuitsand to define special classes of linear codes. In this section we define the Hadamard matrix,discuss its properties and introduce the Hadamard transform.

10.1 Hadamard Matrices

A Hadamard matrix of order n, Hn = [hij] 1 ≤ i ≤ n, 1 ≤ j ≤ n, is an n × n matrixwith hij either +1 or −1. The row vectors of Hn, h1, h2, . . . hn are pairwise orthogonal,hk · hl = 0 ∀(k, l) ∈ {1, n}. A Hadamard matrix of order 2n is

H2n =

(Hn Hn

Hn −Hn

)A Hadamard matrix Hn has the following properties

(i) The product of a Hadamard matrix and its transpose

HnHTn = nIn HT

n Hn = nIn

(ii) The exchange of rows or columns transforms one Hadamard matrix into another one.

(iii) The multiplication of rows or columns by −1 transforms one Hadamard matrix intoanother one.

(iv) If n = 2q−1 then H2n can be expressed as the tensor product of q 2× 2 matrices

H2n = H2 ⊗H2 ⊗ . . . H2 ⊗H2

Proof of (i): Hn and HTn can be written as

Hn =

h1

h2

. . .hn

HT

n =(hT

1 hT2 . . . hT

n

)

Then

HnHTn =

h1 · h1 h1 · h2 . . . h1 · hn

h2 · h1 h2 · h2 . . . h2 · hn

. . . . . . . . . . . .hn · h1 hn · h2 . . . hn · hn

=

n 0 . . . 00 n . . . 0

. . . . . . . . . . . .0 0 . . . n

= nIn.

Now multiply the previous equation with H−1n

H−1n HnH

Tn = nH−1

n In → HTn = nH−1

n → HTn Hn = nH−1

n Hn → HTn Hn = nIn.

Properties (ii), (iii), and (iv) follow immediately.

For example, given that H1 = [1], then the Hadamard matrices of order 2, 4 and 8 are:

236

H2 =

(1 11 −1

)H4 =

1 1 1 11 −1 1 −11 1 −1 −11 −1 −1 1

H8 =

1 1 1 1 1 1 1 11 −1 1 −1 1 −1 1 −11 1 −1 −1 1 1 −1 −11 −1 −1 1 1 −1 −1 11 1 1 1 −1 −1 −1 −11 −1 1 −1 −1 1 −1 11 1 −1 −1 −1 −1 1 11 −1 −1 1 −1 1 1 −1

Consider all the binary q-tuples, b1,b2, . . .b2q . The proper ordering, πq, of binary q-tuplesis defined recursively as

π1 = [0, 1]

If πi = [b1,b2, . . .b2i ] then πi+1 = [b10,b20, . . .b2i0,b11,b21, . . .b2i1] ∀i ∈ {1, q − 1}.For example, given that π1 = [0, 1]

π2 = [00, 10, 01, 11]

π3 = [000, 100, 010, 110, 001, 101, 011, 111]

π4 = [0000, 1000, 0100, 1100, 0010, 1010, 0110, 1110, 0001, 1001, 0101, 1101, 0011, 1011, 0111, 1111]

Let n = 2q and let the n q-tuples under the proper order be

πq = [u0,u1, . . .un−1] .

Let us enforce the convention that the leftmost bit of ui is the least significant bit of thethe integer represented by ui as a q-tuple. For example, consider the following mappings of4-tuples to integers:

0000 → 0 1000 → 1 0100 → 2 1100 → 3 0010 → 4 1010 → 5 . . . 0111 → 14 1111 → 15.

The matrix H = [hij] with

hij = (−1)ui·uj ∀(i, j) ∈ {0, n− 1}where u(i) and u(j) are members of

πq = [u0,u1, . . .un−1]

is the Hadamard matrix of order n = 2q. Observe that the rows and the columns of thematrix H are numbered from 0 to n− 1.

As an example consider the case q = 3. It is easy to see that the matrix elements of thefirst row, h0i, and the first column, hi0, are all +1 because u0 · ui = ui · u0 = 0,∀i ∈ {1, 7}and (−1)0 = +1. It is also easy to see that hii = −1 when the weight of i (the number of 1’sin the binary representation of integer i) is odd and hii = +1 when the weight of i is even.Individual calculation of hij e.g., h32 = (−1)(110)·(010) = (−1)1 = −1, is trivial and shows thatH = H3.

237

Let c = (c0c1c2 . . . ci . . . c2n−1) be a binary 2q-tuple, ci = 0, 1 and Bq a q× 2q matrix whosecolumns are all 2q possible q-tuples bi. We define c(b) to be the component of c selected bythe q-vector b according to the matrix B. c(b) can be either 0 or 1.

For example, let q = 3 and B3 be given by

B3 =

1 0 0 1 0 1 1 0

0 1 0 1 1 0 1 00 0 1 0 1 1 1 0

Let c = (0 1 1 1 1 0 0 1). Then c(1 1 0) = 1 because (1 1 0) is the 4-th column of B andthe 4-th element of c is 1. Similarly c(1 1 1) = 0 because (1 1 1) is the 7-th column of B andthe 7-th element of c is 0.

We can define a vector R(c) whose components are (−1)c(b) can be either +1 or −1. Inour example

R(c) = (+1 − 1 − 1 − 1 − 1 + 1 + 1 − 1).

Indeed:

R1(c) = (−1)c(100) = (−1)0 = +1R2(c) = (−1)c(010) = (−1)1 = −1R3(c) = (−1)c(001) = (−1)1 = −1R4(c) = (−1)c(110) = (−1)1 = −1R5(c) = (−1)c(011) = (−1)1 = −1R6(c) = (−1)c(101) = (−1)0 = +1R7(c) = (−1)c(111) = (−1)0 = +1R8(c) = (−1)c(000) = (−1)1 = −1

Let d be a binary q-tuple and let c be a binary 2q-tuple. Let R(c) = (−1)c(b) be a 2q-tuplewith entries either +1, or −1 as defined earlier. Then the Welsh-Hadamard transform of R(c)is

R(d) =∑b∈Bq

(−1)d·bR(c))

or

R(d) =∑b∈Bq

(−1)d·b+c(b).

For example, let q = 3, d = ( 1 1 1)T and c = (0 1 1 1 1 0 0 1). Then

R(1 1 1) =∑b∈B3

(−1)( 1 1 1)·b+c(b)

R(1 1 1) = (−1)( 1 1 1)·(1 0 0)+c(1 0 0) + (−1)(1 1 1)·(0 1 0)+c(0 1 0) + (−1)(1 1 1)·(0 0 1)+c(0 0 1)

+ (−1)(1 1 1)·(1 1 0)+c(1 1 0) + (−1)(1 1 1)·(0 1 1)+c(0 1 1) + (−1)(1 1 1)·(1 0 1)+c(1 0 1)+

+ (−1)(1 1 1)·(1 1 1)+c(1 1 1) + (−1)(1 1 1)·(0 0 0)+c(0 0 0) =

(−1)1+0 +(−1)1+1 +(−1)1+1 +(−1)0+1 +(−1)0+1 +(−1)0+0 +(−1)1+0 +(−1)0+1 = −2.

Given the binary vector

238

t = c +

q∑i=1

divi

with d = (d1d2 . . . dq)T a binary q-tuple and vi the i-th row of Bq. Then R is the number of

0’s minus the number of 1’s in t.

10.2 The Fast Hadamard Transform

If q is a positive integer and M(i)2q = I2q−i ⊗H2 ⊗ I2i−1 then

H(2q) = M(1)2q M

(2)2q . . . M

(q)2q

Proof: By induction. For q = 1 we need to prove that H(2) = M(1)2 . But by definition

M(1)2 = I21−1 ⊗H2 ⊗ I21−1 = H2.Assume that this is true for q = k and consider the case q = k + 1. For q = k we have:

H(2k) = M(1)

2k M(2)

2k . . . M(k)

2k

Now for q = k + 1 we have to prove that:

H(2k+1) = M(1)

2k+1M(2)

2k+1 . . . M(k)

2k+1M(k+1)

2k+1

But

M(i)

2k+1 = I2k+1−i ⊗H2 ⊗ I2i−2 = I2 ⊗ I2k−i ⊗H2 ⊗ I2i−2 = I2 ⊗M(i)

2k

Thus

H(2k+1) = (I2 ⊗M(1)

2k )(I2 ⊗M(2)

2k ) . . . (I2 ⊗M(k)

2k )M(k+1)

2k+1

We know from Chapter 3 that the tensor product of matrices V,W,X, Y has the followingproperty (V ⊗W )(X⊗Y ) = V X⊗WY . Applying this property repeatedly after substituting

H(2k) for M(1)

2k M(2)

2k . . . M(k)

2k we get

H(2k+1) = (I2 ⊗M(1)

2k M(2)

2k . . . M(k)

2k )(M(k+1)

2k+1 ) = (I2 ⊗H(2k)(M(k+1)

2k+1 ).

But from the definition of M(k+1)

2k+1 we see that

M(k+1)

2k+1 = H2 ⊗ I2k .

Thus

H(2k+1) = (I2 ⊗H(2k))(H2 ⊗ I2k) = (I2H2)⊗ (H(2k)I2k) = H2 ⊗H(2k) = H(2k+1).

The fast Hadamard transform allows a speedup of

SFHT/HT =2q+1 − 1

3q

239

versus direct Welsh-Hadamard transform. For example, for q = 16 we have SFHT/HT =217/42 ≈ 128, 000/42 ≈ 3048.

For example, given R(c) = (+1 − 1 − 1 − 1 − 1 + 1 + 1 − 1) let us compute R = RHwith H = M1

8 M28 M3

8 . First we calculate M18 , M2

8 , and M38 as follows:

M18 = I4 ⊗H2 ⊗ I1 =

1 0 0 00 1 0 00 0 1 00 0 0 1

⊗

(1 11 −1

)

Thus

M18 =

1 1 0 0 0 0 0 01 −1 0 0 0 0 0 00 0 1 1 0 0 0 00 0 1 −1 0 0 0 00 0 0 0 1 1 0 00 0 0 0 1 −1 0 00 0 0 0 0 0 1 10 0 0 0 0 0 1 −1

Then

M28 = I2⊗H2⊗ I2 =

(1 00 1

)⊗(

1 11 −1

)⊗(

1 00 1

)=

1 1 0 01 −1 0 00 0 1 10 0 1 −1

⊗

(1 00 1

)

Thus

M28 =

1 0 1 0 0 0 0 00 1 0 1 0 0 0 01 0 −1 0 0 0 0 00 1 0 −1 0 0 0 00 0 0 0 1 0 1 00 0 0 0 0 1 0 10 0 0 0 1 0 −1 00 0 0 0 0 1 0 −1

Finally

M38 = I1 ⊗H2 ⊗ I4 =

(1 00 1

)⊗

1 0 0 00 1 0 00 0 1 00 0 0 1

Thus

240

M38 =

1 0 0 0 1 0 0 00 1 0 0 0 1 0 00 0 1 0 0 0 1 00 0 0 1 0 0 0 11 0 0 0 1 0 0 00 1 0 0 0 1 0 00 0 1 0 0 0 1 00 0 0 1 0 0 0 1

Then

RM18 = (+1 − 1 − 1 − 1 − 1 + 1 + 1 − 1)

1 1 0 0 0 0 0 01 −1 0 0 0 0 0 00 0 1 1 0 0 0 00 0 1 −1 0 0 0 00 0 0 0 1 1 0 00 0 0 0 1 −1 0 00 0 0 0 0 0 1 10 0 0 0 0 0 1 −1

or

RM18 = (0 + 2 − 2 0 0 − 2 0 + 2).

Next

(RM18 )M2

8 = (0 + 2 − 2 0 0 − 2 0 + 2)M28 =

1 0 1 0 0 0 0 00 1 0 1 0 0 0 01 0 −1 0 0 0 0 00 1 0 −1 0 0 0 00 0 0 0 1 0 1 00 0 0 0 0 1 0 10 0 0 0 1 0 −1 00 0 0 0 0 1 0 −1

or

(RM18 )M1

8 = (−2 + 2 + 2 + 2 0 0 0 − 4).

Finally

(RM18 M2

8 )M38 = (−2 + 2 + 2 + 2 0 0 0 − 4)

1 0 0 0 1 0 0 00 1 0 0 0 1 0 00 0 1 0 0 0 1 00 0 0 1 0 0 0 11 0 0 0 1 0 0 00 1 0 0 0 1 0 00 0 1 0 0 0 1 00 0 0 1 0 0 0 1

.

241

or

R = (RM18 M2

8 )M38 = (−2 + 2 + 2 − 2 − 2 + 2 + 2 − 4).

10.3 Further Readings


(1) Show that a direct computation of the Welsh-Hadamard transform requires 2q(2q+1 − 1)operations. Hint: multiplication of R with a column of H requires 2q multiplications and2q − 1 additions.

(2) Prove that R is the number of 0’s minus the number of 1’s in t = c +∑q

i=1 divi.

(3) Show that a computation of the fast Hadamard transform requires 3× 2q × q operations.

Hint: to compute the product RM(1)2q we need 2 multiplications and one addition for each of

the 2q columns.

242

References

[1] E. S. Abers. Quantum Mechanics. Prentice Hall, Upper Saddle River, NJ, ISBN 0-13-146100-1, 2003.

[2] A. D. Aczel. Entanglement: The Greatest Mystery in Physics. Four Walls Eight WindowsPublishing House, New York, N.Y., ISBN 1-56858-232-3, 2001.

[3] M. Agrawal, N. Kayal, and N. Saxena. PRIMES is in P. http://www.cse.iitk.ac.in, 2002.

[4] D. Z. Albert. Quantum Mechanics and Experience. Harvard University Press, Cambridge,Mass, ISBN 0-674-74112-9, 1992.

[5] A. Barenco, C. H. Bennett, R. Cleve, D. P. DiVincenzo, N. Margolus, P. Shor, T. Sleator,J. Smolin, and H. Weinfurter. Elementary Gates for Quantum Computation. Preprint,http://arxiv.org/archive/quant-ph/9503016 v1, March 1995.

[6] E. Bernstein and U. Vazirani. Quantum Complexity Theory. SIAM J. Computing 26,1411–1473, 1997.

[7] J. S. Bell. On the Einstein-Podolsky-Rosen Paradox. Physics, 1:195–200, 1964.

[8] J. S. Bell. Speakable and Unspeakable in Quantum Mechanics: Collected Papers onQuantum Philosophy. Cambridge University Press, Cambridge, 1987.

[9] P. Benioff. The Computer as a Physical System. A Microscopic Quantum MechanicalHamiltonian Model of Computers as Represented by Turing Machines. J. Stat. Phys.,22:563–591, 1980.

[10] P. Benioff. Quantum Mechanical Models of Turing Machines that Dissipate no Energy.Physical Review Letters, 48:1581–1584, 1982.

[11] P. Benioff. Quantum Mechanical Models of Turing Machines. J. Stat. Phys., 29:515–546,1982.

[12] C. H. Bennett. Logical Reversibility of Computation. IBM Journal of Research andDevelopment, 17:525–535, 1973.

[13] C. H. Bennett. The Thermodinamics of Computation – A Review. International Journalof Theoretical Physics, 21:905–928, 1982.

[14] C. H. Bennett and G. Brassard. Quantum Cryptography: Public Key Distribution andCoin Tossing. Proc. IEEE Conf. on Computers, Systems, and Signal Processing, IEEEPress, 175–179, 1984.

[15] C. H. Bennett, G. Brassard, C. Crepeau, R. Josza, A. Peres, and W. K. Wooters. Tele-porting an Unknown State via Dual Classical and Einstein-Podolsky-Rosen Channels.Physical Review Letters, 70(13):1895–1899, 1993.

[16] C. H. Bennett. Quantum Information and Computation. Physics Today, 24–30, October1995.

243

[17] C. H. Bennett and P.W. Shor. Quantum Information Theory. IEEE Trans. on Informa-tion Theory, 44(6):2724–2742, 1998.

[18] C. H. Bennett, P. W. Shor, J. A. Smolin, and A. V. Thapliyal. Entanglement-AssistedCapacity of a Quantum Channel and the Reverse Shannon Theorem. IEEE Trans. onInformation Theory, 48(10):2637–2655, 2002.

[19] C. H. Bennett, T. Mor, and J. A. Smolin. The Parity Bit in Quantum Cryptography.arXiv.quant-ph/9604040, July 5, 2002.

[20] E. Berstein and U. Vazirani. Quantum Complexity Theory. Proc. 25th ACM Symp. ofTheory of Computing, ACM New York, 11–20, 1993.

[21] A. Berthiaume and G. Brassard. The Quantum Challenge to Structural ComplexityTheory. Proc. 7-th Annual Conf. on Structure in Complexity Theory, IEEE Pres, LosAlamitos Ca, 132–137, 1992.

[22] A. Berthiaume and G. Brassard. Oracle Quantum Computing. Proc. Workshop onPhysics of Computation, IEEE Pres, Los Alamitos Ca, 195–199, 1992.

[23] G. Birkhoff and S. Mac Lane. A Survey of Modern Algebra., Macmillan Publishing,New York, N.Y., 1965.

[24] M. Born. The Statistical Interpretations of Quantum Mechanics. Nobel Lec-ture, December 11, 1954. From Nobel Lectures, Physics 1942-1962, 256–267.http://www.nobel.se/physics/laureates/1954/born-lecture.html.

[25] D. Bouwmester, A. Ekert, and A. Zeilinger, Eds. The Physics of Quantum InformationSpringer Verlag, Heidelberg, ISBN 3-540-66778-4, 2003.

[26] G. K. Brennen, C. M. Caves, P. S. Jessen, and I. H. Deutsch. Quantum Logic Gates inOptical Lattices. Physical Review Letters, 82(5):1060–1063, 1999.

[27] L. de Broglie. The wave nature of the Electron. Nobel Lecture, De-cember 12, 1929. From Nobel Lectures, Physics 1922-1941, 244–256.http://www.nobel.se/physics/laureates/1929/broglie-lecture.html.

[28] J. Brown. The Quest for Quantum Computer. Simon and Suster, New York, N.Y., ISBN0-648-87004-5, 1999.

[29] A. W. Burks, H.H. Goldstine, and J. von Neumann Preliminary Discussion of the Log-ical Design of an Electronic Computer Instrument. Report to the US Army OrdenanceDepartment, 1946. Also in: Papers of John von Neumann W. Asprey and A. W. Burks,editors, MIT Press, Cambridge, Mass., 97–146, 1987.

[30] A. R. Calderbank and P. W. Shor. Good Quantum Error-Correcting Codes Exist. PhysicalReview A, 54(42):1098–1105, 1996.

[31] A. M. Childs and I. L. Chuang. Universal Quantum Computation with Two-Level TrappedIons. Physical Review A, 63(1), 012306, 2000.

[32] J. I. Cirac and P. Zoller. Quantum Computation with Cold Trapped Ions. PhysicalReview Letters, 74(20):4091–4094, 1995.

244

[33] J. I. Cirac and P. Zoller. A Scalable Quantum Computer with Ions in an Array ofMicrotraps. Nature, 404:579-581, 2000.

[34] D. G. Cory, A. F. Fahmy, and T. F. Havel. Nuclear Magnetic Resonance Spectroscopy:An Experimentally Accessible Paradigm for Quantum Computing. Proc. PhysComp96,T. Toffoli, M. Biafore, and J Leao Eds. New England Complex Systems Institute, 87–91,(1996).

[35] D. G. Cory, A. F. Fahmy, and T. F. Havel. Ensemble Quantum Computing by NMRSpectroscopy. Proc. Nat. Acad. Sci, 94(5):1634, 1997.

[36] D. G. Cory, M. D. Price, and T. F. Havel. Nuclear Magnetic Resonance Spectroscopy: AnExperimentally Accessible Paradigm for Quantum Computing. Physica D, 120:82–101,1998.

[37] T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley Series inTelecommunications. John Wiley & Sons, New York, N.Y., ISBN 0-471-06259-6, 1991.

[38] W. van Dam. A Unversal Quantum Cellular Automaton. Proc. PhysComp96, T. Toffoli,M. Biafore, and J Leao Eds. New England Complex Systems Institute, 323–331, (1996).

[39] D. Deutsch. Quantum Theory, the Church-Turing Principle and the Universal QuantumComputer. Proc. R. Soc. London A, 400:97–117, 1985.

[40] D. Deutsch. and R. Jozsa. Rapid Solution of Problems by Quantum Computations. Proc.R. Soc. London A, 439:553–558, 1992.

[41] D. Deutsch. The Fabric of Reality. Penguin Books, New York, N.Y., ISBN 0-14-027541-x,1997.

[42] P. A. M. Dirac. Theory of Electrons and Positrons. Nobel Lecture,December 12, 1933, From Nobel Lectures, Physics 1922-1941, 320–325.http://www.nobel.se/physics/laureates/1933/dirac-lecture.html.

[43] P. A. M. Dirac. The Principles of Quantum Mechanics. Fourth ed. Oxford: ClarendonPress, Sec. 2, 4–7, 1967.

[44] D. P. DiVincenzo. The Physical Implementation of Quantum Computation. Fortschritteder Physik, 48(9-11):771–783, 2000.

[45] M. I. Dyakonov. Quantum Computing. A View from the Enemy Camp. Preprint,http://arxiv.org/archive/quant-ph/0110326 v1, October 2001.

[46] A. Einstein, B. Podolsky, and N. Rosen. Can Quantum-Mechanical Description of Phys-ical Reality Be Considered Complete? Physical Review, 47:777, 1935.

[47] R.P. Feynman, R. B. Leighton, and M. Sands. The Feynman Lectures on Physics, Vol-umes 1,2, and 3. Addison-Wesley, Reading, Mass., ISBN 0-201-02116-1, 1977.

[48] R.P. Feynman. Simulating Physics with Computers. Int. J. Theoret. Phys. 21:467–488,1982.

245

[49] R.P. Feynman. Quantum Mechanical Computers Computers. Found. Phys. 16:507–531,1986.

[50] R.P. Feynman. QED - The Strange Theory of Light and Matter. Princeton UniversityPress, Princeton NJ, ISBN 0-691-02417-0 1985.

[51] R.P. Feynman. Lectures on Computation. Addison Wesley, Reading, Mass., ISBN 9-780201-48991-0, 1996.

[52] Ed. Fredkin. Digital Machnies: An Informational Process Based on Reversible UniversalCA. Physica D 45:254-270, 1990 (on line at http://digitalphilosophy.org/dm paper.htm).

[53] I. M. Gelfand. Lectures on Linear Algebra. Dover Publishers, ISBN 0486660826, 1989.

[54] N. A. Gershenfeld and I. L. Chuang. Bulk Spin-Resonance Quantum Computation Sci-ence, 275(5298):350–356, 1997.

[55] D. Gottesman. An Introduction to Quantum Error Correction. Proc. Symp. in AppliedMath, also Preprint, http://arxiv.org/archive/quant-ph/00040072 v1, April 2000.

[56] I. S. Gradshteyn and I. M. Ryzhik. Table of Integrals, Series, and Products. AcademicPress, Orlando, Fl., ISBN 0-12-294760-6, 1980.

[57] L. K. Grover. A Fast Quantum Algorithm for Database Search. Proc. ACM Symp. onTheory of Computing, ACM Press, 212–219, 1996.

[58] L. K. Grover. Quantum Mechanics Helps in Searching for a Needle in a Haystack. Phys.Rev. Lett. 78, 325–328, 1997.

[59] L. K. Grover. A Framework for Fast Quantum Mechanical Algorithms. Proc. Symp. onTheory of Computing, ACM Press, 53–62, 1998.

[60] Y. Hardey and W-H. Steeb. Classical and Quantum Computing, Birkhauser, Boston,Ma., ISBN 3-7643-6610-9, 2001.

[61] S. W. Hawking A Brief History of Time. Bantam Books, New York, N.Y., ISBN0-553-05340-X, 1988.

[62] W. Heisenberg. The Development of Quantum Mechanics. Nobel Lecture, December 11,1933. From Nobel Lectures, Physics 1922-1942, 290–301.

[63] D. Jaksch, H.-J. Briegel, J. I. Cirac, and P. Zoller. Entanglement of Atoms via ColdControlled Collisions. Physical Review Letters, 82(9):1975–1978, 1999.

[64] G. Johnson. A Shortcut through Time. Alfred A. Knopf, New York, N.Y., ISBN 0-375-41193-3, 2003.

[65] B. E. King, C. S. Wood, C. J. Myatt, Q. A. Turchette, D. Leibfried, W. M. Itano, C.Monroe, and D. J. Wineland. Cooling the Collective Motion of Trapped Ions to Initializea Quantum Register. Physical Review Letters,81:1525–1528, 1998.

[66] E. Knill, R. Laflame, H. Barnum, D. Dalvit, J. Dziarmaga, J. Gubernatis, L.Gurvis, G. Ortiz, L. Viola, and W. H. Zurek. Quantum Information Processing.Kluwer Encyclopedia of Mathematics, Suplement III, 2002.

246

[67] D. Knuth. The Art of Computer Programming. Vol 1, Second Edition, Addison Wesley,Reading, Mass., 1969.

[68] D. Knuth. The Art of Computer Programming. Vol 2, Second Ediion, Addison Wesley,Reading, Mass., ISBN 0201-03822-6, 1981.

[69] A. N. Kolmogorov and S. V. Fomin. Elements of the Theory of Functions and FunctionalAnalysis. Dover Publications, Mineola, N.Y. (replication of the work published in 1957by Graylock Press, Rochester, N.Y.), ISBN 0-486-40683-0, 1999.

[70] R. Landauer. Irreversibility and Heat Generation in the Computing Process. IBM Journalof Research and Development, 5:182–192, 1961.

[71] S. Lloyd. a Potentially Realizable Quantum Computer. Science, 261:1569–1571, 1993.

[72] Y. Manin. Classical Computing, Quantum Computing, and Shor’s Algorithm. Talk atthe Bourbaki Seminar, June 1999. Preprint, http://arxiv.org/archive/quant-ph/9903008v1, March 1999.

[73] S. McCartney. ENIAC; The Triumphs and Tragedies of the World’s FirstComputer.Walker and Company Publishing House, New York, NY., ISBN 0-8027-1348-3, 1999.

[74] L. Meitner and O.R. Frisch. Disintegration of Uranium by Neutrons: a New Type ofNuclear Reaction. Nature, 143, 239-240, 1939.

[75] E. Mertzbacher. Quantum Mechanics, 3rd Edition, Willey, New York, 1998.

[76] G. J. Milburn. Schroedinger’s Machines. Perseus Books, Cambridge, Mass., ISBN 0-7382-0173-1, 1998.

[77] G. J. Milburn. The Feynman Processor. W.H. Freeman and Company, New York, N.Y.,ISBN 0-7167-3106-1, 1996.

[78] C. Monroe, D. M. Meekhof, B. E. King, W. M. Itano, and D. J. Wineland. Demonstrationof a Fundamental Quantum Logic Gate. Physical Review Letters, 74(25):4714–4718, 1995.

[79] C. Monroe, D. Liebfried, B. E. King, D. M. Meekhof, W. M. Itano, and D. J. Wineland.Simplified Quantum Logic with Trapped Ions. Physical Review A, 55:R2489–2491, 1997.

[80] M.A. Nielsen and Isaac L. Chuang. Quantum Computing and Quantum Information.Cambridge University Press, ISBN 0-521-63245-8, 2000.

[81] B. W. Ogburn and J. Preskill. Topological Quantum Computation. Lecture Notes inComputer Science, Springer - Verlag, 1509:341–359, 1999.

[82] R. Omnes. The Interpretation of Quantum Mechanics. Princeton Series in Physics.Princeton University Press, Princeton, NJ, ISBN 0-691-03336-6, 1994.

[83] D. A. Patterson and J. L. Hennesy Computer Organization and Design; The Hard-ware/Software Interface, Second Edition. Morgan Kaufmann, San Francsisco, Ca., ISBN1-55860-428-6, 1998.

247

[84] W. Pauli. Exclusion Principle and Quantum Mechanics. Nobel Lec-ture, December 13, 1946. From Nobel Lectures, Physics 1942-1962, 27–43.http://www.nobel.se/physics/laureates/1945/pauli-lecture.html.

[85] A. O. Pittinger. An Introduction to Quantum Algorithms. Birkhauser, Boston, Ma.,ISBN 0-8176-4127-0, 1999.

[86] M. K. E. L. Planck. The Genesis and Present State of Development of the Quan-tum Theory. Nobel Lecture, June 2, 1920. From Nobel Lectures, Physics 1901-1922.http://www.nobel.se/physics/laureates/1918/planck-lecture.html.

[87] S. Popescu. Bell’s Inequalities and Density Matrices. Revealing “Hidden” Nonlocality.Preprint, http://arxiv.org/archive/quant-ph/9502005 v1, February 1995.

[88] J. F. Poyatos, J. I. Cirac, and P. Zoller. Quantum Gates with “Hot” Trapped Ions.Physical Review Letters, 81(6):1322–1325, 1998.

[89] J. F. Poyatos, J. I. Cirac, and P. Zoller. Schemes of Quantum Computations with TrappedIons. Fortschritte der Physik, 48(9-11):785–799, 2000.

[90] J. Preskill. Fault Tolerant Quantum Computation. Preprint,http://arxiv.org/archive/quant-ph/9712048 v1, December 1997.

[91] J. Preskill. Lecture Notes for Physics 229: Quantum Information and Computing Cali-formia Institute of Technology, 1998.

[92] J. Preskill. Quantum Clock Synchronization and Quantum Error Correction. Preprint,http://arxiv.org/archive/quant-ph/0010098 v1, October 2000.

[93] E. Rieffel and W. Polak. An Introduction to Quantum Computing for Non-Physicists.ACM Computing Surveys, 32(3):300–335, 2000.

[94] E. Schrodinger. The Fundamental Idea of Wave Mechanics. Nobel Lecture, December12, 1933. From Nobel Lectures, Physics 1922-1941, 305–314.

[95] E. Schrodinger. The Present Situation in Quantum Mechanics. Die Naturwissenschaften,23:807-812; 823-828; 944-849, 1935. See also Proceedings of the Cambridge PhilosophicalSociety, 31:555-563, 1935 and 32:446-452, 1936.

[96] B. Schumacher. Quantum Coding. Physical Review A, 51(4):2738–2747, 1995.

[97] C. E. Shannon. Communication in the Presence of Noise. Proceedings of the IRE,37:10–21, 1949.

[98] C. E. Shannon. Certain Results in Coding Theory for Noisy Channels. Information andControl, 1(1):6–25, 1957.

[99] C. E. Shannon and W. Weaver. The Mathematical Theory of Communication. Universityof Illinois Press, Urbana,Il., 1963.

[100] P. W. Shor. Algorithms for Quantum Computation: Discrete Log and Factoring. Proc.35 Annual Symp. on Foundations of Computer Science, pages 124–134, IEEE Press,Piscataway, New Jersey, 1994.

248

[101] P. W. Shor. Scheme for Reducing Decoherence in Quantum Computer Memory. PhysicalReview A, 52(4):2493–2496, 1995.

[102] P. W. Shor. Polynomial - Time Algorithms for Prime Factorization and Discrete Loga-rithms on a Quantum Computer. Preprint, http://arxiv.org/archive/quant-ph/9508027v2, January 1996.

[103] P. W. Shor. Fault - Tolerant Quantum Computation. 37th Ann. Symp. on Foundationsof Computer Science, 56–65, IEEE Press, Piscataway, New Jersey, N.J., 1996.

[104] P. W. Shor. Polynomial-Time Algorithms for Prime Factorization and Discrete Loga-rithms on a Quantum Computer. SIAM J. Computing 26:1484–1509, 1997.

[105] P. W. Shor and J. Preskill. Simple Proof of Security of the BB84 Quantum Key Distri-bution Protocol. arXiv.quant-ph/0003004, May, 2000.

[106] P. W. Shor. Introduction to Quantum Algorithms. http://www.arXiv/quant-ph Preprint0005003 July 6, 2001.

[107] P. W. Shor. Why Haven’t More Quantum Algorithms Been Found. Journal of the ACM,50(1): 87–90, 2003.

[108] D. R. Simon. On the Power of Quantum Computation. SIAM J. Computing 26:1474–1483, 1997.

[109] A. Sørensen and K. Mølmer. Quantum Computation with Ions in Thermal Motion.Physical Review Letters, 82(9):1971–1974, 1999.

[110] A. Steane. The Ion Trap Quantum Information Processor. Preprint,http://arxiv.org/archive/quant-ph/9608011 v2, August 1996.

[111] A. Steane. Quantum Computing. Reports Prog. Phys. 61, 117, 1998. Preprint,http://arxiv.org/archive/quant-ph/97080222 v2, September 1997.

[112] O. Stern. The Method of Molecular Rays Nobel Lecture, December 12, 1946. From NobelLectures, Physics 1942-1961, 8–16. http://www.nobel.se/physics/laureates/1943/stern-lecture.html.

[113] N. S. Szabo and R. I. Tanaka. Residue Arithmetic and Its Applications to ComputerTechnology Mc Graw Hill, New York, N.Y., 1967.

[114] L. Szilard. Uber die Entropieverminderung in einem Thermodynamichen System beiEingriffen Intelligenter Wesen Zeitschrifft fur Physik, 53:840–856, 1929.

[115] A. J. Thomasian. The Structure of Probability Theory with Applications. McGraw-HillNew York, N.Y., 1969.

[116] A. C. de la Tore, A. Daleo, and I. Garcia-Mata. The Photon-Box Bohr-Einstein DebateDemythologized. Preprint, http://arxiv.org/archive/quant-ph/9910040 v1, October 1999.

[117] Q. A. Turchette, C. S. Wood, B. E. King, C. J. Myatt, D. Leibfried, W. M. Itano, C.Monroe,and D. J. Wineland. Deterministic Entanglement of Two Ions. Physical ReviewLetters, 81:3631–3634, 1998.

249

[118] A. M. Turing. On Computable Numbers with Application to the Entscheidungsproblem.Proc. London Math. Soc. 2, 42:230, 1936.

[119] L. M. K. Vandersypen, M. Steffen, G. Breyta, C. S. Yannoni, M. H. Sherwood,and I. S. Chuang. Experimental Realization of Shor’s Quantum Factoring AlgorithmUsing Nuclear Magnetic Resonance. Nature, 414, 883:887, 2001. Preprint, http://arxiv.org/archive/quant-ph/0112176 v1, Decemberv 2001

[120] S. A. Vanstone and P. C. van Oorschot. An Introduction to Error Correcting Codes withApplications. Kluwer Academic Publishers, Boston, Mass., ISBN 0-7923-9017-2, 1987.

[121] J. von Neumann. Fourth University of Illinois Lecture. In A. W. Burks, editor, Theoryof Self-Reproduced Automata, page 66, University of Illinois Press, Urbana, Il., 1966.

[122] J. von Neumann. Mathematical Foundations of Quantum Mechanics. Trans. R. T. Bayer.Princeton University Press, Princeton, 1955. (First published in German in 1932).

[123] D. Wells. The Penguin Book of Curious and Interesting Puzzels. Penguin Books, ISBN:0140148752, 1992.

[124] D. J. Wineland, M. Barrett, J. Britton, J. Chiaverini, B. DeMarco, W. M. Itano, B.Jelencovic, C. Langer, D. Leibfried, V. Meyer, T. Rosenband, and T. Schatz. QuantumInformation Processing with Trapped Ions. Preprint, http://arxiv.org/archive/quant-ph/0212079 v2, March 2003.

[125] W. K. Wotters and W. H. Zurek. A Single Quantum Cannot Be Clonned. Nature,299:802–803, 1982.

[126] Conference on Lasers and Electron Optics, http://optics.org/articles/news/9/6/3/1.

[127] Semiconductor Industry Association Roadmap 2000-2001. http://public.itrs.net.

250

11 Glossary

Abelian group Algebraic structure; a group with a commutative binary operation. Seealso group.

adiabatic Term frequently used in thermodynamics meaning without transfer of heat(without change in temperature). For example, if we open the valve of a gas canister,the gas rushes out and expands without having the time to equalize its temperature withthe environment. The rushing gas feels cool.

alphabet Set of symbols accepted as input by an automaton.

adjoint matrix A† the adjoint of matrix A = [aij] is obtained by transposing A, and thentaking the complex conjugate of each of its element

A† = [a∗ji]

The order of the two operations can be reversed. See also the dual operator.

amplitude, probability amplitude Given a set of basis or logical states | i〉, a quantumsystem may be in a superposition of these states, | ψ〉 =

∑i αi | i〉. The complex numbers

αi are called amplitudes, or probability amplitudes.

ancillas Auxiliary qubits used to assist in a computation.

basis states of a quantum system Linearly independent, complete, subset of orthonor-mal state vectors in the vector space of the quantum system .

basis states of a qubit Two orthonormal states such as | 0〉 and | 1〉.Bayes rule Probability rule. Assume that it is known that event A occurred, but it

is not known which one of the set of mutually exclusive and collectively exhaustive eventsB1,B2, . . .Bn has subsequently occurred. Then the conditional probability that one of theseevents, Bj occurs, given that A occurs is

P (Bj | A) =P (A | Bj)P (Bj)∑i P (A | Bi)P (Bi)

.

P (Bj | A) is called the a posteriori probability.

BB84 A quantum key distribution protocol proposed by Charles Bennett and Gilles Bras-sard in 1984.

beam splitter An optical device, e.g., a half silvered mirror, that splits an incomingbeam of photons into a reflected and a transmitted beam. The color of the light, thus itswavelength, is not altered by a beam splitter, a behavior consistent with a wave.

Bell states Four distinct quantum states of a two-particle system with a very strongcoupling of the individual states of the particles. These states form a normal basis

| β00〉 =| 00〉 + | 11〉√

2

| β01〉 =| 01〉 + | 10〉√

2

| β10〉 =| 00〉 − | 11〉√

2

251

| β11〉 =| 01〉 − | 10〉√

2

Bell inequality A constraint on the sum of averages of measured observables, based onassumptions of local realism, i.e., assumptions of locality and of existence of hidden vari-ables. Quantum mechanics predicts a value for the sum of averages of observables whichis in violation of the Bell inequality. Consider two photons with the same polarization,moving apart from one another at the speed of light. “If the coherence of the system canbe maintained, a measurement made at one location will assign a single definite state to thesystem, which should be verified by any subsequent measurements, no matter where theyoccur. Whether the first photon passes through the polarizer or is absorbed completelydetermines the polarization of both photons from that point onward. The polarizer canbe thought of as asking the system are you polarized at angle θ? and accepting only ayes/no answer, which the system is then obliged to repeat consistently thereafter. In theCopenhagen interpretation, nothing can be said about the angle of polarization of the pho-tons before the measurement. The angle of polarization has no objective reality until it ismeasured, and the measurement result is not predetermined by anything, it is intrinsicallyrandom. After it is measured, then the polarization state of both photons is completelyknown and will not change. In a hidden variable theory, there is a variable or variables thatdetermine the real polarization angle of the photons; these variables have definite valuesfrom the moment the photons are formed, and they determine what the result of the po-larization measurement will be. Let us make the polarization measurement on the secondparticle at a different angle from our measurement on the first. Now the two photons havedifferent probabilities of passing through their respective polarizers. We can expect thatsometimes photon A will pass through its polarizer, but photon B will be absorbed by itspolarizer. We call such events errors, in the sense that there is a disagreement betweenthe measurements of the two polarizers. Suppose the polarizer at A is rotated by an angleθ relative to the polarizer at B. A certain number of errors E are produced. Now whathappens if we rotate the B polarizer through an angle θ in the opposite direction, so thatthe total angle between them is 2θ? We expect more errors, but how much more? Twiceas many? Bell demonstrated that the error rate with an angle 2θ must be less than orequal to twice the error rate at angle θ: E(2θ) ≤ 2E(θ), provided that the photons have adefinite polarization (as postulated by the hidden variable theories) and that the A photoncannot affect the B photon’s state instantaneously (this is the requirement of locality)”(http://www.telp.com/philosophy/qw3.htm). When we obtain two contradictory resultsusing two different theoretical models it means that one of the models is wrong. Then wemust turn to experiments to determine which one is wrong. In this case the experimentsshow that Bell inequality is violated, therefore, they prove the quantum mechanics to becorrect.

John Stewart Bell, British physicist (1928 -1990). His work led to the possibility of exploringseemingly philosophical questions in quantum mechanics, such as the nature of reality,directly through experiments. Bell started from locality and argued for the existence ofdeterministic hidden variables. He considered measurements of spin components alongarbitrary directions corresponding to each of the two EPR particles (see the EPR paradox).Bell calculated what happened when the measurement direction was kept constant for oneparticle and varied for the other. He was able to show that the behavior predicted byquantum theory could not be duplicated by a hidden-variable theory if the hidden variables

252

acted locally. As shown by Bell and others, local realist theories (i.e. theories with hiddenvariables) satisfy a so-called Bell inequality. This is a constraint on the relationship betweenthe joint probability densities of the signals recorded in the two wings of the apparatus; itinvolves the four distinct cases that may be obtained by having two settings in each wing.Quantum theory, on the other hand, does not obey the Bell inequality. In this way Bellhad opened up the possibility of experimental philosophy, the study of what are normallythought of as philosophical issues in experiments.

binary alphabet An alphabet with only two symbols usually denoted as 0 and 1.

binary symmetric channel Abstraction for a communication channel. The input andoutput alphabets consist of two symbols only. There is a well-defined mapping of the twosymbols in the input alphabet to the two symbols of the output alphabet. A noiselessbinary symmetric channel maps a 0 at the input into a 0 at the output and a 1 into a 1. Anoisy symmetric channel maps a 0 into a 1, and a 1 into a 0 with probability p; an inputsymbol is mapped into its itself with probability 1− p.

bit The basic unit of information. It can take one of two values, 0 or 1.

Bloch sphere A qubit is represented as a vector from the origin to a point on the three-dimensional sphere with a radius of one, the so called Bloch sphere. θ is the angle of thevector �r with the z-axis, and ϕ is the angle of the projection of the vector in the x−y planewith the x axis, γ has no observable effect. The state of a qubit can then be expressedusing three real numbers, θ, ϕ, γ

ψ = eiγ[cosθ

2+ eiϕsin

θ

2] = α0 | 0〉+ α1 | 1〉

with

α0 = eiγcosθ

2α1 = eiγeiϕsin

θ

2.

In this representation eiγ is an overall phase factor that is not observable, and thus, it isgenerally ignored.

block code A code where a group of information symbols is encoded into a fixed lengthcode word by adding a set of parity check or redundancy symbols.

Boolean algebra A Boolean algebra is a set B with two binary operations, ∪ and ∩, thatare commutative, associative and each distributes over the other, plus a unary operation ¬.The identity elements Φ, U ∈ B satisfy the relations b ∪ Φ = b, b ∩ U = b, b ∪ ¬b = U , andb ∩ ¬b = Φ for all elements b ∈ B. One interpretation of Boolean algebra is the collectionof subsets of a fixed set X. We take ∪,∩,¬, Φ, and U to be set union, set intersection,complementation, the empty set and the set X respectively. Equality here means theusual equality of sets. Another interpretation is the calculus of propositions in symboliclogic. Here we take ∪,∩,¬, Φ, and U to be disjunction, conjunction, negation, a fixedcontradiction and a fixed tautology respectively. In this setting equality means logicalequivalence.

George Bool, British mathematician (1815-1864). Best known for his contribution to sym-bolic logic (Boolean Algebra) but also active in probability theory, algebra, analysis, anddifferential equations. Bool founded symbolic logic. He lived, taught, and is buried in CorkCity, Ireland.

cellular automaton A cellular automaton (CA) is a discrete and deterministic systemmade up of cells like the points in a lattice. It follows a simple digital rule. A well known

253

CA is the Game of Life invented by John Conway.

certificate A document containing the public key of an entity, signed by an authorizedparty.

circuit complexity The smallest number of gates necessary to implement an operationon a fixed number of qubits.

channel , also communication channel An abstraction for the medium used by twoentities to communicate with each other.

channel capacity Maximum data rate through a communication channel.

Church-Turing principle Every function which can be regarded as computable can becomputed by an universal computing machine.

checksum An error detection method. The sender of a message typically performs a 1’scomplement sum over all the bytes of a protocol data unit and appends it to the message.The receiver recomputes the sum and compares it with the one in the message and decidesthat there is no error if the two agree.

ciphertext Plain text encoded with a secret key.

closed quantum system An idealization of a system of quantum particles. The systemis assumed to be isolated and the interaction of the particles with the environment nonexistent. In reality we can only construct quantum systems with a very weak interactionwith the environment.

CNOT gate, controlled-NOT gate A two-qubit gate. It has as input a control qubitand a target qubit. The control input is transferred directly to the control output of thegate. The target output is equal to the target input if the control input is | 0〉 and it isflipped if the control input is not | 0〉.

code In coding theory, the set of all valid code words; the code is known to sender andreceiver; if the message received is not a code word, then the receiver decides that an errorhas occurred. See also code word.

code word An n-tuple constructed by adding r parity check bits to k information symbolsto support error correction, error detection, or both.

coding theory Study of error correcting and error detecting codes.

coherent superposition of states The state vector | ψ〉 of a system can be written as

| ψ〉 =1√2(| ψa〉+ | ψb〉),

where | ψa〉, | ψb〉 can describe states of individual particles in a two-particle system os statewith slit a open or slit b open in a double-slit experiment. For a qubit, the state| ψ〉 = α0 | 0〉 + α1 | 1〉 is a coherent superposition, if there is always a basis in which thestate (value) of the qubit is well defined.

compression Encoding a data stream to reduce its redundancy and the amount of datatransferred.

conservative logic gate A logic gate that conserves the number of 1s at its input, e.g.,the Fredkin gate.

conjugate matrix The conjugate A∗ of a matrix A = [aij] is obtained by taking thecomplex conjugate of each element

A∗ = [a∗ij]

254

Copenhagen interpretation The Copenhagen school is the name given to the groupof theoreticians who shared the views of Niels Bohr regarding quantum mechanics inter-pretations. In the Copenhagen interpretation, nothing can be said about a property of aquantum system before the property is measured. Such a property has no objective realityuntil it is measured, and the measurement result is not predetermined by anything, it isintrinsically random. After it is measured, then the property is completely known and willnot change. Through a measurement one cannot learn anything about the past state ofthe system; the measurement can only provide information about the future state of thesystem.

cyclic redundancy check (CRC) Error detecting code; the parity check symbols arecomputed over the characters of the message and are then appended to the packet by thenetworking hardware.

decoding The process of restoring encoded data to its original format.

decoherence The destruction of the superposition of pure quantum states due to theinteraction of the quantum system with the environment.

decompression The process of restoring compressed data to its original format in case oflossless compression, or to a format very close to the original one in case of lossy compres-sion.

decryption The process of recovering encrypted data; the reverse of encryption.

de Morgan Laws If x and y are Boolean variables then x + y = x× y and x× y = x + ywhere + and × represent the Boolean operations AND and OR, respectively and x is thenegation of x.

dense coding Encoding multiple bits into a single qubit.

density matrix of a quantum system The density matrix or operator is a simplifyingnotation to represent pure or mixed states. For a pure state | φ〉, the density operatoris ρ =| φ〉〈φ |; for mixed states, it is a probabilistic expression ρ =

∑i µi | φi〉〈φi | with∑

i µi = 1, and with more than one µi �= 0.

Dirac ket and bra notation Notation used in quantum mechanics for state vectors. Ann-dimensional ket vector | ψ〉 can be expressed as a linear combination of the orthonormalket vectors | 0〉, | 1〉 . . . | i〉 . . . | n− 1〉

| ψ〉 = α0 | 0〉 + α1 | 1〉 + . . . + αi | i〉 + . . . + αn−1 | n− 1〉where α0, α1, . . . , αi, . . . , αn−1 are complex numbers. In matrix representation, the ket

vector is expressed as the column matrix

| ψ〉 =

α0

α1...αi...

αn−1

For each ket vector | ψ〉 there is a dual, the bra vector denoted by 〈ψ |. The bra and ket

vectors are related by Hermitian conjugation

| ψ〉 = (〈ψ |)†, 〈ψ |= (| ψ〉)†

255

The bra vector 〈ψa |, the dual of the ket, is expressed as a linear combination of the or-thonormal bra vectors | 0〉, | 1〉 . . . | i〉 . . . | n− 1〉

〈ψ | = α∗0〈0 | + α∗

1〈1 | + . . . + α∗i 〈i | + . . . + 〈n− 1 |

where α∗0, α∗

1, . . . , α∗i , . . . , α∗

n−1 are the complex conjugates of α0, α1, . . . , αi, . . . , αn−1.The dual bra vector is expressed as the row matrix

〈ψ | =(

α∗0 α∗

1 . . . α∗i . . . α∗

n−1

)Paul Adrien Maurice Dirac, British mathematician (1902-1984). His work has been concerned

with the mathematical and theoretical aspects of quantum mechanics. He used a noncom-mutative algebra for calculating atomic properties leading to the relativistic theory of theelectron (1928) and the theory of holes (1930). This latter theory required the existenceof a positive particle having the same mass and charge as the known (negative) electron.This particle, the positron, was discovered experimentally at a later date (1932) by C. D.Anderson. The importance of Dirac’s work lies essentially in his famous wave equation,which introduced special relativity into Schrodinger equation. In 1932 he became LucasianProfessor of Mathematics at Cambridge. In 1933 Dirac shared the Nobel Prize for physicswith Schrodinger.

distinguishable states of a quantum system Two states of a quantum system aredistinguishable if they are orthogonal. If two states are distinguishable, then a measurementexists that guarantees to determine which one of the two states the system is in.

Deutsch principle Extends the Church-Turing principle. Every finitely realizable physicalsystem can be perfectly simulated by a universal computing machine operating by finitemeans.

David Deutsch, physicist born in Haifa, Israel, and educated at Cambridge and Oxford Uni-versities in the UK. Member of the Quantum Computation and Cryptography group atClarendon Laboratory at Oxford University, author of the theory of parallel universes.

dual operator/matrix A†, the dual, or the adjoint of a matrix A, is obtained by trans-posing the matrix and then taking the complex conjugate of each element The order of thetwo operations can be reversed. See also adjoint matrix.

eigenstate In quantum mechanics, an eigenvector of an observable (Hermitian operator).See eigenvector.

eigenvalue A scalar λi associated with an eigenvector | ψi〉 of a linear operator (observable)O as

O | ψi〉 = λi | ψi〉In quantum mechanics, the eigenvalues of an operator represent those values of the correspond-

ing observable that have non-zero probability of occurring. The set of all the eigenvaluesis called the operator (matrix) spectrum.

eigenvector A state vector | ψi〉 is an eigenvector of a linear operator (observable) O ifwhen operated on by the operator, the result is a scalar multiple of itself

O | ψi〉 = λi | ψi〉The scalar λi is called the eigenvalue associated with the eigenvector. If | ψi〉 is an eigen-

vector with eigenvalue λi, then any non-zero multiple of | ψi〉 is also an eigenvector with

256

eigenvalue λi. If the set of state vectors | ψ0〉, | ψ1〉, . . . , | ψn−1〉 are eigenvectors to differenteigenvalues λ0, λ1, . . . , λn−1, then the state vectors | ψ0〉, | ψ1〉, . . . , | ψn−1〉 are necessarilylinear independent.

electromagnetic field An electromagnetic field consists of an electric and a magneticfield perpendicular to each other and oscillating in a plane perpendicular to the directionof propagation of the electromagnetic wave.

EPR paradox Experiment proposed by Einstein, Podolsky, and Rosen in 1935 to show thatquantum mechanics is not a complete theory. This version of the Einstein-Podolsky-Rosen(EPR) paradox is due to David Bohm. Consider a spin-0 particle that decays into two spin-1/2 particles. The z component of the spin, sz, of particle 1 has two eigenstates, α+1/2

and α−1/2. Similarly particle 2 has two eigenstates, β+1/2 and β−1/2. The combined statevector of the two-particle system is y = {(1/2)[α+1/2β−1/2 − α−1/2β+1/2]}. The minus signcorresponds to the fact that the total spin is zero. This state vector is said to be entangled;that is, the plus state for particle 1, α+1/2, is always associated with the minus stateof particle 2, β−1/2, and vice versa. However, in the orthodox view of quantum theory,neither spin has a specific value of sz until a measurement is made. According to vonNeumann, a measurement of sz on particle 1 causes the state vector to collapse to eithery = α+1/2β−1/2 or y = α−1/2β+1/2. Therefore, if the sz of particle 1 is measured to be+�/2, then that of particle 2 will be −�/2, and vice versa. Both spins now have specificvalues of sz. That is, the measurement has an instantaneous effect on the spin of particle1 and the spin of particle 2, even though the particles are spatially separated. But localityinsists that a measurement on the spin of particle 1 cannot have an instantaneous effecton the spin of particle 2 because nothing can travel faster than the speed of light. If onewishes to retain locality, one must dispute the orthodox view that the individual spins donot have values of sz.

entanglement The translation of the German term Verschrankung used by Schrodinger,who was the first to recognize this quantum effect. It means that a two-particle quantumsystem is in a state that cannot be written as a tensor product of the states of the individualparticles. The two quantum particles share a joint state and it is not possible to describeone of the particles in isolation.

entropy Measure of the uncertainty, or the degree of order in a system.

ergodic process Stochastic process for which time averages and set averages are identical.

error-correcting code Code allowing the receiver to reconstruct the code word sent, inthe presence of transmission errors.

error-detecting code Code allowing the receiver to detect transmission errors.

field Algebraic structure consisting of a set F equipped with two binary operations, additionand multiplication such that: (i) under addition F is an Abelian group with identity (orneutral element) 0; (ii) under multiplication, the nonzero elements form an Abelian group;and (iii) the distributive law holds: a(b + c) = ab + ac, ∀(a, b, c) ∈ F .

Fredkin gate A three-qubit gate with two target inputs a, b and one control input c, andthree outputs, a′, b′, and c′ = c. When the control input is not set, c = 0, the targetinputs appear unchanged at the output, a′ = a and b′ = b. When the control is set,c = 1, the target inputs are swapped, a′ = b and b′ = a. The control input is alwaystransferred to the output unchanged. There are both classical and quantum versions of theFredkin gate.

Edward Fredkin, professor of Computer Science at Carnegie Mellon University. Sometime

257

around 1960 he suggested that the Universe is some kind of computational device, a highlyparallel computational machine known as cellular automaton (CA). This idea was firstpublished in a scientific journal in 1990 [52].

frequency Characteristic of a periodic signal, the number of cycles per unit of time. It ismeasured in Hertz (Hz), or cycles per second.

Game of Life A game with the following rules: given a rectangular 2D lattice (like achecker-board), with two states per cell, one (live) or zero (dead). At each tick of a clock,the value at each cell: (a) if zero (dead) becomes one (birth) if exactly 3 of its nearest 8neighbors are ones (live); (b) remains one (survival) if 2 or 3 neighbors are ones (live); or(c) becomes zero (death) or stays zero (dead) in all other cases. The number of neighborsthat are ones (live) is based on the value of the cells before the rule is applied.

gedanken experiment “Gedanken” is the German word for “thought”. A thought ex-periment enables us to prove or disprove a conjecture when the experiment enabling usto study the physical phenomena is not feasible. We construct the result of the thoughtexperiment according to a set of assumptions and a model of the system, all based uponuniversally accepted laws of physics.

gate (quantum gate) Building block of a quantum circuit. A physical system capable oftransforming one or more qubits.

group Algebraic structure consisting of a set closed under a single-valued binary operation“·”, called multiplication, which satisfies three conditions: (i) it is associative, (ii) has aneutral or identity element, and (iii) each element has an inverse under multiplication. Seealso Abelian group.

Hadamard gate The H gate describes a unitary quantum “fair coin flip” performed upona single qubit. For example it transforms an input qubit in state | 0〉 into a superpositionstate (| 0〉+ | 1〉)/

√2, or (| 0〉− | 1〉)/

√2.

H =1√2

(1 11 −1

)Jacques Hadamard, French mathematician (1865 -1963) who proved the prime number theo-

rem, developed Hadamard matrices and did work on the calculus of variations.

Hamiltonian Similar to classical mechanics, the Hamilton function H(q, p) represents theenergy of a quantum system expressed in terms of dynamical variables, such as position qand momentum p.

Hamming bound The minimum number of parity check bits necessary to construct acode with a certain error correction capability, e.g., a code capable to correct all single-biterrors.

Hamming distance The number of positions where two code words differ.Richard Wesley Hamming, an American mathematician (1915-1998). B.S. in 1937 from the

University of Chicago, Ph.D. in mathematics in 1942 from the University of Illinois atUrbana-Champaign with a thesis on “Some Problems in the Boundary Value Theory ofLinear Differential Equations”. In 1945 he joined the Manhattan Project, and one yearlater started working for the Bell Laboratories. At Bell Labs he worked with ClaudeShannon and John Tukey. Hamming is best known for his work on error - detecting anderror - correcting codes. His fundamental paper on this topic appeared in 1950.

Heisenberg uncertainty principle Intrinsic property of the quantum systems, the preciseknowledge of some basic physical properties such as position and momentum is simply

258

forbidden. The uncertainty principle states that the uncertainty in determining the position∆X and the uncertainty in determining the momentum ∆PX at position X are constrainedby the inequality:

∆X ∆PX ≥ �

2

where � = h/2π is a modified form of Planck constant.Werner Heisenberg, German physicist, the founder of quantum mechanics (1901-1976).

Heisenberg and his fellow student Pauli started their study of theoretical physics underSommerfeld in 1920. In 1924-1925 he worked with Niels Bohr at the University of Copen-hagen. In 1925 Heisenberg formulated matrix mechanics, the first coherent mathematicalversion of quantum mechanics. Matrix mechanics was further developed in 1926 in a paperco-authored with Born and Jordan. He is perhaps best known for the Uncertainty Princi-ple, discovered in 1927. In 1928 Heisenberg published “The Physical Principles of QuantumTheory”. In 1932 he was awarded the Nobel Prize in physics for the creation of quantummechanics. During the Second World War Heisenberg headed the unsuccessful Germannuclear weapons project. In 1946 he was appointed director of the Max Planck Institutefor Physics and Astrophysics at Gottingen.

Hermitian operator or matrix A linear operator or a matrix A with the property thatA = A† where A† is the dual, or the adjoint of A, see also adjoint matrix.

Hertz The unit to measure the frequency of a periodic phenomena, abbreviated as Hz.Named after the great German physicist Heinrich Hertz. 1 Hz = 1 cycle/second. Multi-ples of this unit are 1 KHz, 1, 000 cycles/second, 1 MHz, 106 cycles/second, 1 GHz, 109

cycles/second, etc.Heinrich Hertz, a German physicist (1857- 1894), the first to demonstrate experimentally

the production and detection of Maxwell electromagnetic waves. The photoelectric effectwas first discovered in 1887 by Hertz, accidentally, while carrying on investigations on theelectromagnetic waves.

hidden variable theory A theory based upon the assumption that there is a variableor variables that determine the real properties of quantum particles. These variables havedefinite values from the moment the particle is created, and they determine the result ofthe measurement performed on that property of the quantum particle.

n-dimensional Hilbert space An n-dimensional complex vector space with an innerproduct and thus with a norm, denoted as Hn. An n-dimensional Hilbert space is iso-morphic with Cn. Each vector in Hn can be thought as a column vector with n complexcomponents. The norm of a vector a ∈ Hn is || a ||2= 〈a, a〉 with 〈a, a〉 the inner product.See also inner product in an n-dimensional Hilbert space.

David Hilbert, eminent German mathematician (1862-1943). In 1895, he was appointed tothe chair of mathematics at the University of Gottingen and he remained there for therest of his career. Hilbert’s first work was on invariant theory and, in 1888, he proved hisfamous Basis Theorem. He published Grundlagen der Geometrie in 1899 putting geometryin a formal axiomatic setting. He delivered the speech The Problems of Mathematics atthe Second International Congress of Mathematicians in Paris, challenging mathematiciansto solve fundamental questions such as: continuum hypothesis, the well ordering of thereals, Goldbach conjecture, the transcendence of powers of algebraic numbers, the Riemannhypothesis, and the extension of Dirichlet principle. Hilbert’s work in integral equations

259

led to the research in functional analysis and established the basis for his work on infinite-dimensional space, later called Hilbert space.

irreversible/non-invertible gate A gate characterized by the fact that knowing theoutput we cannot determine the input for all possible combinations of input values. Theirreversibility of classical gates, other than NOT, means that there is an irretrievable loss ofinformation and this has very serious consequences regarding the energy consumption ofclassical gates.

impure states of a single qubit Impure, or mixed states are represented by points insidethe Bloch sphere. This implies that the trace of the square of their density matrix is lessthan one, tr(ρ2) ≤ 1. See also pure states of a single qubit.

inner product of two vectors If A = F n is an n-dimensional vector space over the fieldof complex numbers, F = C, then the scalar product of two vectors is also called the innerproduct of the two vectors. See also scalar product

inner product in an n-dimensional Hilbert space The inner product of two vectorsa = (a1, a2, . . . ai . . . an)T and b = (b1, b2, . . . bj . . . bn)T , a, b ∈ Hn is denoted as 〈a, b〉. Then〈a, b〉 = a†b =

∑i a

∗i bi with a∗

i the complex conjugate of ai.In Dirac notation the inner product of two vectors | ψa〉, | ψb〉 ∈ Hn is written as 〉ψa | ψb〉.See also inner product.

isotermal Term frequently used in thermodynamics meaning at a constant temperature.

key distribution Mechanism for distribution of cryptographic keys, in particular of publickeys.

Landauer principle When a computer erases one bit of information the amount of energydissipated into the environment is at least kBT ln(2), with kB the Boltzmann constant andT the temperature of the environment. (An equivalent formulation: the entropy of theenvironment increases by at least kBln(2) when one bit of information is erased.) Landauerprinciple traces the energy consumption in a computation to the act of erasing information.

latency The time needed for an activity to complete.

light A form of electromagnetic radiation.

linear operator A linear operator A between two vector spaces A and B over the field Fis any function from A to B, A : A → B, linear in its inputs

A(∑

i

ciαi) =∑

i

ciA(αi).

with α ∈ A, ci ∈ F , and A(α) ∈ B.

Manhattan project A United States government research project to produce an atomicbomb. It was called the Manhattan Project because the first research had been done atColumbia University in Manhattan. The main research and development activity was latertransferred to Los Alamos, in New Mexico, site of the Los Alamos National Laboratory.

maximum likelihood decoding Decoding strategy when a received n-tuple is decodedinto the code word to minimize the probability of errors.

matrix in a Hilbert space An n×m matrix A is regarded as a linear operator from ann-dimensional Hilbert space, Hn, to an m-dimensional Hilbert space, Hm, namely

A : Hn −→ Hm.

260

Maxwell deamon In 1871 Maxwell proposed the following “thought” experiment. Imaginethe molecules of a gas in a cylinder divided in two by a wall which has a slit; the slit iscovered with a door controlled by a little deamon. The deamon examines every moleculeof gas and determines its velocity; those of high velocity on the left side are let to migrateto the right side and those with low velocity on the right side are allowed to migrate to theleft side. As a result of these measurements, the deamon separates hot from cold in blatantviolation of the Second Law of Thermodynamics.

measurement of a quantum system The process which makes a connection betweenthe quantum and the classical worlds; generally considered as an irreversible operationwhich destroys quantum information about an observable (property) of a quantum systemand replaces it with classical information. In quantum mechanics, the measurement ofan observable of a quantum system, such as momentum, energy, spin, is associated witha Hermitian operator, A, on the Hilbert space of state vectors of the system. If v is aneigenvector of A with eigenvalue λ, then measuring the system in a state described bystate vector v will always give the result λ. If the state vector is not an eigenvector ofA, the measurement process forces the system to jump (“collapse”) randomly to a statecorresponding to a state vector vi, an eigenvector of A. The result of the measurement isthe corresponding eigenvalue of A, λi.

See also projective measurement

mixed state A quantum system whose state | ψ〉 is not known precisely is said to be ina mixed state. A mixed state is a superposition of different pure states; the system is in astate | φi〉 with probability pi. The density operator of a quantum system in a mixed stateis ρ =

∑i pi | φi〉〈φi |. The trace of the density operator is Tr(ρ2) < 1. A mixed state is

also called an impure state. See also pure state of a single qubit.

model of a physical system Abstraction used to study the properties of a system.For example, a Turing machine is a high level model capturing essential properties of acomputer. A model captures only the “relevant” properties of the system. Here “relevant”means important for a particular type of study the model is designed for. We cannotdraw any conclusion regarding the heat dissipation of a computer from the Turing machinemodel because the model does not take into account physical characteristics of a computer,such as the energy required for an elementary operation. Sometimes, we construct a scalemodel of a physical system. Such a model allows us to study some properties of the systemsystem without actually building it. A scale model can used to perform experiments thatare unfeasible, very difficult, or very costly to carry out. For example, the model of anairplane wing allows us to draw conclusions about its lift-off properties.

n-tuple Vector with n components. Each component is a symbol from an input alphabet;for example, a binary 4-tuple is a vector with four components, each one being either 0 or1.

norm of a vector in a Hilbert space A non-negative function which measures the“length” of a vector. If | ψ〉 ∈ Hn then the norm, || | ψ〉 || can be computed from theinner product of the state vector with itself || | ψ〉 ||2=| 〈ψ | ψ〉 |.

normal operator in a Hilbert space An operator O ∈ Hn with the property that[O,O†] = OO† −O†O = 0.

normal unitary basis of an n-dimensional Hilbert space Hn A set of n vectors| ψ0〉, | ψ2〉 · · · | ψi〉, · · · | ψn−1〉 where each vector has the norm (or “length”) equal to one| | ψ0〉 |=| | ψ1〉 |= . . . =| | ψn−1〉 |= 1 and any two of them are orthogonal 〈ψi | ψj〉 = 0

261

for (i �= j).

normalization condition in quantum mechanics Requirement that the square of themodulus of the projections of a state vector on the orthonormal basis sum to 1. Thistranslates into the condition that the sum of probabilities of all possible outcomes of ameasurement of the quantum system must be equal to 1.

numerical simulation (exact/approximate) of a physical system New methodologyof science which became available after the introduction of the stored program computer inmid 1940s. We use a computer to study the behavior of a physical system under differentconditions. For numerical simulation we first construct a model of the physical system, thenwe design a program that reflects the essential properties of the physical system capturedby the model. The results of simulation are only as good as the model is. In 1982 RichardFeynman argued that in traditional numerical simulations such as weather forecasting oraerodynamic calculations, computers model physical reality only approximately. He ad-vanced the idea that physics was computational and that a quantum computer could doan exact simulation of a physical system, even of a quantum system. Numerical simulationcomplements the two traditional scientific methods, theoretical modelling and experiments.

Richard Phyllis Feynman, American physicist (1918-1988). He received his doctorate fromPrinceton in 1942 under J. A. Wheeler (he was also advised by E. Wigner) and worked onthe atomic bomb project at Princeton University (1941-42) and then at Los Alamos (1943-45). Feynman’s main contribution was to quantum mechanics. He introduced diagrams(now called Feynman diagrams) that are graphic analogues of the mathematical expressionsneeded to describe the behavior of systems of interacting particles. He was awarded theNobel Prize in 1965, jointly with Schwinger and Tomonoga for fundamental work in quan-tum electrodynamics and physics of elementary particles. His later work led to the currenttheory of quarks, fundamental in pushing forward an understanding of particle physics. Hemade significant contribution to the field of quantum computing as well. According to hisobituary published in the Boston Globe “He was widely known for his insatiable curiosity,gentle wit, brilliant mind and playful temperament.”

observable A physical property of a quantum system that can be measured by an externalobserver. Mathematically each observable X has an associated operator MX .

orthogonal state vectors Two vectors | ψa〉 and | ψb〉 in Hn are orthogonal and we write| ψa〉 ⊥ | ψb〉, if their inner product is zero

〈ψa | ψb〉 = 0 =⇒ | ψa〉 ⊥ | ψb〉

orthonormal basis in a Hilbert space Hn A basis consisting of a set of n orthonormalvectors such as | 0〉, | 1〉, . . . , | i〉, . . . , | n − 1〉. Any two vectors from the set areorthogonal and all have a norm equal to 1. See also normal unitary basis in a Hilbert spaceHn.

outer product in a Hilbert space Hn The outer product of a ket vector and a bra

vector, | ψa〉〈ψb | is a linear operator. For example, in H3 we have

| ψa〉〈ψb |=

α0

α1

α2

(

β∗0 β∗

1 β∗2

)=

α0β

∗0 α0β

∗1 α0β

∗2

α1β∗0 α1β

∗1 α1β

∗2

α2β∗0 α2β

∗1 α2β

∗2

parity check symbols Symbols added to a message to increase the redundancy andsupport error correcting and/or error detecting capabilities of a code.

262

Pauli matrices Two by two matrices describing transformations (rotations) of a singlequbit. They are defined as σ†

i = σi, i = (1, 2, 3). They are Hermitian, they square tounity, σ2

1 = σ22 = σ2

3 = 1, and they satisfy the relation σ1σ2 = iσ3 (this is also true for acyclic permutation of indices), as well as the relation σ1σ2 + σ2σ1 = 0 (this is also true fora cyclic permutation of indices).

Pauli exclusion principle Consider a pair of electrons on the same orbital around thenucleus of an atom. Pauli exclusion principle dictates that the two electrons cannot bein identical states, including the spin. They must have their spins oriented in oppositedirections because they already share three quantum numbers. If, as a result of an experi-ment, one of the electrons is made to change the orientation of its spin, then a simultaneousmeasurement of the other finds it in a state with opposite spin.

Wolfgang Ernst Pauli, Austrian-born physicist (1900-1958). In 1924 Pauli proposed a quan-tum spin number for electrons. Best known for his (Pauli) exclusion principle, proposedin 1925, which states that no two electrons in an atom can have the same four quantumnumbers. He predicted mathematically, in 1931, that conservation laws required the ex-istence of a new particle which he proposed to call the “neutron”. In 1933 he publishedhis prediction and he made the claim that the particle had zero mass. The particle whichwe now know as the neutron has a non-zero mass and had been discovered by Chadwickin 1932. Pauli’s particle was named the “neutrino” by Fermi in 1934 and at that time hecorrectly stated that it was not a constituent of the nucleus of an atom. The neutrino waslater found experimentally.

photoelectric effect When light shines on a negatively charged metal plate, the surfaceemits electrons. This phenomena was explained by Albert Einstein in 1905 based uponPlank’s quantum theory. Einstein got the Nobel Prize in 1921 for explaining the photo-electric effect.

photon From the Greek word “photos” meaning light. A light particle.

polarization of light As an electromagnetic radiation, light consists of an electric anda magnetic field perpendicular to each other and, at the same time, perpendicular to thedirection the energy is transported by the electromagnetic (light) wave. The electric fieldoscillates in a plane perpendicular to the direction of flight and the way the electric fieldvector travels in this plane defines the polarization of the light. When the electric fieldoscillates along a straight line we say that the light is linearly polarized. When the endof the electric field vector moves along an ellipse, the light is elliptically polarized. Whenthe end of the electric field vector moves around a circle, the light is circularly polarized.If the light comes toward us and the end of the electric field vector moves around in acounterclockwise direction, we say that the light has right-hand polarization; if the end ofthe electric field vector moves in a clockwise direction, we say that the light has left-handpolarization.

polarization measurements Polarization is measured by passing a photon through apolarizer. If the polarizer axis is oriented parallel to the polarization of the photon, thephoton passes through unimpeded. If it is oriented perpendicular to the polarization of thephoton, the photon is absorbed. At an intermediate angle, the photon will have a certainprobability of being transmitted.

polarization filter A partially transparent material that transmits light of a particularpolarization.

projective measurement A projective measurement is characterized by a set of projectors

263

Mi such that∑

iMi = I and MiMj = δijMi. The outcome of the measurement is theone with the index i associated with Mi. The probability of outcome i for a system instate | ψ〉 is pi =| Mi | ψi〉 |2. Given the outcome i, the quantum state “collapses” to thestate Mi | ψi〉/

√pi.

projection operator, projector The outer product of a unit vector with itself

| Ψa〉〈Ψa | = Pa.

It has the defining property: P 2a = Pa.

A complete set of orthogonal projectors in Hn is a set {P0,P1, . . .Pi . . .Pm−1} such that

m−1∑i=0

Pi = 1.

public key cryptography Communicating entities have both a private and a public key.A secure message is sent to an entity E by encrypting it with E’s public key; E decryptsthe message with its own private key.

pure state of a single qubit Pure states of a single qubit are represented by points onthe Bloch sphere. This implies that the trace of the square of their density matrix is one,Tr(ρ2) = 1. See also mixed or impure state of a single qubit .

quantum Latin word meaning some defined quantity. In physics it is used with the samemeaning as discrete in mathematics.

quantum computing New discipline emerging during the last three decades of the 20thcentury at the intersection of computing and quantum physics.

quantum communication channel Physical media allowing two parties to exchangequantum information. For example, an optic fiber allowing photons with a certain polar-ization to circulate from a source to a destination.

quantum computer Device able to store and transform quantum information.

quantum information Information based upon quantum mechanics. The information isencoded as a property of a quantum particle, e.g., the spin of an electron, or the polarizationof a photon.

quantum mechanics Discipline of modern physics founded by Heisenberg.

quantum parallelism Term capturing the fact that a quantum computer can manipulatean exponential set of inputs simultaneously. For example, consider a function f(x) havingas argument a binary vector x = (x1, x2, . . . xl) of length l = 2n. A quantum computer canevaluate all f(x1), f(x2), . . . f(xl) simultaneously.

quantum particle Atomic or sub-atomic particle obeying the laws of quantum mechanics.

qubit Quantum bit. Mathematical abstraction for a quantum system capable of storingone bit of information.

ray in a Hilbert space A mathematical abstraction that exhibits only direction. It canbe represented as a straight line through the origin of the coordinate system.

RSA algorithm Public key encryption algorithm named after its inventors, Rivest,Shamir, and Adleman.

scalar product of two vectors Given an n-dimensional vector space A = F n over thefield F , the scalar product of α, β ∈ A is denoted by 〈α, β〉. The scalar product is a bilinear

264

map with the property 〈α, β〉 = 〈β, α〉. If F = C, the field of complex numbers, then thescalar product is also called the inner product.

Schrodinger equation Equation introduced by Erwin Schrodinger for the wave functionΨn(q) of a stationary state of energy En of an atom in terms of the Hamiltonian functionH(q, p)

H(q,h

2πi

∂

∂q)Ψn(q) = EnΨn(q).

The dynamics of the wave function, is governed by the following partial differential equation

ih

2π

∂

∂tΨ(q, t) = H(q,

h

2πi

∂

∂q)Ψ(q).

Erwin Schrodinger, Austrian born physicist (1887-1961). He is one of the founders of thequantum physics, and has made significant contributions to the statistical mechanics andthe general theory of relativity. Schrodinger transferred the idea of a wave associatedto a particle, predicted by de Broglie, to Bohr’s atomic model. In 1927 he showed themathematic equivalence of his equation HΨ = EΨ and Heisenberg matrix mechanics. Thesame year he became Max Planck’s successor for the theoretical physics chair at Universityof Berlin. He was awarded the Nobel Prize in Physics in 1933 (together with Paul Dirac)for his contributions to the development of Quantum Mechanics.

Schwartz inequality Inequality satisfied by any state vectors (| ψa〉, | ψb〉) ∈ Hn

〈ψa | ψa〉〈ψb | ψb〉 ≥ | 〈ψa | ψb〉 |2

Second Law of Thermodynamics The entropy of a system is a non-decreasing functionof time.

Shannon theorem Relates the channel capacity for a noisy channel to the signal-to-noiseratio and the bandwidth of the noiseless channel.

Claude Elwood Shannon, American mathematician and electrical engineer (1916-2001),founder of the modern information theory. He graduated from the University of Michi-gan in 1936 with bachelor’s degrees in mathematics and electrical engineering. In 1940 heearned both a master’s degree in electrical engineering and a Ph.D. in mathematics fromthe Massachusetts Institute of Technology (MIT). Shannon joined the mathematics depart-ment at Bell Labs in 1941 and remained affiliated with the Labs until 1972. He became apermanent member of the faculty at MIT in 1958, and a professor emeritus in 1978. In 1948Shannon published his landmark work “A Mathematical Theory of Communication” andin 1949 he published the “Communication Theory of Secrecy Systems”, generally creditedwith transforming cryptography from an art to a science.

singlet electron state The antisymmetric state of a pair of electrons with anti-parallelspins 1/sqrt2 |↑↓〉− |↓↑〉. The electrons have different spin quantum numbers +1/2 and−1/2 and the total spin of the state is zero. See also triplet electron state

spectral decomposition of a normal operator in a Hilbert space InHn every normaloperator N has n eigenvectors | ni〉, and, correspondingly, n eigenvalues λi. If Pi are theprojectors corresponding to these eigenvectors, Pi =| ni〉〈ni |, then the operator N has thespectral decomposition

N =∑

i

λiPi

265

spin The observable associated with the intrinsic rotation of the electron is the intrinsicangular momentum, also called the spin angular momentum. The “spin” is the quantumnumber characterizing the intrinsic angular momentum of the electron and, for that matter,of other quantum particles.

superposition probability rule If an event in quantum mechanics may occur in two ormore indistinguishable ways, then the probability amplitude of the event is the sum of theprobability amplitudes of each case considered separately.

superposition state of a quantum system If the states | 1〉, | 2〉, . . . | n〉 of a quantumsystem are distinguishable and if the complex numbers αi satisfy the condition

∑i | αi |2=

1, then the state∑

i αi | i〉 is a valid quantum state called a superposition state. See alsobasis state of a quantum system.

superposition probability rule If an event in quantum mechanics may occur in two ormore indistinguishable ways, then the probability amplitude of the event is the sum of theprobability amplitudes of each case considered separately.

Stern-Gerlach experiment Experiment revealing the spin of quantum systems.

teleportation In a science fiction context: making an object or person disintegrate inone place and have it reembodied as the same object or person somewhere else. In thecontext of quantum information theory: “a way to scan out part of the information froman object A, which one wishes to teleport, while causing the remaining, unscanned, part ofthe information to pass, via the Einstein-Podolsky-Rosen effect, into another object C whichhas never been in contact with A. Later, by applying to C a treatment depending on thescanned-out information, it is possible to maneuver C into exactly the same state as A wasin before it was scanned.” (http://www.research.ibm.com/quantuminfo/teleportation). Inthis process the original state is destroyed.

tensor product of two vector spaces The tensor product A⊗B of two vector spaces Aand B over the same field F is the dual G(A,B)∗ of the space G(A,B) of bilinear functionsfrom A and B to F .

tensor product of two Hilbert spaces The tensor product of Hn and Hm is Hn⊗Hm =Hnm when given that (e0, e1, . . . en−1) is an orthonormal basis for Hn and (f0, f1, . . . fn−1)is an orthonormal basis for Hm, then (e0 ⊗ f0, e1 ⊗ f1, . . . en−1 ⊗ fn−1) is an orthonormalbasis for Hnm.

tensor product of two linear operators or matrices in a Hilbert space The tensorproduct of an m × n matrix and a p × q matrix is an mp × nq matrix. For example, thetensor product of vectors (a, b) and (c, d) is the vector

(ab

)⊗(

cd

)=

acadbcbd

.

Toffoli gate A three-qubit gate with two control inputs, a and b and one target input c. Theoutputs are: a′ = a, b′ = b and c′. When c = 1 then c′ = 1⊕(a AND b) = NOT(a AND b),otherwise c′ = c. The Toffoli gate is a universal gate and it is reversible. There are bothclassical and quantum versions of the Toffoli gate.

Tommaso Toffoli, professor of Electrical Engineering at Boston College.

transpose matrix A matrix whose rows are the columns of the original matrix (the rowsof the original matrix become the columns of the transpose).

266

triplet electron state The state of a pair of electrons with parallel spins |↑↑〉 and |↓〉 |↓〉or in a symmetric superposition of anti-parallel spins 1/sqrt2(|↑〉 |↓〉+ |↓〉 |↑〉. The totalspin of a triplet state is +1.

truth table A method to characterize the output of a logic circuit. One constructs a tablewith one entry for every possible combination of input values.

universal Turing machine An abstract machine which moves from one state to anotherusing a precise finite set of rules (given by a finite table), depending on a single symbol itreads from a tape.

Alan Mathison Turing, British mathematician (1912-1954) regarded as the founder of themodern Computer Science. He entered King’s College at Oxford in 1931 and graduated in1935 with a degree in mathematics. He was elected a fellow of King’s College at Cambridgein 1935. In 1936 he published “On Computable Numbers, with an application to theEntscheidungsproblem”. Turing defined a computable number as a real number whosedecimal expansion could be produced by a Turing machine starting with a blank tape.In 1939 Turing started to work full-time at the Government Code and Cypher Schoolat Bletchley Park. Together with W G Welchman, Turing developed a machine whichfrom late 1940 was able to decipher all messages sent by the German Enigma encryptionmachines of the Luftwaffe (the German air force during Second World War). After the warTuring was invited by the National Physical Laboratory in London to design a computer.In March 1946 he submitted a report proposing the Automatic Computing Engine (ACE),an original design for a modern computer. In 1948 he moved to Manchester and in 1950 hepublished a paper, “Computing Machinery and Intelligence in Mind” where he proposedthe Turing Test. This is the test used today to answer the question whether a computercan be intelligent. Alan Turing was elected a Fellow of the Royal Society of London in1951.

trace of a matrix The trace of a matrix is the sum of its diagonal elements.

trace of a linear operator The trace of an operator is the trace of the matrix associatedwith the operator.

unitary matrix A matrix A = [aij] with complex elements, aij ∈ C is said to be unitaryif A†A = 1. Here A† is the adjoint of A, a matrix obtained from A by first constructingAT , the transpose of A, and then taking the complex conjugate of each element (or byfirst taking the complex conjugate of each element and then transposing the matrix). Thedeterminant of a unitary matrix is 1.

unitary operator A linear operator A on a Hilbert space that preserves the inner product,thus the distance. See also unitary matrix.

universal set of quantum gates A set of gates with the property that there exists anetwork of them capable to implement every single unitary operation.

vector space An algebraic structure consisting of: (i) an Abelian group (V, +) whoseelements are called “vectors” and whose binary operation “+” is called addition; (ii) a fieldF of numbers (either R, the real numbers, or C, the complex numbers), whose elementsare called “scalars”; and (iii) an operation called “multiplication with scalars” and denotedby “·”, which associates to any scalar c ∈ F and vector α ∈ A a new vector c · α ∈ A.

wave function Function describing the state of a stationary system and the evolution intime of a non-stationary system. See also Schrodinger equation.

Young double slit experiment Experiment revealing the interference phenomena relatedto the wave-like behavior of light (photons).

267

Index

Abelian group, 47adiabatic, 195, 198algorithm

classical exponential, 171Deutsch’s, 175Euclid’s

for integers, 231for polynomials, 232

factoring large integers, 12, 171finding discrete logarithms, 171Grover’s, 172Hallgren’s, 172polynomial time, 42, 171Simon’s, 187simulating quantum systems, 172time complexity, 189

ALU - Arithmetic and Logic Unit, 12, 166amplitude, 24angular momentum, 81, 87, 122anti-electron, 89associative law, 47Augmented Probabilistic Model, 28

basisnormal unitary, 59orthonormal, 27, 53, 56, 69, 162, 174state, 16, 66

BB84, 36, 226beam splitter, 19, 33, 34, 134

cascaded, 135Bell

inequality, 172, 204states, 207

Bennett information driven engine, 199bilinear map, 54bit-flip, 109, 136black body, 39

radiation theory, 43black hole thermodynamics, 15Bloch sphere, 105, 134, 151

representation, 97Bohr magneton, 82Boltzmann constant, 197Boolean algebra, 129

Bounded-Error Quantum Polynomial(BQP), 165, 166

building blocks of quantum circuits, 42

canonical base, 51carry-free, 234CarryIn, 131, 234CarryOut, 131, 140, 147, 234check bits, 227chromatic aberration, 18cipher

asymmetric (or public key), 224symmetric, 224

ciphertext, 224clonning theorem, 213common factor, 234communication channel, 14, 226

classical, 36, 209, 219, 225quantum, 36, 111, 219, 225

commutative law, 47completeness condition, 105complex

coefficient, 106conjugate, 132

complexity theory, 14computational

basis states, 16models of physics, 42

computerarithmetic for large integers, 234security, 44simulation of physical phenomena, 40

computingelementary circuits, 12engine, 14systems, 12

confidentiality, 36, 223congruence relation, 230constant

gravitational, 15Planck’s, 15

controlled operation, 153multiple qubit, 129, 159single qubit, 129

Copenhagen doctrine, 202

268

cryptography text, 17

de Morgan’s Laws, 147decoherence, 17, 166decrypt, 12dense coding, 219determinant, 52deterministic

function, 14system, 14

diffraction, 87of electrons, 43of light, 19

Dirac’s notation, 109Discrete Fourier Transform (DTF), 179distribution of encryption key, 36, 209double-slit experiment, 43, 47, 75

eavesdropping, 11, 36, 226edge, 172eigenspace, 60eigenstate, 71, 121, 202eigenvalue, 15, 60, 68, 71, 111, 121eigenvector, 60, 68, 111

orthonormal, 69electric field, 25

vector, 25electromagnetic

radiation, 25, 86, 263wave, 25, 263

electron, 111charge, 82mass, 82, 87non-relativistic, 82orbital, 87

electron-photon collision, 78encryption, 12, 224

key, 39techniques, 12

energy dissipated, 12, 197entangled

particles, 123, 207photons, 43system, 85

entanglement, 45, 85, 123of two qubits, 106

entropy, 14, 193, 198negative, 193

thermodynamics, 194Entscheidungsproblem, 40EPR

experiment (paradox), 203, 209pair, 107, 207

equivalence classes, 126erasure of information, 197, 198error

bit-flip, 109correction, 166phase, 109

Euler’s formula, 97, 181evolutionary process, 192

factoring large integers, 42, 43, 171fanout circuit, 143field, 47finite-state machines, 191floor, 228Fourier transform, 171full-adder, 131, 140function

bilinear, 50, 53Boolean, 129continuous, 48Kroenecker delta, 64Kronecker delta, 49linear, 50non-negative, 54

gate, 129, 138AND, 142array, 174CNOT (Controlled Not), 129, 135, 152,

208, 213control, 152Fredkin, 129, 140, 173

universal, 163Hadamard, 133, 160, 173, 180, 208, 213identity, 133irreversible (not invertible), 130logic, 131

conservative, 142NAND, 129NOT, 133, 142single qubit, 129, 221three qubit, 129, 140Toffoli, 129, 142, 147, 158, 159

269

two qubit, 129gedanken experiment, 22

Szilard, 196general purpose computer, 39, 40greatest common divisor (gcd), 228

Hadamard TransformFast (FHT), 239

Hermitianconjugate, 57operator, 47, 53

hidden variable, 20, 89, 202Hilbert space, 15, 47, 56, 58, 62, 109

n-dimensional, 61, 89, 109finite dimensional, 61two-dimensional, 115

hot-clocking, 198

identity element, 47identity matrix, see matrix, 96indistinguishable, 31information theory, 41input

control, 135, 142target, 135, 142

interference, 47, 87destructive, 73pattern, 77, 86

internal registers, 12intrinsic angular momentum, 112intruder, 44inverse

additive, 14multiplicative, 14

inverse element, 47ion traps, 18isothermal, 195

keydecryption, 224distribution problem, 224encryption, 224, 226private, 224public, 224session, 224

kinetic theory of gases, 195

Landauer principle, 197laws

of physics, 11of quantum mechanics, 11of thermodynamics, 42

least common multiple (lcm), 228linear

algebra, 47map, 51, 52operator, 54transformation, 50, 174

locality, 203, 205logic circuit, 129

magneticdipole, 80field, 80, 114, 117

gradient of, 81moment, 79

magnetic field, 25Markov chain, 172mathematical model, 15, 47matrix

adjoint, 54algebra, 88characteristic polynomial of, 52complex conjugate, 54density, 74, 105equation, 69Hadamard, 151, 162, 236identity, 51, 105, 221infinite, 43, 88mechanics, 40multiplication, 11, 88notation, 95phase - (S), 151phase-shift - Pθ, Rθ, Rk, 151rotation, 67trace of, 52transfer, 129, 142, 147, 163transpose of, 52triangular, 52unitary, 132, 145, 151, 162

maximally entangled particles, 209, 213measurement, 15, 47

exhaustive, 71in multiple bases, 22of the observable, 71outcome of, 72

modular

270

arithmetic, 228, 234representation of integers, 234

modulo two adder, 129modulus

of a complex number, 53relation, 234

momentum, 19of a particle, 43

Moore’s law, 18

neutron, 82Newton’s rings, 19norm, 54, 59

unit, 63normalization condition, 21, 107, 115nuclear magnetic resonance, 18nucleus, 82, 87number

complex, 94imaginary, 53irational, 14rational, 14real, 14, 53theory, 14, 228

observable, 15, 47, 64, 85, 88, 202one time pad, 36open system, 166operator

adjoint, 54density, 105displacement, 67Hamiltonian, 83Hermitian, 15, 64, 74, 90, 105inversion, 67linear, 11, 62momentum, 67projection, 67, 70

orthogonal, 71rotation, 67, 112self-adjoint, 55, 64, 112

orbital, 17orthogonal vectors, 24

parity check, 39Pauli exclusion principle, 17, 82Pauli matrices, 67, 95, 96, 112, 115, 134,

150, 151, 220

I - identity transformation, 96X - negation operator, 96Z - phase shift operation, 96

phase estimation, 187photoelectric effect, 19, 39, 86, 89, 202photomultiplier, 19photon, 19, 111

coincidence experiment, 32entangled, 32, 89idler, 32parametric down conversion, 32polarized, 36signal, 32

physical limitations, 12plaintext, 224Planck constant, 79, 85Planck’s constant, 22planetary model of the atom, 43, 87Poisson brackets, 88polarization

analyzer, 121filter, 25, 27horizontal, 28, 38, 89left-hand, 25, 68, 263measurement, 26of a photon, 11, 210right-hand, 25, 68, 263vertical, 27, 38, 89

polarized light, 121circularly, 25, 263elliptically, 25, 118, 263linearly, 25, 263

positron, 89positronium, 89prime

number, 230relatively, 228

probabilitya posteriori, 32amplitude, 21, 34, 77, 135density, 84distribution, 78superposition, 19

productCartesian, 16inner, 47, 54, 58, 89, 109, 111outer, 47, 62, 67, 70, 90, 95

271

scalar, 48tensor, 16, 44, 47, 61, 90, 221vector, 44

projection, 24proton, 82

quanta, 39, 86Quantum

Fourier Transform (QFT), 179Key Distribution (QKD), 226model

circuit, 162Turing Machine (QTM), 173

quantumalgorithm, 171

parallel search, 12speed-up, 172

bit (or qubit), 14, 94channel, 111circuit, 129, 145

generic, 152non-uniform, 166reversible, 159, 163

code, 43communication channel, 11, 14, 36, 39,

209computer, 13, 111, 123, 144, 162, 163computing, 11, 12, 47, 90, 171

fault-tolerant, 166cryptography, 36, 39, 43device, 171dynamics, 43effects, 11, 44gate, 45

array, 129mathematical representation, 162universal, 145

hypothesis, 39information theory, 11, 12, 109key distribution, 44, 223measurement, 102mechanical computing, 171mechanics, 11, 40, 47, 82, 86, 90

matrix, 43principles of, 203relativistic, 43, 89

model, 43cellular automata, 162

of the atom, 39Turing machine, 162

object, 94operator, 47parallelism, 171, 173particle, 25phenomena, 11physics, 86properties

entanglement, 13uncertainty, 13

state, 14, 25, 47, 209system, 85, 162

closed, 95, 165teleportation, 43, 209, 218universal computer, 42

qubit, 83, 133, 208ancillary, 163as a polarized photon, 119control, 140controlled, 157, 159original state of, 96physical realization, 96polarization of a photon, 110spin, 96

of an electron, 110spin-half particle, 98state

after the measurement, 102information, 96prior to measurement, 102

target, 152transformation, 220work, 158, 163

random number generator, 171randomness, 194ray in a Hilbert space, 110ray in Hilbert space, 63realism, 205relative

orthogonality, 63phase, 63

remainder, 228residue, 234reversibility, 45, 191, 192

of computations, 42

272

reversible general-purpose computing au-tomata, 42

rotationoperation

on the Bloch sphere, 101transformation, 114

scalar, 48Schrodinger equation, 47, 82Schwartz inequality, 60separation system

color, 23hardness, 23

simulation, 42numerical, 15

singlet state, 17skew

linearity, 58symetry, 58

spacecomputational, 174Hilbert, 47, 162quantum physical, 174unitary, 56

spatial axis, 116special relativity, 82spectral decomposition, 47, 70spin, 17, 81

angular momentum, 82of the electron, 112

anti-parallel, 82, 122number, 67of an electron, 11quantum number, 83

stateanti-correlated, 108, 207, 217Bell, 107description, 21excited, 96ground, 96horizontal polarization, 27impure, 106left polarization, 121maximally entangled, 108, 123mixed, 105orthogonal, 36, 63, 114overlap, 63pure, 73, 106

right polarization, 121singlet

electron, 122spin, 217

stationaryof energy, 83

superposition, 105, 162vertical polarization, 27

statistical thermodynamics, 202Stern-Gerlach experiment, 47, 80, 114, 120stochastic engines, 15superdense coding, 17superposition

coherent, 73principle, 63, 84probability rule, 28, 31state, 16, 21, 174

swapping two qubits, 146system security, 223

tabulating machines, 40teleportation, 17, 108, 213theorem

Chinese remainder, 232Euler’s, 230Fermat’s, 230no-clonning, 140

theoryof light

corpuscular, 19wave, 19

of relativity, 11, 202probability, 72

thermal equilibrium, 39thermodynamical equilibrium, 86thermodynamically reversible computer, 42thermodynamics, 194

First Law of, 193Second Law of, 195

trace, 52transport of energy, 12transpose, 132truth table, 129, 148Turing computable, 192Turing Machine

ProbabilisticProbabilistic(PTM), 172Universal (UTM), 191

logically reversible, 200

273

two bit adder circuit, 146

uncertainty principle, 43, 79, 85Heisenberg’s, 12, 47, 89, 163, 202indetermination, 22

Universal Reversible Computer, 201UTM - Universal Turing Machine, 40

abstract digital computing machine, 40

vectorm-dimensional, 173angle between two, 54basis, 27, 66bra, 57, 63column, 110field, 47finite dimensional, 52in a complex Hilbert space, 115inner product, 109, 111ket, 57, 63length, 54, 94norm, 94, 106normalized, 63orthogonal, 60, 112orthonormal, 56output, 138, 161product, 44random, 27row, 110space, 16, 47

n-dimensional complex, 53, 179Euclidean, 50, 53, 54, 56finite-dimensional, 69two-dimensional, 94

state, 24, 66subspace, 48, 51

Verschrankung, 123virtual

machines, 191memory, 191

von Neumann architecture, 40

waveequation, 39

Schrodinger’s, 88function, 43, 83, 88

dynamics of, 43wavelength, 78, 79, 87

Welsh-Hadamard transformation, 129, 134,160, 175, 236

work qubit, 158

Young’s experiment, 84

274

Lectures on Quantum Computing - CS Departmentdcm/Teaching/QuantumComputing/...Lectures on Quantum Computing Dan C. Marinescu and Gabriela M. Marinescu Computer Science Department University

Documents