Introduction to Quantum Machine Learning M. Hilke (Quantum … · 2017. 11. 14. · =10B TB worth of digital data (internet, hard drives, DVDs ... (3.2) Time evolution (f.ex interacting

Introduction to Quantum Machine LearningM. Hilke

(Quantum Nano Electronics Laboratory)

Why Quantum Machine Learning?

Hype Curve

Why Quantum Machine Learning?

Money?

Structure:

• Machine Learning• Quantum Machine Learning

Machine Learning:

Matlab demo

Idea of Machine Learning:

me

Neuron

Axon

20mnot me

Neuron network:1011 neurons and 1014 synapses in the human brain3x1011 neurons in an elephant brain

(six-core i7 has 109 transistors)

Principle of Deep Convolution Neural Network for Face Recognition

Input layer

Input picture into neural network

Input layer layer 1

Add neuron layer

Input layer layer 1

layer 2 Add more layers

Deep Neural Network (more than 1 layer)

Input layer layer 1

layer 2

Output layer

Input layer layer 1

layer 2

Output layer

Artificial Neuron

layer 1 layer 2

layer 1 layer 2

𝐿1(1)

𝐿1(2)

𝐿1(3)

𝐿1(4)

𝐿1(5)

Value of neurons at layer 1

General Computation Flow

layer 1 layer 2

𝐿1(1)

𝐿1(2)

𝐿1(3)

𝐿1(4)

𝐿1(5)

𝑊12(5)

𝑊12(1)


𝑊12(1)

axon weight

axon weight between layer 1 (neuron 5) and layer 2

layer 1 layer 2

𝐿1(1)

𝐿1(2)

𝐿1(3)

𝐿1(4)

𝐿1(5)

𝑥

𝑊12(5)

𝑊12(1)


𝑊12(1)

axon weight

𝑥 =

𝑛=1

5

𝑊12(𝑛)𝐿1(𝑛)

neuron input (layer 2)

axon weight between layer 1 (neuron 5) and layer 2

layer 1 layer 2

𝐿1(1)

𝐿1(2)

𝐿1(3)

𝐿1(4)

𝐿1(5)

𝑥

𝑊12(5)

𝑊12(1)


𝑊12(1)

axon weight

𝑥 =

𝑛=1

5

𝑊12(𝑛)𝐿1(𝑛)

neuron input (layer 2)

b2

Threshold value of neuron

layer 1 layer 2

𝐿1(1)

𝐿1(2)

𝐿1(3)

𝐿1(4)

𝐿1(5)

b2

Threshold value of neuron

𝑥 =

𝑛=1

5

𝑊12(𝑛)𝐿1(𝑛)

neuron output

𝑥

𝐿2 =1

1 + 𝑒−𝑥+𝑏2

𝐿2

neuron input𝐿2

𝑥𝑏2

𝑊12(5)

𝑊12(1)

axon weight

(0 < 𝐿2 < 1)

layer 1 layer 2

𝐿1(1)

𝐿1(2)

𝐿1(3)

𝐿1(4)

𝐿1(5)

𝑥 =

𝑛=1

5

𝑊12(𝑛)𝐿1(𝑛)

neuron output

𝑥

𝐿2 =1

1 + 𝑒−𝑥+𝑏2

𝐿2

neuron input𝐿2

𝑥𝑏2𝑊12

(1)

axon weight

(0 < 𝐿2 < 1)

Convolution layer

Fully Connected Layers

I1

I2

I3

I4

Input layer layer 1

layer 2

Ԧ𝐼

𝐿2 =1

1 + exp(−𝑊12𝐿1 + 𝑏2)

𝐿1

𝐿2

𝐿1, 𝑏1

𝑊12

𝐿2, 𝑏2

𝑊𝐼1

I1

I2

I3

I4

Input layer layer 1

layer 2

Output layer

Ԧ𝐼

𝐿2 =1

1 + exp(−𝑊12𝐿1 + 𝑏2)

𝐿1

𝐿2

𝑂

𝐿1, 𝑏1

𝑊12

𝐿2, 𝑏2

𝑊𝐼1 𝑊2𝑂

Largest Output wins!

Learning Phase of Neural Network requires large amounts of training data and can take a lot of processing time.

Recognizing 1 picture: <0.1s (fast)

Typically: training > 1hr and evaluating < 0.1s for “simple” (few tasks) neural network (for a laptop).

Monkey faces training data (~1,000 pictures)

Female faces training data (~1,000 pictures)

Female faces training data (~1,000 pictures)

Male faces training data (~1,000 pictures)

Input layer layer 1

layer 2

Output layer

𝑊12(11)

𝑊2𝑂(11)

𝑊12(56)

𝑏1(5)

𝑏2(6)

𝑏1(4)

𝑏1(3)

𝑏1(2)

𝑏1(1)

𝑏2(5)

𝑏2(4)

𝑏2(3)

𝑏2(2)

𝑏2(1)

𝑊12(12)

𝑊12(12)

Training network means finding optimal b and W

𝑊𝐼1(11)

𝑏𝑂(3)

𝑏𝑂(2)

𝑏𝑂(1)

𝑏𝐼(4)

𝑏𝐼(3)

𝑏𝐼(2)

𝑏𝐼(1)

Training of Deep Convolution Neural Network

Minimize the cost function: (quadratic)

desired output

NN output

(cross-entropy)

Update NN parameters:

C

iterations

learning rate

Once Network is trained it is very powerful for specific tasks:

2014: deep face (Facebook AI Research) – close to human performance for face recognition

2016: AlphaGo was developed by the Google DeepMind team and beats humans in Go

But! Takes a lot of time to find the best 100 million parameters

Good resources to do it yourself:

http://neuralnetworksanddeeplearning.com/

Good introduction with ML code in python

1)

2) http://www.deeplearningbook.org/

By the master (Yoshua Bengio, Goodfellow and Courville)

3) Open source Matlab CNN code

What about ?

Srce: backreaction

Classical image:

Classical image:

16 x16 pixels with 256 grey tones = 65536 worth of data or 8kB uncompressed (bmp).(~1kB compressed)

Quantum image:

16 x16 spins ½ (qubits)

Spin up Spin down

=256 qubits

Spin up Spin down

=256 qubits

Quantum image:

16 x16 spins ½ (qubits) = 2(16x16) = 1077 worth of data.

Spin up Spin down

The world now has 1023

=10B TB worth of digital data (internet, hard drives, DVDs,…)

=256 qubits

To describe 256 qubits classically one needs 1076 classical Bytes(1 Byte=8 bits)

Quantum image:

16 x16 spins ½ (qubits) = 2(16x16) = 1077 worth of data.

Quantum images:(states)

Very hard to store or to compute with a classical computer

Quantum Machine Learning


1) Quantum data – classical machine2) Classical data – quantum machine3) Quantum data – quantum machine

Input layer layer 1

layer 2

Output layer

|𝜑 >

1) Quantum data – classical machine

V+V-

V0

V0I

q=-e

Flow of electrons through a disordered conductor

Simple example of Quantum Classical Machine Learning

V+V-

V0

V0I

q=-e

Flow of electrons through a disordered conductor

Microscope of electrons (scanning probe)

The Ginger Lab

V0V0

V0Microscope of electrons (scanning probe)

The Ginger Lab

V0

Disorder potential Electron density

Quantum Calculation (solving Schrödinger equation) and computing the Local Density of State at E0

This is the electron density, what is the corresponding potential?

=> Hard problem

Can Quantum machine learning help?

HardEasy

Potential 4

Potential 3

Potential 2

Potential 1

Matlab quantum machine learning (over 90% accuracy)

LDOS for different disorder configurations for same disorder amplitude

Input layer layer 1

layer 2

Output layer

|𝜑 >

1) Quantum data – classical machine

Classicize:𝜑 𝜑

V

WORKS!

1) Quantum data – classical machine (other example)

(Photoionization detector)

Source: Mike Williams

Source: Mike Williams

Delta Log (likelihood)

Neural Network ML

(Photoionization detector)

Input Output

2) classical data – quantum machine

(Similar goal to quantum computing: enhance efficiency by using a quantum computer)

Quantum principal component analysis (an example)

Comparing Stocks

Yesterday’s data

𝑣𝑡𝑛 = stock n change at time = t

jsandatascience.comCovariance of stock change:

CISCO Chevron Exon Mobile


𝑣𝑡𝑛 = change of stock n at time t

𝑣𝑡 = vector for N stocks → |𝑣𝑡 > : quantum state

Density matrix

QPCA: Find Eigenvalues in O(logN)2 instead of O(N2) for classical PCA

Use in Quantum Machine Learning Software for speed-up


Input

Output

3) quantum data – quantum machine

|𝜑 >

|𝜓 >

(3.1) Superposition of memorized states (Quantum Associative Memory - Ventura and Martinez ‘98)

Idea: Create a superposed memory state of learned states

with

Requires copies of |M> since the state is destroyed after the probabilistic measurement

(3.2) Time evolution (f.ex interacting quantum dots – Behrman and co-workers ’99 or Perus ‘00)

inputoutput

Green’s function of the trained system

Ex: interacting quantum dots

(3.3) Time flow approaches (Kak ’95, Zak and Williams ’98, Gupta and Zia ‘01,…)

(a) Quantum Measurement: after some time a quantum measurement is performed – then time evolution –measurement -…

(b) Dissipative operator: after some time a dissipative operator is applied and successive time evolution and dissipative operator…

(c) Successive entanglement: Panella and Martinelli ‘11

(3.4) Quantum Boltzmann Machine (quantization of the classical Boltzmann Machine)

Classical Restricted Boltzmann Machine:

v hw Probabilistic machine: probability value of every

state determined by the local energy 𝐸𝑖 = 𝑧𝑖 +

σ𝑗𝑊𝑖𝑗𝑧𝑗 ; 𝑧 = 𝑣 𝑜𝑟 ℎ ; 𝑃 𝑧𝑖 = 1 =1

1+𝑒−𝐸𝑖. This

will eventually minimize global energy


output

Learning of Restricted Boltzmann Machine:

1. Clamp input and desired output (visible layer) => find global minimum

2. Clamp only input => find global minimum => compare output with desired output. Adjust weights and biases by optimizing difference between output and desired output.

3. Use your machine

input


From Crawford et al. ‘16Deep machine

input

input

output

output


Deep machineinput

input

output

output

Quantization of Boltzmann Machine:

From Crawford et al. ‘16

Jason Rolfe Roger MelkoBohdan KulchytskyyEvgeny Andriyash

arXiv:1601.02036

Slide from AminSlide from Amin

Transverse Ising Hamiltonian

Slide from Amin

Quantum Boltzmann Distribution

Boltzmann probability distribution:

Density matrix:

Projection operator Identity matrix

Slide from Amin

Training

Clamped average Unclamped average

Slide from Amin

Copyright© 2016, D-Wave Systems Inc.

Quantum Boltzmann Machine

Classical BM

Bound gradient

D=2

Exact gradient

(D is trained)

D final = 2.5

Train a Boltzmann machine using quantum Boltzmann

distribution (Amin, Andriyash, et al., arXiv:1601.02036)

Slide from Amin

=𝜀𝑖 𝑡𝑖𝑡𝑖 −𝜀𝑖

(i)𝐻 =

𝑖=1

𝑁

𝕀⨂⋯⨂𝐻𝑖 ⨂⋯⨂𝕀 +

𝑖𝑗

𝑉𝑖𝑗

= 𝜀𝑖𝜎𝑧 + 𝑡𝑖𝜎

𝑥

= 𝐻𝑖

In general: some collection of interacting qubits


For quantum machine learning need an input and output subset:

Input

Output


For quantum machine learning need an input and output subset:

Input

Output

Connected to quantum states

How can this be modeled?

=𝜀𝑖 𝑡𝑖𝑡𝑖 −𝜀𝑖

(i)𝐻 =

𝑖=1

𝑁

𝕀⨂𝕀⨂𝕀⨂𝑆𝑖⨂𝕀⨂𝕀 +

𝑖𝑗

𝑉𝑖𝑗

= 𝜀𝑖𝜎𝑧 + 𝑡𝑖𝜎

𝑥

= 𝑆𝑖

⟺

𝐻 =

𝑖=1

2𝑁

𝜖𝑖|𝑖 >< 𝑖| +

𝑖𝑗

2𝑁

𝑡𝑖𝑗 |𝑖 >< 𝑗|

Collection of qubits

Highly connected tight binding model, which can be computed classically.

⋯𝑖⋯

(6 qubits)

(64 sites)


1) Quantum data – classical machineMany useful applications. Can use powerful classical ML codes (Deep Convolution NN). Often outperform non-ML approaches.

2) Classical data – quantum machineSome powerful algorithms exist but many questions remain, particularly for the learning phase.

1) Quantum data – quantum machineMany different preliminary approaches, but it’s just the beginning. No clear emerging winning candidate. There is a lot of fundamental work remaining to be done.

Thanks!

Introduction to Quantum Machine Learning M. Hilke (Quantum … · 2017. 11. 14. · =10B TB worth of digital data (internet, hard drives, DVDs ... (3.2) Time evolution (f.ex interacting

Documents