Introduction to Quantum Machine Learning M. Hilke (Quantum Nano Electronics Laboratory)
Introduction to Quantum Machine LearningM. Hilke
(Quantum Nano Electronics Laboratory)
Why Quantum Machine Learning?
Hype Curve
Why Quantum Machine Learning?
Money?
Structure:
• Machine Learning• Quantum Machine Learning
Machine Learning:
Matlab demo
Idea of Machine Learning:
me
Neuron
Axon
20mnot me
Neuron network:1011 neurons and 1014 synapses in the human brain3x1011 neurons in an elephant brain
(six-core i7 has 109 transistors)
Principle of Deep Convolution Neural Network for Face Recognition
Input layer
Input picture into neural network
Input layer layer 1
Add neuron layer
Input layer layer 1
layer 2 Add more layers
Deep Neural Network (more than 1 layer)
Input layer layer 1
layer 2
Output layer
Input layer layer 1
layer 2
Output layer
Artificial Neuron
layer 1 layer 2
layer 1 layer 2
𝐿1(1)
𝐿1(2)
𝐿1(3)
𝐿1(4)
𝐿1(5)
Value of neurons at layer 1
General Computation Flow
layer 1 layer 2
𝐿1(1)
𝐿1(2)
𝐿1(3)
𝐿1(4)
𝐿1(5)
𝑊12(5)
𝑊12(1)
Value of neurons at layer 1
𝑊12(1)
axon weight
axon weight between layer 1 (neuron 5) and layer 2
layer 1 layer 2
𝐿1(1)
𝐿1(2)
𝐿1(3)
𝐿1(4)
𝐿1(5)
𝑥
𝑊12(5)
𝑊12(1)
Value of neurons at layer 1
𝑊12(1)
axon weight
𝑥 =
𝑛=1
5
𝑊12(𝑛)𝐿1(𝑛)
neuron input (layer 2)
axon weight between layer 1 (neuron 5) and layer 2
layer 1 layer 2
𝐿1(1)
𝐿1(2)
𝐿1(3)
𝐿1(4)
𝐿1(5)
𝑥
𝑊12(5)
𝑊12(1)
Value of neurons at layer 1
𝑊12(1)
axon weight
𝑥 =
𝑛=1
5
𝑊12(𝑛)𝐿1(𝑛)
neuron input (layer 2)
b2
Threshold value of neuron
layer 1 layer 2
𝐿1(1)
𝐿1(2)
𝐿1(3)
𝐿1(4)
𝐿1(5)
b2
Threshold value of neuron
𝑥 =
𝑛=1
5
𝑊12(𝑛)𝐿1(𝑛)
neuron output
𝑥
𝐿2 =1
1 + 𝑒−𝑥+𝑏2
𝐿2
neuron input𝐿2
𝑥𝑏2
𝑊12(5)
𝑊12(1)
axon weight
(0 < 𝐿2 < 1)
layer 1 layer 2
𝐿1(1)
𝐿1(2)
𝐿1(3)
𝐿1(4)
𝐿1(5)
𝑥 =
𝑛=1
5
𝑊12(𝑛)𝐿1(𝑛)
neuron output
𝑥
𝐿2 =1
1 + 𝑒−𝑥+𝑏2
𝐿2
neuron input𝐿2
𝑥𝑏2𝑊12
(1)
axon weight
(0 < 𝐿2 < 1)
Convolution layer
Fully Connected Layers
I1
I2
I3
I4
Input layer layer 1
layer 2
Ԧ𝐼
𝐿2 =1
1 + exp(−𝑊12𝐿1 + 𝑏2)
𝐿1
𝐿2
𝐿1, 𝑏1
𝑊12
𝐿2, 𝑏2
𝑊𝐼1
I1
I2
I3
I4
Input layer layer 1
layer 2
Output layer
Ԧ𝐼
𝐿2 =1
1 + exp(−𝑊12𝐿1 + 𝑏2)
𝐿1
𝐿2
𝑂
𝐿1, 𝑏1
𝑊12
𝐿2, 𝑏2
𝑊𝐼1 𝑊2𝑂
Largest Output wins!
Learning Phase of Neural Network requires large amounts of training data and can take a lot of processing time.
Recognizing 1 picture: <0.1s (fast)
Typically: training > 1hr and evaluating < 0.1s for “simple” (few tasks) neural network (for a laptop).
Monkey faces training data (~1,000 pictures)
Female faces training data (~1,000 pictures)
Female faces training data (~1,000 pictures)
Male faces training data (~1,000 pictures)
Input layer layer 1
layer 2
Output layer
𝑊12(11)
𝑊2𝑂(11)
𝑊12(56)
𝑏1(5)
𝑏2(6)
𝑏1(4)
𝑏1(3)
𝑏1(2)
𝑏1(1)
𝑏2(5)
𝑏2(4)
𝑏2(3)
𝑏2(2)
𝑏2(1)
𝑊12(12)
𝑊12(12)
Training network means finding optimal b and W
𝑊𝐼1(11)
𝑏𝑂(3)
𝑏𝑂(2)
𝑏𝑂(1)
𝑏𝐼(4)
𝑏𝐼(3)
𝑏𝐼(2)
𝑏𝐼(1)
Training of Deep Convolution Neural Network
Minimize the cost function: (quadratic)
desired output
NN output
(cross-entropy)
Update NN parameters:
C
iterations
learning rate
Once Network is trained it is very powerful for specific tasks:
2014: deep face (Facebook AI Research) – close to human performance for face recognition
2016: AlphaGo was developed by the Google DeepMind team and beats humans in Go
But! Takes a lot of time to find the best 100 million parameters
Good resources to do it yourself:
http://neuralnetworksanddeeplearning.com/
Good introduction with ML code in python
1)
2) http://www.deeplearningbook.org/
By the master (Yoshua Bengio, Goodfellow and Courville)
3) Open source Matlab CNN code
What about ?
Srce: backreaction
Classical image:
Classical image:
16 x16 pixels with 256 grey tones = 65536 worth of data or 8kB uncompressed (bmp).(~1kB compressed)
Quantum image:
16 x16 spins ½ (qubits)
Spin up Spin down
=256 qubits
Spin up Spin down
=256 qubits
Quantum image:
16 x16 spins ½ (qubits) = 2(16x16) = 1077 worth of data.
Spin up Spin down
The world now has 1023
=10B TB worth of digital data (internet, hard drives, DVDs,…)
=256 qubits
To describe 256 qubits classically one needs 1076 classical Bytes(1 Byte=8 bits)
Quantum image:
16 x16 spins ½ (qubits) = 2(16x16) = 1077 worth of data.
Quantum images:(states)
Very hard to store or to compute with a classical computer
Quantum Machine Learning
Quantum Machine Learning
1) Quantum data – classical machine2) Classical data – quantum machine3) Quantum data – quantum machine
Input layer layer 1
layer 2
Output layer
|𝜑 >
1) Quantum data – classical machine
V+V-
V0
V0I
q=-e
Flow of electrons through a disordered conductor
Simple example of Quantum Classical Machine Learning
V+V-
V0
V0I
q=-e
Flow of electrons through a disordered conductor
Microscope of electrons (scanning probe)
The Ginger Lab
V0V0
V0Microscope of electrons (scanning probe)
The Ginger Lab
V0
Disorder potential Electron density
Quantum Calculation (solving Schrödinger equation) and computing the Local Density of State at E0
This is the electron density, what is the corresponding potential?
=> Hard problem
Can Quantum machine learning help?
HardEasy
Potential 4
Potential 3
Potential 2
Potential 1
Matlab quantum machine learning (over 90% accuracy)
LDOS for different disorder configurations for same disorder amplitude
Input layer layer 1
layer 2
Output layer
|𝜑 >
1) Quantum data – classical machine
Classicize:𝜑 𝜑
V
WORKS!
1) Quantum data – classical machine (other example)
(Photoionization detector)
Source: Mike Williams
Source: Mike Williams
Delta Log (likelihood)
Neural Network ML
(Photoionization detector)
Input Output
2) classical data – quantum machine
(Similar goal to quantum computing: enhance efficiency by using a quantum computer)
Quantum principal component analysis (an example)
Comparing Stocks
Yesterday’s data
𝑣𝑡𝑛 = stock n change at time = t
jsandatascience.comCovariance of stock change:
CISCO Chevron Exon Mobile
Quantum principal component analysis (an example)
𝑣𝑡𝑛 = change of stock n at time t
𝑣𝑡 = vector for N stocks → |𝑣𝑡 > : quantum state
Density matrix
QPCA: Find Eigenvalues in O(logN)2 instead of O(N2) for classical PCA
Use in Quantum Machine Learning Software for speed-up
Quantum principal component analysis (an example)
Input
Output
3) quantum data – quantum machine
|𝜑 >
|𝜓 >
(3.1) Superposition of memorized states (Quantum Associative Memory - Ventura and Martinez ‘98)
Idea: Create a superposed memory state of learned states
with
Requires copies of |M> since the state is destroyed after the probabilistic measurement
(3.2) Time evolution (f.ex interacting quantum dots – Behrman and co-workers ’99 or Perus ‘00)
inputoutput
Green’s function of the trained system
Ex: interacting quantum dots
(3.3) Time flow approaches (Kak ’95, Zak and Williams ’98, Gupta and Zia ‘01,…)
(a) Quantum Measurement: after some time a quantum measurement is performed – then time evolution –measurement -…
(b) Dissipative operator: after some time a dissipative operator is applied and successive time evolution and dissipative operator…
(c) Successive entanglement: Panella and Martinelli ‘11
(3.4) Quantum Boltzmann Machine (quantization of the classical Boltzmann Machine)
Classical Restricted Boltzmann Machine:
v hw Probabilistic machine: probability value of every
state determined by the local energy 𝐸𝑖 = 𝑧𝑖 +
σ𝑗𝑊𝑖𝑗𝑧𝑗 ; 𝑧 = 𝑣 𝑜𝑟 ℎ ; 𝑃 𝑧𝑖 = 1 =1
1+𝑒−𝐸𝑖. This
will eventually minimize global energy
(3.4) Quantum Boltzmann Machine (quantization of the classical Boltzmann Machine)
output
Learning of Restricted Boltzmann Machine:
1. Clamp input and desired output (visible layer) => find global minimum
2. Clamp only input => find global minimum => compare output with desired output. Adjust weights and biases by optimizing difference between output and desired output.
3. Use your machine
input
(3.4) Quantum Boltzmann Machine (quantization of the classical Boltzmann Machine)
From Crawford et al. ‘16Deep machine
input
input
output
output
(3.4) Quantum Boltzmann Machine (quantization of the classical Boltzmann Machine)
Deep machineinput
input
output
output
Quantization of Boltzmann Machine:
From Crawford et al. ‘16
Jason Rolfe Roger MelkoBohdan KulchytskyyEvgeny Andriyash
arXiv:1601.02036
Slide from AminSlide from Amin
Transverse Ising Hamiltonian
Slide from Amin
Quantum Boltzmann Distribution
Boltzmann probability distribution:
Density matrix:
Projection operator Identity matrix
Slide from Amin
Training
Clamped average Unclamped average
Slide from Amin
Copyright© 2016, D-Wave Systems Inc.
Quantum Boltzmann Machine
Classical BM
Bound gradient
D=2
Exact gradient
(D is trained)
D final = 2.5
Train a Boltzmann machine using quantum Boltzmann
distribution (Amin, Andriyash, et al., arXiv:1601.02036)
Slide from Amin
=𝜀𝑖 𝑡𝑖𝑡𝑖 −𝜀𝑖
(i)𝐻 =
𝑖=1
𝑁
𝕀⨂⋯⨂𝐻𝑖 ⨂⋯⨂𝕀 +
𝑖𝑗
𝑉𝑖𝑗
= 𝜀𝑖𝜎𝑧 + 𝑡𝑖𝜎
𝑥
= 𝐻𝑖
In general: some collection of interacting qubits
In general: some collection of interacting qubits
For quantum machine learning need an input and output subset:
Input
Output
In general: some collection of interacting qubits
For quantum machine learning need an input and output subset:
Input
Output
Connected to quantum states
How can this be modeled?
=𝜀𝑖 𝑡𝑖𝑡𝑖 −𝜀𝑖
(i)𝐻 =
𝑖=1
𝑁
𝕀⨂𝕀⨂𝕀⨂𝑆𝑖⨂𝕀⨂𝕀 +
𝑖𝑗
𝑉𝑖𝑗
= 𝜀𝑖𝜎𝑧 + 𝑡𝑖𝜎
𝑥
= 𝑆𝑖
⟺
𝐻 =
𝑖=1
2𝑁
𝜖𝑖|𝑖 >< 𝑖| +
𝑖𝑗
2𝑁
𝑡𝑖𝑗 |𝑖 >< 𝑗|
Collection of qubits
Highly connected tight binding model, which can be computed classically.
⋯𝑖⋯
(6 qubits)
(64 sites)
Quantum Machine Learning
1) Quantum data – classical machineMany useful applications. Can use powerful classical ML codes (Deep Convolution NN). Often outperform non-ML approaches.
2) Classical data – quantum machineSome powerful algorithms exist but many questions remain, particularly for the learning phase.
1) Quantum data – quantum machineMany different preliminary approaches, but it’s just the beginning. No clear emerging winning candidate. There is a lot of fundamental work remaining to be done.
Thanks!