Top Banner
Recurrent Neural Network The Deepest of Deep Learning Chen Liang
38

RNNs and Tensorflow

Feb 14, 2017

Download

Documents

dominh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: RNNs and Tensorflow

Recurrent Neural NetworkThe Deepest of Deep Learning

Chen Liang

Page 2: RNNs and Tensorflow

Deep Learning

Page 3: RNNs and Tensorflow

Deep Learning

Page 4: RNNs and Tensorflow

Deep learning works like human brain?

Demystify Deep Learning

Page 5: RNNs and Tensorflow

Deep learning works like human brain?

Demystify Deep Learning

Page 6: RNNs and Tensorflow

Deep learning works like human brain?

Demystify Deep Learning

Page 7: RNNs and Tensorflow

Deep Learning: Building Blocks

Page 8: RNNs and Tensorflow

Deep Learning: Deep Composition

Page 9: RNNs and Tensorflow

Deep Learning: Gradient Descent

Page 10: RNNs and Tensorflow

Deep Learning: Gradient Descent

Page 12: RNNs and Tensorflow

Deep Learning: Weight Sharing

Page 13: RNNs and Tensorflow

Recurrent Neural NetworkDeepest of Deep learning?

● Can be infinitely deep

Equation

Page 14: RNNs and Tensorflow

BPTT: Backpropagation Through Time

Page 15: RNNs and Tensorflow

Recurrent Neural NetworkShort term dependency

Long term dependency

Exploding/vanishing gradient

Page 16: RNNs and Tensorflow

LSTM: Long-Short Term Memory Add a direct pathway for gradient

Page 17: RNNs and Tensorflow

LSTM: Long-Short Term Memory Forget gate

Page 18: RNNs and Tensorflow

LSTM: Long-Short Term Memory Input gate

Page 19: RNNs and Tensorflow

LSTM: Long-Short Term Memory Update memory using forget gate and input gate

Page 20: RNNs and Tensorflow

LSTM: Long-Short Term Memory Output gate

Page 21: RNNs and Tensorflow

LSTM: Long-Short Term Memory Putting together

Page 22: RNNs and Tensorflow

RNN: A General Framework

Machine Translation

Speech recognition

Language ModelingSentiment Analysis

Image Recognition

Image Caption Generation

Page 23: RNNs and Tensorflow

Char-RNNHow it works?

Vocabulary:

[“h”, “e”, “l”, “o”]

Training sequence:

“hello”

Page 24: RNNs and Tensorflow

Char-RNNLinux

Latex

Wikipedia

Music

Check out the blog:

The Unreasonable Effectiveness of RNN

Page 25: RNNs and Tensorflow

TensorFlowTensorFlow™ is an open source software library for numerical computation using data flow graphs.

Page 26: RNNs and Tensorflow

import tensorflow as tfimport numpy as np

# Create 100 phony x, y data points in NumPy, y = x * 0.1 + 0.3x_data = np.random.rand(100).astype(np.float32)y_data = x_data * 0.1 + 0.3

# Try to find values for W and b that compute y_data = W * x_data + b# (We know that W should be 0.1 and b 0.3, but Tensorflow will# figure that out for us.)W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))b = tf.Variable(tf.zeros([1]))y = W * x_data + b

# Minimize the mean squared errors.loss = tf.reduce_mean(tf.square(y - y_data))optimizer = tf.train.GradientDescentOptimizer(0.5)train = optimizer.minimize(loss)

TensorFlow: Computation Graph

Import TensorFlow and NumPy

Page 27: RNNs and Tensorflow

import tensorflow as tfimport numpy as np

# Create 100 phony x, y data points in NumPy, y = x * 0.1 + 0.3x_data = np.random.rand(100).astype(np.float32)y_data = x_data * 0.1 + 0.3

# Try to find values for W and b that compute y_data = W * x_data + b# (We know that W should be 0.1 and b 0.3, but Tensorflow will# figure that out for us.)W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))b = tf.Variable(tf.zeros([1]))y = W * x_data + b

# Minimize the mean squared errors.loss = tf.reduce_mean(tf.square(y - y_data))optimizer = tf.train.GradientDescentOptimizer(0.5)train = optimizer.minimize(loss)

TensorFlow: Computation Graph

Synthesize some noisy data from a linear model

Page 28: RNNs and Tensorflow

import tensorflow as tfimport numpy as np

# Create 100 phony x, y data points in NumPy, y = x * 0.1 + 0.3x_data = np.random.rand(100).astype(np.float32)y_data = x_data * 0.1 + 0.3

# Try to find values for W and b that compute y_data = W * x_data + b# (We know that W should be 0.1 and b 0.3, but Tensorflow will# figure that out for us.)W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))b = tf.Variable(tf.zeros([1]))y = W * x_data + b

# Minimize the mean squared errors.loss = tf.reduce_mean(tf.square(y - y_data))optimizer = tf.train.GradientDescentOptimizer(0.5)train = optimizer.minimize(loss)

W x_data

*

+

b

TensorFlow: Computation Graph

Page 29: RNNs and Tensorflow

TensorFlow: Computation Graphimport tensorflow as tfimport numpy as np

# Create 100 phony x, y data points in NumPy, y = x * 0.1 + 0.3x_data = np.random.rand(100).astype(np.float32)y_data = x_data * 0.1 + 0.3

# Try to find values for W and b that compute y_data = W * x_data + b# (We know that W should be 0.1 and b 0.3, but Tensorflow will# figure that out for us.)W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))b = tf.Variable(tf.zeros([1]))y = W * x_data + b

# Minimize the mean squared errors.loss = tf.reduce_mean(tf.square(y - y_data))optimizer = tf.train.GradientDescentOptimizer(0.5)train = optimizer.minimize(loss)

W x_data

*

+

b

LossOptimizer

y_data

Page 30: RNNs and Tensorflow

TensorFlow: Session# Before starting, initialize the variables. We will 'run' this first.init = tf.initialize_all_variables()

# Launch the graph.sess = tf.Session()sess.run(init)

# Fit the line.for step in xrange(201): sess.run(train) if step % 20 == 0: print(step, sess.run(W), sess.run(b))

# Learns best fit is W: [0.1], b: [0.3]

Page 31: RNNs and Tensorflow

TensorFlow: Session# Before starting, initialize the variables. We will 'run' this first.init = tf.initialize_all_variables()

# Launch the graph.sess = tf.Session()sess.run(init)

# Fit the line.for step in xrange(201): sess.run(train) if step % 20 == 0: print(step, sess.run(W), sess.run(b))

# Learns best fit is W: [0.1], b: [0.3]

Page 32: RNNs and Tensorflow

Tensorboard Demo

Page 33: RNNs and Tensorflow

Tensorboard Demo

Page 34: RNNs and Tensorflow

Tensorboard Demo

Page 35: RNNs and Tensorflow

Now the part that everybody hates...

Page 36: RNNs and Tensorflow

Jon Snow is dead

Page 37: RNNs and Tensorflow

HomeworkPart 1: Backpropagation and gradient check

● NumPy

Part 2: Char-RNN

● Undergrad/Grad Descent○ Gradient descent => graduate descent

○ Systematic search of hyperparameters

● Do something fun with it!

Use gradient to find the best parameters

Use graduate student to find the best hyperparameters

Page 38: RNNs and Tensorflow

ReferencesChristopher Colah’s Blog: http://colah.github.io/

Andrej Karpathy’s Blog: http://karpathy.github.io/2015/05/21/rnn-effectiveness/

David Silver’s Talk: http://videolectures.net/rldm2015_silver_reinforcement_learning/

Geoffrey Hinton’s Coursera Talk:

https://class.coursera.org/neuralnets-2012-001/lecture