Top Banner
Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill
28

Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Neural Nets Using Backpropagation

Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill

Page 2: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Agenda

Review of Neural Nets and Backpropagation Backpropagation: The Math Advantages and Disadvantages of Gradient

Descent and other algorithms Enhancements of Gradient Descent Other ways of minimizing error

Page 3: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Review

Approach that developed from an analysis of the human brain

Nodes created as an analog to neurons Mainly used for classification problems (i.e.

character recognition, voice recognition, medical applications, etc.)

Page 4: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Review

Neurons have weighted inputs, threshold values, activation function, and an output

Weighted inputsOutput

Activation function = f((inputs * weight))

Page 5: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Review4 Input AND

Threshold = 1.5

Threshold = 1.5

Threshold = 1.5

All weights = 1 and all outputs = 1 if active 0 otherwise

Inputs

Inputs

Outputs

Page 6: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Review

Output space for AND gate

(1,1)

(1,0)

(0,1)

(0,0)

1.5 = w1*I1 + w2*I2

Input 1

Input 2

Page 7: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Review Output space for XOR gate Demonstrates need for hidden layer

(1,1)

(1,0)

(0,1)

(0,0)

Input 1

Input 2

Page 8: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Backpropagation: The Math

General multi-layered neural network

0 1 2 3 4 5 6 7 8 9

0 1 i

0 1

Output Layer

Wi,0W0,0 W1,0

X9,0X0,0 X1,0

Hidden Layer

Input Layer

Page 9: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Backpropagation: The Math

Backpropagation Calculation of hidden layer activation values

Page 10: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Backpropagation: The Math

Backpropagation Calculation of output layer activation values

Page 11: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Backpropagation: The Math

Backpropagation Calculation of error

k = f(Dk) -f(Ok)

Page 12: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Backpropagation: The Math

Backpropagation Gradient Descent objective function

Gradient Descent termination condition

Page 13: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Backpropagation: The Math

Backpropagation Output layer weight recalculation

Learning Rate (eg. 0.25)

Error at k

Page 14: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Backpropagation: The Math

Backpropagation Hidden Layer weight recalculation

Page 15: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Backpropagation Using Gradient Descent

Advantages Relatively simple implementation Standard method and generally works well

Disadvantages Slow and inefficient Can get stuck in local minima resulting in sub-

optimal solutions

Page 16: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Local Minima

Local Minimum

Global Minimum

Page 17: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Alternatives To Gradient Descent

Simulated Annealing Advantages

Can guarantee optimal solution (global minimum) Disadvantages

May be slower than gradient descent Much more complicated implementation

Page 18: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Alternatives To Gradient Descent

Genetic Algorithms/Evolutionary Strategies Advantages

Faster than simulated annealing Less likely to get stuck in local minima

Disadvantages Slower than gradient descent Memory intensive for large nets

Page 19: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Alternatives To Gradient Descent

Simplex Algorithm Advantages

Similar to gradient descent but faster Easy to implement

Disadvantages Does not guarantee a global minimum

Page 20: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Enhancements To Gradient Descent

Momentum Adds a percentage of the last movement to the

current movement

Page 21: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Enhancements To Gradient Descent Momentum

Useful to get over small bumps in the error function Often finds a minimum in less steps w(t) = -n*d*y + a*w(t-1)

w is the change in weight n is the learning rate d is the error y is different depending on which layer we are calculating a is the momentum parameter

Page 22: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Enhancements To Gradient Descent Adaptive Backpropagation Algorithm

It assigns each weight a learning rate That learning rate is determined by the sign of the gradient

of the error function from the last iteration If the signs are equal it is more likely to be a shallow slope so

the learning rate is increased The signs are more likely to differ on a steep slope so the

learning rate is decreased This will speed up the advancement when on gradual

slopes

Page 23: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Enhancements To Gradient Descent

Adaptive Backpropagation Possible Problems:

Since we minimize the error for each weight separately the overall error may increase

Solution: Calculate the total output error after each adaptation

and if it is greater than the previous error reject that adaptation and calculate new learning rates

Page 24: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Enhancements To Gradient Descent SuperSAB(Super Self-Adapting Backpropagation)

Combines the momentum and adaptive methods. Uses adaptive method and momentum so long as the sign

of the gradient does not change This is an additive effect of both methods resulting in a faster

traversal of gradual slopes When the sign of the gradient does change the momentum

will cancel the drastic drop in learning rate This allows for the function to roll up the other side of the

minimum possibly escaping local minima

Page 25: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Enhancements To Gradient Descent

SuperSAB Experiments show that the SuperSAB converges

faster than gradient descent Overall this algorithm is less sensitive (and so is less

likely to get caught in local minima)

Page 26: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Other Ways To Minimize Error Varying training data

Cycle through input classes Randomly select from input classes

Add noise to training data Randomly change value of input node (with low

probability) Retrain with expected inputs after initial training

E.g. Speech recognition

Page 27: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Other Ways To Minimize Error

Adding and removing neurons from layers Adding neurons speeds up learning but may cause

loss in generalization Removing neurons has the opposite effect

Page 28: Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.

Resources

Artifical Neural Networks, Backpropagation, J. Henseler

Artificial Intelligence: A Modern Approach, S. Russell & P. Norvig

501 notes, J.R. Parker www.dontveter.com/bpr/bpr.html www.dse.doc.ic.ac.uk/~nd/surprise_96/

journal/vl4/cs11/report.html