Top Banner
10701 Recitation 3- Backpropagation CNN SVM, Kernel, Backpropagation, CNN
14

10701 Recitation 3- Backpropagation CNN10701/slides/10-701_Fall_2017_Recitation_6_CNN.pdf · Why backpropagation? Loss y x1 x2 z1 z2 z3 z4 z5 z6 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11

Jan 28, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 10701 Recitation 3- Backpropagation CNN10701/slides/10-701_Fall_2017_Recitation_6_CNN.pdf · Why backpropagation? Loss y x1 x2 z1 z2 z3 z4 z5 z6 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11

10701 Recitation 3- Backpropagation CNN

SVM, Kernel, Backpropagation, CNN

Page 2: 10701 Recitation 3- Backpropagation CNN10701/slides/10-701_Fall_2017_Recitation_6_CNN.pdf · Why backpropagation? Loss y x1 x2 z1 z2 z3 z4 z5 z6 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11

Backpropagation and CNN● Simple neural network with demo of backpropagation

○ XOR (need to search for it)

● Why is backpropagation helpful in neural networks?● LeNet implementation

○ What are k, s, p, … in the convolutional layer and pooling layer○ Demo of lenet in action

Page 3: 10701 Recitation 3- Backpropagation CNN10701/slides/10-701_Fall_2017_Recitation_6_CNN.pdf · Why backpropagation? Loss y x1 x2 z1 z2 z3 z4 z5 z6 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11

How many layers do you need to construct a neural network that achieves XOR?

Backpropagation simple example: XOR

Page 4: 10701 Recitation 3- Backpropagation CNN10701/slides/10-701_Fall_2017_Recitation_6_CNN.pdf · Why backpropagation? Loss y x1 x2 z1 z2 z3 z4 z5 z6 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11

Backpropagation simple example: XOR

How many layers do you need to construct a neural network that achieves XOR?

Page 5: 10701 Recitation 3- Backpropagation CNN10701/slides/10-701_Fall_2017_Recitation_6_CNN.pdf · Why backpropagation? Loss y x1 x2 z1 z2 z3 z4 z5 z6 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11

Backpropagation simple example: XOR

σ

σ

1 1

σ

w1

w2

w3

w4

b1

b2

b3

w5

w6

Page 6: 10701 Recitation 3- Backpropagation CNN10701/slides/10-701_Fall_2017_Recitation_6_CNN.pdf · Why backpropagation? Loss y x1 x2 z1 z2 z3 z4 z5 z6 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11

Derivation

Page 7: 10701 Recitation 3- Backpropagation CNN10701/slides/10-701_Fall_2017_Recitation_6_CNN.pdf · Why backpropagation? Loss y x1 x2 z1 z2 z3 z4 z5 z6 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11

Derivation

Page 8: 10701 Recitation 3- Backpropagation CNN10701/slides/10-701_Fall_2017_Recitation_6_CNN.pdf · Why backpropagation? Loss y x1 x2 z1 z2 z3 z4 z5 z6 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11

Derivation

Page 9: 10701 Recitation 3- Backpropagation CNN10701/slides/10-701_Fall_2017_Recitation_6_CNN.pdf · Why backpropagation? Loss y x1 x2 z1 z2 z3 z4 z5 z6 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11

Why backpropagation?

Loss

y

x1 x2

z1 z2

z3 z4

z5 z6

w1w2 w3

w4

w5w6 w7

w8

w9w10 w11

w12

w13 w14

Interpretation 1: since the order of differentiation is from the outer function to the inner function. This corresponds to differentiate upper levels first, thus backpropagation

Interpretation 2: We can see from the toy example that the number of terms computed from the backward propagation is linear in the number of nodes (or weights), but roughly quadratic for the forward path

Page 10: 10701 Recitation 3- Backpropagation CNN10701/slides/10-701_Fall_2017_Recitation_6_CNN.pdf · Why backpropagation? Loss y x1 x2 z1 z2 z3 z4 z5 z6 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11

Why backpropagation?

Loss

y

x1 x2

z1 z2

z3 z4

z5 z6

w1w2 w3

w4

w5w6 w7

w8

w9w10 w11

w12

w13 w14

Each layer compute some constant number of terms (including carried over terms)

Page 11: 10701 Recitation 3- Backpropagation CNN10701/slides/10-701_Fall_2017_Recitation_6_CNN.pdf · Why backpropagation? Loss y x1 x2 z1 z2 z3 z4 z5 z6 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11

Why backpropagation?

Loss

y

x1 x2

z1 z2

z3 z4

z5 z6

w1w2 w3

w4

w5w6 w7

w8

w9w10 w11

w12

w13 w14

Each layer compute 8 more terms than the previous layer

Page 12: 10701 Recitation 3- Backpropagation CNN10701/slides/10-701_Fall_2017_Recitation_6_CNN.pdf · Why backpropagation? Loss y x1 x2 z1 z2 z3 z4 z5 z6 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11

Demon of convolution operation

Stolen from f16 10601 slides

Page 13: 10701 Recitation 3- Backpropagation CNN10701/slides/10-701_Fall_2017_Recitation_6_CNN.pdf · Why backpropagation? Loss y x1 x2 z1 z2 z3 z4 z5 z6 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11

What are the stride, padding, size of the receptive fields

https://adeshpande3.github.io/A-Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks-Part-2/

Stride: the step size your receptive field moves

padding

Page 14: 10701 Recitation 3- Backpropagation CNN10701/slides/10-701_Fall_2017_Recitation_6_CNN.pdf · Why backpropagation? Loss y x1 x2 z1 z2 z3 z4 z5 z6 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11

Layer structure

http://scs.ryerson.ca/~aharley/vis/conv/