Top Banner
1 Understanding Deep Image Representations by Inverting Them Paper by Aravindth Mahendran, Andrea Velaldi Presentation by Anthony Chen
25

Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

Jul 27, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

1

Understanding Deep Image Representations by Inverting Them

Paper by Aravindth Mahendran, Andrea Velaldi

Presentation by Anthony Chen

Page 2: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

2

Background

● Feature extraction methods like SIFT and HOG and CNN, but difficult to understand from information preservation standpoint.

Page 3: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

3

Contributions

● Novel method to invert representations. – That is, given a function and its output, recover the

original input.

● Analysis of the information preservation of different types of representation (CNN, HOG, SIFT).

Page 4: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

4

Related Work

● DeConvNets

Your thoughts on similarities/differences?

Page 5: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

5

Related Work (2)

● DeConvNets – My thoughts

– DeConvNet are encouraged to look like original, while this paper enforces no such constraint.

– Therefore, while both can be thought of as inverses, DeConvNet studies how results are obtained, whereas this paper studies information representation/preservation.

Page 6: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

6

Inverting Images

● This is the function representing the CNN.

● Let x0 be the original image.

● Goal: Find an x such that is close to

Page 7: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

7

Inverting Images (2)

● We want to find an x, which we will call x*, s.t

● Here, we add a regularizer to ensure that the optimization search only searches for “natural images”

Page 8: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

8

Inverting Images (3)

● Given an image reconstruction , the reconstruction error is given by:

● Additional modification To ensure that loss near solution is bounded in a [0, 1) range:

where sigma is the mean of the images in our test set.

Page 9: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

9

Regularizers

● Let x be a mean subtracted image vector.

● enforces range.

● Total variation:

– Penalizes images with large total gradients.

– Discrete version:

Page 10: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

10

Regularizers (2)

● Allows us to set the range of the pixel values . If we want to set the range between [-B, B], then

● Allows us to say how much variability the reconstruction should have.

Page 11: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

11

Final objective function

Page 12: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

12

Optimization

● Momentum based gradient descent is used to minimize the objective function.

● Momentum size has a decaying factor of .9

● Because CNN's function is differentiable, this is easy to optimize, but not for HOG and SIFT. Therefore, HOG and SIFT are implemented in CNN.

Page 13: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

13

Representations: CNN

Page 14: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

14

Representations: SIFT and HOG

● DSIFT and HOG implemented w/ CNN architecture which makes it easy to compute gradients.

● Binning is approximated using ReLU layer

● Pooling into cell histograms by linear filter.

● Cell blocks then normalized by a normalization layer.

● Maximum values are then set using ReLU unit.

Page 15: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

15

Results

● Normalized reconstruction error

● is the normalization constant. Average pairwise Euclidean distance across 100 images.

● λa = 2.16x108, λVβ = 5, β = 2.

Page 16: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

16

Results: SIFT and HOG

● Using bilinear gradient improves HOGb greatly.

Page 17: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

17

Results: SIFT and HOG (2)

Page 18: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

18

Results: CNN

● Experiments run allowing different levels of total variance. – λ1 = .5. λ2 = 5. λ3 = 50

Page 19: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

19

Test Images

Page 20: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

20

Results CNN (2)

Page 21: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

21

Results: CNN (3)

● Reconstruction from subset of network illustrates subset's purpose.

Page 22: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

22

Results (4): Variance in Reconstruction

Page 23: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

23

Effects of parameter tuning

Decreasing the regularizing constant leads to higher variance reconstructions. These indistinguishable images still lead to good reconstruction errors.

Page 24: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

24

Future Work

● Use this inverse technique to improve CNN architecture.

● Use this technique on other forms of neural networks (LSTM)?

Page 25: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented

25

Conclusion

● This paper provides a novel method to study and visualize information preservation in a CNN.

● Formalizes relationship between CNN and shallow feature representation.