Top Banner
comp150dl Lecture 12: Activity Recognition and Unsupervised Learning 1 Tuesday April 4, 2017
62

Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

Aug 13, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl

Lecture 12: Activity Recognition and Unsupervised Learning

1

Tuesday April 4, 2017

Page 2: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl

- International Max Planck Research School for Intelligent Systems with director Michael Black, applications open for 100 new PhD students

- Final Project milestones due today

- Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know

- Complain about AWS availability to t-staff

Announcements!

2

Page 3: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Activity Recognition

3

Page 4: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl 4

Latest Iteration: Video Segmentation via object flowTsai et al., 2016

Classic Video Segmentation: Optical Flow

[G. Farnebäck, “Two-frame motion estimation based on polynomial expansion,” 2003] [T. Brox and J. Malik, “Large displacement optical flow: Descriptor matching in variational motion estimation,” 2011]

Page 5: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl 5

Case Study: AlexNet[Krizhevsky et al. 2012]

Input: 227x227x3 images

First layer (CONV1): 96 11x11 filters applied at stride 4 => Output volume [55x55x96] Q: What if the input is now a small chunk of video? E.g. [227x227x3x15] ?

Page 6: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl 6

Case Study: AlexNet[Krizhevsky et al. 2012]

Input: 227x227x3 images

First layer (CONV1): 96 11x11 filters applied at stride 4 => Output volume [55x55x96] Q: What if the input is now a small chunk of video? E.g. [227x227x3x15] ? A: Extend the convolutional filters in time, perform spatio-temporal convolutions! E.g. can have 11x11xT filters, where T = 2..15.

Page 7: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Spatio-Temporal ConvNets

7

[3D Convolutional Neural Networks for Human Action Recognition, Ji et al., 2010]

Page 8: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Spatio-Temporal ConvNets

8

[Large-scale Video Classification with Convolutional Neural Networks, Karpathy et al., 2014]

Learned filters on the first layer

Page 9: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Long-time Spatio-Temporal ConvNets

9

Sequential Deep Learning for Human Action Recognition, Baccouche et al., 2011

LSTM way before it was cool

(This paper was ahead of its time. Cited 65 times.)

Page 10: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Spatio-Temporal ConvNets

10

[Two-Stream Convolutional Networks for Action Recognition in Videos, Simonyan and Zisserman 2014]

[T. Brox and J. Malik, “Large displacement optical flow: Descriptor matching in variational motion estimation,” 2011]

Page 11: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Spatio-Temporal ConvNets

11

[Two-Stream Convolutional Networks for Action Recognition in Videos, Simonyan and Zisserman 2014]

[T. Brox and J. Malik, “Large displacement optical flow: Descriptor matching in variational motion estimation,” 2011]

Two-stream version works much better than either alone.

Page 12: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Long-time Spatio-Temporal ConvNets

12

All 3D ConvNets so far used local motion cues to get extra accuracy (e.g. half a second or so) Q: what if the temporal dependencies of interest are much much longer? E.g. several seconds?

event 1 event 2

Page 13: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Long-time Spatio-Temporal ConvNets

13

[Long-term Recurrent Convolutional Networks for Visual Recognition and Description, Donahue et al., 2015]

Page 14: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

14

Venugopalan et al., “Sequence to Sequence -- Video to Text,” 2015.

Page 15: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Long-time Spatio-Temporal ConvNets

15

[Delving Deeper into Convolutional Networks for Learning Video Representations, Ballas et al., 2016]

All neurons in the ConvNet are recurrent.

Only requires (existing) 2D CONV routines. No need for 3D spatio-temporal CONV.

Update to vanilla RNN (aka GRU)

update gate reset gate

Page 16: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl

Propagation

16

Graph Cut for Video:

Bilateral Space Video SegmentationMarki et al., 2016

Page 17: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Unsupervised Learning

17

Page 18: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Unsupervised Learning Overview

- Autoencoders - Vanilla - Variational

- Adversarial Networks

18

Page 19: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Supervised vs Unsupervised

- Supervised Learning

- Data: (x, y) - x is data, y is label

- Goal: Learn a function to map x -> y

- Examples: Classification, regression, object detection, semantic segmentation, image captioning, etc

19

Page 20: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Supervised vs Unsupervised

- Supervised Learning

- Data: (x, y) - x is data, y is label

- Goal: Learn a function to map x -> y

- Examples: Classification, regression, object detection, semantic segmentation, image captioning, etc

20

Unsupervised Learning

Data: x Just data, no labels!

Goal: Learn some structure of the data

Examples: Clustering, dimensionality reduction, feature learning, generative models, etc.

Page 21: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Unsupervised Learning

- Autoencoders - Traditional: feature learning - Variational: generate samples

- Generative Adversarial Networks: Generate samples

21

Page 22: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Autoencoders

22

x

z

Encoder

Input data

Features

Page 23: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Autoencoders

23

x

z

Encoder

Input data

Features

Originally: Linear + nonlinearity (sigmoid) Later: Deep, fully-connected Later: ReLU CNN

Page 24: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Autoencoders

24

x

z

Encoder

Input data

Features

Originally: Linear + nonlinearity (sigmoid) Later: Deep, fully-connected Later: ReLU CNN

z usually smaller than x (dimensionality reduction) Prevents trivial solution

Page 25: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Autoencoders

25

x

z

xx

Encoder

Decoder

Input data

Features

Reconstructed input data

Page 26: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Autoencoders

26

x

z

xx

Encoder

Decoder

Input data

Features

Reconstructed input data

Encoder: 4-layer conv Decoder: 4-layer upconv

Page 27: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Autoencoders

27

x

z

xx

Encoder

Decoder

Input data

Features

Reconstructed input data

Encoder: 4-layer conv Decoder: 4-layer upconv

Goal: Train for reconstruction with no labels!

Encoder / decoder sometimes share weights

Example: dim(x) = D dim(z) = H we: H x D wd: D x H = we

T

Page 28: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl 28

x

z

xx

Encoder

Decoder

Input data

Features

Reconstructed input data

Loss function (Often L2)

Train for reconstruction with no labels!

Page 29: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Autoencoders

29

x

z

Encoder

Input data

Features

xx

Decoder

Reconstructed input data

After training, throw away decoder!

Page 30: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl 30

x

z

yy

Encoder

Classifier

Input data

Features

Predicted Label

Loss function (Softmax, etc)

yUse encoder to initialize a supervised model

planedog deer

birdtruck

Train for final task (sometimes with small data)

Fine-tune encoderjointly withclassifier

Page 31: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Autoencoders: Greedy Training

31

Hinton and Salakhutdinov, “Reducing the Dimensionality of Data with Neural Networks”, Science 2006

In mid 2000s layer-wise pretraining with Restricted Boltzmann Machines (RBM) was common

Training deep nets was hard in 2006!

Not common anymore

With ReLU, proper initialization, batchnorm, Adam, etc easily train from scratch

Page 32: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl

Alternatives

• Siamese Networks

• Triplet Networks

• Pretraining on unrelated supervised task (aka Transfer Learning)

32

Creation of a Deep Convolutional Auto-Encoder in Caffe Volodymyr Turchenko, Artur Luczak. arXiv 2015

Page 33: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl

Generating Samples• What if you want to make new examples?

• Need Generative Model

• MCMC?

• too slow, hard to scale

• MAP / Maximization?

• Strong overfitting of high dimensional data — won’t generate a large variety of interesting things

33

Page 34: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Variational Autoencoder a Generative Method

- A Bayesian spin on an autoencoder - lets us generate data!

- Assume our data is generated like this:

34

z xSample from true prior

Sample from true conditional

Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014

Intuition: x is an image, z gives class, orientation, attributes, etc

Problem: Estimate 𝜃 without access to latent states !

Page 35: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Variational Autoencoder: Encoder- By Bayes Rule the posterior is:

35

x

𝜇z Σz

Mean and (diagonal) covariance of

Data point

Encoder network with parameters 𝜙

Use decoder network =) Gaussian =) Intractible integral =(

Approximate posterior with encoder network

Fully-connected or convolutional

Kingma and Welling, ICLR 2014

Page 36: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

36

Solution: Approximate posterior with encoder network

Page 37: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl

Variational Autoencoder a Generative Method

37

Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014

Page 38: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

38

Decoder Network Parameters

Encoder Network Parameters

Page 39: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Mean and (diagonal) covariance of (should be close to data x)

Variational Autoencoder

39

x

𝜇z Σz Mean and (diagonal) covariance of (should be close to prior ) Data point

Encoder network

z

𝜇x

Sample from

Decoder network

Sample from

Training like a normal autoencoder: reconstruction loss at the end, regularization toward prior in middle

xxReconstructed

Σx

Kingma and Welling, ICLR 2014

Page 40: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Autoencoder Overview

- Traditional Autoencoders - Try to reconstruct input - Used to learn features, initialize supervised model - Not used much anymore

- Variational Autoencoders - Bayesian meets deep learning - Sample from model to generate images

40

Page 41: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl 41

Generative Adversarial Networks

Page 42: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Generative Adversarial Nets

42

zRandom noise

Can we generate images with less math?

Goodfellow et al, “Generative Adversarial Nets”, NIPS 2014

Page 43: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Generative Adversarial Nets

43

z

x

Generator

Random noise

Fake image

Can we generate images with less math?

Goodfellow et al, “Generative Adversarial Nets”, NIPS 2014

Page 44: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Generative Adversarial Nets

44

z

x

Generator

Random noise

Fake image

yReal or fake?

Discriminator

Can we generate images with less math?

Goodfellow et al, “Generative Adversarial Nets”, NIPS 2014

Page 45: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Generative Adversarial Nets

45

z

x

Generator

Random noise

Fake image

y

Real image

Real or fake?

Discriminator

x

Fake examples: from generator Real examples: from dataset

Can we generate images with less math?

Goodfellow et al, “Generative Adversarial Nets”, NIPS 2014

Page 46: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Generative Adversarial Nets

46

z

x

Generator

Random noise

Fake image

y

Real image

Real or fake?

Discriminator

x

Fake examples: from generator Real examples: from dataset

Train generator and discriminator jointly After training, easy to generate images

Can we generate images with less math?

Goodfellow et al, “Generative Adversarial Nets”, NIPS 2014

Page 47: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl

(Decoder)

(Encoder)

47

Generative Adversarial Nets

Page 48: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

Generative Network

Random Input

Generated Image

Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016

Page 49: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl 49

Discriminative NetworkClassified

Label VectorReal Training

Image

This is just a CNN!

Page 50: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Generative Adversarial Nets: Simplifying

50

Radford et al, ICLR 2016

Samples from the model look amazing!

Page 51: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Generative Adversarial Nets: Simplifying

51

Radford et al, ICLR 2016

Interpolating between random points in latent space

Page 52: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Generative Adversarial Nets: Vector Math

52

Smiling woman Neutral woman Neutral man

Smiling ManSamples from the model

Average Z vectors, do arithmetic

Radford et al, ICLR 2016

Page 53: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

* Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl

Generative Adversarial Nets: Vector Math

53

Radford et al, ICLR 2016

Glasses man No glasses man No glasses woman

Woman with glasses

Page 54: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl

Learning what to Ignore

54

Tzeng et al, “Adversarial Discriminative Domain Adaptation”, arXiv 2017.

Page 55: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl

Interaction

55

Sangkloy et al, “Scribbler: Controlling Deep Image Synthesis with Sketch and Color”, Siggraph 2017.

Page 56: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl

Deep Learning and Generalization

56

Page 57: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl

(super short) primer on generalization

57

Page 58: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl

Central finding of Zhang et al (2017):

deep neural nets are able to fit random labels and data

58

So how are Deep Nets achieving good generalization?

Page 59: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl

datasets and models

• CIFAR10 dataset: 60000 images (50000 train, 10000 validation), 10 categories

• ImageNet dataset: 1,281,167 training images, 50000 validation images, 1000 categories

• alexnet, inception, multilayer perceptrons

59

Page 60: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl

randomization tests:

•  

60

Page 61: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl

performance on randomized tests

61

Page 62: Lecture 12: Activity Recognition and Unsupervised Learning · - Vote for Final Day and Location on Doodle, if you didn’t get a Doodle link let me know - Complain about AWS availability

comp150dl

explicit regularization does not help much

62