Top Banner
CVPR12 Tutorial on Deep Learning Sparse Coding Kai Yu [email protected] Department of Multimedia, Baidu
69

p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Oct 25, 2014

Download

Documents

zukun
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

CVPR12 Tutorial on Deep Learning

Sparse Coding

Kai Yu

[email protected]

Department of Multimedia, Baidu

Page 2: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Relentless research on visual recognition

04/07/23 2

Caltech 101

PASCAL VOC

80 Million Tiny Images

ImageNet

Page 3: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

The pipeline of machine visual perception

04/07/23 3

Low-level sensing

Pre-processing

Feature extract.

Feature selection

Inference: prediction, recognition

• Most critical for accuracy• Account for most of the computation for testing• Most time-consuming in development cycle• Often hand-craft in practice

Most Efforts in Machine Learning

Page 4: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Computer vision features

SIFT Spin image

HoG RIFT

Slide Credit: Andrew Ng

GLOH

Page 5: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Learning features from data

04/07/23 5

Low-level sensing

Pre-processing

Feature extract.

Feature selection

Inference: prediction, recognition

Feature Learning: instead of design features, let’s design feature learners

Machine Learning

Page 6: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Learning features from data via sparse coding

04/07/23 6

Low-level sensing

Pre-processing

Feature extract.

Feature selection

Inference: prediction, recognition

Sparse coding offers an effective building block to learn useful features

Page 7: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Outline

1. Sparse coding for image classification

2. Understanding sparse coding

3. Hierarchical sparse coding

4. Other topics: e.g. structured model, scale-up, discriminative training

5. Summary

04/07/23 7

Page 8: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

“BoW representation + SPM” Paradigm - I

04/07/23 8Figure credit: Fei-Fei Li

Bag-of-visual-words representation (BoW) based on VQ coding

Page 9: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

“BoW representation + SPM” Paradigm - II

04/07/23 9Figure credit: Svetlana Lazebnik

Spatial pyramid matching: pooling in different scales and locations

Page 10: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Image Classification using “BoW +SPM”

04/07/23 10

VQ Coding

Dense SIFT

Spatial Pooling

Classifier

Image Classification

Page 11: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

The Architecture of “Coding + Pooling”

11

• e.g., convolutional neural net, HMAX, BoW, …• e.g., convolutional neural net, HMAX, BoW, …

Coding Pooling Coding Pooling

Page 12: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

“BoW+SPM” has two coding+pooling layers

12

e.g, SIFT, HOG

VQ Coding Average Pooling (obtain histogram)

SVMLocal Gradients Pooling

SIFT feature itself follows a coding+pooling operation SIFT feature itself follows a coding+pooling operation

Page 13: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Develop better coding methods

13

Better Coding Better Pooling Better Classifier

Better Coding Better Pooling

- Coding: nonlinear mapping data into another feature space- Better coding methods: sparse coding, RBMs, auto-encoders

Page 14: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

What is sparse coding

04/07/23 14

Sparse coding (Olshausen & Field,1996). Originally developed to explain early visual processing in the brain (edge detection).

Training: given a set of random patches x, learning a dictionary of bases [Φ1, Φ2, …]

Coding: for data vector x, solve LASSO to find the sparse coefficient vector a

Page 15: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Sparse coding: training time

Input: Images x1, x2, …, xm (each in Rd)Learn: Dictionary of bases , …, k (also Rd).

Alternating optimization: 1.Fix dictionary , …, k , optimize a (a standard LASSO problem)

2.Fix activations a, optimize dictionary , …, k (a convex QP problem)

Page 16: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Sparse coding: testing time

Input: Novel image patch x (in Rd) and previously learned i’sOutput: Representation [ai,ai,, …, ai,] of image patch xi.

0.8 * + 0.3 * + 0.5 *

Represent xi as: ai = [0, 0, …, 0, 0.8, 0, …, 0, 0.3, 0, …, 0, 0.5, …]

Page 17: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Sparse coding illustration

Natural Images Learned bases (1 , …, 64): “Edges”

50 100 150 200 250 300 350 400 450 500

50

100

150

200

250

300

350

400

450

500 50 100 150 200 250 300 350 400 450 500

50

100

150

200

250

300

350

400

450

500

50 100 150 200 250 300 350 400 450 500

50

100

150

200

250

300

350

400

450

500 50 100 150 200 250 300 350 400 450 500

50

100

150

200

250

300

350

400

450

500

50 100 150 200 250 300 350 400 450 500

50

100

150

200

250

300

350

400

450

500 50 100 150 200 250 300 350 400 450 500

50

100

150

200

250

300

350

400

450

500

0.8 * + 0.3 * + 0.5 *

x 0.8 * 36 + 0.3 * 42

+ 0.5

* 63

[a1, …, a64] = [0, 0, …, 0, 0.8, 0, …, 0, 0.3, 0, …, 0, 0.5, 0] (feature representation)

Test example

Compact & easily interpretableSlide credit: Andrew Ng

Page 18: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Testing:What is this?

Motorcycles Not motorcycles

Unlabeled images

[Raina, Lee, Battle, Packer & Ng, ICML 07]Self-taught Learning

Testing:What is this?

Slide credit: Andrew Ng

Page 19: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Classification Result on Caltech 101

04/07/23 19

64% SIFT VQ + Nonlinear SVM

~50%Pixel Sparse Coding + Linear SVM

9K images, 101 classes

Page 20: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

20

Sparse Coding Max Pooling Linear Classifier

Local Gradients Pooling

e.g, SIFT, HOG

Sparse Coding on SIFT – ScSPM algorithm

[Yang, Yu, Gong & Huang, CVPR09]

Page 21: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

04/07/23 21

64% SIFT VQ + Nonlinear SVM

73%SIFT Sparse Coding + Linear SVM (ScSPM)

Caltech-101

Sparse Coding on SIFT – the ScSPM algorithm

[Yang, Yu, Gong & Huang, CVPR09]

Page 22: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Summary: Accuracies on Caltech 101

04/07/23 22

VQ Coding

Dense SIFT

Spatial Pooling

Nonlinear SVM

Sparse Coding

Spatial Max Pooling

Linear SVM

Sparse Coding

Dense SIFT

Spatial Max Pooling

Linear SVM

64% 50% 73% Key message: - Deep models are preferred - Sparse coding is a better building block

Page 23: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Outline

1. Sparse coding for image classification

2. Understanding sparse coding

3. Hierarchical sparse coding

4. Other topics: e.g. structured model, scale-up, discriminative training

5. Summary

04/07/23 23

Page 24: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Outline

1. Sparse coding for image classification

2. Understanding sparse coding– Connections to RBMs, autoencoders, …– Sparse activations vs. sparse models, ..– Sparsity vs. locality – local sparse coding methods

3. Hierarchical sparse coding

4. Other topics: e.g. structured model, scale-up, discriminative training

5. Summary

04/07/23 24

Page 25: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Classical sparse coding

04/07/23 25

- a is sparse

- a is often higher dimension than x

- Activation a=f(x) is nonlinear implicit function of x

- reconstruction x’=g(a) is linear & explicit

x

a

f(x)

x’

ag(a)encoding decoding

Page 26: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

RBM & autoencoders

- also involve activation and reconstruction

- but have explicit f(x)

- not necessarily enforce sparsity on a

- but if put sparsity on a, often get improved results [e.g. sparse RBM, Lee et al. NIPS08]

04/07/23 26

x

a

f(x)

x’

ag(a)encoding decoding

Page 27: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Sparse coding: A broader view

Any feature mapping from x to a, i.e. a = f(x), where

-a is sparse (and often higher dim. than x)

-f(x) is nonlinear

-reconstruction x’=g(a) , such that x’≈x

04/07/23 27

x

a

f(x)

x’

ag(a)

Therefore, sparse RBMs, sparse auto-encoder, even VQ can be viewed as a form of sparse coding.

Page 28: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Outline

1. Sparse coding for image classification

2. Understanding sparse coding– Connections to RBMs, autoencoders, …– Sparse activations vs. sparse models, …– Sparsity vs. locality – local sparse coding methods

3. Hierarchical sparse coding

4. Other topics: e.g. structured model, scale-up, discriminative training

5. Summary

04/07/23 28

Page 29: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Sparse activations vs. sparse models

For a general function learning problem a = f(x):

1. sparse model: f(x)’s parameters are sparse- example: LASSO f(x)=<w,x>, w is sparse- the goal is feature selection: all data selects a common subset of features- hot topic in machine learning

2. sparse activations: f(x)’s outputs are sparse- example: sparse coding a=f(x), a is sparse- the goal is feature learning: different data points activate different feature subsets

04/07/23 29

Page 30: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Example of sparse models

04/07/23 30

• because the 2nd and 4th elements of w are non-zero, these are the two selected features in x

• globally-aligned sparse representation

x1 [ | | | | | | ]x2 [ | | | | | | ]

xm [ | | | | | | ]

x3 [ | | | | | | ]

[ 0 | 0 | 0 0 ][ 0 | 0 | 0 0 ]

[ 0 | 0 | 0 0 ]

…[ 0 | 0 | 0 0 ]

f(x)=<w,x>, where w=[0, 0.2, 0, 0.1, 0, 0]

Page 31: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Example of sparse activations (sparse coding)

04/07/23 31

• different x has different dimensions activated

• locally-shared sparse representation: similar x’s tend to have similar non-zero dimensions

a1 [ 0 | | 0 0 … 0 ]a2 [ | | 0 0 0 … 0 ]

am [ 0 0 0 | | … 0 ]

…a3 [ | 0 | 0 0 … 0 ]

x1

x2x3

xm

Page 32: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Example of sparse activations (sparse coding)

04/07/23 32

• another example: preserving manifold structure

• more informative in highlighting richer data structures, i.e. clusters, manifolds,

a1 [ | | 0 0 0 … 0 ]a2 [ 0 | | 0 0 … 0 ]

am [ 0 0 0 | | … 0 ]

…a3 [ 0 0 | | 0 … 0 ]

x1

x2 x3

xm

Page 33: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Outline

1. Sparse coding for image classification

2. Understanding sparse coding– Connections to RBMs, autoencoders, …– Sparse activations vs. sparse models, …– Sparsity vs. locality– Local sparse coding methods

3. Hierarchical sparse coding

4. Other topics: e.g. structured model, scale-up, discriminative training

5. Summary

04/07/23 33

Page 34: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Sparsity vs. Locality

04/07/23 34

sparse coding

local sparsecoding

• Intuition: similar data should get similar activated features

• Local sparse coding: • data in the same

neighborhood tend to have shared activated features;

• data in different neighborhoods tend to have different features activated.

Page 35: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Sparse coding is not always local: example

04/07/23 35

Case 2data manifold (or clusters)

• Each basis an “anchor point”• Sparsity: each datum is a linear combination of neighbor anchors. • Sparsity is caused by locality.

Case 1independent subspaces

• Each basis is a “direction”• Sparsity: each datum is a linear combination of only several bases.

Page 36: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Two approaches to local sparse coding

04/07/23 36

Approach 2Coding via local subspaces

Approach 1Coding via local anchor points

Page 37: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Classical sparse coding is empirically local

When it works best for classification, the codes are often found local.

It’s preferred to let similar data have similar non-zero dimensions in their codes.

04/07/23 37

Page 38: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

MNIST Experiment: Classification using SC

04/07/23 38

• 60K training, 10K for test

• Let k=512

• Linear SVM on sparse codes

• 60K training, 10K for test

• Let k=512

• Linear SVM on sparse codes

Try different values

Page 39: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

MNIST Experiment: Lambda = 0.0005

04/07/23 39

Each basis is like a part or direction.

Each basis is like a part or direction.

Page 40: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

MNIST Experiment: Lambda = 0.005

04/07/23 40

Again, each basis is like a part or direction.

Again, each basis is like a part or direction.

Page 41: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

MNIST Experiment: Lambda = 0.05

04/07/23 41

Now, each basis is more like a digit !

Now, each basis is more like a digit !

Page 42: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

MNIST Experiment: Lambda = 0.5

04/07/23 42

Like VQ now!Like VQ now!

Page 43: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Geometric view of sparse coding

04/07/23 43

Error: 4.54%

• When sparse coding achieves the best classification accuracy, the learned bases are like digits – each basis has a clear local class association.

• When sparse coding achieves the best classification accuracy, the learned bases are like digits – each basis has a clear local class association.

Error: 3.75% Error: 2.64%

Page 44: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Distribution of coefficients (MNIST)

04/07/23 44

Neighbor bases tend to get nonzero coefficients

Neighbor bases tend to get nonzero coefficients

Page 45: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Distribution of coefficient (SIFT, Caltech101)

04/07/23 45

Similar observation here!Similar observation here!

Page 46: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Outline

1. Sparse coding for image classification

2. Understanding sparse coding– Connections to RBMs, autoencoders, …– Sparse activations vs. sparse models, …– Sparsity vs. locality– Local sparse coding methods

3. Other topics: e.g. structured model, scale-up, discriminative training

4. Summary

04/07/23 46

Page 47: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Why develop local sparse coding methods

Since locality is a preferred property in sparse coding, let’s explicitly ensure the locality.

The new algorithms can be well theoretically justified

The new algorithms will have computational advantages over classical sparse coding

04/07/23 47

Page 48: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Two approaches to local sparse coding

04/07/23 48

Approach 2Coding via local subspaces

Approach 1Coding via local anchor points

Local coordinate coding Super-vector codingImage Classification using Super-Vector Coding of Local Image Descriptors, Xi Zhou, Kai Yu, Tong Zhang, and Thomas Huang. In ECCV 2010.

Large-scale Image Classification: Fast Feature Extraction and SVM Training, Yuanqing Lin, Fengjun Lv, Shenghuo Zhu, Ming Yang, Timothee Cour, Kai Yu, LiangLiang Cao, Thomas Huang. In CVPR 2011

Learning locality-constrained linear coding for image classification, Jingjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas Huang. In CVPR 2010.

Nonlinear learning using local coordinate coding, Kai Yu, Tong Zhang, and Yihong Gong. In NIPS 2009.

Page 49: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

A function approximation framework to understand coding

04/07/23 49

• Assumption: image patches x follow a nonlinear manifold, and f(x) is smooth on the manifold.

• Coding: nonlinear mapping x a

typically, a is high-dim & sparse

• Nonlinear Learning: f(x) = <w, a>

• Assumption: image patches x follow a nonlinear manifold, and f(x) is smooth on the manifold.

• Coding: nonlinear mapping x a

typically, a is high-dim & sparse

• Nonlinear Learning: f(x) = <w, a>

Page 50: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Local sparse coding

04/07/23 50

Approach 1Local coordinate coding

Page 51: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Function Interpolation based on LCC

04/07/23 51

data points bases

locally linear

Yu, Zhang, Gong, NIPS 10

Page 52: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Local Coordinate Coding (LCC): connect coding to nonlinear function learning

04/07/23 52

Locality termFunction approximation

error

Coding error

If f(x) is (alpha, beta)-Lipschitz smoothThe key message: A good coding scheme should1. have a small coding error,2. and also be sufficiently local

Page 53: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Local Coordinate Coding (LCC)

04/07/23 53

• Dictionary Learning: k-means (or hierarchical k-means)• Dictionary Learning: k-means (or hierarchical k-means)

• Coding for x, to obtain its sparse representation a

Step 1 – ensure locality: find the K nearest bases

Step 2 – ensure low coding error:

Yu, Zhang & Gong, NIPS 09Wang, Yang, Yu, Lv, Huang CVPR 10

Page 54: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Local sparse coding

04/07/23 54

Approach 2Super-vector coding

Page 55: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Function approximation via super-vector coding:

data pointscluster centers

Piecewise local linear (first-order)Local tangent

Zhou, Yu, Zhang, and Huang, ECCV 10

Page 56: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

04/07/23 56

Quantization error

Function approximation error

If f(x) is beta-Lipschitz smooth, and

Super-vector coding: Justification

Local tangent

Page 57: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Super-Vector Coding (SVC)

04/07/23 57

• Dictionary Learning: k-means (or hierarchical k-means)• Dictionary Learning: k-means (or hierarchical k-means)

• Coding for x, to obtain its sparse representation a

Step 1 – find the nearest basis of x, obtain its VQ coding

e.g. [0, 0, 1, 0, …]

Step 2 – form super vector coding:

e.g. [0, 0, 1, 0, …, 0, 0, (x-m3), 0 ,… ]

Zhou, Yu, Zhang, and Huang, ECCV 10

Zero-order Local tangent

Page 58: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Results on ImageNet Challenge Dataset

04/07/23 58

40% VQ + Intersection Kernel

65%SVC + Linear SVM

ImageNet Challenge: 1.4 million images, 1000 classes

62% LCC + Linear SVM

Page 59: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Summary: local sparse coding

04/07/23 59

Approach 2Super-vector coding

Approach 1Local coordinate coding

- Sparsity achieved by explicitly ensuring locality- Sound theoretical justifications- Much simpler to implement and compute - Strong empirical success

Page 60: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Outline

1. Sparse coding for image classification

2. Understanding sparse coding

3. Hierarchical sparse coding

4. Other topics: e.g. structured model, scale-up, discriminative training

5. Summary

04/07/23 60

Page 61: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Hierarchical sparse coding

61

Sparse Coding Pooling Sparse Coding Pooling

Learning from unlabeled data

Yu, Lin, & Lafferty, CVPR 11Matthew D. Zeiler, Graham W. Taylor, and Rob Fergus, ICCV 11

Page 62: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

A two-layer sparse coding formulation

04/07/23 62

Yu, Lin, & Lafferty, CVPR 11

Page 63: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

MNIST Results - classificationMNIST Results - classification

HSC vs. CNN: HSC provide even better performance than CNN more amazingly, HSC learns features in unsupervised

manner!63

Yu, Lin, & Lafferty, CVPR 11

Page 64: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

MNIST results -- learned dictionaryMNIST results -- learned dictionary

64

A hidden unit in the second layer is connected to a unit group in the 1st layer: invariance to translation, rotation, and deformation

Yu, Lin, & Lafferty, CVPR 11

Page 65: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Caltech101 results Caltech101 results - classification- classification

Learned descriptor: performs slightly better than SIFT + SC

65

Yu, Lin, & Lafferty, CVPR 11

Page 66: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Adaptive Deconvolutional Networks for Mid and High Level Feature Learning

Hierarchical Convolutional Sparse Coding.

Trained with respect to image from all layers (L1-L4).

Pooling both spatially and amongst features.

Learns invariant mid-level features.

Matthew D. Zeiler, Graham W. Taylor, and Rob Fergus, ICCV 2011

Select L2 Feature Groups

Select L3 Feature Groups

Select L4 Features

L1 Feature Maps

L1 Feature Maps

Image

Image

L2 Feature Maps

L2 Feature Maps

L4 Feature Maps

L4 Feature Maps

L1 Features

L3 Feature Maps

L3 Feature Maps

Page 67: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Outline

1. Sparse coding for image classification

2. Understanding sparse coding

3. Hierarchical sparse coding

4. Other topics: e.g. structured model, scale-up, discriminative training

5. Summary

04/07/23 67

Page 68: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Other topics of sparse coding

Structured sparse coding, for example– Group sparse coding [Bengio et al, NIPS 09]– Learning hierarchical dictionary [Jenatton, Mairal et al, 2010]

Scale-up sparse coding, for example– Feature-sign algorithm [Lee et al, NIPS 07]– Feed-forward approximation [Gregor & LeCun, ICML 10]– Online dictionary learning [Mairal et al, ICML 2009]

Discriminative training, for example– Backprop algorithms [Bradley & Jbagnell, NIPS 08; Yang et al. CVPR 10]– Supervised dictionary training [Mairal et al, NIPS08]

04/07/23 68

Page 69: p02 Sparse Coding Cvpr2012 Deep Learning Methods for Vision

Summary of Sparse Coding

Sparse coding is an effect way for (unsupervised) feature learning

A building block for deep models

Sparse coding and its local variants (LCC, SVC) have pushed the boundary of accuracies on Caltech101, PASCAL VOC, ImageNet, …

Challenge: discriminative training is not straightforward