Top Banner
Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009
30
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Classification

Derek HoiemCS 598, Spring 2009

Jan 27, 2009

Page 2: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Outline

• Principles of generalization

• Survey of classifiers

• Project discussion

• Discussion of Rosch

Page 3: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Pipeline for Prediction

Imagery Representation Classifier Predictions

Page 4: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Free Lunch Theorem

Page 5: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Bias and Variance

Complexity Low BiasHigh Variance

High BiasLow Variance

Err

or

Page 6: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Overfitting• Need validation set• Validation set not same as test set

Page 7: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Bias-Variance View of Features• More compact = lower variance, potentially

higher bias• More features = higher variance, lower bias• More independence among features = simpler

classifier lower variance

Page 8: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

How to reduce variance• Parameterize model

E.g., linear vs. piecewise

Page 9: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

How to measure complexity?• VC dimension

Training error +

Upper bound on generalization error

N: size of training seth: VC dimension: 1-probability

Page 10: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

How to reduce variance• Parameterize model• Regularize

Page 11: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

How to reduce variance• Parameterize model• Regularize• Increase number of training examples

Page 12: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Effect of Training Size

Number of Training Examples

Err

or

Page 13: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Risk Minimization• Margins

x x

xx

x

x

x

x

oo

o

o

o

x2

x1

Page 14: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Classifiers• Generative methods

– Naïve Bayes– Bayesian Networks

• Discriminative methods– Logistic Regression– Linear SVM– Kernelized SVM

• Ensemble methods– Randomized Forests– Boosted Decision Trees

• Instance based– K-nearest neighbor

• Unsupervised– Kmeans

Page 15: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Components of classification methods• Objective function• Parameterization• Regularization• Training• Inference

Page 16: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Classifiers: Naïve Bayes• Objective• Parameterization• Regularization• Training• Inference x1 x2 x3

y

Page 17: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Classifiers: Logistic Regression• Objective• Parameterization• Regularization• Training• Inference

Page 18: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Classifiers: Linear SVM• Objective• Parameterization• Regularization• Training• Inference

x x

xx

x

x

x

x

oo

o

o

o

x2

x1

Page 19: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Classifiers: Linear SVM• Objective• Parameterization• Regularization• Training• Inference

x x

xx

x

x

x

x

oo

o

o

o

x2

x1

Page 20: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Classifiers: Linear SVM• Objective• Parameterization• Regularization• Training• Inference

x x

xx

x

x

x

x

o

oo

o

o

o

x2

x1

Needs slack

Page 21: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Classifiers: Kernelized SVM• Objective• Parameterization• Regularization• Training• Inference

xx xx oo o

x1

x

x

x

x

o

oo

x1

x12

Page 22: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Classifiers: Decision Trees• Objective• Parameterization• Regularization• Training• Inference

x x

xx

x

x

x

x

oo

o

o

o

o

x2

x1

Page 23: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Ensemble Methods: Boosting

figure from Friedman et al. 2000

Page 24: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Boosted Decision Trees

Gray?

High inImage?

Many LongLines?

Yes

No

NoNo

No

Yes Yes

Yes

Very High Vanishing

Point?

High in Image?

Smooth? Green?

Blue?

Yes

No

NoNo

No

Yes Yes

Yes

Ground Vertical Sky

[Collins et al. 2002]

P(label | good segment, data)

Page 25: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Boosted Decision Trees• How to control bias/variance trade-off

– Size of trees– Number of trees

Page 26: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

K-nearest neighbor

x x

xx

x

x

x

xo

oo

o

o

o

o

x2

x1

• Objective• Parameterization• Regularization• Training• Inference

Page 27: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Clustering

x x

xx

x

xo

o

o

o

o

x1

x

x2

+ +

++

+

++

+

+

+

+

x2

x1

+

Page 28: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

References

• General– Tom Mitchell, Machine Learning, McGraw Hill, 1997– Christopher Bishop, Neural Networks for Pattern Recognition, Oxford

University Press, 1995

• Adaboost– Friedman, Hastie, and Tibshirani, “Additive logistic regression: a statistical view

of boosting”, Annals of Statistics, 2000

• SVMs– http://www.support-vector.net/icml-tutorial.pdf

Page 29: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Project ideas?

Page 30: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Discussion of Rosch