Top Banner
5/17/2012 1 A N COGNI ION PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition
31

PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

Jul 09, 2018

Download

Documents

doanphuc
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

1

A N COGNI IONPATTERN RECOGNITION AND MACHINE LEARNINGCHAPTER 1: INTRODUCTION

Example

Handwritten Digit Recognition

Page 2: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

2

Polynomial Curve Fitting

Sum‐of‐Squares Error Function

Page 3: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

3

0th Order Polynomial

1st Order Polynomial

Page 4: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

4

3rd Order Polynomial

9th Order Polynomial

Page 5: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

5

Over‐fitting

Root‐Mean‐Square (RMS) Error:

Polynomial Coefficients   

Page 6: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

6

Data Set Size: 

9th Order Polynomial

Data Set Size: 

9th Order Polynomial

Page 7: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

7

Regularization

Penalize large coefficient values

Regularization: 

Page 8: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

8

Regularization: 

Regularization:           vs. 

Page 9: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

9

Polynomial Coefficients   

Probability Theory

Apples and Oranges

Page 10: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

10

Probability Theory

Marginal ProbabilityMarginal Probability

Conditional ProbabilityJoint Probability yy

Probability Theory

Sum RuleSum Rule

Product Rule

Page 11: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

11

The Rules of Probability

Sum Rule

Product Rule

Bayes’ Theorem

posterior likelihood × priorposterior  likelihood × prior

Page 12: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

12

Probability Densities

Transformed Densities

JFMS5

Page 13: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

Slide 24

JFMS5 This figure was taken from Solution 1.4 in the web-edition of the solutions manual for PRML, available at http://research.microsoft.com/~cmbishop/PRML. A more thorough explanation of what the figure shows is provided in the text of the solution.Markus Svensén, 11/14/2007

Page 14: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

13

Expectations

Conditional Expectation(discrete)

i iApproximate Expectation(discrete and continuous)

Variances and Covariances

Page 15: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

14

The Gaussian Distribution

Gaussian Mean and Variance

Page 16: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

15

The Multivariate Gaussian

Gaussian Parameter Estimation

Likelihood functionLikelihood function

Page 17: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

16

Maximum (Log) Likelihood

Properties of          and 

Page 18: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

17

Curve Fitting Re‐visited

Maximum Likelihood

Determine b minimi in s m of sq ares errorDetermine            by minimizing sum‐of‐squares error,             .

Page 19: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

18

Predictive Distribution

MAP: A Step towards Bayes

Determine               by minimizing regularized sum‐of‐squares error,             .

Page 20: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

19

Bayesian Curve Fitting

Bayesian Predictive Distribution

Page 21: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

20

Model Selection

Cross‐Validation

Curse of Dimensionality

Page 22: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

21

Curse of Dimensionality

Polynomial curve fitting, M = 3

Gaussian Densities in higher dimensions

Decision Theory

Inference step

Determine either orDetermine either            or           .

Decision step

For given x, determine optimal t.

Page 23: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

22

Minimum Misclassification Rate

Minimum Expected Loss

Example: classify medical images as ‘cancer’ or ‘normal’

Decision

Truth

Page 24: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

23

Minimum Expected Loss

Regions       are chosen to minimize

Reject Option

Page 25: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

24

Why Separate Inference and Decision?

• Minimizing risk (loss matrix may change over time)

• Reject option• Reject option

• Unbalanced class priors

• Combining models

Decision Theory for Regression

Inference step

DetermineDetermine            .

Decision step

For given x, make optimal prediction, y(x), for t.

Loss function:

Page 26: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

25

The Squared Loss Function

Generative vs Discriminative

Generative approach: 

ModelModel

Use Bayes’ theorem

Discriminative approach: 

Model           directly

Page 27: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

26

Entropy

Important quantity in• coding theory• statistical physics

hi l i• machine learning

Entropy

Coding theory: xdiscrete with 8 possible states; how many bits to transmit the state of x?bits to transmit the state of x?

All states equally likely

Page 28: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

27

Entropy

Entropy

In how many ways can N identical objects be allocated Mbins?bins?

Entropy maximized when

Page 29: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

28

Entropy

Differential Entropy

Put bins of width ∆ along the real line

Differential entropy maximized (for fixed     ) when

in which case

Page 30: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

29

Conditional Entropy

The Kullback‐Leibler Divergence

Page 31: PATTERNAN RECOGNITIONCOGNI ION AND …feihu.eng.ua.edu/NSF_TUES/w8_1and2.pdfPATTERNAN RECOGNITIONCOGNI ION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition

5/17/2012

30

Mutual Information