Top Banner
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION
59

PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Jun 12, 2018

Download

Documents

hathuy
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

PATTERN RECOGNITION AND MACHINE LEARNINGCHAPTER 1: INTRODUCTION

Page 2: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Example

Handwritten Digit Recognition

Page 3: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Polynomial Curve Fitting

Page 4: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Sum-of-Squares Error Function

Page 5: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

0th Order Polynomial

Page 6: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

1st Order Polynomial

Page 7: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

3rd Order Polynomial

Page 8: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

9th Order Polynomial

Page 9: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Over-fitting

Root-Mean-Square (RMS) Error:

Page 10: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Polynomial Coefficients

Page 11: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Data Set Size:

9th Order Polynomial

Page 12: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Data Set Size:

9th Order Polynomial

Page 13: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Regularization

Penalize large coefficient values

Page 14: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Regularization:

Page 15: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Regularization:

Page 16: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Regularization: vs.

Page 17: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Polynomial Coefficients

Page 18: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Probability Theory

Apples and Oranges

Page 19: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Probability Theory

Marginal Probability

Conditional ProbabilityJoint Probability

Page 20: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Probability Theory

Sum Rule

Product Rule

Page 21: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

The Rules of Probability

Sum Rule

Product Rule

Page 22: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Bayes’ Theorem

posterior likelihood × prior

Page 23: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Probability Densities

Page 24: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Transformed Densities

Page 25: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Expectations

Conditional Expectation(discrete)

Approximate Expectation(discrete and continuous)

Page 26: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Variances and Covariances

Page 27: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

The Gaussian Distribution

Page 28: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Gaussian Mean and Variance

Page 29: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

The Multivariate Gaussian

Page 30: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Gaussian Parameter Estimation

Likelihood function

Page 31: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Maximum (Log) Likelihood

Page 32: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Properties of and

Page 33: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Curve Fitting Re-visited

Page 34: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Maximum Likelihood

Determine by minimizing sum-of-squares error, .

Page 35: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Predictive Distribution

Page 36: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

MAP: A Step towards Bayes

Determine by minimizing regularized sum-of-squares error, .

Page 37: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Bayesian Curve Fitting

Page 38: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Bayesian Predictive Distribution

Page 39: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Model Selection

Cross-Validation

Page 40: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Curse of Dimensionality

Page 41: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Curse of Dimensionality

Polynomial curve fitting, M = 3

Gaussian Densities in higher dimensions

Page 42: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Decision Theory

Inference step

Determine either or .

Decision step

For given x, determine optimal t.

Page 43: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Minimum Misclassification Rate

Page 44: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Minimum Expected Loss

Example: classify medical images as ‘cancer’ or ‘normal’

DecisionTr

uth

Page 45: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Minimum Expected Loss

Regions are chosen to minimize

Page 46: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Reject Option

Page 47: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Why Separate Inference and Decision?

• Minimizing risk (loss matrix may change over time)

• Reject option

• Unbalanced class priors

• Combining models

Page 48: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Decision Theory for Regression

Inference step

Determine .

Decision step

For given x, make optimal prediction, y(x), for t.

Loss function:

Page 49: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

The Squared Loss Function

Page 50: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Generative vs Discriminative

Generative approach:

Model

Use Bayes’ theorem

Discriminative approach:

Model directly

Page 51: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Entropy

Important quantity in• coding theory• statistical physics• machine learning

Page 52: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Entropy

Coding theory: x discrete with 8 possible states; how many bits to transmit the state of x?

All states equally likely

Page 53: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Entropy

Page 54: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Entropy

In how many ways can N identical objects be allocated Mbins?

Entropy maximized when

Page 55: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Entropy

Page 56: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Differential Entropy

Put bins of width ¢ along the real line

Differential entropy maximized (for fixed ) when

in which case

Page 57: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Conditional Entropy

Page 58: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

The Kullback-Leibler Divergence

Page 59: PATTERN RECOGNITION AND MACHINE LEARNING · 2018-01-04 · AND MACHINE LEARNING CHAPTER 1: INTRODUCTION. Example Handwritten Digit Recognition. Polynomial Curve Fitting. Sum-of-Squares

Mutual Information