Bayesian Classification with a brief introduction to pattern recognition Modified from slides by Michael L. Raymer, Ph.D.

Post on 22-Dec-2015

228 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

Transcript

Bayesian Classificationwith a brief introduction to pattern

recognition

Modified from slides byMichael L. Raymer, Ph.D.

8/29/03 M. Raymer – WSU, FBS 2

The pattern recognition paradigm• Fruit on an assembly line

Oranges, grapefruit, lemons, cherries, apples

• Sensors measure:Red intensityYellow intensityMass (kg)Approximate volume

• At the end of the line, a gate switches to deposit the fruit into the correct bin

8/29/03 M. Raymer – WSU, FBS 3

Training the algorithm

Red = 2.125

Yellow = 6.143

Mass = 134.32

Volume = 24.21

Apple

Sensors, scales, etc…

8/29/03 M. Raymer – WSU, FBS 4

Training (2)Red = 2.125

Yellow = 6.143

Mass = 134.32

Volume = 24.21

Apple

Red = ???

Yellow = ???

Mass = ???

Volume = ???

Label

Red = ???

Yellow = ???

Mass = ???

Volume = ???

Label

Red = ???

Yellow = ???

Mass = ???

Volume = ???

Label

Red = ???

Yellow = ???

Mass = ???

Volume = ???

Label

Red = ???

Yellow = ???

Mass = ???

Volume = ???

LabelRed = ???

Yellow = ???

Mass = ???

Volume = ???

Label

Red = ???

Yellow = ???

Mass = ???

Volume = ???

Label

Classifier

8/29/03 M. Raymer – WSU, FBS 5

Testing

Red = 2.125

Yellow = 6.143

Mass = 134.32

Volume = 24.21

??

Classifier

!

8/29/03 M. Raymer – WSU, FBS 6

Pattern MatrixV1 V2 V3 V4 V5

Ex 1 3.06 2.05 6.39 7.84 6.75

Ex 2 8.25 0.72 2.52 0.50 9.08

Ex 3 2.72 9.32 5.68 7.83 7.86

Ex 4 7.37 1.30 2.97 0.61 3.49

Ex 5 0.73 1.46 6.60 6.08 0.78

Ex 6 4.85 5.08 4.87 8.06 8.65

Ex 7 5.89 1.23 6.38 2.81 6.84

Ex 8 0.52 6.57 4.08 3.62 0.59

Ex 9 5.66 3.65 6.87 6.90 7.93

Ex 10 3.92 0.73 1.01 3.57 2.47

Ex 11 8.84 1.42 2.79 3.40 3.19

Ex 12 5.63 4.32 8.08 0.82 4.74

Class

1

1

1

2

2

2

3

3

3

4

4

4

8/29/03 M. Raymer – WSU, FBS 7

Nearest Neighbor Classification

Mass (normalized)0 1 2 3 4 5 6 7 8 9 10

12

34

56

78

910

Red

Int

ensi

ty (

norm

aliz

ed)

?

8/29/03 M. Raymer – WSU, FBS 8

Evaluating Accuracy

Trainingdata

Mass (normalized)0 1 2 3 4 5 6 7 8 9 10

12

34

56

78

910

Red

Inte

nsity

(no

rmal

ized

)

Testingdata

8/29/03 M. Raymer – WSU, FBS 9

Problems with KNN classifiers• Lots of memorization• Slow (lots of distance calculations)• Incorrect features cause problems• Features are assumed to all be of

equal importance in classification• Odd exemplars (e.g. green/yellow

apples) cause problems• What value for k?

8/29/03 M. Raymer – WSU, FBS 10

Distributions• Bayesian classifiers start with an estimate of

the distribution of the features

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

P(N)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

N =# Heads (20 Tosses)0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 0.5 1 1.5 2

N

P(N

)

Binomial Distribution(Discrete)

Gaussian Distribution(Continuous)

8/29/03 M. Raymer – WSU, FBS 11

Density Estimation• Parametric

Assume a Gaussian (e.g.) distribution.Estimate the parameters (,).

• Non-parametricHistogram samplingBin size is criticalGaussian smoothing

can help

8/29/03 M. Raymer – WSU, FBS 12

The Gaussian distribution

xxxfd

1T

2

1exp

2

12

12

2

22 2

ex

xf

Multivariate (d-dimensional):

Univariate:

A parametric Bayesian classifier must estimate and from the training samples.

8/29/03 M. Raymer – WSU, FBS 13

Making decisions• Once you have the distributions for

Each featureand

Each class

• You can ask questions like…

If I have an apple, what is the probability that the diameter will be between 3.2 and 3.5 inches?

8/29/03 M. Raymer – WSU, FBS 14

More decisions…

Non-parametric Parametric

Diameter

Cou

nt

bins all

inches 3.5 through 3.1 ngrepresenti bins

dxex

5.3

1.3

2

2

22

8/29/03 M. Raymer – WSU, FBS 15

A Simple Example• You are given a fruit with a

diameter of 4” – is it a pear or an apple?

• To begin, we need to know the distributions of diameters for pears and apples.

8/29/03 M. Raymer – WSU, FBS 16

Maximum Likelihood

P(x)

apple|xP pear|xP

diameterx

Class-Conditional Distributions

Class-Conditional Distributions

1” 2” 3” 4” 5” 6”

8/29/03 M. Raymer – WSU, FBS 17

What are we asking?• If the fruit is an apple, how likely is it

to have a diameter of 4”?• If the fruit is a xenofruit from planet

Xircon, how likely is it to have a diameter of 4”?

Is this the right question to ask?

8/29/03 M. Raymer – WSU, FBS 18

A Key Problem• We based this decision on

(class conditional)• What we really want to use is

(posterior probability)• What if we found the fruit in a

pear orchard?• We need to know the prior

probability of finding an apple or a pear!

pear|xP

xP |pear

8/29/03 M. Raymer – WSU, FBS 19

Statistical decisions…• If a fruit has a diameter of 4”, how

likely is it to be an apple?

Apples 4” Fruit

8/29/03 M. Raymer – WSU, FBS 20

“Inverting” the question

Given an apple, what is the probability that it will have a diameter of 4”?

Given an apple, what is the probability that it will have a diameter of 4”?

Given a 4” diameter fruit, what is the probability that it is an apple?

Given a 4” diameter fruit, what is the probability that it is an apple? appleP |0.4x appleP |0.4x 0.4x| appleP 0.4x| appleP

8/29/03 M. Raymer – WSU, FBS 21

Prior Probabilities• Prior probability + Evidence

Posterior Probability

• Without evidence, what is the “prior probability” that a fruit is an apple?

8/29/03 M. Raymer – WSU, FBS 22

The heart of it all• Bayes Rule

classes all

)()|(

)()|(|

classPclassevidenceP

classPclassevidencePevidenceclassP

pearpear|4appleapple|4

appleapple|4

P"dpP"dp

P"dp

8/29/03 M. Raymer – WSU, FBS 23

Bayes Rule

c

jjj

jjj

Pxp

PxpxP

1

|

||

or

xpPxp

xP jjj

|

|

8/29/03 M. Raymer – WSU, FBS 24

Example Revisited

• Is it an ordinary apple or an uncommon pear?

05.0pear|4

4.0apple|4

"dP

"dP

9.0)pear(

1.0apple

P

P

8/29/03 M. Raymer – WSU, FBS 25

Bayes Rule Example

47.0085.0

04.0

9.005.01.04.0

1.04.0

"dP 4|apple

pearpear|4appleapple|4

appleapple|4

P"dpP"dp

P"dp

8/29/03 M. Raymer – WSU, FBS 26

Bayes Rule Example "dP 4|pear

pearpear|4appleapple|4

pearpear|4

P"dpP"dp

P"dp

53.0085.0

045.0

9.005.01.04.0

9.005.0

8/29/03 M. Raymer – WSU, FBS 27

Solution

909.0000999.000099.0

00099.0

999.00001.0001.099.0

001.099.0

)|( posguiltP

)(||

|

innocentPinnocentpospguiltPguiltposp

guiltPguiltposp

8/29/03 M. Raymer – WSU, FBS 28

Marginal Distributions

apple|1xP pear|1xP

apple|2xP pear|2xP

8/29/03 M. Raymer – WSU, FBS 29

Combining Marginals• Assuming independent features:

• If we assume independence and use Bayes rule, we have a Naïve Bayes decision maker (classifier).

jdjjj xPxPxPxP ω|ω|ω|ω| 21

8/29/03 M. Raymer – WSU, FBS 30

Bayes Decision Rule

• Provably optimal when the features (evidence) follow Gaussian distributions, and are independent.

jxPxP ji

i

||

such that , classPredict

8/29/03 M. Raymer – WSU, FBS 31

Likelihood Ratios• When deciding between two

possibilities, we don’t need the exact probabilities. We only need to know which one is greater.

• The denominator for all the classes is always equal.Can be eliminatedUseful when there are many possible

classes

8/29/03 M. Raymer – WSU, FBS 32

Likelihood Ratio Example

pearpear|4appleapple|4

pearpear|4

P"dpP"dp

P"dp

pearpear|4appleapple|4

appleapple|4

P"dpP"dp

P"dp

8/29/03 M. Raymer – WSU, FBS 33

Likelihood Ratio Example

appleapple|4

pearpear|4

P"dp

P"dp

8/29/03 M. Raymer – WSU, FBS 34

In-class example:Oranges Grapefruit

Red Intensity

1 0 13

2018

20

8

3 4

22

0

5

10

15

20

25

3.01

7

3.60

47

4.19

24

4.78

01

5.36

78

5.95

55

6.54

32

7.13

09

7.71

86

8.30

63M

ore

Bin

Fre

qu

en

cy

Mass

05

10

19 21

149

12

63

05

10152025

0.09

18

1.11

06

2.12

94

3.14

824.

167

5.18

58

6.20

46

7.22

34

8.24

22M

ore

Bin

Fre

qu

en

cy

Red Intensity

1 2

810

19

15

19

15

8

1 2

0

5

10

15

20

5.39

1

5.82

64

6.26

18

6.69

72

7.13

267.

568

8.00

34

8.43

88

8.87

42

9.30

96M

ore

Bin

Fre

qu

en

cy

Mass

1 17 9

2026

12 14

72 1

05

1015202530

4.48

1

5.24

83

6.01

56

6.78

29

7.55

02

8.31

75

9.08

48

9.85

21

10.6

194

11.3

867

Mor

e

Bin

Fre

qu

en

cy

8/29/03 M. Raymer – WSU, FBS 35

Example (cont’d)• After observing several hundred fruit

pass down the assembly line, we observe that72% are oranges28% are grapefruit

• Fruit ‘x’Red intensity = 8.2Mass = 7.6

What shall we predict for the class of fruit ‘x’?

What shall we predict for the class of fruit ‘x’?

8/29/03 M. Raymer – WSU, FBS 36

The whole enchilada 6.7,2.8|orangeP

grapefruitgrapefruit|6.7,2.8orangeorange|6.7,2.8

orangeorange|6.7,2.8

PpPp

Pp

and…

orange|6.7orange|2.8orange|6.7,2.8 massPredPP

(Naïve assumption)

Repeat for grapefruit and predict the more probable class.

8/29/03 M. Raymer – WSU, FBS 37

The whole enchilada (2) 6.7,2.8|orangeP

grapefruit orange,

)(|6.7|2.8

)orange(orange|6.7orange|2.8

f

fPfmassPfredP

PmassPredP

)grapefruit(grapefruit|6.7grapefruit|2.8)orange(orange|6.7orange|2.8

)orange(orange|6.7orange|2.8

PmassPredPPmassPredP

PmassPredP

39.

28.20.19.72.12.08.

72.12.08.

8/29/03 M. Raymer – WSU, FBS 38

The whole enchilada (3) 6.7,2.8|grapefruitP

grapefruit orange,

)(|6.7|2.8

)grapefruit(grapefruit|6.7grapefruit|2.8

f

fPfmassPfredP

PmassPredP

)grapefruit(grapefruit|6.7grapefruit|2.8)orange(orange|6.7orange|2.8

)grapefruit(grapefruit|6.7grapefruit|2.8

PmassPredPPmassPredP

PmassPredP

61.

28.20.19.72.12.08.

28.20.19.

8/29/03 M. Raymer – WSU, FBS 39

Conclusion

39.|orange xP

61.|grapefruit xP

Predict that fruit ‘x’ is a grapefruit, despite the relative scarcity of grapefruits on the conveyor belt.

8/29/03 M. Raymer – WSU, FBS 40

Abbreviated

• Since the denominator is the same for all classes, we can just compare:

orangeorange|6.7orange|2.8 PmassPredP

and

grapefruitgrapefruit|6.7grapefruit|2.8 PmassPredP

8/29/03 M. Raymer – WSU, FBS 41

Likelihood comparison orangeorange|6.7orange|2.8 PmassPredP

0069.

72.12.08.

grapefruitgrapefruit|6.7grapefruit|2.8 PmassPredP

0106.

28.20.19.

top related