Top Banner
Pattern Classification, Chapter 2 (Part 2) 1 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley & Sons, 2000 with the permission of the authors and the publisher
19

Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Dec 17, 2015

Download

Documents

Eleanore Henry
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

1

Pattern Classification

All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley & Sons, 2000 with the permission of the authors and the publisher

Page 2: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Chapter 2 (Part 2): Bayesian Decision Theory

(Sections 2.3-2.5)

• Minimum-Error-Rate Classification

• Classifiers, Discriminant Functions and Decision Surfaces

• The Normal Density

Page 3: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

3

Minimum-Error-Rate Classification

•Actions are decisions on classesIf action i is taken and the true state of nature is j then:

the decision is correct if i = j and in error if i j

•Seek a decision rule that minimizes the probability of error which is the error rate

Page 4: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

4

• Introduction of the zero-one loss function:

Therefore, the conditional risk is:

“The risk corresponding to this loss function is the average probability error”

c,...,1j,i ji 1

ji 0),( ji

1jij

cj

1jjjii

)x|(P1)x|(P

)x|(P)|()x|(R

Page 5: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

5

•Minimize the risk requires maximize P(i | x)

(since R(i | x) = 1 – P(i | x))

•For Minimum error rate

•Decide i if P (i | x) > P(j | x) j i

Page 6: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

6

• Regions of decision and zero-one loss function, therefore:

• If is the zero-one loss function which means:

b1

2

a1

2

)(P

)(P2 then

0 1

2 0 if

)(P

)(P then

0 1

1 0

)|x(P

)|x(P :if decide then

)(P

)(P. Let

2

11

1

2

1121

2212

Page 7: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

7

Page 8: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

8Classifiers, Discriminant Functions

and Decision Surfaces

•The multi-category case

•Set of discriminant functions gi(x), i = 1,…, c

•The classifier assigns a feature vector x to class i

if: gi(x) > gj(x) j i

Page 9: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

9

Page 10: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

10

•Let gi(x) = - R(i | x)

(max. discriminant corresponds to min. risk!)

•For the minimum error rate, we take gi(x) = P(i | x)

(max. discrimination corresponds to max. posterior!)

gi(x) P(x | i) P(i)

gi(x) = ln P(x | i) + ln P(i)

(ln: natural logarithm!)

Page 11: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

11

•Feature space divided into c decision regions

if gi(x) > gj(x) j i then x is in Ri

(Ri means assign x to i)

•The two-category case•A classifier is a “dichotomizer” that has two

discriminant functions g1 and g2

Let g(x) g1(x) – g2(x)

Decide 1 if g(x) > 0 ; Otherwise decide 2

Page 12: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

12

•The computation of g(x)

)(P

)(Pln

)|x(P

)|x(Pln

)x|(P)x|(P)x(g

2

1

2

1

21

Page 13: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

13

Page 14: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

14

The Normal Density

• Univariate density

• Density which is analytically tractable

• Continuous density

• A lot of processes are asymptotically Gaussian

• Handwritten characters, speech sounds are ideal or prototype corrupted by random process (central limit theorem)

Where: = mean (or expected value) of x 2 = expected squared deviation or variance

,x

2

1exp

2

1)x(P

2

Page 15: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

15

Page 16: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

16

• Multivariate density

• Multivariate normal density in d dimensions is:

where:

x = (x1, x2, …, xd)t (t stands for the transpose vector form)

= (1, 2, …, d)t mean vector = d*d covariance matrix

|| and -1 are determinant and inverse respectively

)x()x(

2

1exp

)2(

1)x(P 1t

2/12/d

Page 17: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

17

Appendix

•Variance=S2

•Standard Deviation=S

2

1

2 )(1

1xx

nS

n

ii

Page 18: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

18

Bays theorem

A ﹁ A

B A and B ﹁ A and B

﹁ B A and ﹁ B ﹁ A and ﹁ B

)|()()|()(

)|()()|(

ABPAPABPAP

ABPAPBAP

)(

)|()()|(

BP

ABPAPBAP

Page 19: Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification, Chapter 2 (Part 2)

19