Top Banner
Machine Learning 2015.08.01. Naรฏve Bayes
13

Naรฏve Bayes ๐‘– ๐œถ - Kangwoncs.kangwon.ac.kr/.../2015_MachineLearning/07_naive_bayes.pdfย ยท 2016. 6. 17.ย ยท Naรฏve Bayes โ€ขBayes rule์„์ ์šฉํ•˜๋ฉด๋ชจ๋“ ๋ฐ์ดํ„ฐ์—๋Œ€ํ•˜์—ฌ๊ณ ๋ คํ•ด์•ผํ•จ

Oct 05, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Naรฏve Bayes ๐‘– ๐œถ - Kangwoncs.kangwon.ac.kr/.../2015_MachineLearning/07_naive_bayes.pdfย ยท 2016. 6. 17.ย ยท Naรฏve Bayes โ€ขBayes rule์„์ ์šฉํ•˜๋ฉด๋ชจ๋“ ๋ฐ์ดํ„ฐ์—๋Œ€ํ•˜์—ฌ๊ณ ๋ คํ•ด์•ผํ•จ

๐‘ ๐‘–๐‘”๐‘š๐‘Ž ๐œถ

Machine Learning

๐‘ ๐‘–๐‘”๐‘š๐‘Ž ๐œถ

2015.08.01.

Naรฏve Bayes

Page 2: Naรฏve Bayes ๐‘– ๐œถ - Kangwoncs.kangwon.ac.kr/.../2015_MachineLearning/07_naive_bayes.pdfย ยท 2016. 6. 17.ย ยท Naรฏve Bayes โ€ขBayes rule์„์ ์šฉํ•˜๋ฉด๋ชจ๋“ ๋ฐ์ดํ„ฐ์—๋Œ€ํ•˜์—ฌ๊ณ ๋ คํ•ด์•ผํ•จ

๐‘ ๐‘–๐‘”๐‘š๐‘Ž ๐œถ 2

Probability Basics

โ€ข Prior, conditional and joint probability for random variables

โ€ข Prior probability: ๐‘ƒ(๐‘‹)

โ€ข Conditional probability: ๐‘ƒ ๐‘‹1 ๐‘‹2 , ๐‘ƒ(๐‘‹2|๐‘‹1)

โ€ข Joint probability: ๐‘ฟ = ๐‘‹1, ๐‘‹2 , ๐‘ƒ ๐‘ฟ = ๐‘ƒ(๐‘‹1, ๐‘‹2)

โ€ข Relationship: ๐‘ƒ ๐‘‹1, ๐‘‹2 = ๐‘ƒ ๐‘‹2 ๐‘‹1 ๐‘ƒ ๐‘‹1 = ๐‘ƒ ๐‘‹1 ๐‘‹2 ๐‘ƒ(๐‘‹2)

โ€ข Independence: ๐‘ƒ ๐‘‹2|๐‘‹1 = ๐‘ƒ ๐‘‹2 , ๐‘ƒ ๐‘‹1|๐‘‹2 = ๐‘ƒ ๐‘‹1 ,

๐‘ƒ ๐‘‹1, ๐‘‹2 = ๐‘ƒ ๐‘‹1 ๐‘ƒ(๐‘‹2)

โ€ข Bayesian Rule

Page 3: Naรฏve Bayes ๐‘– ๐œถ - Kangwoncs.kangwon.ac.kr/.../2015_MachineLearning/07_naive_bayes.pdfย ยท 2016. 6. 17.ย ยท Naรฏve Bayes โ€ขBayes rule์„์ ์šฉํ•˜๋ฉด๋ชจ๋“ ๋ฐ์ดํ„ฐ์—๋Œ€ํ•˜์—ฌ๊ณ ๋ คํ•ด์•ผํ•จ

๐‘ ๐‘–๐‘”๐‘š๐‘Ž ๐œถ 3

Probabilistic Classification

โ€ข Establishing a probabilistic model for classification

โ€ข Discriminative model

),, , )( 1 n1L X(Xc,,cC|CP XX

),,,( 21 nxxx x

Discriminative

Probabilistic Classifier

1x 2x nx

)|( 1 xcP )|( 2 xcP )|( xLcP

Page 4: Naรฏve Bayes ๐‘– ๐œถ - Kangwoncs.kangwon.ac.kr/.../2015_MachineLearning/07_naive_bayes.pdfย ยท 2016. 6. 17.ย ยท Naรฏve Bayes โ€ขBayes rule์„์ ์šฉํ•˜๋ฉด๋ชจ๋“ ๋ฐ์ดํ„ฐ์—๋Œ€ํ•˜์—ฌ๊ณ ๋ คํ•ด์•ผํ•จ

๐‘ ๐‘–๐‘”๐‘š๐‘Ž ๐œถ 4

Probabilistic Classification

โ€ข Establishing a probabilistic model for classification (cont.)

โ€ข Generative model

โ€ข Data๋“ค์˜ ํŒจํ„ด์œผ๋กœ ๋ถ„๋ฅ˜

โ€ข Label์ด ์ฃผ์–ด์กŒ์„ ๋•Œ data๋“ค์„ ํ™•์ธ data์™€ label ๊ด€๊ณ„ ํŒŒ์•…

Page 5: Naรฏve Bayes ๐‘– ๐œถ - Kangwoncs.kangwon.ac.kr/.../2015_MachineLearning/07_naive_bayes.pdfย ยท 2016. 6. 17.ย ยท Naรฏve Bayes โ€ขBayes rule์„์ ์šฉํ•˜๋ฉด๋ชจ๋“ ๋ฐ์ดํ„ฐ์—๋Œ€ํ•˜์—ฌ๊ณ ๋ คํ•ด์•ผํ•จ

๐‘ ๐‘–๐‘”๐‘š๐‘Ž ๐œถ 5

Bayes`s Theorem

โ€ข Bayes' theorem (alternatively Bayes' law or Bayes' rule) describes the probability of an event, based on conditions that might be related to the event.

โ€ข ๋‘ ํ™•๋ฅ  ๋ณ€์ˆ˜์˜ ์‚ฌ์ „ ํ™•๋ฅ ๊ณผ ์‚ฌํ›„ ํ™•๋ฅ  ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ๋‚˜ํƒ€๋ƒ„

โ€ข ์ƒˆ๋กœ์šด ๊ทผ๊ฑฐ๊ฐ€ ์ œ์‹œ๋  ๋•Œ ์‚ฌํ›„ ํ™•๋ฅ ์ด ์–ด๋–ป๊ฒŒ ๊ฐฑ์‹ ๋ ์ง€ ๊ตฌํ•จ

โ€ข ๐‘ƒ ๐ด = ๐‘๐‘Ÿ๐‘–๐‘œ๐‘Ÿ ๐‘๐‘Ÿ๐‘œ๐‘๐‘Ž๐‘๐‘–๐‘™๐‘–๐‘ก๐‘ฆ ๐‘œ๐‘“ โ„Ž๐‘ฆ๐‘๐‘œ๐‘กโ„Ž๐‘’๐‘ ๐‘–๐‘  ๐‘จ

โ€ข ๐‘ƒ ๐ต = ๐‘๐‘Ÿ๐‘–๐‘œ๐‘Ÿ ๐‘๐‘Ÿ๐‘œ๐‘๐‘Ž๐‘๐‘–๐‘™๐‘–๐‘ก๐‘ฆ ๐‘œ๐‘“ ๐‘ก๐‘Ÿ๐‘Ž๐‘–๐‘›๐‘–๐‘›๐‘” ๐‘‘๐‘Ž๐‘ก๐‘Ž ๐‘ฉ

โ€ข ๐‘ƒ ๐ด ๐ต = ๐‘๐‘Ÿ๐‘œ๐‘๐‘Ž๐‘๐‘–๐‘™๐‘–๐‘ก๐‘ฆ ๐‘œ๐‘“ ๐‘จ ๐‘”๐‘–๐‘ฃ๐‘’๐‘› ๐‘ฉ

โ€ข ๐‘ƒ ๐ต ๐ด = ๐‘๐‘Ÿ๐‘œ๐‘๐‘Ž๐‘๐‘–๐‘™๐‘–๐‘ก๐‘ฆ ๐‘œ๐‘“ ๐‘ฉ ๐‘”๐‘–๐‘ฃ๐‘’๐‘› ๐‘จ

๐‘ท ๐‘จ ๐‘ฉ =๐‘ท(๐‘ฉ|๐‘จ)๐‘ท ๐‘จ

๐‘ท ๐‘ฉ

Page 6: Naรฏve Bayes ๐‘– ๐œถ - Kangwoncs.kangwon.ac.kr/.../2015_MachineLearning/07_naive_bayes.pdfย ยท 2016. 6. 17.ย ยท Naรฏve Bayes โ€ขBayes rule์„์ ์šฉํ•˜๋ฉด๋ชจ๋“ ๋ฐ์ดํ„ฐ์—๋Œ€ํ•˜์—ฌ๊ณ ๋ คํ•ด์•ผํ•จ

๐‘ ๐‘–๐‘”๐‘š๐‘Ž ๐œถ 6

Bayes`s Theorem

โ€ข MAP classification ruleโ€ข MAP: Maximum A Posterior

โ€ข Assign ๐‘ฅ to ๐‘โˆ— if

๐‘ƒ ๐ถ = ๐‘โˆ— ๐‘‹ = ๐‘ฅ > ๐‘ƒ ๐ถ = ๐‘ ๐‘‹ = ๐‘ฅ ๐‘ โ‰  ๐‘โˆ—, ๐‘ = ๐‘1, โ€ฆ , ๐‘๐ฟ

โ€ข Generative classification with the MAP ruleโ€ข Apply Bayesian rule

๐‘ƒ ๐ถ = ๐‘๐‘– ๐‘‹ = ๐‘ฅ =๐‘ƒ ๐‘‹ = ๐‘ฅ ๐ถ = ๐‘๐‘– ๐‘ƒ ๐ถ = ๐‘๐‘–

๐‘ƒ ๐‘‹ = ๐‘ฅ

โˆ ๐‘ƒ ๐‘‹ = ๐‘ฅ ๐ถ = ๐‘๐‘– ๐‘ƒ ๐ถ = ๐‘๐‘– โˆ€ ๐‘๐‘–

Page 7: Naรฏve Bayes ๐‘– ๐œถ - Kangwoncs.kangwon.ac.kr/.../2015_MachineLearning/07_naive_bayes.pdfย ยท 2016. 6. 17.ย ยท Naรฏve Bayes โ€ขBayes rule์„์ ์šฉํ•˜๋ฉด๋ชจ๋“ ๋ฐ์ดํ„ฐ์—๋Œ€ํ•˜์—ฌ๊ณ ๋ คํ•ด์•ผํ•จ

๐‘ ๐‘–๐‘”๐‘š๐‘Ž ๐œถ 7

Naรฏve Bayes

โ€ข Bayes rule์„ ์ ์šฉํ•˜๋ฉด ๋ชจ๋“  ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•˜์—ฌ ๊ณ ๋ คํ•ด์•ผ ํ•จ learning the joint probability ๐‘ƒ(๐‘‹1, โ€ฆ , ๐‘‹๐‘›|๐ถ) : Difficulty

โ€ข 10๊ฐœ์˜ Binary feature 210๊ฐœ์˜ data

โ€ข Thus, assumption that all input features are conditionally independent Naรฏve Bayes rule

โ€ข ๊ฐ ์ž์งˆ์— ๋Œ€ํ•˜์—ฌ ์กฐ๊ฑด๋ถ€ํ™•๋ฅ ์ด ๋…๋ฆฝ์ ์ด๋ผ ๊ฐ€์ •

โ€ข ์กฐ๊ฑด๋ถ€ ํ™•๋ฅ ์— ๋Œ€ํ•œ ๊ฒฝ์šฐ์˜ ์ˆ˜: 2๐‘› 2๐‘›

Page 8: Naรฏve Bayes ๐‘– ๐œถ - Kangwoncs.kangwon.ac.kr/.../2015_MachineLearning/07_naive_bayes.pdfย ยท 2016. 6. 17.ย ยท Naรฏve Bayes โ€ขBayes rule์„์ ์šฉํ•˜๋ฉด๋ชจ๋“ ๋ฐ์ดํ„ฐ์—๋Œ€ํ•˜์—ฌ๊ณ ๋ คํ•ด์•ผํ•จ

๐‘ ๐‘–๐‘”๐‘š๐‘Ž ๐œถ 8

Naรฏve Bayes

โ€ข Naรฏve Bayes

โ€ข MAP classification rule: ๐‘ฅ = (๐‘ฅ1, ๐‘ฅ2, โ€ฆ , ๐‘ฅ๐‘›)

๐‘ƒ ๐‘‹1, ๐‘‹2, โ€ฆ , ๐‘‹๐‘› ๐ถ = ๐‘ƒ ๐‘‹1 ๐‘‹2, โ€ฆ , ๐‘‹๐‘›, ๐ถ ๐‘ƒ(๐‘‹2, โ€ฆ , ๐‘‹๐‘›|๐ถ)

= ๐‘ƒ ๐‘‹1 ๐ถ ๐‘ƒ(๐‘‹2, โ€ฆ , ๐‘‹๐‘›|๐ถ)

= ๐‘ƒ ๐‘‹1 ๐ถ ๐‘ƒ ๐‘‹2 ๐ถ โ€ฆ๐‘ƒ(๐‘‹๐‘›|๐ถ)

ProbabilityChain rule!

๐‘ƒ ๐‘ฅ1 ๐ถโˆ— โ€ฆ๐‘ƒ ๐‘ฅ๐‘› ๐‘

โˆ— ๐‘ƒ ๐‘โˆ— > [๐‘ƒ ๐‘ฅ1 ๐‘ โ€ฆ๐‘ƒ ๐‘ฅ๐‘› ๐‘)]๐‘ƒ(๐‘),

๐‘ โ‰  ๐‘^ โˆ— , ๐‘ = ๐‘_1, โ€ฆ , ๐‘_๐ฟ

=

๐‘–

๐‘ƒ(๐‘‹๐‘–|๐ถ)

Page 9: Naรฏve Bayes ๐‘– ๐œถ - Kangwoncs.kangwon.ac.kr/.../2015_MachineLearning/07_naive_bayes.pdfย ยท 2016. 6. 17.ย ยท Naรฏve Bayes โ€ขBayes rule์„์ ์šฉํ•˜๋ฉด๋ชจ๋“ ๋ฐ์ดํ„ฐ์—๋Œ€ํ•˜์—ฌ๊ณ ๋ คํ•ด์•ผํ•จ

๐‘ ๐‘–๐‘”๐‘š๐‘Ž ๐œถ 9

Example

โ€ข Example: Play Tennis

Page 10: Naรฏve Bayes ๐‘– ๐œถ - Kangwoncs.kangwon.ac.kr/.../2015_MachineLearning/07_naive_bayes.pdfย ยท 2016. 6. 17.ย ยท Naรฏve Bayes โ€ขBayes rule์„์ ์šฉํ•˜๋ฉด๋ชจ๋“ ๋ฐ์ดํ„ฐ์—๋Œ€ํ•˜์—ฌ๊ณ ๋ คํ•ด์•ผํ•จ

๐‘ ๐‘–๐‘”๐‘š๐‘Ž ๐œถ 10

Example

โ€ข Learning Phase

Outlook Play=Yes Play=No

Sunny 2/9 3/5Overcast 4/9 0/5

Rain 3/9 2/5

Temperature Play=Yes Play=No

Hot 2/9 2/5Mild 4/9 2/5Cool 3/9 1/5

Humidity Play=Yes Play=No

High 3/9 4/5Normal 6/9 1/5

Wind Play=Yes Play=No

Strong 3/9 3/5Weak 6/9 2/5

P(Play=Yes) = 9/14 P(Play=No) = 5/14

Page 11: Naรฏve Bayes ๐‘– ๐œถ - Kangwoncs.kangwon.ac.kr/.../2015_MachineLearning/07_naive_bayes.pdfย ยท 2016. 6. 17.ย ยท Naรฏve Bayes โ€ขBayes rule์„์ ์šฉํ•˜๋ฉด๋ชจ๋“ ๋ฐ์ดํ„ฐ์—๋Œ€ํ•˜์—ฌ๊ณ ๋ คํ•ด์•ผํ•จ

๐‘ ๐‘–๐‘”๐‘š๐‘Ž ๐œถ 11

Example

โ€ข Test Phaseโ€ข Given a new instance, predict its label

xโ€™=(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong)

โ€ข Look up tables achieved in the learning phrase

โ€ข Decision making with the MAP rule

P(Outlook=Sunny|Play=No) = 3/5

P(Temperature=Cool|Play==No) = 1/5

P(Huminity=High|Play=No) = 4/5

P(Wind=Strong|Play=No) = 3/5

P(Play=No) = 5/14

P(Outlook=Sunny|Play=Yes) = 2/9

P(Temperature=Cool|Play=Yes) = 3/9

P(Huminity=High|Play=Yes) = 3/9

P(Wind=Strong|Play=Yes) = 3/9

P(Play=Yes) = 9/14

P(Yes|xโ€™) โ‰ˆ [P(Sunny|Yes)P(Cool|Yes)P(High|Yes)P(Strong|Yes)]P(Play=Yes) = 0.0053

P(No|xโ€™) โ‰ˆ [P(Sunny|No) P(Cool|No)P(High|No)P(Strong|No)]P(Play=No) = 0.0206

Given the fact P(Yes|xโ€™) < P(No|xโ€™), we label xโ€™ to be โ€œNoโ€.

Page 12: Naรฏve Bayes ๐‘– ๐œถ - Kangwoncs.kangwon.ac.kr/.../2015_MachineLearning/07_naive_bayes.pdfย ยท 2016. 6. 17.ย ยท Naรฏve Bayes โ€ขBayes rule์„์ ์šฉํ•˜๋ฉด๋ชจ๋“ ๋ฐ์ดํ„ฐ์—๋Œ€ํ•˜์—ฌ๊ณ ๋ คํ•ด์•ผํ•จ

๐‘ ๐‘–๐‘”๐‘š๐‘Ž ๐œถ 12

References

โ€ข Naรฏve Bayes Classifier - Ke Chen

โ€ข Advanced Algorithm(Naรฏve Bayes Classifier) - Leeck

โ€ข Machine Learning and Its Applications โ€“ Harksoo Kim

โ€ข Wikipedia

โ€ข http://www.leesanghyun.co.kr/Naive_Bayesian_Classifier

โ€ข http://darkpgmr.tistory.com/62

Page 13: Naรฏve Bayes ๐‘– ๐œถ - Kangwoncs.kangwon.ac.kr/.../2015_MachineLearning/07_naive_bayes.pdfย ยท 2016. 6. 17.ย ยท Naรฏve Bayes โ€ขBayes rule์„์ ์šฉํ•˜๋ฉด๋ชจ๋“ ๋ฐ์ดํ„ฐ์—๋Œ€ํ•˜์—ฌ๊ณ ๋ คํ•ด์•ผํ•จ

๐‘ ๐‘–๐‘”๐‘š๐‘Ž ๐œถ 13

QA

๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

๋ฐ•์ฒœ์Œ, ๋ฐ•์ฐฌ๋ฏผ, ์ตœ์žฌํ˜

๐‘ ๐‘–๐‘”๐‘š๐‘Ž ๐œถ , ๊ฐ•์›๋Œ€ํ•™๊ต

Email: [email protected]