Top Banner
Graphical Models for Machine Learning and Computer Vision
21

Graphical Models for Machine Learning and Computer Vision

Feb 20, 2016

Download

Documents

rosine

Graphical Models for Machine Learning and Computer Vision. Statistical Models. Statistical Models Describe observed ‘DATA’ via an assumed likelihood: With denoting the ‘parameters’ needed to describe the data. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Graphical Models for Machine Learning and Computer Vision

Graphical Models for Machine Learning and Computer Vision

Page 2: Graphical Models for Machine Learning and Computer Vision

Statistical Models

• Statistical Models Describe observed ‘DATA’ via an assumed likelihood:

• With denoting the ‘parameters’ needed to describe the data. • Likelihoods measure how likely what was observed was. They

implicitly assume an error mechanism (in the translation between what was observed and what was ‘supposed’ to be observed).

• Parameters may describe model features or even specify different models.

(DATA | Θ)LΘ

Page 3: Graphical Models for Machine Learning and Computer Vision

An Example of a Statistical Model

• A burgler alarm is affected by both earthquakes and burgleries. It has a mechanism to communicate with the homeowner if activated. It went off at Judah Pearles house one day. Should he:

• a) immediately call the police • under suspicion that a burglary took • place, or• b) go home and immediately transfer his • valueables elsewhere?

Page 4: Graphical Models for Machine Learning and Computer Vision

A Statistical Analysis

• Observation: The burgler alarm went off (i.e., a=1);• Parameter 1: The presence or absence of an earthquake (i.e., e=1,0);• Parameter 2: The presence or absence of a burglary at Judah’s house (i.e., b=1,0).

Page 5: Graphical Models for Machine Learning and Computer Vision

LIKELIHOODS/PRIORS IN THIS CASE

• The Likelihood associated with the observation is:

• With b,e =0,1 (depending on whether a burglery,earthquake has taken place).

• The Priors specify the probabilities of a burglery or earthquake happenning:

( | ) ( 1| , )DATA P a b e L

( 1) ?; P(e=1)=?;P b

Page 6: Graphical Models for Machine Learning and Computer Vision

Example Probabilities

• Here are some probabilities indicating something about the likelihood and prior:( 0) .9; P(b=1)=.1;

P(a=1|e=b=0)=.001; P(a=1|b=1,e=0)=.368;P(a=1|e=1,b=0)=.135; P(a=1|b=e=1)=.607;

P b

Page 7: Graphical Models for Machine Learning and Computer Vision

LIKELIHOOD/PRIOR INTERPRETATION

• Burglaries are as likely (apriori) as earthquakes.• It is unlikely that the alarm just went off by itself.• The alarm goes off more often when a burglary

happens but an earthquakes does not than (the reverse) i.e., when an earthquake happens but a burglary does not.

• If both a burglary and an earthquake happens than it is (virtually) twice as likely the alarm will go off.

Page 8: Graphical Models for Machine Learning and Computer Vision

Probability Propagation Graph

Page 9: Graphical Models for Machine Learning and Computer Vision

PROBABILITY PROPOGATION

• There are two kinds of Probability Propogation: (see Frey 1998)

a) marginalization i.e., • And b) multiplication i.e., • Marginalization sums over terms leading

into the node;• Multiplication multiplies over terms leading

into the node.

( )P B bP(b B)

Page 10: Graphical Models for Machine Learning and Computer Vision

CAUSAL ANALYSIS

• To analyze the causes of the alarm going off, we calculate the probability that it was a burglary (in this case) and compare it with the probability

e

P(b = 1 | a = 1) P(B b)P(A b)

= (.1)* P(a = 1 | e,b = 1)P(e A)

= .1 .368* .9 + .607* .1 = .1* .3919

Page 11: Graphical Models for Machine Learning and Computer Vision

CAUSAL ANALYSIS II

• So, after normalization:

• Similarly,• So, if we had to choose between burglary

and earthquake as a cause of making the alarm go off, we should choose burglary.

P(b = 1 | a = 1) = .751P(e = 1 | a = 1) = .349

Page 12: Graphical Models for Machine Learning and Computer Vision

Markov Chain Monte Carlo for the Burglar Problem

• For current values of e =e*, calculate

• or

• Simulate b from this distribution. Call the result b*. Now calculate:

• Or

*P(b = 0 | a = 1,e = e*),P(b = 1 | a = 1,e = e )

* *P(e = 0 | b = b ,a = 1), P(e = 1 | b = b ,a = 1)

*A b B b *P( | e = e*) P( | e = e )

* *P(A e | b )* P(E e | b )

Page 13: Graphical Models for Machine Learning and Computer Vision

Independent Hidden Variables: A Factorial Model

• In statistical modeling it is often advantageous to treat variables which are not observed as ‘hidden’. This means that they themselves have distributions. In our case suppose b and e are independent hidden variables:

• Then optimally:

P(b = 1) = β; P(b = 0) = 1- β;P(e = 1) = ε; P(e = 0) = 1- ε;

P(b = 1 | a = 1) = .951P(e = 1 | a = 1) = .186

Page 14: Graphical Models for Machine Learning and Computer Vision

Nonfactorial Hidden Variable Models

• Suppose b and e are dependent hidden variables:

• Then a similar analysis yields a related result

1,1 1,0

0,1

1,1 1,0 0,1

P(b = 1,e = 1) = p ; P(b = 1,e = 0) = pP(b = 0,e = 1) = p ; P(b = 0,e = 0) = 1- p - p - p

Page 15: Graphical Models for Machine Learning and Computer Vision

INFORMATION

• The difference in information available from parameters after observing the alarm versus before the alarm was observed is:

• This is the Kullback-Leibler ‘distance’ between the prior and posterior distributions.

• Parameters are chosen to optimize this distance.

,b e

(b,e | β,ε)I(β,ε) = (b,e | β,ε)log

(b,e,a = 1)LLL

β,ε

Q PD

Page 16: Graphical Models for Machine Learning and Computer Vision

INFORMATION IN THIS EXAMPLE

• The information available in this example • Calculated using:

is

b 1-b e 1-e

1-b b e (1-e)

(b,e | β,ε) β (1- β) ε (1- ε)(b,e,a = 1) P(a = 1 | b,e).9 * .1 * .1 * .9

LL

I(β,ε) = -H(β) - H(ε) +

-logP(a = 1 | b,e) - (b + e)* log(.1) - (2 - b - e)log(.9)

Page 17: Graphical Models for Machine Learning and Computer Vision

Markov Random Fields

• Markov Random Fields are simply Graphical Models set in a 2 or higher dimensional field. Their fundamental criterion is that the distribution of a point x conditional on all of those that remain (i.e., -x) is identical to its distribution given a neighborhood ‘N’ of it (i.e.,

( | ) ( | )xx x x L L N

Page 18: Graphical Models for Machine Learning and Computer Vision

EXAMPLE OF A RANDOM FIELD

• Modeling a video frame is typically done via a random field. Parameters identify our expectations of what the frame looks like.

• We can ‘clean up’ video frames or related media using a methodology which distinguishes between what we expect and what was observed.

Page 19: Graphical Models for Machine Learning and Computer Vision

GENERALIZATION

• This is can be generalized to non-discrete likelihoods with non-discrete parameters.

• More generally (sans data) assume that a movie (consisting of many frames, each of which consists in grey level pixel values over a lattice) is observed. We would like to ‘detect’ ‘unnatural’ events.

Page 20: Graphical Models for Machine Learning and Computer Vision

GENERALIZATION II

• Assume a model for frame i (given frame i-1) taking the form,

• The parameters typically denote invariant features for pictures of cars, houses, etc..

• The presence or absence of unnatural events can be described by hidden variables.

• The (frame) likelihood describes the natural evolution of the movie over time.

(Frame[i] | Θ,Frame[i -1])L

Θ

Page 21: Graphical Models for Machine Learning and Computer Vision

GENERALIZATION III

• Parameters are estimated by optimizing the information they provide. This is accomplished by ‘summing or integrating over’ the hidden variables.