Markov Random Fields Presented by: Vladan Radosavljevic
Feb 11, 2016
Markov Random Fields
Presented by:Vladan Radosavljevic
Outline
• IntuitionSimple Example
• TheorySimple Example - Revisited
• Application
• Summary
• References
Intuition• Simple example
• Observation
– noisy image – pixel values are -1 or +1
• Objective
– recover noise free image
Intuition• An idea
• Represent pixels as random variables
• y - observed variable• x - hidden variable
x and y are binary variables (-1 or +1)
• Question: – Is there any relation
among those variables?
Intuition• Building a model
• Values of observed and original pixels should be correlated (small level of noise) - make connections!
• Values of neighboring pixels should be correlated (large homogeneous areas, objects) - make connections!
Final model
Intuition• Why do we need a model?
• y is given, x has to be find• The objective is to find an image x that maximizes p(x|y)• Use model
– penalize connected pairs in the model that have opposite sign as they are not correlated
• Assume distribution
p(x,y) ~ exp(-E)
E =
over all pairs of connected nodes
• If xi and xj have the same sign, the probability will be higher• The same holds for x and y
Intuition• How to find an image x which maximizes probability
p(x|y)?
• Assume x=y
• Take a node xi at time and evaluate E for
xi=+1 and xi=-1
• Set xi to the value that has lowest E (highest probability)
• Iterate through all nodes until convergence
• This method finds local optimum
Intuition
• Result
Theory• Graphical models – a general framework for
representing and manipulating joint distributions defined over sets of random variables• Each variable is associated with a node in a graph• Edges in the graph represent dependencies between
random variables– Directed graphs: represent causative relationships (Bayesian
Networks)– Undirected graphs: represent correlative relationships (Markov
Random Fields)• Representational aspect: efficiently represent complex
independence relations• Computational aspect: efficiently infer information about data
using independence relations
Theory• Recall
• If there are M variables in the model each having K possible states, straight forward inference algorithm will be exponential in the size of the model (KM)
• However, inference algorithms (whether computing distributions, expectations etc.) can use structure in the graph for the purposes of efficient computation
Theory• How to use information from the structure?
• Markov property: If all paths that connect nodes in set A to nodes in set B pass through nodes in set C, then we say that A and B are conditionally independent given C:
p(A,B|C) = p(A|C)p(B|C)• The main idea is to factorize joint probability, then
use sum and product rules for efficient computation
Theory• General factorization
• If two nodes xi and xk are not connected, then they have to be conditionally independent given all other nodes
• There is no link between those two nodes and all other links pass through nodes that are observed
• This can be expressed as
• Therefore, joint distribution must be factorized such that unconnected nodes do not appear in the same factor
• This leads to the concept of clique: a subset of nodes such that all pairs of nodes in the subset are connected
• The factors are defined as functions on the all possible cliques
Theory• Example
• Factorization
where : potential function on the clique C Z : partition function
Theory• Since potential functions have to be positive we can
define them as:
(there is also a theorem that proofs correspondence of this distribution and Markov Random Fields, E is energy function)
• Recall
E =
Theory• Why is this useful?• Inference algorithms can take an advantage of such
representation to significantly increase computational efficiency
• Example, inference on a chain
• To find marginal distribution ~ KN
Theory• If we rearrange the order of summations and
multiplications ~ NK2
Application
Summary
• Advantages• Graphical representation• Computational efficiency
• Disadvantages• Parameter estimation• How to define a model?• Computing probability is sometimes difficult
References
[1] Alexander T. Ihler, Sergey Kirshner, Michael Ghilc, Andrew W. Robertson and Padhraic Smyth “Graphical models for statistical inference and data assimilation”, Physica D: Nonlinear Phenomena, Volume 230, Issues 1-2, June 2007, Pages 72-87