Sum-product and related algorithms for inference

Post on 11-Sep-2021

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Sum-product and related algorithms forinference

Manuel Yguel1

Person in charge: Olivier Aycard2

prenom.nom@inrialpes.fr

1Institut National Polytechnique de Grenoble2Université Joseph Fourier, Grenoble

Master II, IVR, 3I, I.C.A.

1 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Outline

1 Motivation

2 JPDFDefinitionsDefinitions and rules for JPDF

3 Factorization of JPDFProduct ruleIndependencies

4 Graphical modelsBayesian networksFactor Graphs

5 Sum-Product AlgorithmSingle marginal function

Marginal for a chainMarginal for a tree

2 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Probabilistic modelling

The modelling of phenomenons is almost surely uncertain:• communication between entities is subject to random

perturbations,• records of sensors are uncertain:

pixels of an image, range measurements of a laserrange-finder, etc.

• knowledge are approximative:camera extrinsec and intrinsec parameters, sensor orrobot localization, goals of people, etc.

• algorithms are approximative:approximations for real-time, first-order approximationsfor optimization and control, numerical precision, etc.

3 / 64

Factor GraphsAlgorithms

Motivation

JPDFDefinitions

Definitions and rulesfor JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Outline

1 Motivation

2 JPDFDefinitionsDefinitions and rules for JPDF

3 Factorization of JPDFProduct ruleIndependencies

4 Graphical modelsBayesian networksFactor Graphs

5 Sum-Product AlgorithmSingle marginal function

Marginal for a chainMarginal for a tree

4 / 64

Factor GraphsAlgorithms

Motivation

JPDFDefinitions

Definitions and rulesfor JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Framework

• x1, x2, . . . , xn is a set of variables,• ∀i , xi takes on values in some (usually finite) domain (or

alphabet) Ai ,• let g(x1, . . . , xn) be a [0; 1]-valued function of x1, . . . , xn,

g is called the joint probabilistic distribution function(JPDF).

• the domain of g is S = A1 × A2 × . . .× An and is calledthe configuration space,

• each element of S is a particular configuration of thevariables, also called an event.

5 / 64

Factor GraphsAlgorithms

Motivation

JPDFDefinitions

Definitions and rulesfor JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Example: the robot start problem

The robot does not start. The possible causes are:1 the battery is down,2 a wire is disconnected,

Furthermore observation can be made on the batteryvoltage. 4 variables can be defined:

Variable Alphabet or domainStart? {yes, no}

Power State? {up, down}Connected? {connected, disconnected}

Voltage Measure {[iV ; (i + 1)V [|i = 0, · · · , 199}.e = (no, up, disconnected, [24V ; 25V [) is a configuration ofthe 4 variables.g(e) ∈ [0; 1] is defined for all possible event it is also calledP(e) as a probability.

6 / 64

Factor GraphsAlgorithms

Motivation

JPDFDefinitions

Definitions and rulesfor JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Outline

1 Motivation

2 JPDFDefinitionsDefinitions and rules for JPDF

3 Factorization of JPDFProduct ruleIndependencies

4 Graphical modelsBayesian networksFactor Graphs

5 Sum-Product AlgorithmSingle marginal function

Marginal for a chainMarginal for a tree

7 / 64

Factor GraphsAlgorithms

Motivation

JPDFDefinitions

Definitions and rulesfor JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Variable partition

For each problem: the set of variables is partitionned intothree subsets:

1 the set of questionned variables Q,2 the set of known variables K, (possibly empty),3 the set of unknown variables U , (possibly empty).

8 / 64

Factor GraphsAlgorithms

Motivation

JPDFDefinitions

Definitions and rulesfor JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Example: the robot start problem

Power State evaluation:1 questionned variables Q = { Power State? },2 known variables K = {Start? , Voltage Measure },3 unknown variables U = {Connected? }.

Connection evaluation:1 questionned variables Q = { Connected? },2 known variables K = {Start? , Voltage Measure },3 unknown variables U = { Power State? }.

9 / 64

Factor GraphsAlgorithms

Motivation

JPDFDefinitions

Definitions and rulesfor JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Goal

The goal of a probabilistic model is to calculate theconditional jpdf

B := P(xq(1), . . . , xq(p)|xk(1), . . . , xk(q))

where ∀(i , j), xq(i) ∈ Q and xk(j) ∈ KIt is a set of functions, each one indexed by one differentconfiguration of the known variables:

(ak(1), . . . , ak(q)) 7−→ (pdf : Sq(1) × . . .× Sq(p) −→ [0; 1])

10 / 64

Factor GraphsAlgorithms

Motivation

JPDFDefinitions

Definitions and rulesfor JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Example: the robot start problem

Power State evaluation:

B1 := P(PS|St, VM)

The variables Start? , Power State? , Connected? , VoltageMeasure are abreviated St , PS , C , VM respectively.

For each Start? and Voltage Measure configuration(∈ {yes, no} × [0.0V ; 200V ]) it defines a probabilisticfunction over the possible values of Power State? .

(no, 24V ) 7−→

0

0.2

0.4

0.6

0.8

1

up down

11 / 64

Factor GraphsAlgorithms

Motivation

JPDFDefinitions

Definitions and rulesfor JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Use of probabilistic definitions

brown hair blond hair red hair light brown hairbrown eyes 22% 5% 3% 15%blue eyes 8% 11% 9% 7%

green eyes 6% 2% 6% 6%

12 / 64

Factor GraphsAlgorithms

Motivation

JPDFDefinitions

Definitions and rulesfor JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Use of probabilistic definitions

brown hair blond hair red hair light brown hairbrown eyes 22% 5% 3% 15%blue eyes 8% 11% 9% 7%

green eyes 6% 2% 6% 6%

Marginal probability: calculating the probability of having blondhair.

P(blond hair) =∑

eye colorP(blond hair, eye color) = 18%

brown hair blond hair red hair light brown hairbrown eyes 22% 5% 3% 15%blue eyes 8% 11% 9% 7%

green eyes 6% 2% 6% 6%

12 / 64

Factor GraphsAlgorithms

Motivation

JPDFDefinitions

Definitions and rulesfor JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Use of probabilistic definitions

brown hair blond hair red hair light brown hairbrown eyes 22% 5% 3% 15%blue eyes 8% 11% 9% 7%

green eyes 6% 2% 6% 6%

Conditional probability on eyes having blond hair:P(eye color|blond hair).

blond hairbrown eyes 5%

18%

blue eyes 11%18%

green eyes 2%18%

12 / 64

Factor GraphsAlgorithms

Motivation

JPDFDefinitions

Definitions and rulesfor JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Probabilistic definitions

Conditional probability:P(xq(1), . . . , xq(p)|xk(1), . . . , xk(q)) =

P(xq(1),...,xq(p),xk(1),...,xk(q))

P(xk(1),...,xk(q))

Marginalization (also called sum rule):• P(xq(1), . . . , xq(p), xk(1), . . . , xk(q))

=∑

(au(1),...,au(r))∈

Au(1)×...×Au(r)

g(x1, . . . , xn)

• P(xk(1), . . . , xk(q))

=∑

(aq(1),...,aq(p))∈

Aq(1)×...×Aq(p)

∑(au(1),...,au(r))∈

Au(1)×...×Au(r)

g(x1, . . . , xn)

13 / 64

Factor GraphsAlgorithms

Motivation

JPDFDefinitions

Definitions and rulesfor JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Expanded expression of the goal function

Following conditional and marginal probability definitions Bequal: ∑

(au(1),...,au(r))

g(x1, . . . , xn)∑(aq(1),...,aq(p))

∑(au(1),...,au(r))

g(x1, . . . , xn)

14 / 64

Factor GraphsAlgorithms

Motivation

JPDFDefinitions

Definitions and rulesfor JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Example: the robot start problem

Power State evaluation:

B1 := P(PS|St, VM)

=

∑c∈{connected,disconnected}

g(St, PS, c, VM)

∑ps∈{up,down}

∑c∈{connected,disconnected}

g(St, ps, c, VM)

15 / 64

Factor GraphsAlgorithms

Motivation

JPDFDefinitions

Definitions and rulesfor JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithm

Worst inference complexity

Each variable takes values in a finite alphabet of size K ,K p+r sums are required,p and r variables in the questionned and unknown sets(resp.).

EXPONENTIAL COMPLEXITY in the number of variables.

16 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDFProduct rule

Independencies

Graphicalmodels

Sum-ProductAlgorithm

Outline

1 Motivation

2 JPDFDefinitionsDefinitions and rules for JPDF

3 Factorization of JPDFProduct ruleIndependencies

4 Graphical modelsBayesian networksFactor Graphs

5 Sum-Product AlgorithmSingle marginal function

Marginal for a chainMarginal for a tree

17 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDFProduct rule

Independencies

Graphicalmodels

Sum-ProductAlgorithm

Product rule

Let o(i), i ∈ {1, . . . , n} any permutation of the variablesindices,

g(x1, . . . , xn) = P(xo(1))n∏

i=2

P(xo(i)|xo(1), . . . , xo(i−1))

(easy to demonstrate by recursion: just replace conditionalprobabilities by their definition)

Example: the robot start problemg(St, PS, C, VM)= P(PS)P(C|PS)P(VM|PS, C)P(St|PS, C, VM)

18 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDFProduct rule

Independencies

Graphicalmodels

Sum-ProductAlgorithm

Outline

1 Motivation

2 JPDFDefinitionsDefinitions and rules for JPDF

3 Factorization of JPDFProduct ruleIndependencies

4 Graphical modelsBayesian networksFactor Graphs

5 Sum-Product AlgorithmSingle marginal function

Marginal for a chainMarginal for a tree

19 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDFProduct rule

Independencies

Graphicalmodels

Sum-ProductAlgorithm

Probabilistic independence and conditionalindependence

Two variables xi and xj are said independent if and only if:

∀(ai , aj) ∈ Ai × Aj , p(ai , aj) = p(ai)p(aj)

Two variables xi and xj are said conditionnaly independentgiven xk if and only if:∀(ai , aj , ak ) ∈ Ai × Aj × Ak , p(ai , aj |ak ) = p(ai |ak )p(aj |ak )

20 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDFProduct rule

Independencies

Graphicalmodels

Sum-ProductAlgorithm

Example: the robot start problem

Most of the case, a lot of independencies or conditionalindependencies arise as reasonable hyptothesis in aprobabilistic modelling.Reasonable hypothesis:• Power State? and Connected? are independent• Voltage Measure and Connected? are conditionnally

independent given Power State?• Start? and Voltage Measure are conditionnally

independent given Power State? and Connected? .

g(St, PS, C, VM)

= P(PS)P(C|PS/////////////////)P(VM|PS, C/////////)P(St|PS, C, VM///////////////////)

= P(PS)P(C)P(VM|PS)P(St|PS, C)

21 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDFProduct rule

Independencies

Graphicalmodels

Sum-ProductAlgorithm

the robot start problem

Simple substitutions gives this simplifications from thehypothesis:

1 Power State? and Connected? are independent:P(PS, C) = P(PS)P(C).

P(C|PS) =P(C, PS)

P(PS)=

P(PS)//////////////////////////////////P(C)

P(PS)//////////////////////////////////

22 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDFProduct rule

Independencies

Graphicalmodels

Sum-ProductAlgorithm

the robot start problem

Simple substitutions gives this simplifications from thehypothesis:

2 Voltage Measure and Connected? are conditionnallyindependent given Power State? :P(VM, C|PS) = P(VM|PS)P(C|PS).

22 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDFProduct rule

Independencies

Graphicalmodels

Sum-ProductAlgorithm

the robot start problem

Simple substitutions gives this simplifications from thehypothesis:

2 Voltage Measure and Connected? are conditionnallyindependent given Power State? :P(VM, C|PS) = P(VM|PS)P(C|PS).Use of conditional probability definition:

P(VM|PS, C) =P(VM, PS, C)

P(PS, C)

22 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDFProduct rule

Independencies

Graphicalmodels

Sum-ProductAlgorithm

the robot start problem

Simple substitutions gives this simplifications from thehypothesis:

2 Voltage Measure and Connected? are conditionnallyindependent given Power State? :P(VM, C|PS) = P(VM|PS)P(C|PS).Use of conditional probability definition:

P(VM|PS, C) =P(VM, PS, C)

P(PS, C)

Use of product rule:

P(VM, PS, C) = P(PS)P(VM, C|PS)

= P(PS)P(VM|PS)P(C|PS)

22 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDFProduct rule

Independencies

Graphicalmodels

Sum-ProductAlgorithm

the robot start problem

Simple substitutions gives this simplifications from thehypothesis:

2 Voltage Measure and Connected? are conditionnallyindependent given Power State? :P(VM, C|PS) = P(VM|PS)P(C|PS).Use of conditional probability definition:

P(VM|PS, C) =P(VM, PS, C)

P(PS, C)

P(VM|PS, C) =P(PS)P(VM|PS)P(C|PS)

P(PS, C)

P(VM|PS, C) =

P(PS)//////////////////////////////////P(VM|PS)

P(PS, C)/////////////////////////////////////////////////

P(C, PS)/////////////////////////////////////////////////

P(PS)//////////////////////////////////

22 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDFProduct rule

Independencies

Graphicalmodels

Sum-ProductAlgorithm

the robot start problem

Simple substitutions gives this simplifications from thehypothesis:

2 Voltage Measure and Connected? are conditionnallyindependent given Power State? :P(VM, C|PS) = P(VM|PS)P(C|PS).

P(VM|PS, C) = P(VM|PS)

22 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDFProduct rule

Independencies

Graphicalmodels

Sum-ProductAlgorithm

the robot start problem

Simple substitutions gives this simplifications from thehypothesis:

3 Start? and Voltage Measure are conditionnallyindependent given Power State? and Connected? :P(St, VM|PS, C) = P(St|PS, C)P(VM|PS, C).Same as for hypothesis (2) by replacing St by C andthe group PS, C by PS.

22 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDFProduct rule

Independencies

Graphicalmodels

Sum-ProductAlgorithm

Independencies immediate utility: memory gain

If each variable takes values in a finite alphabet of size Kand no independence assumption is made:g(x1, . . . , xn) required a grid of size K n.The memory size of P(xi |x1, . . . , xi−1) is K × K i−1 = K i .If p conditional indepence assumptions are made thememory size reduced to: K × K i−1−p = K i−p

no independencies with independenciesg(St, PS, C, VM) P(PS)P(C)P(VM|PS)P(St|PS, C)

23 ∗ 200 = 1600 2 + 2 + 200× 2 + 2 ∗ 4 = 412

23 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Outline

1 Motivation

2 JPDFDefinitionsDefinitions and rules for JPDF

3 Factorization of JPDFProduct ruleIndependencies

4 Graphical modelsBayesian networksFactor Graphs

5 Sum-Product AlgorithmSingle marginal function

Marginal for a chainMarginal for a tree

24 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Example: the robot start problem

P(PS)P(C|PS)

Power State?

VoltageMeasure

Start?

Connected?

1

25 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Example: the robot start problem

P(PS)P(C|PS)P(VM|PS, C)

Power State?

VoltageMeasure

Start?

Connected?

1

25 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Example: the robot start problem

Full graph:

P(PS)P(C|PS)P(VM|PS, C)P(St|PS, C, VM)

Power State?

VoltageMeasure

Start?

Connected?

1

25 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Example: the robot start problem

P(PS)P(C|PS/////////////////)P(VM|PS, C)P(St|PS, C, VM)

Power State?

VoltageMeasure

Start?

Connected?

1

25 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Example: the robot start problem

P(PS)P(C)P(VM|PS, C/////////)P(St|PS, C, VM)

Power State?

VoltageMeasure

Start?

Connected?

1

25 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Example: the robot start problem

P(PS)P(C)P(VM|PS, )P(St|PS, C, VM///////////////////)

Power State?

VoltageMeasure

Start?

Connected?

1

25 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Bayesian networks (BN): definition (1)

• BN are Directed Acyclic Graphs (DAGs) that expressesa certain factorization of a JPDF.

• The graph as a polytree structure: it is possible todefine an order o over the nodes.If there is a directed path from xi to xj in the graph theno(j) > o(i).Let’s x1, . . . , xn ordered following o.

26 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Bayesian networks: definition (2)

• o is used in the factorization of the JPDF induced bythe product rule:

g(x1, . . . , xn) = P(x1)n∏

i=2

P(xi |x1, . . . , xi−1)

If there is no edge from xk to xj then in the factorP(xj |x1, . . . , xj−1) xk can be simplified at the right handside:

P(xj |x1, . . . , xk , . . . , xj−1) := P(xj |x1, . . . , xk////////////, . . . , xj−1)

27 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Bayesian networks: definition (3)

• The JPDF is equivalently defined by the BN or thefollowing factorization:

g(x1, . . . , xn) :=n∏

j=1

P(xj |paj)

where paj is the set of parents of xj . It is the set ofvariables xi such that there exists an edge from xi to xjin the BN.

28 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Interests and drawbacks of Bayesian networks

+ each term of the factorization is a probabilitydistribution:

1 clear semantic,2 normalized;

− does not represent each possible decomposition withprobability distributions:

P(A, B, C, D, E) = P(C)P(D|C)P(A, B|C, D)P(E |C, B);

− does not represent each possible factorization of theJPDF.

29 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Outline

1 Motivation

2 JPDFDefinitionsDefinitions and rules for JPDF

3 Factorization of JPDFProduct ruleIndependencies

4 Graphical modelsBayesian networksFactor Graphs

5 Sum-Product AlgorithmSingle marginal function

Marginal for a chainMarginal for a tree

30 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Factor Graphs

Hypothesis: g(x1, . . . , xn) factors into a product of severallocal functions, each having some subset of {x1, . . . , xn} asarguments:

g(x1, . . . , xn) =∏j∈J

fj(Xj)

where J is a discrete index set, Xj is a subset of {x1, . . . , xn}and fj(Xj) is a function that depends only on the variables inXj .If Xj = {v1, . . . , vp}, fj(Xj) = fj(v1, . . . , vp).

Factor graphs represent all possible factorizations of theJPDF.

31 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Factor Graphs

Hypothesis: g(x1, . . . , xn) factors into a product of severallocal functions, each having some subset of {x1, . . . , xn} asarguments:

g(x1, . . . , xn) =∏j∈J

fj(Xj)

where J is a discrete index set, Xj is a subset of {x1, . . . , xn}and fj(Xj) is a function that depends only on the variables inXj .If Xj = {v1, . . . , vp}, fj(Xj) = fj(v1, . . . , vp).

Factor graphs represent all possible factorizations of theJPDF.

31 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Factor Graphs

Sometimes factor graphs for JPDF are expressed asfollows:

g(x1, . . . , xn) =1Z

∏j∈J

fj(Xj)

where Z =∑

(x1,··· ,xn)

∏j∈J fj(Xj), such that g is normalized.

It is possible to consider a special factor node: f0 = 1Z with

X0 = ∅ so that f0 is not linked to any variable node.

In those cases: factors are not necessarily normalizedanymore and thus are not necessarily probabilisticdistributions.

32 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Factor Graphs

Sometimes factor graphs for JPDF are expressed asfollows:

g(x1, . . . , xn) =1Z

∏j∈J

fj(Xj)

where Z =∑

(x1,··· ,xn)

∏j∈J fj(Xj), such that g is normalized.

It is possible to consider a special factor node: f0 = 1Z with

X0 = ∅ so that f0 is not linked to any variable node.

In those cases: factors are not necessarily normalizedanymore and thus are not necessarily probabilisticdistributions.

32 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Factor Graphs

Definition: a factor graph is a bipartite graph that expressesthe structure of the factorization hypothesis. A factor graph

has a variable node xi for each variable xi and factor

node fjfor each local function fj . The nodes of the

graph only connect a variable node to a factor node(bipartite property). A variable node xi is edge-connected toa factor node fj if and only if xi is an argument of fj or xi ∈ Xj .

33 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

GraphicalmodelsBayesian networks

Factor Graphs

Sum-ProductAlgorithm

Example: the robot start problem

VoltageMeasure

Power State? Start? Connected?

P(VM |PS ) P(PS ) P(St |PS ,C ) P(C )

34 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Bayesian inference for factor graphs: thesum-product algorithm

Reminder: the goal of a probabilistic model is to calculatethe conditional jpdf

B := P(xq(1), . . . , xq(p)|xk(1), . . . , xk(q))

NOW: exploit factorization of the JPDF to speed upbayesian inference.

35 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Factorization property

Idea: reorganize sums of products into products of sumsfollowing the distributive law:

ab + ac = a(b + c)

36 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Factorization property

Idea: reorganize sums of products into products of sumsfollowing the distributive law:

ab + ac = a(b + c)

2 MULT, 1 ADD become 1 MULT, 1 ADD

36 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Factorization property

Idea: reorganize sums of products into products of sumsfollowing the distributive law:

ab + ac + ad + ae + · · ·+ az = a(b + c + · · ·+ z)

25 MULT, 25 ADD become 1 MULT, 25 ADD

36 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Outline

1 Motivation

2 JPDFDefinitionsDefinitions and rules for JPDF

3 Factorization of JPDFProduct ruleIndependencies

4 Graphical modelsBayesian networksFactor Graphs

5 Sum-Product AlgorithmSingle marginal function

Marginal for a chainMarginal for a tree

37 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainChain definition (1)

A path without cycle or a chain describes a JPDF in whicheach variable has at most one parent and at most one child,in a BN point of view. It leads to the following factor graph:

g(x1, . . . , xn) = P(x1)n∏

j=2

P(xj |xj−1)

= f1(x1)f2(x1, x2) · · · fn(xn, xn−1)

38 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainChain definition (2)

x1 x2 x3 x4 x5 x6

f1 f2 f3 f4 f5 f6

Definition: a chain without cycle is a sequence of verticesand edges in a graph:

c = v0, e1, v1, e2, · · · , vn−1, en, vn

such that the edge ei joins the vertices vi−1 and vi and thateach vertex appears only one time.

39 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainSum rearrangements and message definition (1)

Example: g(x1, . . . , x6) = f1(x1)∏6

j=2 fj(xj , xj−1).

Marginal for x4:

P(x4) =∑

x1,x2,x3,x5,x6

g(x1, . . . , x6) =∑∼{x4}

g(x1, . . . , x6)

where the notation ∼ {xi} stands forx1, · · · , xi−1, xi+1, · · · , xn i.e. all variables except xi .

40 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainx4 is a pivot for factorization

P∼ {x4}

g(x1, . . . , x6) =

Px1, x2, x3

f1(x1)f2(x2, x1)f3(x3, x2)f4(x4, x3)

8<: Px5, x6

f5(x5, x4)f6(x6, x5)

9=;

x1 x2 x3 x4 x5 x6

f1 f2 f3 f4 f5 f6

41 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainx4 is a pivot for factorization

P∼ {x4}

g(x1, . . . , x6) =

8<: Px1, x2, x3

f1(x1)f2(x2, x1)f3(x3, x2)f4(x4, x3)

9=;8<: P

x5, x6

f5(x5, x4)f6(x6, x5)

9=;

x1 x2 x3 x4 x5 x6

f1 f2 f3 f4 f5 f6

41 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainx4 is a pivot for factorization

P∼ {x4}

g(x1, . . . , x6) =8<: Px1, x2, x3

f1(x1)f2(x2, x1)f3(x3, x2)f4(x4, x3)

9=; ×

8<: Px5, x6

f5(x5, x4)f6(x6, x5)

9=;µα(x4) × µβ(x4)

x1 x2 x3 x4 x5 x6

f1 f2 f3 f4 f5 f6

µα(x4) µβ(x4)

41 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainx4 is a pivot for factorizationP

∼ {x4}g(x1, . . . , x6) =8<: P

x3

f4(x4, x3)

8<: Px1, x2

f3(x3, x2)f1(x1)f2(x2, x1)

9=;9=;

×

8<: Px5, x6

f5(x5, x4)f6(x6, x5)

9=;

x1 x2 x3 x4 x5 x6

f1 f2 f3 f4 f5 f6

µα(x4) µβ(x4)

µα(x3)

41 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainx4 is a pivot for factorizationP

∼ {x4}g(x1, . . . , x6) =8<: P

x3

f4(x4, x3)

8<: Px1, x2

f3(x3, x2)f1(x1)f2(x2, x1)

9=;9=;

×

8<: Px5

f5(x5, x4)

8<: Px6

f6(x6, x5)

9=;9=;

x1 x2 x3 x4 x5 x6

f1 f2 f3 f4 f5 f6

µα(x4) µβ(x4)

µα(x3) µβ(x5)

41 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainx4 is a pivot for factorizationP

∼ {x4}g(x1, . . . , x6) =8<: P

x3

f4(x4, x3)

8<: Px2

f3(x3, x2)

8<: Px1

f1(x1)f2(x2, x1)

9=;9=;

9=;×

8<: Px5

f5(x5, x4)

8<: Px6

f6(x6, x5)

9=;9=;

x1 x2 x3 x4 x5 x6

f1 f2 f3 f4 f5 f6

µα(x4) µβ(x4)

µα(x3) µβ(x5)

µα(x2)

41 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainGraphical definition of messages using recursionP

∼ {x4}g(x1, . . . , x6) =8<: P

x3

f4(x4, x3)

8<: Px2

f3(x3, x2)

8<: Px1

f1(x1)f2(x2, x1)

9=;9=;

9=;×

8<: Px5

f5(x5, x4)

8<: Px6

f6(x6, x5)

9=;9=;

x1 x2 x3 x4 x5 x6

f1 f2 f3 f4 f5 f6

µα(x4) µβ(x4)µα(x3) µβ(x5)µα(x2)

42 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainMathematical insight of the message operations (1)

•∑

∼ {x4}g(x1, . . . , x6) = µα(x4)× µβ(x4)

• x4 can take K values: {a14, . . . , aK

4 }• the product of two messages is a vector: µα(x4)

1 × µβ(x4)1

...µα(x4)

K × µβ(x4)K

43 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainMathematical insight of the message operations (2)

• µβ(x4) =

∑x5

f5(x5, x4)

∑x6

f6(x6, x5)

• x5 can take P values: {a15, . . . , aP

5 }• (f5(x5, x4) is discretized) the next message is obtained

by a matrix vector operation:

26664µβ(x4)

1

...µβ(x4)

K

37775=

26666664f5(a1

5, a14) f5(a2

5, a14) . . . f5(aP

5 , a14)

f5(a15, a2

4) . . . f5(aP5 , a2

4)...

. . ....

f5(a15, aK

4 ) f5(a25, aK

4 ) . . . f5(aP5 , aK

4 )

37777775

26666664µβ(x5)

1

µβ(x5)2

...µβ(x5)

P

37777775

44 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainMathematical insight of the message operations (2)

• µβ(x4) =

∑x5

f5(x5, x4)µβ(x5)

• x5 can take P values: {a1

5, . . . , aP5 }

• (f5(x5, x4) is discretized) the next message is obtainedby a matrix vector operation:

26664µβ(x4)

1

...µβ(x4)

K

37775=

26666664f5(a1

5, a14) f5(a2

5, a14) . . . f5(aP

5 , a14)

f5(a15, a2

4) . . . f5(aP5 , a2

4)...

. . ....

f5(a15, aK

4 ) f5(a25, aK

4 ) . . . f5(aP5 , aK

4 )

37777775

26666664µβ(x5)

1

µβ(x5)2

...µβ(x5)

P

37777775

44 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainMathematical insight of the message operations (2)

• µβ(x4) =

∑x5

f5(x5, x4)µβ(x5)

• x5 can take P values: {a1

5, . . . , aP5 }

• (f5(x5, x4) is discretized) the next message is obtainedby a matrix vector operation:

26664µβ(x4)

1

...µβ(x4)

K

37775=

26666664f5(a1

5, a14) f5(a2

5, a14) . . . f5(aP

5 , a14)

f5(a15, a2

4) . . . f5(aP5 , a2

4)...

. . ....

f5(a15, aK

4 ) f5(a25, aK

4 ) . . . f5(aP5 , aK

4 )

37777775

26666664µβ(x5)

1

µβ(x5)2

...µβ(x5)

P

37777775

44 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainMathematical insight of the message operations (2)

• µβ(x4) =

∑x5

f5(x5, x4)µβ(x5)

• x5 can take P values: {a1

5, . . . , aP5 }

• (f5(x5, x4) is discretized) the next message is obtainedby a matrix vector operation:

26664µβ(x4)

1

...µβ(x4)

K

37775=

26666664f5(a1

5, a14) f5(a2

5, a14) . . . f5(aP

5 , a14)

f5(a15, a2

4) . . . f5(aP5 , a2

4)...

. . ....

f5(a15, aK

4 ) f5(a25, aK

4 ) . . . f5(aP5 , aK

4 )

37777775

26666664µβ(x5)

1

µβ(x5)2

...µβ(x5)

P

37777775

44 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainFactor node information representation

In a discretized factor node: the matrix f5(x5, x4) is stored.

K

f5(a15, a1

4) f5(a25, a1

4) . . . f5(aP5 , a1

4)f5(a1

5, a24) . . . f5(aP

5 , a24)

.... . .

...f5(a1

5, aK4 ) f5(a2

5, aK4 ) . . . f5(aP

5 , aK4 )

︸ ︷︷ ︸

P

45 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainInference complexity

For discretized variables (all with K cases):26666664µβ(x4)

1

µβ(x4)2

...µβ(x4)

K

37777775=

26666664f5(a1

5, a14) f5(a2

5, a14) . . . f5(aK

5 , a14)

f5(a15, a2

4) . . . f5(aK5 , a2

4)...

. . ....

f5(a15, aK

4 ) f5(a25, aK

4 ) . . . f5(aK5 , aK

4 )

37777775

26666664µβ(x5)

1

µβ(x5)2

...µβ(x5)

K

37777775

• for a message: K 2 sums and K 2 products,• marginalizing over one variable among N: N − 1

messages,inference for a chain:

O((N − 1)K 2) operations

to compare with O(K N−1) in the general case.

46 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a chainContinuous factor nodes

Where a variable is continuous, sums becomes integrals.

−→ computational overhead

BUT only the definition of the functional f5(x5, x4) is needed(instead of matrices).

Example: f5(x5, x4) = N (x4, σ4)(x5) = 1σ4√

2πe

12 (

x5−x4σ4

)2

Warning: at factor nodestorage = function code definition + parameters (here σ4)

47 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeSum rearrangements (1)

Extension of the chain algorithm to a tree is possible.Let the graph of g(x1, · · · , xn) =

∏j fj(Xj) be a tree.

For a marginal over xi :

P(xi) =∑

∼ {xi}

∏j

fj(Xj)

pick up a variable node, xi , as the root of the tree (that’salways possible with trees).

48 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeExample: the robot start problem

For a marginal calculation over Start?:

VoltageMeasure

Power State? Start? Connected?

P(VM |PS ) P(PS ) P(St |PS ,C ) P(C )

49 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeExample: the robot start problem

For a marginal calculation over Start?:

VoltageMeasure

Power State? Start? Connected?

P(VM |PS ) P(PS ) P(St |PS ,C ) P(C )

49 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeExample: the robot start problem

For a marginal calculation over Start?:VoltageMeasure

Power State? Start? Connected?

P(VM |PS ) P(PS ) P(St |PS ,C ) P(C )

VoltageMeasure

Power State?

Start? Connected?

P(VM |PS )

P(PS )

P(St |PS ,C )

P(C )

49 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeSum rearrangements (2)

Let st(xi) the set of all the subtrees connected to xi .They are disjoint subtrees and then:

P(xi) =∑

∼ {xi}

∏j

fj(Xj) =∑

∼ {xi}

∏s∈st(xi )

Fs(xi , Ys)

• Ys is the set of all the variables in the subtree s,• Fs(xi , Ys) is the product of all the factors in the subtree

s,• in a tree there is at most one path that link one node to

another, so

∀(s1, s2) ∈ st(xi)2, Xs1 ∩ Xs2 = ∅

the factors of different subtree work on disjointvariables.

50 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeSum rearrangements (2)

Let st(xi) the set of all the subtrees connected to xi .They are disjoint subtrees and then:

P(xi) =∑

∼ {xi}

∏j

fj(Xj) =∑

∼ {xi}

∏s∈st(xi )

Fs(xi , Ys)

• Ys is the set of all the variables in the subtree s,• Fs(xi , Ys) is the product of all the factors in the subtree

s,• in a tree there is at most one path that link one node to

another, so

∀(s1, s2) ∈ st(xi)2, Xs1 ∩ Xs2 = ∅

the factors of different subtree work on disjointvariables.

50 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeExample: the robot start problem

VoltageMeasure

Power State?

Start? Connected?

P(VM |PS )

P(PS )

P(St |PS ,C )

P(C )

∑∼{St}

g(St,PS,C,VM)=∑

∼{St}{P(VM|PS)}{P(C)P(St|PS,C)}{P(PS)}

51 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeExample: the robot start problem

VoltageMeasure

Power State?

Start? Connected?

P(VM |PS )

P(PS )

P(St |PS ,C )

P(C )

∑∼{St}

g(St,PS,C,VM)=∑

∼{St}Fj (VM,PS)Fb(St,PS,C)Fg(PS)

51 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeExample: the robot start problem

VoltageMeasure

Power State?

Start? Connected?

P(VM |PS )

P(PS )

P(St |PS ,C )

P(C )

∑∼{St}

g(St,PS,C,VM)=∑

∼{St}fj (VM,PS)fb2(C)fb1(St,PS,C)fg(PS)

51 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeExample: the robot start problem

VoltageMeasure

Power State?

Start? Connected?

P(VM |PS )

P(PS )

P(St |PS ,C )

P(C )

∑∼{St}

g(St,PS,C,VM)=∑

∼{St}{P(VM|PS)}{P(C)P(St|PS,C)}{P(PS)}

51 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeSum rearrangements (3)

As the factors of different subtree work on disjoint variables,it is possible to exchange sums and product locally:

P(xi) =∑

∼ {xi}

∏j

fj(Xj) (1)

=∑

∼ {xi}

∏s∈st(xi )

Fs(xi , Ys) (2)

=∏

s∈st(xi )

∑Ys

Fs(xi , Ys) (3)

=∏

s∈st(xi )

µFs→xi (4)

where µFs→xi :=∑Ys

Fs(xi , Ys).

52 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeFactor to node messages definition (1)

And each subtree s is connected to the node variablethrough a unique factor node fs due to the bipartite propertyof factor graphs such as the message from factor s to node iis defined as:

µfs→xi := µFs→xi .

VoltageMeasure

Power State?

Start? Connected?

P(VM |PS )

P(PS )

P(St |PS ,C )

P(C )

µfj→PS

µfb→PS

µfv→PS

53 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeNode to factor messages definition (1)

For each subtree, the processus of pushing the sumsdeeper is continued:

µfs→xi :=∑Ys

Fs(xi , Ys)

=∑Ys

fs(xi , Xs)∏

m∈st(fs)

Fm(Ym)

where:• st(fs) is the set of all the subtrees connected to the

factor node fs;• each subtree m is connected to fs through a unique

variable node xm;• Fm(Ym) is the product of all the factors in the subtree m;• Ym is the set of all the variables in the subtree m.

54 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeNode to factor messages definition (2)

µfs→xi :=∑Ys

Fs(xi , Ys)

=∑Ys

fs(xi , Xs)∏

m∈st(fs)

Fm(Ym)

=∑

Xs\{xi}fs(xi , Xs)

∏m∈st(fs)

∑Xm\Xs

Fm(Ym)

=∑

Xs\{xi}fs(xi , Xs)

∏m∈st(fs)

µxm→fs

µxm→fs :=∑

Ym\Xs

Fm(Ym) is the message from node m to

factor s.

55 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeNode to factor messages definition (2)

µfs→xi :=∑Ys

Fs(xi , Ys)

=∑Ys

fs(xi , Xs)∏

m∈st(fs)

Fm(Ym)

=∑

Xs \ {xi}fs(xi , Xs)

∏m∈st(fs)

∑Ym\Xs

Fm(Ym)

=∑

Xs \ {xi}fs(xi , Xs)

∏m∈st(fs)

µxm→fs

µxm→fs :=∑

Ym\Xs

Fm(Ym) is the message from node m to

factor s.

55 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeNode to factor messages definition (3)

Ys =⋃

m∈st(fs) Ym et Ym \ Xs = Ym \ {xm}

xa xb xt

f1

f2 f3 fi fj

56 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeFactor to node messages definition (2)

Finally:from the previous expansion, factor to node messages fromfs to xi can be written completely recursively from xi , Xs andthe messages from other nodes than xi to fs.

µfs→xi :=∑Ys

Fs(xi , Ys)

=∑

Xs \ {xi}fs(xi , Xs)

∏m∈Xs\{xi}

µxm→fs

57 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeNode to factor messages definition (4)

It is possible to expand the messages from node to factor asdone previously.

µxm→fs :=∑

Ym\{xm}Fm(Ym)

As:• Fm(Ym) =

∏k∈st(xm) Fk (Yk ) considering all subtrees

attached to the variable node xm• Ym \ {xm} =

⋃k Yk

• all the set of variables of the subtrees are disjoint

∑Ym\{xm}

Fm(Ym) =∏

k∈st(xm)

∑Yk

Fk (Yk )

=∏

k∈st(xm)

µfk→xm

58 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeNode to factor messages definition (4)

As:• Fm(Ym) =

∏k∈st(xm) Fk (Yk ) considering all subtrees

attached to the variable node xm

• Ym \ {xm} =⋃

k Yk

• all the set of variables of the subtrees are disjoint

µxm→fs =∏

k∈st(xm)

µfk→xm

The recursion is done, node to factor messages are productof factor to node messages.

58 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeNode to factor messages definition (5)

It is worth noticing that the factors in the subtrees attachedto xm are all the factors attached to xm except the precedentone in the path: fs.In general we note ne(v) the set of all the neighbour nodesof the node v in the factor graph.So that:

µxm→fs =∏

fk∈ne(xm)\fs

µfk→xm

59 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Marginal for a treeFactor graph messages formulae: update rules

FACTOR TO NODE MESSAGE FORMULA:

µfs→xi =∑

xm∈ne(fs)\{xi}fs(xi , Xs)

∏xm∈ne(fs)\{xi}

µxm→fs

NODE TO FACTOR MESSAGE FORMULA:

µxm→fs =∏

fk∈ne(xm)\fs

µfk→xm

60 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Factor graph messages definitionEnd messages

Definition of end messages:

• for node to factor message:

a vector of one

1...1

x f

µx→f (x) = 1

• for factor to node message:

a vector of function values

fi(a1l )

...fi(aK

l )

xf

µf→x(x) = f(x)

61 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Sum-Product algorithm for trees

1 start from leaves: brodcast end messages to therespective neighbours,

2 apply message update rule recursively,3 in the root: multiply each incomming message.

62 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Exercise: factor graph message definition for achain

Chain example:

x1 x2 x3 x4 x5 x6

f1 f2 f3 f4 f5 f6

µαf1→x1

µαf2→x2

µαf3→x3

µαf4→x4 µβ

f6→x5µβ

f5→x4

µαx1→f2

µαx2→f3

µαx3→f4 µβ

x6→f6µβ

x5→f5

63 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Solution: factor graph message definition for achain

Factor to node message for a chain:

µβfi→xi−1

:=∑xi

fi(xi , xi−1)µβxi→fi

It is a sum of product, no real simplification. Except, there isonly no product with all the differents children of fi .Node to factor message for a chain:

µβxi→fi−1

:= µβfi+1→xi

In this case there is a big simplification: the message isexactly that send by the unique child. Again: there is only noproduct with all the differents children of xi .

64 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Solution: factor graph message definition for achain

Factor to node message for a chain:

µβfi→xi−1

:=∑xi

fi(xi , xi−1)µβxi→fi

It is a sum of product, no real simplification. Except, there isonly no product with all the differents children of fi .Node to factor message for a chain:

µβxi→fi−1

:= µβfi+1→xi

In this case there is a big simplification: the message isexactly that send by the unique child. Again: there is only noproduct with all the differents children of xi .

Same remarks hold for α messages.64 / 64

Factor GraphsAlgorithms

Motivation

JPDF

Factorizationof JPDF

Graphicalmodels

Sum-ProductAlgorithmSingle marginalfunction

Marginal for a chain

Marginal for a tree

Bibliography I

• Kschischang, Frey, Loeliger, Factor Graphs and theSum-Product Algorithm (2001)http://citeseer.ist.psu.edu/kschischang01factor.html

• Christopher M. Bishop, Pattern Recognition andMachine Learning, chapter 8, Springer (2006)http://research.microsoft.com/ cmbishop/PRML/Bishop-PRML-sample.pdf

• David J.C. MacKay (2003). Message Passing andExact Marginalization in Graphs. In David J.C. MacKay,Information Theory, Inference, and LearningAlgorithms, pp. 241-247, pp. 334-340. Cambridge:Cambridge University Press.

65 / 64

top related