Top Banner
Message Passing Algorithms for Optimization Nicholas Ruozzi Advisor: Sekhar Tatikonda Yale University 1
55

Message Passing Algorithms for Optimization

Feb 11, 2016

Download

Documents

abram

Message Passing Algorithms for Optimization. Nicholas Ruozzi Advisor: Sekhar Tatikonda Yale University. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box .: A A A. The Problem. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Message Passing Algorithms for Optimization

1

Message Passing Algorithms for OptimizationNicholas Ruozzi

Advisor: Sekhar Tatikonda

Yale University

Page 2: Message Passing Algorithms for Optimization

2

The Problem

Minimize a real-valued objective function that factorizes as a sum of potentials

(a multiset whose elements are subsets of the indices 1,…,n)

Page 3: Message Passing Algorithms for Optimization

3

Corresponding Graph

21 3

Page 4: Message Passing Algorithms for Optimization

4

Local Message Passing Algorithms

Pass messages on this graph to minimize f

Distributed message passing algorithm

Ideal for large scientific problems, sensor networks, etc.

21 3

Page 5: Message Passing Algorithms for Optimization

5

The Min-Sum Algorithm Messages at time t:

21 3

4

Page 6: Message Passing Algorithms for Optimization

6

Computing Beliefs The min-marginal corresponding to the ith

variable is given by

Beliefs approximate the min-marginals:

Estimate the optimal assignment as

Page 7: Message Passing Algorithms for Optimization

7

Min-Sum: Convergence Properties

Iterations do not necessarily converge

Always converges when the factor graph is a tree

Converged estimates need not correspond to the optimal solution

Performs well empirically

Page 8: Message Passing Algorithms for Optimization

8

Previous Work

Prior work focused on two aspects of message passing algorithms Convergence

Coordinate ascent schemes Not necessarily local message passing algorithms

Correctness No combinatorial characterization of failure modes Concerned only with global optimality

Page 9: Message Passing Algorithms for Optimization

9

Contributions

A new local message passing algorithm Parameterized family of message passing algorithms

Conditions under which the estimate produced by the splitting algorithm is guaranteed to be a global optima

Conditions under which the estimate produced by the splitting algorithm is guaranteed to be a local optima

Page 10: Message Passing Algorithms for Optimization

10

Contributions

What makes a graphical model “good”?

Combinatorial understanding of the failure modes of the splitting algorithm via graph covers

Can be extended to other iterative algorithms

Techniques for handling objective functions for which the known convergent algorithms fail

Reparameterization centric approach

Page 11: Message Passing Algorithms for Optimization

11

Publications Convergent and correct message passing schemes for optimization problems

over graphical modelsProceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI), July 2010

Fixing Max-Product: A Unified Look at Message Passing Algorithms (invited talk)Proceedings of the Forty-Eighth Annual Allerton Conference on Communication, Control, and Computing, September 2010

Unconstrained minimization of quadratic functions via min-sumProceedings of the Conference on Information Sciences and Systems (CISS), Princeton, NJ/USA, March 2010

Graph covers and quadratic minimizationProceedings of the Forty-Seventh Annual Allerton Conference on Communication, Control, and Computing, September 2009

s-t paths using the min-sum algorithmProceedings of the Forty-Sixth Annual Allerton Conference on Communication, Control, and Computing, September 2008

Page 12: Message Passing Algorithms for Optimization

12

Outline

Reparameterizations Lower Bounds Convergent Message Passing

Finding a Minimizing Assignment Graph covers

Quadratic Minimization

Page 13: Message Passing Algorithms for Optimization

13

The Problem

Minimize a real-valued objective function that factorizes as a sum of potentials

(a multiset whose elements are subsets of the indices 1,…,n)

Page 14: Message Passing Algorithms for Optimization

14

Factorizations

Some factorizations are better than others

If xi takes one of k values this requires at most 2k2

+ k operations

Page 15: Message Passing Algorithms for Optimization

15

Factorizations

Some factorizations are better than others

Suppose

Only need k operations to compute the minimum value!

Page 16: Message Passing Algorithms for Optimization

16

Reparameterizations

We can rewrite the objective function as

This does not change the objective function as long as the messages are real-valued at each x

The objective function is reparameterized in terms of the messages

Page 17: Message Passing Algorithms for Optimization

17

Reparameterizations

We can rewrite the objective function as

The reparameterization has the same factor graph as the original factorization

Many message passing algorithms produce a reparameterization upon convergence

Page 18: Message Passing Algorithms for Optimization

18

The Splitting Reparameterization Let c be a vector of non-zero reals

If c is a vector of positive integers, then we could view this as a factorization in two ways: Over the same factor graph as the original

potentials Over a factor graph where each potential has been

“split” into several pieces

Page 19: Message Passing Algorithms for Optimization

19

The Splitting Reparameterization

2

1

3 2

1

3

Factor graph

Factor graph resulting from “splitting” each of the

pairwise potentials 3 times

Page 20: Message Passing Algorithms for Optimization

20

The Splitting Reparameterization

Beliefs:

Reparameterization:

Page 21: Message Passing Algorithms for Optimization

21

Outline

Reparameterizations Lower Bounds Convergent Message Passing

Finding a Minimizing Assignment Graph covers

Quadratic Minimization

Page 22: Message Passing Algorithms for Optimization

22

Lower Bounds

Can lower bound the objective function with these reparameterizations:

Find the collection of messages that maximize this lower bound Lower bound is a concave function of the messages

Use coordinate ascent or subgradient methods

Page 23: Message Passing Algorithms for Optimization

23

Lower Bounds and the MAP LP

Equivalent to minimizing f

Dual provides a lower bound on f

Messages are a side-effect of certain dual formulations

Page 24: Message Passing Algorithms for Optimization

24

Outline

Reparameterizations Lower Bounds Convergent Message Passing

Finding a Minimizing Assignment Graph covers

Quadratic Minimization

Page 25: Message Passing Algorithms for Optimization

25

The Splitting Algorithm A local message passing algorithm for the

splitting reparameterization

Contains the min-sum algorithm as a special case For the integer case, can be derived from the min-

sum update equations

Page 26: Message Passing Algorithms for Optimization

26

The Splitting Algorithm

For certain choices of c, an asynchronous version of the splitting algorithm can be shown to be a block coordinate ascent scheme for the lower bound:

For example:

Page 27: Message Passing Algorithms for Optimization

27

Asynchronous Splitting Algorithm

2

1

3

Page 28: Message Passing Algorithms for Optimization

28

Asynchronous Splitting Algorithm

2

1

3

Page 29: Message Passing Algorithms for Optimization

29

Asynchronous Splitting Algorithm

2

1

3

Page 30: Message Passing Algorithms for Optimization

30

Coordinate Ascent

Guaranteed to converge

Does not necessarily maximize the lower bound

Can get stuck in a suboptimal configuration

Can be shown to converge to the maximum in restricted cases

Pairwise-binary objective functions

Page 31: Message Passing Algorithms for Optimization

31

Other Ascent Schemes

Many other ascent algorithms are possible over different lower bounds: TRW-S [Kolmogorov 2007]

MPLP [Globerson and Jaakkola 2007]

Max-Sum Diffusion [Werner 2007]

Norm-product [Hazan 2010]

Not all coordinate ascent schemes are local

Page 32: Message Passing Algorithms for Optimization

32

Outline

Reparameterizations Lower Bounds Convergent Message Passing

Finding a Minimizing Assignment

Graph covers

Quadratic Minimization

Page 33: Message Passing Algorithms for Optimization

33

Constructing the Solution

Construct an estimate, x*, of the optimal assignment from the beliefs by choosing

For certain choices of the vector c, if each argmin is unique, then x* minimizes f

A simple choice of c guarantees both convergence and correctness (if the argmins are unique)

Page 34: Message Passing Algorithms for Optimization

34

Correctness

If the argmins are not unique, then we may not be able to construct a solution

When does the algorithm converge to the correct minimizing assignment?

Page 35: Message Passing Algorithms for Optimization

35

Outline

Reparameterizations Lower Bounds Convergent Message Passing

Finding a Minimizing Assignment Graph covers

Quadratic Minimization

Page 36: Message Passing Algorithms for Optimization

36

Graph Covers

A graph H covers a graph G if there is homomorphism from H to G that is a bijection on neighborhoods

Graph G 2-cover of G

2

1

3

21 3

3’2’

1’

Page 37: Message Passing Algorithms for Optimization

37

Graph Covers

Potential functions are “lifts” of the nodes they cover

Graph G 2-cover of G

2

1

3

21 3

3’2’

1’

Page 38: Message Passing Algorithms for Optimization

38

Graph Covers

The lifted potentials define a new objective function

Objective function:

2-cover objective function

Page 39: Message Passing Algorithms for Optimization

39

Graph Covers

Indistinguishability: for any cover and any choice of initial messages on the original graph, there exists a choice of initial messages on the cover such that the messages passed by the splitting algorithm are identical on both graphs

For choices of c that guarantee correctness, any assignment that uniquely minimizes each must also minimize the objective function corresponding to any finite cover

Page 40: Message Passing Algorithms for Optimization

40

Maximum Weight Independent Set

1

2 3

21 3

3’2’

1’

Graph G 2-cover of G

Page 41: Message Passing Algorithms for Optimization

41

Maximum Weight Independent Set

5

2 2

25 2

22

5

Graph G 2-cover of G

Page 42: Message Passing Algorithms for Optimization

42

Maximum Weight Independent Set

5

2 2

25 2

22

5

Graph G 2-cover of G

Page 43: Message Passing Algorithms for Optimization

43

Maximum Weight Independent Set

3

2 2

23 2

22

3

Graph G 2-cover of G

Page 44: Message Passing Algorithms for Optimization

44

Maximum Weight Independent Set

3

2 2

23 2

22

3

Graph G 2-cover of G

Page 45: Message Passing Algorithms for Optimization

45

More Graph Covers If covers of the factor graph have different solutions

The splitting algorithm cannot converge to the correct answer for choices of c that guarantee correctness

The min-sum algorithm may converge to an assignment that is optimal on a cover

There are applications for which the splitting algorithm always works

Minimum cuts, shortest paths, and more…

Page 46: Message Passing Algorithms for Optimization

46

Graph Covers

Suppose f factorizes over a set with corresponding factor graph G and the choice of c guarantees correctness

Theorem: the splitting algorithm can only converge to beliefs that have unique argmins if f is uniquely minimized at the assignment x*

The objective function corresponding to every finite cover H of G has a unique minimum that is a lift of x*

Page 47: Message Passing Algorithms for Optimization

47

Graph Covers

This result suggests that

There is a close link between “good” factorizations and the difficulty of a problem

Convergent and correct algorithms are not ideal for all applications

Convex functions can be covered by functions that are not convex

Page 48: Message Passing Algorithms for Optimization

48

Outline

Reparameterizations Lower Bounds Convergent Message Passing

Finding a Minimizing Assignment Graph covers

Quadratic Minimization

Page 49: Message Passing Algorithms for Optimization

49

Quadratic Minimization

symmetric positive definite implies a unique minimum

Minimized at

Page 50: Message Passing Algorithms for Optimization

For a positive definite matrix, min-sum convergence implies a correct solution:

Min-sum is not guaranteed to converge for all symmetric positive definite matrices

50

Quadratic Minimization

Page 51: Message Passing Algorithms for Optimization

51

Quadratic Minimization

A symmetric matrix is scaled diagonally dominant if there exists w > 0 such that for each row i:

Theorem: ¡ is scaled diagonally iff every finite cover of ¡ is positive definite

Page 52: Message Passing Algorithms for Optimization

52

Quadratic Minimization

Scaled diagonal dominance is a sufficient condition for the convergence of other iterative methods Gauss-Seidel, Jacobi, and min-sum

Suggests a generalization of scaled diagonal dominance for arbitrary convex functions Purely combinatorial!

Empirically, the splitting algorithm can always be made to converge for this problem

Page 53: Message Passing Algorithms for Optimization

53

Conclusion

General strategy for minimization Reparameterization Lower bounds Convergent and correct message passing

algorithms

Correctness is too strong Algorithms cannot distinguish graph covers Can fail to hold even for convex problems

Page 54: Message Passing Algorithms for Optimization

54

Conclusion

Open questions

Deep relationship between “hardness” of a problem and its factorizations

Convergence and correctness criteria for the min-sum algorithm

Rates of convergence

Page 55: Message Passing Algorithms for Optimization

55

Questions?

A draft of the thesis is available online at:http://cs-www.cs.yale.edu/homes/nruozzi/Papers/ths2.pdf