A dual ascent framework for Lagrangean decomposition of combinatorial problems Paul Swoboda 1 Jan Kuske 2 Bogdan Savchynskyy 3 1 IST Austria 2 Heidelberg University, Germany 3 TU Dresden, Germany e-mail: [email protected] Goal and contribution Goal: Efficient dual block coordinate ascent solvers for a wide range of large scale combina- torial problems from computer vision and beyond. Contribution: We propose a new class of general LP-relaxations of ILP-problems, called Integer Relaxed Pairwise Separable Linear Programs (IRPS-LP). (IRPS-LP) generalize the local polytope relaxation for CRFs. (IRPS-LP) allows explicit modelling of allowed configurations, instead of forbidding them with ∞-costs. This leads to more compact formulations and lower computational cost. Relaxations of the multicut [4] (poster #75) and graph matching [5] (poster #69 tommorow afternoon) problems can be written as (IRPS-LP). We analyze dual block coordinate ascent for (IRPS-LP). We prove monotonical improvement of a dual lower bound. Popular algorithms TRWS [2], SRMP [3] and MPLP [1] for inference in CRFs are special cases. Practical impact: Efficient algorithms for multicut and graph matching have been written in our framework (for extensive details see corresponding posters). C++ implementation available at https://github.com/pawelswoboda/LP_MP. Problem formulation (IRPS-LP) Factor graph G =(F, E): factors i ∈ F with variables μ i ∈ X i ⊆{0, 1} d i couplings ij ∈ E ⊆ F 2 with constraints A (i ,j ) μ i = A (j ,i ) μ j . (IRPS-LP): min μ∈Λ G k X i =1 hθ i ,μ i i Λ G := (μ 1 ...μ k ) μ i ∈ conv(X i ) ∀i ∈ F A (i ,j ) μ i = A (j ,i ) μ j ∀ij ∈ E . μ 1 ∈ X 1 μ 2 ∈ X 2 μ 3 ∈ X 3 μ 4 ∈ X 4 A (1,2) μ 1 =A (2,1) μ 2 A (1,3) μ 1 =A (3,1) μ 3 A (2,4) μ 2 =A (4,2) μ 4 A (3,4) μ 3 =A (4,3) μ 4 Example: MAP-inference in CRFs as (IRPS-LP) CRF: Graph G =(V, E), label space X = Q u ∈V X u , ∀u ∈ V: unary costs θ u : X u → R, ∀uv ∈ E: pairwise costs θ uv : X u × X v → R. MAP-inference: min x ∈X X u ∈V θ u (x u )+ X uv ∈E θ uv (x uv ) . (IRPS-LP): F = V ∪ E, E = {{u , uv }, {v , uv } : uv ∈ E}. Simplex: Δ n = {x ∈ R n + : ∑ n i =1 x i i = 1}. min μ∈L G hθ, μi := X u ∈V hθ u ,μ u i + X uv ∈E hθ uv ,μ uv i L G = μ u ∈ conv(X u ): μ u ∈ Δ |X u | , u ∈ V μ uv ∈ conv(X uv ): μ uv ∈ Δ |X uv | , uv ∈ E A (uv ,u ) μ uv = A (u ,uv ) μ u : ∑ x v ∈X v μ uv (x u , x v )= μ u (x u ) . u v w CRF μ u ∈ Δ X u μ v ∈ Δ X v μ w ∈ Δ X w μ vw ∈ Δ X v ·X w μ uv ∈ Δ X u ·X v μ uw ∈ Δ X u ·X w μ u (x u )= ∑ x v μ uv (x uv ) μ(x v )= ∑ x u μ uv (x uv ) μ(x v )= ∑ x w μ vw (x vw ) 0 μ(x w )= ∑ x v μ vw (x vw ) μ(x u )= ∑ x w μ uw (x uw ) 0 μ(x w )= ∑ x u μ uw (x uw ) CRF as (IRPS-LP) Example: graph matching Given: CRF. Label spaces X u ⊂L are part of an universe. min x ∈X X u ∈V θ u (x u )+ X uv ∈E θ uv (x uv ) | {z } CRF s.t. x u 6= x v ∀u 6= v | {z } uniqueness constraints Example: V = {u , v , w }, L = {l , l 0 , l 00 }. Unary factors: Each node takes a label. Label factors: Each label can be taken once. ∑ = 1 ∑ = 1 ∑ = 1 ∑ = 1 ∑ = 1 ∑ = 1 u v w l l 0 l 00 (IRPS-LP): min μ∈L G , ˜ μ hθ, μi s.t. ˜ μ ∈ ∀s ∈L : X u ∈V ˜ μ u (s ) ≤ 1 additional (IRPS-LP) factors for uniqueness constraint μ u (s ) = ˜ μ s (u ), s ∈ X u couple CRF and uniqueness constraints . Example: multicut Optimization problem: Weighted graph G =(V , E ). Find partition of G. min x ∈{0,1} |E| X e∈E θ e x e , s.t. ∀ cycles C : ∀e 0 ∈ C : X e∈C\{e 0 } x e ≥ x e 0 | {z } cycle inequalities . (IRPS-LP): Factors: F = E ∪ 3-cycle(E ). Couplings: E = {{e , C } : C ∈ 3-cycle(E ), e ∈ C } Motivation: Dual block coordinate ascent for CRFs Observation: TRWS [2] is state of the art among LP-based methods for MAP-inference for CRFs: Contribution: Develop dual block coordinate ascent methods inspired by TRWS [2] for (IRPS-LP). Dual lower bound Dualize primal problem w.r.t. coupling constraints: Primal variables P := μ = μ 1 . . . μ k : μ ∈ conv(x 1 ) . . . conv(X k ) . Coupling constraints: Aμ =(A (i ,j ) μ i - A (j ,i ) μ j ) ij ∈E = 0. Maximize dual lower bound: max φ [D (φ) := min μ∈P hθ, μi + hφ, Aμi]. Reparametrization: θ φ := θ + A > φ. Elementary updates – Admissible messages For factor i ∈ BF take subset of neighbors N ⊆{j : ij ∈ E}. Consider dual variables φ i j related to couplings j ∈ N . Change (φ i j ) j ∈N by Δ such that D (φ - Δ) ≥ D (φ). Illustration in the figure: factor 1, neighbors N = {2, 3}. μ 1 ∈ X 1 μ 2 ∈ X 2 μ 3 ∈ X 3 μ 4 ∈ X 4 φ 12 φ 13 Elementary updates – Admissible messages Computation of Δ: ˆ x i ∈ arg min x i ∈X i hθ φ i , x i i : opt. configuration before update. ˆ x i ∈ arg min x i ∈X i hθ φ-Δ i , x i i : opt. configuration after update. hθ φ-Δ i , ˆ x i i≥hθ φ i , ˆ x i i : costs do not decrease. Δ = arg max Δ-admissible hδ, θ φ-Δ i i, δ (s ) ( > 0, ˆ x i (s )= 1 < 0, ˆ x i (s )= 0 Algorithm Visit all factors i ∈ F and perform elementary updates: Special cases: TRWS [2], SRMP [3] and MPLP [1] for MAP-inference in CRFs. Results: graph matching 10 -1 10 0 10 1 10 1 10 2 10 3 runtime(s) car log(energy - lower bound) AMP HUNGARIAN-BP DD 10 -1 10 0 10 1 10 0 10 1 10 2 10 3 runtime(s) motor log(energy - lower bound) 10 -1 10 0 10 1 10 2 10 2 10 3 10 4 10 5 runtime(s) worms log(energy - lower bound) For more details please see [5] and corresponding poster #69 on Tuesday afternoon. Results: multicut 10 -2 10 -1 10 0 -5,000 -4,800 -4,600 -4,400 -4,200 -4,000 runtime(s) knott-3d-150 energy MP-C CGC MC-ILP CC-Fusion-RWS CC-Fusion-RHC 10 -1 10 0 10 1 10 2 -2.75 -2.74 -2.73 -2.72 -2.71 ·10 4 runtime(s) knott-3d-300 10 -1 10 0 10 1 10 2 -8 -7.8 -7.6 ·10 4 runtime(s) knott-3d-450 For more details please see [4] and corresponding poster #75. References [1] A. Globerson and T. S. Jaakkola. Fixing max-product: Convergent message passing algorithms for MAP LP-relaxations. In NIPS, pages 553–560, 2007. [2] V. Kolmogorov. Convergent tree-reweighted message passing for energy minimization. IEEE Trans. Pattern Anal. Mach. Intell., 28(10):1568–1583, 2006. [3] V. Kolmogorov. A new look at reweighted message passing. IEEE Trans. Pattern Anal. Mach. Intell., 37(5):919–930, 2015. [4] P. Swoboda and B. Andres. A message passing algorithm for the minimum cost multicut problem. In CVPR, 2017. [5] P. Swoboda, C. Rother, H. Abu Alhaija, D. Kainmueller, and B. Savchynskyy. Study of Lagrangean decomposition and dual ascent solvers for graph matching. In CVPR, 2017. Acknowledgements. The first author was supported by the European Research Council under the European Unions Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement no 616160, the second author by DFG Grant GRK 1653 and the last author by the European Research Council under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 647769) and the DFG Grant ”ERBI” SA 2640/1-1.