This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
All online material will be online (after conference):http://research.microsoft.com/en-us/um/cambridge/projects/tutorial/
Advanced Topics –Optimizing Higher-Order MRFs
Carsten Rother
Microsoft Research Cambridge
Challenging Optimization Problems
• How to solve higher-order MRFs:
• Possible Approaches:
- Convert to Pairwise MRF (Pushmeet has explained)
- Branch & MinCut (Pushmeet has explained)
- Add global constraint to LP relaxation
- Dual Decomposition
Add global constraints to LPBasic idea:
References:[K. Kolev et al. ECCV’ 08] silhouette constraint [Nowizin et al. CVPR ‘09+ connectivity prior[Lempitsky et al ICCV ‘09+ bounding box prior (see talk on Thursday)
T
∑i Є T
Xi ≥ 1
See talk on Thursday: [Lempitsky et al ICCV ‘09+ bounding box prior
Dual Decomposition
• Well known in optimization community [Bertsekas ’95, ‘99+
• Other names: “Master-Slave” [Komodiakis et al. ‘07, ’09+
• Examples of Dual-Decomposition approaches:– Solve LP of TRW [Komodiakis et al. ICCV ‘07+
– Image segmentation with connectivity prior [Vicente et al CVPR ‘08+
– Feature Matching [Toressani et al ECCV ‘08+
– Optimizing Higher-Order Clique MRFs [Komodiakis et al CVPR ‘09+
– Marginal Probability Field *Woodford et al ICCV ‘09+
– Jointly optimizing appearance and Segmentation [Vicente et al ICCV 09]
Dual Decomposition
min E(x) = min [ E1(x) + θTx + E2(x) – θTx ]
• θ is called the dual vector (same size as x)
• Goal: max L(θ) ≤ min E(x)
• Properties:• L(θ) is concave (optimal bound can be found)• If x1=x2 then problem solved (not guaranteed)
xx
xθ
Hard to optimize Possible to optimize Possible to optimize
x1 x2 “Lower bound”
≥ min [E1(x1) + θTx1] + min [E2(x2) - θTx2] = L(θ)
Why is the lower bound a concave function?
L(θ) = min [E1(x1) + θTx1] + min [E2(x2) - θTx2] x1 x2
L(θ) : Rn -> R
L1(θ)
θ
L1(θ) L2(θ)
L(θ) concave since a sum of concave functions
θTx’1
θTx’’1
θTx’’’1
How to maximize the lower bound?If L(θ) were to be differentiable use gradient ascent
L(θ) not diff. … so subgradient approach [Shor ‘85+
How to maximize the lower bound?If L(θ) were to be differentiable use gradient ascent
L(θ) not diff. so subgradient approach [Shor ‘85+
L(θ) = min [E1(x1) + θTx1] + min [E2(x2) - θTx2] x1 x2
L(θ) : Rn -> R
L1(θ)
Θ
L1(θ) L2(θ)
θTx’1
θTx’’1
θTx’’’1
Θ’’ = Θ’ + λ g
Θ’ Θ’’
= Θ’ + λ x’1 Θ’’ = Θ’ + λ (x1-x2)
Subgradient g
Dual DecompositionL(θ) = min [E1(x1) + θTx1] + min [E2(x2) - θTx2] x1 x2
Subproblem 1x1 = min [E1(x1) + θTx1]
x1 x2
Subgradient Optimization:
subgradient
Θ
Θ Θ
Θ = Θ + λ(x1-x2)
x1
Subproblem 2x2 = min [E2(x2) + θTx2]x2
“Slaves”
“Master”
Example optimization
• Guaranteed to converge to optimal bound L(θ)
• Choose step-width λ correctly ([Bertsekas ’95])
• Pick solution x as the best of x1 or x2
• E and L can in- and decrease during optimization
• Each step: θ gets close to optimal θ*
*
* *
*
* *
Why can the lower bound go down?
Lower envelop of planes in 3D:
L(θ)
L(θ’)
L(θ’) ≤ L(θ)
Analyse the model
Θ’’ = Θ’ + λ (x1-x2)
L(θ) = min [E1(x1) + θTx1] + min [E2(x2) - θTx2]
Update step:
Look at pixel p:
Case2: x1p = 1 x2p = 0 then Θ’’ = Θ’+ λ
push x1p towards 0 push x2p towards 1
Case1: x1p = x2p then Θ’’ = Θ’
Case3: x1p = 0 x2p = 1 then Θ’’ = Θ’- λ
push x1p towards 1 push x2p towards 0
* *
*
*
* *
*
*
Example 1: Segmentation and Connectivity
Foreground object must be connected:
User input Standard MRF Standard MRF+h
Zoom in
E(x) = ∑ θi (xi) + ∑ θij (xi,xj) + h(x)
h(x)= { ∞ if x not 4-connected0 otherwise
*Vicente et al ’08+
E(x) = ∑ θi (xi) + ∑ θij (xi,xj) + h(x)
h(x)= { ∞ if x not 4-connected0 otherwise
Example 1: Segmentation and ConnectivityE1(x)
min E(x) = min [ E1(x) + θTx + h(x) – θTx ]
≥ min [E1(x1) + θTx1] + min [h(x2) + θTx2] = L(θ)x1 x2
xx
Derive Lower bound:
Subproblem 1:
Unary terms +pairwise terms
Global minimum:GraphCut
Subproblem 2:
Unary terms + Connectivity constraint
Global minimum: Dijkstra
But: Lower bound was for no example tight.
Example 1: Segmentation and Connectivity
min E(x) = min [ E1(x) + θTx + θ’Tx’ + h(x) – θTx + h(x) - θ’Tx’]
≥ min [E1(x1) + θTx1 + θ’Tx’1] + min [h(x2) + θTx2] +
min [h(x3) + θ’Tx’3] =L(θ)x2
x,x,
x1,x’1
Derive Lower bound:
Subproblem 1:
Unary terms +pairwise terms
Global minimum:GraphCut
Subproblem 2:
Unary terms + Connectivity constraint
Global minimum: Dijkstra
x’ indicator vector of all pairwise terms
E(x) = ∑ θi (xi) + ∑ θij (xi,xj) + h(x)
h(x)= { ∞ if x not 4-connected0 otherwise
E1(x)
x3,x’3Subproblem 3:
Pairwise terms + Connectivity constraint
Lower Bound: Based on minimal paths on a dual graph
x’ x’
Results: Segmentation and Connectivity
Global optimum 12 out of 40 cases.
Image Input GraphCut GlobalMin
Heuristic method, DijkstraGC, which is faster and gives empirically same or better results
Extra Input
*Vicente et al ’08+
Example2: Dual of the LP Relaxation(from Pawan Kumar’s part)
Wainwright et al., 20011
2
3
4 5 6
q*( 1)
i i =
q*( 2)
q*( 3)
q*( 4) q*( 5) q*( 6)
i q*( i)
Dual of LP
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
i ≥ 0
max
= ( a0, a1,…, ab00, ab01,…)
{ i}
i
= L({ i})
q*( i) = min iTxixi
x = (xa0, xa1,…,xab00,xab01,…)
i
Example2: Dual of the LP Relaxation
min Tx = min iTx ≥ min iTxi = L({ i})
“Original problem”
∑i
∑ix xi
“i different trees”
q*( i)Subject to i =
“Lower bound”
Projected subgradient method:
Θi = [Θi + λxi ]Ω
Ω= {Θi| ∑ Θi = Θ }
q*( i) = min iTxixi
q*( i) concave wrt i ;
Guaranteed to get the optimal lower bound !
*
i
θ
θTx’1θTx’’1θTx’’’1
x
i
Use subgradient … why?
Example 2: optimize LP of TRW
[Komodiakis et al ’07+
TRW-S:• Not guaranteed to get optimal bound (DD does)• Lower bound goes always up (DD not).• Needs min-marginals (DD not)• DD paralizable (every tree in DD can be optimized separately)
Not NP-hard(PushmeetKohli’s part)
Example 3: A global perspective on low-level vision
*Woodford et al. ICCV’09+(see poster on Friday)
Add global term which enforcesa match with the marginal statistic
∑xii
0 n
Cost f
E(x) = ∑ θi (xi) + ∑ θij (xi,xj) + f(∑xi)ii i,jЄN
Global unary, er. 12.8%
E1 E2
∑xii
0 n
Cost f“Solve with dual-decomposition”
Example 3: A global perspective on low-level vision
input
Image synthesis
Image de-noising
global colordistribution prior
[Kwatra ’03+
Pairwise-MRFNoisy input Global gradient prior
Ground truth Gradient strength
Example 4: Solve GrabCut globally optimal
*Vicente et al; ICCV ’09+(see poster on Tuesday)
E(x, w)
w Color model
Highly connected MRF
E’(x) = min E(x, w)w
Higher-order MRF
E(x,w) = ∑ θi (xi,w) + ∑ θij (xi,xj)
E(x,w): {0,1}n x {GMMs}→ R
Example 4: Solve GrabCut globally optimal
Prefers “equal area” segmentation Each color either fore- or background
0 n/2 n
g convex fb
0 max
concave
E(x)= g(∑xi) + ∑ fb(∑xib) + ∑i b i,jЄN
θij (xi,xj)
input
∑xi ∑xib
E1 E2
“Solve with dual-decomposition”
Example 3: Solve GrabCut globally optimal
*Vicente et al; ICCV ’09+(see poster on Tuesday)
Globally optimal in 60% of cases, such as…
Summary
• Dual Decomposition is a powerful technique for challenging MRFs
• Not guaranteed to give globally optimal energy
• … but for several vision problems we get tight bounds