Pushkar Tripathi Georgia Institute of Technology
Approximability of Combinatorial Optimization Problems with
Submodular Cost Functions Based on joint work with Gagan Goel,
Chinmay Karande, and Wang Lei
Slide 2
Motivation Network Design Problem Objective: Find minimum
spanning tree that can be built collaboratively by these agents f h
g
Slide 3
Additive Cost Function Functions which capture economies of
scale cost(a) = 1 cost(b) = 1 cost(a,b) = 2 cost(a) = 1 cost(b) = 1
cost(a,b) = 1.5 How to mathematically model these functions? - We
use Submodular Functions as a starting point. Can one design
efficient approximation algorithms under Submodular Cost Functions?
How to mathematically model these functions? - We use Submodular
Functions as a starting point. Can one design efficient
approximation algorithms under Submodular Cost Functions?
General Framework Ground set X and collection C 2 X C: set of
all tours, set of all spanning trees k agents, each specifies f i :
2 X R + f i is submodular and monotone Find S 1, , S k such that: [
S i 2 C i f i ( S i ) is minimized ORACLE S f(S)
Slide 6
Our Results Lower Bounds : Information theoretic Upper Bounds :
Rounding of configurational LPs, Approximating sumdodular functions
and Greedy Problem Upper BoundLower BoundUpper BoundLower Bound
Vertex Cover 2 2 - 2. log n (log n) Shortest Path O (n 2/3 )(n 2/3
) O (n 2/3 )(n 2/3 ) Spanning Treen (n) n Perfect Matchingn (n) n
Multiple AgentsSingle Agent
Slide 7
Selected Related Work [Grtschel, Lovsz, Schrijver 81]
Minimizing non-monotone submodular function is poly-time [Feige,
Mirrokni, Vondrak 07] Maximizing non-monotone function is hard.
2/5-Approximation Algorithm. [Calinescu, Chekuri, Pal, Vondrak 08]
Maximizing monotone function subject to Matroid constraint: 1-1/e
Approximation. [Svitkina, Fleischer 09] Upper and lower bounds for
Submodular load balancing, Sparsest Cut, Balanced Cut [Iwata,
Nagano 09] Bounds for Submodular Vertex Cover, Set Cover [Chekuri,
Ene 10] Bounds for Submodular Multiway Partition
Slide 8
In this talk Submodular Shortest Path with single agent O(n 2/3
) approximation algorithm Matching hardness of approximation
Slide 9
In this talk Submodular Shortest Path with single agent O(n 2/3
) approximation algorithm Matching hardness of approximation
Slide 10
Submodular Shortest Path s t Given: Graph G, Two nodes s and t
f : 2 E R + Submodular, Monotone Goal: Find path P s.t. f(P) is
minimized G=(V,E) |V| =n, |E| =m
Slide 11
t Attempt 1: Approximate by Additive function Let w e = f({e})
Idea : w e OPT w e s 2. Pruning: Remove edges costlier than e* 1.
Guess e* = argmax{ w e | e 2 OPT } 3. Search: Find the shortest
length s-t path in the residual graph ALG diameter(G). w e*
diameter(G).OPT e 2 OPT
Slide 12
Attempt 2: Ellipsoid Approximation Johns theorem : For every
polytope P, there exists an ellipsoid contained in it that can be
scaled by a factor of O( n) to contain P [GHIM 09]: If the convex
body is a polymatroid, then there is a poly-time algorithm to
compute the ellipse. P
Slide 13
Attempt 2: Ellipsoid Approximation P [GHIM 09]: If the convex
body is a polymatroid, then there is a poly-time algorithm to
compute the ellipse. S: e 2 S x(e) f(S) e: x(e) 0 f: Submodular,
monotone Polymatroid
Slide 14
Approximating Submodular Functions X f : Monotone submodular
function g(S) = d e e 2 S g(S) f(S) n g(S) d1d1 d2d2 d6d6 d4d4 d5d5
d3d3 Polynomial time |X| = n
Slide 15
Attempt 2: Ellipsoid Approximation f: 2 E R + Submodular,
Monotone STEP 1: STEP 2: Min g(S) s.t. S 2 PATH(s,t) * Minimizing
over g(S) is equivalent to minimizing just the additive part [GHIM
09] Analysis : f(P) g(P) g(O) f(O) g(S): = d e P: Optimum path
under g O: Optimum path under f {de}{de} EE EE EE
Slide 16
Recap. Approximating by linear functions : Works for graphs
with small diameter Approximating by ellipsoid functions : Works
for sparse graphs n/2 Dense Graph with large diameter
Slide 17
Algorithm for Shortest Path STEP 1: Pruning - Guess edge e* =
argmax {w e | e OPT path} - Remove edges costlier than w e*
Slide 18
Algorithm for Shortest Path STEP 1: Pruning - Guess edge e* =
argmax {w e | e OPT path} - Remove edges costlier than w e* STEP 2
: Contraction - if v, s.t. degree(v) > n 1/3, contract
neighborhood of v - repeat
Slide 19
s t s t Dense connected component
Slide 20
Algorithm for Shortest Path STEP 1: Pruning - Let w e = f({e})
- Guess edge e* = argmax {w e | e OPT path} - Remove edges costlier
than w e* STEP 2 : Contraction - if v, s.t. degree(v) < n 1/3,
contract neighborhood of v - repeat STEP 3 : Ellipsoid
Approximation - Calculate ellipsoidal approximation (d,g) for the
residual graph
Slide 21
Algorithm for Shortest Path STEP 1: Pruning - Let w e = f({e})
- Guess edge e* = argmax {w e | e OPT path} - Remove edges costlier
than w e* STEP 2 : Contraction - if v, s.t. degree(v) < n 1/3,
contract neighborhood of v - repeat STEP 3 : Ellipsoid
Approximation - Calculate ellipsoidal approximation (d,g) for the
residual graph STEP 4 : Search - Find shortest s-t path according
to g.
Slide 22
s t
Slide 23
Algorithm for Shortest Path STEP 1: Pruning - Let w e = f({e})
- Guess edge e* = argmax {w e | e OPT path} - Remove edges costlier
than w e* STEP 2 : Contraction - if v, s.t. degree(v) < n 1/3,
contract neighborhood of v - repeat STEP 3 : Ellipsoid
Approximation - Calculate ellipsoidal approximation (d,g) for the
residual graph STEP 4 : Search - Find shortest s-t path according
to g. STEP 5 : Reconstruction - Replace the path through each
contracted vertex with one having the fewest edges.
Slide 24
s t Path having fewest edges
Slide 25
Analysis s t P1 P2 R
Slide 26
Bounding the cost of P1 s t P1 P2 Has at most n 4/3 edges R f(P
1 ) E(R).g(P 1 ) E(R).g(OPT) E(R).f(OPT) n 2/3 f(OPT)
Slide 27
Bounding the cost of P2 s t Diam(G i ) | G i |/ n 1/3 f(P 2 )
(dia(G 1 ) +.. +dia(G k ) ) w e* (|G 1 | / n 1/3 + . ) w e* (n / n
1/3 ) w e* n 2/3 f(OPT) G1G1 G2G2 G3G3
Slide 28
In this talk Submodular Shortest Path with single agent O(n 2/3
) approximation algorithm Matching hardness of approximation
Slide 29
Information Theoretic Lower Bound Polynomial number of queries
to the oracle Algorithm is allowed unbounded amount of time to
process the results of the queries Not contingent on P vs NP f S1
f(S1) S2 f(S2) S3 f(S2)
Slide 30
General Technique Cost functions f, g satisfying OPT( f )
>> OPT( g ) f (S) = g(S) for most sets S A any randomized
algorithm f(Q ) = g( Q ) with high probability for every query Q
made by A. Probability over random bits in A.
Slide 31
Yaos Lemma f(Q) = g(Q) with high probability for every query Q
made by randomized algorithm A. f and a distribution D from which
we choose g, such that for an arbitrary query Q, f(Q) = g(Q) with
high probability
Slide 32
Non-combinatorial Setting X : Ground set f(S) = min{ |S|, } D :
R X, |R| = g R (S) = min{| S R c | + min( S R, ) } D : R X, |R| = g
R (S) = min{| S R c | + min( S R, ) }
Slide 33
Optimal Query Claim : Optimal query has size Case 1 : |Q| <
Probability can only increase if we increase |Q|
Slide 34
Case 2 : |Q| > Probability can only increase if we decrease
|Q| Optimal query size to distinguish f and g R is
Slide 35
Distinguishing f and g R = (1+ ) E[|Q R|] f and g are hard to
distinguish Chernoff Bounds
Slide 36
Hardness of learning submodular functions Set = n 1/2 log n
Optimal query size = = n 1/2 log n |R| = = n 1/2 log n E[ Q R] =
log 2 n = (1+ ) E[ Q R] = (1+ ) log 2 n Super logarithmic Corollary
: Hard to learn a submodular function to a factor better than n 1/2
/log n in polynomial value queries. f and g are indistinguishable
f(R) = min{ |R|, } = |R| = = n 1/2 log n g R (R ) = min{| R R c | +
min( R R, ) } = = log 2 n
Slide 37
Randomly chosen set may not be a feasible solution in the
combinatorial setting. Eg. Randomly chosen set of edges rarely
yield a s-t path. Difficulty in Combinatorial Setting Solution :
1.Do not choose R randomly from the entire domain X. 2.Use a subset
of R as a proxy for the solution. Solution : 1.Do not choose R
randomly from the entire domain X. 2.Use a subset of R as a proxy
for the solution.
Slide 38
Base Graph G ...st n 2/3 levels n 1/3 vertices
Slide 39
Functions f and g . st Y B f(S) = f( S B ) & g(S) = g( S B
)
Slide 40
Functions f and g . st Y B f(S) = min( |S B|, )
Slide 41
Functions f and g Y B .. st .. g R (S) = min{| S R B| + min( S
R B, )} Uniform random subset of B of size Solution : 1.Do not
choose R randomly from the entire domain X. 2.Use a subset of R as
a proxy for the solution. Solution : 1.Do not choose R randomly
from the entire domain X. 2.Use a subset of R as a proxy for the
solution.
Slide 42
Functions f and g . st Y B g R (S) = min{| S R B| + min( S R B,
) Solution : 1.Do not choose R randomly from the entire domain X.
2.Use a subset of R as a proxy for the solution. Solution : 1.Do
not choose R randomly from the entire domain X. 2.Use a subset of R
as a proxy for the solution. R = n 2/3 log 2 n
Slide 43
Setting the constants Set = n 2/3 log 2 n Optimal Query size =
= n 2/3 log 2 n = log 2 n f and g are indistinguishable f(OPT) =
min{ |R|, } = |R| = = O( n 2/3 log 2 n) g R (OPT ) = min{| R R c |
+ min( R R, ) } = = log 2 n Theorem : Submodular Shortest Path
problem is hard to approximate to a factor better than O(n 2/3
)
Slide 44
Problem Upper BoundLower BoundUpper BoundLower Bound Vertex
Cover 2 2 - 2. log n (log n) Shortest Path O (n 2/3 )(n 2/3 ) O (n
2/3 )(n 2/3 ) Spanning Treen (n) n Perfect Matchingn (n) n Multiple
AgentsSingle Agent n: # of vertices in graph G Whats the right
model to study economies of scale? Summary
Slide 45
Newer Models Discount Models f h g E R
Slide 46
Task: Minimize sum of payments Cost Payment f(a) + f(b) + f(c)
. Sub modular functions
Shortest Path : O(log c n) hardness Set Cover Instance U S s t
Agents - Cost of every edge is 1 1 1 Claim : Set cover of size |S|
Shortest path of length |S|
Slide 49
Hardness Gap Amplification s t s t Original Instance Harder
Instance Replace each edge by a copy of the original graph. Edges
of the same color get the same copy. Edges of different colors gets
copies with new colors(agents)
Slide 50
Claim : The new instance has a solution of cost 2 iff the
original instance has a solution of cost . For any fixed constant c
iterate this construction c times to further amplify the lower
bound to O(log c n).
Slide 51
Q.E.F
Slide 52
Why is it so hard to distinguish f and g ? Observation: f R (S
) is at most g(S ) for any set S. Case 1: Small size queries - |Q |
n This probability can only increase if we increase |Q |
Slide 53
Case 2: Large size queries - |Q | n This probability can only
increase if we decrease |Q |
Slide 54
Combinatorial Optimization C - Ground set f - Valuation
function over subsets of C X - Collection of some subsets C having
a special property Task - Find the set in X that has minimum cost
under a given valuation function.
Slide 55
General Technique cont. f S2 S3 S1f(S1) = g(S1) f(S2) = g(S2)
f(S3) = g(S3) A cannot distinguish between f and g Output is at
least OPT( g ) OPT( g ) OPT( f )
Slide 56
Plan Fix a cost function f Fix a distribution D of functions
such that for every g in D OPT(f ) >> OPT (g) For an
arbitrary query Q, f(Q) = g(Q) with high probability