Binary Contingency Tables in Theory and Practice
Ivona Bezáková(Rochester Institute of Technology)
Based on joint works with Nayantara Bhatnagar, Alistair Sinclair, Daniel Štefankovič, and Eric Vigoda.
DIMACS Workshop on Markov Chain Monte Carlo: Synthesizing Theory and Practice June 5th, 2007
Input:
Sample space: 0/1 tables satisfying the marginals
Goal: count / sample
2
3
21
3
32
5
3 4 2
4
Binary Contingency Tables
marginals(row sums r1, r2, …, rm, column sums c1, c2, …, cn)
Input:
Sample space: 0/1 tables satisfying the marginals
Goal: count / sample
Binary Contingency Tables
marginals(row sums r1, r2, …, rm, column sums c1, c2, …, cn)
2
3
21
3
32
5
3 4 2
41111
11
111
1111 1
1 11
Different Approaches
Theory (Markov chain Monte Carlo with simulated annealing)
• Jerrum-Sinclair-Vigoda ’01: approximate permanent in O*(n10), yields O*((mn)10) algorithm for m x n binary contingency tables
• Bezáková-Bhatnagar-Vigoda ’06: O*((mn)3(m+n)5)
Practice (sequential importance sampling, Chen-Diaconis-Holmes-Liu ’05)
• Bezáková-Sinclair-Štefankovič-Vigoda ’06: negative example
• Jose Blanchet ’06: SIS works if marginals O(n1/4)
• Bayati-Kim-Saberi ’07: alternative importance sampling method, works if marginals O(n1/4)
Practice (the switching Markov chain, Diaconis-Gangolli ’94)
• Kannan-Tetali-Vempala ’97, Cooper-Dyer-Greenhill ’05: works for regular marginals
Different Approaches
Theory (Markov chain Monte Carlo with simulated annealing)
• Jerrum-Sinclair-Vigoda ’01: approximate permanent in O*(n10), yields O*((mn)10) algorithm for m x n binary contingency tables
• Bezáková-Bhatnagar-Vigoda ’06: O*((mn)3(m+n)5)
Practice (sequential importance sampling, Chen-Diaconis-Holmes-Liu ’05)
• Bezáková-Sinclair-Štefankovič-Vigoda ’06: negative example
• Jose Blanchet ’06: SIS works if marginals O(n1/4)
• Bayati-Kim-Saberi ’07: alternative importance sampling method, works if marginals O(n1/4)
Practice (the switching Markov chain, Diaconis-Gangolli ’94)
• Kannan-Tetali-Vempala ’97, Cooper-Dyer-Greenhill ’05: works for regular marginals
Permanent: Broder Chain [Broder ’88]
What for: uniform sampling of perfect matchings
How: Markov chain on perfect + near-perfect matchings
Perfect matching:
Permanent:
subset of vertex-disjoint edges covering all vertices
number of all perfect matchings
What for: uniform sampling of perfect matchings
How: Markov chain on perfect + near-perfect matchings
Permanent: Broder Chain [Broder ’88]
Perfect matching:
Permanent:
subset of vertex-disjoint edges covering all vertices
number of all perfect matchings
At a perfect matching:
- remove a random edge
What for: uniform sampling of perfect matchings
How: Markov chain on perfect + near-perfect matchings
Permanent: Broder Chain [Broder ’88]
At a perfect matching:
- remove a random edge
What for: uniform sampling of perfect matchings
How: Markov chain on perfect + near-perfect matchings
Permanent: Broder Chain [Broder ’88]
At a perfect matching:
- remove a random edge
What for: uniform sampling of perfect matchings
How: Markov chain on perfect + near-perfect matchings
Permanent: Broder Chain [Broder ’88]
At a perfect matching:
- remove a random edge
At a near matching:
- choose a random vertex
What for: uniform sampling of perfect matchings
How: Markov chain on perfect + near-perfect matchings
Permanent: Broder Chain [Broder ’88]
At a perfect matching:
- remove a random edge
At a near matching:
- choose a random vertex
• if unmatched: match to the other hole
What for: uniform sampling of perfect matchings
How: Markov chain on perfect + near-perfect matchings
Permanent: Broder Chain [Broder ’88]
At a perfect matching:
- remove a random edge
At a near matching:
- choose a random vertex
• if unmatched: match to the other hole
What for: uniform sampling of perfect matchings
How: Markov chain on perfect + near-perfect matchings
Permanent: Broder Chain [Broder ’88]
At a perfect matching:
- remove a random edge
At a near matching:
- choose a random vertex
• if unmatched: match to the other hole
What for: uniform sampling of perfect matchings
How: Markov chain on perfect + near-perfect matchings
Permanent: Broder Chain [Broder ’88]
At a perfect matching:
- remove a random edge
At a near matching:
- choose a random vertex
• if unmatched: match to the other hole
What for: uniform sampling of perfect matchings
How: Markov chain on perfect + near-perfect matchings
Permanent: Broder Chain [Broder ’88]
At a perfect matching:
- remove a random edge
At a near matching:
- choose a random vertex
What for: uniform sampling of perfect matchings
How: Markov chain on perfect + near-perfect matchings
• if unmatched: match to the other hole
• if matched: slide adjacent edge
Permanent: Broder Chain [Broder ’88]
At a perfect matching:
- remove a random edge
At a near matching:
- choose a random vertex
What for: uniform sampling of perfect matchings
How: Markov chain on perfect + near-perfect matchings
• if unmatched: match to the other hole
• if matched: slide adjacent edge
Permanent: Broder Chain [Broder ’88]
At a perfect matching:
- remove a random edge
At a near matching:
- choose a random vertex
• if unmatched: match to the other hole
• if matched: slide adjacent edge
What for: uniform sampling of perfect matchings
How: Markov chain on perfect + near-perfect matchings
Permanent: Broder Chain [Broder ’88]
At a perfect matching:
- remove a random edge
At a near matching:
- choose a random vertex
• if unmatched: match to the other hole
• if matched: slide adjacent edge
What for: uniform sampling of perfect matchings
How: Markov chain on perfect + near-perfect matchings
Permanent: Broder Chain [Broder ’88]
Broder Chain
Mixes in polynomial time ? Even if it did…
1 perfect matching
near matchings≥2(n/4)
Thm [Jerrum-Sinclair ’89]:Rapid mixing if perfects polynomially related to nears.
Jerrum-Sinclair-Vigoda ’01:Change the weight so that perfect matchingstake polynomial fraction.
Simulated Annealing for Permanent
Jerrum-Sinclair-Vigoda ’01:
n2+1 regions, very different size
u,v
Change the weight so that perfect matchingstake polynomial fraction.
Originally:
Simulated Annealing for Permanent
Jerrum-Sinclair-Vigoda ’01:
n2+1 regions, very different size
u,v
the same
Change the weight so that perfect matchingstake polynomial fraction.
Want:
Simulated Annealing for Permanent
Jerrum-Sinclair-Vigoda ’01:
u,v
Ideal weights(for a matching with holes u,v):
(# perfects)
(# nears with holes u,v)
n2+1 regions, very different sizethe same
Change the weight so that perfect matchingstake polynomial fraction.
Simulated Annealing for Permanent
Want:
u,v
Ideal weights(for a matching with holes u,v):
(# perfects)
(# nears with holes u,v)
A perfect matching sampled with prob. 1/(n2+1)
Computing ideal weights as hard as original problem ?
Good:
Bad:
Solution: Approximate
Simulated Annealing for Permanent
Simulated Annealing for Permanent
Ideal weights(for a matching with holes u,v):
(# perfects)
(# nears with holes u,v)
Unordered state Ordered state
(EASY) (DIFFICULT)
Solution: Approximate
G
G
Simulated Annealing for Permanent
Ideal weights(for a matching with holes u,v):
(# perfects)
(# nears with holes u,v)
Kn,n G
Unordered state Ordered state
(EASY) (DIFFICULT)
Solution: Approximate
Kn,n
Simulated Annealing for Permanent
Ideal weights(for a matching with holes u,v):
(# perfects)
(# nears with holes u,v)
Kn,n G
Unordered state Ordered state
(EASY) (DIFFICULT)
Solution: Approximate
Simulated Annealing for Permanent
Ideal weights(for a matching with holes u,v):
(# perfects)
(# nears with holes u,v)
Kn,n G
Unordered state Ordered state
(EASY) (DIFFICULT)
Solution: Approximate
Simulated Annealing for Permanent
Ideal weights(for a matching with holes u,v):
(# perfects)
(# nears with holes u,v)
Kn,n G
Unordered state Ordered state
(EASY) (DIFFICULT)
Solution: Approximate
Simulated Annealing for Permanent
Ideal weights(for a matching with holes u,v):
(# perfects)
(# nears with holes u,v)
Kn,n G
Unordered state Ordered state
(EASY) (DIFFICULT)
Solution: Approximate
Simulated Annealing for Permanent
Ideal weights(for a matching with holes u,v):
(# perfects)
(# nears with holes u,v)
Kn,n G
Unordered state Ordered state
(EASY) (DIFFICULT)
Solution: Approximate
Simulated Annealing for Permanent
Ideal weights(for a matching with holes u,v):
(# perfects)
(# nears with holes u,v)
Kn,n G
Unordered state Ordered state
(EASY) (DIFFICULT)
Solution: Approximate
Simulated Annealing for Permanent
Ideal weights(for a matching with holes u,v):
(# perfects)
(# nears with holes u,v)
Kn,n G
Unordered state Ordered state
(EASY) (DIFFICULT)
Solution: Approximate
Simulated Annealing for Permanent
Ideal weights(for a matching with holes u,v):
(# perfects)
(# nears with holes u,v)
Kn,n G
Unordered state Ordered state
(EASY) (DIFFICULT)
Solution: Approximate
Simulated Annealing for Permanent
Ideal weights(for a matching with holes u,v):
(# perfects)
(# nears with holes u,v)
Kn,n G
Unordered state Ordered state
(EASY) (DIFFICULT)
Solution: Approximate
G
Simulated Annealing for Permanent
Ideal weights(for a matching with holes u,v):
(# perfects)
(# nears with holes u,v)
Kn,n G
Unordered state Ordered state
(EASY) (DIFFICULT)
Solution: Approximate λλλλ = 1 … ~0
1
Simulated Annealing for Permanent
Ideal weights(# perfects)
(# nears with holes u,v)
λλλλ = 1 … ~0
1
need to be λλλλ-weighted:
w(u,v) =
λλλλ(M) = λλλλ# λλλλ-edges in M
λλλλ(S) = ∑∑∑∑ λλλλ(M)
where
λλλλ( P P P P )
λλλλ( N N N N (u,v) )
M in S
Simulated Annealing for Permanent
λλλλ = 1
1
λλλλ( P P P P )
λλλλ( N N N N (u,v) )w(u,v) =
Initially, λλλλ = 1. Thus w(u,v) = n!/(n-1)! = n.
Algorithm (sketch):
Later, have approx. of w(u,v). Run chain to improve the approx. Decrease λλλλ (until ~0).
(Improved approx. of old λ = starting approx. of new λ)
*
Thm [Jerrum-Sinclair-Vigoda ’01]: Weighted Broder chain mixes if w(u,v) approximated within a constant factor.
Simulated Annealing for Permanent
λλλλ = 0.7
1
λλλλ( P P P P )
λλλλ( N N N N (u,v) )w(u,v) =
Initially, λλλλ = 1. Thus w(u,v) = n!/(n-1)! = n.
Algorithm (sketch):
Later, have approx. of w(u,v). Run chain to improve the approx. Decrease λλλλ (until ~0).
(Improved approx. of old λ = starting approx. of new λ)
4-apx
*
Thm [Jerrum-Sinclair-Vigoda ’01]: Weighted Broder chain mixes if w(u,v) approximated within a constant factor.
Simulated Annealing for Permanent
λλλλ = 0.7
1
λλλλ( P P P P )
λλλλ( N N N N (u,v) )w(u,v) =
Initially, λλλλ = 1. Thus w(u,v) = n!/(n-1)! = n.
Thm [Jerrum-Sinclair-Vigoda ’01]: Weighted Broder chain mixes if w(u,v) approximated within a constant factor.
Algorithm (sketch):
Later, have approx. of w(u,v). Run chain to improve the approx. Decrease λλλλ (until ~0).
(Improved approx. of old λ = starting approx. of new λ)
4-apx2
*
Simulated Annealing for Permanent
λλλλ = 0.7 0.6
1
λλλλ( P P P P )
λλλλ( N N N N (u,v) )w(u,v) =
Initially, λλλλ = 1. Thus w(u,v) = n!/(n-1)! = n.
Algorithm (sketch):
Later, have approx. of w(u,v). Run chain to improve the approx. Decrease λλλλ (until ~0).
(Improved approx. of old λ = starting approx. of new λ)
4-apx2 = 4-apx for
*
Thm [Jerrum-Sinclair-Vigoda ’01]: Weighted Broder chain mixes if w(u,v) approximated within a constant factor.
Simulated Annealing for Permanent
λλλλ = 0.7 0.6 …~0
1
λλλλ( P P P P )
λλλλ( N N N N (u,v) )w(u,v) =
Initially, λλλλ = 1. Thus w(u,v) = n!/(n-1)! = n.
Algorithm (sketch):
Later, have approx. of w(u,v). Run chain to improve the approx. Decrease λλλλ (until ~0).
(Improved approx. of old λ = starting approx. of new λ)
4-apx2 = 4-apx for
*
Thm [Jerrum-Sinclair-Vigoda ’01]: Weighted Broder chain mixes if w(u,v) approximated within a constant factor.
0 1 0 1
1 1 0 0
1 0 1 0
0 1 0 1
columns
rows
BCT: Bipartite Graphs with Given Degrees
2
2
2
2
2 3 1 2
2 3 1 2
2 2 2 2
columns
rows
0 1 0 1
1 1 0 0
1 0 1 0
0 1 0 1
columns
rows
BCT: Bipartite Graphs with Given Degrees
2
2
2
2
2 3 1 2
2 3 1 2
2 2 2 2
“Sliding” Markov Chain on perfect and near tables
Perfect: remove a random edge
Near: slide edges or match
0 1 0 1
1 1 0 0
1 0 1 0
0 1 0 1
columns
rows
BCT: Bipartite Graphs with Given Degrees
2
2
2
2
2 3 1 2
2 3 1 2
2 2 2 2
“Sliding” Markov Chain on perfect and near tables
Perfect: remove a random edge
Near: slide edges or match
2 0 1 0 1
1 1 0 0
1 0 1 0
0 1 0 0
columns
rows
BCT: Bipartite Graphs with Given Degrees
2
2
2
2
2 3 1 2
2 3 1
2 2 2 2
“Sliding” Markov Chain on perfect and near tables
Perfect: remove a random edge
Near: slide edges or match
1
1
X
XX
X
1
1
2 0 1 0 1
1 1 0 0
1 0 1 0
0 1 0 0
columns
rows
BCT: Bipartite Graphs with Given Degrees
2
2
2
2
2 3 1 2
2 3 1
2 2 2 2
“Sliding” Markov Chain on perfect and near tables
Perfect: remove a random edge
Near: slide edges or match
1
1
X
XX
X
1
1
2 0 1 0 1
1 1 0 0
1 0 1 0
0 1 0 0
columns
rows
BCT: Bipartite Graphs with Given Degrees
2
2
2
2
2 3 1 2
2 3 1
2 2 2 2
“Sliding” Markov Chain on perfect and near tables
Perfect: remove a random edge
Near: slide edges or match
1
1
X
XX
X
1
1
2 0 1 0 1
1 1 0 0
1 0 1 0
0 1 0 0
columns
rows
BCT: Bipartite Graphs with Given Degrees
2
2
2
2
2 3 1 2
2 3 1
2 2 2 2
“Sliding” Markov Chain on perfect and near tables
Perfect: remove a random edge
Near: slide edges or match
1
1
X
XX
X
1
1
2 0 1 0 1
1 1 0 0
1 0 0 1
0 1 0 0
columns
rows
BCT: Bipartite Graphs with Given Degrees
2
2
2
2
2 3 1 2
2 3 1
2 2 2 2
“Sliding” Markov Chain on perfect and near tables
Perfect: remove a random edge
Near: slide edges or match
0
1
X
XX
X
1
0
2 0 1 0 1
1 1 0 0
1 0 0 1
0 1 0 0
columns
rows
BCT: Bipartite Graphs with Given Degrees
2
2
2
2
2 3 1 2
2 3 1
2 2 2 2
“Sliding” Markov Chain on perfect and near tables
Perfect: remove a random edge
Near: slide edges or match
0
1
X
XX
X
1
0
2 0 1 0 1
1 1 0 0
1 0 0 1
0 1 0 0
columns
rows
BCT: Bipartite Graphs with Given Degrees
2
2
2
2
2 3 1 2
2 3 1
2 2 2 2
“Sliding” Markov Chain on perfect and near tables
Perfect: remove a random edge
Near: slide edges or match
0
1
X
XX
X
1
0
2 0 1 0 1
1 0 1 0
1 0 0 1
0 1 0 0
columns
rows
BCT: Bipartite Graphs with Given Degrees
2
2
2
2
2 3 1 2
2 3 1
2 2 2 2
“Sliding” Markov Chain on perfect and near tables
Perfect: remove a random edge
Near: slide edges or match
2
1
X
XX
X
1
2
2 0 1 0 1
1 0 1 0
1 0 0 1
0 1 0 0
columns
rows
BCT: Bipartite Graphs with Given Degrees
2
2
2
2
2 3 1 2
2 3 1
2 2 2 2
“Sliding” Markov Chain on perfect and near tables
Perfect: remove a random edge
Near: slide edges or match
2
1
X
XX
X
1
2
2 0 1 0 1
1 0 1 0
1 0 0 1
0 1 0 0
columns
rows
BCT: Bipartite Graphs with Given Degrees
2
2
2
2
2 3 1 2
2 3 1
2 2 2 2
“Sliding” Markov Chain on perfect and near tables
Perfect: remove a random edge
Near: slide edges or match
2
1
X
XX
X
1
2
2 0 1 0 1
1 0 1 0
1 0 0 1
0 1 0 0
columns
rows
BCT: Bipartite Graphs with Given Degrees
2
2
2
2
2 3 1 2
2 3 1
2 2 2 2
“Sliding” Markov Chain on perfect and near tables
Perfect: remove a random edge
Near: slide edges or match
2
1
X
XX
X
1
2
2 0 1 0 1
1 0 1 0
0 0 0 1
1 1 0 0
columns
rows
BCT: Bipartite Graphs with Given Degrees
2
2
2
2
2 3 1 2
2 3 1
2 2 2 2
“Sliding” Markov Chain on perfect and near tables
Perfect: remove a random edge
Near: slide edges or match
2
1
X
X
X
X
1
2
2 0 1 0 1
1 0 1 0
0 0 0 1
1 1 0 0
columns
rows
BCT: Bipartite Graphs with Given Degrees
2
2
2
2
2 3 1 2
2 3 1
2 2 2 2
“Sliding” Markov Chain on perfect and near tables
Perfect: remove a random edge
Near: slide edges or match
2
1
X
X
X
X
1
2
2 0 1 0 1
1 0 1 0
0 1 0 1
1 1 0 0
columns
rows
BCT: Bipartite Graphs with Given Degrees
2
2
2
2
2 3 1 2
2 3 1
2 2 2 2
“Sliding” Markov Chain on perfect and near tables
Perfect: remove a random edge
Near: slide edges or match
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
Ideal weights
(# perfects)
(# nears with holes u,v)
Unordered state Ordered state
(EASY) (DIFFICULT)
2 3 1 2
2 2 2 2?
Bezáková-Bhatnagar-Vigoda ’06
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
Ideal weights
(# perfects)
(# nears with holes u,v)
Unordered state Ordered state
(EASY) (DIFFICULT)
2 3 1 2
2 2 2 2? on Kn,n
Bezáková-Bhatnagar-Vigoda ’06
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
Ideal weights
(# perfects)
(# nears with holes u,v)
Unordered state Ordered state
(EASY) (DIFFICULT)
2 3 1 2
2 2 2 2on Kn,n
2 3 1 2
2 2 2 2on G*
Bezáková-Bhatnagar-Vigoda ’06
Simulated Annealing for BCT ?
22 3 1
2 2 2 2 where
w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
T in S
λλλλ(T) = λλλλ# λλλλ-edges in T
λλλλ(S) = ∑∑∑∑ λλλλ(T)
Recall that
2 3 1 2
2 2 2 2on Kn,n
2 3 1 2
2 2 2 2on G*
λλλλ = ~0
λλλλ
……………………… 1
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
λλλλ
The catch :
What, if for some u,v, there is no near-table which uses all real edges ? Then,
λλλλ( N N N N (u,v) ) = 0 for λλλλ = 0.
where
w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
T in S
λλλλ(T) = λλλλ# λλλλ-edges in T
λλλλ(S) = ∑∑∑∑ λλλλ(T)
Recall that
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall that
Thm [Bezáková-Bhatnagar-Vigoda ’06]:
There exists a graph G* with given degree sequence s.t. between any two vertices there exists an “alternating”path of length ≤≤≤≤ 5.
G*
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall that
Thm [Bezáková-Bhatnagar-Vigoda ’06]:
There exists a graph G* with given degree sequence s.t. between any two vertices there exists an “alternating”path of length ≤≤≤≤ 5.
G*
Corollary: There exists a (u,v)-near-table similar to G*.
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall that
Thm [Bezáková-Bhatnagar-Vigoda ’06]:
There exists a graph G* with given degree sequence s.t. between any two vertices there exists an “alternating”path of length ≤≤≤≤ 5.
a near-table T
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
λλλλ w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall that
Thm [Bezáková-Bhatnagar-Vigoda ’06]:
There exists a graph G* with given degree sequence s.t. between any two vertices there exists an “alternating”path of length ≤≤≤≤ 5.
Corollary: λλλλ( N N N N (u,v) ) “easy” to compute for λλλλ = ~0.
G*
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
λλλλ w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall that
Thm [Bezáková-Bhatnagar-Vigoda ’06]:
There exists a graph G* with given degree sequence s.t. between any two vertices there exists an “alternating”path of length ≤≤≤≤ 5.
Corollary: λλλλ( N N N N (u,v) ) “easy” to compute for λλλλ = ~0.
G* Kn,n
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
λλλλ w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall that
Thm [Bezáková-Bhatnagar-Vigoda ’06]:
There exists a graph G* with given degree sequence s.t. between any two vertices there exists an “alternating”path of length ≤≤≤≤ 5.
Corollary: λλλλ( N N N N (u,v) ) “easy” to compute for λλλλ = ~0.
G* Kn,n
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
λλλλ w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall that
Thm [Bezáková-Bhatnagar-Vigoda ’06]:
There exists a graph G* with given degree sequence s.t. between any two vertices there exists an “alternating”path of length ≤≤≤≤ 5.
Corollary: λλλλ( N N N N (u,v) ) “easy” to compute for λλλλ = ~0.
G* Kn,n
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
λλλλ w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall that
Thm [Bezáková-Bhatnagar-Vigoda ’06]:
There exists a graph G* with given degree sequence s.t. between any two vertices there exists an “alternating”path of length ≤≤≤≤ 5.
Corollary: λλλλ( N N N N (u,v) ) “easy” to compute for λλλλ = ~0.
G* Kn,n
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
λλλλ w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall that
Thm [Bezáková-Bhatnagar-Vigoda ’06]:
There exists a graph G* with given degree sequence s.t. between any two vertices there exists an “alternating”path of length ≤≤≤≤ 5.
Corollary: λλλλ( N N N N (u,v) ) “easy” to compute for λλλλ = ~0.
G* Kn,n
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
λλλλ w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall that
Thm [Bezáková-Bhatnagar-Vigoda ’06]:
There exists a graph G* with given degree sequence s.t. between any two vertices there exists an “alternating”path of length ≤≤≤≤ 5.
Corollary: λλλλ( N N N N (u,v) ) “easy” to compute for λλλλ = ~0.
G* Kn,n
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
λλλλ w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall that
Thm [Bezáková-Bhatnagar-Vigoda ’06]:
There exists a graph G* with given degree sequence s.t. between any two vertices there exists an “alternating”path of length ≤≤≤≤ 5.
Corollary: λλλλ( N N N N (u,v) ) “easy” to compute for λλλλ = ~0.
G* Kn,n
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
λλλλ w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall that
Thm [Bezáková-Bhatnagar-Vigoda ’06]:
There exists a graph G* with given degree sequence s.t. between any two vertices there exists an “alternating”path of length ≤≤≤≤ 5.
Corollary: λλλλ( N N N N (u,v) ) “easy” to compute for λλλλ = ~0.
G* Kn,n
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
λλλλ w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall that
Thm [Bezáková-Bhatnagar-Vigoda ’06]:
There exists a graph G* with given degree sequence s.t. between any two vertices there exists an “alternating”path of length ≤≤≤≤ 5.
Corollary: λλλλ( N N N N (u,v) ) “easy” to compute for λλλλ = ~0.
G* Kn,n
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
λλλλ w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall that
Thm [Bezáková-Bhatnagar-Vigoda ’06]:
There exists a graph G* with given degree sequence s.t. between any two vertices there exists an “alternating”path of length ≤≤≤≤ 5.
Corollary: λλλλ( N N N N (u,v) ) “easy” to compute for λλλλ = ~0.
G* Kn,n
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
λλλλ w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall that
Thm [Bezáková-Bhatnagar-Vigoda ’06]:
There exists a graph G* with given degree sequence s.t. between any two vertices there exists an “alternating”path of length ≤≤≤≤ 5.
Corollary: λλλλ( N N N N (u,v) ) “easy” to compute for λλλλ = ~0.
G* Kn,n
Simulated Annealing for BCT ?
22 3 1
2 2 2 2
λλλλ w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall that
Thm [Bezáková-Bhatnagar-Vigoda ’06]:
There exists a graph G* with given degree sequence s.t. between any two vertices there exists an “alternating”path of length ≤≤≤≤ 5.
Corollary: λλλλ( N N N N (u,v) ) “easy” to compute for λλλλ = ~0.
G* Kn,n
Simulated Annealing for BCT ?
w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall thatH Kn,n
22 3 1
2 2 2 2
Thm:
It is possible to sample/count bipartite graphs of given degree sequence (which are subgraphs of a given graph H) in time O*((nm)3(n+m)5).
Simulated Annealing for BCT ?
w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall thatH Kn,n
22 3 1
2 2 2 2
Thm:
It is possible to sample/count bipartite graphs of given degree sequence (which are subgraphs of a given graph H) in time O*((nm)3(n+m)5).
Simulated Annealing for BCT ?
w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall thatH Kn,n
22 3 1
2 2 2 2
Thm:
It is possible to sample/count bipartite graphs of given degree sequence (which are subgraphs of a given graph H) in time O*((nm)3(n+m)5).
Simulated Annealing for BCT ?
w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall thatH Kn,n
22 3 1
2 2 2 2
Thm:
It is possible to sample/count bipartite graphs of given degree sequence (which are subgraphs of a given graph H) in time O*((nm)3(n+m)5).
Simulated Annealing for BCT ?
w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall thatH Kn,n
22 3 1
2 2 2 2
Thm:
It is possible to sample/count bipartite graphs of given degree sequence (which are subgraphs of a given graph H) in time O*((nm)3(n+m)5).
Simulated Annealing for BCT ?
w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall thatH Kn,n
22 3 1
2 2 2 2
Thm:
It is possible to sample/count bipartite graphs of given degree sequence (which are subgraphs of a given graph H) in time O*((nm)3(n+m)5).
Simulated Annealing for BCT ?
w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall thatH Kn,n
22 3 1
2 2 2 2
Thm:
It is possible to sample/count bipartite graphs of given degree sequence (which are subgraphs of a given graph H) in time O*((nm)3(n+m)5).
Simulated Annealing for BCT ?
w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall thatH Kn,n
22 3 1
2 2 2 2
Thm:
It is possible to sample/count bipartite graphs of given degree sequence (which are subgraphs of a given graph H) in time O*((nm)3(n+m)5).
Simulated Annealing for BCT ?
w(u,v) = λλλλ( P P P P )
λλλλ( N N N N (u,v) )
Recall thatH Kn,n
22 3 1
2 2 2 2
Thm:
It is possible to sample/count bipartite graphs of given degree sequence (which are subgraphs of a given graph H) in time O*((nm)3(n+m)5).
Different Approaches
Theory (Markov chain Monte Carlo with simulated annealing)
• Jerrum-Sinclair-Vigoda ’01: approximate permanent in O*(n10), yields O*((mn)10) algorithm for m x n binary contingency tables
• Bezáková-Bhatnagar-Vigoda ’06: O*((mn)3(m+n)5)
Practice (sequential importance sampling, Chen-Diaconis-Holmes-Liu ’05)
• Bezáková-Sinclair-Štefankovič-Vigoda ’06: negative example
• Jose Blanchet ’06: SIS works if marginals O(n1/4)
• Bayati-Kim-Saberi ’07: alternative importance sampling method, works if marginals O(n1/4)
Practice (the switching Markov chain, Diaconis-Gangolli ’94)
• Kannan-Tetali-Vempala ’97, Cooper-Dyer-Greenhill ’05: works for regular marginals
Importance Sampling for counting problems
x
with positive probability σσσσ(x)>0
Probability distribution σσσσon the points + ♦
♦
Random variable ηηηη(s) =1/σσσσ(s)
0
if s in the set
if s is ♦
Unbiased estimator
E[ηηηη] = ∑∑∑∑ σσσσ(x).1/σσσσ(x) = size of the set
2
3
21
3
32
5
3 4 2
4
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
[Chen-Diaconis-Holmes-Liu ’05]
Sequential Importance Sampling for BCT
2
3
22
3
12
5
4 3 3
4
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
Sequential Importance Sampling for BCT
[Chen-Diaconis-Holmes-Liu ’05]
2
3
22
3
12
5
4 3 3
4
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
Sequential Importance Sampling for BCT
[Chen-Diaconis-Holmes-Liu ’05]
2
3
3
5
4
4
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
Sequential Importance Sampling for BCT
[Chen-Diaconis-Holmes-Liu ’05]
2
3
3
5
4
4assign the column with probability proportional to
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
∏∏∏∏ ri/(n-ri)
where product ranges over i: rows with assignment 1
Sequential Importance Sampling for BCT
[Chen-Diaconis-Holmes-Liu ’05]
1
2
2
5
4
3assign the column with probability proportional to
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
where product ranges over i: rows with assignment 1 3
∏∏∏∏ ri/(n-ri)
Sequential Importance Sampling for BCT
[Chen-Diaconis-Holmes-Liu ’05]
1
2
2
5
4
3assign the column with probability proportional to
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
where product ranges over i: rows with assignment 1 3
∏∏∏∏ ri/(n-ri)
Sequential Importance Sampling for BCT
[Chen-Diaconis-Holmes-Liu ’05]
0
1
2
4
4
3assign the column with probability proportional to
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
where product ranges over i: rows with assignment 1 3 3
∏∏∏∏ ri/(n-ri)
Sequential Importance Sampling for BCT
[Chen-Diaconis-Holmes-Liu ’05]
4
assign the column with probability proportional to
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
where product ranges over i: rows with assignment 1 3 3
∏∏∏∏ ri/(n-ri)
0
1
2
4
3
Sequential Importance Sampling for BCT
[Chen-Diaconis-Holmes-Liu ’05]
0
1
1
3
4
2assign the column with probability proportional to
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
where product ranges over i: rows with assignment 1 3 3 2
∏∏∏∏ ri/(n-ri)
Sequential Importance Sampling for BCT
[Chen-Diaconis-Holmes-Liu ’05]
4
assign the column with probability proportional to
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
where product ranges over i: rows with assignment 1 3 3 2
∏∏∏∏ ri/(n-ri)
0
1
1
3
2
Sequential Importance Sampling for BCT
[Chen-Diaconis-Holmes-Liu ’05]
0
1
1
2
4
1assign the column with probability proportional to
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
where product ranges over i: rows with assignment 1 3 3 22
∏∏∏∏ ri/(n-ri)
Sequential Importance Sampling for BCT
[Chen-Diaconis-Holmes-Liu ’05]
4
assign the column with probability proportional to
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
where product ranges over i: rows with assignment 1 3 3 22
∏∏∏∏ ri/(n-ri)
0
1
1
2
1
Sequential Importance Sampling for BCT
[Chen-Diaconis-Holmes-Liu ’05]
0
1
0
1
4
1assign the column with probability proportional to
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
where product ranges over i: rows with assignment 1 3 3 22 2
∏∏∏∏ ri/(n-ri)
Sequential Importance Sampling for BCT
[Chen-Diaconis-Holmes-Liu ’05]
4
assign the column with probability proportional to
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
where product ranges over i: rows with assignment 1 3 3 22 2
∏∏∏∏ ri/(n-ri)
0
1
0
1
1
Sequential Importance Sampling for BCT
[Chen-Diaconis-Holmes-Liu ’05]
0
1
0
0
4
0assign the column with probability proportional to
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
where product ranges over i: rows with assignment 1 3 3 22 2 1
∏∏∏∏ ri/(n-ri)
Sequential Importance Sampling for BCT
[Chen-Diaconis-Holmes-Liu ’05]
4
assign the column with probability proportional to
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
where product ranges over i: rows with assignment 1 3 3 22 2 1
∏∏∏∏ ri/(n-ri)
0
1
0
0
0
Sequential Importance Sampling for BCT
[Chen-Diaconis-Holmes-Liu ’05]
Sequential Importance Sampling for BCT
4
assign the column with probability proportional to
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
where product ranges over i: rows with assignment 1 3 3 22 2 1
∏∏∏∏ ri/(n-ri)
0
0
0
0
0
[Chen-Diaconis-Holmes-Liu ’05]
Sequential Importance Sampling for BCT
4
assign the column with probability proportional to
a specific σσσσ
• fill table column-by-column
• assign each column ignoring other column sums
where product ranges over i: rows with assignment 1 3 3 22 2 1
∏∏∏∏ ri/(n-ri)
2
3
3
5
4
[Chen-Diaconis-Holmes-Liu ’05]
A Counterexample for SIS
1 1 1 11 1 γγγγm
1
ββββm
1
1
1
Thm [Bezáková-Sinclair-Štefankovič-Vigoda ‘06]:
For any ββββ ≠≠≠≠ γγγγ, SIS output after any subexponentialnumber of trials is off by an exponential factor(with high probability).
A Counterexample for SIS
1 1 1 11 1
1
ββββm
1
1
1
Thm [Bezáková-Sinclair-Štefankovič-Vigoda ‘06]:
For any ββββ, SIS output after any subexponentialnumber of trials is off by an exponential factor(with high probability).
Simpler example
1
A Counterexample for SIS
1 1 1 11 1
1
ββββm
1
1
1Intuition
1
Random table:
- randomly choose ββββm ones
Thm [Bezáková-Sinclair-Štefankovič-Vigoda ‘06]:
For any ββββ, SIS output after any subexponentialnumber of trials is off by an exponential factor(with high probability).
A Counterexample for SIS
1 1 1 11 1
1
ββββm
1
1
1Intuition
1
Random table:
- randomly choose ββββm ones
Thm [Bezáková-Sinclair-Štefankovič-Vigoda ‘06]:
For any ββββ, SIS output after any subexponentialnumber of trials is off by an exponential factor(with high probability).
A Counterexample for SIS
1 1 1 11 1
1
ββββm
1
1
1Intuition
1
Random table:
- randomly choose ββββm ones
Thm [Bezáková-Sinclair-Štefankovič-Vigoda ‘06]:
For any ββββ, SIS output after any subexponentialnumber of trials is off by an exponential factor(with high probability).
A Counterexample for SIS
1 1 1 11 1
1
ββββm
1
1
1Intuition
1
Random table:
- randomly choose ββββm ones
ααααm
Expect: αβαβαβαβm ones
SIS: asymptotically fewer
Thm [Bezáková-Sinclair-Štefankovič-Vigoda ‘06]:
For any ββββ, SIS output after any subexponentialnumber of trials is off by an exponential factor(with high probability).
A Counterexample for SIS
Intuition
Expect: αβαβαβαβm ones
SIS: asymptotically fewer
all tables
tables with ~αβαβαβαβm ones
tables seen by SIS whp
Thm [Bezáková-Sinclair-Štefankovič-Vigoda ‘06]:
For any ββββ, SIS output after any subexponentialnumber of trials is off by an exponential factor(with high probability).
A Counterexample for SIS
1 1 1 11 1 γγγγm
1
ββββm
1
1
1
Thm [Bezáková-Sinclair-Štefankovič-Vigoda ‘06]:
For any ββββ ≠≠≠≠ γγγγ, SIS output after any subexponentialnumber of trials is off by an exponential factor(with high probability).
Result holds for any order of rows/columns.
Alternating rows and columns?
50 100 150 200 250 300 350
201
202
203
204
205
206
SIS – Experimental Results
Bad example, m = 300, ββββ = 0.6, γγγγ = 0.7
log-scaleof SIS estimate
number SIS steps
correct