8/7/2019 cse291talk
1/39
10/11/2001 Random walks and spectral segmentation 1
CSE 291 Fall 2001
Marina Meila and Jianbo Shi:
Learning Segmentation by RandomWalks/A Random Walks View of SpectralSegmentation
Markus HerrgardUCSD Bioengineering and Bioinformatics
8/7/2019 cse291talk
2/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
2
Overview
Introduction: Why random walks?
Review of the Ncut algorithm
Finite Markov chains
Spectral properties of Markov chains
Conductance of a Markov chain
Block-stochastic matrices
Application: Supervised segmentation
8/7/2019 cse291talk
3/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
3
Introduction
Why bother with mapping a segmentationproblem to a random walk problem?
Utilize strong connections between:
Graph theory
Theory of stochastic processes
Matrix algebra
8/7/2019 cse291talk
4/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
4
Applications of random walks
Markov chain monte carlo: Approximate high dimensional integration
e.g. in Bayesian inference
How to sample efficiently from a complexdistribution?
Randomized algorithms: Approximate counting in high dimensional
spaces How to sample points efficiently inside a
convex polytope?
8/7/2019 cse291talk
5/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
5
Segmentation as graph partitioning
Consider an image Iwith a similarityfunction Sij between all pairs of pixels i,jI
RepresentSas graph G =(I,S):
Pixels are the nodes of the graph
Sij is the weight of the edge between nodes iand j
Degree of node i: di = jSij Volume of setAI: volA= iAdi
8/7/2019 cse291talk
6/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
6
Simple example
Data with both distanceand color cues
Similarity matrix
8/7/2019 cse291talk
7/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
7
The normalized cut criterion
Partitioning ofG into A and itscomplement is found by minimizing thenormalized cut criterion:
! AjAi ijS
AvolvolAAANcut
,
11),(
Produces more balanced partitions thanregular graph cut
Approximate solution can be foundthrough spectral methods
8/7/2019 cse291talk
8/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
8
The normalized cut algorithm
Define: Diagonal matrix D with Dii = di Laplacian of the graph G: L = D S
Solve the generalized eigenvalueproblem: Lx = PDx
LetxL be the eigenvector correspondingto 2nd smallest eigenvalue PL
Partition xL to two sets containing roughlyequal values graph partition
8/7/2019 cse291talk
9/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
9
What does this actually mean?
Spectral methods are easy to apply, butnotoriously hard to understand intuitively
Some questions:
Why does it work? (see Shi & Malik)
Why this particular eigenvector?
Why would xL be piecewise constant?
What if there are more than two segments? What ifxL is not piecewise constant? (see
Kannan, Vempala & Vetta)
8/7/2019 cse291talk
10/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
10
Interlude: Finite Markov chains
Discrete time, finite state random process
State of the system at time tn: x
n
Probability of being in state i at time tn
given by:
Probability distribution for all statesrepresented by the column vector T(n)
Markov property:
)()( ixp nni !!
)|()|( 110 ! nnnn xxpxxxp -
8/7/2019 cse291talk
11/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
11
Transition matrix
Transition matrix:
Pis a (row) stochastic matrix:
Pij u 0
jPij = 1
If attn
the distribution is T(n) attn+1
the
distribution is given by:
)( 1 ixjxpP nnij !!!
)()1( nTn P !
8/7/2019 cse291talk
12/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
12
Example of a Markov chain
!
3/2
6/1
6/1
0
0
1
9.001.0
2.07.01.0
2.03.05.024
T
Play
Work
Sleep
!
2.0
3.0
5.0
0
0
1
9.001.0
2.07.01.0
2.03.05.0
SW P
T
8/7/2019 cse291talk
13/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
13
Some terminology
Stationary distribution Tg is given by:
Markov chain is reversible if the detailedbalance condition holds:
A reversible finite Markov chain is called a
random walk
gg! P T
jijiji PPgg
!
8/7/2019 cse291talk
14/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
14
Spectra of stochastic matrices
For reversible Markov chains theeigenvalues ofPare real andeigenvectors orthogonal
Spectral radius V(P) = 1 (i.e. |P|e1) Right (left) hand eigenvector
corresponding to P1=1 is x1=1 (x1=Tg)
8/7/2019 cse291talk
15/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
15
Back to Ncut
How is Ncut related to random walks ongraphs?
Transform the similarity matrix Sto a
stochastic matrix:
Pij is the probability of moving from pixel i
to pixel j in the graph representation ofthe image in one step of a random walk
SDP 1!
8/7/2019 cse291talk
16/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
16
Relationship to random walks
Spectrum of P: The generalized eigenvalue problem in
Ncut can be written as:
How are the spectra related? Same eigenvectors: x =xP
Eigenvalues: P = 1-PP
xxPI
xxSDIxDxSD
!
!!
)(
)()(1
PPP xPx !
8/7/2019 cse291talk
17/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
17
Simple example
Similarity matrix S
Transition matrix P=D-1S
8/7/2019 cse291talk
18/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
18
Eigenvalues and eigenvectors ofP
8/7/2019 cse291talk
19/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
19
Why the second eigenvector?
The smallest eigenvalue in NCutcorresponds to the largest eigenvalue ofP
The corresponding eigenvector x1=1 has
no information about partitioning
8/7/2019 cse291talk
20/39
8/7/2019 cse291talk
21/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
21
Conductance and the Ncut criterion
Assume that the random walk started fromits stationary distribution
Using this and Pij = Sij/di we can write:
I
d
i
i vol!
g
A
S
I
dd
S
I
d
P
AAjAi
ij
Ai
i
AjAi i
iji
Ai
i
AjAi
iji
vol
vol
vol)(,,,
g
g
!!!
8/7/2019 cse291talk
22/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
22
Interpretation of the Ncut criterion
Alternative representation of the Ncutcriterion:
Minimum NCut is equivalent to minimizing the conductance between setA
and its complement minimizing the probability of moving between
setA and its complement
)()(),(
,,AA
Avol
S
volA
S
AANcutAjAi
ij
AjAi
ij
!!
8/7/2019 cse291talk
23/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
23
Block-stochastic matrices
Let( = (A1,A2,,Ak) be a partition ofI
Pis a block-stochastic matrix orequivalently the Markov chain is
aggregatable iff
ks,s',AiRPPsss
Aj
ijis
s
-1'''
!!!
8/7/2019 cse291talk
24/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
24
Aggregation
Markov chain defined by Pwith statespace iIcan be aggregated to a Markovchain with a smaller state space A
s( and
a transition matrix R The keigenvalues ofR are the same as
the klargest eigenvalues ofP
Aggregation can be performed as a lineartransformation R = UPV
8/7/2019 cse291talk
25/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
25
Aggregation example
Aggregated transitionmatrix R
Transition matrix P
8/7/2019 cse291talk
26/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
26
Why piecewise constant eigenvectors?
IfPis block-stochastic with kblocks thenits kfirst eigenvectors are piecewiseconstant
Ncut is exact for block-stochastic matricesin addition to block diagonal matrices
Ncut groups pixels by the similarity of
their transition probabilities to subsets ofI
8/7/2019 cse291talk
27/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
27
Block-stochastic matrix example
Transition matrix P
Piecewise constant
eigenvector x
8/7/2019 cse291talk
28/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
28
The modified Ncut algorithm
Finds ksegments in one pass
Requires that the keigenvalues ofR arelarger than the other n-kspurious
eigenvalues ofP1. Compute eigenvalues ofP
2. Selectklargest eigenvectors
3. Use k-means to obtain segmentationbased on the keigenvectors
8/7/2019 cse291talk
29/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
29
Supervised image segmentation
Training data: Based on a human-segmented image
define target probabilities
Features: Different criteria fqij q=1,,Q that measure
similarity between pixels i and j
AiAjA
AjPij
! for,,
,01
*
8/7/2019 cse291talk
30/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
30
Supervised segmentation criterion
Model: Parametrized similarity function:
Optimization criterion: Minimize Kullback-Leibler divergence between
target transition matrix P* and P(E)=D-1S(E)
Corresponds to maximizing cross-entropy:
!q
q
ijqQij fS )exp(),,( 1 -
)(log1
)( *
!Ii Ij
ijij PPI
J
8/7/2019 cse291talk
31/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
31
Supervised segmentation algorithm
This can be done by using gradient ascentin E:
where
)(
)()1(
nq
n
q
n
q
J
!
x
x!
? A !xx
!ij
qij
n
ijij
q
fPPI
J
n
)(1
)(*
)(
8/7/2019 cse291talk
32/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
32
Toy example
Distance2
2
1
jiijf xx !
Color (or intensity)2
2
2
jiij ccf !
Training segmentation 1 (by distance): E1=-1.19,E2=1.04
Training segmentation 2 (by color):E1=-0.19,E2=-4.55
8/7/2019 cse291talk
33/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
33
Toy example results
Test data Training segmentation 1 (by distance):
Training segmentation 2 (by color):
8/7/2019 cse291talk
34/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
34
Application real image segmentation
Cues: Intervening contour:
Edge flow:
)(Edgemax ),( kf jilkIC
ij !
)cos(1
)2cos(2
)cos(1
)2cos()2cos(2
o
ji
l
jiC
ij
f
!
8/7/2019 cse291talk
35/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
35
Training
8/7/2019 cse291talk
36/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
36
Testing
8/7/2019 cse291talk
37/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
37
Conclusions I
Random walks perspective provides newinsights to the Ncut algorithm:
Relating the Ncut algorithm to spectral
properties of random walks Interpreting of the Ncut criterion in terms of
conductance of a random walk
Proving that Ncut is exact for block stochastic
matrices
8/7/2019 cse291talk
38/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
38
Conclusions II
Is any of this useful in practice? Supervised segmentation method
Comparing different spectral clustering
methods in terms of the underlying randomwalks
Choosing the kernel to allow for effectiveclustering (approximately block-stochastic)
New clustering criteria, e.g. bipartiteclustering
8/7/2019 cse291talk
39/39
CSE 291 Fall 2001 10/11/2001Random walks and spectral segmentation
39
References
Kemeny JG, Snell JL: Finite Markov Chains.Springer 1976.
Stewart WJ: Introduction to the NumericalSolution of Markov Chains. Princeton UniversityPress 1994.
Lovasz L: Random Walks of Graphs: A Survey.
Jerrum M, Sinclair A: The Markov Chain Monte
Carlo Method: An Approach to ApproximateCounting and Integration.