Graph-based image segmentation Václav Hlaváč Czech Technical University in Prague Faculty of Electrical Engineering Department of Cybernetics Prague, Czech.

Graph-based image segmentation

Václav HlaváčCzech Technical University in Prague

Faculty of Electrical Engineering

Department of Cybernetics

Prague, Czech Republic

http://cmp.felk.cvut.cz/~hlavac

Based on the presentation of D. Hoiem ,S. Lazebnik, Jianbo Shi




2

Types of segmentations

Oversegmentation Undersegmentation

Multiple Segmentations

3

Major processes for segmentation Bottom-up: group tokens with similar

features Top-down: group tokens that likely belong to

the same object

[Levin and Weiss 2006]

4

5

Main Ideas

Convert image into a graph• Vertices for the pixels• Edges between the pixels• Additional vertices and edges to encode

other constraints Manipulate the graph to segment the

image

6

Papers

Interactive graph cuts for optimal boundary & region segmentation ofobjects in N-D images • Boykov and Jolly• Minimize an energy function

Efficient Graph-based Segmentation• Felzenszwalb and Huttenlocher• Cluster the vertices based on edge weight

7

Boykov and Jolly

Binary image segmentation• Classify pixels as object or background• Their contribution is adding interactivity

Minimise an energy function• E(A) = B(A) + λR(A)

A: Segmentation (assign each pixels to object or background)

B(A): The cost of all the edges between object pixels and background pixels

R(A): The cost of deciding a pixel to be object or background

8

Creating the Graph

Each pixel has a corresponding vertex Additionally, a source (“object”) and a

sink (“background”) Each pixel vertex has an edge to its

neighbours (e.g. 8 adjacent neighbours in 2D), an edge to the source, an edge to the sink

9

An image represented as a graph

10

Undirected/directed graph

11

12

13

14

Intelligent Scissors

Mortenson and Barrett (SIGGRAPH 1995)

15

Intelligent Scissors

A good image boundary has a short path through the graph.

Mortenson and Barrett (SIGGRAPH 1995)

1 2 1

4

1

6

9

1

3

1

4

113

2

3

5

Start

End

16

Intelligent Scissors Formulation: find good boundary between seed

points Challenges

• Minimize interaction time• Define what makes a good boundary• Efficiently find it

17

Intelligent Scissors: method

1. Define boundary cost between neighboring pixels

2. User specifies a starting point (seed)

3. Compute lowest cost from seed to each other pixel

4. Get path from seed to cursor, choose new seed, repeat

18

Intelligent Scissors: methodDefine boundary cost between neighboring pixels

a) Lower if edge is present (e.g., with edge(im, ‘canny’))

b) Lower if gradient is strong

c) Lower if gradient is in direction of boundary

19

Gradients, Edges, and Path Cost

Gradient Magnitude

Edge Image

Path Cost

20



2. User specifies a starting point (seed)• Snapping

21





• Dijkstra’s shortest path algorithm

22

Dijkstra’s shortest path algorithmInitialize, given seed s: Compute cost2(q, r) % cost for boundary from pixel q to

neighboring pixel r cost(s) = 0 % total cost from seed to this point A = {s} % set to be expanded E = { } % set of expanded pixels P(q) % pointer to pixel that leads to q

Loop while A is not empty1. q = pixel in A with lowest cost2. Add q to E3. for each pixel r in neighborhood of q that is not in E

a) cost_tmp = cost(q) + cost2(q,r)

b) if (r is not in A) OR (cost_tmp < cost(r))

i. cost(r) = cost_tmp ii. P(r) = qiii.Add r to A

23





4. Get new seed, get path between seeds, repeat

24

Intelligent Scissors: improving interaction

1. Snap when placing first seed

2. Automatically adjust to boundary as user drags

3. Freeze stable boundary points to make new seeds

25

A cut of a graph

26

Flow network, flow

27

28

29

30

31

32

33

34

35

Pixel-based statistical model

has limitations because of existing relations to other pixels.

P(foreground | image)

36

Solution: encode dependences between pixels

P(foreground | image)

edgesji

jiNi

i datayypdataypZ

dataP,

2..1

1 ),;,(),;(1

),;( y

Labels to be predicted Individual predictions Pairwise predictions

Normalizing constant called “partition function”

37

Writing likelihood as an energy

edgesji

jiNi

i datayypdataypZ

dataP,

2..1

1 ),;,(),;(1

),;( y

edgesji

jii

i datayydataydataEnergy,

21 ),;,(),;(),;( y

Cost of assignment yi Cost of pairwise assignment yi ,yj

- log(.)

38

Notes on energy-based formulation

Primarily used when you only care about the most likely solution (not the confidences).

Can think of it as a general cost function. Can have larger “cliques” than 2. The clique is the

set of variables that go into a potential function.

edgesji

jii


21 ),;,(),;(),;( y

40

Markov Random Fields

edgesji

jii


21 ),;,(),;(),;( y

Node yi: pixel label

Edge: constrained pairs

Cost to assign a label to each pixel

Cost to assign a pair of labels to connected pixels

41

Label smoothing grid example

Unary potential

0 10 0 K1 K 0

K>0

0: - logP(yi = 0 | data)1: - logP(yi = 1 | data)

Pairwise Potential

edgesji

jii


21 ),;,(),;(),;( y

42

Creating the graph

Each pixel has a corresponding vertex Additionally, a source (“object”) and a sink

(“background”) Each pixel vertex has an edge to its

neighbors (e.g. 8 adjacent neighbors in 2D), an edge to the source, an edge to the sink

43

Edge weights between pixels

Weight of edges between pixel vertices are determined by the function expressing dependence between two pixels

Low score when boundary is likely to pass between the vertices

High score when vertices are probably part of the same element

E.g. the difference in pixel intensities, the image gradient

44

Solving MRFs with graph cuts

edgesji

jii


21 ),;,(),;(),;( y

Source (Label 0)

Sink (Label 1)

Cost to assign to 1

Cost to assign to 0

Cost to split nodes

45

Solving MRFs with graph cuts

edgesji

jii


21 ),;,(),;(),;( y

Source (Label 0)

Sink (Label 1)

Cost to assign to 1

Cost to assign to 0

Cost to split nodes

46

Graph cut, Boykov & Jolly 2001

Image Min Cut

Cut: Separating source and sink; Energy: collection of edgesMin Cut: Global minimal energy in polynomial time

Foreground (source)

Background

(sink)

47

Max flow

Directed graph with one source & one sink node Directed edge = pipe Edge label = capacity What is the max flow from source to sink?

Source Sink

1010

1212

9

8

8 8

9 8

4

5

92

2

52

6

1

1

7

6

3

10

4

5

1

1

3

99

48

Max flow

Graph with one source & one sink node Edge = pipe Edge label = capacity What is the max flow from source to sink?

Source Sink

10/102/10

9/121/12

9/9

0/8

8/8 8/8

8/9 8/8

2/4

1/5

9/92/2

1/2

5/52/2

1/6

0/1

0/1

2/7

0/6

3/3

7/10

4/4

5/5

0/1

0/1

3/3

0/93/9

49

Max flow What is the max flow from source to sink? Look at residual graph

• remove saturated edges (green here)• min cut is at boundary between 2 connected

components

Source Sink

10/102/10

9/121/12

9/9

0/8

8/8 8/8

8/9 8/8

2/4

1/5

9/92/2

1/2

5/52/2

1/6

0/1

0/1

2/7

0/6

3/3

7/10

4/4

5/5

0/1

0/1

3/3

0/93/9

min cut

50

Equivalence of min cut/max flow The three following statements are equivalent

• The maximum flow is f• The minimum cut has weight f• The residual graph for flow f contains no

directed path from source to sink

Source Sink

10/102/10

9/121/12

9/9

0/8

8/8 8/8

8/9 8/8

2/4

1/5

9/92/2

1/2

5/52/2

1/6

0/1

0/1

2/7

0/6

3/3

7/10

4/4

5/5

0/1

0/1

3/3

0/93/9

min cut

51

Normalized cut

• A minimum cut penalizes large segments• This can be fixed by normalizing the cut by

component size• The normalized cut cost is:

• The exact solution is NP-hard but an approximation can be computed by solving a generalized eigenvalue problem

assoc(A, V) = sum of weights of all edges in V that touch A

),(

),(

),(

),(

VBassoc

BAcut

VAassoc

BAcut

J. Shi and J. Malik. Normalized cuts and image segmentation. PAMI 2000

http://www.cs.berkeley.edu/~malik/papers/SM-ncut.pdf

52

GrabCut segmentation

User provides rough indication of foreground region.

Goal: Automatically provide a pixel-level segmentation.• Less user input, rectangle only.• Handles color

Carsten Rother (professor at TU Dresden now) et al. 2004

53

Grab cuts and graph cuts

User Input

Result

Magic Wand (198?)

Intelligent ScissorsMortensen and Barrett (1995)

GrabCut

Regions

Boundary

Source: K. Rother

54

Color model

Gaussian Mixture Model (typically 5-8 components)

Foreground &Background

Background

R

G

Source: K. Rother

55

Color model 2

Gaussian Mixture Model (typically 5-8 components)

Foreground &Background

Background

Foreground

BackgroundG

R

G

RIterated graph cut

Source: K. Rother

56

GrabCut segmentation1. Define graph

• usually 4-connected or 8-connected Divide diagonal potentials by sqrt(2)

2. Define unary potentials• Color histogram or mixture of Gaussians for

background and foreground

3. Define pairwise potentials

4. Apply graph cuts

5. Return to 2, using current labels to compute foreground, background models

2

2

21 2

)()(exp),(_

ycxc

kkyxpotentialedge

));((

));((log)(_

background

foreground

xcP

xcPxpotentialunary

57

What is easy or hard about these cases for graphcut-based segmentation?

58

Easier examples

GrabCut – Interactive Foreground Extraction 10

59

More difficult Examples

Camouflage &

Low Contrast

Harder Case

Fine structure

Initial Rectangle

InitialResult

GrabCut – Interactive Foreground Extraction 11

60

Using graph cuts for recognition

TextonBoost (Shotton et al. 2009 IJCV)

Graph-based image segmentation Václav Hlaváč Czech Technical University in Prague Faculty of Electrical Engineering Department of Cybernetics Prague, Czech.

Documents

image slide

background slide

sink slide

boundary cost

end slide

direction of boundary

edge weight slide

compute cost