Top Banner
Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003 Lecture 29
23

Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003 Lecture 29.

Dec 17, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Segmentation

Course web page:vision.cis.udel.edu/~cv

May 2, 2003 Lecture 29

Page 2: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Announcements

• Read Forsyth & Ponce Chapter 14.4 and Chapter 25 on clustering and digital libraries, respectively

Page 3: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Outline

• Definition of segmentation• Grouping strategies• Segmentation applications

– Detecting shot boundaries– Background subtraction

Page 4: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

What is Segmentation?

• Clustering image elements that “belong together” – Partitioning

• Divide into regions/sequences with coherent internal properties

– Grouping • Identify sets of coherent tokens in image

• Tokens: Whatever we need to group – Pixels – Features (corners, lines, etc.) – Larger regions (e.g., arms, legs, torso)– Discrete objects (e.g., people in a crowd)– Etc.

Page 5: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Example: Partitioning by Texture

courtesy of University of Bonn

Page 6: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Fitting

• Associate model(s) with tokens– Estimation: What are parameters of

model for a given set of tokens?• Least-squares, etc.

– Correspondence: Which token belongs to which model?• RANSAC, etc.

Page 7: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Approaches to Grouping

• Bottom up segmentation– Tokens belong together because they are

locally coherent• Top down segmentation

– Tokens belong together because they lie on the same object—must recognize object first

– RANSAC implements this in a very basic form• Not clear how to apply to higher-level concepts (i.e.,

objects for which we lack analytic models)

• Not mutually exclusive—successful algorithms generally require both

Page 8: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Gestalt Theory of Grouping

• Psychological basis for why/how things are grouped• Figure-ground discrimination

– Grouping can be seen in terms of allocating tokens to figure or ground

• Factors affecting token coherence– Proximity– Similarity: Based on color, texture, orientation (aka parallelism), etc.– Common fate: Parallel motion (i.e., segmentation of optical flow by

similarity)– Common region: Tokens that lie inside the same closed region tend

to be grouped together.– Closure: Tokens or curves that tend to lead to closed curves tend to

be grouped together.– Symmetry: Curves that lead to symmetric groups are grouped

together– Continuity: Tokens that lead to “continuous” — as in “joining up

nicely,” rather than in the formal sense — curves tend to be grouped– Familiar Configuration: Tokens that, when grouped, lead to a

familiar object—e.g., the top-down recognition that allows us to see the dalmation from Forsyth & Ponce

Page 9: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Gestalt Grouping Factors

from Forsyth & Ponce

Page 10: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Example: Bottom-Up Segmentation

Segmenting cheese curds by texture(note importance of scale!)

Page 11: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Example: Top-Down Segmentation

from Forsyth & Ponce

Page 12: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Application:Shot Boundary Detection

• The problem: Divide video footage into a set of shots– Each shot is a continuous sequence of frames

from one camera• Types

– Cut: Shot changes in one frame– Fade, wipe, dissolve, etc.: Multi-frame

transition• Applications

– Video editing easier since shots become tokens– Can summarize video with key frames from

each shotfrom M. Smith

& T. Kanade

Page 13: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Shot Boundary Detection

• Basic approach: Threshold inter-frame difference

• Possible metrics – Raw: SSD, correlation, etc.

• More sensitive to camera motion– Histogram– Edge comparison– Break into blocks

• Use hysteresis to handle gradual transitions

from M. Smith & T. Kanade

Graph of frame-to-frame histogram difference

Page 14: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Application: Background Subtraction

• The problem: Assuming static camera, discrimin-ate moving foreground objects from background

• Applications– Traffic monitoring– Surveillance/security– User interaction

Current imagefrom C. Stauffer and W. Grimson

Background image Foreground pixels

courtesy of C. Wren

Pfinder

Page 15: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Background Subtraction: Simple Approaches

• Adjacent Frame Difference: Each image is subtracted from previous image in sequence. Absolute pixel differences greater than a threshold are

marked as foreground (It > ¿)

• Mean & Threshold: Pixel-wise mean values are computed during training phase; pixels within fixed threshold of the mean are considered background

adapted from K. Toyama et al.

Page 16: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Results & Problems for Simple Approaches

from K. Toyama et al.

Page 17: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Background Subtraction: Issues

• Noise models– Unimodal: Pixel values vary over time even for

static scenes– Multimodal: Features in background can “oscillate”,

requiring models which can represent disjoint sets of pixel values (e.g., waving trees against sky)

• Gross illumination changes– Continuous: Gradual illumination changes alter the

appearance of the background (e.g., time of day)– Discontinuous: Sudden changes in illumination and

other scene parameters alter the appearance of the background (e.g., flipping a light switch)

• Bootstrapping– Is a training phase with “no foreground” necessary,

or can the system learn what’s static vs. dynamic online?

Page 18: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Pixel RGB Distributions over time

Perceived color values of solid objects (e.g., tree trunk) have roughly Gaussian distributions due to CCD noise, etc. Leaf & monitor pixels

have bimodal distributions because of waving & flickering, respectively

courtesy of J. Buhmann

Page 19: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Improved Approaches to Background Subtraction

• Mean & Covariance: Mean and covariance of pixel values are updated continuously: – Moving average is used to adapt to slowly

changing illumination (low-pass temporal filter)

Foreground pixels are determined using a threshold on the Mahalanobis distance

• Mixture of Gaussians: A pixel-wise mixture of multiple Gaussians models the background adapted from K. Toyama et al.

Page 20: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Ellipsoids of Constant Probability for Gaussian

Distributions

from Duda et al.

Page 21: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Fitting Gaussians to Color Distributions

Can parametrize scaling, rotation, translationof ellipsoid with SVD of covariance matrix

Page 22: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Mahalanobis Distance

• Distance of point from Gaussian distribution– Along axes of fitted

ellipsoid– In units of standard

deviations (i.e., scaled)

covariance matrix

X(2, 2)

adapted from Duda & Hart

Page 23: Segmentation Course web page: vision.cis.udel.edu/~cv May 2, 2003  Lecture 29.

Example: Background Subtraction for Surveillance

courtesy of Elgammal et al.