Fitting a transformation: feature-based alignment Thursday, September 24 th 2015 Devi Parikh Virginia Tech 1 Slide credit: Kristen Grauman Disclaimer:

Fitting a transformation:feature-based alignment

Thursday, September 24th 2015Devi Parikh

Virginia Tech

1Slide credit: Kristen Grauman

Disclaimer: Many slides have been borrowed from Kristen Grauman, who may have borrowed some of them from others. Any time a slide did not already have a credit on it, I have credited it to Kristen. So there is a chance some of these credits are inaccurate.

Announcements

• Project proposals– Due Tuesday– Teams of > 2– Look at class webpage for guidelines

• PS2 out– Due October 5th

• PS1 graded– Grades will be released soon

2Slide credit: Adapted by Devi Parikh from Kristen Grauman

Given: initial contour (model) near desired object

a.k.a. active contours, snakes

Figure credit: Yuri Boykov

Goal: evolve the contour to fit exact object boundary

[Snakes: Active contour models, Kass, Witkin, & Terzopoulos, ICCV1987]

Main idea: elastic band is iteratively adjusted so as to• be near image positions with

high gradients, and• satisfy shape “preferences” or

contour priors

Last time: Deformable contours

3

Slide credit: Kristen Grauman

Last time: Deformable contours

Image from http://www.healthline.com/blogs/exercise_fitness/uploaded_images/HandBand2-795868.JPG Kristen Grauman

4

Recap: deformable contour

• A simple elastic snake is defined by:– A set of n points,– An internal energy term (tension,

bending, plus optional shape prior)– An external energy term (gradient-based)

• To use to segment an object:– Initialize in the vicinity of the object– Modify the points to minimize the total

energy

Kristen Grauman

5

),( 44 nvvE),( 433 vvE

)3(3E

)(3 mE )(4 mE

)3(4E

)2(4E

)1(4E

)(mEn

)3(nE

)2(nE

)1(nE

)2(3E

)1(3E

)(2 mE

)3(2E

),(...),(),( 11322211 nnntotal vvEvvEvvEE

),( 322 vvE

)1(2E

)2(2E

),( 211 vvE

0)1(1 E

0)2(1 E

0)3(1 E

0)(1 mE

Main idea: determine optimal position (state) of predecessor, for each possible position of self. Then backtrack from best state for last vertex.

states

1

2

…

m

vert

ices

1v 2v 3v 4v nv

)( 2nmOComplexity: vs. brute force search ____?

Viterbi algorithm

Example adapted from Y. Boykov

6


),(...),(),( 11322211 nnn vvEvvEvvE DP can be applied to optimize an open ended snake

For a closed snake, a “loop” is introduced into the total energy.

1n

),(),(...),(),( 111322211 vvEvvEvvEvvE nnnnn

1n

2

1n

3 4

Work around:

1) Fix v1 and solve for rest .

2) Fix an intermediate node at its position found in (1), solve for rest.

Energy minimization: dynamic programming


Aspects we need to consider

• Representation of the contours• Defining the energy functions

– External– Internal

• Minimizing the energy function• Extensions:

– Tracking– Interactive segmentation

9


Tracking via deformable contours

1. Use final contour/model extracted at frame t as an initial solution for frame t+1

2. Evolve initial contour to fit exact object boundary at frame t+1

3. Repeat, initializing with most recent frame.

Tracking Heart Ventricles (multiple frames)

Kristen Grauman

10

Visual Dynamics Group, Dept. Engineering Science, University of Oxford.

Traffic monitoringHuman-computer interactionAnimationSurveillanceComputer assisted diagnosis in medical imaging

Applications:

Tracking via deformable contours

Kristen Grauman

11

http://www.robots.ox.ac.uk/~vdg/~vdg/

3D active contours•

Jörg

en A

hlbe

rg•

http

://w

ww

.cvl

.isy.

liu.s

e/S

cOut

/Ma

ster

s/P

ape

rs/E

x170

8.p

df

Kristen Grauman

12

• May over-smooth the boundary

• Cannot follow topological changes of objects

Limitations

13


Limitations• External energy: snake does not really “see” object

boundaries in the image unless it gets very close to it.

image gradientsare large only directly on the boundary

I

14


Distance transform• External image can instead be taken from the distance

transform of the edge image.

original -gradient distance transform

edges

Value at (x,y) tells how far that position is from the nearest edge point (or other binary mage structure) >> help bwdist

Kristen Grauman

15

Deformable contours: pros and cons

Pros:• Useful to track and fit non-rigid shapes• Contour remains connected• Possible to fill in “subjective” contours• Flexibility in how energy function is defined, weighted.

Cons:• Must have decent initialization near true boundary, may

get stuck in local minimum• Parameters of energy function must be set well based on

prior information

Kristen Grauman

16

Summary

• Deformable shapes and active contours are useful for

– Segmentation: fit or “snap” to boundary in image– Tracking: previous frame’s estimate serves to initialize the next

• Fitting active contours:

– Define terms to encourage certain shapes, smoothness, low curvature, push/pulls, …

– Use weights to control relative influence of each component cost – Can optimize 2d snakes with Viterbi algorithm.

• Image structure (esp. gradients) can act as attraction force for interactive segmentation methods.

Kristen Grauman

17

Today

• Interactive segmentation• Feature-based alignment

– 2D transformations– Affine fit– RANSAC

18


Today



19


Interactive forces

How can we implement such an interactive force with deformable contours?

Kristen Grauman

20

Interactive forces

• An energy function can be altered online based on user input – use the cursor to push or pull the initial snake away from a point.

• Modify external energy term to include a term such that

1

02

2

||

n

i ipush p

rE

Nearby points get pushed hardest

Adapted by Devi Parikh from Kristen Grauman

21

Intelligent scissors

[Mortensen & Barrett, SIGGRAPH 1995, CVPR 1999]

Another form of interactive segmentation:

Compute optimal paths from every point to the seed based on edge-related costs.

22

Adapted by Devi Parikh from Kristen Grauman Demo: http://www.luberth.com/java/scissors/

http://rivit.cs.byu.edu/Eric/Eric.html


23


http://rivit.cs.byu.edu/Eric/Eric.html


24


Beyond boundary snapping…

• Another form of interactive guidance: specify regions• Usually taken to suggest foreground/background color

distributions

Boykov and Jolly (2001)

User Input Result

How to use this information?Kristen Grauman

25

q

Recall: Images as graphs

Fully-connected graph• node for every pixel• link between every pair of pixels, p,q• similarity wpq for each link

» similarity is inversely proportional to difference in color and position

p

wpqw

Steve Seitz

26

Recall: Segmentation by Graph Cuts

Break graph into segments• Delete links that cross between segments

• Easiest to break links that have low similarity– similar pixels should be in the same segments– dissimilar pixels should be in different segments

w

A B C

Steve Seitz

27

Adding hard constraints:

Add two additional nodes, object and background “terminals”

Link each pixel• To both terminals• To its neighboring pixels

Graph cuts for interactive segmentation

Yuri Boykov29


Adding hard constraints:

Let the edge weight to object or background terminal reflect similarity to the respective seed pixels.

Yuri Boykov30

Intelligent Scissors Mortensen and Barrett (1995)

GrabCutRother et al. (2004)

Graph Cuts Boykov and Jolly (2001)


Another interaction modality: specify bounding box

32


“Grab Cut”

• Loosely specify foreground region• Iterated graph cut

Rother et al (2004)

User Initialisation

K-means for learning

colour distributions

Graph cuts to infer the

segmentation

? User initialization

33

“Grab Cut”

• Loosely specify foreground region• Iterated graph cut

Rother et al (2004)Gaussian Mixture Model (typically 5-8 components)

Foreground &Background

Background

Foreground

BackgroundG

R

G

RIterated graph cut

34

“Grab Cut”

Rother et al (2004)35

Topics overview• Features & filters• Grouping & fitting

– Segmentation and clustering– Hough transform– Deformable contours– Alignment and 2D image transformations

• Multiple views and motion• Recognition• Video processing

36


Today



37


Motivation: Recognition

Figures from David Lowe

38


Motivation: medical image registration

39


Motivation: mosaics

Image from http://graphics.cs.cmu.edu/courses/15-463/2010_fall/

(In detail next week)

40


Alignment problem

• We have previously considered how to fit a model to image evidence– e.g., a line to edge points, or a snake to a deforming contour

• In alignment, we will fit the parameters of some transformation according to a set of matching feature pairs (“correspondences”).

T

xixi

'

41


Parametric (global) warpingExamples of parametric warps:

translation rotation aspect

affineperspective

Source: Alyosha Efros

42

Parametric (global) warping

Transformation T is a coordinate-changing machine:

p’ = T(p)

What does it mean that T is global?• Is the same for any point p• can be described by just a few numbers (parameters)

Let’s represent T as a matrix:

p’ = Mp

T

p = (x,y) p’ = (x’,y’)

y

x

y

xM

'

'


43

ScalingScaling a coordinate means multiplying each of its components by

a scalarUniform scaling means this scalar is the same for all components:

2


44

Non-uniform scaling: different scalars per component:

Scaling

X 2,Y 0.5


45

Scaling

Scaling operation:

Or, in matrix form:

byy

axx

'

'

y

x

b

a

y

x

0

0

'

'

scaling matrix S


46

What transformations can be represented with a 2x2 matrix?

2D Rotate around (0,0)?

yxyyxx

*cos*sin'*sin*cos'

y

x

y

x

cossin

sincos

'

'

2D Shear?

yxshy

yshxx

y

x

*'

*'

y

x

sh

sh

y

x

y

x

1

1

'

'


2D Scaling?

ysy

xsx

y

x

*'

*'

y

x

s

s

y

x

y

x

0

0

'

'

47

What transformations can be represented with a 2x2 matrix?


2D Mirror about Y axis?

yyxx

''

yx

yx

1001

''

2D Mirror over (0,0)?

yyxx

''

yx

yx

1001

''

2D Translation?

y

x

tyy

txx

'

'NO!

48

2D Linear Transformations

Only linear 2D transformations can be represented with a 2x2 matrix.

Linear transformations are combinations of …• Scale,• Rotation,• Shear, and• Mirror

y

x

dc

ba

y

x

'

'


49

Homogeneous coordinates

Converting from homogeneous coordinates

homogeneous image coordinates

To convert to homogeneous coordinates:

50


Homogeneous CoordinatesQ: How can we represent 2d translation as a 3x3 matrix

using homogeneous coordinates?

A: Using the rightmost column:

100

10

01

y

x

t

t

ranslationT

y

x

tyy

txx

'

'


51

Translation

11100

10

01

1

'

'

y

x

y

x

ty

tx

y

x

t

t

y

x

tx = 2ty = 1

Homogeneous Coordinates


52

Basic 2D TransformationsBasic 2D transformations as 3x3 matrices

1100

0cossin

0sincos

1

'

'

y

x

y

x

1100

10

01

1

'

'

y

x

t

t

y

x

y

x

1100

01

01

1

'

'

y

x

sh

sh

y

x

y

x

Translate

Rotate Shear

1100

00

00

1

'

'

y

x

s

s

y

x

y

x

Scale


53

2D Affine Transformations

Affine transformations are combinations of …• Linear transformations, and• Translations

Parallel lines remain parallel

w

y

x

fed

cba

w

y

x

100'

'

'

54


Today



55


Alignment problem

• We have previously considered how to fit a model to image evidence– e.g., a line to edge points, or a snake to a deforming contour

• In alignment, we will fit the parameters of some transformation according to a set of matching feature pairs (“correspondences”).

T

xixi

'

Kristen Grauman

56

Image alignment

• Two broad approaches:– Direct (pixel-based) alignment

• Search for alignment where most pixels agree

– Feature-based alignment• Search for alignment where extracted features agree

• Can be verified using pixel-based alignment 57


Fitting an affine transformation• Assuming we know the correspondences, how do we

get the transformation?

),( ii yx ),( ii yx

2

1

43

21

t

t

y

x

mm

mm

y

x

i

i

i

i

58


An aside: Least Squares ExampleSay we have a set of data points (X1,X1’), (X2,X2’),

(X3,X3’), etc. (e.g. person’s height vs. weight)

We want a nice compact formula (a line) to predict X’s from Xs: Xa + b = X’

We want to find a and b

How many (X,X’) pairs do we need?

What if the data is noisy?

'22

'11

XbaX

XbaX

'2

'1

2

1

1

1

X

X

b

a

X

XAx=B

.........

1

1

1

'3

'2

'1

3

2

1

X

X

X

b

a

X

X

X

overconstrained

2min BAx


59

Fitting an affine transformation• Assuming we know the correspondences, how do we

get the transformation?

),( ii yx ),( ii yx

2

1

43

21

t

t

y

x

mm

mm

y

x

i

i

i

i

i

i

ii

ii

y

x

t

t

m

m

m

m

yx

yx

2

1

4

3

2

1

1000

0100

60


Fitting an affine transformation

• How many matches (correspondence pairs) do we need to solve for the transformation parameters?

• Once we have solved for the parameters, how do we compute the coordinates of the corresponding point for ?

• Where do the matches come from?

i

i

ii

ii

y

x

t

t

m

m

m

m

yx

yx

2

1

4

3

2

1

1000

0100

),( newnew yx

Kristen Grauman

61

What are the correspondences?

?

• Compare content in local patches, find best matches.e.g., simplest approach: scan with template, and compute SSD or correlation between list of pixel intensities in the patch

• Later in the course: how to select regions according to the geometric changes, and more robust descriptors.

Kristen Grauman

62

Fitting an affine transformation

Figures from David Lowe, ICCV 1999

Affine model approximates perspective projection of planar objects.

63


Today



64


Outliers• Outliers can hurt the quality of our parameter

estimates, e.g., – an erroneous pair of matching points from two images– an edge point that is noise, or doesn’t belong to the

line we are fitting.

Kristen Grauman

65

Outliers affect least squares fit

66


Outliers affect least squares fit

67


RANSAC

• RANdom Sample Consensus

• Approach: we want to avoid the impact of outliers, so let’s look for “inliers”, and use those only.

• Intuition: if an outlier is chosen to compute the current fit, then the resulting line won’t have much support from rest of the points.

68


RANSAC: General form

• RANSAC loop:

1. Randomly select a seed group of points on which to base transformation estimate (e.g., a group of matches)

2. Compute transformation from seed group

3. Find inliers to this transformation

4. If the number of inliers is sufficiently large, re-compute estimate of transformation on all of the inliers

• Keep the transformation with the largest number of inliers

69


RANSAC for line fitting example

Source: R. Raguram Lana Lazebnik

70


Least-squares fit


71


1. Randomly select minimal subset of points


72



2. Hypothesize a model


73




3. Compute error function


74





4. Select points consistent with model


75






5. Repeat hypothesize-and-verify loop


76

77








78







Uncontaminated sample









79

RANSAC for line fitting

Repeat N times:• Draw s points uniformly at random• Fit line to these s points• Find inliers to this line among the remaining

points (i.e., points whose distance from the line is less than t)

• If there are d or more inliers, accept the line and refit using all inliers

Lana Lazebnik

80

RANSAC pros and cons• Pros

• Simple and general• Applicable to many different problems• Often works well in practice

• Cons• Lots of parameters to tune• Doesn’t work well for low inlier ratios (too many iterations,

or can fail completely)• Can’t always get a good initialization

of the model based on the minimum number of samples

Lana Lazebnik

81

Today



82


Coming up: alignment and image stitching


Questions?

See you Tuesday!

84Slide credit: Devi Parikh

Fitting a transformation: feature-based alignment Thursday, September 24 th 2015 Devi Parikh Virginia Tech 1 Slide credit: Kristen Grauman Disclaimer:

Documents

kristen graumancs376

kristen grauman7cs376

kristen graumantracking

kristen grauman44cs

kristen grauman3cs

kristen graumandisclaimer

snakesfigure credit

boykov6slide credit