Top Banner
Computer Vision Image pyramids Ali Borji UWM
95

Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Mar 14, 2018

Download

Documents

buitram
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Computer Vision

Image pyramids

Ali BorjiUWM

Page 2: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

• This image is too big to fit on the screen. How can we reduce it?

• How to generate a half-sized version?

2Slide credit: S. Seitz

Image Scaling

Page 3: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Image Sub-Sampling

Throw away every other row and column to create a 1/2 size image

- called image sub-sampling

1/4

1/8

Slide credit: S. Seitz

Page 4: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Image Sub-Sampling

1/4 (2x zoom) 1/8 (4x zoom)1/2

Slide credit: S. Seitz

Page 5: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Good and Bad Sampling

Good sampling: •Sample often or, •Sample wisely

Bad sampling: •Aliasing!

Slide credit: S. Narasimhan

Page 6: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Aliasing

6Slide credit: F. Durand

Page 7: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Aliasing

• Occurs when your sampling rate is not high enough to capture the amount of detail in your image

• Can give you the wrong signal/image—an alias !

• To do sampling right, need to understand the structure of your signal/image

• Enter Monsieur Fourier… !

•To avoid aliasing: - sampling rate ≥ 2 * max frequency in the image

•said another way: ≥ two samples per cycle - This minimum sampling rate is called the Nyquist rate 7

Slide credit: L. Zhang

Page 8: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Aliasing• When downsampling by a factor of two

- Original image has frequencies that are too high

!

!

!• How can we fix this?

8Slide credit: N. Snavely

Page 9: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Gaussian pre-filtering

G 1/4

G 1/8

Gaussian 1/2

• Solution: filter the image, then subsample Slide credit: S. Seitz

Page 10: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Subsampling with Gaussian pre-filtering

G 1/4 G 1/8Gaussian 1/2

• Solution: filter the image, then subsampleSlide credit: S. Seitz

Page 11: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Compare with...

1/4 (2x zoom) 1/8 (4x zoom)1/2

Slide credit: S. Seitz

Page 12: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Gaussian pre-filtering• Solution: filter

the image, then subsample

blur

F0 H*

subsample blur subsample …F1

F1 H*

F2F0

Slide credit: N. Snavely

Page 13: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

blur

F0 H*

subsample blur subsample …F1

F1 H*

F2F0

{Gaussian pyramid

Slide credit: N. Snavely

Page 14: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Template matching• Goal: find in image !

• Main challenge: What is a good similarity or distance measure between two patches? – Correlation – Zero-mean correlation – Sum Square Difference – Normalized Cross

Correlation

Page 15: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Matching with filters

• Goal: find in image • Method 0: filter the image with eye patch

Input Filtered Image

],[],[],[,

lnkmflkgnmhlk

++=∑

What went wrong?

f = image g = filter

Page 16: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Matching with filters

• Goal: find in image • Method 1: filter the image with zero-mean

eye

Input Filtered Image (scaled) Thresholded Image

)],[()],[(],[,

lnkmgflkfnmhlk

++−=∑

True detections

False detections

mean of f

Page 17: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Matching with filters

• Goal: find in image • Method 2: SSD

Input 1- sqrt(SSD) Thresholded Image

2

,

)],[],[(],[ lnkmflkgnmhlk

++−=∑

True detections

Page 18: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Matching with filters

• Goal: find in image • Method 2: SSD

Input 1- sqrt(SSD)

2

,

)],[],[(],[ lnkmflkgnmhlk

++−=∑

What’s the potential downside of SSD?

Page 19: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Matching with filters

• Goal: find in image • Method 3: Normalized cross-correlation

5.0

,

2,

,

2

,,

)],[()],[(

)],[)(],[(],[

!!"

#$$%

&−−−−

−−−−

=

∑ ∑

lknm

lk

nmlk

flnkmfglkg

flnkmfglkgnmh

Matlab: normxcorr2(template, im)

mean image patchmean template

Page 20: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Matching with filters

• Goal: find in image • Method 3: Normalized cross-correlation

Input Normalized X-Correlation Thresholded Image

True detections

Page 21: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Matching with filters

• Goal: find in image • Method 3: Normalized cross-correlation

Input Normalized X-Correlation Thresholded Image

True detections

Page 22: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Q: What is the best method to use?

!A: Depends • SSD: faster, sensitive to overall intensity • Normalized cross-correlation: slower,

invariant to local average intensity and contrast

• But really, neither of these baselines are representative of modern recognition.

Page 23: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Image pyramidsImage information occurs at all spatial scales !

• Gaussian pyramid • Laplacian pyramid • Wavelet/QMF pyramid • Steerable pyramid

Slide credit: B. Freeman and A. Torralba

Page 24: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

The Gaussian pyramid• Smooth with gaussians, because

– a gaussian*gaussian=another gaussian • Gaussians are low pass filters, so

representation is redundant.

Slide credit: B. Freeman and A. Torralba

Page 25: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

The computational advantage of pyramids

Slide credit: B. Freeman and A. Torralba[Burt and Adelson, 1983]

Page 26: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

The Gaussian Pyramid

Slide credit: B. Freeman and A. Torralba[Burt and Adelson, 1983]

Page 27: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Slide credit: B. Freeman and A. Torralba

Page 28: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Convolution and subsampling as a matrix multiply (1D case)

!! 1 4 6 4 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 1 4 6 4 1 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 1 4 6 4 1 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 1 4 6 4 1 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 1 4 6 4 1 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 1 4 6 4 1 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 1 4 6 4 1 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 4 6 4 1 0

(Normalization constant of 1/16 omitted for visual clarity.)Slide credit: B. Freeman and A. Torralba

Page 29: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Next pyramid level

!! 1 4 6 4 1 0 0 0

0 0 1 4 6 4 1 0

0 0 0 0 1 4 6 4

0 0 0 0 0 0 1 4

Slide credit: B. Freeman and A. Torralba

Page 30: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

The combined effect of the two pyramid levels

!!! 1 4 10 20 31 40 44 40 31 20 10 4 1 0 0 0 0 0 0 0

0 0 0 0 1 4 10 20 31 40 44 40 31 20 10 4 1 0 0 0

0 0 0 0 0 0 0 0 1 4 10 20 31 40 44 40 30 16 4 0

0 0 0 0 0 0 0 0 0 0 0 0 1 4 10 20 25 16 4 0

Slide credit: B. Freeman and A. Torralba

Page 31: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Slide credit: B. Freeman and A. Torralba

Page 32: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Gaussian pyramids used for• up- or down- sampling images. • Multi-resolution image analysis

– Look for an object over various spatial scales

– Coarse-to-fine image processing: form blur estimate or the motion analysis on very low-resolution image, upsample and repeat. Often a successful strategy for avoiding local minima in complicated estimation tasks.

Slide credit: B. Freeman and A. Torralba

Page 33: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

1D Gaussian pyramid matrix, for [1 4 6 4 1] low-pass filter

full-band image, highest

resolution

lower-resolution image

lowest resolution image

Slide credit: B. Freeman and A. Torralba

Page 34: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Image pyramidsImage information occurs at all spatial scales !

• Gaussian pyramid • Laplacian pyramid • Wavelet/QMF pyramid • Steerable pyramid

Slide credit: B. Freeman and A. Torralba

Page 35: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

The Laplacian Pyramid• Synthesis

– Compute the difference between upsampled Gaussian pyramid level and Gaussian pyramid level.

– band pass filter - each level represents spatial frequencies (largely) unrepresented at other level.

Slide credit: B. Freeman and A. Torralba

Page 36: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

The Laplacian Pyramid

36

Page 37: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Upsampling

6 1 0 0

4 4 0 0

1 6 1 0

0 4 4 0

0 1 6 1

0 0 4 4

0 0 1 6

0 0 0 4

Insert zeros between pixels, then apply a low-pass filter, [1 4 6 4 1]

Slide credit: B. Freeman and A. Torralba

Page 38: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Showing, at full resolution, the information captured at each level of a Gaussian (top) and Laplacian (bottom)

pyramid.

Slide credit: B. Freeman and A. Torralba

Page 39: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Laplacian pyramid reconstruction algorithm: recover x1 from L1, L2, L3 and x4

G# is the blur-and-downsample operator at pyramid level # F# is the blur-and-upsample operator at pyramid level # !Laplacian pyramid elements: L1 = (I – F1 G1) x1 L2 = (I – F2 G2) x2 L3 = (I – F3 G3) x3 x2 = G1 x1 x3 = G2 x2 x4 = G3 x3 !!Reconstruction of original image (x1) from Laplacian pyramid elements: x3 = L3 + F3 x4 x2 = L2 + F2 x3 x1 = L1 + F1 x2

Slide credit: B. Freeman and A. Torralba

Page 40: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Laplacian pyramid reconstruction algorithm: recover x1 from L1, L2, L3 and g3

+

++

Slide credit: B. Freeman and A. Torralba

Page 41: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Slide credit: B. Freeman and A. Torralba

Page 42: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Slide credit: B. Freeman and A. Torralba

Page 43: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

1D Laplacian pyramid matrix, for [1 4 6 4 1] low-pass filter

high frequencies

mid-band frequencies

low frequencies

Slide credit: B. Freeman and A. Torralba

Page 44: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Laplacian pyramid applications• Texture synthesis • Image compression • Noise removal

Slide credit: B. Freeman and A. Torralba

Page 45: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Image blending

Slide credit: B. Freeman and A. Torralba

Page 46: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Szel

iski

, Com

pute

r Vi

sion

, 201

0

Slide credit: B. Freeman &

A. Torralba

Page 47: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Image blending

• Build Laplacian pyramid for both images: LA, LB • Build Gaussian pyramid for mask: G • Build a combined Laplacian pyramid: L(j) = G(j) LA(j) + (1-G(j)) LB(j) • Collapse L to obtain the blended image

33Slide credit: B. Freeman and A. Torralba

Page 48: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

48

Page 49: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Image pyramidsImage information occurs at all spatial scales !

• Gaussian pyramid • Laplacian pyramid • Wavelet/QMF pyramid • Steerable pyramid

Slide credit: B. Freeman and A. Torralba

Page 50: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Linear transforms

Linear transform

Vectorized image

transformed image

! f = U−1 ! F

Note: not all important transforms need to have an inverse

Slide credit: A. Torralba

Page 51: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

U= U-1=1 1

1 -1

0.5 0.5

0.5 -0.5

The simplest set of functions:

Haar transform

Slide credit: A. Torralba

Page 52: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

U= U-1=1 1

1 -1

0.5 0.5

0.5 -0.5

To code a signal, repeat at several locations:

1 11 -

1 1 11 -

1 1 11 -

1 1 11 -

1

U=

The simplest set of functions:

1 11 -

1 1 11 -

1 1 11 -

1 1 11 -

1

U-1= ½

Haar transform

Slide credit: A. Torralba

Page 53: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

1 1

1 -1

1 1

1 -1

1 1

1 -1

1 1

1 -1

1 11 1

1 11 1

1 -11 -1

1 -11 -1

Reordering rows

Low pass

High pass

Apply the same decomposition to the Low pass component:

1 11 1

1 11 1

1 1

1 -1

1 1

1 -1

=

1 1 1 11 1 -1 -1

1 1 1 11 1 -1 -1

And repeat the same operation to the low pass component, until length 1.Note: each subband is sub-sampled and has aliased signal components.

Haar transform

Slide credit: A. Torralba

Page 54: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

1 1 1 1 1 1 1 11 1 1 1 -1 -1 -1 -11 1 -1 -1

1 1 -1 -11 -1

1 -11 -1

1 -1

The entire process can be written as a single matrix:

Average

Multiscale derivatives

Haar transform

Slide credit: A. Torralba

Page 55: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

1 1 1 1 1 1 1 11 1 1 1 -1 -1 -1 -11 1 -1 -1

1 1 -1 -11 -1

1 -11 -1

1 -1

U= U-1=

0.125

0.125 0.25 0 0.5 0 0 0

0.125

0.125 0.25 0 -0.5 0 0 0

0.125

0.125 -0.25 0 0 0.5 0 0

0.125

0.125 -0.25 0 0 -0.5 0 0

0.125

-0.125 0 0.25 0 0 0.5 0

0.125

-0.125 0 0.25 0 0 -0.5 0

0.125

-0.125 0 -0.25 0 0 0 0.5

0.125

-0.125 0 -0.25 0 0 0 -0.5

Properties: • Orthogonal decomposition • Perfect reconstruction • Critically sampled

Haar transform

Slide credit: A. Torralba

Page 56: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

11

1-1

1 1 1 -1

Basic elements:

56

2D Haar transform

Page 57: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

11

1-1

1 1 1 -1

Basic elements:

11

1 11 11 1

= 2 Low pass

57

2D Haar transform

Page 58: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

11

1-1

1 1 1 -1

Basic elements:

11

1 1

11

1 -1

1-1

1 1

1-1

1 -1

1 11 1

=

=

=

=

2 Low pass

58

2D Haar transform

Page 59: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

11

1-1

1 1 1 -1

Basic elements:

11

1 1

11

1 -1

1-1

1 1

1-1

1 -1

1 11 1

1 -11 -1

1 1-1

-1

1 -1-

11

=

=

=

=

2

2

2

2

Low pass

59

2D Haar transform

Page 60: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

11

1-1

1 1 1 -1

Basic elements:

11

1 1

11

1 -1

1-1

1 1

1-1

1 -1

1 11 1

1 -11 -1

1 1-1

-1

1 -1-

11

=

=

=

=

2

2

2

2

Low pass

High pass vertical

High pass horizontal

High pass diagonal

60

2D Haar transform

Page 61: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

2D Haar transform

1 11 1

1 -11 -1

1 1-1 -1

1 -1-1 1

2

2

2

2 Horizontal high pass, vertical high pass

Horizontal high pass, vertical low-pass

Horizontal low pass, vertical high-pass

Horizontal low pass, Vertical low-pass

Sketch of the Fourier transform

Slide credit: B. Freeman and A. Torralba

Page 62: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Simoncelli and Adelson, in “Subband coding”, Kluwer, 1990.

Pyramid cascade

37Slide credit: B. Freeman and A. Torralba

Page 63: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Wavelet/QMF representation

1 -1

1 -1

1 1

-1 -1

1 -1

-1 1

Same number of pixels!Slide credit: B. Freeman and A. Torralba

Page 64: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Image pyramidsImage information occurs at all spatial scales !

• Gaussian pyramid • Laplacian pyramid • Wavelet/QMF pyramid • Steerable pyramid

Slide credit: B. Freeman and A. Torralba

Page 65: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Steerable Pyramid

Images from: http://www.cis.upenn.edu/~eero/steerpyr.html

2 Level decomposition of white circle example:

Low pass residual

Subbands

Slide credit: B. Freeman and A. Torralba

Page 66: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Steerable Pyramid

Images from: http://www.cis.upenn.edu/~eero/steerpyr.html

… …

Slide credit: B. Freeman and A. Torralba

We may combine Steerability with Pyramids to get a Steerable Laplacian Pyramid as shown below

Decomposition Reconstruction

Page 67: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Steerable Pyramid

We may combine Steerability with Pyramids to get a Steerable Laplacian Pyramid as shown below

Images from: http://www.cis.upenn.edu/~eero/steerpyr.html

Decomposition Reconstruction

Slide credit: B. Freeman and A. Torralba

Page 68: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Steerable Pyramid

But we need to get rid of the corner regions before starting the recursive circular filtering

Slide credit: B. Freeman and A. TorralbaSimoncelli and Freeman, ICIP 1995

Page 69: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Reprinted from “Shiftable MultiScale Transforms,” by Simoncelli et al., IEEE Transactions on Information Theory, 1992, copyright 1992, IEEE

There is also a high pass residual…Slide credit: B. Freeman and A. Torralba

Page 70: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

70

Page 71: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Monroe

71

Page 72: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Dog or cat?

72

Page 73: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Almost no dog information

73

Page 74: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

• Summary of pyramid representations

74

Page 75: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Image pyramids

Shows the information added in Gaussian pyramid at each spatial scale. Useful for noise reduction & coding.

Progressively blurred and subsampled versions of the image. Adds scale invariance to fixed-size algorithms.

Shows components at each scale and orientation separately. Non-aliased subbands. Good for texture and feature analysis. But overcomplete and with HF residual.

Bandpassed representation, complete, but with aliasing and some non-oriented subbands.

• Gaussian !!

• Laplacian !!

• Wavelet/QMF !!

• Steerable pyramid75

Page 76: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Schematic pictures of each matrix transform

Shown for 1-d images The matrices for 2-d images are the same idea, but

more complicated, to account for vertical, as well as horizontal, neighbor relationships.

Fourier transform, or Wavelet transform, or Steerable pyramid transform

Vectorized imagetransformed image

76

Page 77: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Gaussian pyramid

= *pixel image

Overcomplete representation. Low-pass filters, sampled appropriately for their blur.

Gaussian pyramid

Slide credit: B. Freeman and A. Torralba

Page 78: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Laplacian pyramid

= *pixel image

Overcomplete representation. Transformed pixels represent bandpassed image information.

Laplacian pyramid

Slide credit: B. Freeman and A. Torralba

Page 79: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Wavelet (QMF) transform

= *pixel imageOrtho-normal transform

(like Fourier transform), but with localized basis functions.

Wavelet pyramid

Slide credit: B. Freeman and A. Torralba

Page 80: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

= *pixel image

Over-complete representation, but non-aliased subbands.

Steerable pyramid

Multiple orientations at one scale

Multiple orientations at the next

scale

the next scale…

Steerable pyramid

Slide credit: B. Freeman and A. Torralba

Page 81: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Why use image pyramids? • Handle real-world size variations with

a constant-size vision algorithm. • Remove noise • Analyze texture • Recognize objects • Label image features • Image priors can be specified naturally

in terms of wavelet pyramids.

Slide credit: B. Freeman and A. Torralba

Page 82: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Image representation

!• Pixels: great for spatial resolution, poor access

to frequency !

• Fourier transform: great for frequency, not for spatial info !

• Pyramids/filter banks: balance between spatial and frequency information

Slides credit: James Hayes

Page 83: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Major uses of image pyramids!

• Compression !

• Object detection – Scale search – Features !

• Detecting stable interest points !!

• Registration – Course-to-fine

Slides credit: James Hayes

Page 84: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Application: Representing Texture

Source: Forsyth

Page 85: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Texture and Material

http://www-cvr.ai.uiuc.edu/ponce_grp/data/texture_database/samples/

Page 86: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Texture and Orientation

http://www-cvr.ai.uiuc.edu/ponce_grp/data/texture_database/samples/

Page 87: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Texture and Scale

http://www-cvr.ai.uiuc.edu/ponce_grp/data/texture_database/samples/

Page 88: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

What is texture?

! Regular or stochastic patterns caused by

bumps, grooves, and/or markings

Page 89: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

How can we represent texture?

!• Compute responses of blobs and edges at

various orientations and scales

Page 90: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Overcomplete representation: filter banks

LM Filter Bank

Code for filter banks: www.robots.ox.ac.uk/~vgg/research/texclass/filters.html

Page 91: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Filter banks

• Process image with each filter and keep responses (or squared/abs responses)

Page 92: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

How can we represent texture?

!• Measure responses of blobs and edges at

various orientations and scales !

• Idea 1: Record simple statistics (e.g., mean, std.) of absolute filter responses

Page 93: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Can you match the texture to the response?

Mean abs responses

FiltersA

B

C

1

2

3

Page 94: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Representing texture by mean abs response

Mean abs responses

Filters

Page 95: Computer Visioncrcv.ucf.edu/people/faculty/xBorji/www/lectures/6.pdf1/2 1/4 (2x zoom) 1/8 (4x zoom) Slide credit: ... filter the image with eye patch ... x2 = G1 x1! x3 = G2 x2!

Representing texture• Idea 2: take vectors of filter responses at each pixel and

cluster them, then take histograms (more on in later weeks)