Top Banner
What should be done at the Low Level? 16-721: Learning-Based Methods in Visi A. Efros, CMU, Spring 20
41

What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

What should be done at the Low Level?

16-721: Learning-Based Methods in VisionA. Efros, CMU, Spring 2009

Page 2: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Class Introductions

• Name:• Research area / project / advisor• What you want to learn in this class?• When I am not working, I ______________• Favorite fruit:

Page 3: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Analysis Projects / Presentations

Wed: Varun

note-taker: Dan

Next Wed: Dan

note-taker: Edward

Dan and Edward need to meet with me ASAP

Varun needs to meet second time

Page 4: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Four Stages of Visual PerceptionFour Stages of Visual Perception

© Stephen E. Palmer, 2002

Image- BasedProcessing

Surface- BasedProcessing

Object-Based

Processing

Category- BasedProcessing

Light

Vision

Audition

STM

LTM

Motor

Sound

LightMove-ment

Odor (etc.)

Ceramiccup on a table

David Marr, 1982

Page 5: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Four Stages of Visual PerceptionFour Stages of Visual Perception

© Stephen E. Palmer, 2002

The Retinal Image

An Image (blowup) Receptor Output

Page 6: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Four Stages of Visual PerceptionFour Stages of Visual Perception

© Stephen E. Palmer, 2002

Image-basedRepresentation

Primal Sketch(Marr)

An Image

(Line Drawing)

RetinalImage

Image-based

processes

EdgesLinesBlobsetc.

Page 7: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Four Stages of Visual PerceptionFour Stages of Visual Perception

© Stephen E. Palmer, 2002

Surface-basedRepresentation

Primal Sketch 2.5-D Sketch

Image-basedRepresentation

Surface-based

processes

StereoShadingMotion

etc.

Page 8: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Koenderink’s trick

Page 9: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Four Stages of Visual PerceptionFour Stages of Visual Perception

© Stephen E. Palmer, 2002

Object-basedRepresentation

Object-based

processes

GroupingParsing

Completionetc.

Surface-basedRepresentation

2.5-D Sketch Volumetric Sketch

Page 10: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Geons(Biederman '87)

Page 11: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Four Stages of Visual PerceptionFour Stages of Visual Perception

© Stephen E. Palmer, 2002

Category-basedRepresentation

Category-based

processes

Pattern-Recognition

Spatial-description

Object-basedRepresentation

Volumetric Sketch Basic-level Category

Category: cup

Color: light-gray

Size: 6”

Location: table

Page 12: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

We likely throw away a lot

Page 13: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

line drawings are universal

Page 14: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

However, things are not so simple…

● Problems with feed-forward model of processing…

Page 15: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

two-tone images

Page 16: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.
Page 17: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.
Page 18: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

hair (not shadow!)

inferred external contours

“attached shadow” contour

“cast shadow” contour

Page 19: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.
Page 20: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Finding 3D structure in two-tone images requires distinguishing cast shadows, attached shadows, and areas of low reflectivity

The images do not contain this information a priori (at low level)

Cavanagh's argument

Page 21: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Marr's model (circa 1980) Cavanagh’s Model (circa 1990s)

Feedforward vs. feedback models

stimulusstimulus

2D shape

memory

3D shape

2½D sketch

Object

3D model

feedback

basic recognition with 2D primitives

reconstruction of shape from image features

object recognition by matching 3D models

primal sketch

Page 22: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

A Classical View of Vision

Grouping /Segmentation

Figure/GroundOrganization

Object and Scene Recognition

pixels, features, edges, etc.Low-level

Mid-level

High-level

Page 23: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

A Contemporary View of Vision

Figure/GroundOrganization

Grouping /Segmentation

Object and Scene Recognition

pixels, features, edges, etc.Low-level

Mid-level

High-level

But where we draw this line?

Page 24: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Question #1:What (if anything) should be done at the “Low-Level”?

N.B. I have already told you everything that is known. From now on, there

aren’t any answers.. Only questions…

Page 25: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Who cares? Why not just use pixels?

Pixel differences vs. Perceptual differences

Page 26: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Eye is not a photometer!

"Every light is a shade, compared to the higher lights, till you come to the sun; and every shade is a light, compared to the deeper shades, till you come to the night."

— John Ruskin, 1879

Page 27: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Cornsweet Illusion

Page 28: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Campbell-Robson contrast sensitivity curveCampbell-Robson contrast sensitivity curve

Sine wave

Page 29: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Metamers

Page 30: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Question #1:What (if anything) should be done at the “Low-Level”?

i.e. What input stimulus should we be invariant to?

Page 31: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Invariant to:

• Brightness / Color changes?

small brightness / color changeslow-frequency changes

But one can be too invariant

Page 32: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Invariant to:

• Edge contrast / reversal?

I shouldn’t care what background I am on!

but be careful of exaggerating noise

Page 33: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Representation choices

Raw Pixels

Gradients:

Gradient Magnitude:

Thresholded gradients (edge + sign):

Thresholded gradient mag. (edges):

Page 34: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Typical filter bank

Page 35: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

pyramid (e.g. wavelet, stearable, etc)

Filters

Input image

Page 36: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

What does it capture?

v = F * Patch (where F is filter matrix)

Page 37: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Why these filters?

Page 38: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Learned filters

Page 39: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Spatial invariance

• Rotation, Translation, Scale• Yes, but not too much…

• In brain: complex cells – partial invariance

• In Comp. Vision: histogram-binning methods (SIFT, GIST, Shape Context, etc) or, equivalently, blurring (e.g. Geometric Blur) -- will discuss later

Page 40: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Many lives of a boundary

Page 41: What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Often, context-dependent…

input canny human

Maybe low-level is never enough?