Top Banner
Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004
48

Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Computer Vision

CSPP 56553

Artificial Intelligence

March 3, 2004

Page 2: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Roadmap

• Motivation– Computer vision applications

• Is a Picture worth a thousand words?– Low level features

• Feature extraction: intensity, color

– High level features• Top-down constraint: shape from stereo, motion,..

• Case Study: Vision as Modern AI– Fast, robust face detection (Viola & Jones 2002)

Page 3: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Perception

• From observation to facts about world– Analogous to speech recognition– Stimulus (Percept) S, World W

• S = g(W)

– Recognition: Derive world from percept• W=g’(S)

• Is this possible?

Page 4: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Key Perception Problem

• Massive ambiguity– Optical illusions

• Occlusion

• Depth perception

• “Objects are closer than they appear”

• Is it full-sized or a miniature model?

Page 5: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Image Ambiguity

Page 6: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Handling Uncertainty

• Identify single perfect correct solution– Impossible!

• Noise, ambiguity, complexity

• Solution:– Probabilistic model– P(W|S) = αP(S|W) P(W)

• Maximize image probability and model probability

Page 7: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Handling Complexity

• Don’t solve the whole problem– Don’t recover every object/position/color…

• Solve restricted problem– Find all the faces– Recognize a person– Align two images

Page 8: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Modern Computer Vision Applications

• Face / Object detection

• Medical image registration

• Face recognition

• Object tracking

Page 9: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Vision Subsystems

Page 10: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Image Formation

Page 11: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Images and Representations

• Initially pixel images – Image as NxM matrix of pixel values

– Alternate image codings• Grey-scale intensity values

• Color encoding: intensities of RGB values

Page 12: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Images

Page 13: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Grey-scale Images

Page 14: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Color Images

Page 15: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Image Features

• Grey-scale and color intensities– Directly access image signal values

– Large number of measures• Possibly noisy

• Only care about intensities as cues to world

• Image Features:– Mid-level representation

– Extract from raw intensities

– Capture elements of interest for image understanding

Page 16: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Edge Detection

Page 17: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Edge Detection

• Find sharp demarcations in intensity• 1) Apply spatially oriented filters

• E.g. vertical, horizontal, diagonal

• 2) Label above-threshold pixels with edge orientation• 3) Combine edge segments with same orientation:

line

Page 18: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Top-down Constraints

• Goal: Extract objects from images– Approach: apply knowledge about how the world

works to identify coherent objects

Page 19: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Motion: Optical Flow

• Find correspondences in sequential images– Units which move

together represent objects

Page 20: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Stereo

Page 21: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Stereo Depth Resolution

Page 22: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Texture and Shading

Page 23: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Edge-Based 2-3D Reconstruction

Assume world of solid polyhedra with 3-edge verticesApply Waltz line labeling – via Constration Satisfaction

Page 24: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Basic Object Recognition

• Simple idea:– extract 3-D shapes from image– match against \shape library"

• Problems:– extracting curved surfaces from image– representing shape of extracted object– representing shape and variability of library object classes– improper segmentation, occlusion– unknown illumination, shadows, markings, noise, complexity, etc.

• Approaches:– index into library by measuring invariant properties of objects– alignment of image feature with projected library object feature– match image against multiple stored views (aspects) of library object– machine learning methods based on image statistics

Page 25: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Hand-written Digit Recognition

Page 26: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Summary

• Vision is hard:– Noise, ambiguity, complexity

• Prior knowledge is essential to constrain problem– Cohesion of objects, optics, object features

• Combine multiple cues– Motion, stereo, shading, texture,

• Image/object matching:– Library: features, lines, edges, etc

• Apply domain knowledge: Optics• Apply machine learning: NN, NN, CSP, etc

Page 27: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Computer Vision Case Study

• “Rapid Object Detection using a Boosted Cascade of Simple Features”, Viola/Jones ’01

• Challenge:– Object detection:

• Find all faces in an arbitrary images

– Real-time execution• 15 frames per second

– Need simple features, classifiers

Page 28: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Rapid Object Detection Overview

• Fast detection with simple local features– Simple fast feature extraction

• Small number of computations per pixel• Rectangular features

– Feature selection with Adaboost• Sequential feature refinement

– Cascade of classifiers• Increasingly complex classifiers• Repeatedly rule out non-object areas

Page 29: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Picking Features

• What cues do we use for object detection?– Not direct pixel intensities– Features

• Can encode task specific domain knowledge (bias)– Difficult to learn directly from data

– Reduce training set size

• Feature system can speed processing

Page 30: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Rectangle Features

• Treat rectangles as units– Derive statistics

• Two-rectangle features– Two similar rectangular regions

• Vertically or horizontally adjacent

– Sum pixels in each region• Compute difference between regions

Page 31: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Rectangle Features II

• Three-rectangle features– 3 similar rectangles: horizontally/vertically

• Sum outside rectangles

• Subtract from center region

• Four-rectangle features– Compute difference between diagonal pairs

• HUGE feature set: ~180,000

Page 32: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Rectangle Features

Page 33: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Computing Features Efficiently

• Fast detection requires fast feature calculation• Rapidly compute intermediate representation

– “Integral image”

– Value for point (x,y) is sum of pixels above, left

– ii(x,y) = Σx’<=x,y’<=y i(x,y)

– Computed by recurrence• s(x,y) = s(x,y-1) + i(x,y) , where s(x,y) cumulative row

• ii(x,y) = ii(x-1,y) + s(x,y)

• Compute rectangle sum with 4 array references

Page 34: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Rectangle Feature Summary

• Rectangle features– Relatively simple– Sensitive to bars, edges, simple structure

• Coarse

– Rich enough for effective learning– Efficiently computable

Page 35: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Learning an Image Classifier

• Supervised training: +/- examples• Many learning approaches possible• Adaboost:

– Selects features AND trains classifier– Improves performance of simple classifiers

• Guaranteed to converge exponentially rapidly

– Basic idea: Simple classifier• Boosts performance by focusing on previous errors

Page 36: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Feature Selection and Training

• Goal: Pick only useful features from 180000– Idea: Small number of features effective

• Learner selects single feature that best separates +/- ve examples– Learner selects optimal threshold for each feature– Classifier h(x) = 1 if pf(x)<pθ, 0 otherwise

Page 37: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Basic Learning Results

• Initial classification: Frontal faces– 200 features– Finds 95%, 1/14000 false positive– Very fast

• Adding features adds to computation time

• Features interpretable– Darker region around eyes that nose/cheeks– Eyes are darker than bridge of nose

Page 38: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Primary Features

Page 39: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

“Attentional Cascade”

• Goal: Improved classification, reduced time– Insight: Small – fast – classifiers can reject

• But have very few false negatives– Reject majority of uninteresting regions quickly

– Focus computation on interesting regions

• Approach: “Degenerate” decision tree• Aka “cascade”

• Positive results passed to high detection classifiers– Negative results rejected immediately

Page 40: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Cascade Schematic

All Sub-window Features

CL 1 CL 2 CL 3

F F F

T T T MoreClassifiers

Reject Sub-Window

Page 41: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Cascade Construction

• Each stage is a trained classifier– Tune threshold to minimize false negatives– Good first stage classifier

• Two feature strong classifier – eye/check + eye/nose

• Tuned: Detect 100%; 40% false positives

– Very computationally efficient • 60 microprocessor instructions

Page 42: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Cascading

• Goal: Reject bad features quickly– Most features are bad

• Reject early in processing, little effort

– Good regions will trigger full cascade• Relatively rare

• Classification is progressively more difficult– Rejected the most obvious cases already

• Deeper classifiers more complex, more error-prone

Page 43: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Cascade Training

• Tradeoffs: Accuracy vs Cost– More accurate classifiers: more features, complex

– More features, more complex: Slower

– Difficult optimization

• Practical approach– Each stage reduces false positive rate

– Bound reduction in false pos, increase in miss

– Add features to each stage until meet target

– Add stages until overall effectiveness targets met

Page 44: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Results

• Task: Detect frontal upright faces– Face/non-face training images

• Face: ~5000 hand-labeled instances

• Non-face: ~9500 random web-crawl, hand-checked

– Classifier characteristics:• 38 layer cascade

• Increasing number of features: 1,10,25,… : 6061

– Classification: Average 10 features per window• Most rejected in first 2 layers

• Process 384x288 image in 0.067 secs

Page 45: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Detection Tuning

• Multiple detections:– Many subwindows around face will alert– Create disjoint subsets

• For overlapping boundaries, only report one – Return average of corners

• Voting:– 3 similarly trained detectors

• Majority rules

– Improves overall

Page 46: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Conclusions

• Fast, robust facial detection– Simple, easily computable features– Simple trained classifiers– Classification cascade allows early rejection

• Early classifiers also simple, fast

– Good overall classification in real-time

Page 47: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Some Results

Page 48: Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Vision in Modern Ai

• Goals: – Robustness– Multidomain applicability– Automatic acquisition– Speed: Real time

• Approach:– Simple mechanisms, feature selection– Machine learning: Tune features, classification