Subhransu Maji CMPSCI 670: Computer Vision October 25, 2016 Recognition Subhransu Maji (UMass, Fall 16) CMPSCI 670 Overview of recognition Image representations Machine learning Deep learning Agenda for the next few lectures 2 3 4 Scene categorization • outdoor/indoor • city/forest/factory/etc.
7
Embed
Overview of recognition Image representations Recognitionsmaji/cmpsci670/slides/lec13_recognition_4UP.pdf · health care, autonomous driving, face recognition, image/video search,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Subhransu MajiCMPSCI 670: Computer Vision
October 25, 2016
Recognition
Subhransu Maji (UMass, Fall 16)CMPSCI 670
Overview of recognitionImage representationsMachine learningDeep learning
Agenda for the next few lectures
2
3 4
Scene categorization• outdoor/indoor • city/forest/factory/etc.
5
Image annotation/tagging
• street • people • building • mountain • …
6
Object detection• find pedestrians
7
Activity recognition
• walking • shopping • rolling a cart • sitting • talking • …
8
Image parsing
mountain
buildingtree
banner
marketpeople
street lamp
sky
building
9
Visual question answeringHow many people are waking on the street?Where was this picture taken? (external knowledge)
1960s – early 1990s: the geometric era1990s: appearance-based modelsLate 1990s: local featuresEarly 2000s: parts-and-shape modelsMid-2000s: bags-of-features, learning-based techniquesPresent trends: big data, recognition + X (X=geometry, robotics, language), deep learning, getting AI to work, many applications: health care, autonomous driving, face recognition, image/video search, etc.
History of ideas in recognition
13 Subhransu Maji (UMass, Fall 16)CMPSCI 670
Recognition by learning
14
Subhransu Maji (UMass, Fall 16)CMPSCI 670
Apply a prediction function to a feature representation of the image to get the desired output:
f( ) = “apple”f( ) = “tomato”f( ) = “cow”
The machine learning framework
15 Subhransu Maji (UMass, Fall 16)CMPSCI 670
y = f(x)
Training: given a training set of labeled examples {(x1,y1), …, (xN,yN)}, estimate the prediction function f by minimizing the prediction error on the training setTesting: apply f to a never before seen test example x and output the predicted value y = f(x)
The machine learning framework
16
output prediction function
Image feature
Prediction
StepsTraining LabelsTraining
Images
Training
Training
Image Features
Image Features
Testing
Test Image
Learned model
Learned model
Slide credit: D. Hoiem Subhransu Maji (UMass, Fall 16)CMPSCI 670
Whole idea: Inject your knowledge into a learning system
Ingredients for learning
18
Sources of knowledge:1. Feature representation
2. Training data: labeled examples
3. Model
➡ Not typically a focus of machine learning ➡ Typically seen as “problem specific” ➡ However, it’s hard to learn from bad representations
➡ Often expensive to label lots of data ➡ Sometimes data is available for “free”
➡ No single learning algorithm is always good (“no free lunch”) ➡ Different learning algorithms work with different ways of
representing the learned classifier
Subhransu Maji (UMass, Fall 16)CMPSCI 670
Features (examples)
19
bags of features
GIST descriptors Gradient histograms
Raw pixels (and simple functions of raw pixels)
Subhransu Maji (UMass, Fall 16)CMPSCI 670
Images in the training set must be annotated with the “correct answer” that the model is expected to produce
Contains a motorbike
Recognition task and supervision
20
Subhransu Maji (UMass, Fall 16)CMPSCI 670 21
Unsupervised “Weakly” supervised Fully supervised
Definition depends on taskSubhransu Maji (UMass, Fall 16)CMPSCI 670
How well does a learned model generalize from the data it was trained on to a new test set?
Generalization
22
Training set (labels known) Test set (labels unknown)
Subhransu Maji (UMass, Fall 16)CMPSCI 670
Circa 2001: five categories, hundreds of images per categoryCirca 2004: 101 categoriesToday: up to thousands of categories, millions of images