Top Banner
Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech
42

Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Dec 16, 2015

Download

Documents

Sybil Chandler
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent

Machines

Devi ParikhVirginia Tech

Page 2: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Image Understanding

Slide credit: Devi Parikh

“Color College Avenue”, Blacksburg, VA, May 2012

Page 3: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Accu

racy

Machine Human

State of Affairs

Slide credit: Devi Parikh

Page 4: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

How do we teach machines today?

Slide credit: Devi Parikh

Page 5: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Slide credit: Devi Parikh

Page 6: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Slide credit: Devi Parikh

Page 7: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Slide credit: Devi Parikh

Page 8: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Slide credit: Devi Parikh

Page 9: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Slide credit: Devi Parikh

Page 10: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Slide credit: Devi Parikh

Page 11: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Slide credit: Devi Parikh

Page 12: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

And on, and on, and on…

Slide credit: Devi Parikh

Page 13: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Slide credit: Devi Parikh

Page 14: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

How do machines behave?

Slide credit: Devi Parikh

Page 15: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Airplane Cabin Amusement Park

Aquarium Badminton Court

Bedroom

Xiao et al., CVPR 2010Slide credit: Devi Parikh

Page 16: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Clarifai, April 10th 2014

Page 17: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Slide credit: Devi Parikh

Need a better mode of communication!

Interacting with Vision Systems

Page 18: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Attributes

• Examples: furry, natural, young, etc.• Mid-level• Shareable across concepts• Human understandable• Machine detectable• Allow for human-machine communication

Slide credit: Devi Parikh

Page 19: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

[Parikh and Grauman, ICCV 2011]

[Parkash and Parikh, ECCV 2012]

[Biswas and Parikh, CVPR 2013]

[Lad and Parikh, ECCV 2014]

[Kovashka, Parikh and Grauman, CVPR 2012][Parikh and Grauman,

ICCV 2013]

[Bansal, Farhadi and

Parikh, ECCV 2014]

Supervisor

User

User

Role of the Human

Com

mun

icat

or

SupervisorUser

Hum

anM

achi

neImage Search Instilling Domain Knowledge

Characterizing Failure Modes

Interpretable Models

My missing brother is fuller-faced than

this boy.

Polar bears are white and larger

than rabbits.

If the image is blurry or the face is not frontal, I may fail.

Active and Interactive Learning

Slide credit: Devi Parikh

Supervisor

I think this is a polar bear because this is a

white and furry animal.

Page 20: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Accessing Common Sense

• Direct communication

• Learn by observing structure in our visual world?

Slide credit: Devi Parikh

Page 21: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Two professors converse in front of a blackboard.

Slide credit: Larry Zitnick

Page 22: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Two professors stand in front of a blackboard.

Slide credit: Larry Zitnick

Page 23: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Two professors converse in front of a blackboard.

Slide credit: Larry Zitnick

Page 24: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Challenges

• Lacking visual density• Annotations are expensive (and boring)• Computer vision doesn’t work well enough

Slide credit: Devi Parikh

Page 25: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Is photorealism necessary?

Slide credit: Larry Zitnick

Page 26: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Jenny Mike

Slide credit: Larry Zitnick

Page 27: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Slide credit: Larry Zitnick

Page 28: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Slide credit: Larry Zitnick

Page 29: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Interface

2x

Page 30: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Mike fights off a bear by giving him a hotdog while Jenny runs away.

Slide credit: Larry Zitnick

Page 31: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

1,000 classes of semantically similar scenes:

Class 1 Class 2 Class 1,000

1,000 classes x 10 scenes per class = 10,000 scenes

Slide credit: Larry Zitnick

Dataset

Dataset online[Zitnick and Parikh, CVPR 2013]

Page 32: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Slide credit: Larry Zitnick

Visual Features

Page 33: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Cloud

Cat Basketball

Smile

Gaze

Gaze

Person sitting

Tree

Person standing

Slide credit: Larry Zitnick

Visual Features

Page 34: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Cloud

Cat Basketball

Smile

Gaze

Gaze

Person sitting

Tree

Person standing

Slide credit: Devi Parikh

Visual Features

Which visual features are important for

semantic meaning?

Which words correlate with specific visual

features?

Page 35: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Generate and Retrieve ScenesInput: Jenny is catching the ball. Mike is kicking the ball. The table is next to the tree.

Tuples: <<Jenny>,<catch>,<ball>> <<Mike>,<kick>,<ball>> <<table>,<be>,<>>

Slide credit: Devi Parikh [Zitnick, Parikh and Vanderwende, ICCV 2013]

Automatically Generated Human Generated

Retrieval: score a database of scenes

Page 36: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Slide credit: Devi Parikh [Antol, Zitnick and Parikh, ECCV 2014]

Learning Fine-grained Interactions

3x

Page 37: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Learning Fine-grained Interactions

Train on abstract, test on real

Page 38: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Results: 60 categories

Accu

racy

%

Chance Today's PoseDet

Today's PoseDet ++

Perfect PoseDet

0

2

4

6

8

10

12

14

• Domain adaptation• Learn explicit mapping from abstract to real world• Multi-label problem

[Antol, Zitnick and Parikh, ECCV 2014]

Page 39: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Visual Abstraction For…• Studying mappings between images and text• Zero-shot learning• Studying image memorability, specificity, etc.

• Learning common sense knowledge

• Rich annotation modality– Ask for descriptions– Ask for scenes– Show scene and ask for modification

Goes beyond “Jenny and Mike.”

Study high-level image understanding

tasks without waiting for lower-level

vision tasks to be solved

Page 40: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

[Xinlei Chen]

Page 41: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Accu

racy

Machine Human

Conclusion

• Give computer vision systems access to common-sense knowledge– Communication with humans via attributes (text)– Visual abstraction

• Use humans for more than just “labels”Slide credit: Devi Parikh

Page 42: Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Thank you.

Slide credit: Devi Parikh