Capturing Human Insight for Visual Learning Kristen Grauman Department of Computer Science University of Texas at Austin Work with Sudheendra Vijayanarasimhan, Adriana Kovashka, Devi Parikh, Prateek Jain, Sung Ju Hwang, and Jeff Donahue Frontiers in Computer Vision Workshop, MIT August 22, 2011
Capturing Human Insight for Visual Learning. Kristen Grauman Department of Computer Science University of Texas at Austin. Frontiers in Computer Vision Workshop, MIT August 22, 2011. - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Capturing Human Insight for Visual Learning
Kristen GraumanDepartment of Computer Science
University of Texas at Austin
Work with Sudheendra Vijayanarasimhan, Adriana Kovashka, Devi Parikh, Prateek Jain, Sung Ju Hwang,
and Jeff Donahue
Frontiers in Computer Vision Workshop, MITAugust 22, 2011
Problem: how to capture human insight about the visual world?
The complex space of visual objects, activities, and scenes.
Problem: how to capture human insight about the visual world?
The complex space of visual objects, activities, and scenes.
[tiny image montage by Torralba et al.]
Annotator
Our approach:
Listen:Explanations,Comparisons,Implied cues
Ask:Actively learn
Traditional active learning
Unlabeled data
Labeled data
Current Model
Active Selection
Annotator
At each cycle, obtain label for the most informative or uncertain example. [Mackay 1992, Freund et al. 1997, Tong & Koller 2001, Lindenbaum et al. 2004, Kapoor et al. 2007,…]
?
$$
$$ $
Unlabeled data
Labeled data
Current Model
$Active Selection
Annotator
• Annotation tasks vary in cost and info• Multiple annotators working parallel• Massive unlabeled pools of data
?
[Vijayanarasimhan & Grauman NIPS 2008, CVPR 2009, Vijayanarasimhan et al. CVPR 2010, CVPR 2011, Kovashka et al. ICCV 2011]
Challenges in active visual learning
Current classifier
Unlabeled data
Sub-linear time active selection
[Jain, Vijayanarasimhan, Grauman, NIPS 2010]
110
Hash table
111
101
We propose a novel hashing approach to identify the most uncertain examples in sub-linear time.
Actively selected examples
For 4.5 million unlabeled instances, 10 minutes machine time per iter,
vs. 60 hours for a naïve scan.
Live active learning resultson
Flic
kr te
st s
et
Outperforms status quo data collection approach
[Vijayanarasimhan & Grauman, CVPR 2011]
Summary
• Humans are not simply “label machines”
• Widen access to visual knowledge– New forms of input, often requiring associated new
learning algorithms
• Manage large-scale annotation efficiently– Cost-sensitive active question asking