Analysis of Large Scale Visual Recognition Fei-Fei Li and Olga Russakovsky Olga Russakovsky, Jia Deng, Zhiheng Huang, Alex Berg, Li Fei-Fei Detecting avocados to zucchinis: what have we done, and where are we going? ICCV 2013 http://image-net.org/challenges/LSVRC/2012/analysis
60
Embed
Analysis of Large Scale Visual Recognition Fei-Fei Li and Olga Russakovsky Refernce to paper, photos, vision-lab, stanford logos Olga Russakovsky, Jia.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Analysis of Large Scale Visual Recognition
Fei-Fei Li and Olga Russakovsky
Olga Russakovsky, Jia Deng, Zhiheng Huang, Alex Berg, Li Fei-FeiDetecting avocados to zucchinis: what have we done, and where are we going? ICCV 2013 http://image-net.org/challenges/LSVRC/2012/analysis
What happens under the hoodon classification+localization?
What happens under the hoodon classification+localization?
Olga Russakovsky, Jia Deng, Zhiheng Huang, Alex Berg, Li Fei-FeiDetecting avocados to zucchinis: what have we done, and where are we going? ICCV 2013 http://image-net.org/challenges/LSVRC/2012/analysis
What happens under the hoodon classification+localization?
Preliminaries:• ILSVRC-500 (2012) dataset• Leading algorithms
Olga Russakovsky, Jia Deng, Zhiheng Huang, Alex Berg, Li Fei-FeiDetecting avocados to zucchinis: what have we done, and where are we going? ICCV 2013 http://image-net.org/challenges/LSVRC/2012/analysis
What happens under the hoodon classification+localization?
Preliminaries:• ILSVRC-500 (2012) dataset• Leading algorithms
• A closer look at small objects• A closer look at textured objects
Olga Russakovsky, Jia Deng, Zhiheng Huang, Alex Berg, Li Fei-FeiDetecting avocados to zucchinis: what have we done, and where are we going? ICCV 2013 http://image-net.org/challenges/LSVRC/2012/analysis
What happens under the hoodon classification+localization?
Preliminaries:• ILSVRC-500 (2012) dataset• Leading algorithms
• A closer look at small objects• A closer look at textured objects
Olga Russakovsky, Jia Deng, Zhiheng Huang, Alex Berg, Li Fei-FeiDetecting avocados to zucchinis: what have we done, and where are we going? ICCV 2013 http://image-net.org/challenges/LSVRC/2012/analysis
What happens under the hoodon classification+localization?
Preliminaries:• ILSVRC-500 (2012) dataset – similar to PASCAL• Leading algorithms
• A closer look at small objects• A closer look at textured objects
Olga Russakovsky, Jia Deng, Zhiheng Huang, Alex Berg, Li Fei-FeiDetecting avocados to zucchinis: what have we done, and where are we going? ICCV 2013 http://image-net.org/challenges/LSVRC/2012/analysis
630M connections • Rectified Linear Units, max pooling, dropout trick• Randomly extracted 224x224 patches for more data• Trained with SGD on two GPUs for a week, fully supervised
630M connections • Rectified Linear Units, max pooling, dropout trick• Randomly extracted 224x224 patches for more data• Trained with SGD on two GPUs for a week, fully supervised
What happens under the hoodon classification+localization?
Preliminaries:• ILSVRC-500 (2012) dataset – similar to PASCAL• Leading algorithms: SV and VGG
• A closer look at small objects• A closer look at textured objects
Olga Russakovsky, Jia Deng, Zhiheng Huang, Alex Berg, Li Fei-FeiDetecting avocados to zucchinis: what have we done, and where are we going? ICCV 2013 http://image-net.org/challenges/LSVRC/2012/analysis
SV VGG
Cls+
loc
accu
racy
54.3%45.8%
Results on ILSVRC-500
Difference in accuracy: SV versus VGG
Classification-only
✔Folding chair
Persian cat
Loud speaker
Steel drumPicket
fence
Object scale
Cls.
Acc
urac
y: S
V - V
GG
Difference in accuracy: SV versus VGG
Classification-only
SV better(452 classes)
VGG better(34 classes)
Object scale
Cls.
Acc
urac
y: S
V - V
GG
Difference in accuracy: SV versus VGG
Classification-only
SV better(452 classes)
VGG better(34 classes)
Object scale
Cls.
Acc
urac
y: S
V - V
GG
Difference in accuracy: SV versus VGG
Classification-only
*
*** *** ***
*** *** ***
SV beats VGG
VGG beats SV
SV better(452 classes)
VGG better(34 classes)
Object scale
Cls.
Acc
urac
y: S
V - V
GG
Difference in accuracy: SV versus VGG
Cls+
Loc
Accu
racy
: SV
- VG
G
Object scale
Classification-only
VGG better(150 classes)
SV better(338 classes)
Classification+Localiation
Cumulative accuracy across scales
SV
VGGSV
VGG
Object scale
Cum
ulati
ve c
ls. a
ccur
acy
Classification-only Classification+Localization
Cum
ulati
ve c
ls+l
oc a
ccur
acy
Object scale
Cumulative accuracy across scales
SV
VGGSV
Object scale
Cum
ulati
ve c
ls. a
ccur
acy
Classification-only Classification+Localization
Cum
ulati
ve c
ls+l
oc a
ccur
acy
Object scale0.24
205 smallest object classes
VGG
What happens under the hoodon classification+localization?
Preliminaries:• ILSVRC-500 (2012) dataset – similar to PASCAL• Leading algorithms: SV and VGG
• SV always great at classification, but VGG does better than SV at localizing small objects
• A closer look at textured objects
Olga Russakovsky, Jia Deng, Zhiheng Huang, Alex Berg, Li Fei-FeiDetecting avocados to zucchinis: what have we done, and where are we going? ICCV 2013 http://image-net.org/challenges/LSVRC/2012/analysis
What happens under the hoodon classification+localization?
Preliminaries:• ILSVRC-500 (2012) dataset – similar to PASCAL• Leading algorithms: SV and VGG
• SV always great at classification, but VGG does better than SV at localizing small objects
• A closer look at textured objectsWHY?
Olga Russakovsky, Jia Deng, Zhiheng Huang, Alex Berg, Li Fei-FeiDetecting avocados to zucchinis: what have we done, and where are we going? ICCV 2013 http://image-net.org/challenges/LSVRC/2012/analysis
What happens under the hoodon classification+localization?
Preliminaries:• ILSVRC-500 (2012) dataset – similar to PASCAL• Leading algorithms: SV and VGG
• SV always great at classification, but VGG does better than SV at localizing small objects
• A closer look at textured objects
Olga Russakovsky, Jia Deng, Zhiheng Huang, Alex Berg, Li Fei-FeiDetecting avocados to zucchinis: what have we done, and where are we going? ICCV 2013 http://image-net.org/challenges/LSVRC/2012/analysis
Textured objects (ILSVRC-500)
Amount of textureLow High
No texture Low texture Medium texture High texture# classes 116 189 143 52
Textured objects (ILSVRC-500)
Amount of textureLow High
No texture Low texture Medium texture High texture# classes 116 189 143 52
Object scale 20.8% 23.7% 23.5% 25.0%
Textured objects (ILSVRC-500)
Amount of textureLow High
No texture Low texture Medium texture High texture# classes 116 189 149 143 115 52 35
Localizing textured objects (416 classes, same average object scale at each level of texture)
Loca
lizati
on a
ccur
acy
Level of texture
SV VGG
Level of texture
Loca
lizati
on a
ccur
acy On correctly classified images
SV VGG
Localizing textured objects (416 classes, same average object scale at each level of texture)
Level of texture
Loca
lizati
on a
ccur
acy On correctly classified images
SV VGG
Localizing textured objects (416 classes, same average object scale at each level of texture)
What happens under the hoodon classification+localization?
Preliminaries:• ILSVRC-500 (2012) dataset – similar to PASCAL• Leading algorithms: SV and VGG
• SV always great at classification, but VGG does better than SV at localizing small objects
• Textured objects easier to localize, especially for SV
Olga Russakovsky, Jia Deng, Zhiheng Huang, Alex Berg, Li Fei-FeiDetecting avocados to zucchinis: what have we done, and where are we going? ICCV 2013 http://image-net.org/challenges/LSVRC/2012/analysis
ILSVRC 2013 with large-scale object detection
http://image-net.org/challenges/LSVRC/2013/
Fully annotated 200 object classes across 60,000 images
Allows evaluation of generic object detection in cluttered scenes at scale