Visual Saliency Prediction using Deep Learning Techniques Junting Pan Xavier Giró-i-Nieto AUTHOR ADVISOR 20/07/2014
Aug 12, 2015
Visual Saliency Prediction using Deep Learning Techniques
Junting Pan Xavier Giró-i-Nieto
AUTHOR ADVISOR
20/07/2014
2
OUTLINE
1. Motivation2. Related works3. Methodology4. Results5. Conclusions
3
Let’s play a game!
4
SALIENCY PREDICTION
5
SALIENCY PREDICTION
What have you seen?
6
Tower
SALIENCY PREDICTION
7
Tower
SALIENCY PREDICTION
House
8
SALIENCY PREDICTION
Tower House
Rocks
9
SALIENCY PREDICTION
10
SALIENCY PREDICTION
Eye Tracker Mouse Click
11
LSUN SALIENCY CHALLENGE
12
LSUN SALIENCY CHALLENGE
13
LSUN SALIENCY CHALLENGE
14
OUTLINE
1. Motivation2. Related Works3. Methodology4. Results5. Conclusions
15
RELATED WORK: Deep Learning
@jponttuset
16
RELATED WORK: Deep Learning
Deep Learning
http://insights.venturescanner.com/category/artificial-intelligence-2/
17
RELATED WORK: ConvNet
A Krizhevsky, I Sutskever, GE Hinton “Imagenet classification with deep convolutional neural networks” Part of: Advances in Neural Information Processing Systems 25 (NIPS 2012)
Orange
18
RELATED WORK: ConvNet
A Krizhevsky, I Sutskever, GE Hinton “Imagenet classification with deep convolutional neural networks” Part of: Advances in Neural Information Processing Systems 25 (NIPS 2012)
A Krizhevsky, I Sutskever, GE Hinton “Imagenet classification with deep convolutional neural networks” Part of: Advances in Neural Information Processing Systems 25 (NIPS 2012) 19
RELATED WORK: ConvNet
20
RELATED WORK: ConvNet
=Downsampling
A Krizhevsky, I Sutskever, GE Hinton “Imagenet classification with deep convolutional neural networks” Part of: Advances in Neural Information Processing Systems 25 (NIPS 2012)
21
RELATED WORK: ConvNet
A Krizhevsky, I Sutskever, GE Hinton “Imagenet classification with deep convolutional neural networks” Part of: Advances in Neural Information Processing Systems 25 (NIPS 2012)
ReLU (non-linearity)
f(x) = max(0,x)
22
RELATED WORK: ConvNet
Dot Produt
A Krizhevsky, I Sutskever, GE Hinton “Imagenet classification with deep convolutional neural networks” Part of: Advances in Neural Information Processing Systems 25 (NIPS 2012)
23
RELATED WORK: Conventional Saliency
Jianming Zhang, Stan Sclaroff. Saliency detection: a boolean map approach [ICCV 2013]
24
RELATED WORK: Deep Saliency
Kümmerer, Matthias, Lucas Theis, and Matthias Bethge. "Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet." arXiv preprint arXiv:1411.1045 (2014).
25
RELATED WORK: Deep Saliency
Vig, Eleonora, Michael Dorr, and David Cox. "Large-scale optimization of hierarchical features for saliency prediction in natural images." Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 2014.
26
RELATED WORK: End-to-end Architecture
Long, Jonathan, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on. IEEE, 2015.
27
OUTLINE
1. Motivation2. Related Works3. Methodology4. Results5. Conclusions
28
SALIENCY PREDICTION: JuntingNet
29
SALIENCY PREDICTION: JuntingNet
30
SALIENCY PREDICTION: Data
TRAIN VALIDATION TEST
10,000 5,000 5,000
6,000 926 2,000
CAT2000 [Borji’15] 2,000 - 2,000
MIT300 [Judd’12] 300 - -
LargeScale
31
SALIENCY PREDICTION: JuntingNet
32
SALIENCY PREDICTION: Architecture
Upsample + filter
2D map
96x96 2340=48x48
IMAGE INPUT(RGB)
33
SALIENCY PREDICTION: Architecture
Upsample + filter
2D map
96x96 2340=48x48
3 CONV LAYERS
34
SALIENCY PREDICTION: Architecture
Upsample + filter
2D map
96x96 2340=48x48
2 DENSE LAYERS
35
SALIENCY PREDICTION: Architecture
Upsample + filter
2D map
96x96 2340=48x48
36
SALIENCY PREDICTION: JuntingNet
37
SALIENCY PREDICTION: Overfitting
Overfitting: More than 20 Milions of parameters
10.000 images for training
38
SALIENCY PREDICTION: Training
Data augmentation with horizontal mirroring.
39
SALIENCY PREDICTION: TrainingWe split the total training data in TWO parts:
80% Training
20% Validation (simultaneous testing)
40
SALIENCY PREDICTION: Training
Training curve of iSUN Database
Training curve of iSUN Database
41
SALIENCY PREDICTION: TrainingLower is better !!
42
SALIENCY PREDICTION: Training
Training curve of iSUN Database
Number of iterations (Training time)
43
SALIENCY PREDICTION: Training
Number of iterations (Training time)
Longer is better?
Training curve of iSUN Database
44
SALIENCY PREDICTION: Training
Number of iterations (Training time)
If the validation loss stops decreasing...
Training curve of iSUN Database
45
SALIENCY PREDICTION: Training
Number of iterations (Training time)
If the validation loss stops decreasing...
DANGER OF OVERFITTING!The model is learning from the data, NOT the problem itself
Training curve of iSUN Database
46
SALIENCY PREDICTION: Training
Training curve of SALICON Database
47
SALIENCY PREDICTION: Training
A: I have just show you our best model.
B: Why is this the best model?
48
SALIENCY PREDICTION: Trial and ErrorWe tried many architectures, too many to be listed here..
49
SALIENCY PREDICTION: Trial and ErrorWe tried many architectures, too many to be listed here..
We tried many architectures, too many to be listed here..
50
SALIENCY PREDICTION: Trial and Error
We tried many architectures, too many to be listed here..
51
SALIENCY PREDICTION: Trial and Error
52
SALIENCY PREDICTION: Training
Loss function Mean Square Error (MSE)
Weight initialization Gaussian distribution
Learning rate 0.03 to 0.0001
Mini batch size 128
Training time 7h (SALICON) / 4h (iSUN)
Acceleration SGD+ nesterov momentum (0.9)
Regularisation Maxout norm
GPU NVidia GTX 980
53
OUTLINE
1. Motivation2. Related Works3. Methodology4. Results5. Conclusions
54
RESULTS: Qualitative (iSUN)
JuntingNetGround TruthPixels
55
RESULTS: Qualitative (iSUN)
JuntingNetGround TruthPixels
56
RESULTS: Qualitative (iSUN)
JuntingNetGround TruthPixels
57
RESULTS: Qualitative (iSUN)
JuntingNetGround TruthPixels
58
RESULTS: Quantitative (iSUN)Results from CVPR LSUN Challenge 2015
59
RESULTS: Qualitative (SALICON)
JuntingNetGround TruthPixels
60
RESULTS: Qualitative (SALICON)
JuntingNetGround TruthPixels
61
RESULTS: Qualitative (SALICON)
JuntingNetGround TruthPixels
62
RESULTS: Qualitative (SALICON)
JuntingNetGround TruthPixels
63
RESULTS: Quantitative (SALICON)Results from CVPR LSUN Challenge 2015
64
RESULTS: First Position at LSUN Challenge
65
RESULTS: MIT Saliency Benchmark
Method SImilarity CC AUC_shuffled AUC_Borji AUC_Judd
Baseline: infinite human
1 1 0.80 0.87 0.91
Deep Gaze 0.39 0.48 0.66 0.85 0.84
eDN 0.41 0.45 0.62 0.81 0.82
Our work 0.4708 0.4285 0.5075 0.7416 0.7720
Torralba, Antonio, and Alexei Efros. "Unbiased look at dataset bias." Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011
66
Future Work
Method SImilarity CC AUC_shuffled AUC_Borji AUC_Judd
Baseline: infinite human
1 1 0.80 0.87 0.91
Deep Gaze 0.39 0.48 0.66 0.85 0.84
SalNet 0.52 0.58 0.69 0.82 0.83
eDN 0.41 0.45 0.62 0.81 0.82
Our work 0.4708 0.4285 0.5075 0.7416 0.7720
Torralba, Antonio, and Alexei Efros. "Unbiased look at dataset bias." Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011
K. McGuinness
67
RESULTS: Dissemination
http://bit.ly/juntingnet
Preprint Open Source Software & Models
http://arxiv.org/abs/1507.01422
68
RESULTS: Dissemination
Article highlighted at www.upc.edu
on 17 July 2015
69
OUTLINE
1. Motivation2. Related Works3. Methodology4. Results5. Conclusions
70
LSUN SALIENCY CHALLENGE: A Déjà vu ?
John Markoff, “Scientists see promise in deep learning Programs”, The New York Times (Nov2012).
Photo: Keith Penner
71
ACKNOWLEDGMENTSXavier Giró NietoCarlos SeguraCarles FernándezAlbert GilVictor CamposEnric MonteElisa SayrolEdu FontdevilaMíriam BellverAmaia SalvadorMarc CarnéJavier HernandoJavier VeraAll my family members and friends
72
Thank you!
73
Thank you!
74
Thank you!
75
Thank you! : )
Thank you!