Saliency prediction using deep learning techniques

Visual Saliency Prediction using Deep Learning Techniques

Junting Pan Xavier Giró-i-Nieto

AUTHOR ADVISOR

20/07/2014

2

OUTLINE

1. Motivation2. Related works3. Methodology4. Results5. Conclusions

3

Let’s play a game!

4

SALIENCY PREDICTION

5

SALIENCY PREDICTION

What have you seen?

6

Tower

SALIENCY PREDICTION

7

Tower

SALIENCY PREDICTION

House

8

SALIENCY PREDICTION

Tower House

Rocks

9

SALIENCY PREDICTION

10

SALIENCY PREDICTION

Eye Tracker Mouse Click

11

LSUN SALIENCY CHALLENGE

12


13


14

OUTLINE

1. Motivation2. Related Works3. Methodology4. Results5. Conclusions

15

RELATED WORK: Deep Learning

@jponttuset

https://twitter.com/jponttuset

https://twitter.com/jponttuset

16

RELATED WORK: Deep Learning

Deep Learning

http://insights.venturescanner.com/category/artificial-intelligence-2/

17

RELATED WORK: ConvNet

A Krizhevsky, I Sutskever, GE Hinton “Imagenet classification with deep convolutional neural networks” Part of: Advances in Neural Information Processing Systems 25 (NIPS 2012)

Orange

https://scholar.google.es/citations?user=x04W_mMAAAAJ&hl=es&oi=sra

https://scholar.google.es/citations?user=JicYPdAAAAAJ&hl=es&oi=sra

http://papers.nips.cc/paper/4824-imagenet-classification-w



http://papers.nips.cc/book/advances-in-neural-information-processing-systems-25-2012



18











A Krizhevsky, I Sutskever, GE Hinton “Imagenet classification with deep convolutional neural networks” Part of: Advances in Neural Information Processing Systems 25 (NIPS 2012) 19










20


=Downsampling










21



ReLU (non-linearity)

f(x) = max(0,x)









22


Dot Produt










23

RELATED WORK: Conventional Saliency

Jianming Zhang, Stan Sclaroff. Saliency detection: a boolean map approach [ICCV 2013]

http://cs-people.bu.edu/jmzhang/BMS/BMS_iccv13_preprint.pdf

http://cs-people.bu.edu/jmzhang/BMS/BMS_iccv13_preprint.pdf

24

RELATED WORK: Deep Saliency

Kümmerer, Matthias, Lucas Theis, and Matthias Bethge. "Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet." arXiv preprint arXiv:1411.1045 (2014).

http://arxiv.org/abs/1411.1045

25

RELATED WORK: Deep Saliency

Vig, Eleonora, Michael Dorr, and David Cox. "Large-scale optimization of hierarchical features for saliency prediction in natural images." Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 2014.

http://dx.doi.org/10.1109/CVPR.2014.358

26

RELATED WORK: End-to-end Architecture

Long, Jonathan, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on. IEEE, 2015.

http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Long_Fully_Convolutional_Networks_2015_CVPR_paper.pdf

27

OUTLINE


28

SALIENCY PREDICTION: JuntingNet

29


http://vision.princeton.edu/projects/2014/iSUN/



http://salicon.net/

http://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Jiang_SALICON_Saliency_in_2015_CVPR_paper.html

http://salicon.net/

30

SALIENCY PREDICTION: Data

TRAIN VALIDATION TEST

10,000 5,000 5,000

6,000 926 2,000

CAT2000 [Borji’15] 2,000 - 2,000

MIT300 [Judd’12] 300 - -

LargeScale

http://salicon.net/


http://salicon.net/




http://saliency.mit.edu/results_cat2000.html

http://saliency.mit.edu/results_cat2000.html

http://saliency.mit.edu/results_mit300.html

http://saliency.mit.edu/results_mit300.html

31





http://salicon.net/


http://salicon.net/

32

SALIENCY PREDICTION: Architecture

Upsample + filter

2D map

96x96 2340=48x48

IMAGE INPUT(RGB)

33


Upsample + filter

2D map

96x96 2340=48x48

3 CONV LAYERS

34


Upsample + filter

2D map

96x96 2340=48x48

2 DENSE LAYERS

35


Upsample + filter

2D map

96x96 2340=48x48

36





http://salicon.net/


http://salicon.net/

http://www.iro.umontreal.ca/~lisa/pointeurs/theano_scipy2010.pdf

http://arxiv.org/pdf/1211.5590.pdf

http://deeplearning.net/software/theano/

37

SALIENCY PREDICTION: Overfitting

Overfitting: More than 20 Milions of parameters

10.000 images for training

38

SALIENCY PREDICTION: Training

Data augmentation with horizontal mirroring.

39

SALIENCY PREDICTION: TrainingWe split the total training data in TWO parts:

80% Training

20% Validation (simultaneous testing)

40


Training curve of iSUN Database


41

SALIENCY PREDICTION: TrainingLower is better !!

42



Number of iterations (Training time)

43



Longer is better?


44



If the validation loss stops decreasing...


45



If the validation loss stops decreasing...

DANGER OF OVERFITTING!The model is learning from the data, NOT the problem itself


46


Training curve of SALICON Database

47


A: I have just show you our best model.

B: Why is this the best model?

48

SALIENCY PREDICTION: Trial and ErrorWe tried many architectures, too many to be listed here..

49

SALIENCY PREDICTION: Trial and ErrorWe tried many architectures, too many to be listed here..

We tried many architectures, too many to be listed here..

50

SALIENCY PREDICTION: Trial and Error

We tried many architectures, too many to be listed here..

51

SALIENCY PREDICTION: Trial and Error

52


Loss function Mean Square Error (MSE)

Weight initialization Gaussian distribution

Learning rate 0.03 to 0.0001

Mini batch size 128

Training time 7h (SALICON) / 4h (iSUN)

Acceleration SGD+ nesterov momentum (0.9)

Regularisation Maxout norm

GPU NVidia GTX 980

53

OUTLINE


54

RESULTS: Qualitative (iSUN)

JuntingNetGround TruthPixels

55



56



57



58

RESULTS: Quantitative (iSUN)Results from CVPR LSUN Challenge 2015

59

RESULTS: Qualitative (SALICON)


60



61



62



63

RESULTS: Quantitative (SALICON)Results from CVPR LSUN Challenge 2015

64

RESULTS: First Position at LSUN Challenge

65

RESULTS: MIT Saliency Benchmark

Method SImilarity CC AUC_shuffled AUC_Borji AUC_Judd

Baseline: infinite human

1 1 0.80 0.87 0.91

Deep Gaze 0.39 0.48 0.66 0.85 0.84

eDN 0.41 0.45 0.62 0.81 0.82

Our work 0.4708 0.4285 0.5075 0.7416 0.7720

Torralba, Antonio, and Alexei Efros. "Unbiased look at dataset bias." Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5995347&tag=1

66

Future Work

Method SImilarity CC AUC_shuffled AUC_Borji AUC_Judd

Baseline: infinite human

1 1 0.80 0.87 0.91

Deep Gaze 0.39 0.48 0.66 0.85 0.84

SalNet 0.52 0.58 0.69 0.82 0.83

eDN 0.41 0.45 0.62 0.81 0.82

Our work 0.4708 0.4285 0.5075 0.7416 0.7720

Torralba, Antonio, and Alexei Efros. "Unbiased look at dataset bias." Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011

K. McGuinness

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5995347&tag=1

67

RESULTS: Dissemination

http://bit.ly/juntingnet

Preprint Open Source Software & Models






68

RESULTS: Dissemination

Article highlighted at www.upc.edu

on 17 July 2015

http://www.upc.edu

http://www.upc.edu

69

OUTLINE


70

LSUN SALIENCY CHALLENGE: A Déjà vu ?

John Markoff, “Scientists see promise in deep learning Programs”, The New York Times (Nov2012).

Photo: Keith Penner

http://www.nytimes.com/2012/11/24/science/scientists-see-advances-in-deep-learning-a-part-of-artificial-intelligence.html?_r=0

71

ACKNOWLEDGMENTSXavier Giró NietoCarlos SeguraCarles FernándezAlbert GilVictor CamposEnric MonteElisa SayrolEdu FontdevilaMíriam BellverAmaia SalvadorMarc CarnéJavier HernandoJavier VeraAll my family members and friends

72

Thank you!

73

Thank you!

74

Thank you!

75

Thank you! : )

Thank you!

Saliency prediction using deep learning techniques

Technology

deep saliency vig

deep saliency kmmerer

visual saliency prediction

saliency detection

related works

lsun saliency challenge

convolutional networks

deep gaze