Top Banner
Visual Memorability for Egocentric Cameras Marc Carné Herrera Advisors: Xavier Giró-i-Nieto and Cathal Gurrin
48

Visual Memorability for Egocentric Cameras

Apr 08, 2017

Download

Technology

Xavier Giro
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Visual Memorability for Egocentric Cameras

Visual Memorability for Egocentric Cameras

Marc Carné Herrera

Advisors: Xavier Giró-i-Nieto and Cathal Gurrin

Page 2: Visual Memorability for Egocentric Cameras

Acknowledgements

2

Petia Radeva Maite Garolera

Albert Gil Josep Pujal

Page 3: Visual Memorability for Egocentric Cameras

Outline

➔ Introduction➔ Contributions

◆ Annotation tool for visual memorability◆ EgoMemNet: visual memorability adaptation to egocentric images◆ Visual memorability and physiological signals

➔ Conclusions

3

Page 4: Visual Memorability for Egocentric Cameras

Outline

➔ Introduction➔ Contributions

◆ Annotation tool for visual memorability◆ EgoMemNet: visual memorability adaptation to egocentric images◆ Visual memorability and physiological signals

➔ Conclusions

4

Page 5: Visual Memorability for Egocentric Cameras

“Brain is designed to forget in order to survive”

● Lifelogger → person that captures his daily life in order to create a virtual and digital memory.

● Wearable cameras → capture first person vision.● Big data → 1.400 - 2.000 images/day.● Challenge → retrieval!

5

Introduction

Page 6: Visual Memorability for Egocentric Cameras

“Brain is designed to forget in order to survive”

● Lifelogger → person that captures his daily life in order to create a virtual and digital memory.

● Wearable cameras → capture first person vision.● Big data → 1.400 - 2.000 images/day.● Challenge → retrieval!

6

Introduction

What we want to

remember?

Page 7: Visual Memorability for Egocentric Cameras

● Cognitive therapy → Alzheimer patients, reminiscence therapy.

7

Motivation

Page 8: Visual Memorability for Egocentric Cameras

8

Image set

Relevant images with low level feature from a

CNN

Relevant images with object detection, faces detction… (based on content)

Page 9: Visual Memorability for Egocentric Cameras

Visual memorability

9

[Isola, CVPR 2011]

Page 10: Visual Memorability for Egocentric Cameras

Visual memorability

10

[Isola, CVPR 2011]

More memorable Less memorable

Page 11: Visual Memorability for Egocentric Cameras

Domain adaptation

11

Human-taken Egocentric

[Khosla, ICCV 2015]

Page 12: Visual Memorability for Egocentric Cameras

Outline

➔ Introduction➔ Contributions

◆ Annotation tool for visual memorability◆ EgoMemNet: visual memorability adaptation to egocentric images◆ Visual memorability and physiological signals

➔ Conclusions

12

Page 13: Visual Memorability for Egocentric Cameras

Why an annotation tool?

13

ConvolutionalNeural Network

Input(image)

Output(label)

Image + label to train the model

Page 14: Visual Memorability for Egocentric Cameras

● Inspired by MIT research work [1]● Visual memory game:

○ Simple task → press ‘d’ when a repeated image is found○ Duration: 9 minutes○ Output: text file with detections

14

[1] Understanding and Predicting Image Memorability at a Large

Scale, A. Khosla, A. S. Raju, A. Torralba and A. Oliva. ICCV 2015

Annotation tool for visual memorability

UTEgocentric

Insight Center for Data Analytics

Page 15: Visual Memorability for Egocentric Cameras

Annotation

15

[Khosla, ICCV 2015]

Page 16: Visual Memorability for Egocentric Cameras

Annotation tool

16

Page 17: Visual Memorability for Egocentric Cameras

● Docker:○ Container with an operating system and software required.○ Always run the same in any environment.

● Simple implementation → dockerfile

17

Annotation tool implementationWhy to use a Docker?

First docker implementation

in GPI for research

Page 18: Visual Memorability for Egocentric Cameras

18

● Memorability score → [0,1]

● Result:○ Dataset → 50 annotated images (25 users)

Annotation tool results

Page 19: Visual Memorability for Egocentric Cameras

Outline

➔ Introduction➔ Contributions

◆ Annotation tool for visual memorability◆ EgoMemNet: visual memorability adaptation to egocentric images◆ Visual memorability and physiological signals

➔ Contributions

19

Page 20: Visual Memorability for Egocentric Cameras

Convolutional neural network: definition

● Automatic learning paradigm based by how human brain works● Neuron interconnection that work together to generate an output stimulus or activation

20

Page 21: Visual Memorability for Egocentric Cameras

21

Convolutional neural network: layers

Convolutional layer Fully connected layer

Page 22: Visual Memorability for Egocentric Cameras

● MemNet → CNN for memorability prediction○ 5 conv layers + 2 fully connected layers + linear regression

22

EgoMemNet: visual memorability adaptation to egocentric images

MemNet CNN[Koshla, ICCV 2015]

1

Structure: AlexNet

Page 23: Visual Memorability for Egocentric Cameras

CNN fine-tuning

23

MemNet [Koshla, ICCV 2015]Insight dataset(egocentric dataset)

EgoMemNet

1

Page 24: Visual Memorability for Egocentric Cameras

● No augmentation● Spatial data augmentation → common method● Temporal data augmentation → egocentric feature

24

Data augmentation strategies

Spatial data augmentation Temporal data augmentation

Page 25: Visual Memorability for Egocentric Cameras

25

Quantitative results

Spearman’s rank correlation

Compute the similarity between positions between two different ranked lists.

Memorability rank Ground truth rank

Page 26: Visual Memorability for Egocentric Cameras

26

Quantitative results

Spearman’s rank correlation

Page 27: Visual Memorability for Egocentric Cameras

27

Qualitatives results

Page 28: Visual Memorability for Egocentric Cameras

28

Memorability maps

● Heat maps that highlight most memorable regions.● Methods:

○ Grid-and-forward → obtain a memorability score per patch○ EgoMemNet → fully convolutional version

Grid-and-forward EgoMemNet

Page 29: Visual Memorability for Egocentric Cameras

Memorability maps: grid-and-forward pass

29

Page 30: Visual Memorability for Egocentric Cameras

Memorability maps: EgoMemNet

30

[Zhou, CVPR 2016]

Page 31: Visual Memorability for Egocentric Cameras

31

Memorability vs. saliency maps

Original image Saliency map(SalNet CNN)

[Pan, CVPR 2016]

Memorability map(EgoMemNet* CNN)

In green, parts shared between saliency and memorability maps.In blue, memorability regions non-salient.In red, salient regions non-memorability

Binarized maps with

learned threshold

Page 32: Visual Memorability for Egocentric Cameras

Outline

➔ Introduction➔ Contributions

◆ Annotation tool for visual memorability◆ EgoMemNet: visual memorability adaptation to egocentric images◆ Visual memorability and physiological signals

➔ Conclusions

32

Page 33: Visual Memorability for Egocentric Cameras

The Insight dataset

● Multimodal homemade dataset:○ Images○ Memorability score○ Heart rate value (during

image acquisition)○ Galvanic skin response

(during image acquisition)

33

Publicly available!

Page 34: Visual Memorability for Egocentric Cameras

Heart rate correlation

34

Memory scores quantized in 8 bins

Mean heart rate in the bin

Page 35: Visual Memorability for Egocentric Cameras

Galvanic skin response

35

Memory scores quantized in 8 bins

Mean GSR in the bin

Page 36: Visual Memorability for Egocentric Cameras

Physiological signals for memorability prediction

36

SNAP !

Page 37: Visual Memorability for Egocentric Cameras

Detect snap points● Prior approach → efficient capture without image processing

37

Linear Regression

Page 38: Visual Memorability for Egocentric Cameras

Adding physiological signals for memorability prediction

● Post approach

38

Linear Regression

EgoMemNet score

Page 39: Visual Memorability for Egocentric Cameras

New feature: EEG signals

● EEG → electroencephalographic signals● Hands free visual memory game

39

Page 40: Visual Memorability for Egocentric Cameras

EEG data extraction

40

Peak at 400 ms

P3@Pz → average 350-600 ms

Page 41: Visual Memorability for Egocentric Cameras

41

P3@PzPeak 400ms

Page 42: Visual Memorability for Egocentric Cameras

Outline

➔ Introduction➔ Contributions

◆ Annotation tool for visual memorability◆ EgoMemNet: visual memorability adaptation to egocentric images◆ Visual memorability and physiological signals

➔ Conclusions

42

Page 43: Visual Memorability for Egocentric Cameras

Conclusions

● New annotation tool allows to create novel dataset for egocentric memorability.

● Egocentric (first person vision) dataset containing 50 annotated images.

● EgoMemNet, a model adapted for memorability prediction to egocentric images, presents a perform over MemNet, a convolutional neural network model trained with human-taken images.

● Physiological signals for memorability prediction.

43

Page 44: Visual Memorability for Egocentric Cameras

Extended abstract

Carné-Herrera M, Giró-i-Nieto X, Gurrin C. EgoMemNet: Visual Memorability Adaptation to Egocentric Images. Las Vegas, NV, USA: 4th Workshop on Egocentric (First-Person) Vision, CVPR 2016;

44

Page 45: Visual Memorability for Egocentric Cameras

Spotlight

Full spotlight in youtube!

https://www.youtube.com/watch?v=qwM5NNW37YE

Page 46: Visual Memorability for Egocentric Cameras

Poster presentation

464646

Page 47: Visual Memorability for Egocentric Cameras

47

Open Research

Dataset Model Annotation tool

http://imatge-upc.github.io/memory-2016-fpv/

Page 48: Visual Memorability for Egocentric Cameras

48

Hope this presentation has been memorable!

Thanks for your attention!