Top Banner
Life - logging: what’s it about? ICMV'2014, Milano, Italy, 20 of November, 2014. 1 Petia Radeva www.cvc.uab.es/~peti a Barcelona Perceptual Computing Laboratory (BCNPCL), Universitat de Barcelona (www.bcnpcl.wordpress.com ) & Computer Vision Center (www.cvc.uab.es)
69

Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Jul 16, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Life-logging: what’s it about?

ICMV'2014, Milano, Italy, 20 of November, 2014.

1

Petia Radeva

www.cvc.uab.es/~petia

Barcelona Perceptual Computing Laboratory (BCNPCL), Universitat de Barcelona (www.bcnpcl.wordpress.com) &

Computer Vision Center (www.cvc.uab.es)

Page 2: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Index

The life-logging trend

Life-logging and egocentric vision

Video segmentation for events extraction

Motion-based video segmentation towards activities recognition

Human tracking, towards social interaction and key-frame extraction

Active learning for object recognition

Object discovery for lifestyle characterization

ICMV'2014, Milano, Italy, 20 of November, 2014.

2

Page 4: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Life-logging

Definition: Life-logging consists of acquiring images related to an individual through a wearable camera.

Benefits:

A digital memory of people you met, conversations you had, places you visited, and events you participated in.

This memory would be searchable, retrievable, and shareable.

A 14/7/365 monitoring of daily activities.

This data could serve as a warning system and also as a personal base upon which to diagnosis illness and to prescribe medicines.

A way of organizing, shaping, and “reading” your own life.

A complete archive of your work and play, and your work habits. Deep comparative analysis of your activities could assist your productivity, creativity, and consumptivity.

To the degree this life-log is shared, this archive of information can be leveraged to help others work, amplify social interactions, and in the biological realm, shared medical logs could rapidly advance medicine discoveries.

4

Page 5: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Technology is “running”!

Evolution of life-logging apparatus, including wearable computer, camera, and viewfinder with wireless Internet connection. Early apparatus used separate transmitting and receiving antennas. Later apparatus evolved toward the appearance of ordinary eyeglasses in the late 1980s and early 1990s .

5

“Quantified Self & life-logging MeetsInternet of Things (IOT)”, Mazzlan Abbas.

Page 6: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Is it feasible to record everything that happens in a person’s life?

In 1970, a disk to store 20 MB was the size of a washing machine and costed 20.000$.

Today a TB (one trillion bytes) costs a 100$ and is the size of a paperback book.

By 2020 a TB will cost the same as a good cup of coffee and will probably be in your cell phone.

100$ will then buy around 250 TB of storage, enough to hold tens of thousands of hours of video and tens of millions of photographs.

This should satisfy most life-loggers’ recording needs for an entire life.

In fact, digital storage capacity is increasing faster than our ability to pull information back out.

From 2000 it became trivial and cheap to sock away tremendous piles of data.

“Total: How the E-memory revolution will change everything”, Gordon Bell and Jim Gemmel, 2009, Dutton, Penguin Group.

6

The Moore’s Law (1965): “Transistor density that can be etched onto the silicon wafer of a microchip doubles every two years”.

The hard part is no longer deciding what to hold on to, but how to efficiently organize it, sort it, access it, and find patterns and meaning in it.

This is a primary challenge for the engineers that will fully unleash the power of Total Recall.”

Page 7: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Will life-logging and internet of thingshelp know us better?

7

Don’t you love to know…

• Where you’re going?! Who you’ve interacted with?!

• How long you’ve spoken to friends?! The affinity to connections?!

• How long it takes to get to work?!

• The tone of your messages?! The amount you text, tweet, or update?!

• How much exercise you’re getting?!

• How much you get distracted?! Where is your time most spent?

30

day

sre

po

rto

f M

ax K

no

bla

uch

Life-logging everything you see, do, feel, speak, experience and hear is almost here.

The really big issue here is that you might, individually, not worry aboutpublishing details of your personal life.•But you are publishing your friends, family and businesscontacts details at the same time.

Page 8: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Ethical guidelines for wearable cameras

Anonimity and confidentiality: Researchers coding image data should: not discuss the content with anyone outside of the team,

not identify anyone they recognize in the images,

be aware of how sensitive the data are.

Data encryption: Confıdentiality can be protected by confıguring devices and using specialist viewing software to make the images accessible only to the research team (lost devices). Devices should be configured so that data can only be retrieved by the

research team. It should be impossible for participants or third parties who find devices to access the images.

Data storage: Collected images should be stored securely and password-protected, according to national regulations.

8

Page 9: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Understanding user privacy requirements and risks from emerging technologies

People are bad at:

1. understanding the future value of revealing private information today,

2. understanding the risks from technology they have not yet used or heard of.

9

Wearable cameras can be very useful:

An estimated one million Russian motorists have dashboard video cameras installed in their cars.

Police officers carring video camera units and using Velcro to place these cameras in policewagons, helmet cams, ear cams, chest cams with audio capability, GPS locators, taser cams

• “Even with only half of the 54 uniformed patrol officers wearing cameras at any giventime, a department in USA had an 88 %decline in the number of complaints filedagainst officers, compared with the 12 months before the study”, The New York Times, 4th of July, 2013.

The world’s leading policebody worn video camera deployed by over 4000 agencies in 16 countries.

Page 10: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Benefits and potential applications It will take quite some time for people to feel comfortable with ‘always connected’ devices that can discreetly

take photos or videos. Will the benefits outweigh the negatives?

Wearable camera can provide many benefits, such as assistive technologies to help people:

“Quantified Self & life-logging Meets Internet of Things (IOT)”, Dr. Mazlan Abbas, MIMOS Berhad

10

• see better, • store and remember better, • work better, • function better, • remember and recognize names

and faces, etc, etc

Page 11: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Index

The life-logging trend

Egocentric vision for life-logging

Video segmentation for events extraction

Motion-based video segmentation towards activities recognition

Human tracking, towards social interaction and key-frame extraction

Active learning for object recognition

Object discovery for lifestyle characterization

ICMV'2014, Milano, Italy, 20 of November, 2014.

11

Page 12: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Motivation“Taking photos can impair your ability to remember” (1)

Next time you're at a museum or an event, think before you snap a photo!

“SenseCam has already made an impact in memory enhancement” (2)

What do we have?

Self centric, automatically captured Images of our life

Objective photos versus Subjective (traditional) photos

Most similar photos to our memory

1) ICMV'2014, Milano, Italy, 20 of November, 2014.

12

Page 13: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Motivation“Taking photos can impair your ability to remember” (1)

Next time you're at a museum or an event, think before you snap a photo!

“SenseCam has already made an impact in memory enhancement” (2)

What do we have?

Self centric, automatically captured Images of our life

Objective photos versus Subjective (traditional) photos

Most similar photos to our memory

What we are looking for?

Semantic information (life-logs, NOT data)

A Search Engine for the Self is required! (life-logging)

1) ICMV'2014, Milano, Italy, 20 of November, 2014.

13

Page 14: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Life-logging data

What we want:

Events to be extracted from life-logging images

ICMV'2014, Milano, Italy, 20 of November, 2014.

14

Page 15: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Life-logging data

ICMV'2014, Milano, Italy, 20 of November, 2014.

15

What we have:

Page 16: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Wealth or Hell of life-logging data?

We propose an energy-based approach for motion-based event segmentation of life-logging sequences of low temporal resolution

- The segmentation is reached integrating different kind of image features and classifiers into a graph-cut framework to assure consistent sequence treatment.

Complete dataset of a day captured with SenseCam (more than 4,100 images

ICMV'2014, Milano, Italy, 20 of November, 2014.

16

Choice of devise depends on: 1) where they are set: a hung up camera has the advantage that is considered more unobtrusive for the user, or 2) their temporal resolution: a camera with a low fps will capture less motion information, but we will need to process less data.We chose a SenseCam or Narrative - cameras hung on the neck or pinned on the dress that capture 2-4 fps.

Page 17: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Towards lifestyle characterization

We want to extract life-logging information about:

Events

Activities

Social interactions, etc..

Memorable moments

Personal habits and context

Lifestyle… healthy lifestyle…

17

ICMV'2014, Milano, Italy, 20 of November, 2014.

Our egocentric vision research:

Video segmentation for events extraction

Motion-based video segmentation towards activities recognition

Human tracking, towards social interaction and key-frame extraction

Active learning for object recognition

Object discovery for lifestyle characterization, etc, etc.

What? Where? When? Who?

Page 18: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Egocentric vision Video segmentation for events extraction

Motion-based video segmentation towards activities recognition

Human tracking, towards social interaction and key-frame extraction

Active learning for object recognition

Object discovery for lifestyle characterization

ICMV'2014, Milano, Italy, 20 of November, 2014.

18

Page 19: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Events segmentation

ICMV'2014, Milano, Italy, 20 of November, 2014.

19

Life-logging (LL) devices are characterized by easily collecting huge amount of images

One of the challenges of life-logging is how to organize the big amount of image data acquired in semantically meaningful segments in order to be able to store them and review later, being able to focus just on the most important aspects.

Life-logging Video Segments Extraction

Page 20: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Methodology

ICMV'2014, Milano, Italy, 20 of November, 2014.

20

Data set: thousands of

images

FeaturesExtraction

ClusteringDaily Video

Summarization

Event extraction Eventcharacterization

RGB+HOGCNN

Wardk-Means

Spectral Clustering

Page 21: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Methodology

ICMV'2014, Milano, Italy, 20 of November, 2014.

21

RGB + HOG Convolutional Neural Networks (4096 Features)

Features

Color

Structure

time

Page 22: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Clustering Techniques

Spectral Clustering

ICMV'2014, Milano, Italy, 20 of November, 2014.

22

WARD

K-Means

Page 23: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Validation

Results

Qualitative

ICMV'2014, Milano, Italy, 20 of November, 2014.

23

Page 24: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Validation

Results

ICMV'2014, Milano, Italy, 20 of November, 2014.

24

Moments of change

Page 25: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Validation

Results Quantitatives

ICMV'2014, Milano, Italy, 20 of November, 2014.

25

Weighted Features

K-Means Ward Spectral Cluster.

CNN 56 60 50

RGB+HOG 35 37 42

manauto

manauto

CC

CCJC

Page 26: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Egocentric vision Video segmentation for events extraction

Motion-based video segmentation

Human tracking, towards social interaction and key-frame extraction

Active learning for object recognition

Object discovery for lifestyle characterization

ICMV'2014, Milano, Italy, 20 of November, 2014.

26

Page 27: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Motion-based video segmentation Event segmentation in LL data is characterized by the movement of the

person wearing the device.

Group consecutive frames in three general event classes:

• ”Static” (person is not moving)

• ”In Transit” (person is moving or running)

• ”Moving Camera” (person is in the same area performing some action).

ICMV'2014, Milano, Italy, 20 of November, 2014.

27

Page 28: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Motion-based video segmentation Combine a set of features and classifiers in a Graph Cut (GC) formulation for

spatially coherent treatment of LL consecutive frames

1) Extract motion, colour and blurriness information from the images and apply a classifier to obtain a rough approximation of the class labels in single frames.

2) Apply an energy-minimization technique (GC) to achieve spatial coherence of labels assigned by the classifier and separate the sequences of consecutive images in events.

ICMV'2014, Milano, Italy, 20 of November, 2014.

28

Page 29: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Feature Extraction The three classes are distinguished by the motion of the camera and the big

difference between frames (low fps).

Robust event segmentation needs motion features that do not assume smooth image transition.

Image features:

Colour difference

Blurriness

SIFT flow data

Example of two consecutive images (left and center) and their relative SIFT-Flow field (right)

29

Page 30: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Graph Cuts Event Segmentation

GCs are based on a minimization of the :

The unary term, Ui is set to 1 − LH, likelihood for each image to belong to one of the classes (according to blurriness, colour difference and sift flow).

S C niniiif

iffPwfUE ),()( ,

The pairwise term Pi,n is a similarity measure for each sample on each cliqué that determines the likelihood for each neighbouring pair of images to have the same label (according to colour and HOG). Result of GC:

ICMV'2014, Milano, Italy, 20 of November, 2014.

30

Page 31: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Segmentation results

Final segmentation result of Transit, Static and Moving sequences for the 10th dataset split in classes.

ICMV'2014, Milano, Italy, 20 of November, 2014.

31

Page 32: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Quantitative results

Accuracy for each class (T,S,M) and average accuracy for the classifiers (SVM,KNN) and the GC.

Improvement in accuracy using different weights for the GC with respect to the KNN with cosine metrics; tests on the 10th dataset.

ICMV'2014, Milano, Italy, 20 of November, 2014.

32

Page 33: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Egocentric vision Video segmentation for events extraction

Motion-based video segmentation towards activities recognition

Human tracking for social analysis

Active learning for object recognition

Object discovery for lifestyle characterization

ICMV'2014, Milano, Italy, 20 of November, 2014.

33

Page 34: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Our goal on social analysis Who was with us? For how long?

Humans are important

Video segmentation based on presence of people

Each segment is an event related to specific person

Intuitive solution is to track visible people

34

ICMV'2014, Milano, Italy, 20 of November, 2014.

Page 35: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Restrictions What does make the tracking even more difficult on life-logging data?

low spatial and temporal resolution of images

Free motion of the camera

Frequent scene occlusion and distortion

Wide variation in appearance, scale and location of people along the videos

Even best state-of-the-art tracking methods fail!

35

ICMV'2014, Milano, Italy, 20 of November, 2014.

Page 36: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Our proposal: Bag-of-Tracklets Our strategy: Study restrictions of current tools to design a new method

A tracking method independent from background changes and temporal resolution could fit this data

What we propose:

Treat every detection individually

Track every detection (as much as we can!)

Try to find one reliable track of a same person, among many (one tracklet per detection) by grouping similar tracklets.

Advantage: Confront the tracking problem from higher level of information

Tracklets instead of detection

Getting rid of false alarms by excluding unreliable tracklets using group of information (bag-of-tracklets)

Accuracy and robustness improved!

36

M. Aghaei, P. Radeva, “Bag-of-Tracklets for Person Tracking in Life-Logging Data”, CCIA’2014.

Page 37: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Our Proposed Method

37

Seed Generation

TrackletCreation

Bag-of-tracklets

Formation

Density Calculation

Finding Reliable

BOTs

Prototype Extraction

Video

SummarizedVideo

ICMV'2014, Milano, Italy, 20 of November, 2014.

Page 38: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Seed Generation

38

Seed Generation

TrackletCreation

Bag-of-tracklets

Formation

Density Calculation

Finding Reliable

BOTs

Prototype Extraction

Video

SummarizedVideo

ICMV'2014, Milano, Italy, 20 of November, 2014.

Page 39: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Tracklet Creation

39

Seed Generation

TrackletCreation

Bag-of-tracklets

Formation

Density Calculation

Finding Reliable

BOTs

Prototype Extraction

Video

SummarizedVideoForward CTBackward CT

Tracklet gnerated by comressive tracking

ICMV'2014, Milano, Italy, 20 of November, 2014.

Page 40: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Bag-of-Tracklets Formation

40

Seed Generation

TrackletCreation

Bag-of-tracklets

Formation

Density Calculation

Finding Reliable

BOTs

Prototype Extraction

Video

SummarizedVideo

Similarity btw tracklets:

Likelihood that a tracklet belongs to a model:

ICMV'2014, Milano, Italy, 20 of November, 2014.

Page 41: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Bag-of-Tracklets Formation

41

Seed Generation

TrackletCreation

Bag-of-tracklets

Formation

Density Calculation

Finding Reliable

BOTs

Prototype Extraction

Video

SummarizedVideo

ICMV'2014, Milano, Italy, 20 of November, 2014.

Page 42: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Density Calculation

42

Seed Generation

TrackletCreation

Bag-of-tracklets

Formation

Density Calculation

Finding Reliable

BOTs

Prototype Extraction

Video

SummarizedVideo

NB = number of tracklets in the bag Bti = tracklet i in the bag B

ICMV'2014, Milano, Italy, 20 of November, 2014.

Page 43: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Finding Reliable BOTs

43

Seed Generation

TrackletCreation

Bag-of-tracklets

Formation

Density Calculation

Finding Reliable

BOTs

Prototype Extraction

Video

SummarizedVideo

ICMV'2014, Milano, Italy, 20 of November, 2014.

Page 44: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Prototype Extraction

44

Seed Generation

TrackletCreation

Bag-of-tracklets

Formation

Density Calculation

Finding Reliable

BOTs

Prototype Extraction

Video

SummarizedVideo

ICMV'2014, Milano, Italy, 20 of November, 2014.

Page 45: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Results Dataset:

Results:

Accuracy: A model similar to Jaccard Distance

# True Positive tracked frames / # Frames for that track in the groundtruth

Average accuracy of 84% has been obtained

45

# Days # Frames # Frames with Person(s)

# TrackablePerson

10 24,000 11,000 65

A sequence of 10 frames, with detected people indicated

Tracking results using Bag-of-Tracklets

ICMV'2014, Milano, Italy, 20 of November, 2014.

Page 46: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Comparison Comparing to the performance of Compressive Tracking (without using BOT) over

SenseCam images [48%], 75% of improvement has been achieved!

46

False alarms in detection and compressive tracking

True track of the person using Bag-of-Tracklets

Tracklet A

Tracklet B

ICMV'2014, Milano, Italy, 20 of November, 2014.

Page 47: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Advantages of Bag-of-Tracklets approach Advantages

Efficient method to track persons in low spatial and temporal resolution images

Human-based segmentation approach of visual life-logging data

Detect events based on human presence

Address the tracking problem using higher level of information

Assignment problem

Limitations and current work

Occlusion

Cost

Pose recovery

47

ICMV'2014, Milano, Italy, 20 of November, 2014.

Page 48: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Egocentric vision Video segmentation for events extraction

Motion-based video segmentation towards activities recognition

Human tracking, towards social interaction and key-frame extraction

Active learning for object recognition towards life characterization

Object discovery for lifestyle characterization

ICMV'2014, Milano, Italy, 20 of November, 2014.

48

Page 49: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Towards lifestyle characterization

Our next steps are directed towards visualizing summarized lifestyle data to ease the management of the user’s healthy habits (sedentary lifestyles, nutritional activity of obese people, etc.).

ICMV'2014, Milano, Italy, 20 of November, 2014.

49

Life-logging can help us accomplish our goals: taking photos of our everyday life and being able to analyse what we eat, starting by the dish recognition.

We need a food-related objects classifier!

Page 50: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Life-logging and healthy lifestyle

But, how can we automatically detect every instance of a dish in all of its variants, shapes and positions and in such a large number of images?

The main problems that arise are:• Complexity and variability of the data.• Huge amounts of data to analyse (up to 100.000 images per month).

Any efficient supervised classifier needs a huge amount of training set!

Using human time for labelling million of images is too expensive.

ICMV'2014, Milano, Italy, 20 of November, 2014.

50

We need an automatic aid for labelling huge amount of images.

Page 51: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Active Learning

Goal: Minimize human effort by guiding the process of creating the training set.

51

Effect of active learning: improvement of learning performance (left), improvement of training time (right).

ICMV'2014, Milano, Italy, 20 of November, 2014.

Page 52: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Issues of Active Learning What is the optimal set of examples to give to the

classifier?!

What is the minimal set of examples to be labeled in order to achieve optimal classification results.

Two directions to optimize the learning process:

by choosing the points to query that shrink thespace of possible classifiers as much as possible, or

by exploiting the structures of data distributionsto find clusters of unlabeled data.

ICMV'2014, Milano, Italy, 20 of November, 2014.

52

Class 1Class 2Unlabeled

C2

C1

C3Training data

New data

Page 53: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Active learning

Goal: Given a large amount of data, how to guide the labeling process in order to achieve optimal performance with minimal cost

In each step:

◦ Fit a classifier

◦ Select criteria (biased sampling)

Closest to boundary

Most uncertain

Most likely to decrease overall uncertainity

...

Biased sampling: the labeled points do not represent underlying distribution.

What is the alternative?

ICMV'2014, Milano, Italy, 20 of November, 2014.

53

+

-

+

+

+

+

+

++ +

+

+

--

--

-

--

--

-

--

-

Page 54: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Represent data with clusters...

Problem: How to find the right granuality?

ICMV'2014, Milano, Italy, 20 of November, 2014.

54

......

. . ...

.

......

. . ...

.

If the data can be discribed with pure clusters...

......

. . ...

.than only one annotation for cluster is needed!

......

. . ...

.

......

. . ...

.

Idea: Use clusters...

Page 55: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Partition-based Active Learning

ICMV'2014, Milano, Italy, 20 of November, 2014.

55

Iteration

0 1 2 3

Clu

ster

s

► Goal: Achieve labelling of data with minimal cost (user’s clicks)

► Begin with a partition P0= {C}0 of a cluster from whole set of data

► While i<m (max iterations)

► Select clusters of the current partition and sample data according to a criterion

► Query labels

► Compute clusters and their mislabelling error in Pi={C}i

► Search for a better partition

► Assign to each cluster the majority label.

Page 56: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Data sets

• Images from about 15 days

• Each day has up to 4.500 pictures.

• Total: 43.750 images.

• 2.900 images per day on average.

ICMV'2014, Milano, Italy, 20 of November, 2014.

56

Page 57: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

From Active Tree to Active Forest

Computing a tree with nearly 90.000 samples is unfeasible with the present technology (classical Dasgupta’s hierarchical sampling).

Active forest reduces by 70% the number of clicks per label training set.

ICMV'2014, Milano, Italy, 20 of November, 2014.

57

Page 58: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Active learning for nutrition-related objects annotation

58

Marc Bolaños, Maite Garolera, Petia Radeva, "Active Labeling Application Applied to Food-Related Object Recognition", 5th Workshop on Multimedia for Cooking and Eating Activities : CEA2013, ACM International Conference on Multimedia 2013, Barcelona October, 2013.

Page 59: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Egocentric vision Video segmentation for events extraction

Motion-based video segmentation towards activities recognition

Human tracking, towards social interaction and key-frame extraction

Active learning for object recognition

Object discovery for lifestyle characterization

ICMV'2014, Milano, Italy, 20 of November, 2014.

59

Page 60: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Object discovery

What is the context of a person day-by-day?

Do these sets belong to the same person?

Person 1

Person 1

Person 2

M. Bolaños, P. Radeva, “Object Discovery for Egocentric Videos Based on Convolutional Neural Network Features”, Workshop on Story telling, ECCV, 2014. (submitted)

ICMV'2014, Milano, Italy, 20 of November, 2014.

60

Page 61: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Main Goal

Goal: to develop automatic techniques for object discovery to characterize the

environment of the person wearing the camera.

Our proposal: to discover iteratively the most relevant objects from

egocentric videos for a particular user, based on:

1. Clustering of object region candidates

2. CNN feature extraction

3. New refill methodology on already discovered instances clusters.

Images acquired by a life-logging device, where objects of interest appear like: mobile phone, person, or TV monitor.

ICMV'2014, Milano, Italy, 20 of November, 2014.

61

Page 62: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

The selected easiest samples are complemented with a certain percentage(20%) of samples from the previousknowledge. Equally divided between all theknown classes but the “NoObject”s.

Before starting with the discovery iterations,40% of all the object candidates are insertedinto the “previous knowledge” set.

Algorithm

0.87

0.75

0.68

0.43 0.72

0.92

0.89

0.80

0.68

0.65

0.60 0.35

0.41 0.26

0.09

Used a pre-trained CNN provided by Hinton et al., trained on millions of ImageNet images in a succession of convolutional and pooling layers.

Deleted the last layer (supervised) and used the output of the penultimate layer as our features (4096 variables).

ICMV'2014, Milano, Italy, 20 of November, 2014.

62

Page 63: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Dataset

Our dataset consists of 1.000 images from a person's work day, from which 50.000 object candidates were extracted. To validate our method, we used the labels of the most frequent objects appearing.

ICMV'2014, Milano, Italy, 20 of November, 2014.

63

Page 64: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Tests ComparisonFinal F-Measure, Purity and Accuracy for each setting F-Measure evolution for each different setting

ICMV'2014, Milano, Italy, 20 of November, 2014.

64

As expected, more than 76% of the samples were labeled as "No Objects”.

We defined three different test settings to evaluate our proposal:

1. CNN Features

2. CNN Features with the Refill methodology

3. Features of [13] (Lee and Grauman's work)

Page 65: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Application of Life-logging for MCI treatment

ICMV'2014, Milano, Italy, 20 of November, 2014.

65

Goal: to develop tools for memory reforcing of MCI and Alzheimer people.

To develop, for subjects with MCI, a program-based life-logging captured by a Wearable Camera recording specific autobiographical episodes for stimulating posteriorly episodic memory function known to be deficient in MCI.

To explore the association between biomarkers changes in cognitive, functional and emotional outcomes.

To learn more about the underlying biological mechanisms for how effective behavioural interventions improve cognitive and functional outcomes.

Page 66: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Life-logging for wellfare

• how to extract semantic units related to the lifestyle and their context relation, •how to segment life-log data into meaningful events,

•what are the semantic units that characterize the lifestyle of individuals,

•what is their relation and how the context affects them,

•how to extract and characterize lifestyles patterns,

• what is the healtstyle, etc.

To derive lifestyle patterns from visual life-logs and to conduct a study on the feasibility of automatically generation of lifestyle patterns and interpretations to be used in the future to improve lifestyle of individuals.

ICMV'2014, Milano, Italy, 20 of November, 2014.

66

Page 67: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Conclusions Life-logging is a very recent trend growing very fast and with a huge potential.

Technology for data acquisition and storage is ready.

Life-loggers are waiting for tools of egocentric vision to process the huge amount of data.

Algorithms urgently needed for:

Shot boundary detection

Scene segmentation

Event detection

Data mining

Object and Activity recognition

Video annotation

Key frame extraction, memorability and esthetics

Video summarization and browsing

Query and retrieval

Numerous applications to health & wellfare, memory, safety, leisure, etc., etc.

ICMV'2014, Milano, Italy, 20 of November, 2014.

67

Page 68: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

Thank you for your attention!

ICMV'2014, Milano, Italy, 20 of November, 2014.

68

Page 69: Life-logging: what’s it about? · Index The life-logging trend Life-logging and egocentric vision Video segmentation for events extraction Motion-based video segmentation towards

How else can be LL useful?

ICMV'2014, Milano, Italy, 20 of November, 2014.

69

http://ideas.ted.com/2014/11/17/the-economic-impact-of-bad-meetings/