inds The Conference 2012: Wesley de Neve

Tackling the Digital Video Overload Wesley De Neve

8/11/2012 1

Context (1/2)

Increasing consumption of online video content easy-to-use devices and online services cheap storage and bandwidth more and more people going online

Increasing availability of online video content digitization of professional video archives popularity of user-generated video content

8/11/2012 2

Context (2/2)

Some statistics

professional video content BBC Motion Gallery (as of January 2009)

offers over 2.5 million hours of video content with video content dating back 60 years in time

user-generated video content YouTube (as of October 2012)

people watch 4 billion hours of video content each month people upload 72 hours of video content each minute

8/11/2012 3

Digital Video Overload (1/2)

Problem description our ability to manage video content is not able to keep

up with our ability to create video content

Cause to facilitate text-based video search, we need to

manually annotate video content with textual labels

8/11/2012 4

Digital Video Overload (2/2)

Real cause people experience manual video annotation as time-

consuming and cumbersome, thus foregoing the effort

Solution automatic video content understanding this is, computerized translation of pixels into text

8/11/2012 5

“Curiosity on Mars”

Automatic Video Content Understanding

Traditionally: video content analysis works reasonably well in highly controlled environments room for improvement in terms of applicability and

effectiveness

Nowadays: video content analysis, enhanced with unstructured knowledge from the Social Web, and/or structured knowledge from the Semantic Web

8/11/2012 6

two use cases

Social Video Face Annotation (1/2)

Description improving face annotation for personal video collections

by harvesting online social network context

Goal of video face annotation

8/11/2012 7

Search for peoples

person 3 person 1

person 2

Social Video Face Annotation (2/2)

8/11/2012 8

video face recognition using visual features

Contact list contact 1

contact 2

contact 3

contact 4

contact 5

contact 6

occurrence probabilities

co-occurrence probabilities

Labeled face images

+

robust video face recognition using visual and social features

[ published in IEEE ToMM, 2011 ]

Annotation of Live Soccer Video (1/2)

Description annotation of live soccer video by harvesting collective

knowledge from Twitter

Goal of annotating soccer video

8/11/2012 9

Search for events

logo logo attack goal trainer

Annotation of Live Soccer Video (2/2)

8/11/2012 10

soccer event detection using visual features

Twitter-assisted annotation of live soccer video

0

2

4

6

0 5 10

Twee

ts/s

Time (s)

What is happening? What are people saying?

[ submitted to IEEE ToMM, 2012 ]

Other Use Cases

Movie actor recognition

Semantic video copy detection

Audiovisual enrichment of text documents 8/11/2012 11

Research Challenges (1/2)

Design of techniques that jointly take advantage of unstructured and structured knowledge unstructured knowledge: collective knowledge structured knowledge: Linked Data Cloud

cf. “Everything is Connected” for video content enrichment http://everythingisconnected.be/

Design of techniques for translating unstructured knowledge into structured knowledge velocity, volume, and variety sparsity, ambiguity, and complexity

8/11/2012 12

http://everythingisconnected.be/

Research Challenges (2/2)

Design of effective semantic similarity metrics

Design of user-oriented performance metrics need to go beyond the use of precision and recall need to better capture whether the needs of users

have been met by a video content retrieval system

8/11/2012 13

visual distance

semantic distance

Thank you!

14 8/11/2012

inds The Conference 2012: Wesley de Neve

Technology