Multimedia Analytics: Synergy Between Human and Machine by ...videos.rennes.inria.fr/Workshop-Multimedia... · I Multimedia data increasingly important I Valuable sources of knowledge,

Post on 01-Oct-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

MULTIMEDIA ANALYTICS: SYNERGYBETWEEN HUMAN AND MACHINE BY

VISUALIZATION

Marcel Worring, Jan Zahalka, Stevan Rudinac

Intelligent Systems Lab Amsterdam Amsterdam Data ScienceUniversity of AmsterdamAmsterdam Data Science

INTRODUCTION

I Multimedia data increasingly important

I Valuable sources of knowledge, for example:

I Forensics: analyze multimedia data for evidence of ISISinvolvement

I Travel industry: analyze social media data to map trendingplaces of interest. . .

INTRODUCTION

I Multimedia data increasingly importantI Valuable sources of knowledge, for example:

I Forensics: analyze multimedia data for evidence of ISISinvolvement

I Travel industry: analyze social media data to map trendingplaces of interest. . .

MULTIMEDIA AS A KNOWLEDGE SOURCE

I Night Watch by Rembrandt. How to describe it?g

MULTIMEDIA AS A KNOWLEDGE SOURCE

I Art? Painting? People? Military unit? Amsterdam?g

MULTIMEDIA AS A KNOWLEDGE SOURCE

I Art? Painting? People? Military unit? Amsterdam? . . .Content, technical parameters, geo location, . . .

MULTIMEDIA AS A KNOWLEDGE SOURCE

I Description depends on context provided by the analystAnalyst needs to interact with the system

MULTIMEDIA AS A KNOWLEDGE SOURCE

Image

Tags

Comments

Metadata. . .

I Multimedia items contain multiple types of dataIntegrating them improves the information gain

MULTIMEDIA AS A KNOWLEDGE SOURCE

I What if we have millions of images, tags, metadata. . . ?Intelligent navigation capabilities required from the system

MULTIMEDIA ANALYTICS

I How do we move towards interactive, intelligent, andintegrated multimedia systems?

I Possible answer: multimedia analytics

MultimediaAnalysis

MultimediaAnalytics

InfoVis Visual Analytics

MULTIMEDIA ANALYTICS

I How do we move towards interactive, intelligent, andintegrated multimedia systems?

I Possible answer: multimedia analytics

MultimediaAnalysis

MultimediaAnalytics

InfoVis Visual Analytics

RELATED WORK

I Extensive survey work involving ∼ 800 references

I Covered relevant work from last 10 years:

I Multimedia analyticsI Multimedia visualizationI Information visualizationI Visual analyticsI Automated multimedia analysis

I Multimedia Analytics Article Library (MAAL):

I staff.fnwi.uva.nl/j.zahalka/maal.htmlI 374 catalogued references

RELATED WORK

I Extensive survey work involving ∼ 800 referencesI Covered relevant work from last 10 years:

I Multimedia analyticsI Multimedia visualizationI Information visualizationI Visual analyticsI Automated multimedia analysis

I Multimedia Analytics Article Library (MAAL):

I staff.fnwi.uva.nl/j.zahalka/maal.htmlI 374 catalogued references

RELATED WORK

I Extensive survey work involving ∼ 800 referencesI Covered relevant work from last 10 years:

I Multimedia analyticsI Multimedia visualizationI Information visualizationI Visual analyticsI Automated multimedia analysis

I Multimedia Analytics Article Library (MAAL):I staff.fnwi.uva.nl/j.zahalka/maal.htmlI 374 catalogued references

PIPELINE

interactive

model update

navigation

directions

Visualization

Model

Knowledge

. . .

Category 1people

61 items. . .

Category 2nature

93 items. . .

DataMM collection

Images

Annotations

Metadata

I Multimedia instantiation of the visual analytics process (Keim et al., Visualanalytics: Scope and challenges, 2008)

TASK MODEL

Exploration

Search

Start

End

Categorization

I Exploration: uncovering the overall structureI Search: finding particular items

I Exploration-search axis: E-S ratio changes dynamicallyI Mental model attributes: semantic→ categorical

TASK MODEL

Exploration

Search

Start

End

Categorization

I Exploration: uncovering the overall structureI Search: finding particular itemsI Exploration-search axis: E-S ratio changes dynamically

I Mental model attributes: semantic→ categorical

TASK MODEL

Exploration

Search

Start

End

Categorization

I Exploration: uncovering the overall structureI Search: finding particular itemsI Exploration-search axis: E-S ratio changes dynamically

I Mental model attributes: semantic→ categorical

TASK MODEL

Exploration

Search

Start

End

Categorization

I Exploration: uncovering the overall structureI Search: finding particular itemsI Exploration-search axis: E-S ratio changes dynamically

I Mental model attributes: semantic→ categorical

TASK MODEL

Exploration

Search

Start

End

Categorization

I Exploration: uncovering the overall structureI Search: finding particular itemsI Exploration-search axis: E-S ratio changes dynamicallyI Mental model attributes: semantic→ categorical

TASK MODEL

Exploration

Search

Start

End

Categorization

I Exploration: uncovering the overall structureI Search: finding particular itemsI Exploration-search axis: E-S ratio changes dynamicallyI Mental model attributes: semantic→ categorical

CATEGORIZATION

I Categorization — assigning individual multimedia itemsinto categories defined by the analyst

CHALLENGE: THE GAPS

Complex and abstract semantics

Recognized instantly

Put in context

Limited semantics

Takes time, computationally costly

No context

semantic gap

New categories on the fly

Non-exclusive categories

Dynamic category semantics

Static no. of classes

Exclusive classes

Static class semantics

pragmatic gap

I Multimedia analysis capabilities very different for humansand machines

I Semantic gap [Smeulders et al. 2000] — richness ofsemantics

I Pragmatic gap (our work) — flexibility of the model

CHALLENGE: THE GAPS

Complex and abstract semantics

Recognized instantly

Put in context

Limited semantics

Takes time, computationally costly

No context

semantic gap

New categories on the fly

Non-exclusive categories

Dynamic category semantics

Static no. of classes

Exclusive classes

Static class semantics

pragmatic gap

I Multimedia analysis capabilities very different for humansand machines

I Semantic gap [Smeulders et al. 2000] — richness ofsemantics

I Pragmatic gap (our work) — flexibility of the model

CHALLENGE: THE GAPS

Complex and abstract semantics

Recognized instantly

Put in context

Limited semantics

Takes time, computationally costly

No context

semantic gap

New categories on the fly

Non-exclusive categories

Dynamic category semantics

Static no. of classes

Exclusive classes

Static class semantics

pragmatic gap

I Multimedia analysis capabilities very different for humansand machines

I Semantic gap [Smeulders et al. 2000] — richness ofsemantics

I Pragmatic gap (our work) — flexibility of the model

SIMILARITY BROWSER

FORK BROWSER

PHOTO CUBE

MULTIMEDIA PIVOT TABLES

STATE OF THE ART

Limited Intermediate Advanced

Limited

Inter-mediate

AdvancedGoal

I-SI NewdlesVisitInformedia

Canopy

Similaritybrowser

INA browser

MediaTable

semantic gap

pragmatic gap

I Systems advance w.r.t. gapsI Algorithms and techniques allow realization of our model

STATE OF THE ART

Limited Intermediate Advanced

Limited

Inter-mediate

AdvancedGoal

I-SI NewdlesVisitInformedia

Canopy

Similaritybrowser

INA browser

MediaTable

semantic gap

pragmatic gap

I Systems advance w.r.t. gaps

I Algorithms and techniques allow realization of our model

STATE OF THE ART

Limited Intermediate Advanced

Limited

Inter-mediate

AdvancedGoal

I-SI NewdlesVisitInformedia

Canopy

Similaritybrowser

INA browser

MediaTable

semantic gap

pragmatic gap

I Systems advance w.r.t. gapsI Algorithms and techniques allow realization of our model

INSTANTIATING THE MODEL

NEW YORKER MELANGE

I Interactive New York venue recommender

I “Explore the city through the eyes of social media usersthat share interests with you.”

I newyorkermelange.com

I ACM Multimedia Grand Challenge 2014 1st Prize

NEW YORKER MELANGE

I Interactive New York venue recommenderI “Explore the city through the eyes of social media users

that share interests with you.”

I newyorkermelange.com

I ACM Multimedia Grand Challenge 2014 1st Prize

NEW YORKER MELANGE

I Interactive New York venue recommenderI “Explore the city through the eyes of social media users

that share interests with you.”I newyorkermelange.com

I ACM Multimedia Grand Challenge 2014 1st Prize

NEW YORKER MELANGE

I Interactive New York venue recommenderI “Explore the city through the eyes of social media users

that share interests with you.”I newyorkermelange.com

I ACM Multimedia Grand Challenge 2014 1st Prize

NEW YORKER MELANGE: INGREDIENTS

Visual & textfeatures for

venues & users

Grid, map

SVM

Interesting venuesto visit

indicate

relevant

users & venues

suggest

more

relevant

users & venues

Exploration SearchNY Melange

NEW YORKER MELANGE: INGREDIENTS

Visual & textfeatures for

venues & users

Grid, map

SVM

Interesting venuesto visit

indicate

relevant

users & venues

suggest

more

relevant

users & venues

Exploration SearchNY Melange

NEW YORKER MELANGE

NEW YORKER MELANGE

DATASET

New York venuesVenue images

Images, metadata

Q(venue name,geo)

DATASET

New York venuesVenue images

Images, metadata

Q(venue name,geo)

DATASET

I >1M New York venue images with metadata

I Real dataset with a purposeI Query strategy designed to reduce noise

I Exploitable size-noise tradeoff

I Each image has a venue category label→ ready forclassification

DATASET

I >1M New York venue images with metadataI Real dataset with a purpose

I Query strategy designed to reduce noise

I Exploitable size-noise tradeoff

I Each image has a venue category label→ ready forclassification

DATASET

I >1M New York venue images with metadataI Real dataset with a purposeI Query strategy designed to reduce noise

I Exploitable size-noise tradeoff

I Each image has a venue category label→ ready forclassification

DATASET

I >1M New York venue images with metadataI Real dataset with a purposeI Query strategy designed to reduce noise

I Exploitable size-noise tradeoffI Each image has a venue category label→ ready for

classification

VENUE/USER TOPICS

Dataset

Images

Annotations

Foursquare

Flickr

Picasa

Features

1000

visual

concepts

100

latent

topics

ConvNet

LDA

Clustering

Venuetopics

Visual

Text

Usertopics

Visual

Text

VENUE/USER TOPICS

Dataset

Images

Annotations

Foursquare

Flickr

Picasa

Features

1000

visual

concepts

100

latent

topics

ConvNet

LDA

Clustering

Venuetopics

Visual

Text

Usertopics

Visual

Text

VENUE/USER TOPICS

Dataset

Images

Annotations

Foursquare

Flickr

Picasa

Features

1000

visual

concepts

100

latent

topics

ConvNet

LDA

Clustering

Venuetopics

Visual

Text

Usertopics

Visual

Text

USER PREFERENCE LEARNING

Initial

interface

Negatives

Positives

empty

+relevant

venues

User

topics

(random sample)

Linear

SVM

User

ranking

Venue

selection

Venue

topics

Map

interface

+relevant

users

+non-relevant

users

USER PREFERENCE LEARNING

Initial

interface

Negatives

Positives

empty

+relevant

venues

User

topics

(random sample)

Linear

SVM

User

ranking

Venue

selection

Venue

topics

Map

interface

+relevant

users

+non-relevant

users

USER PREFERENCE LEARNING

Initial

interface

Negatives

Positives

empty

+relevant

venues

User

topics

(random sample)

Linear

SVM

User

ranking

Venue

selection

Venue

topics

Map

interface

+relevant

users

+non-relevant

users

USER PREFERENCE LEARNING

Initial

interface

Negatives

Positives

empty

+relevant

venues

User

topics

(random sample)

Linear

SVM

User

ranking

Venue

selection

Venue

topics

Map

interface

+relevant

users

+non-relevant

users

USER PREFERENCE LEARNING

Initial

interface

Negatives

Positives

empty

+relevant

venues

User

topics

(random sample)

Linear

SVM

User

ranking

Venue

selection

Venue

topics

Map

interface

+relevant

users

+non-relevant

users

USER PREFERENCE LEARNING

Initial

interface

Negatives

Positives

empty

+relevant

venues

User

topics

(random sample)

Linear

SVM

User

ranking

Venue

selection

Venue

topics

Map

interface

+relevant

users

+non-relevant

users

USER PREFERENCE LEARNING

Initial

interface

Negatives

Positives

empty

+relevant

venues

User

topics

(random sample)

Linear

SVM

User

ranking

Venue

selection

Venue

topics

Map

interface

+relevant

users

+non-relevant

users

USER PREFERENCE LEARNING

Initial

interface

Negatives

Positives

empty

+relevant

venues

User

topics

(random sample)

Linear

SVM

User

ranking

Venue

selection

Venue

topics

Map

interface

+relevant

users

+non-relevant

users

USER PREFERENCE LEARNING

Initial

interface

Negatives

Positives

empty

+relevant

venues

User

topics

(random sample)

Linear

SVM

User

ranking

Venue

selection

Venue

topics

Map

interface

+relevant

users

+non-relevant

users

EVALUATION: SCHEME

I Real user data

I 25% of the visited venues withheld, rest used to seed thesystem

I 10 interaction roundsI Measure: average recall of the withheld venuesI Only exact withheld venues count as match

EVALUATION: SCHEME

I Real user dataI 25% of the visited venues withheld, rest used to seed the

system

I 10 interaction roundsI Measure: average recall of the withheld venuesI Only exact withheld venues count as match

EVALUATION: SCHEME

I Real user dataI 25% of the visited venues withheld, rest used to seed the

systemI 10 interaction rounds

I Measure: average recall of the withheld venuesI Only exact withheld venues count as match

EVALUATION: SCHEME

I Real user dataI 25% of the visited venues withheld, rest used to seed the

systemI 10 interaction roundsI Measure: average recall of the withheld venues

I Only exact withheld venues count as match

EVALUATION: SCHEME

I Real user dataI 25% of the visited venues withheld, rest used to seed the

systemI 10 interaction roundsI Measure: average recall of the withheld venuesI Only exact withheld venues count as match

EVALUATION: RESULTS

1 2 3 4 5 6 7 8 9 100

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Interaction Round

AverageRecall

Baseline

NYM-VNYM-T

NYM-VT

FUTURE OF MELANGE: SOFTWARE

I Consolidated Melange deployable everywhere

I AmsterdamI Hong KongI BeijingI Washington, D. C.I PragueI Rennes

. . .

FUTURE OF MELANGE: SOFTWARE

I Consolidated Melange deployable everywhereI Amsterdam

I Hong KongI BeijingI Washington, D. C.I PragueI Rennes

. . .

FUTURE OF MELANGE: SOFTWARE

I Consolidated Melange deployable everywhereI AmsterdamI Hong Kong

I BeijingI Washington, D. C.I PragueI Rennes

. . .

FUTURE OF MELANGE: SOFTWARE

I Consolidated Melange deployable everywhereI AmsterdamI Hong KongI Beijing

I Washington, D. C.I PragueI Rennes

. . .

FUTURE OF MELANGE: SOFTWARE

I Consolidated Melange deployable everywhereI AmsterdamI Hong KongI BeijingI Washington, D. C.

I PragueI Rennes

. . .

FUTURE OF MELANGE: SOFTWARE

I Consolidated Melange deployable everywhereI AmsterdamI Hong KongI BeijingI Washington, D. C.I Prague

I Rennes. . .

FUTURE OF MELANGE: SOFTWARE

I Consolidated Melange deployable everywhereI AmsterdamI Hong KongI BeijingI Washington, D. C.I PragueI Rennes

. . .

FUTURE OF MELANGE: SOFTWARE

I Consolidated Melange deployable everywhereI AmsterdamI Hong KongI BeijingI Washington, D. C.I PragueI Rennes

. . .

CONCLUSION

I A model of multimedia analytics integration, tasks andchallenges

I Based on extensive survey work

I Multimedia Analytics Article Library:staff.fnwi.uva.nl/j.zahalka/maal.html

I Current state-of-the-art techniques allow realization

I Ample research opportunities in closing the gaps

I Model already successfuly instantiated

I New Yorker Melange: newyorkermelange.com

ImagesText

Metadata

CONCLUSION

I A model of multimedia analytics integration, tasks andchallenges

I Based on extensive survey workI Multimedia Analytics Article Library:staff.fnwi.uva.nl/j.zahalka/maal.html

I Current state-of-the-art techniques allow realization

I Ample research opportunities in closing the gaps

I Model already successfuly instantiated

I New Yorker Melange: newyorkermelange.com

ImagesText

Metadata

CONCLUSION

I A model of multimedia analytics integration, tasks andchallenges

I Based on extensive survey workI Multimedia Analytics Article Library:staff.fnwi.uva.nl/j.zahalka/maal.html

I Current state-of-the-art techniques allow realizationI Ample research opportunities in closing the gaps

I Model already successfuly instantiated

I New Yorker Melange: newyorkermelange.com

ImagesText

Metadata

CONCLUSION

I A model of multimedia analytics integration, tasks andchallenges

I Based on extensive survey workI Multimedia Analytics Article Library:staff.fnwi.uva.nl/j.zahalka/maal.html

I Current state-of-the-art techniques allow realizationI Ample research opportunities in closing the gaps

I Model already successfuly instantiatedI New Yorker Melange: newyorkermelange.com

ImagesText

Metadata

top related