Research and Development at Sound and Vision

Post on 29-Jun-2015

437 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Slides for guest lecture about R&D at the Netherlands Institute for Sound and Vision for the lecture series "Introduction to IMM" at VU Amsterdam. With slides by Lotte Belice Baltussen, Maarten Brinkerink, Johan Oomen, Bouke Huurnink and Victor de Boer

Transcript

Research and Development at Sound and Vision

Victor de Boer

vdboer@beeldengeluid.nl v.de.boer@vu.nl

Met slides van Johan Oomen, Lotte Belice Baltussen, Maarten Brinkerink, Bouke Huurnink

Mediapark Hilversum

Ontstaan in 1997: 3 archieven (RVD,

AVAC/Omroepen, Stichting Film en Wetenschap)

1 museum (Omroepmuseum)

Grootste audiovisueel archief

De Opdracht1. “Beheren en bewaken van dé audiovisuele

schatkamer van Nederland”

Cultuurhistorisch instituut• Archief• Museum• Kenniscentrum

2. “Toegang tot deze schatkamer voor iedereen!”

• Archiefcollectie voor programmamakers, onderwijs

• Media Experience voor algemeen publiek

14 april 2023

Nederlands Instituut voor Beeld en Geluid

5

70% audiovisueel erfgoed > 800.000 uur

250 000 uur televisie150 000 uur radio

300 000 uur muziek50 000 uur docu, film, reclame, etc

2 miljoen foto’s

20.000 objecten

Archieftaak in het digitale domein

Archieftaak ook in non-lineair medialandschap adequaat vervullen;

Internet en mobiele technologie hebben de manier waarop we communiceren volledig veranderd;

Audiovisueel materiaal: van zeldzaam tijdsdocument tot ‘preferred’ informatiedrager;

Archiveringsproces in digitaal tijdperk heel anders.

“The Archive as a Laboratory”

• Exchange– Interoperability between collections– Open Data– Linking collection to semantic web

• Access– Annotation– Retrieval– Contextualization

• Knowledge about end-users– Interviews– Experiments– Log studies

Research and Development

• Exchange– Interoperability between collections– Open Data– Linking collection to semantic web

• Access– Annotation– Retrieval– Contextualization

• Knowledge about end-users– Interviews– Experiments– Log studies

Research and Development

Thanks to the proliferation of channels and the portability of new computing and telecommunications technologies, we are entering an era where media will be everywhere. (Jenkins, 2006)

The core design principle underlying the Web’s usefulness and growth is openness and universality. (Oomen & Aroyo, 2011)

• A lot of cultural heritage is digisited or is being digitised.

• Consequently, it is brought online. (Or?)• The archival turn: “[...] material that was until

recently locked into archival vaults and mainly used by professionals has now become available and accessible to non-industry users.” -- De Leeuw, 2012

14-04-2023

INTRODUCTION

TYPES OF VALUE

1. Economic (income)2. Public reach and access3. Re-use4. Participation

Focus of most CH organisations is on 2, 3 and 4

14-04-2023

2. PUBLIC REACH AND ACCESS

• Number of (online) visitors• Number of app downloads• Examples: website visits, project website visits,

engagement on Twitter and Facebook, repins on Pinterest.

14-04-2023

3. RE-USE

• Re-use of collections by an organisation itself, and by the public.

• e.g. developments the last years between the CH world and open initiatives, Europeana API

14-04-2023

4. PARTICIPATION

14-04-2023

• Participatory culture• From engagement to creative re-use

PARTICIPATION

14-04-2023

Oomen & Aroyo (2011)

14-04-2023

BEING OPEN: CONTRIBUTING TO THE COMMONS

Many of the projects we and others in the CH field do that have the aim to increase access, re-use and participation are ‘open’:

“A piece of content or data is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and/or share-alike.” --http://opendefinition.org/

14-04-2023

BEING OPEN: A GENERAL TREND

• The open data movement => New EU directive on open data

• Global mass digitisation• Example: Comité des Sages report• Example: Europeana API

14-04-2023

BEING OPEN: CREATIVE COMMONS LICENSES

Open Images is an open media platform that offers online access to audiovisual archive material to stimulate creative reuse.

Built by Sound and Vision & Knowledgeland but designed for participation by others.

• Open source (MMBase, FFmpeg, LAMP)

• Open media formats (Ogg Theora, WebM)

• Open standards (Dublin Core, CC-REL, HTML 5)

• Open API (OAI-PMH, CC-0)

• Open content (CC-licenses, PD Mark)

OPEN, OPEN, OPEN!

14-04-2023

OPEN, OPEN, OPEN!

Openbeelden.nl / openimages.eu

• CC BY - SA as preferable license

• 3000 items• “Internet

Quality”

∼800,000h

∼110h

Objectives• Public outreach by embracing new

technologies and ‘participatory culture’• Contextualization by interlinking with

other platforms• Exploring new services and distribution

models• Supporting a National and European

Audiovisual Commons

MW2012

Hackathons

http://www.beeldvoorbeeld.nl/tv/

• Exchange– Interoperability between collections– Open Data– Linking collection to semantic web

• Access– Annotation– Retrieval– Contextualization

• Knowledge about end-users– Interviews– Experiments– Log studies

Research and Development

Annotation today

• INSERT PICTURE OF IMMIX– +-15 archivists, with changing role– Speech recognition and radio streaming in

From 2015 this should happen automatically…

Current situation: Video ingested from broadcasters, documentalists annotate

using thesaurus terms

….…..…….….. ? “Ezel”

How could you do this?

How?

Image recognitionSpeaker recognitionSpeech recognition

Broadcasting guidesCrowdsourcingTT888 subtitles

….…..…….….. ? “Ezel”

How could you do this?

Term extraction from TT888 subtitles

Algoritme

Wordf requencies Dutch

Named Entity Recognition

Thesaurus B&G

“Ezel”“Amsterdam”“Jos Brink”

• Compare to manual terms from documentalists• What (type of) terms are extracted (or not)• Are the automatically extracted ones better?• End user test

Evaluation: How well does the algorithm work?

Crowdsourcing

geluidvannederland.nl

Amateurfilms

COGNITIVE SURPLUS

The so-called “cognitive surplus” that used to be spent on passive activities (notably watching television) can now be used in a profoundly different way, for new kinds of creativity and problem-solving. (Oomen & Aroyo, 2011, Clay Shirky 2010)

14-04-2023

WAISDA? (What’s That?)

- Game-With-a-Purpose (GWAP)- Allows internet users to annotate audiovisual archive

material in the form of a (serious) game- The goal of the game is consensus between players

(which also works as a filter)- Fun and competition as motivation

14-04-2023

GOALS AND ADDED VALUE

- Investigate the added value of social tagging- Experimenting with new forms of services for the

public (serious games)Which results in:- Time-related metadata- Social tagging (bridging the semantic gap)- Interaction between the archive/broadcaster and the

public

14-04-2023

GAME MECHANICS 2/5

GAME MECHANICS 4/5

GAME MECHANICS 5/5

RESULTS AND FINDINGS 1/2

- Three implementations resulted in over a million social tags (ongoing)

- On average 40-50% of the social tags consists of matched tags

- On average 10-20% of the social tags are unique- ‘Super taggers’ are responsible for the vast majority

of the social tags that are added

14-04-2023

RESULTS AND FINDINGS 2/2

- The extend to which expert cataloguers deem the social tags to be useful, heavily depends on the type of content

- The same is true for the balance between social tags the correspond with terms from a controlled-vocabulary and terms invented by users themselves

- First experiments suggest that the social tags enable high recall fragment retrieval.

14-04-2023

RETRIEVAL AND ACCESS

Retrieval RealityRetrieval reality

• Search with all sorts of annotations– Game, automatic– A/B tests

• Log analysis (Huurnink et al.)– Most queries are: [name of program] + [proper

name]• New forms of access

– LinkedTV– DIVE

Retrieval research

All information that can be used to interpret and understand the production, publication and reception of audiovisual

heritage from multiple standpoints.

metadata ≠ context source

CONTEXTUALISATION:AN AV-ARCHIVE DEFINITION

...TO CONTEXT: MUTUALLY CONNECTED COLLECTIONS...

14-04-2023

Connecting collections:topics, people, genres, etc

Catalogue Photos

B&G

Wiki

Programm

eguides

Video hyperlinking

Method to link a video to other multimedia sources Applications:

Search: connections between sources based on a query (clustering, storytelling)

Detail-on-demand: zooming in on specific elements through linked information sources (contextualisering)

12 februari 2013

Networked heritage

Concept: Jan Sluijters (schilder)DBpedia

Related items

Links

• Styles (Expressionism, Cubism, Fauvism)

• Period (contemporaries)

LinkedTV: Example of contextualization

LinkedTV – SmartTV

12 februari 2013

Cultureel erfgoed scenario, Tussen Kunst & KitschMet dank aan overeenkomst met AVRO!

DIVE• Linked Data

– B&G videos– Koninklijke

bibliotheek

• Event based– Context, Narratives

• New paradigms of explorative search and browsing

DIVE

• How well can we integrate these heterogeneous collections – How many (correct) links?

• How well can we extract knowledge from the descriptions with ensemble methods?

• Hoe werkbaar is dit type interface– Vinden mensen andere, nieuwe mediaobjecten?]

• Professionals• Amateurs

– Zijn ze hier tevreden over?– Krijgen ze een beter beeld over gebeurtenissen?

Evaluation

• Exchange– Interoperability between collections– Open Data– Linking collection to semantic web

• Access– Annotation– Retrieval– Contextualization

• Knowledge about end-users– Interviews– Experiments– Log studies

Research and Development

vdboer@beeldengeluid.nl | v.de.boer@vu.nl

beeldengeluid.nlhttp://www.beeldengeluid.nl/blogs/Research-and-Development

top related