Research and Development at Sound and Vision Victor de Boer [email protected] [email protected] Met slides van Johan Oomen, Lotte Belice Baltussen, Maarten Brinkerink, Bouke Huurnink
Jun 29, 2015
Research and Development at Sound and Vision
Victor de Boer
[email protected] [email protected]
Met slides van Johan Oomen, Lotte Belice Baltussen, Maarten Brinkerink, Bouke Huurnink
Mediapark Hilversum
Ontstaan in 1997: 3 archieven (RVD,
AVAC/Omroepen, Stichting Film en Wetenschap)
1 museum (Omroepmuseum)
Grootste audiovisueel archief
De Opdracht1. “Beheren en bewaken van dé audiovisuele
schatkamer van Nederland”
Cultuurhistorisch instituut• Archief• Museum• Kenniscentrum
2. “Toegang tot deze schatkamer voor iedereen!”
• Archiefcollectie voor programmamakers, onderwijs
• Media Experience voor algemeen publiek
14 april 2023
Nederlands Instituut voor Beeld en Geluid
5
70% audiovisueel erfgoed > 800.000 uur
250 000 uur televisie150 000 uur radio
300 000 uur muziek50 000 uur docu, film, reclame, etc
Archieftaak in het digitale domein
Archieftaak ook in non-lineair medialandschap adequaat vervullen;
Internet en mobiele technologie hebben de manier waarop we communiceren volledig veranderd;
Audiovisueel materiaal: van zeldzaam tijdsdocument tot ‘preferred’ informatiedrager;
Archiveringsproces in digitaal tijdperk heel anders.
“The Archive as a Laboratory”
• Exchange– Interoperability between collections– Open Data– Linking collection to semantic web
• Access– Annotation– Retrieval– Contextualization
• Knowledge about end-users– Interviews– Experiments– Log studies
Research and Development
• Exchange– Interoperability between collections– Open Data– Linking collection to semantic web
• Access– Annotation– Retrieval– Contextualization
• Knowledge about end-users– Interviews– Experiments– Log studies
Research and Development
Thanks to the proliferation of channels and the portability of new computing and telecommunications technologies, we are entering an era where media will be everywhere. (Jenkins, 2006)
The core design principle underlying the Web’s usefulness and growth is openness and universality. (Oomen & Aroyo, 2011)
• A lot of cultural heritage is digisited or is being digitised.
• Consequently, it is brought online. (Or?)• The archival turn: “[...] material that was until
recently locked into archival vaults and mainly used by professionals has now become available and accessible to non-industry users.” -- De Leeuw, 2012
14-04-2023
INTRODUCTION
TYPES OF VALUE
1. Economic (income)2. Public reach and access3. Re-use4. Participation
Focus of most CH organisations is on 2, 3 and 4
14-04-2023
2. PUBLIC REACH AND ACCESS
• Number of (online) visitors• Number of app downloads• Examples: website visits, project website visits,
engagement on Twitter and Facebook, repins on Pinterest.
14-04-2023
3. RE-USE
• Re-use of collections by an organisation itself, and by the public.
• e.g. developments the last years between the CH world and open initiatives, Europeana API
14-04-2023
4. PARTICIPATION
14-04-2023
• Participatory culture• From engagement to creative re-use
PARTICIPATION
14-04-2023
Oomen & Aroyo (2011)
14-04-2023
BEING OPEN: CONTRIBUTING TO THE COMMONS
Many of the projects we and others in the CH field do that have the aim to increase access, re-use and participation are ‘open’:
“A piece of content or data is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and/or share-alike.” --http://opendefinition.org/
14-04-2023
BEING OPEN: A GENERAL TREND
• The open data movement => New EU directive on open data
• Global mass digitisation• Example: Comité des Sages report• Example: Europeana API
14-04-2023
BEING OPEN: CREATIVE COMMONS LICENSES
Open Images is an open media platform that offers online access to audiovisual archive material to stimulate creative reuse.
Built by Sound and Vision & Knowledgeland but designed for participation by others.
• Open source (MMBase, FFmpeg, LAMP)
• Open media formats (Ogg Theora, WebM)
• Open standards (Dublin Core, CC-REL, HTML 5)
• Open API (OAI-PMH, CC-0)
• Open content (CC-licenses, PD Mark)
OPEN, OPEN, OPEN!
14-04-2023
OPEN, OPEN, OPEN!
Openbeelden.nl / openimages.eu
• CC BY - SA as preferable license
• 3000 items• “Internet
Quality”
∼800,000h
∼110h
Objectives• Public outreach by embracing new
technologies and ‘participatory culture’• Contextualization by interlinking with
other platforms• Exploring new services and distribution
models• Supporting a National and European
Audiovisual Commons
MW2012
Hackathons
http://www.beeldvoorbeeld.nl/tv/
• Exchange– Interoperability between collections– Open Data– Linking collection to semantic web
• Access– Annotation– Retrieval– Contextualization
• Knowledge about end-users– Interviews– Experiments– Log studies
Research and Development
Annotation today
• INSERT PICTURE OF IMMIX– +-15 archivists, with changing role– Speech recognition and radio streaming in
From 2015 this should happen automatically…
Current situation: Video ingested from broadcasters, documentalists annotate
using thesaurus terms
….…..…….….. ? “Ezel”
How could you do this?
How?
Image recognitionSpeaker recognitionSpeech recognition
Broadcasting guidesCrowdsourcingTT888 subtitles
….…..…….….. ? “Ezel”
How could you do this?
Term extraction from TT888 subtitles
Algoritme
Wordf requencies Dutch
Named Entity Recognition
Thesaurus B&G
“Ezel”“Amsterdam”“Jos Brink”
• Compare to manual terms from documentalists• What (type of) terms are extracted (or not)• Are the automatically extracted ones better?• End user test
Evaluation: How well does the algorithm work?
Crowdsourcing
COGNITIVE SURPLUS
The so-called “cognitive surplus” that used to be spent on passive activities (notably watching television) can now be used in a profoundly different way, for new kinds of creativity and problem-solving. (Oomen & Aroyo, 2011, Clay Shirky 2010)
14-04-2023
WAISDA? (What’s That?)
- Game-With-a-Purpose (GWAP)- Allows internet users to annotate audiovisual archive
material in the form of a (serious) game- The goal of the game is consensus between players
(which also works as a filter)- Fun and competition as motivation
14-04-2023
GOALS AND ADDED VALUE
- Investigate the added value of social tagging- Experimenting with new forms of services for the
public (serious games)Which results in:- Time-related metadata- Social tagging (bridging the semantic gap)- Interaction between the archive/broadcaster and the
public
14-04-2023
GAME MECHANICS 1/5
GAME MECHANICS 2/5
GAME MECHANICS 4/5
GAME MECHANICS 5/5
RESULTS AND FINDINGS 1/2
- Three implementations resulted in over a million social tags (ongoing)
- On average 40-50% of the social tags consists of matched tags
- On average 10-20% of the social tags are unique- ‘Super taggers’ are responsible for the vast majority
of the social tags that are added
14-04-2023
RESULTS AND FINDINGS 2/2
- The extend to which expert cataloguers deem the social tags to be useful, heavily depends on the type of content
- The same is true for the balance between social tags the correspond with terms from a controlled-vocabulary and terms invented by users themselves
- First experiments suggest that the social tags enable high recall fragment retrieval.
14-04-2023
RETRIEVAL AND ACCESS
Retrieval RealityRetrieval reality
• Search with all sorts of annotations– Game, automatic– A/B tests
• Log analysis (Huurnink et al.)– Most queries are: [name of program] + [proper
name]• New forms of access
– LinkedTV– DIVE
Retrieval research
All information that can be used to interpret and understand the production, publication and reception of audiovisual
heritage from multiple standpoints.
metadata ≠ context source
CONTEXTUALISATION:AN AV-ARCHIVE DEFINITION
...TO CONTEXT: MUTUALLY CONNECTED COLLECTIONS...
14-04-2023
Connecting collections:topics, people, genres, etc
Catalogue Photos
B&G
Wiki
Programm
eguides
Video hyperlinking
Method to link a video to other multimedia sources Applications:
Search: connections between sources based on a query (clustering, storytelling)
Detail-on-demand: zooming in on specific elements through linked information sources (contextualisering)
12 februari 2013
Networked heritage
Concept: Jan Sluijters (schilder)DBpedia
Related items
Links
• Styles (Expressionism, Cubism, Fauvism)
• Period (contemporaries)
LinkedTV: Example of contextualization
LinkedTV – SmartTV
12 februari 2013
Cultureel erfgoed scenario, Tussen Kunst & KitschMet dank aan overeenkomst met AVRO!
DIVE• Linked Data
– B&G videos– Koninklijke
bibliotheek
• Event based– Context, Narratives
• New paradigms of explorative search and browsing
DIVE
• How well can we integrate these heterogeneous collections – How many (correct) links?
• How well can we extract knowledge from the descriptions with ensemble methods?
• Hoe werkbaar is dit type interface– Vinden mensen andere, nieuwe mediaobjecten?]
• Professionals• Amateurs
– Zijn ze hier tevreden over?– Krijgen ze een beter beeld over gebeurtenissen?
Evaluation
• Exchange– Interoperability between collections– Open Data– Linking collection to semantic web
• Access– Annotation– Retrieval– Contextualization
• Knowledge about end-users– Interviews– Experiments– Log studies
Research and Development
[email protected] | [email protected]
beeldengeluid.nlhttp://www.beeldengeluid.nl/blogs/Research-and-Development