Breaking the Waves
Alastair Dunning
(The European Library /
Europeana)
Discovery Summit
London, Feb 2013
@alastairdunning
“There is a tsunami of data that is crashing
onto the beaches of the civilized world. This
is a tidal wave of unrelated, growing data
formed in bits and bytes, coming in an
unorganized, uncontrolled, incoherent
cacophony of foam…. we see graphic
designers and government officials, all
getting their shoes wet and slowly
submerging in the dense trough of stuff….
they walk stupidly into the water, smiling—a
false smile of confidence and control.
The tsunami is a wall of data—data
produced at a greater and greater speed …
in amounts that double, it seems, with each
sunset ....
... [Thankfully]
Google mastered the
technical art of the
search.”
http://intdev.stc.org/2012/02/question-information-quest-inform/
But Europeana’s
aims are different
Trusted data from
European cultural
heritage
Before that - a
quick aside
• 26m (Feb 2013) metadata
records from 2,200
European galleries,
museums, archives and
libraries
• Books, newspapers,
journals, letters, diaries,
archival papers... Paintings,
maps, drawings,
photographs… Music,
spoken word, radio
broadcasts…
• Only links to digitised
content; 31 languages
• Started in 2007
• Based in National Library of
Netherlands
Europeana - Europe’s cultural heritage portal
• Centrally indexes 115m
bibliographic records, plus
16m digital links
• 48 National Libraries of
Europe
• Plus 19 research libraries
• Links to digitised content
and bibliographic records at
libraries
• Started in 1990s - ‘Mother’ of
Europeana. Now aggregates
content for Europeana
• Also hosted in National
Library of Netherlands
The European Library (TEL) Europe’s library aggregator
Europeana
Libraries
Museums
Film & Sound
Archives
Archaeological
Heritage
Other
Cultural
Heritage
National
Aggregator
National
Aggregator
Europeana
The
European
Library
ATHENA
Euro. Film
Gateway
ApeNet
CARARE
Other
Aggregators
Culture
Grid (UK)
National
Aggregator
Europeana
The
European
Library
ATHENA
Euro. Film
Gateway
ApeNet
CARARE
Other
Aggregators
Culture
Grid (UK)
National
Aggregator
European
a
The
European
Library
Culture
Grid (UK)
National
Aggregator
British Library
Spanish
National Library
German National Library
Italian
National Libraries
French
National Library
and another
43 national libraries
19 research libraries
and RLUK
End of aside
Europeana Portal
Europeana API
Europeana Linked
Open Data
Europeana SPARQL
Endpoint
Linked Data &
aggregation of
data for others -
source and quality
of data is
paramount
Europeana and
TEL are testing the
waters of resource
discovery
Are we making
progress? And what is impeding progress ?
The European Library to
release
>115m bibliographic
records to be released as
CC0 this year
Working with
RLUK to release
members’
metadata as linked
data
API and Linked
Data to be
published this year
as well
22m+ metadata
records released
as CC0 by
Europeana
c.2,200 institutions
Some of largest
cultural datasets in
the world
but ...
Cool URIs
97% of links
resolve properly
660,000 (c.3%) of
records have
broken links
Licencing
64% of
records do not
come with
clear licensing
about the
content
Current licence distribution in Europeana
Europeana has launched a rights labelling campaign to improve this
and even when
metadata is
technically well
formed ... it might
not help user
discovery
Quality of
metadata ?
User path ?
Lack of
context ?
Multiple
records for
one item
The perils of
basic search
And of course
semantic differences
Much of this is
‘basics’ - licencing,
permanent URIs,
quality of metadata
intelligent URIs
More complex
issues such as
semantics and
clustering of
records and
relevancy ... ... are being
addressed by the
Europeana Data
Model
s
Europeana
enriches with
GEMET thesaurus;
GeoNames;
Semium for dates;
DBPedia;
…while TEL uses
MACS for subject
headings; VIAF for
persons, orgs;
GeoNames
and working with
OCLC on
clustering records
no point aggregating if you
can’t reuse
for europeana and tel, finding
re(users) is critical
Europeana dataset: 77
prototypes based on
Europeana data
http://popcorn.webmadecontent.org/je9 HTML5 Music Player
How do I get
involved?
Europeana Network
Adopting open licencing
Clear and documented APIs Ensuring data currency and
accuracy
Optimising data for reuse
So rather than
Canute holding
back the waves ...
HMS
Discovery
charting new
waters.
Thank
you !
What is going to disrupt us and how will we
react?
There are a lot of us working in this area; how do we work
effectively together?
What changes do libraries, museums and archives need
to make to support better resource discovery ?
http://www.theeuropeanlibrary.org/tel4/record/2000085285616?query=canute&link-level=THUMBNAIL
http://www.kb.dk/images/billed/2010/okt/billeder/object131728/en/
http://www.europeana.eu/portal/record/09405y/8CA4B71EC49BB62BC393766CD6545DBFECFC988
B.html
http://www.freezeframe.ac.uk/collection/photos-british-arctic-expedition-1875-76/ls99-3-9?mode=giant
http://www.africamuseum.be/collections/browsecollections/europeana
http://www.digitalnz.org/records?i%5Bcontent_partner%5D=Europeana
http://www.geheugenvannederland.nl/?/zoom/index/&language=nl&i=http%3A%2F%2Fresolver.kb.nl%2Fresolve%3Furn%3Durn%3Agvn%3ANIOD01:AE0810&size%3Dlargels
http://opac.nebis.ch/F/VIJSXF7GE9YNUIG95HFXLJR26FEGPA73LSMTYS36MTLBGNFQAH-
13374?func=service&doc_library=EBI01&doc_number=005289411&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA&pds_handle=GUEST