W3C Library Linked Data Group A Summary Antoine Isaac Europeana Vrije Universiteit Amsterdam Talis Linked Data and Libraries day, London, July 14th 2011
Jan 26, 2015
W3C Library Linked Data Group A Summary
Antoine Isaac
EuropeanaVrije Universiteit Amsterdam
Talis Linked Data and Libraries day, London, July 14th 2011
?
W3C incubator (XG) activity
• Short-lived working groups: around 1 year
• No delivery of W3C Recommendations, but “innovative ideas for specifications, guidelines, and applications that are not (or not yet) clear candidates as Web standards”
http://www.w3.org/2005/Incubator/
Library Linked Data incubator
• May 2010 – August 2011• 51 participants• 23 W3C member organizations
VU Amsterdam, INRIA, Library of Congress, JISC, Deutsche Nationalbibliotek, DERI Galway, OCLC, Talis, LANL, Helsinki University of Technology, University of Edinburgh, etc.
• Invited experts from other organizationsBnF, National Library of Latvia, German National Library of
Economics, etc.
Up-to-date list at http://www.w3.org/2000/09/dbwg/details?group=44833
To help increase global interoperability of library data on the Web, by
bringing together people involved in Semantic Web activities—focusing on Linked Data—in the library community and beyond,
building on existing initiatives, and identifying collaboration tracks for the future.
Mission
Linked Library Cloud 2008
[Ross Singer, Code4Lib2010]
http://code4lib.org/conference/2010/singer
2010
[Ross Singer, Code4Lib2010] http://code4lib.org/conference/2010/singer
Now
Technological bits and pieces
• Vocabularies/schemasDublin Core, SKOS, BIBO, FRBR
• Web services• Semantic Web search engines• Ontology editorsEtc.
Need for mapping the landscape
Investigate answers to higher-level questions
NAHSL2009
Stuart Weibel
What’s this I hear about the Semantic Web?
• What is the Semantic Web?
• What does it have to do with bibliography?
• Does it make life better for patrons?
• Does it strengthen libraries?
• Is it practical?
• Where can we get some?
http://www.slideshare.net/stuartweibel/semantic-web-technologies-changing-bibliographic-descriptions
Various activities
• Discussions• Presentations in various fora – libraries and beyond• Writing papers, blog posts• Gathering use cases and implementation examples• Identifying relevant technology pieces• Publishing linked data!
Deliverables
• Side deliverable on use cases• Side deliverable on available data• Main report
Use Cases
• Identify business cases and examples implementations
• Over 50 cases from XG participants and community• Grouped into 8 topical clusters
Bibliographic data, vocabulary alignment, citations, digital objects, social and new uses…
http://www.w3.org/2005/Incubator/lld/wiki/UseCaseReport
Available DataDocument surveying• Datasets• Value vocabularies• Element sets
CKAN LLD grouphttp://ckan.net/group/lld
http://www.w3.org/2005/Incubator/lld/wiki/Vocabulary_and_Dataset
Main report
• Intended at a general library audience: decisions-makers, developers, metadata librarians, etc.
• Tries to expand on general benefits, issues and recommendations
• An entry point into more specific resourcesLLD XG side deliverables, many external links
Benefits
• General benefits of linked data• Benefits to researchers, students and patrons• Benefits to cultural institutions• Benefits to librarians, archivists and curators• Benefits to developers
Relevant technologies
• Linked data front-ends to existing data stores• Web Application Frameworks• Web services for library linked data• Microformats, Microdata and RDFa • Tools for data designersEtc.
Implementation challenges and barriers to adoption
• Designed for stability, the library ecosystem resists (technological) change
• ROI is difficult to calculate• Data may have rights issues that prevent (open) publi
cation• Data in library-specific formats is not easily shared ou
tside the library community
Recommendations
• Assess• Facilitate• Design and prepare• Curate, identify and link
Still one week for feedback!
• Wiki pagehttp://www.w3.org/2005/Incubator/lld/wiki/DraftReportWithTransclusion
• Comments can be sent to the public LLD [email protected]
• Blog for fine-grained commentshttp://blogs.ukoln.ac.uk/w3clld/
Or wait till we have finished…
The future?
Discussions and collaboration should continue• Existing groups within libraries or with wider scope
– IFLA Semantic Web special interest group– LOD-LAM
• A new W3C Community group?
A long-term effort
Libraries are in a unique position for this
Questions?
Links• Official page @ W3C
http://www.w3.org/2005/Incubator/lld/
• Wiki sitehttp://www.w3.org/2005/Incubator/lld/wiki/
• LLD community mailing [email protected]
http://lists.w3.org/Archives/Public/public-lld/
Questions?
Some slides adapted from William Waites, http://eris.okfn.org/ww/2011/06/nls/ Pictures:http://www.flickr.com/photos/nationalarchives/3048286070/http://www.europeana.eu/portal/record/03903/78FA3F8B4299B45C25C395345D3D16ED24EA7F4F.htmlhttp://www.europeana.eu/portal/record/03912/E9666896A50FDDE5F7F15A17C11219A7FBCBBC50.htmlhttp://europeana.eu/portal/record/09405o/
651D82BEC748FF421B4252C699CC2498EF57E466.html (Europeana links give access to resources on original sites, with copyright info)
General benefits of linked data
• Shareable– Globally unique resolvable identifiers - URI– Libraries can make trusted metadata descriptions for common use
• Extensible– "Open world" - no description is complete, anybody can add
descriptive information from within their own publishing space
• Re-usable– Descriptions from diverse sources talking about the same thing– Annotations, enrichments, etc.
• Internationalisable– Full support for translations of terms to other languages– Natural language strings are not used as identifiers
Back
Benefits to researchers, students and patrons
• Greater discovery and use capabilities, across library and non-library resources, across disciplines
• Information seekers can extract and re-mix the parts of the data they need, add own annotations to library global graph
• Semantics in HTML allow resources to be better discovered from websites they use routinely
• Library items and data can be fully integrated into research documents and bibliographies
Back
Benefits to cultural institutions
• Use of mainstream technologies rather than formats and integrated systems specific to libraries
• Sharing data, particularly for items/works and authority data, means less duplication of effort, lower infrastructure costs
• Clarification of metadata licensing• Greater visibility on the web and reuse
Back
Benefits to librarians, archivists and curators
• Use of web-based identifiers makes resources immediately available and up-to-date
• Pull together data from outside their domain environment– across cultural heritage datasets– from the web at large
• Concentrate on their domain of local expertise rather than re-creating existing descriptions
Back
Benefits to developers
• Use of well known standard protocols and techniques instead of domain-specific software– HTTP instead of Z39.50– RDF instead of MARC or EAD– REST
• Freely mix or mash-up data from libraries with other sources
Back
Challenges and barriers
Designed for stability, the library ecosystem resists (technological) change
• Tendency to engage only with well-established standards and practices
• Standardization processes are long-term, top-downBottom-up can be successful but garner little recognition
• Tech. expertise lies mostly with a small number of software vendors or in large academic librariesLibraries are Libraries are understaffed in the technology area
Challenges and barriers
…
• Sharing of data traditionally happens amongst libraries, not with the wider world There is fear that data will need to be "dumbed down" in order to
interact with other communities; few see the possibility of "smarting up” data
• Cooperative metadata creation is economical but centralised
Back
Challenges and barriers
ROI is difficult to calculate
• Cost of current practice is not well known• LD requires tech. staff with specific expertise in
library data• Library-specific data formats require niche systems
solutions
Back
Challenges and barriers
Data may have rights issues that prevent (open) publication
• Some data cannot be opened• Rights have perceived value• Ownership of rights can be unmanageably complex
Back
Challenges and barriers
Data in library-specific formats is not easily shared outside the library community
• Data is expressed primarily as text strings, not "linkable" URIs
• Self-contained records differ from open-world graphs• Best practices or standardisation for using RDF with
library data are needed• The library and LD communities lack shared
terminology for metadata concepts statement, heading, authority control
Back
Recommendations
Assess
• Identify candidate datasets for early exposure as linked data
• For each dataset, determine ROI of current practices, and costs and ROI of exposing as LD
• Evaluate migration strategies• Foster a discussion about open data and rights
Back
Recommendations
Facilitate
• Cultivate an ethos of innovationSmall scale R&D within individual library organisations
• Identify Linked Data literacy needed for different staff roles in the library
• Include metadata design in library and information science education
• Increase participation in linked-data standardisation efforts
Back
Recommendations
Design and prepare
• Translate library data and standards into linked data• Develop best practices and design patterns for data• Directly use or map to commonly understood LD
vocabularies• Design user stories and exemplar UIs• Identify tools supporting the creation and use of LLD
Back
Recommendations
Curate, identify and link
• Apply library experience in curation and long-term preservation to linked data (and other) datasets
• Ensure preservation of relevant linked data vocabularies
• Assign unique identifiers (URIs) for all significant things in library data
• Create explicit links between library datasets and to other well-used datasets
Back