Being a Good Data Provider Alastair Dunning JISC Programme Manager - Digitisation a.dunning AT jisc.ac.uk , 0203 006 6065 November 2011, Oxford This presentation is intended to give some brief advice for those publishing digital content (digital images, cultural heritage, scholarly information etc.) on the Internet
Making sure your content is licenced and discoverable
A presentation from the JISC Programme Meeting for its Content Programme for 2011 http://www.jisc.ac.uk/whatwedo/programmes/digitisation/econtent11.aspx
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
This presentation is intended to give some brief advice for those publishingdigital content (digital images, cultural heritage, scholarly information etc.)on the Internet
Being a Good Data Provider: A simple thing gets complex
Cool URIs
Being Friends with Google, Is Google Enough?
International Portals
Geographies
Re-Use and APIs
Licensing
Cool URIs
http://www.ariadne.ac.uk/issue31/web-focus/
URI (Uniform Resource Identifier) refers to the "generic set of all names/addresses that are short strings that refer to resources" whereas URL (Uniform Resource Locator) is "an informal term (no longer used in technical specifications) associated with popular URI schemes: http, ftp, mailto, etc.“
Keep them stable, memorable and consistent – develop a short URI policy
No need to explain the importance in exposing content to metadata – many users have Google as their principal springboard for digital information
Even if using authentication, expose metadata
Make sure your database is easily queried by robots like Google
Optimisation is complex and depends on good communications process
– Use established URIs – Ensure your website is trusted
– Get incoming links from other trusted sources – this drives up traffic via Google and via the original sites themselves
Strategic Content Alliance / Netskills training and documentation
Being friends with Google
Give distinctive <title> to each page – helps with clarity on Google
Use Google Sitemaps to upload details of your pages
Google Analytics can help with measuring web usage
Google Maps, Google Scholar?
http://www.google.com/publicsector
Is Google everything?
Recommendation by peers and other respected persons gets resources used
Marketing a resource is an integrated strategy to marketing which involves technical and ‘academic’ integration
Workshop will be held in this area for all JISC projects in this programme Source – Lesly Huxley et al (2007): Gathering
evidence: Current ICT use and future needs for arts and humanities researchers
Is Google everything?
How is your collection integrated into library catalogue?
How does your resource fit in with other resources?
Source – Mark Greengrass et al (2007): RePAH: A User Requirements Analysis for Portals in the
Arts and Humanities
“Resource discovery and use would be increased by separate collections being aggregated logically based on their content”Recommendation 3 – Daisy Abbott (2008): Digital Repositories and Archives Inventory
Working with Aggregators
CultureGrid - http://www.culturegrid.org.uk/
– UK aggregator cultural heritage material Large-scale harvest of digital resources
– Works well for images and multimedia
– Culture grid then exposes metadata to Europeana WorldCat - http://www.worldcat.org/librarians/default.jsp
– Bibliographic data - both digital and not digital
– Metadata exposed via Registry of Digital Masters
– Requires membership – so best done via institution
Aggregators
Other options
– Archives Hub, http://archiveshub.ac.uk/
– Connected Histories, British History 1500 – 1900
– JISC Historic Books, JISC MediaHub
Other options exist and will emerge, particularly within specific subject fields and areas of interest.
Key is to have easily exposable or transferable metadata
Geographies
“80% of data has a geographical component” … possibly
Lists, text, word can be confusing to navigate
Maps have a simplicity which many, but not all, find engaging
Examples - BL Sound Archive, Population Reports online, Flickr
It’s about visualising your data in different ways … time is also a powerful metaphor
Geographies
Application Programming Interfaces (API)
“The best use of your data will be thought of by someone else”
Separating data from its interface
Publishing each strand of metadata as a separate URI
Allows others to build interfaces over your data (and edit / annotate your data, if you want)
Requires certain amount of technical knowledge in setting up and institutional belief
Good example – http://www.vam.ac.uk/api
Licensing
A different challenge for re-use – making sure people know what they can do with your content
Licensing in – clearing third party rights
Licensing out – what can your users do– Possibilities – re-use in educational context, remashing (including editing,