Top Banner
1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data Centre, University of Edinburgh, Scotland UK 10 October 2006 les & Responsibilities
38

1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

Dec 31, 2015

Download

Documents

Gavin Conley
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

1

We Must All Be Curators Now

from Ingest to Service Delivery, in Data Library & National Data Centre

Peter Burnhill

Director, EDINA

JISC National Data Centre, University of Edinburgh, Scotland UK

10 October 2006

Roles & Responsibilities

Page 2: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

2

Three different voices / roles

1. Director, EDINA National Data Centre– serving researchers, lecturers and students across the UK

* so something about what EDINA is & what EDINA does

– EDINA is funded by the JISC* so something about the JISC & the JISC IE

2. A time-served data person & fellow professional, from the University of Edinburgh– building on the past, planning for the future

3. A substitute for another guy … – trying to make sense of what is going on– working towards shared understanding– proposing a framework of verbs & nouns

Page 3: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

Joint Information Systems Committee (JISC) …

… of all the UK funding councils for higher and further education

Mission:

“world-class leadership in the innovative use of ICT for support of education & research”

Information Communication Technology

Income mix of ‘top-slice’ recurrent funding + capital grants

Page 4: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

4

Funding Councils, the JISC and EDINA

UK National Data Centres

Higher Ed funding councils

Further Ed funding bodies (Learning & Skills

Council)

Research Councils as ‘Partners’

NDCs are now HEFCE-related bodies

Page 5: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

organisational infrastructure for JISC Services• UKERNA – runs Joint Academic Network (JANET)

• EDINA & MIMAS – national data centres

+

• Arts & Humanities Data Service (AHDS)

• Economic and Social Data Service (ESDS)

+

• UKOLN; Centre for Educational Technology Interoperability Standards (CETIS); Digital Curation Centre (DCC); British Universities Film & Video Council (BUFVC); Technical Advisory Service on Images (TASI); Open Source Advisory Service; Nat. Centre for Text Mining; Plagiarism Advisory Service

• JISC Legal/Monitoring/…TechDis ; Regional Support Centres; UK Access Management / Athens

* most located in universities across UK *

Page 6: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

6

What is EDINA? • A National Data Centre, designated by the JISC in 1995/96

– based on Edinburgh University Data Library, est. 1983/84

Mission to enhance productivity of research, learning & teaching in UK higher and further education

• part of JISC Information Environment– Keywords have been Accessibility/Outreach/Inter-working/Inter-operability …

• range of development projects and 24/7 services– Geo-spatial, about which more later ..

– Scholarly communication & Multimedia * films & images; spoken word

– Infrastructure for Digital Library* certificates; rights; middleware

* SDSS -> UK Access Management Federation

• And the name, what’s that stand for?– Edinburgh Data Information Access– ‘Edina’ is the poetic name for Edinburgh …

Page 7: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

7

Delivering online services, 24/7 …

http://edina.ac.uk

http://edina.ac.uk/

Page 8: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.
Page 9: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

Biog: as data person these past 25+ years …

• Moved to the University of Edinburgh in 1979 – formerly science staff at Social Science Research Council (ESRC), 1974/77 – then medical statistician at Queen Charlotte’s Maternity Hospital, 1978/79

• first as statistician & researcher (& senior lecturer)– with Scottish Education Data Archive, from 1979

* making survey data at Govt-funded research centre (CES)– from design, data creation and documentation, onto analysis

* as survey methodologist in Edinburgh Survey Methodology Group

• then recruited to do R&D for service delivery– setting up & managing Edinburgh University Data Library, 1984 -– Co-director, ESRC Regional Research Laboratory, Scotland 1986/90

* early days of Geographical Information Systems (GIS)* member of Data Task Force, Inter-Agency Global Env. Change

– European Secretary (1993/95); President (1996/2001) of IASSIST* international assoc. for (social science) data librarians and archivists

• Now EDINA & IS Directorate at Univ. of EdinburghWas Set-up Director for Digital Curation Centre, 2003/4 to 2004/5

Page 10: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

10

• Scottish Education Data Archive, late 1970s – mid ‘80s– Database of surveys of school leavers & cohorts of young people (16-19)

* derived data, trend datasets over time, changing classifiers (eg Social Class)– integrating data from different sources, eg census ‘small area’ statistics– made available online but under ‘privileged’ not ‘open’ access’

• Edinburgh University Data Library, mid- ‘80s & on– Wider variety of datasets, obtained from others, often via others

* A ‘local’ library of datasets* Easing access to data held elsewhere (eg UKDA)

– made available online across ERCC wide area network and beyond* building databases, sometimes with special software,

• ESRC Regional Research Laboratory, Scotland 1986/90– early days of Geographical Information Systems (GIS)– Integrating ‘large-scale’ data, much geographic or geo-spatial

• EDINA national data centre, mid-1990s & on– National online access to wider range of reference and source data

* obtained under licence– required value-added ‘curation’

* Digimap as but one example

… maybe I’ve been a ‘data curator’ all along

Page 11: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

one example of ‘data curation’

OS digital data

Software + application of cartographic skill/rules

Value added component

11000152100913Playing Field 0901103 120001016400000%2100000010001004040097130 0%15000155 0321 0901103 0000000%2100000010001055810075820 0%15000156 0321 0901103 0000000%2100000010001057130076690 0%15000157 0321 0901103 0000000%2100000010001060110075460 0%15000158 0321 0901103 0000000%2100000010001063260074650 0%15000159 0321 8010619 0000000%2100000010001063370071760 0%15000160 0321 0901103 0000000%2100000010001066730076700 0%15000161 0321 0901103 0000000%2100000010001058910068550 0%15000162 0321 0901103 0000000%2100000010001064490069040 0%15000164 0321 0901103 0000000%2100000010001055710052730 0%15000173 0321 0901103 0000000%2100000010001058730050390 0%15000174 0321 0901103 0000000%2100000010001059520050430 0%15000175 0321 0901103 0000000%2100000010001056430049210 0%15000176 0321 0901103 0000000%

Software + default rules

Page 12: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

12

• Scottish Education Data Archive, late 1970s – mid ‘80s– Database of surveys of school leavers & cohorts of young people (16-19)

* derived data, trend datasets over time, changing classifiers (eg Social Class)– integrating data from different sources, eg census ‘small area’ statistics– made available online but under ‘privileged’ not ‘open’ access’

• Edinburgh University Data Library, mid- ‘80s & on– Wider variety of datasets, obtained from others, often via others

* A ‘local’ library of datasets* Easing access to data held elsewhere (eg UKDA)

• ESRC Regional Research Laboratory, Scotland 1986/90– early days of Geographical Information Systems (GIS)– Integrating ‘large-scale’ data, much geographic or geo-spatial

• EDINA national data centre, mid-1990s & on– National online access to wider range of reference and source data

* obtained under licence– required value-added ‘curation’

* Digimap as but one example– national repositories of digital content: Jorum, GRADE, TheDepot

• Digital Curation Centre, 2004 & 2005 – strategic role: ‘data curation’ & ‘digital preservation’– even wider range of databases (e-science), held by others

… maybe I’ve been a ‘data curator’ all along

Page 13: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

Data Provider

e.g. Ordnance Survey

end user(staff/student)

access

HE & FE funding councils

Institution(Licence)

£

££

£

Licensing Agent(JISC

Collections)

Value-added Service Provider

Authorising Institutions for free-at-point of use

Key role for Authentication (is-member of Institution) and Authorisation (is-licensed Institution)

Page 14: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

14

EDINA as national data centre

• http://edina.ac.uk

• 50% direct funding from JISC for delivering services– Good reputation for helpdesk, user interfaces, FAQs etc

– 24/7, 99% uptime

• 50% is extra awarded for Development activity

– Developing services; developing JISC IE; working with Researchers

– Acknowledged project competence for R&D

• Strategic role as Geographic Data Centre

– For JISC (Digimap etc), for ESRC (UKBORDERS)

– Building Spatial Data Infrastructure with NERC and internationally (OGC)

Page 15: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

Existing Geo-data Services

Page 16: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

16

Where are we with GIS?• University of Edinburgh & its Data Library have long run interest &

experience– Geography Department (Coppock/Hotson; Waugh/GIMMS) & PLU

first MSc GIS course, and much else

– ESRC Regional Research Laboratory for Scotland, 1987-– Launch of UKBORDERS in 1994

• EDINA has continued and extended that for geo-spatial data– JISC eLib project: access to Ordnance Survey mapping, 1996- – Launch of Digimap service, 2000 -– Extension of UKBORDERS, 2001 -

• ‘Shared Services’ provisionGo-Geo! (geo-data portal) geoXwalkGRADE – Geospatial Repositor for Academic Deposit and Extraction

• Not all (only a fraction) of geo-referenced data at EDINA• Strategic importance of interoperability

– GI web services

• Interested in furthering the use of GI data across disciplines– Geo-parsing & mark-up; geo-finding; geoXwalk (vocabularies)

Page 17: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

17

Disciplinary data-centres

* Something’s special about the spatial *

EDINA role as Geographic Data Centre?

Slide ‘borrowed’ from Liz Lyon, & curated ..

Page 18: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

2. Getting back to Problem Statement

‘roles & responsibilities’Some Thoughts, and Questions…

• What resources, and how should we share?– What are ‘scholarly resources’?

• What is special about scholarship?

• What is different about digital?

• Who should do what?– A division of labour that leverages

* ‘responsibility’ and ‘expertise’ for curation* Means of service delivery

I. Find our place – in old and new geography• ‘words, numbers, pictures, sounds

all to be digital & accessed from afar’

Page 19: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

19

Scholarship: Services and Stewardship

• Services, in support of scholarship, – Libraries have traditionally focussed on the formal part of

scholarly communication– Relevance: searching strategies– new challenges: how to cope with digital everything?

• Stewardship– Was ‘Special Collections’, now ‘Collections, inc. the digital’– Ensuring provenance & continuing access

* Digital curation, preservation & archiving* Sharing with future scholarship* Sharing with wider world

• Research– What do researchers do, and what do they want/need?– eScience, Data, and ‘scholar workstation’ and the VRE

• Learning and Teaching– What do students need?– What do teachers/lecturers need?– e-learning and the VLE (virtual learning environment)

Page 20: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

20

Infrastructure to support four ‘demand-side’ verbs

discover information object of intereste.g. article referenced in database, A&I, eToC, etc

locate organisation offering service e.g. library (union catalogue/OPAC)

or document delivery service

request use of servicevia payment of money or privilege of membership

access object of interestvia personal visit, document delivery, online access

based on MODELS workshops (UKOLN/JISC eLib)

Page 21: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

21

Simplified workflow

Discover

Locate

Access

Use

‘Publish*’

Fit for purpose?

Curate

*Issue

Page 22: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

22

Dataset publishing

• Re examine concept of Dataset Publishing (Callahan, Johnson, and Shelley 1996)

– analogous to publishing papers– rewards for publishing datasets (e.g. promotion, RAE)– procedures (e.g. standards to use, peer review) & resources to

manage procedures* Should minimise time and effort required

– need tools to assist in creation, maintenance and dissemination of dataset descriptions

• Means of ‘putting’ into a public/community– Deposit and Share are too cosy– to ‘publicate, to issue

• Terms of access and use– Open? – Privilege of membership– Payment of money

Page 23: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

Repositories of digital content

• So what is a digital repository?– I like (user) verbs, not (supply-side) nouns …

• A repository is a noun that meets a set of (user) verbs/tasks, by supporting delivery of [services] for a given/designated client community:

– Put [ingest service]– Keep-safe [storage service]– Get [access service]

Motivation:

• for the record? preservation; prospect of access

• for re-use? curation; current access • Can we say, “Behind every great service, there is a wonderful

managed repository”?

No, not if access service does not have corresponding ingest service.

Page 24: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

Repositories & OAIS Reference Model

?? In a classic Repository, the DIP is the same as the SIP ??

In a data centre, and many data libraries, it rarely is.

4-1

.2

MANAGEMENT

Ingest

Data Management

SIP

AIPDIP

queries

result setsAccess

PRODUCER

CONSUMER

Descriptive Info

AIP

orders

Descriptive Info

Archival Storage

Administration

Preservation Planning

Page 25: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

25

Support for Research & research-led learning• Data, software and facilities

– Data as ‘evidence’– Data curation and digital preservation: continuing access

• Data Archives and Data Libraries– Social surveys, and much more – IASSIST

* International Association for data professionals (1972 -)* Members in Philippines and Vietnam

• Census Programme– Small area statistics [MIMAS]– UKBORDERS (boundaries for thematic mapping) [EDINA]

• EDINA Digimap Collection– Topographic mapping data, from national mapping agency– Marine & Geological mapping data

• then there is the challenge of scientific visualisation, and observational images and documentary films!

Page 26: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

26

Scholarly Communication

1. Access to commercial services & resources– Consortium licensing– ‘local’ hosting licensed data at National Data Centres (NDCs)

2. Focus on community-generated resources– Union catalogues (& links to ILL/docdel) - SUNCAT– digital library developments– Open Access repositories

* “Put it in The Depot” (www.depot.ac.uk)

3. Need for Access Control as Middleware development– Shibboleth framework, developed as part of Internet2

* UK Access Management Federation for Education & Research* Managed by UKERNA, based on work by EDINA SDSS

– replacing vendor’s UserID & password with community scheme

Page 27: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

Scholarly Communication

Author

Reader

writes to be recognised by peer community &

for institutional Research Assessment Excersise (RAE) purposes

… perhaps to be read

Key User (Reader) Verbs:

Discover article of interestLocate service on those articlesRequest permission to use serviceAccess to service/article

(content of) article is the ‘information

object of desire’

Page 28: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

Author(article)

Reader(article)

Publisherarticle serial

issue

Library(serial)

Licence

Scholarly Communication(simple model: focus on article–length work published in journals)

Libraries and Publishers provide framework …

the traditional ‘middleware’/infrastructure’

... with Licence(s) for electronic (online) and print (on-shelf)

£

P.Burnhill, EDINA/JISC, 2005

Page 29: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

Author(article)

Reader(article)

Publisherarticle serial

issue

Library(serial)

Licence

Scholarly Communication & Open Access(Access to article–length work)

peer review

peer exchange

Informal: ‘invisible college’ and the ‘gift economy’

Institutional arrangement

Licensed Online Access

Forma£

economy

ILL/docdel

repositories

‘Open Access’‘Digital Preservation’

free2web access

E-prints££

learned

society

Page 30: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

Research Data

Creator

Researcher

Generates (curates) data for own purpose, or as part of team

… wants/has to ‘put’ it somewhere for use by others

(perhaps to be recognised by a peer community)

Key User (Researcher) Verbs:

Discover data of interestLocate service on that data with documentation on provenance etc

Request permission to use serviceAccess to service/data,

Evidential value of data in analysis as

object of desire’

Page 31: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

Creator(dataset)

Researcher(data)

Data Centre(database)

(Data) Library

Licence

Data (simple model)

who provides framework? … the ‘middleware’/infrastructure’

... with what kind of Licence(s) for access?

£ ??

P.Burnhill, EDINA/JISC, 2006

Page 32: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

Creator(dataset)

Researcher

Institution

Licence

Doing Data

peer review

peer exchange

Informal: ‘invisible college’ and the ‘gift economy’

Institutional arrangement

Authorised Online Access

Forma£

economy

repositories

‘Open Access’‘Digital Preservation’

free2web access

datasets££

learned

society

Data Centre

Page 33: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

33

All Curators Now …

Thank you

[email protected]

http://edina.ac.uk

http://jisc.ac.uk

Page 34: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

JISC Information Environment Architecture

(Idealised) Technical Infrastructure for ServicesAndy Powell, 2005

Page 35: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

35

Disciplinary data-centres

* Something’s special about the spatial *

EDINA has role as Geographic Data Centre

Slide ‘borrowed’ from Liz Lyon, & curated ..

Page 36: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

36

Support for Research & research-led learning• Data, software and facilities

– Data as ‘evidence’– Data curation and digital preservation: continuing access

* Digital Curation Centre established (Edinburgh-led)

• Data Archives and Data Libraries– Social surveys, and much more – IASSIST

* International Association for data professionals (1972 -)* Members in Philippines and Vietnam

• Census Programme– Small area statistics [MIMAS]– UKBORDERS (boundaries for thematic mapping) [EDINA]

• EDINA Digimap Collection– Topographic mapping data, from national mapping agency– Marine & Geological mapping data– I could say very much more about Digimap!!

• And then there are images and documentary films!

Page 37: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

37

Page 38: 1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data.

38

Focus on community-generated resources

1. ‘traditional ground for libraries’– Union catalogues (& links to ILL/docdel) – SUNCAT– [SAsk me about SUNCAT]

2. ‘digital library developments’* Resource Discovery Network* Inter-operability – not just http, but m2m interfaces* Digitisation

– Newspapers, NewsFilm, Manuscripts …– DIWAN: digitising Islamic Materials in UK university collections

3. New challenge: Open Access repositories* International development – UK active * Institutional Repositories

– ‘put it in The Depot’ – www.depot.ac.uk [not yet launched]

need Access Management Federation for Education & Research – Shibboleth framework, developed as part of Internet2