Top Banner
EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications
31

EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Dec 25, 2015

Download

Documents

Tobias Floyd
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen,

Linking Data to Open Access Publicati ons

Page 2: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

In 12 Minutes….

OpenAIRE – Publications and Data

Demonstrators for Enhanced Publications

Use Case Scenarios

Services for Users

EGU, April 23 20122

Page 3: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

OpenAIRE – Second Phase

Open Access, participatory infrastructure for scientific information linking publications, datasets, funding

Disseminates OA/RDM information in Europe

Opens its content (search, browse, stats) and to 3rd-party/Service providers

Capitalizes on the OpenAIRE infrastructure, built for Open Access pilot, FP7-funded articles (measuring the impact of EC SC39)

EGU, April 23 20123

Page 4: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Portal:Search, Access, Deposit

EGU, April 23 20124

Page 5: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Past, present and OpenAIREplus

5

Publication repositories networkInstitutional & Thematic

FP7 publications

EC Project metadata

National Project metadata

National funding publications

Driver Guidelines OpenAIRE Guidelines v1.0

OpenAIRE Guidelines v2.0

Dataset repositories

Metadata on data sets

OpenAIRE+ Guidelines for Data Providers

OpenAIREplus

EGU, April 23 2012

5,600,000 OA publications311 validated repositories

Page 6: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

OA Publication Infrastructure

Open Data Infrastructures

EGU, April 23 2012

ES

FR

i, EU

wid

e in

frastructu

res

Covering ‘European Knowledge’

6

Page 7: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

A ‘Static‘ publication

<Slide from Jens Klump

Page 8: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Enhanced Publicati ons (EPs)

Compound information objects: represent the aggregation of distinct information objects through meaningful relationships

Example of SURF-EPs: textual publications enhanced with links to datasets

OpenAIREplus provides EP services:

Management: creation and curation

Visualization, browsing, querying

Import: OAI-PMH/ORE harvesting of EPs from external providers

Export: OAI-PMH/ORE publishing of EPs, Linked Data representation

EGU, April 23 20128

Page 9: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

‘Information in Context’

EGU, April 23 20129

Page 10: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Attempt at a generic workflow

No one-size fits all for data– Use different data types, PIs, policies, access levels,

standards

Look at research driven disciplines, different communities

Incremental, based on prototypes

“..any roadmap for OA infrastructure must address this natural tension between diversity and infrastructure”

C. Meier zu Verl, & W. Horstmann (Eds.) 2011. Studies on Subject-Specific Requirements for Open Access Infrastructure.

Cross-discipline approach

EGU, April 23 201210

Page 11: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Subject-specific pilots

Learning lessons from interoperation of data infrastructures– Interoperability pilots between OpenAIREplus and subject-

specific infrastructures In the Life Sciences In the Social Sciences

– Exploitation in modelling and implementation for OpenAIRE data model Relationship entities: projects, publications, datasets

EGU, April 23 201211

Page 12: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

The Challenges

Aggregation and Discovery of resources

Representation of diverse disciplines in a ‚generic‘ infrastructure

Access restrictions/reuse policies

User friendly way for Researchers to link research results with project information

Machine-readable (Linked Open Data)

EGU, April 23 201212

Page 13: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Two disciplines…

SSH - DANS/EASY– Produce handmade EP‘s at file level– Experienced data modelling and research work (Veteran

tapes)

Life Sciences – EMBL-EBI– Text mine abstracts/full texts– Link bio-entities to database– Enriched information could be transfered to generic

infrastructure

EGU, April 23 201213

Page 14: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Demonstrator

Data model – Generalised

Extract citation info for datasets– from e.g UniProt and full text

Derive Persistent Identifiers – from URLs (URNs and PMC-Ids)

Transfer of linked entities– community services and OpenAIRE infrastructure

EGU, April 23 201214

Page 15: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Use Cases

1. Import EP created in DANS or SURF– Proof of Services Interoperability

EGU, April 23 201215

Page 16: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Use Cases

1. Import EP created in DANS or SURF– Proof of Services Interoperability

2. Manual composition of EP in OpenAIRE– Proof of Tools: Editor, Discovery of Research data in OpenAIRE

EGU, April 23 201216

Page 17: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Use Cases

1. Import EP created in DANS or SURF– Proof of Services Interoperability

2. Manual composition of EP in OpenAIRE– Proof of Tools: Editor, Discovery of Research data in OpenAIRE

3. Automatic generation of EP by extracting citation information (or mining), auto-linking– Proof that rich metadata can be represented in user-friendly

way– Possible Linked Open Data compliancy

EGU, April 23 201217

Page 18: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Use Cases

4. Reuse and enrichment: annotations added by users to datasets or publications – An EP is used by researcher in publication– Adequate documentation– Test legal framework – Study into Licensing of publications and data

Analyse requirements of legal protection of research data Legal prototype of restraints

EGU, April 23 201218

Page 19: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Research Scenario 1

1. You are an EC-project researcher– OA publication– Dataset with a DOI– Generate the link in OpenAIRE

2. Researcher completes data output with paper– No data repository– Submit dataset to OpenAIRE ‚orphan‘ repository

EGU, April 23 201219

Page 20: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Research Scenario 2

You search for ‚mouse genome literature‘ in OpenAIRE– Find a citation for publication– funding details of project– Related data, say a protein link to GenBank– Create your own links to this

EGU, April 23 201220

Page 21: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Service acti viti es

For publication providers - OpenAIRE’s Guidelines for repository managers

– Metadata: (DC) and Protocols: (OAI etc.)

For data providers: accessing (metadata of) datasets from providers while minimizing effort to comply

– Metadata: indications on minimal metadata about datasets (e.g., identifiers, date of creations, title, URLs) and best-practices for interlinking datasets and publications

– Access protocols: no requirements for adopting precise protocols (e.g., OAI, FTP) or ID/URL frameworks (e.g., OpenURL, DOI) to comply

EGU, April 23 201221

Page 22: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Service activitiesUsers

Registered end-users (e.g., EC personnel, project coordinators, researchers, authors)

– Search, browse and access statistics

– Deposit files and metadata of publications and datasets into the Orphan Repository

– Ingest (claim) into the information space metadata

– Create EP by combining datasets from different communities

– Reuse of datasets as secondary data (with respect to IPR)

22 EGU, April 23 2012

Page 23: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Service activitiesUsers

Content provider managers (e.g. datasets and publications repository managers)

– Registration and validation (OpenAIREPlus guidelines) of publication and dataset repositories

Data curators (administrative tasks)

– Collect and aggregate publications, project data and dataset metadata

Third-party application developers

– Bulk-fetch content from the (curated) information space

23 EGU, April 23 2012

Page 24: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

The Future…..

“Forget PDFs, imagine an ideal publication where you

click on tables to get through to raw data, where you can

contribute and discuss some aspects and later update or

correct parts of a paper in subsequent versions. The latter

is similar to Wikipedia, actually.”

– PhD Student, UGOE

EGU, April 23 201224

Page 25: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Danke…...– [email protected]– @openaire_eu

EGU, April 23 201225

Page 26: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Linking: Publication to Database

EGU, April 23 201226

Page 27: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Author supplied Supplementary info: TIFF,MOV

EGU, April 23 2012

PLoS: O’Toole, Greenan, Lange, Srayko, Müller-Reichert

27

Page 28: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Research Impact

OpenAIRE puts foundations to measure research impact per publication, researcher, project, institution, country, …

EGU, April 23 201228

Page 29: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Data Management Issues

Good data practices

Data policies, standards

Drivers for deposit? What‘s in it for researchers?

Work with publishers, DOIs

Where do researchers deposit data? Figshare?

EGU, April 23 201229

Page 30: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

• Potential issues: unstructured data with different kinds of media files

• Persistent IDs: resolvable and managed by the originator of resource

• Preservation: responsibility lies in the trusted repositories

EGU, April 23 201230

Page 31: EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen, Linking Data to Open Access Publications.

Demonstrators

Demonstrators for Enhanced Publications– Explore how links are managed between publications and research data in Life

Sciences and SSH– How data can be mutually complemented and exchanged in generic

infrastructures– Example: how a publication ‚reported‘ in OpenAIRE is enriched via UKPMC with

links to databases

Report: „Connection Data and Publications through e-Infrastructure“

EGU, April 23 201231