Top Banner
Opportunities for Data Exchange: optimising the conditions for data sharing Susan Reilly LERU Doctoral Summer School, 9th Jul, 2012
54

Research Data Sharing LERU

May 10, 2015

Download

Technology

LIBER Europe

Presentation from LERU Doctoral Summer School 2012, Barcelona
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Research Data Sharing LERU

Opportunities for Data Exchange: optimising the conditions for data sharingSusan Reilly

LERU Doctoral Summer School, 9th Jul, 2012

Page 2: Research Data Sharing LERU

Thank you!

Page 3: Research Data Sharing LERU

LIBER (Association of European Research Libraries)-Projects:

ContentEuropeana LibrariesEuropeana Newspapers

PolicyMEDOANET

InfrastructureAPARSENAAA StudyODE

LIBER & the European Research Infrastructure

Page 4: Research Data Sharing LERU

Ready to ride the wave… ?

Page 5: Research Data Sharing LERU
Page 6: Research Data Sharing LERU

Rule #11: Don’t Publicize!Unless the break is a well known spot, like for e.g. Lahinch, Bundoran, or Strandhill, taking photo’s and posting them on the Internet is regarded as unacceptable in the surfing community. If you publicize a break in this manner you draw attention to it, which in turns draws more people to it, which means a place gets more crowded and there is more aggro in the water. The more you talk about a break to those who haven’t surfed it the more damage you do to it, and yourself in the long run because the more people there are in the water the less waves there are for you. Think about it.

http://www.boards.ie/vbulletin/showthread.php?s=fc082712ef1354ecf7cb0e53dc71d519&t=2055828999

Page 7: Research Data Sharing LERU

Reason not to share surf info

• Other people will steal my wave

• Unethical to share e.g.inexperienced surfers on dangerous breaks get hurt

• We won’t get recognition e.g. local surfers loose out to visiting pros

• .............

Page 8: Research Data Sharing LERU
Page 9: Research Data Sharing LERU

15 petabytes (15 million gigabytes) of data annually – enough to fill more than 1.7 million dual-layer DVDs a year!

Page 10: Research Data Sharing LERU

The Vision

“With a proper scientific einfrastructure, researchers in different domains can collaborate on the same data set,

finding new insights. They can share a data set easily across the globe, but also protect its integrity and ownership. They can use, re-use and combine data, increasing productivity.

They can more easily solve today’s Grand Challenges, such as climate change and energy supply. Indeed, they can engage in whole new forms of scientific inquiry, made

possible by the unimaginable power of the e-infrastructure to find correlations, draw inferences and trade ideas and information at a scale we are only beginning to see.”

Page 11: Research Data Sharing LERU

Now and Next

• Authentication & authorisation

• New skills

Page 12: Research Data Sharing LERU

The Opportunities for Data Exchange Project

• identify, collate, interpret and deliver evidence of emerging best practices in sharing, re-using, preserving and citing data, the drivers for these changes and barriers impeding progress, in forms suited to each audience

• policy makers, funders, infrastructure operators, data centres, data providers and users, libraries and publishers

Page 13: Research Data Sharing LERU

Steps to creating the conditions for data sharing

• Understand data sharing today• Collection of "success stories”, “near misses” and “honourable

failures” in data sharing, re-use and preservation

• Data & scholarly communications• Integrating data and publications• Best practice in data citation• New roles

• Identify drivers and barriers• Interviews with stakeholder to seek consensus

Foto "Bell", Noordewierweg 116, Amersfoort.

Page 14: Research Data Sharing LERU

Tales of Data sharing

• 21 stories• scientific communities• infrastructure initiatives • management• other relevant stakeholders

Page 15: Research Data Sharing LERU
Page 16: Research Data Sharing LERU

The Astronomical Importance of Discoverability

• Galaxy Zoo (Carolin Liefke)

• Pre-processed data shared with the public to carry out specific tasks (e.g. classifying galaxies)

• Discoverability a major challenge

in data sharing- easier, more

sophisticated data mining, more

complex automated processing

Page 17: Research Data Sharing LERU

Hypotheses

“Without the infrastructure that helps scientists manage their data in a convenient and efficient way, no culture of data sharing will evolve.”

Stefan Winkler-Nees (German Research Foundation, DFG)

Page 18: Research Data Sharing LERU

Hypotheses Expected

Category: Infrastructure

“An international research community needs an international data infrastructure and international support.”

"After decades of reports with data in their titles the community found inadequate services almost no international support and few solutions.”

Page 19: Research Data Sharing LERU

Tension between hypotheses

Cat: Legislation, Education, Behaviour

“Premature data releases should not be enforced, but the mere possibility of data misinterpretation is no reason for not sharing data.”

“To avoid misuse and lack of acknowledgement of very special data, access should be restricted to skilled persons trained by the data creator.”

Page 20: Research Data Sharing LERU

Hypotheses by Category

4.Attitudes

6.Policies

8.Infrastructure

10.DMPs, Citability

11.Dependency on discipline

Page 21: Research Data Sharing LERU

Barriers & Drivers

data sharing

education

legislation funding

culture & attitude quality

policiescooperation

Infrastructure

publishing & visibility data flow improvements disciplines

accreditation & certification

career efficiency

Page 22: Research Data Sharing LERU

Integrating Data & Publications

• 3 stakeholder groups• Publishers• Researchers• Libraries & data centres

Page 23: Research Data Sharing LERU

How stakeholders interact

Page 24: Research Data Sharing LERU

(1) Data contained and

explained within the article

(2) Further data explanations in

any kind of supplementary files to articles

(3) Data referenced from the article and

held in data centers and repositories

(4) Data publications, describing available datasets

(5) Data in drawers and on

disks at the institute

The Data Publication Pyramid

Page 25: Research Data Sharing LERU

Where do you currently store your research data? (multiple answers possible)

Source: PARSE.Insight survey 2009, N = 1202

Page 26: Research Data Sharing LERU

26

The Pyramid’s likely short term reality:

(1) Top of the pyramid is stable

but small(2) Risk that

supplements to articles turn into Data Dumping

places(3) Too many

disciplines lack a community

endorsed data archive

(4) Estimates are that at least

75 % of research data is

never made openly avaiable

Page 27: Research Data Sharing LERU

27

The Ideal Pyramid(1) More

integration of text and data, viewers

and seamless links to interactive

datasets(2) Only if data

cannot be integrated in

article, and only relevant extra explanations

(3) Seamless links (bi-directional)

between publications and data, interactive

viewers within the articles

(4) More Data Journals that

describe datasets, data mgt plans and data methods

Page 28: Research Data Sharing LERU

A famous paper in Nature:DNA structure - 1953

• 1 page• 2 authors• 1 figure• no data

Source: V. Kiermer, Nature Publishing Group, 2011

Page 29: Research Data Sharing LERU

Nature in 2001: The human genome issue

• 62 pages, 49 figures, 27 tables

Source: V. Kiermer, Nature Publishing Group, 2011

Page 30: Research Data Sharing LERU

A thousand genomes – 2010

http://www.nature.com/nature/journal/v467/n7319/full/nature09534.html

Raw data: 12,145 SRA run ids submitted to Short Read Archive

Raw data: 12,145 SRA run ids submitted to Short Read Archive

Source: V. Kiermer, Nature Publishing Group, 2011

Page 31: Research Data Sharing LERU

31

Elsevier offers gene and protein viewers

from within the article, to data stored elsewhere:

Page 32: Research Data Sharing LERU

Articles: the currency of Science

Page 33: Research Data Sharing LERU

Issues for researchers

• Researchers need somewhere to put data and make it safe for reuse

• Researchers need to control its sharing and access• Researchers need the ability to integrate data and

publication• Researchers need to get credit for data as a first class research object• Researchers need someone to pay for the costs of data availability and re-use

Page 34: Research Data Sharing LERU

Library support for the researcher

Libraries and data centres must support…

• data as first class research object: publishing, persistent identification/citation of datasets

• data description, metadata, standards documentation and retrieval

• proper documentation of data

• long-term data archiving including data curation and preservation

Availability

Findability

Interpretability

Re-usability

Page 35: Research Data Sharing LERU

7 Areas of Opportunity

• Availability

• Findability

• Interpretability

• Reusability

• Citability

• Curation

• Preservation

Page 36: Research Data Sharing LERU

Researcher Opportunities

Data Issue: Researchers opportunities:

Availability Researchers demand their data be treated as first class research objectsResearchers loosen control over dataDefine roles of responsibility and control

Findability Agree convention to propose to publishers regarding data citationUse of persistent identifiers such as DOI’sEnsure common citation practices

Interpretability Recognize that data require metadata and work towards community best practice in metadata development

Re-usability Be concerned about the long term ability for secondary use and consider or seek out responsible preservation actions

Citability Agree a convention for data citationFollow metadata standards for datasetsUse of persistent identifiers such as DOI’s

Curation Develop sustainable and realistic data management plansCollaboration with public data archives

Preservation Develop sustainable realistic preservation plansActive engagement with public data archives

Page 37: Research Data Sharing LERU

Publishers’ Opportunties

Data Issue: Publishers opportunities (Chapter 3):

Availability Articles with data provide richer content and higher usageImpose stricter editorial policies about availability of underlying data which is in line with general funder’s trendsEnsure data is stored in a safe place, preferably a public repositoryBe transparent about curation and preservation of submitted data

Findability Ensure bi-directional links between data and publicationsEnsure common citation practices

Interpretability Provide services around data such as viewer apps for underlying data from within the article or interactive graphs, tables and images

Data Publications

Re-usability Interactive data from within articlesLinks to the relevant datasets, not just to the databaseData Publications

Citability Establish uniform data citation standardsFollow metadata standards for datasetsUse of persistent identifiers such as DOI’sData Publications

Curation Transparency about curation of submitted dataCollaboration with public data archives

Preservation Transparency about preservation of submitted dataCollaboration with public data archives

Page 38: Research Data Sharing LERU

Libraries’ Opportunities

Data Issue: Libraries and data centres opportunities (Chapter 4):

Availability Lower barriers to researchers to make their data available. Integrate data sets into retrieval services.

Findability Support of persistent identifiers. Engage in developing common metadescription schemas and common citation practices. Promote use of common standards and tools among researchers

Interpretability Support crosslinks between publications and datasets. Provide and help researchers understand metadescriptions of datasets. Establish and maintain knowledge base about data and their context.

Re-usability Curate and preserve datasets. Archive software needed for re-analysis of data. Be transparent about conditions under which data sets can be re-used (expert knowledge needed, software

needed).

Citability Engage in establishing uniform data citation standards. Support and promote persistent identifiers.

Curation/Preservation Transparency about curation of submitted data. Promote good data management practice. Collaborate with data creators Instruct researchers on discipline specific best practices in data creation (preservation formats, documentation of

experiment,…)

Page 39: Research Data Sharing LERU

Q. What exactly should the role of the library be and what are the skills we need?

Page 40: Research Data Sharing LERU

Data Citation: Getting Credit!

• Challenges:• granularity: which bits inside the dataset is being referred to• versioning: in case of dynamic or regularly updated data, which

version is cited• retrievability: indicate via DOIs or accession numbers where the

data are retrievable

Overview of best practices reported in literature and through interviews with experts

Page 41: Research Data Sharing LERU

Some Findings

• Citations with persistent identifiers should be listed in the references/bibliography to enable tracking of citation metrics.

• Publishers need to provide guidance for authors and referees on citation of data.

• Researchers need to nurture awareness in their community of the benefits of data citation, and follow citation guidelines given by publishers and data centres.

• Many researchers do not appear to see the value and benefits of data citation. How different communities can work together to promote this activity and the status of datasets as primary research outputs and publishable works in their own right, is an issue that still needs to be addressed.

Page 42: Research Data Sharing LERU

Our Relationship

Many researchers do not appear to see the value and benefits of data citation. There is a gap, which could be filled by libraries, in advocacy for data sharing, the use of subject specific repositories, and best practice in data citation. These, if filled, would increase the number of researchers

sharing and reusing data.

The issue still to be

addressed is how different

communities can work together

to promote this activity and

the status of datasets as

primary research outputs and

publishable works

in their own right.

Page 43: Research Data Sharing LERU

Now & Next

• For ODE:• Verify hypotheses as drivers and barriers• Translate findings for various target groups

• For LIBER:• Continue to find ways of supporting data sharing• Return to the framework for the collaborative data infrastructure

Page 44: Research Data Sharing LERU

Now and Next

• Authentication & authorisation

• New skills

Page 45: Research Data Sharing LERU

Addressing Trust and Data Curation

• AAA Study• Authentication and authorisation infrastructure for European

researchers• On the Riding the Wave wish list: “Distributed and collaborative

authentication, authorisation and accounting”• Safe depositing of data• Authenticity and provenance• Ensure recognition• Safe environments for collaboration

Page 46: Research Data Sharing LERU

Addressing Trust and Data Curation

• Alliance for Permanent Access to the Record of Science in Europe Network (APARSEN)

• look across the excellent work in digital preservation which is carried out in Europe and to try to bring it together under a common vision

• Trust, Sustainability, Usability, Access

Page 47: Research Data Sharing LERU

Back to surfing…

What was the result of all this sharing?

Page 48: Research Data Sharing LERU
Page 49: Research Data Sharing LERU
Page 50: Research Data Sharing LERU

http://www.brain-cloud.net/wp-content/uploads/2011/05/fergal-smith.jpg

Page 51: Research Data Sharing LERU

Has enabeled surfers to do things they only dreamed about

• Big wave hunters….

http://theweek.com/article/index/227955/the-biggest-wave-ever-surfed-the-mind-blowing-video

Page 52: Research Data Sharing LERU

Further Reading

Riding the Wave (2011)

http://www.cordis.europa.eu/fp7/ict/e.../hlg-sdi-report.pdf  

ODE/APARSEN Publicationshttp://www.alliancepermanentaccess.org/index.php/community/current-projects/ode/

AAA Studyhttps://confluence.terena.org/display/aaastudy/AAA+Study+Home+Page

Page 53: Research Data Sharing LERU

Credits

Slide reused from presentations by:

Salvatore Mele (CERN)

Eefke Smit (STM)

Hans Pfeiffenberger (Helmholtz)

Most images sourced through The European Library

Page 54: Research Data Sharing LERU

Thank you again!