ARIADNE is funded by the European Commission's Seventh Framework Programme Requirements for Open Sharing of Archaeological Research Data EAA 2016 Vilnius Session: Open Access and Open Data in Archaeology - Following the ARIADNE Thread 1 September 2016 Guntram Geser Salzburg Research
27
Embed
Requirements for Open Sharing of Archaeological Research Data
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
ARIADNE is funded by the European Commission's Seventh Framework Programme
Requirements for Open Sharing of Archaeological Research Data
EAA 2016 VilniusSession: Open Access and Open Data in Archaeology -
Following the ARIADNE Thread1 September 2016
Guntram GeserSalzburg Research
ARIADNE• Advanced Research Infrastructure for Archaeological
Dataset Networking in Europe – EU FP7-Infrastructures project– Type „Integrating Activity“, 2/2013–1/2017– Focus on archaeological datasets – help overcome data
fragmentation and foster a culture of sharing and re-use– 23 partners of 18 European countries– Website: www.ariadne-infrastructure.eu – Data portal: http://portal.ariadne-infrastructure.eu
• Open Data – criteria, expectations and drivers• Current barriers to open data sharing data• How to benefit from open data publication• Issue of actual data re-use• Takeaway points
Open Data – criteria• Accessible online– not necessarily without registration
• Reusable– open format (e.g. not PDF documents)– not summarized data (i.e. figures, charts, etc.) canned in
publications – state: raw, cleaned, normalized,… (accord. to practice)
• Machine-readable• Openly licensed– e.g. CC-BY or ODC-by, importance of Attribution!
• For free– yes, but somebody must pay to ensure sustainability
Open Data – expectations• The scientific method: provide evidence for
knowledge claims („show us your data“)• Output of publicly funded research should be openly
available, and preserved properly• Better return-on-investment, e.g. – no/less duplication of data collection – better exploitation of available data– therefore emphasis on data re-use!
• More open data sharing => better analysis => better decision-making / solutions
• Innovation through „data-intensive“, „data-driven“, „big data“ research (incl. archaeology?)
Open Data – drivers /1• High-level policies– OECD Declaration on Access to Research
Data from Public Funding (2004; Principles and Guidelines, 2007) … many others
• Research funding agencies– Open Access mandates (for publications)
extended to data– Mandatory data management plans, i.e. data
sharing must be considered already at application stage
– e.g. European Commission Open Research Data Pilot in Horizon 2020 programme
Neelie Kroes, EC Vice-President, 2012:“Taxpayers should not have to pay twice for scientific research and they need seam-less access to raw data. We want to bring dissemination and exploitation of scientific research results to the next level.”
Open Data – drivers /2• Data repositories/archives are being put in place– General• Zenodo (related to OpenAIRE)• Figshare and Mendeley Data (commercial background)
– Archaeology • Archaeology Data Service (UK, since 1996!)• E-depot Nederlandse Archeologie (NL)• MAPPA (Pisa, Italy)• IANUS (Germany, in preparation) • OpenContext and tDAR (USA)
• Community data archives missing in many countries
Current data (non-)sharing behaviour• Contrary to what advocates of proper management
and sharing of data would like researchers to do• According to representative surveys (PARSE.Insight
2009, Science 2011): Most data remains locked away– On personal computers, portable storage carriers,
restricted access servers… eventually discarded as “obsolete” or lost otherwise
• Mostly not considered: Potential value of the data for alternate and new uses by others
• Only 6-8% of researchers sometimes deposit data in a community open data repository/archive
Archiving/storing data for future use /1
PARSE.Insight survey 2009: 1202 respondents – different research domains and countries
Archiving/storing data for future use /2
“Science” journal 2011 survey of peer reviewers: 1700 responses – different research domains and countriesWhere do you archive most of the data generated in your lab or for your research?
50.2% in our lab38.5% university server 7.6% community repository 3.2% “other” 0.5% not stored
Data archiving/publication
ARIADNE online survey
Nov./Dec. 2013
Barriers to open data sharing• Many obstacles to providing open access to reusable
data – Priority of published papers– Little academic reward for development and sharing of
datasets/DB– Required effort to share re-usable data (incl. formatting,
metadata creation, licensing etc.)– Existing copyrights, confidential and sensitive data – Concerns that data could be scooped, misused or
misinterpreted – Potential reputational risk (e.g. data quality, errors,…)
• Overall a bad ratio of additional effort & risks to potential benefits
Barriers to data deposit/publicationARIADNE
online survey
Nov./Dec. 2013
„How would you rate the importance of the following potential barriers to enhancing access to research data?”
European Commission: Online survey on scientific information in the digital age (2012); 1140 participants from around Europe.
Need to make clear benefits of open data publication!
Authors‘ benefit focus• Goal = recognition and academic reward for data
providers – same as for other publications• Core mechanism = citation of published data/set– Confirms value of the data contributed– Indicates providers of good data– Promotes further use of the data (i.e. more citations)– Allows the use and impact of the data to be tracked and
measured • But data citation metrics not implemented yet• Some indications of higher citation rates of
publications that make underlying data available
How to reap the benefits? / 1• Deposit data that underpins your research results in
a reliable, community recognised repository– See: Data Seal of Approval; Trusted Repositories Audit &
Certification (TRAC) and other checklists – Should provide unique persistent identifiers (e.g. DOIs)– Require following citation standard as part of user
agreement (e.g. DataCite; citation in reference list)
• Provide good metadata – “no pain, no gain”– Key for data re-use without direct contact with creator– Costs of preparing data and metadata for publication
should be included in project funding
How to reap the benefits? /2• Apply a license not impeding re-use (e.g. CC-BY,
ODC-BY)• Demand proper citation by others who use your
published data• Publish a “data paper” (about your data)• Promote/cite your data when appropriate• Seek collaboration and co-authoring of papers with
data re-users
Issue of actual data re-use• No re-use => no citation => no recognition/rewards • Few studies on re-use outside of science (best known
Piwowar & Vision 2013 on genomics data)• DIPIR project included archaeologists (Faniel et al.
2013) – main results:– Key importance of sufficient contextual information– Especially research design and data collection procedures– Other criteria (less important):
• Institutional background of data producers (training, reputation)
• Good practices of the data repository
• Will there be a lot of re-use if more data is openly available?
Takeaway points /1• Researchers as open data publishers and consumers– Publish open data to reap benefits – individually and as
research community– Recognise colleagues who share data, cite their datasets
properly
• Research institutions – Reward researchers who publish data– Change mind-sets by doing (not teethless mandates)– Offer skills development and support
• Data archives/repositories– Important research infrastructure – need sustained
funding, but also demonstrate usage/impact– Repositories and researchers are partners
Takeaway points /2• Research funders– Open Data mandates should come with financial coverage
of extra effort for open data publication
• All: It‘s not about data management (plans) to comply with policies
• It‘s about … – making published data part of the scholarly record –
persistent, citable, rewarded – demonstrating tangible benefits of open data publication
and re-use
References and other literature /1• ADS – Archaeology Data Service, http://archaeologydataservice.ac.uk • ARIADNE online survey Nov./Dec. 2013, results presented in: ARIADNE First report on users
needs (D2.1, April 2014), http://www.ariadne-infrastructure.eu/Resources/D2.1-First-report-on-users-needs
• Charles Beagrie Ltd.: Keeping Research Data Safe (KRDS) Benefits Framework, http://beagrie.com/krds-i2s2.php
• Borgman, C.L: Research Data: Who will share what, with whom, when, and why? Fifth China – North America Library Conference 2010, Beijing, 8-12 September 2010, http://works.bepress.com/cgi/viewcontent.cgi?article=1237&context=borgman
• CODATA - ICSTI Task Group on Data Citation Standards and Practices (2013): Out of cite, out of mind: The current state of practice, policy, and technology for the citation of data. In: Data Science Journal, vol.12, https://www.jstage.jst.go.jp/article/dsj/12/0/12_OSOM13-043/_article
• CODATA (2015): The Value of Open Data Sharing. A CODATA Report for the Group on Earth Observations. Living Document, V1, November 2015, https://zenodo.org/record/33830#.V6TSdzX74ac
• Costas R., Meijer I., Zahedi Z. & Wouters P. (2013): The Value of Research Data. Metrics for datasets from a cultural and technical point of view. Center for Science and Technology Studies, Leiden University, http://www.knowledge-exchange.info/Default.aspx?ID=586
• Data Seal of Approval, http://www.datasealofapproval.org
References and other literature /2• DataCite, http://www.datacite.org • Digital Object Identifier (DOI), http://www.doi.info • DIPIR project, http://dipir.org • E-depot Nederlandse Archeologie, http://www.edna.nl • European Commission: Online survey on scientific information in the digital age, Brussels,
• European High-level Expert Group on Scientific Data (2010): Riding the wave. How Europe can gain from the rising tide of scientific data. A submission to the European Commission, October 2010, http://cordis.europa.eu/fp7/ict/e-infrastructure/docs/hlg-sdi-report.pdf
• Faniel I., Kansa E., Whitcher-Kansa S., Barrera-Gomez J. & Yakel E. (2013): The Challenges of Digging Data: A Study of Context in Archaeological Data Reuse. JCDL 2013 Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, 295-304 (preprint), http://www.oclc.org/content/dam/research/publications/library/2013/faniel-archae-data.pdf
• Force 11: Joint Declaration of Data Citation Principles, https://www.force11.org/datacitation/ • Gibney E. & Van Noorden R. (2013): Scientists losing data at a rapid rate. Nature, 19.12.2013,
http://www.nature.com/news/scientists-losing-data-at-a-rapid-rate-1.14416 • Goodman A., Pepe A., Blocker A. W. et al. (2014): Ten Simple Rules for the Care and Feeding of
References and other literature /3• Harley, Diane et al. (2010): Assessing the Future Landscape of Scholarly Communication: An
Exploration of Faculty Values and Needs in Seven Disciplines. – Archaeology Case Study. University of California Berkeley, http://escholarship.org/uc/item/15x7385g#page-37
• Heidorn, P.B: Shedding Light on the Dark Data in the Long Tail of Science. Library Trends 57(2), 2008, http://hdl.handle.net/2142/9127
• IANUS - Research Data Centre for Archaeology and Ancient Studies (Germany), http://www.ianus-fdz.de
• Knowledge Exchange (2013): "Making Data Count: Research data availability and research assessment", workshop, Berlin, http://www.knowledge-exchange.info/Default.aspx?ID=577
• Internet Archaeology: Data Papers, http://intarch.ac.uk/authors/data-papers.html • Journal of Open Archaeology Data, http://openarchaeologydata.metajnl.com • JISC / Curtis G., Hammond M. & Oppenheim C. (2013): Access to Citation Data: Cost-benefit
and Risk Review and Forward Look. JISC, September 2013, http://repository.jisc.ac.uk/5371/1/Access-to-Citation-data-report-final.pdf
• Key Perspectives (2010): Data Dimensions: Disciplinary Differences in Research Data Sharing, Reuse and Long Term Viability. Digital Curation Center, http://www.dcc.ac.uk/sites/default/files/SCARP%20SYNTHESIS_FINAL.pdf
• Kvalheim V. & Kvamme T. (2014): Policies for Sharing Research Data in Social Sciences and Humanities. A survey about research funders’ data policies. International Federation of Data Organisations (IFDO), http://www.ada.edu.au/documents/ifdo-report-on-policies-for-data-sharing
References and other literature /4• Lawrence B., Jones C., Matthews B., Pepler S. & Callaghan S. (2011): Citation and peer review
of data: Moving towards formal data publication. International Journal of Digital Curation, 6(2): 4-37, http://www.ijdc.net/index.php/ijdc/article/viewFile/181/265
• MAPPA Open Data (University of Pisa, Italy), http://mappaproject.arch.unipi.it/?lang=en• OECD: Declaration on Access to Research Data from Public Funding (30.01.2004),
• OECD: Principles and Guidelines for Access to Research Data from Public Funding (2007), http://www.oecd.org/science/sci-tech/38500813.pdf
• OpenAIRE: Open access to research data: the Open Research Data Pilot, https://www.openaire.eu/h2020-oa-data-pilot
• Open Context (Alexandria Archive Institute, USA), http://opencontext.org• Opportunities for Data Exchange (ODE) project / Kotarski R. et al. (2012). Report on best
practices for citability of data and on evolving roles in scholarly communication, http://www.alliancepermanentaccess.org/index.php/community/current-projects/ode/outputs/
• PARSE.Insight: Insight into digital preservation of research output in Europe. Project deliverable D3.4: Survey Report, 9 December 2009, http://www.parse-insight.eu/downloads/PARSE-Insight_D3-4_SurveyReport_final_hq.pdf
References and other literature /5• Parsons M.A. & Fox P.A. (2013): Is data publication the right metaphor? In: Data Science
Journal, vol.12, February 2013, https://www.jstage.jst.go.jp/article/dsj/12/0/12_WDS-042/_pdf
• Piwowar H.A. & Vision T.J. (2013): Data reuse and the open data citation advantage. In: PeerJ, 1:e175 http://dx.doi.org/10.7717/peerj.175
• Pryor, Graham (2009): Multi-scale Data Sharing in the Life Sciences: Some Lessons for Policy Makers. International Journal of Digital Curation, Vol. 4, No 3, http://ijdc.net/index.php/ijdc/article/view/135/178
• Research Data Alliance, https://www.rd-alliance.org • RIN - Research Information Network & British Library (2010): Patterns of information use and
exchange: case studies of researchers in the life sciences, http://www.rin.ac.uk/our-work/using-and-accessing-information-resources/patterns-information-use-and-exchange-case-studie
• RIN - Research Information Network & Key Perspectives (2008): To Share or not to Share: Publication and Quality Assurance of Research Data Outputs. Main Report and Annex. RIN: London, http://www.rin.ac.uk/our-work/data-management-and-curation/share-or-not-share-research-data-outputs
• RIN & NESTA - National Endowment for Science, Technology, and the Arts (2010): Open to All? Case Studies of Openness in Research. Report prepared by the Digital Curation Centre. London: RIN, http://www.rin.ac.uk/our-work/data-management-and-curation/open-science-case-studies
References and other literature /6• Science magazine: Science Staff introduction to the Special Issue “Dealing with Data”,
Science, Vol. 331 no. 6018, 11 February 2011, pp. 692-693, http://www.sciencemag.org/content/331/6018/692.short
• tDAR - The Digital Archaeological Record (Digital Antiquity consortium, USA), http://www.tdar.org
• Tenopir C., Allard S., Douglass K. et al. (2011): Data Sharing by Scientists: Practices and Perceptions. PLoS ONE 6(6): e21101, http://www.plosone.org/article/info:doi/10.1371/journal.pone.0021101
• The Royal Society (2012): Science as an Open Enterprise, June 2012, http://royalsociety.org/policy/projects/science-public-enterprise/report/
• Thessen, A.E & Patterson, D.J (2011): Data issues in the life sciences. In: ZooKeys 150: 15–51, http://www.pensoft.net/journals/zookeys/article/1766/data-issues-in-the-life-sciences
• Uhlir, Paul F. (ed., 2012): For attribution: Developing scientific data attribution and citation practices and standards: Summary of an international workshop. Washington, D.C.: National Academies Press, http://www.nap.edu/catalog.php?record_id=13564
• Vines T.H., Andrew R.L., Bock D.G. et al. (2013): Mandated data archiving greatly improves access to research data. FASEB Journal, 27(4), http://www.fasebj.org/content/27/4/1304
• Zenodo (data repository at CERN, related to OpenAIRE) http://www.zenodo.org
ARIADNE is a project funded by the European Commission under the Community’s Seventh Framework Programme, contract no. FP7-INFRASTRUCTURES-2012-1-313193. The views and opinions expressed in this presentation are the sole responsibility of the authors and do not necessarily reflect the views of the European Commission.