Top Banner
MxCube/ISPyB Meeting Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit Software Group ESRF Triestre 12/09/2018
28

Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

Jun 13, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

MxCube/ISPyB Meeting

Data Management @ESRF

Alex de Maria AntolinosSoftware Engineer

Data Manager@Data Analysis UnitSoftware Group

ESRFTriestre 12/09/2018

Page 2: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

Why a data management plan?

● Data is the raw material of science and is our main product

Page 3: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

Why an ESRF data Policy?

Data needs to be properly managed to allow:● Linking to publications (increasingly requested by publishers)● Reanalyse● Verification● New research● Preservation of unique data sets

Page 4: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

Data Policy General Principles

Page 4 l Metadata Catalogue l 16 Décembre 2016 l Alejandro DE MARIA ANTOLINOS

● Automatic capture of data and metadata

● ESRF is the keeper (custodian) of the raw data and associated metadata

● Raw data and metadata will be selected, organized and look after in well-defined formats (curation)

● Raw data and metadata will be READ-ONLY for the duration of their life time

● Proprietary research (commercial) will be owned exclusively by the client who purchased the access and it is not covered by the data policy

● Restricted to the experimental team during the a period of 3 years (EMBARGO)

● Access to raw data and associated metadata is foreseen to be via a searchable online catalogue (ICAT)

Page 5: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

Data Policy

● About data and metadata○ Only keep data generated at the ESRF○ Data must be in a known format by the ESRF ○ Data must be traceable and verifiable as coming

from the ESRF

● After the embargo the data will be released under the license CC-By-4

Page 6: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

Benefits of Data Management

Fair Principles

Page 7: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

Benefits of Data Management

Page 8: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

Benefits of Data Management

Page 9: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.
Page 10: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

https://doi.esrf.fr

Page 11: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

Applications at the ESRF

ISPyBESRF paleontological DB

TomoDB

● Raw Data○ Data are deleted from disk after 50 days○ Full backups are kept for 2 years○ No data management plan○ No persistent identifiers

● Metadata○ Not collected systematically○ No online metadata catalogue for all beamlines○ Experiment report is not public

Page 12: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

ICAT

● ICAT is an open source metadata management system designed for large facilities

Page 13: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

Implementation Overview

Page 14: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

Implementation Overview

Page 15: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

HDF5 + Nexus + ICAT https://icat.esrf.fr

ICAT

Page 16: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

NeXus Implementation@ESRF

Page 17: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

HDF5 + Nexus + ICAT https://icat.esrf.fr

● HDF5 as a mirror of ICAT on the local beamline file system● Following the NEXUS convention

Page 18: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

HDF5 + Nexus + ICAT https://icat.esrf.fr

Page 19: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

HDF5 + Nexus + ICAT https://icat.esrf.fr

Page 20: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

Software involved in Data Management

BEAMLINE CENTRAL SERVICE

ICAT● Data is preserved at least 10 years● Metadata is stored forever● DOI● Web Portal● Electronic Logbook● Open Data compliant with ESRF

data Policy

1956.RH> mdatanewproposal MD7890 /data/id01/{}/metadata1984.RH> mdata_set_sample("CdTe","kmap","test of CdTe sample")1986.RH> mdata_start "datasetName"

1986.RH> mdata_put("AcquisitionMode", "Transmission")1986.RH> mdata_put("Element", "Fe")1986.RH> mdata_put("Edge", "K")1986.RH> mdata_upload("saxs", "/data/visitor/../XAS.png")

1986.RH> mdata_save

SPEC

Page 21: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

User Interface

Page 22: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

User Interface

Page 23: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

E-Logbook

Page 24: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

E-Logbook

Page 25: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

Architecture https://icat.esrf.fr

Page 26: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

Architecture https://icat.esrf.fr

VALI

DA

TIO

N

ING

ESTI

ON

ICA

T

ENR

ICH

MEN

T

ICA

T TA

PE IN

TER

FAC

E

PRODUCERS CONSUMERS

InvestigationSampleDataset NameMetadataRaw data file paths

=

Page 27: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

Status(http://www.esrf.fr/datapolicy)

Page 28: Data Management - GitHub Pages · Data Management @ESRF Alex de Maria Antolinos Software Engineer Data Manager@Data Analysis Unit ... User Interface. User Interface. E-Logbook. E-Logbook.

Implementation Coordination Team Members

● Alejandro de Maria (ISDD) – Data Manager● Bruno Lebayle (TID) – IT infrastructure● Joanne McCarthy (EXPD) – User Office● Armando Solé (ISDD) – Metadata+data● Jens Meyer (ISDD) – Beamline controls● Dominique Porte (TID) – User ID's● Rudolf Dimper (TID) – Data policy● Andy Götz (ISDD) – Implementation

Thanks for your attention!!