Top Banner
Going FAIR: highlights from the life sciences Susanna-Assunta Sansone, PhD @SusannaASansone ORCiD: 0000-0001-5306-5690 Consultant, Founding Academic Editor Associate Director, Principal Investigator RDA Europe Science Workshop, Wellcome Trust, London, 25-26 April 2017
24

Going FAIR: premises, promises and challenges of interoperability standards

Jan 21, 2018

Download

Data & Analytics

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Going FAIR: premises, promises and challenges of interoperability standards

Going FAIR:highlights from the life sciences

Susanna-Assunta Sansone, PhD

@SusannaASansoneORCiD: 0000-0001-5306-5690

Consultant,Founding Academic Editor

Associate Director,Principal Investigator

RDA Europe Science Workshop, Wellcome Trust, London, 25-26 April 2017

Page 2: Going FAIR: premises, promises and challenges of interoperability standards

Interoperability standards:premises, promises and challenges

Page 3: Going FAIR: premises, promises and challenges of interoperability standards

A set of principles, for those

wishing to enhance

the value of their

data holdings

Designed and endorsed by a diverse

set of stakeholders - representing

academia, industry, funding agencies,

and scholarly publishers.

Page 4: Going FAIR: premises, promises and challenges of interoperability standards

Wider adoption by policies, e.g.

Page 5: Going FAIR: premises, promises and challenges of interoperability standards

Wider adoption by research and infrastructure programmes, e.g.

Page 6: Going FAIR: premises, promises and challenges of interoperability standards

Wider adoption by pharmas, e.g.

The world's biggest public-private partnership

in the life sciences, a partnership between the European Commission and the

European pharmaceutical industry.

Funds research and infrastructure projects to improve health

by speeding up the development of, and patient access to, innovative medicines.

Page 7: Going FAIR: premises, promises and challenges of interoperability standards

NOTE: The Principles are high-level; do not suggest any specific

technology, standard, or implementation-solution

Beyond the nice acronym….Principles put emphasis on enhancing the ability of machines to automatically find

and use the data, in addition to supporting its reuse by individuals

Page 8: Going FAIR: premises, promises and challenges of interoperability standards

Interoperability standards – invisible machinery

• Identifiers and metadata to be implemented by technical experts in tools, registries, catalogues, databases, services§ to find, store, manage (e.g., mint, track provenance, version) and

aggregate (e.g., interlink and map etc.) digital objects

• It is essential to make standards ‘invisible’ to lay users, who often have little or no familiarity with them

Page 9: Going FAIR: premises, promises and challenges of interoperability standards

Metadata standards – fundamentals

• Descriptors for a digital object that help to understand what it is, where to find it, how to access it etc.

• The type of metadata depends also on the type of digital object (e.g. software, dataset)

• The depth and breadth of metadata varies according to their purpose§ e.g. reproducibility requires richer metadata then citation

Page 10: Going FAIR: premises, promises and challenges of interoperability standards

• Domain-level descriptors that are essential for interpretation, verification and reproducibility of datasets

• The depth and breadth of descriptors vary according to the domain broadly covering the what, who, when, how and why

Content standards – deeper metadata for datasets

Page 11: Going FAIR: premises, promises and challenges of interoperability standards

• Domain-level descriptors that are essential for interpretation, verification and reproducibility of datasets

• The depth and breadth of descriptors vary according to the domain broadly covering the what, who, when, how and why allowing:§ experimental components (e.g., design, conditions, parameters),§ fundamental biological entities (e.g., samples, genes, cells), § complex concepts (such as bioprocesses, tissues and diseases),§ analytical process and the mathematical models, and § their instantiation in computational simulations (from the molecular

level through to whole populations of individuals)

to be harmonized with respect to structure, format and annotation

Content standards – deeper metadata for datasets

Page 12: Going FAIR: premises, promises and challenges of interoperability standards

Formats Terminologies Guidelines

Content standards in the life/biomedical sciences

220+

115+

548+

source sourcesource

miame

MIRIAMMIQASMIX

MIGEN

ARRIVEMIAPE

MIASE

MIQE

MISFISHIE….

REMARK

CONSORT

SRAxml

SOFT FASTA

DICOM

MzMLSBRML

SEDML…

GELML

ISA

CML

MITAB

AAOCHEBIOBI

PATO ENVOMOD

BTOIDO…

TEDDY

PRO

XAO

DO

VO

882 -> ~1000

Page 13: Going FAIR: premises, promises and challenges of interoperability standards

de jure de factograss-roots

groupsstandard

organizations

Nanotechnology Working Group

Variety of community efforts, just few examples:

• Formal authorities§ openess to participations varies§ standards are sold or licenced (at a

costs or no cost)§ charges apply to advanced training or

programmatic access

• Bottom-up communities§ open to interested varies§ standards are free for use§ volunteering efforts § minimal or little funds for carry out

the work, let alone provide training

Formats Terminologies Guidelines

Page 14: Going FAIR: premises, promises and challenges of interoperability standards

• Perspective and focus vary, ranging:§ from standards with a specific biological or clinical domain of study

(e.g. neuroscience) or significance (e.g. model processes)§ to the technology used (e.g. imaging modality)

• Motivation is different, spanning:§ creation of new standards (to fill a gap)§ mapping and harmonization of complementary or contrasting efforts§ extensions and repurposing of existing standards

• Stakeholders are diverse, including those:§ involved in managing, serving, curating, preserving, publishing or

regulating data and/or other digital objects § academia, industry, governmental sectors, and funding agencies§ producers but also also consumers of the standards, as domain (and

not just technical) expertise is a must

A complex landscape

Page 15: Going FAIR: premises, promises and challenges of interoperability standards

Technologically-delineated views of the world

Biologically-delineated views of the world

Generic features (‘common core’)- description of source biomaterial- experimental design components

Arrays

Scanning Arrays &Scanning

Columns

GelsMS MS

FTIR

NMR

Columns

transcriptomics proteomics metabolomics

plant biologyepidemiology microbiology

Fragmentation of content standards

Page 16: Going FAIR: premises, promises and challenges of interoperability standards
Page 17: Going FAIR: premises, promises and challenges of interoperability standards

Working in/across multiple domains is challenging

• Requires§ Mapping between/among heterogeneous representations

§ Conceptual modelling framework to encompass the domain specific content standards

§ Tools to handle customizable annotation, multiple conversions and validation

Page 18: Going FAIR: premises, promises and challenges of interoperability standards
Page 19: Going FAIR: premises, promises and challenges of interoperability standards

Mapofthelandscape,monitoringdevelopmentandevolution ofdataandmetadatastandards,theiruse indatabases andthe

adoptionofbothindatapolicies

Page 20: Going FAIR: premises, promises and challenges of interoperability standards

is also a WG of the

Page 21: Going FAIR: premises, promises and challenges of interoperability standards

Data deposition:ENA, EGA, PDBe, EuropePMC, …

Bioinformatics tools:Bio.tools

Data Interoperability:BioSharing, identifiers.org, OLS

Compute:Secure data transfer, cloud computing, AAI

Industry:Innovation and SME programmeBespoke collaborations

Training:TeSS, Data Carpentry, eLearning

Data management:Genome annotationData management plans

Added value data:UniProt, Ensembl, OrphaNet, …

is part of the services

Page 22: Going FAIR: premises, promises and challenges of interoperability standards

Standard developing groups:Journal, publishers:

Cross-links, data exchange:

Societies and organisations: Institutional RDM services:

Projects, programmes:

Page 23: Going FAIR: premises, promises and challenges of interoperability standards

• Pain points include: § Fragmentation§ Coordination, harmonization, extensions§ Credit, incentives for contributors§ Governance, ownership§ Funding streams§ Indicators and evaluation methods§ Implementations: infrastructures, tools, services§ Outreach and engagement with all stakeholders§ Synergies between basic and clinical/medical areas§ Education, documentation and training§ Business models for sustainability

Interoperability standards - technical & social engineering

Page 24: Going FAIR: premises, promises and challenges of interoperability standards

“As Data Science culture grows,digital research outputs (such asdata, computational analysis andsoftware) are being established asfirst-class citizens.

This cultural shift is required to goone step further: to recognizeinteroperability standards as digitalobjects in their own right, with theirassociated research, developmentand educational activities”.