Top Banner
FAIR digital research assets: beyond the acronym Susanna-Assunta Sansone, PhD @SusannaASansone ORCiD 0000-0001-5306-5690 Consultant, Founding Academic Editor Associate Director, Principal Investigator Neuroinformatics, Kuala Lumpur, 20-21 August, 2017
53

FAIR and metadata standards - FAIRsharing and Neuroscience

Jan 21, 2018

Download

Data & Analytics

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: FAIR and metadata standards - FAIRsharing and Neuroscience

FAIR digital research assets: beyond the acronym

Susanna-Assunta Sansone, PhD@SusannaASansone

ORCiD 0000-0001-5306-5690

Consultant,Founding Academic Editor

Associate Director,Principal Investigator

Neuroinformatics,KualaLumpur,20-21August,2017

Page 2: FAIR and metadata standards - FAIRsharing and Neuroscience

• Available in a public repository

• Findable through some sort of search facility

• Retrievable in a standard format

• Self-described so that third parties can make sense of it

• Intended to outlive the experiment for which they were collected

To do better science, more efficiently we need data that are…

Page 3: FAIR and metadata standards - FAIRsharing and Neuroscience

A set of principles, for those

wishing to enhance

the value of their

data holdings

Page 4: FAIR and metadata standards - FAIRsharing and Neuroscience
Page 5: FAIR and metadata standards - FAIRsharing and Neuroscience

Wider adoption of the FAIR principles, by research infrastructure programmes, e.g.

Page 6: FAIR and metadata standards - FAIRsharing and Neuroscience

Defining FAIRness

Page 7: FAIR and metadata standards - FAIRsharing and Neuroscience

Defining a framework for evaluating FAIRness

By the

fairmetrics.org

Working Group

Page 8: FAIR and metadata standards - FAIRsharing and Neuroscience

NOTE: The Principles are high-level; do not suggest any specific

technology, standard, or implementation-solution

Principles put emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals

Interoperability standards – the pillars of FAIR

Page 9: FAIR and metadata standards - FAIRsharing and Neuroscience

The invisible machinery

• Identifiers and metadata to be implemented by technical experts in tools, registries, catalogues, databases, services

• It is essential to make standards ‘invisible’ to lay users, who often have little or no familiarity with them

Page 10: FAIR and metadata standards - FAIRsharing and Neuroscience

http://nometadata.org/logo

Page 11: FAIR and metadata standards - FAIRsharing and Neuroscience

Metadata standards – fundamentals

• Descriptors for a digital object that help to understand what it is, where to find it, how to access it etc.

• The type of metadata depends also on the type of digital object (e.g. software, dataset)

• The depth and breadth of metadata varies according to their purpose§ e.g. reproducibility requires richer metadata then citation

Page 12: FAIR and metadata standards - FAIRsharing and Neuroscience

• Domain-level descriptors that are essential for interpretation, verification and reproducibility of datasets

• The depth and breadth of descriptors vary according to the domain, broadly covering the what, who, when, how and why

Metadata standards - datasets

Page 13: FAIR and metadata standards - FAIRsharing and Neuroscience

• Domain-level descriptors that are essential for interpretation, verification and reproducibility of datasets

• The depth and breadth of descriptors vary according to the domain, broadly covering the what, who, when, how and why allowing:§ experimental components (e.g., design, conditions, parameters),§ fundamental biological entities (e.g., samples, genes, cells), § complex concepts (such as bioprocesses, tissues and diseases),§ analytical process and the mathematical models, and § their instantiation in computational simulations (from the molecular

level through to whole populations of individuals)

to be harmonized with respect to structure, format and annotation

Metadata standards - datasets

Page 14: FAIR and metadata standards - FAIRsharing and Neuroscience
Page 15: FAIR and metadata standards - FAIRsharing and Neuroscience
Page 16: FAIR and metadata standards - FAIRsharing and Neuroscience

Metadata for discovery

Page 17: FAIR and metadata standards - FAIRsharing and Neuroscience
Page 18: FAIR and metadata standards - FAIRsharing and Neuroscience

model and related formats

Metadata for discovery, but not only

Page 19: FAIR and metadata standards - FAIRsharing and Neuroscience
Page 20: FAIR and metadata standards - FAIRsharing and Neuroscience

…..

Page 21: FAIR and metadata standards - FAIRsharing and Neuroscience
Page 22: FAIR and metadata standards - FAIRsharing and Neuroscience

Domain-specific metadata standards for datasets

MIAMEMIRIAM

MIQASMIXMIGEN

ARRIVEMIAPE

MIASE

MIQE

MISFISHIE….

REMARK

CONSORT

SRAxml

SOFT FASTADICOM

MzMLSBRML

SEDML…

GELML

ISA

CML

MITAB

AAOCHEBIOBI

PATO ENVOMOD

BTOIDO…

TEDDY

PROXAO

DO

VO

de jurestandard

organizations

de facto

grass-rootsgroups

Formats Terminologies Guidelines

220+

115+

548+

~1000

Page 23: FAIR and metadata standards - FAIRsharing and Neuroscience

https://doi.org/10.6084/m9.figshare.3795816.v2

https://doi.org/10.6084/m9.figshare.4055496.v1

Page 24: FAIR and metadata standards - FAIRsharing and Neuroscience

• Perspective and focus vary, ranging:§ from standards with a specific biological or clinical domain of study

(e.g. neuroscience) or significance (e.g. model processes)§ to the technology used (e.g. imaging modality)

• Motivation is different, spanning:§ creation of new standards (to fill a gap)§ mapping and harmonization of complementary or contrasting efforts§ extensions and repurposing of existing standards

• Stakeholders are diverse, including those:§ involved in managing, serving, curating, preserving, publishing or

regulating data and/or other digital objects § academia, industry, governmental sectors, and funding agencies§ producers but also also consumers of the standards, as domain (and

not just technical) expertise is a must

A complex landscape

Page 25: FAIR and metadata standards - FAIRsharing and Neuroscience

Standards’ life cycle

• Formulation§ use cases, scope, prioritization and expertise

• Development§ iterations, tests, feedback and evaluation§ harmonization of different perspectives and available options

• Maintenance§ (exemplar) implementations, technical documentation, education

material, metrics§ sustainability, evolution (versions) and conversion modules

Page 26: FAIR and metadata standards - FAIRsharing and Neuroscience

Technologically-delineated views of the world

Biologically-delineated views of the world

Generic features (‘common core’)- description of source biomaterial- experimental design components

Arrays &Scanning

Columns

GelsMS MS

FTIR

NMR

Columns…

transcriptomics proteomics metabolomics

plant biologyepidemiology neuroscience

Fragmentation, duplications and gaps

Arrays

Scanning …

Page 27: FAIR and metadata standards - FAIRsharing and Neuroscience

Arrays

Scanning … Arrays &

Scanning…

Columns

GelsMS MS

FTIR

NMR

Columns…

transcriptomics proteomics metabolomics

Modularization to combine and validate

plant biologyepidemiology neuroscience

Proteomics-based investigations of

neurodegenerative diseases

Proteomics and metabolomics-based investigations of

neurodegenerative diseases

Page 28: FAIR and metadata standards - FAIRsharing and Neuroscience

Working in/across multiple domains is challenging

• Requires§ Mapping between/among heterogeneous representations

§ Conceptual modelling framework to encompass the domain specific metadata standards

§ Tools to handle customizable annotation, multiple conversions and validation

Page 29: FAIR and metadata standards - FAIRsharing and Neuroscience
Page 30: FAIR and metadata standards - FAIRsharing and Neuroscience

Technical and social engineering required

• Pain points include§ Fragmentation§ Coordination, harmonization, extensions§ Credit, incentives for contributors§ Governance, ownership§ Indicators and evaluation methods§ Outreach and engagement with all stakeholders§ Synergies between basic and clinical/medical areas§ Implementations: infrastructures, tools, services§ Education, documentation and training§ Funding streams§ Business models for sustainability

Page 31: FAIR and metadata standards - FAIRsharing and Neuroscience

Too many

cooks in the

standards’

kitchen?

Page 32: FAIR and metadata standards - FAIRsharing and Neuroscience

Standards

fusion…anyone?

Page 33: FAIR and metadata standards - FAIRsharing and Neuroscience
Page 34: FAIR and metadata standards - FAIRsharing and Neuroscience

doi: 10.1126/science.1180598

doi:10.1038/nbt1346doi:10.1038/nbt1346

OBO Portal and Foundry Portal and Foundrydoi: 10.1038/nbt.1411

Doing my fair share

Page 35: FAIR and metadata standards - FAIRsharing and Neuroscience

• Consumers:§ How do I find the standards appropriate for my case?

• Producers§ How do I make my standards visible to others?

Improving discoverability of standards

Page 36: FAIR and metadata standards - FAIRsharing and Neuroscience
Page 37: FAIR and metadata standards - FAIRsharing and Neuroscience
Page 38: FAIR and metadata standards - FAIRsharing and Neuroscience

Monitorsthedevelopment andevolution ofstandards,

theiruse indatabases andtheadoptionofbothindatapolicies,

toinform andeducate theusercommunity

Page 39: FAIR and metadata standards - FAIRsharing and Neuroscience

Standard developing groups, incl:Journal, publishers, incl:

Cross-links, data exchange, incl:

Societies and organisations, incl: Institutional RDM services, incl:

Projects, programmes:

Working with and for producers and consumers

Page 40: FAIR and metadata standards - FAIRsharing and Neuroscience

Databases/data repositories

Metadata standards

Formats Terminologies Guidelines

Interlink standards among themselves and with repositories

Data policies by funders, journals and other organizations

Page 41: FAIR and metadata standards - FAIRsharing and Neuroscience

Formats Terminologies Guidelines

…and to indicate ‘adoption’

Databases/data repositories

Data policies by funders, journals and other organizations

Metadata standards

Page 42: FAIR and metadata standards - FAIRsharing and Neuroscience

270

48232

97

87 4

204

9 6 8

Assign ‘indicators’ to describe their status…

Paper in preparation, preliminary information as of July 2017

Readyforuse,implementation,orrecommendation

Indevelopment

Statusuncertain

Deprecatedassubsumedorsuperseded

Allrecordsaremanuallycurated

in-houseandverifiedbythe

communitybehindeachresource

Page 43: FAIR and metadata standards - FAIRsharing and Neuroscience

Help us map the neuroscience standards landscape

Page 44: FAIR and metadata standards - FAIRsharing and Neuroscience
Page 45: FAIR and metadata standards - FAIRsharing and Neuroscience
Page 46: FAIR and metadata standards - FAIRsharing and Neuroscience
Page 47: FAIR and metadata standards - FAIRsharing and Neuroscience
Page 48: FAIR and metadata standards - FAIRsharing and Neuroscience
Page 49: FAIR and metadata standards - FAIRsharing and Neuroscience
Page 50: FAIR and metadata standards - FAIRsharing and Neuroscience

Models/Formats Reporting Guidelines Terminology Artifacts

Database Implementations

Journal Recommendations

Models/Formats Reporting Guidelines Terminology Artifacts

Number of standards recommended by 68 journals/publishers policies (the top one)

6 out of 223 (ISA-Tab)

26 out of 118 (MIAME)

8 out of 343 (NCBI Tax)

Paper in preparation, preliminary information as of July 2017

Activating the decision-making chain

Page 51: FAIR and metadata standards - FAIRsharing and Neuroscience

Models/Formats Reporting Guidelines Terminology Artifacts

Database Implementations

Journal Recommendations

Models/Formats Reporting Guidelines Terminology Artifacts

Models/Formats Reporting Guidelines Terminology Artifacts

Database Implementations

Journal Recommendations

Models/Formats Reporting Guidelines Terminology Artifacts

Number of standards recommended by 68 journals/publishers policies (the top one)

Number of standards implemented by 544 databases/repositories (the top one)

6 out of 223 (ISA-Tab)

26 out of 118 (MIAME)

8 out of 343 (NCBI Tax)

59 out of 116 (MIAME)

146 out of 223 (FASTA)

121 out of 343 (GO)

Paper in preparation, preliminary information as of July 2017

Activating the decision-making chain

Page 52: FAIR and metadata standards - FAIRsharing and Neuroscience

Philippe Rocca-Serra, PhDSenior Research Lecturer

AlejandraGonzalez-Beltran, PhDResearch Lecturer

Milo Thurston, DPhDResearch Software Engineer

MassimilianoIzzo, PhDResearch Software Engineer

Peter McQuilton, PhDKnowledge Engineer

Allyson Lister, PhDKnowledge Engineer

EamonnMaguire, DphilContractor

David Johnson, PhDResearch Software Engineer

MelanieAdekale, PhDBiocurator Contractor

DelphineDauga, PhDBiocurator Contractor

Susanna-Assunta Sansone, PhDPrincipal Investigator, Associate Director

Page 53: FAIR and metadata standards - FAIRsharing and Neuroscience

The (long) road to FAIR

Interoperability standards

are digital objects in their own right,

with their associated research, development and educational activities