This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
FAIR digital research assets: beyond the acronym
Susanna-Assunta Sansone, PhD@SusannaASansone
ORCiD 0000-0001-5306-5690
Consultant,Founding Academic Editor
Associate Director,Principal Investigator
Neuroinformatics,KualaLumpur,20-21August,2017
• Available in a public repository
• Findable through some sort of search facility
• Retrievable in a standard format
• Self-described so that third parties can make sense of it
• Intended to outlive the experiment for which they were collected
To do better science, more efficiently we need data that are…
A set of principles, for those
wishing to enhance
the value of their
data holdings
Wider adoption of the FAIR principles, by research infrastructure programmes, e.g.
Defining FAIRness
Defining a framework for evaluating FAIRness
By the
fairmetrics.org
Working Group
NOTE: The Principles are high-level; do not suggest any specific
technology, standard, or implementation-solution
Principles put emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals
Interoperability standards – the pillars of FAIR
The invisible machinery
• Identifiers and metadata to be implemented by technical experts in tools, registries, catalogues, databases, services
• It is essential to make standards ‘invisible’ to lay users, who often have little or no familiarity with them
http://nometadata.org/logo
Metadata standards – fundamentals
• Descriptors for a digital object that help to understand what it is, where to find it, how to access it etc.
• The type of metadata depends also on the type of digital object (e.g. software, dataset)
• The depth and breadth of metadata varies according to their purpose§ e.g. reproducibility requires richer metadata then citation
• Domain-level descriptors that are essential for interpretation, verification and reproducibility of datasets
• The depth and breadth of descriptors vary according to the domain, broadly covering the what, who, when, how and why
Metadata standards - datasets
• Domain-level descriptors that are essential for interpretation, verification and reproducibility of datasets
• The depth and breadth of descriptors vary according to the domain, broadly covering the what, who, when, how and why allowing:§ experimental components (e.g., design, conditions, parameters),§ fundamental biological entities (e.g., samples, genes, cells), § complex concepts (such as bioprocesses, tissues and diseases),§ analytical process and the mathematical models, and § their instantiation in computational simulations (from the molecular
level through to whole populations of individuals)
to be harmonized with respect to structure, format and annotation
Metadata standards - datasets
Metadata for discovery
model and related formats
Metadata for discovery, but not only
…..
Domain-specific metadata standards for datasets
MIAMEMIRIAM
MIQASMIXMIGEN
ARRIVEMIAPE
MIASE
MIQE
MISFISHIE….
REMARK
CONSORT
SRAxml
SOFT FASTADICOM
MzMLSBRML
SEDML…
GELML
ISA
CML
MITAB
AAOCHEBIOBI
PATO ENVOMOD
BTOIDO…
TEDDY
PROXAO
DO
VO
de jurestandard
organizations
de facto
grass-rootsgroups
Formats Terminologies Guidelines
220+
115+
548+
~1000
https://doi.org/10.6084/m9.figshare.3795816.v2
https://doi.org/10.6084/m9.figshare.4055496.v1
• Perspective and focus vary, ranging:§ from standards with a specific biological or clinical domain of study
(e.g. neuroscience) or significance (e.g. model processes)§ to the technology used (e.g. imaging modality)
• Motivation is different, spanning:§ creation of new standards (to fill a gap)§ mapping and harmonization of complementary or contrasting efforts§ extensions and repurposing of existing standards
• Stakeholders are diverse, including those:§ involved in managing, serving, curating, preserving, publishing or
regulating data and/or other digital objects § academia, industry, governmental sectors, and funding agencies§ producers but also also consumers of the standards, as domain (and
not just technical) expertise is a must
A complex landscape
Standards’ life cycle
• Formulation§ use cases, scope, prioritization and expertise
• Development§ iterations, tests, feedback and evaluation§ harmonization of different perspectives and available options
§ Conceptual modelling framework to encompass the domain specific metadata standards
§ Tools to handle customizable annotation, multiple conversions and validation
Technical and social engineering required
• Pain points include§ Fragmentation§ Coordination, harmonization, extensions§ Credit, incentives for contributors§ Governance, ownership§ Indicators and evaluation methods§ Outreach and engagement with all stakeholders§ Synergies between basic and clinical/medical areas§ Implementations: infrastructures, tools, services§ Education, documentation and training§ Funding streams§ Business models for sustainability
Too many
cooks in the
standards’
kitchen?
Standards
fusion…anyone?
doi: 10.1126/science.1180598
doi:10.1038/nbt1346doi:10.1038/nbt1346
OBO Portal and Foundry Portal and Foundrydoi: 10.1038/nbt.1411
Doing my fair share
• Consumers:§ How do I find the standards appropriate for my case?
• Producers§ How do I make my standards visible to others?