Consultant, Honorary Academic Editor Associate Director, Principal Investigator Susanna-Assunta Sansone, PhD Alan Turing Institute Symposium Oxford, 6-7 April, 2016
Consultant, Honorary Academic Editor
Associate Director, Principal Investigator
!
Susanna-Assunta Sansone, PhD!!
!
Alan Turing Institute Symposium Oxford, 6-7 April, 2016
A Data Discovery Index prototype that:!• Helps users find and access shared data !
• Interoperates in the NIH Commons (biomedical digital assets) !
Repositories
Metadata Ingestion ElasticSearch
Terminology server
User Interface
Online datasets
Publishers Funding Agencies
Data producers
Dat
a So
urce
s
Ingestion Indexing
Searching
prototype!
aggregator'A'
B C
Aaggregator'
Data'Discovery'Index'
data'
Organizing framework and portal for data
Dashed lines: mapping of metadata standards, links to aggregators, data Aggregators: repositories or various indices Data: digital research objects
Pilot projects* Core development team
* There is work for everyone (and more)
Designed as an element of the ecosystem!
Use cases- community-driven effort!
The ‘right’ level of metadata elements!!
Examples of competency questions, derived from the use cases
The ‘appropriate’ metadata standards!!
Mapping the landscape of standards and databases in the life sciences
mapped a variety of metadata standards and database schemas
Generic schemas:!• schema.org!• DataCite!• RIF-CS!• DCAT!• PROV!• VOID!• Dublin Core !• etc…!!
Life/biomedical schemas:!• BioProject!• BioSample!• MiNIML!• PRIDE-ml!• GA4GH metadata schema!• SRA xml!• CDISC SDM / BRIDGE model !• etc…!
We have aimed to have maximum coverage of use cases with minimal number of data elements
We do foresee that not all questions can be answered in full
From to!
Prototype, model, mappings, documentation and more at!https://biocaddie.org and https://github.com/biocaddie !
Supported by the NIH grant 1U24 AI117966-01 to the University of California, San Diego