Top Banner
Consultant, Honorary Academic Editor Associate Director, Principal Investigator Susanna-Assunta Sansone, PhD Alan Turing Institute Symposium Oxford, 6-7 April, 2016
10

NIH BD2K bioCADDIE DataMed: Data Discovery Index

Apr 13, 2017

Download

Data & Analytics

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: NIH BD2K bioCADDIE DataMed: Data Discovery Index

Consultant, Honorary Academic Editor

Associate Director, Principal Investigator

!

Susanna-Assunta Sansone, PhD!!

!

Alan Turing Institute Symposium Oxford, 6-7 April, 2016

Page 2: NIH BD2K bioCADDIE DataMed: Data Discovery Index

A Data Discovery Index prototype that:!•  Helps users find and access shared data !

•  Interoperates in the NIH Commons (biomedical digital assets) !

Page 3: NIH BD2K bioCADDIE DataMed: Data Discovery Index
Page 4: NIH BD2K bioCADDIE DataMed: Data Discovery Index

Repositories

Metadata Ingestion ElasticSearch

Terminology server

User Interface

Online datasets

Publishers Funding Agencies

Data producers

Dat

a So

urce

s

Ingestion Indexing

Searching

prototype!

Page 5: NIH BD2K bioCADDIE DataMed: Data Discovery Index

aggregator'A'

B C

Aaggregator'

Data'Discovery'Index'

data'

Organizing framework and portal for data

Dashed lines: mapping of metadata standards, links to aggregators, data Aggregators: repositories or various indices Data: digital research objects

Pilot projects* Core development team

* There is work for everyone (and more)

Designed as an element of the ecosystem!

Page 6: NIH BD2K bioCADDIE DataMed: Data Discovery Index

Use cases- community-driven effort!

Page 7: NIH BD2K bioCADDIE DataMed: Data Discovery Index

The ‘right’ level of metadata elements!!

Examples of competency questions, derived from the use cases

Page 8: NIH BD2K bioCADDIE DataMed: Data Discovery Index

The ‘appropriate’ metadata standards!!

Mapping the landscape of standards and databases in the life sciences

Page 9: NIH BD2K bioCADDIE DataMed: Data Discovery Index

mapped a variety of metadata standards and database schemas

Generic schemas:!•  schema.org!•  DataCite!•  RIF-CS!•  DCAT!•  PROV!•  VOID!•  Dublin Core !•  etc…!!

Life/biomedical schemas:!•  BioProject!•  BioSample!•  MiNIML!•  PRIDE-ml!•  GA4GH metadata schema!•  SRA xml!•  CDISC SDM / BRIDGE model !•  etc…!

We have aimed to have maximum coverage of use cases with minimal number of data elements

We do foresee that not all questions can be answered in full

From to!

Page 10: NIH BD2K bioCADDIE DataMed: Data Discovery Index

Prototype, model, mappings, documentation and more at!https://biocaddie.org and https://github.com/biocaddie !

Supported by the NIH grant 1U24 AI117966-01 to the University of California, San Diego