Top Banner
Toward interoperable bioscience data Susanna-Assunta Sansone, PhD Principal Investigator, Team Leader, University of Oxford e-Research Centre, Oxford, UK @isatools @biosharing ISMB 2012, Long Beach, California, USA, July 15-17 ISMB hashtag: #PP44 Highlights Track: Databases and Ontologies
30

ISA Commons / BioSharing - Susanna-Assunta Sansone - ISMB 2012

Jan 27, 2015

Download

Technology

 
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 1. ISMB hashtag: #PP44 Highlights Track: Databases and OntologiesToward interoperable bioscience dataSusanna-Assunta Sansone, PhD Principal Investigator, Team Leader,University of Oxford e-Research Centre, Oxford, UK @isatools@biosharingISMB 2012, Long Beach, California, USA, July 15-17

2. ISMB tag:What is this presentation about? #PP44 ISA Commons, a grass-root collaborative that works to facilitatecollection, curation and sharing of experiments in anincreasingly diverse set of life science domains, using a common,structured representation of the experiments that transcends individual biological and technological domains, follows the appropriate community norms and standards, many listed in the BioSharing catalogue and is implemented by several curation, storage and data sharing toolsTOWARDS INTEROPERABLE BIOSCIENCE DATAdoi:10.1038/ng.1054Sansone SA, Rocca-Serra P, Field D, Maguire E, Taylor C, Hofmann O, Fang H, NeumannS, Tong W, Amaral-Zettler L, Begley K, Booth T, Bougueleret L, Burns G, Chapman B,Clark T, Coleman LA, Copeland J, Das S, de Daruvar A, de Matos P, Dix I, Edmunds S,Evelo C, Forster M, Gaudet P, Gilbert J, Goble C, Griffin J, Jacob D, Kleinjans J, HarlandL, Haug K, Hermjakob H, Sui S, Laederach A, Liang S, Marshall S, Merrill E, McGrath A, Feb 2012Reilly D, Roux M, Shamu C, Shang C, Steinbeck C, Trefethen A, Williams-Jones B,www.biosharing.org www.isacommons.orgWolstencroft K, Xenarios J, Hide W. www.isacommons.org 3. ISMB tag: From reusable data to reproducible research#PP44To make the datasets comprehensible, interoperable and reusable,underpinning future investigations, we need common ways to report andshare the experimental details and the associated results.Consistent reporting will have a positive and long-lasting impact on thevalue of collective scientific outputs. The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansonewww.ebi.ac.uk/net-project 4. ISMB tag:Structured description of datasets #PP44 Capture all salient features of the experimental workflow Make annotation explicit and discoverable Structure the descriptions for consistency, tracking independent variables dependent variablesusing cross reference andresolvable identifiers 5. ISMB tag:Not too much, not too little, just right #PP44 We must strike a balancebetween depth and breadth ofinformation; and sufficient informationrequired to reuse the data 6. experimental designsample characteristic(s)experimental variable(s) technology(s)measurement(s)protocols(s) data le(s)...... Example of experiments by InnoMed PredTox6The International Conference on Systems Biology (ICSB), 22-28 August, 2008a FP6 public-private consortiumSusanna-Assunta Sansonewww.ebi.ac.uk/net-project 7. ISMB tag:A general mobilization to develop standards, e.g.:#PP44use the same word andallow data to flow from report the same core,refer to the same thingone system to another essential informationChallenges: different communities, different norms and standards,lack of coordination, fragmentation and uneven coverage 8. ISMB tag:Growing number of reporting standards #PP44+ 303 Each one focuses on a particular biological or technological domains + 150+ 130 Source: MIBBI,Source: BioPortalEQUATOREstimated MAGE-Tab!AAO!MIAME! GCDML! MIAPA! CHEBI! SRAxml!OBI!MIRIAM! VO! SOFT! MIQAS! FASTA! PATO!MIX!CML!ENVO! REMARK! DICOM! MIGEN! GELML!MOD! SBRML!MIAPE! MIQE! TEDDY! MITAB! MzML! XAO! CIMR! CONSORT!BTO!ISA-Tab! SEDML! DOPRO! IDO!MIASE! MISFISHIE.! 9. A catalogue to map the landscape of standards : over 400 bio-standards (public and in curation) Field*, Sansone* et al., Omics data sharing. Science9 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone 326, 234-36 (2009) doi:0.1126/science.1180598 www.ebi.ac.uk/net-project 10. ISMB tag:Example of multi-assays study how many #PP44standards are applicable to this? 11. ISMB tag:Example of multi-assays study how many #PP44standards are applicable to this? 12. ISMB tag:Example of multi-assays study how many #PP44standards are applicable to this? 13. ISMB tag:Example of multi-assays study how many #PP44standards are applicable to this? 14. ISMB tag: #PP44 user communityThe International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project 15. ISMB tag:#PP44Metadata tracking framework, designed tosupport the use us several standardschecklists, terminologies conversions to(a growing number of) other metadataformats, used by public repositories, e.g.MAGE-Tab Pride-xmlSRA-xmlSOFTCurrently finalizing conversion to RDF toexplore the growing Linked Data universe,in collaboration with the W3C HCLSIG) 16. ISMB tag: #PP44ISA software suite: supporting standards-compliant experimentalannotation and enabling curation at the community level(Rocca-Serra et al, 2010)a collaborative effort of international research/service groups:University of Oxford, EBI, Harvard School of Public Health, NERC EnvironmentalBioinformatics Centre, Genomic Standards Consortium, US FDA Center forBioinformatics, Leibniz Institute of Plant Biochemistry and more. 17. ISMB tag:#PP44 To mint DOIs17 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansonewww.ebi.ac.uk/net-project empowering researchers to use standards 18. ISMB tag: #PP44Maguire E, Rocca-Serra P, Sansone SA, Davies J and Chen M.Taxonomy-based Glyph Design -- with a Case Study on VisualizingWorkflows of Biological Experiments, IEEE Transactions on Visualization and Computer Graphics, volume 18, 2012 (in press) 19. ISMB tag: #PP44Ontology Search and Tagging in Google Spreadsheets 20. ISMB tag: #PP44Ontology Search and Tagging in Google Spreadsheets 21. A growing ecosystem of over 30 public and internal resources using the ISA metadata tracking framework to facilitate standards- compliant collection, curation, management and reuse of investigations in an increasingly diverse set of life science domains, including: environmental health stem cell discovery environmental genomics system biology metabolomics transcriptomics metagenomics toxicogenomics nanotechnology also by communities working to build proteomics, a library of cellular signaturesWe aim to achieve a commonrepresentation of experimental content thattranscends individual bioscience domains 22. A growing ecosystem of over 30 public and internal resourcesusing the ISA metadata tracking framework to facilitate standards-compliant collection, curation, management and reuse ofinvestigations in an increasingly diverse set of life science domains,including: environmental health stem cell discovery environmental genomics system biology metabolomics transcriptomics metagenomics toxicogenomics nanotechnology also by communities working to build proteomicsa library of cellular signaturesSome of the public groups/resources:Some of the internal projects:Stem Cell CommonsNanotechnologyInformatics WorkingGroup 23. ISMB tag:Implementations at Harvard #PP44 ISA 24. ISMB tag:Implementations at Harvard#PP44Importance of a local community 25. ISMB tag: Implementations at Harvard#PP44data sharing in ISA-Tab Importance of a local community 26. ISMB tag: Implementation at the EBI#PP4426 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansonewww.ebi.ac.uk/net-project 27. Data papers 28. ExtensionsNanotechnology Informatics Working Group28The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project 29. @isatools @biosharingisacommons.org biosharing.org 30. TOWARDS INTEROPERABLE BIOSCIENCE DATA doi:10.1038/ng.1054 Sansone SA, Rocca-Serra P, Field D, Maguire E, Taylor C, Hofmann O, Fang H, Neumann S, Tong W, Amaral-Zettler L, Begley K, Booth T, Bougueleret L, Burns G, Chapman B, Clark T, Coleman LA, Copeland J, Das S, de Daruvar A, de Matos P, Dix I, Edmunds S, Evelo C, Forster M, Gaudet P, Gilbert J, Goble C, Griffin J, Jacob D, Kleinjans J, Harland L, Haug K, Hermjakob H, Sui S, Laederach A, Liang S, Marshall S, Merrill E, McGrath A,Feb 2012 Reilly D, Roux M, Shamu C, Shang C, Steinbeck C, Trefethen A, Williams-Jones B, www.biosharing.org www.isacommons.org Wolstencroft K, Xenarios J, Hide W. www.isacommons.orgCommunity involvement and uptake!1st ISA-Tab workshop! 3rd ISA-Tab workshop!User workshops/visits - start! 1st public instance: ! 2nd ISA-Tab workshop!Other tools implement ! Harvard Stem Cell ! Growing number ofISA-Tab!Discovery Engine! systems starts to adopt ISA framework!Core developments!Conversions to !Links toPride-XML/SRA-XML/! analysis toolsStrawman ISA-Tab spec!ISA software v1!MAGE-Tab and more!starts!Final ISA-Tab spec!Database instance ! at EBI!RDF format starts!Publications! Stem Cell ! ISA-Tab and ! Discovery ! ISA Commons! Omics data sharing!Workshop reports!ISA software suite! Engine!(Science)! (Nature Genetics)! (Bioinformatics)! (NAR)!20072008 2009201020112012Development timeline