Top Banner
oreChem: Planning and Enacting Chemistry on the Semantic Web Microsoft Research eScience Workshop 2010 Berkeley, CA USA Mark Borkum, Simon Coles and Jeremy Frey 12 October 2010
29

oreChem: Planning and Enacting Chemistry on the Semantic Web

Nov 01, 2014

Download

Technology

Mark Borkum

 
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: oreChem: Planning and Enacting Chemistry on the Semantic Web

oreChem: Planning and Enacting Chemistry on the Semantic WebMicrosoft Research eScience Workshop 2010Berkeley, CA USA

Mark Borkum, Simon Coles and Jeremy Frey12 October 2010

Page 2: oreChem: Planning and Enacting Chemistry on the Semantic Web

2

Overview• Introduction

• Ontology

• Case Study: X-ray Crystallography

• Future Work

• Summary

Page 3: oreChem: Planning and Enacting Chemistry on the Semantic Web

3

The Scientific Method• A systematic

process for knowledge acquisition

• Becoming increasingly data-intensive

Page 4: oreChem: Planning and Enacting Chemistry on the Semantic Web

4

The Data Deluge• In Haiku:

– Lots of producers;Generating more datathan ever before.

• 40 years ago, a PhD student would determine 3 structures over the entire course of their study!

The Great Wave off Kanagawa by Katsushika Hokusai

Page 5: oreChem: Planning and Enacting Chemistry on the Semantic Web

5

The Scientific Method (on the Web)

Page 6: oreChem: Planning and Enacting Chemistry on the Semantic Web

6

Provenance (The Elephant in the Room)• The 7 W’s [Goble

2002]

– Who, What, Where, Why, When, Which, & (W)How

• The Why aspect is often ignored

Page 7: oreChem: Planning and Enacting Chemistry on the Semantic Web

7

The oreChem Project• Funded by Microsoft

Research

• Investigating the design and deployment of a semantic-based eScience infrastructure for Chemistry

• Project website:

– http://research.microsoft.com/en-us/projects/orechem/

oreChem

Dublin Core, FOAF, SIOC, OWL Time, GeoNames, etc…

Page 8: oreChem: Planning and Enacting Chemistry on the Semantic Web

8

oreChem Core Ontology

Page 9: oreChem: Planning and Enacting Chemistry on the Semantic Web

9

Planning• Prospective

provenance

• Describes a scientific experiment that will be enacted (in the future)

• Three entity types:

– Plan– Plan Stage– Plan Object

Page 10: oreChem: Planning and Enacting Chemistry on the Semantic Web

10

Enactment• Retrospective

provenance

• Describes a scientific experiment that was enacted

• Three entity types:

– Run– Stage– Object

Page 11: oreChem: Planning and Enacting Chemistry on the Semantic Web

“In theory, there is no difference between theory and practice.But, in practice, there is.” Unknown (possibly Yogi Berra)

Page 12: oreChem: Planning and Enacting Chemistry on the Semantic Web

12

Realisation (is not Instantiation)• Each ‘run thing’ is

linked to zero or one ‘plan thing’

– Deviation from the plan is allowed

Page 13: oreChem: Planning and Enacting Chemistry on the Semantic Web

13

X-RAY CRYSTALLOGRAPHY

Case Study

Page 14: oreChem: Planning and Enacting Chemistry on the Semantic Web

14

Current Practice in Crystallography• Crystallography data

is highly structured

– The de facto standard adopted by the community is the CIF (Crystallographic Information File)

• Relatively few crystal structures are openly available online

http://www.rin.ac.uk/our-work/data-management-and-curation/share-or-not-share-research-data-outputs

Page 15: oreChem: Planning and Enacting Chemistry on the Semantic Web

15

Crystallography and Fraud

Page 16: oreChem: Planning and Enacting Chemistry on the Semantic Web

16

The eCrystals Federation• JISC project

• Network of crystallography resources

• All published records are available as Open Data

• Based on EPrints repository

http://ecrystals.chem.soton.ac.uk/

Page 17: oreChem: Planning and Enacting Chemistry on the Semantic Web

17

eCrystal #20• Each eCrystals

record contains:

– Bibliographic metadata

– Fundamental and derived data (excluding raw images)

– Final structure solution

Page 18: oreChem: Planning and Enacting Chemistry on the Semantic Web

18

Single Crystal Structure Determination1. Take powder

specimen of chemical substance

2. Measure diffraction of X-rays

3. Compute electron densities

4. Solve for crystal structure

Page 19: oreChem: Planning and Enacting Chemistry on the Semantic Web

19

oreChem Plan for eCrystals• Machine-readable

representation of methodology

• Describes requirements for software and data products

• Available online at:– http://ecrystals.chem.sot

on.ac.uk/plan.rdf

Page 20: oreChem: Planning and Enacting Chemistry on the Semantic Web

20

oreChem Run for eCrystal #20• Exported by

“oreChem” plug-in for EPrints 3.1

– RDF/XML serialisation

– Uses SWRL rules to infer causal relationships

• Describes:

– Software– Data products

http://ecrystals.chem.soton.ac.uk/cgi/export/20/ORE_Chem/ecrystals-eprint-20.xml?include_xsl=1

Page 21: oreChem: Planning and Enacting Chemistry on the Semantic Web

21

Retrospective Provenance Graphs for eCrystal #20

Stages and Objects Objects

used (dashed)emitted (solid)

derivedFrom (solid)

used(?s, ?o1) & emitted(?s, ?o2) derivedFrom(?o2, ?o1)

Page 22: oreChem: Planning and Enacting Chemistry on the Semantic Web

22

Crystallography and Fraud – SPARQL PREFIX orechem: <http://www.openarchives.org/2010/05/24-orechem-ns#>PREFIX ecrystals: <http://ecrystals.chem.soton.ac.uk/plan.rdf#>SELECT ?run ?raw ?derived ?reportedWHERE { ?run a orechem:Run ; orechem:hasPlan ecrystals:Ecrystals ; orechem:containsObject ?raw ; orechem:containsObject ?derived ; orechem:containsObject ?reported . ?raw a orechem:File ; orechem:hasPlanObject ecrystals:HKL . ?derived a orechem:File ; orechem:derivedFrom ?raw . ?reported a orechem:File ; orechem:hasPlanObject ecrystals:CIF ; orechem:derivedFrom ?derived .}

Page 23: oreChem: Planning and Enacting Chemistry on the Semantic Web

23

Crystallography and Fraud – SPARQL (2)

Page 24: oreChem: Planning and Enacting Chemistry on the Semantic Web

24

Crystallography and Fraud – SPARQL (3)?run ?raw

?reported

?derived

http://ecrystals.chem.soton.ac.uk/cgi/export/20/ORE_Chem/ecrystals-eprint-20.xml?include_xsl=1

Page 25: oreChem: Planning and Enacting Chemistry on the Semantic Web

25

Crystallography and Fraud – SPARQL (4)

?run ?raw ?derived ?reported

_:eCrystal_20_Run 02sot126.hkl 02sot126.prp 02sot126.cif

_:eCrystal_20_Run 02sot126.hkl 02sot126.lst 02sot126.cif

_:eCrystal_20_Run 02sot126.hkl 02sot126.res 02sot126.cif

Page 26: oreChem: Planning and Enacting Chemistry on the Semantic Web

26

Future Work• oreChem Core Ontology

– Support for conditionals and continuations

• oreChem Lower Ontology

– Specialised for Physical and Computational Chemistry

• Applications and Services

– oreChem Plan Designer and Enactor– oreChem Run Inspector

Page 27: oreChem: Planning and Enacting Chemistry on the Semantic Web

27

Summary• <summary/>

Page 28: oreChem: Planning and Enacting Chemistry on the Semantic Web

28

Acknowledgements• Microsoft Research

– Tony Hey– Lee Dirks– Savas Parastatidis– Alex Wade

• oreChem Project

– Carl Lagoze, Theresa Velden

– Jeremy Frey, Simon Coles

– Peter Murray-Rust, Nick Day, Jim Downing

– C. Lee Giles, Prasenjit Mitra, William Brouwer, Na Li

– Marlon Pierce, Sashi Kiran Challa

Page 29: oreChem: Planning and Enacting Chemistry on the Semantic Web

29

Thank You• Questions?