Top Banner
SMART Protocols: SeMAntic RepresenTation for Experimental Protocols Olga Giraldo [email protected] Ontology engineering group (OEG) Universidad Politécnica de Madrid
19
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SMART Protocols in LISC-2014

SMART Protocols: SeMAntic RepresenTation for

Experimental Protocols

Olga Giraldo

[email protected]

Ontology engineering group (OEG)

Universidad Politécnica de Madrid

Page 2: SMART Protocols in LISC-2014

Agenda

• What is a lab protocol

• Motivation

• Our general research question

• Our assumption

• Our propose

• Preliminary results

• Future work

Page 3: SMART Protocols in LISC-2014

What is a lab protocol

• Laboratory protocols are like cooking recipes• They have ingredients: reagents and sample,• They have appliances: equipment,• They have a total time,• They have a list of instructions,• They have critical steps.

• The laboratory protocols are “the how to do” an experiment.

Page 4: SMART Protocols in LISC-2014

Some problems in lab protocols

some of them present insufficient granularity,

the instructions can be imprecise or ambiguous due to the use of natural language.

• Incubate the centrifuge tubes in a water bath.

• Incubate the samples for 5 min with gentle shaking.

• Rinse DNA briefly in 1-2 ml of wash.

• Incubate at -20C overnight.

Page 5: SMART Protocols in LISC-2014

Why do we need to formalize and extract information from lab protocols?

Because we want a recommendation system…• That matches protocols according to my situation, for

instance• samples I have, • availability of equipment, reagents, lab conditions • expertise

We also want content based information retrieval • Meaningful sentences, sample used, purpose of the

protocol, applicability, critical steps, etc. Also, identification of instructions• Find all protocols for DNA extraction that have been used in

Oryza sativa that are suitable for processing a large number of samples with a low execution time.

Motivation

Page 6: SMART Protocols in LISC-2014

Currently…

Semi-structured information

Unstructured information

How to formalize the information from laboratory protocols as a knowledge base?

Ontologies + NLP tools

Page 7: SMART Protocols in LISC-2014

Our assumption

“Experimental protocols are fundamental information structures that should support the description of the processes by means of which results are generated in experimental research”

Page 8: SMART Protocols in LISC-2014

Our propose

Page 9: SMART Protocols in LISC-2014

Methods to represent and extract information

• Gazetteer-based method: use existing lists of named entities Lists of proper nouns, which refer to real-life entities

• Rule-based approaches: write manual extraction rules

• Combination of the above

• Ontology model representing lab protocols

work in progress

Page 10: SMART Protocols in LISC-2014
Page 11: SMART Protocols in LISC-2014

Ontology development

Page 12: SMART Protocols in LISC-2014

Methodology used to develop SMART Protocols

Kick-off

• Gathering use cases.• Gathering competency questions.

Conceptualization &

Formalization

• DAKA - Domain Analysis and Knowledge AcquisitionAnalysis of 175 experimental protocols.1

• LISA - Linguistic and Semantic AnalysisIdentification of key metadata for reporting protocols,2 Determination of workflow aspects in protocols

(implicit order in the instructions, following the input output structure.)

Extraction of elements pertaining to domain knowledge. (e.g. classification of protocols in groups according to the purpose. Within each group were identified basic steps (or common patterns), according to the type of protocol.

• IO - Iterative Ontology buildingDesign of conceptual maps and draft ontologies. The

ontology modules were gathering from DAKA and LISA activities and exchanged with domain experts.

Evaluation &

Evolution

• OWL• Correction of syntactic inconsistencies by using OWLViz3

and OOPS4

• The ontology model evolves as new knowledge goes through the whole cycle.

1http://goo.gl/MC4mR92goo.gl/gAVnn

3http://protegewiki.stanford.edu/wiki/OWLViz4http://oeg-lia3.dia.fi.upm.es/oops/index-content.jsp

Page 13: SMART Protocols in LISC-2014

SMART Protocols - document It is an extension of IAO ontology. It supports rhetorical and structural components (e.g. introduction, materials, and methods); It supports Information like application of the protocol, advantages and limitations, list of

reagents, critical steps.

SMART Protocols ontology is available here:

http://vocab.linkeddata.es/SMARTProtocols/

Page 14: SMART Protocols in LISC-2014

SMART Protocols - wf

• It is an extension of the P-Plan Ontology.

• It represents of the workflow aspects in protocols implicit order in the instructions, following the input output structure.

SMART Protocols ontology is available here:

http://vocab.linkeddata.es/SMARTProtocols/

Page 15: SMART Protocols in LISC-2014

New and reused terms

Resource No. of terms Resource No. of termsOBI 15 P-Plan 3NCIthesaurus 9 NPO 3CHEBI 7 EXACT 2IAO 7 SO 2MGEDOntology 3 MeSH 1

• Reused classes = 52

• Reused properties = 4Property Origen Reused in

isManufacturedBy OBI SMART Protocols-Document

hasInputVar P-Plan SMART Protocols-Workflow

hasOutputVar P-Plan SMART Protocols-Workflow

isStepOfPlan P-Plan SMART Protocols-Workflow

Ontology No. of classes No. of propertiesSMART Protocols-Document 60 7SMART Protocols-Workflow 44 1Total 104 8

• New terms

Page 16: SMART Protocols in LISC-2014

Future work

Page 17: SMART Protocols in LISC-2014

• Analysis of the protocols. Focus on the identification of keywords and/or constructs in English –e.g. instructions, actions.

• Writing rules.

• Executing, testing and debugging the rules.

Work in progress

Page 18: SMART Protocols in LISC-2014

Summarizing…

Our purpose is the formalization of lab protocols by using ontologies and NLP tools to intelligently extract information.

Page 19: SMART Protocols in LISC-2014

Special thanks…Supervisors

Oscar Corcho Alexander Garcia

OEG’s colleagues

Daniel Garijo María Poveda Pablo Calleja Nandana Mihindukulasooriya

Olga Giraldo

[email protected]

[email protected]

Ontology engineering group (OEG)

Universidad Politécnica de Madrid