Top Banner
http://img.cs.man.ac.uk/ stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK
25

Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

Dec 18, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 1

Building and Using Ontologies

Robert StevensDepartment of Computer Science

University of ManchesterManchester UK

Page 2: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 2

Introduction

• The nature of bioinformatics resources• What is knowledge?• What is an ontology?• What are the uses of ontologies?• Components of an ontology• Building an ontology (in brief)

Page 3: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 3

The Nature of Bioinformatics Resources

• Over 500 databanks and analysis tools that work over resources

• Repositories of knowledge and data and generation of new knowledge

• Knowledge often held as free text; some use made of controlled vocabularies

• Enormous amount of semantic heterogeneity and poor query facilities

• Knowledge about services not always apparent

Page 4: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 4

What is Knowledge?

• Knowledge – all information and an understanding to carry out tasks and to infer new information

• Information -- data equipped with meaning

• Data -- un-interpreted signals that reach our senses PATRICIAGRACEKENNEDY

SAIDMINEISAPINT

Patricia Grace Kennedy said mine is a pint

name noun verb

Pat Baker is a Manchester bioinformatician who drinks beer.

…CEKENN…Single letter amino acid codesC – cysteineK - lysine

Protein that acts as a tyrosine kinase inthe liver of primates.

Page 5: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 5

Capturing Knowledge

• Capturing knowledge for both humans an computer applications

• A set of vocabulary definitions that capture a community’s knowledge of a domain

• `An ontology may take a variety of forms, but necessarily it will include a vocabulary of terms, and some specification of their meaning. This includes definitions and an indication of how concepts are inter-related which collectively impose a structure on the domain and constrain the possible interpretations of terms.'

Page 6: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 6

What Does an Ontology Do?

• Captures knowledge• Creates a shared understanding – between

humans and for computers• Makes knowledge machine processable• Makes meaning explicit – by definition and

context

Page 7: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 7

What is an Ontology?

Catalog/ID

GeneralLogical

constraints

Terms/glossary

Thesauri“narrower

term”relation Formal

is-aFrames

(properties)

Informalis-a

Formalinstance

Value Restrs. Disjointness, Inverse, part-

of…

Page 8: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 8

Roles of Ontologies in Bioinformatics

• We can divide ontology use into three types:• Domain-oriented, which are either domain specific (e.g.

E. coli) or domain generalisations (e.g. gene function or ribosomes);

• Task-oriented, which are either task specific (e.g. annotation analysis) or task generalisations (e.g. problem solving);

• Generic, which capture common high level concepts, such as Physical, Abstract and Substance. Important in ontology management and language applications.

Page 9: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 9

Uses of Ontology

• Community reference -- neutral authoring. • Either defining database schema or defining a common

vocabulary for database annotation -- ontology as specification.

• Providing common access to information. Ontology-based search by forming queries over databases.

• Understanding database annotation and technical literature.

• Guiding and interpreting analyses and hypothesis generation

Page 10: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 10

Components of an Ontology

• Concepts: Class of individuals – The concept Protein and the individual `human cytochrome C’

• Relationships between concepts• Is a kind of relationship forms a taxonomy• Other relationships give further structure – is a

part of• Axioms – Disjointness, covering, equivalence,…

Page 11: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 11

Knowledge Representation• Ontology are best delivered in some computable

representation• Variety of choices with different:

– Expressiveness• The range of constructs that can be used to formally,

flexibly, explicitly and accurately describe the ontology

– Ease of use– Computational complexity

• Is the language computable in real time?

Rigour -- Satisfiability and consistency of the representation• Systematic enforcement mechanisms

– Unambiguous, clear and well defined semantics

Page 12: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 12

Languages• Vocabularies using natural language

– Hand crafted, flexible but difficult to evolve, maintain and keep consistent, with weak semantics

– Gene Ontology

• Object-based KR: frames– Extensively used, good structuring, intuitive. Semantics

defined by OKBC standard– EcoCyc (uses Ocelot) and RiboWeb (uses Ontolingua)

• Logic-based: Description Logics– Very expressive, model is a set of theories, well defined

semantics– Automatic derived classification taxonomies– Concepts are defined and primitive

Page 13: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 13

Building Ontologies

• No field of Ontological Engineering equivalent to Knowledge or Software Engineering;

• No standard methodologies for building ontologies;• Such a methodology would include:

– a set of stages that occur when building ontologies; – guidelines and principles to assist in the different stages; – an ontology life-cycle which indicates the relationships among

stages.

Page 14: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 14

The Development Lifecycle• Two kinds of complementary methodologies emerged:

– Stage-based, e.g. TOVE [Uschold96] – Iterative evolving prototypes, e.g. MethOntology [Gomez Perez94].

• Most have TWO stages:1. Informal stage

• ontology is sketched out using either natural language descriptions or some diagram technique

2. Formal stage • ontology is encoded in a formal knowledge representation language, that is

machine computable

– the informal representation helps the former – the formal representation helps the latter.

Page 15: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 15

A Provisional Methodology• A skeletal methodology and life-cycle for building

ontologies;• Inspired by the software engineering V-process model;

• The overall process moves through a life-cycle.

The left side charts the processes in building an ontology

The right side charts the guidelines, principles and evaluation used to ‘quality assure’ the ontology

Page 16: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 16

The V-model Methodology

Conceptualisation

Integrating existing ontologies

Encoding

Representation

Identify purpose and scope

Knowledge acquisition

Evaluation: coverage, verification, granularity

Conceptualisation Principles: commitment, conciseness, clarity, extensibility, coherency

Encoding/Representation principles: encoding bias, consistency, house styles and standards, reasoning system exploitation

Ontology in Use

User Model

Conceptualisation Model

Implementation Model

Page 17: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 17

The ontology building life-cycle

Identify purpose and scope

Knowledge acquisition

Evaluation

Language and representation

Available development tools

Conceptualisation

Integrating existing ontologiesEncoding

Building

Page 18: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 18

Starting Concept List

• Chemicals – atom, ion, molecule, compound, element;• Molecular-compound, ionic-compound, ionic-molecular-

compound, …;• Ionic-macromolecular-compound and ionic-small-

macromolecular-compound;• Protein, peptide, polyprotein, enzyme, holoprotein,

apoprotein,…• Nucleic acid – DNA, RNA, tRNA, mRna, snRNA, …

Page 19: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 19

Conceptualisation SketchChemical

AtomElementCompoundMolecule Ion

MetalNon-Metal

Metaloid

Molecular Compound

Molecular Element

Ionic Compound

Ionic Molecule

Ionic Molecular Compound

Page 20: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 20

Molecule Conceptualisation Sketch

NucleicAcid

ProteinPolysaccharide

DNA RNAEnzyme

Macromolecule SmallMolecule

Ionic MacromolecularCompound

Starch Glycogen

mRNA tRNA rRNAsnRNA

Peptide

Page 21: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 21

Initial Encoding

class-def chemical

subclass-of substance

class-def molecule

subclass-of chemical

class-def compound

subclass-of chemical

class-def molecular-compound

subclass-of molecule and compound

Page 22: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 22

Molecules Revisited

NucleicAcid

ProteinPolysaccharide

DNA RNAEnzyme

Macromolecule SmallMolecule

Ionic MacromolecularCompound

Starch Glycogen

mRNA tRNA rRNAsnRNA

Peptide

Non-Ionic MacromolecularCompound

Page 23: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 23

More Encoding

class-def chemical

subclass-of substance

class-def defined molecule

subclass-of chemical

Slot-constraint contains-bond min-cardinality 1 has-value covalent-bond

class-def defined compound

subclass-of chemical

Slot-constraint has-atom-types greater-than 1

class-def defined molecular-compound

subclass-of molecule and compound

Page 24: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 24

Expansion

• Sketch and encode in cycles• Build a taxonomy of a small portion• Then build links to other portions• Add more detail• Document sources, author, date and

argumentation.

Page 25: Http://img.cs.man.ac.uk/stevens 1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.

http://img.cs.man.ac.uk/stevens 25

Summary

• An ontology captures knowledge for a shared understanding

• The important question is not whether an artefact is an ontology, but whether it does any good

• Making our understanding of domain explicit, consistent and processable

• Bioinformatics resources are knowledge resources – needs to be both human and machine understandable