Top Banner
Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology? Frederic Bastian Biocuration 2013
38

Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

May 18, 2015

Download

Education

fbastian

Presentation of the methods used to simplify the display of the Uberon ontology, and to maintain up-to-date annotations to it.
Presented at the Biocuration 2013 conference.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

Use of Uberon in the Bgee database:

How to deal with a complex, large, dynamic ontology?

Frederic BastianBiocuration 2013

Page 2: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

A biocurator nightmare?

Ontologies now regularly include thousands of terms.

Complex relations are used, e.g., “transitively proximally connected to”.

Curators are expected to provide complex annotations, e.g.: post-composition of terms.

=> How can we simplify the use of complex ontologies?

Page 3: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

The Bgee database

http://bgee.unil.ch

Page 4: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

The Bgee database

Description of anatomy and development

http://bgee.unil.ch

Page 5: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

The Bgee database

Description of anatomy and development

Expression data

http://bgee.unil.ch

Page 6: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

The Bgee database

Description of anatomy and development

Expression data Homology

http://bgee.unil.ch

Page 7: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

The Bgee database

http://tinyurl.com/bgee12-hoxa5a

Page 8: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

The Bgee database

http://tinyurl.com/bgee12-hoxa5a

Page 9: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

The Bgee database

http://tinyurl.com/bgee12-hoxa5a

Page 10: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

Use of anatomical ontologies in Bgee

Several species-specific ontologies were used:

• ZFA

• XAO

• FBbt

• EMAPA, MA

• EHDAA, EV

Page 11: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

Use of anatomical ontologies in Bgee

Several species-specific ontologies were used:

• ZFA

• XAO

• FBbt

• EMAPA, MA

• EHDAA, EV

=> Limitation to add new species

=> Inconsistent anatomical descriptions, different formalisms adopted, etc.

Page 12: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

Homology relations between anatomical ontologies

To perform automated comparisons: • We built groups of homologous organs• We organized these groups into an ontology

VHOG:0000157 brain

EHDAA:2629 brainEHDAA:300 brainEHDAA:830 future brainEMAPA:16089 future brainEMAPA:16894 brainEV:0100164 brainMA:0000168 brainXAO:0000010 brainZFA:0000008 brainZFA:0000146 presumptive brain

Page 13: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

Homology relations between anatomical ontologies

To perform automated comparisons: • We built groups of homologous organs• We organized these groups into an ontology

=> vHOG ontologyvHOG, a multispecies vertebrate ontology of homologous organs groups

Bioinformatics (2012) 28(7): 1017-1020, 2012.

Page 14: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

Homology relations between anatomical ontologies

To perform automated comparisons: • We built groups of homologous organs• We organized these groups into an ontology

=> vHOG ontology

To add a species: • All groups need to be re-evaluated • The graph structure needs to be updated

=> Not maintainable on the long run

Page 15: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

And then came Uberon …

Fruit fly FBbt ‘tibia’ Human FMA ‘tibia’

UBERON: tibia

UBERON: bone

is_a

is_a

is_a

Vertebrata

Drosophila melanogaster

part_of

Homo sapiens

is_a

only_in_taxon

part_of

Page 16: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

And then came Uberon …

Fruit fly FBbt ‘tibia’ Human FMA ‘tibia’

UBERON: tibia

UBERON: bone

is_a

is_a

is_a

Vertebrata

Drosophila melanogaster

part_of

Homo sapiens

is_a

only_in_taxon

part_of

Page 17: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

And then came Uberon …

Uberon also provides a composite ontology:

Merges terms from species-specific ontologies, when term not present in Uberon.

=> Allow to import data from Model Organism Databases.

.... is_a UBERON:0003059 ! presomitic mesoderm devf UBERON:0002329 ! somite is_a ZFA:0000073 ! somite 5 (zebrafish) is_a ZFA:0000982 ! somite 6 (zebrafish) is_a EHDAA2:0001853 ! somite 05 (embryonic human) is_a EHDAA2:0001854 ! somite 06 (embryonic human)

Page 18: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

And then came Uberon … BUT

Uberon is complex:

• About 22 000 terms in the composite ontology

Page 19: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

And then came Uberon … BUT

Uberon is complex:

• About 22 000 terms in the composite ontology

• Use of advanced constructs, supported only in OWL• Use of high level abstract terms for interoperability

Page 20: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

And then came Uberon … BUT

Uberon is complex:

• About 22 000 terms in the composite ontology

• Use of advanced constructs, supported only in OWL• Use of high level abstract terms for interoperability

• Frequently updated, highly responsive• Structure changes when any imported species-specific

ontology changes => even more updated

Page 21: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

Uberon cannot be easily browsed

Page 22: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

First step: ontology simplification

Page 23: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

First step: ontology simplification

1. Simplification of the relations

Keep only is_a, part_of, develops_from.

Map all relations to their ancestors, e.g.:

develops_directly_from => develops_from

Page 24: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

First step: ontology simplification

2. Removal of redundant relations

A is_a B; B is_a C;

=> A is_a C is redundant.

Page 25: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

First step: ontology simplification

2. Removal of redundant relations

A is_a B; B is_a C;

=> A is_a C is redundant.

But, we consider part_of and is_a relations as equivalent.

A part_of B; B is_a C

=> A part_of C and A is_a C are considered redundant

This removes almost all “is_a anatomical entity”

Page 26: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

First step: ontology simplification

3. Removal of relations to upper_level terms

upper_level subset: "abstract upper-level terms not directly useful for analysis”

Terms useful for analysis are almost all present under “upper_level” terms, thus being confusing.

=> remove relations to “upper_level” terms if non-orphan

Page 27: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

First step: ontology simplification

3. Removal of relations to upper_level terms

upper_level subset: "abstract upper-level terms not directly useful for analysis”

Terms useful for analysis are almost all present under “upper_level” terms, thus being confusing.

=> remove relations to “upper_level” terms if non-orphan

[Term]id: MA:0000747name: lymph organ (mouse) is_a: UBERON:0001062 ! anatomical entityrelationship: part_of UBERON:0002465 ! lymphoid system

Page 28: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

First step: ontology simplification

3. Removal of relations to upper_level terms

upper_level subset: "abstract upper-level terms not directly useful for analysis”

Terms useful for analysis are almost all present under “upper_level” terms, thus being confusing.

=> remove relations to “upper_level” terms if non-orphan

[Term]id: MA:0000747name: lymph organ (mouse) is_a: UBERON:0001062 ! anatomical entityrelationship: part_of UBERON:0002465 ! lymphoid system

Page 29: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

First step: ontology simplification

3. Removal of relations to upper_level terms

upper_level subset: "abstract upper-level terms not directly useful for analysis”

Terms useful for analysis are almost all present under “upper_level” terms, thus being confusing.

=> remove relations to “upper_level” terms if non-orphan

[Term]id: UBERON:0007502name: epithelial plexusis_a: UBERON:0000480 ! anatomical group

Page 30: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

First step: ontology simplification

4. Generate species-specific versions

To simplify even more the “composite-metazoan” ontology, generate a version for each species used in Bgee.

Page 31: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

First step: ontology simplification

Page 32: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

Second step: track ontology changes

1. Store annotation status

- “Perfect” annotation: would not need to be refined as long as the term used is not obsoleted.

- “Missing granularity” annotation: a term is missing in the ontology, e.g., vastus lateralis.

If a new child was added to the term, refine annotation

Page 33: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

Second step: track ontology changes

2. Track ontology changes

- Compare the versions used between two annotation cycles.

- If a term used in a “missing granularity” annotation has new children, refine the annotation.

Page 34: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

Conclusion 1/2

To manage complex, frequently updated ontology:

1. Provide a formal version for the reasoning, and a simplified view for the end-user.

2. Store annotation status, to focus only on annotations which need to be updated.

Page 35: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

Conclusion 2/2

Major update of Bgee incoming for fall 2013:

- All expression data annotations are being transferred to Uberon.

- All homology information are being transferred from vHOG to Uberon, using an external file.

Page 36: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

Conclusion 2/2

Major update of Bgee incoming for fall 2013:

- All expression data annotations are being transferred to Uberon.

- All homology information are being transferred from vHOG to Uberon, using an external file.

And also:

- Besides present/absent calls, Bgee will include: overexpression calls; biologically significant expression.

- Revamped interfaces, webservices, APIs, …

Page 37: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

© 2013 SIB

Advertisement! Other Bgee-related work

Poster 145:

Average rank IQR: a new improved method for Affymetrix microarray quality control for meta-analyses and database curation.

Marta Rosikiewicz

Database biocuration virtual issue: Uncovering hidden duplicated content in public transcriptomics data Marta Rosikiewicz, Aurélie Comte, Anne Niknejad, Marc Robinson-Rechavi, and Frederic B. Bastian Database Vol. 2013, bat010; doi:10.1093/database/bat010

Page 38: Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

Thank You

Marta RosikiewiczSébastien Moretti

Komal Sanjeev

Anne NiknejadAurélie Comte

Mathieu SeppeyMarc Robinson-Rechavi

And also:

• Melissa Haendel

• Chris Mungall