Uberon – a multi-species ontology for phenomics and evo-devo analyses Chris Mungall, LBNL Melissa Haendel, OHSU Lausanne Feb 2012
May 11, 2015
Uberon – a multi-species ontology for phenomics and evo-devo analyses
Chris Mungall, LBNLMelissa Haendel, OHSU
Lausanne Feb 2012
Outline
• Introduction to Bio-Ontologies– Ontologies for data analysis and data integration– Anatomy ontology re-use vs variation in nature
• Uberon– Integration with species anatomy ontologies– Interoperation with non-anatomy ontologies– Reasoning and validation– Handling taxonomic variation– Applications
• Homology• Conclusions
lung
thoracic cavity organ
thoracic cavity
lungalveolus
organ system
respiratory system
lung bud
respiratory primordium
develops_frompart_of
is_a (SubClassOf)
Ontologies abstract over repeated patterns in nature
lung
thoracic cavity organ
thoracic cavity
lungalveolus
organ system
respiratory system
lung bud
respiratory primordium
develops_frompart_of
is_a (SubClassOf)
Logical semantics: the difference between ontologies and graphs
x instance-of lungx instance-of ‘thoracic cavity organ’
lung
thoracic cavity organ
thoracic cavity
lungalveolus
organ system
respiratory system
lung bud
respiratory primordium
develops_frompart_of
is_a (SubClassOf)
Logical semantics: the difference between ontologies and graphs
x instance-of lungexists y:y instance-of ‘lung bud’x develops-from y
lung
thoracic cavity organ
thoracic cavity
lungalveolus
organ system
respiratory system
lung bud
respiratory primordium
develops_frompart_of
is_a (SubClassOf)
Formal semantics allows for more precise queries
x expressed in y &y part of zx expressed in z
expressed in
Plunc
x expressed ubiquitously in y &y part of zx expressed ubiquitously in z ✗
✔
(inferred)
Ontology Languages
• Web Ontology Language (OWL)– Standard set of logical constructs for building an
ontology– Many syntaxes
• OWL-RDF/XML• OWL-XML• Manchester
– Many reasoners• OBO-Format
– Current formalized by mapping to a subset of OWL• can be treated as another OWL syntax
chemical entities
Many perspectives, many ontologies
grossanatomy
tissues
cells cellanatomy
proteins
phenotypes
clinical disorders
processes
physiological processes
development
reactions
cellular processes
behavior
evolutionary characters
nervous system
lung
lung
respiratory gaseous exchange
lobular organ
parenchymatous organ
solid organ
pleural sac
thoracic cavity organ
thoracic cavity
multicellular organismal process
abnormal lung morphology
abnormal respiratory system morphology
GO
MPO
MA
FMA
abnormal pulmonary acinus morphology
abnormal pulmonary alveolus morphology
lungalveolus
respiratory system process
organ system
respiratory system
Lower respiratory
tract
alveolar sac
pulmonary acinus
organ system
respiratory system
EHDAA2
lung
lung bud
respiratory primordium
pharyngeal region
develops_frompart_of
is_a (SubClassOf)
surrounded_by
The problem: Data Silos
The OBO Foundry
• Avoid silo-ization via ontologies that are– open– documented– reusable– interoperable– built according to shared principles– reuse core relations and patterns
• Problem:– How do we re-use in the presence of variability?
http://obofoundry.org
Ontologies built for one species will not work for others
http://fme.biostr.washington.edu:8080/FME/index.html
http://ccm.ucdavis.edu/bcancercd/22/mouse_figure.html
Generalization leads to complexity
erythrocyte
nucleate erythrocyte
enucleate erythrocyte
cell
nucleate cell enucleate cell
Variables:V : Variability of entities in domainP : Logical Precision of queries
TP/(TP+FP*c)L : “Latticeyness” of class hierarchy
‘exception hierarchy’Hypothesis:
L = kPV
human erythrocyte
zebrafish erythrocyte
Anatomy Ontology Menagerie• Mouse:
– MA (adult)– EMAP / EMAPA (embryonic)
• Human– FMA (adult)– EHDAA2 (CS1-CS20)
• Amphibian– AAO– XAO
• Fish– ZFA– TAO
• Nematode– WBbt
• Arthropod– FBbt (Drosophila)– HAO– Arthropod anatomy ontology
Reduced taxonomic scope =Reduced complexity
Contrast to: Gene Ontology (GO)(all kingdoms of life)
Historically littlecoordination
Sept 2011
Semantic Similarity of Phenotypes
"Linking Human Diseases to Animal Models Using Ontology-Based Phenotype Annotation." PLoS Biol 7(11): e1000247. doi:10.1371/journal.pbio.1000247 Washington NL, Haendel MA, Mungall CJ, Ashburner M, Westerfield M, Lewis SE
FMA+PATO MP ZFA+PATO FBbt+PATO
The problem with mappingsClass A Class B In Bioportal? Useful?
FMA extensor retinaculum of wrist
MA retina Yes No
FMA portion of blood MA blood No Yes
ZFA Macula MA macula Yes No
ZFA aortic arch MA arch of aorta Yes Dubious
ZFA hypophysis MA pitiuitary No Yes
FMA tibia FBbt tibia Yes No
FMA colon GAZ Colón, Panama Yes No
Our solution (2008-2009)• Create grouping classes for mappings
– Used our own software for entity matching– Manually split/merge in OboEdit using curator knowledge– Internal joke name: Uberon
• Used in phenotype analysis– Washington et al– http://owlsim.org
• We kept on tweaking– Used for GO logical definitions– Used in cell ontology– Used to clarify and align existing AOs– Integrated logic-based methods– We got criticized, we got better
• Fast forward to 2012…
Uberon in 2012• Size:
– >6500 classes– >19000 relationships (50 relations)– >2000 logical definitions
• Scope:– Metazoa
• vertebrate bias, in particular mammals
• Availability– many versions, in obo and owl
• http://uberon.org
– Source version is obo, compiled to owl using Oort
• What does it look like?
anatomical structure
endoderm of forgut
lung bud
lung
respiration organ
organ
foregut
alveolus
alveolus of lung
organ part
FMA:lung
MA:lung
endoderm
GO: respiratory gaseous exchange
MA:lung alveolus
FMA: pulmonary
alveolus
is_a (taxon equivalent)
develops_frompart_of
is_a (SubClassOf)
capable_of
NCBITaxon: Mammalia
EHDAA:lung bud
only_in_taxon
pulmonary acinus
alveolar sac
lung primordium
swim bladder
respiratory primordium
NCBITaxon:Actinopterygii
Uberon classes generalize species-specific ones, and connect to other ontologies via a variety of relations
Inter-ontology bridging axioms
• Equivalence axioms:– lung (FMA) EquivalentTo lung (Ubr) and ‘part of’ some
NCBITaxon_9606– lung (MA) EquivalentTo lung (Ubr) and ‘part of’ some
NCBITaxon_10090• Subclass axioms:– lung (EMAPA) SubClassOf lung (Ubr)
• Axioms are maintained as xrefs– Translated to full axioms in obo2owl translation (header
tags)
Import closure of
Uberon‘collector’ ontologies
Different ontology modulesontology contents
basic simple relationships
uberon main ontology
merged main ontology + links to GO, CL, NCBITaxon, NBO
taxon collected merged basic
metazoan ✔ ✔vertebrate ✔ ✔amniote ✔aves ✔euarchontoglires ✔
Collected vs Merged
somite(Ubr)
somite(ZFA)
somite 1(ZFA)
somite(Ubr)
[includes ZFA axioms as GCIs]
somite 1(ZFA)
Logical definitions in GO using Uberon
GO:notochord formation: The formation of the notochord from the chordamesoderm. The notochord is composed of large cells packed within a firm connective tissue sheath and is found in all chordates at the ventral surface of the neural tube. In vertebrates, the notochord contributes to the vertebral column.
Cross-Product Extensions of the Gene Ontology Journal of Biomedical Informatics 2010. Christopher J. Mungall and Michael Bada and Tanya Z. Berardini and Jennifer Deegan and Amelia Ireland and Midori A. Harris and David P. Hill and Jane Lomax
Uberon and phenotype ontologies
MA:blood vessel
UBERON: retinal blood vessel
MP:abnormal retinal blood vessel morphology
inheresin
is_a
MA: retina
HP: Central retinal artery vascular tortuosity
FMA:central retinal artery
inheresin
Logical definitions in CL using Uberon
UBERON: trachea
CL: tracheal epithelial cell
CL: epithelial cell
is_apart_of
Uberon trachea: A trachea held open by up to 20 C-shaped rings of cartilage. The trachea is the portion of the airway that attaches to the bronchi as it branches.
Terrence Meehan, Anna Maria Masci, Amina Abdulla, Lindsay Cowell, Judith Blake, C J Mungall, Alexander Diehl (2011) Logical Development of the Cell Ontology, 6. In BMC Bioinformatics 12 (1)
UBERON: epithelium
part_of
Uberon logical definitions represent functional, developmental, spatial, etc., axes of classification
Logical definitions in Uberon using external ontologies
UBERON: trachea
UBERON: respiratory airway
CL: tracheal epithelial cell
CL: epithelial cell
is_apart_of
is_a
UBERON: respiratory system
part_of
GO: respiratory gaseous exchangecapable_of
J Deegan, E Dimmer, C J Mungall (2010) Formalization of taxon-based constraints to detect inconsistencies in annotation and ontology development, 530. In BMC bioinformatics 11 (1).
Uberon taxon constraints
UBERON: trachea
UBERON: respiratory airway
CL: tracheal epithelial cell
CL: epithelial cell
is_apart_of
is_a
Vertebrataonly_in_taxon
UBERON: respiratory system
part_of
GO: respiratory gaseous exchangecapable_of
Axioms encodedin OWL provideexplicit semantics
Text matchingStem and synonym
matching
Curationmanual adding of new classesobsoletion, merging, splitting
Reasoning• Keep axioms that are
consistent across AOs• automated
consistency checks for disjointeness violations
Uberon iterative development cycle
Developmental Biology, Scott Gilbert, 6th ed.
Using reasoners to detect errors
Fruit fly FBbt ‘tibia’ Human FMA ‘tibia’
UBERON: tibia
UBERON: bone
is_a
is_a
is_a
Vertebrata
Drosophila melanogaster
part_of
Homo sapiens
is_a
only_in_taxon
part_of
disjoint with
✗
Spatial disjointness axioms
• Example:– (part_of some midbrain) DisjointWith (part_of
some hindbrain)• Note: part_of implies all parts are part of
– Brain spatial axioms derived from ABA– Used to find problems in existing mouse
ontologies
Differences in bone and bone tissue representation
Ontology alignment
Using Uberon for alignment facilitates identification of missing classes
Ontology alignment
Managing variation: named subtypes
• ‘mammary gland’ part of some ‘female thoracic region’– humans ✔– other mammals ✗
• Solution:– mammary gland
• thoracic mammary gland• abdominal mammary gland• inguinal mammary gland
Managing variation: general axioms
• adenohypophysis develops from some ‘Rathke’s pouch’– tetrapoda✔– teleost ✗
• Named subtypes solution– ‘Rathke’s pouch-derived adenohypophysis’
• ugly!
• Alternative:– use anonymous classes / OWL general axioms:
• (adenohypohysis and part of some tetrapoda) develops from some ‘Rathke’s pouch’
the adenohypophysis has different developmental origins in different species - while in most basal fish and tetrapods the adenohypophyseal anlagen invaginates to form Rathke’s pouch, in teleost fish the adenohypophyseal placode does not invaginate but rather maintains its initial organization forming a solid structure in the head
Pharyngeal derivatives
• Pharyngeal pouches 1-5– dorsal and ventral parts
• Give rise to different structures in different clades– E.g.
• parathyroid from ventral pouch 3 & 4 in many vertebrates
• in humans, from dorsal pouches 3 and 4– Kardong, Vertebrates
• All encoded in Uberon using general axioms
A logic for developmental relationships
• Most AOs use a single generic develops from relationship– FBbt distinguishes between direct and transitive development– EHDAA2 includes ‘develops in’
• Different structures give different contributions– E.g. neural crest– Modeled explicitly in EHDAA2
• develops from relationships at very specific leaf nodes
• Relation composition– has_part o develops_from ->
has_developmental_contribution_from
Credit: Osumi-Sutherland, Haendel and Bard
Provenance for relations[Term]id: UBERON:0005562name: thymus primordium…relationship: has_developmental_contribution_from UBERON:0010023 {gci_relation="part_of", gci_filler="NCBITaxon:7778", notes="Elasmobranchii", source="ISBN10:0073040584-table13.1"} ! dorsal part of pharyngeal pouch 2
OWL:(‘thymus primordium’ and part_of some NCBITaxon_7778) SubClassOf has_developmental_contribution_from ‘dorsal part of pharyngeal pouch 2’ Annotations: source "ISBN10:0073040584-table13.1"
Use of Uberon enhances species-specific ontologies
• Many ontologies lack develops from relationships– mouse
• MA ✗• EMAPA ✗
– human• FMA ✗• EHDAA2 ✔• SNOMED-CT ✗
• These can be enhanced by the develops from relationships in uberon– E.g find all pharyngeal arch derivatives
• Combine with Bgee expression data for powerful queries– E.g compare gene expression patterns for pharyngeal arch derivatives
Use of Uberon as building block for other ontologies
• Basic science– CL– GO– NBO (behavior)– Phenotype (MP, HP)
• Applications– OBI– eagle i
Applications of Uberon in bioinformatics analyses
• Crucial lynchpin in a number of phenotype analyses– Washington, Haendel et al– Mousefinder– Phenomenet
• Expression analyses– FANTOM5
Uberon and homology• Uberon classes do not need to be homologous• We try to state necessary and sufficient conditions for all classes
– Genus: parent class– Differentia may be any mix of:
• Locational• Histological• Structural• Functional• Developmental• Or homology!
• This is essentially essentialist– ‘essentialist’ may make evo-devo folks uncomfortable, but it’s how
most ontologies work
Eyes
• Eye: organ and has function in go:visual perception– Compound eye: has part ommatidia– Camera-type eye : equivalent to vHOG eye• vertebrate-type*• cephalod-type*
*Not yet in ontology
adrenal gland – interrenal gland
• Single class in vHOG• Distinct classes in Uberon– Score highly on semantic similarity measures do to
has_part relationships to cell types• Homology can be handled separately• Open question:– interrenal gland vs bodies?– Homology at the level of gland or cortex?
Using Uberon and vHOG together
• UberHOG?
Using Uberon and vHOG together
• vHUG?
Proposal
• Separation of concerns– essentialist definitions– homology relationships
• Create ‘homology knowledgebase’– Statements anchored to Uberon classes
• E.g– lung (Ubr) has property: homologous, has_evidence …– head kidney + bone marrow, has property: homologous, has_evidence …
– Use homology ontology– Contributions from vHOG and Phenoscape
• Automatically aggregate for powerful queries
Conclusions
Anatomy ontologies have been developed independently and do not integrate well without additional help
•Uberon generalizes over species-specific anatomy classes
• Includes detailed anatomical knowledge via a variety of relationships
• designed for reasoning
• Highly interconnected with other ontologies
• Homology is largely separated
• Growing number of applications
•For more info:
• http://uberon.org
http://genomebiology.com/2012/13/1/R5
Acknowledgments• Uberon• Melissa Haendel• George Gkoutos• Carlo Torniai• Suzanna Lewis
• Ontologies• Jonathan Bard (EHDAA2)• Terry Meehan (CL)• Alex Diehl (CL)• Terry Hayamizu (MA/CL)• Onard Mejino (FMA)• David Hill (GO)• David Osumi Sutherland (FBbt/CARO)• Paul Schofield (MPATH)• Wasilla Dahdul (TAO/VAO)• Paula Mabee (TAO/VAO)• Erik Segerdell (XAO)• Monte Westerfield (ZFA)• Cynthia Smith (MP)• Maryanne Martone (NIF)• Frederic Bastian (vHOG)• Marc Robinson-Rechavi (vHOG)
• Contributions• Alan Ruttenberg• Rob Hoehndorf• Wacek Kusnierczyk• Harry Hochheiser