The Human Phenotype History of HPO Ontology Project (HPO) …compbio.charite.de/tl_files/groupmembers/koehler/VEP/... · 2016. 11. 2. · The Human Phenotype Ontology (HPO) id: HP:0002185

Post on 11-Sep-2020

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

November 1st, 2016

The Human Phenotype Ontology Project (HPO)

History of HPOCurrent statusUse cases

HVP/HUGO Variant Detection Training Course Variant Effect Prediction

Dr. Sebastian Köhlerdrseb.github.io

Why phenotypes matter?

❖ Phenotypic abnormality = clinical feature

❖ Constellation/Pattern of phenotypes/clinical features defines a disease:❖ … is a rare developmental disorder defined by the combination of

aplasia cutis congenita of the scalp vertex and terminal transverse limb defects. In addition, vascular anomalies such as cutis marmorata telangiectatica … are recurrently seen.

MIM

❖ The single most valuable resource for human genetics

❖ 12 printed version between 1966 and 1998

❖ Online for almost two decades

Victor McKusick with printed MIM www.hhmi.org/biointeractive/museum/exhibit98/content/h1_7info.html

OMIM

• Free text phenotypic description

• Very expressive

OMIM

• Contains Clinical Synopsis (CS) section

• Free text phenotypic description

• Very expressive

(Un)Controlled Vocabularies

❖ Non-standardized method for describing phenotypes

❖ Not designed to be easily machine interpretable

❖ Spelling problems

(Un)Controlled Vocabularies

❖ Non-standardized method for describing phenotypes

❖ Not designed to be easily machine interpretable

❖ Spelling problems

Incomplete: Fulltext contains phenotype information; absent in CS

Inconsistent: No handling of synonyms ‘Height: short stature’ ‘Reduced adult height’ ‘Final adult height, 84-128cm’

(Un)Controlled Vocabularies

❖ Non-standardized method for describing phenotypes

❖ Not designed to be easily machine interpretable

❖ Spelling problems

CS contains symptoms such as: ‘Heart: Prolonged QTc interval’ or ‘T-wave abnormalities’

Imagine query for ‘ECG Abnormalities’ , how to ensure the examples above are found?

(Un)Controlled Vocabularies

❖ Non-standardized method for describing phenotypes

❖ Not designed to be easily machine interpretable

❖ Spelling problems

E.g.:

hypereflexia - hyperreflexia congential - congenital defeciency - deficiency

Homonyms

= muscle fibrillation

... fibrillation ...

fibrillation ≠ fibrillation

= ventricular fibrillation

MotivationOMIM Query Number of Resultslarge bones 264large bone 785

enlarged bones 87enlarged bone 156

big bones 16huge bones 4

massive bones 28hyperplastic bones 12hyperplastic bone 40bone hyperplasia 134

increased bone growth 612Washington et al. PLoS Biology (2009)Linking human diseases to animal models using ontology-based phenotype annotation

Motivation

Goal of HPO

❖ We want computer-interpretable clinical features!

❖ Compare diseases based on clinical features

❖ Compare patients based on clinical features

❖ Compare patient with diseases based on clinical features

❖ …

❖ As easy to use and freely available

The Human Phenotype Ontology (HPO)

❖ Description of phenotypic abnormalities (or clinical features) in humans

❖ Synonyms merged into one term

❖ Creation of textual and logical definitions for each term

abnormality of the nervous system

neurofibrillary tangles

cerebral inclusion bodies

gait ataxia

gait disturbanceataxia

phenotypic abnormality

incoordination

abnormality of movement

abnormality of the central nervous

system

The Human Phenotype Ontology (HPO)

id: HP:0002185 name: Neurofibrillary tangles def: Pathological protein aggregates formed by hyperphosphorylation of a microtubule-associated protein known as tau, causing it to aggregate in an insoluble form. [HPO:sdoelken] synonym: Neurofibrillary tangles may be present EXACT [] synonym: Paired helical filaments EXACT []

abnormality of the nervous system

neurofibrillary tangles

cerebral inclusion bodies

gait ataxia

gait disturbanceataxia

phenotypic abnormality

abnormality of movement

abnormality of the central nervous

system

incoordination

The Human Phenotype Ontology (HPO)

❖ Semantic relations (’subclass of’, ‘is a’)

❖ From top to bottom terms get more specific

abnormality of the nervous system

neurofibrillary tangles

cerebral inclusion bodies

gait ataxia

gait disturbanceataxia

phenotypic abnormality

abnormality of movement

abnormality of the central nervous

system

is a

is a is a

is a

is a

is a

is ais a

is a

is a

is a

is a

is a

is a incoordination

Annotation of diseases❖ Terms of the HPO are used to

annotate (describe) diseases❖ E.g. neurofibrillary tangles is

used to annotate Alzheimer Disease

❖ Note: Annotation with neurofibrillary tangles induces annotation to all ancestor terms (transitive)

❖ We provide annotations to common and rare diseases

abnormality of the nervous system

neurofibrillary tangles

cerebral inclusion bodies

gait ataxia

gait disturbanceataxia

phenotypic abnormality

incoordination

abnormality of movement

abnormality of the central nervous

system

is a

is a is a

is a

is a

is a

is ais a

is a

is a

is a

is a

is a

is a

Köhler et al. The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data ; NAR (2014)

Current Status

❖ 4 root classes:

❖ Phenotypic abnormality, Mode of Inheritance, Clinical modifier, Mortality/Ageing

❖ 11,813 classes/terms in HPO

❖ ~124,000 annotations of 7,700 rare diseases from OMIM, Orphanet, DECIPHER

❖ ~133,000 annotations of 3,145 common diseases

Other Resources?

❖ Do we really need HPO?

Winnenburg and Bodenreider, Coverage of phenotypes in standard terminologies. ; Proceedings of the ISMB (2014)

Recent projects

❖ Over 6,000 layperson synonyms added

❖ Hypoplastic kidney —> Small kidneys

❖ Nephropathy —> Kidney damage

❖ Lobulated tongue —> Bumpy tongue

❖ Important for patient-reported phenotype data

See slides: http://www.slideshare.net/NicoleVasilevsky/enhancing-the-human-phenotype-ontology-for-use-by-the-layperson-64669468

Recent projects

❖ Translations of labels, synonyms and textual definitions (crowd-sourcing)

Adoption of HPO

Köhler et al.The human phenotype ontology in 2017 NAR (2016) to appear in a few days

Adoption HPO

Köhler et al.The human phenotype ontology in 2017 NAR (2016) to appear in a few days

Clinical genetics❖ Requires: Conversion tables for legacy vocabularies

❖ Done by text-mining followed by manual curation

❖ LDDB (✓)

❖ Orphanet (✓) (they are now using HPO directly)

❖ Possum (?)

❖ MedDRA

❖ UMLS (completely incorporated now)

Applications of HPO

Semantic similarity

❖ Basic idea of ontological search: Do not need exact match! But semantically similar diseases score well.

❖ Image a BLAST-search for sets of clinical features. (Phenomizer)

❖ More in practical session on Wednesday (Room: Mouses)

Clinical genomics❖ “Standard” clinical

exome pipeline

❖ Predicts causative variant based on information in genome of patient and background genomic data

❖ Each human genome harbors about 100 genuine loss-of-function SNVs with ∼20 genes completely inactivated (3) and around 50-100 CNVs. (DG MacArthur et al., Science 2012)

Whole exome

Filter, e.g. common variants

Variant Score, e.g. - allele frequency - conservation

Clinical genomics

Zemojtel , Köhler et al. Effective diagnosis of genetic disease by computational phenotype analysis of the disease-associated genomeScience Translational Medicine (2014)

Robinson, Köhler, et al. Improved exome prioritization of disease genes through cross-species phenotype comparisonGenome Research (2013)

Deep phenotype profileof patient as HPO terms

Phenotypic Relevance Score based on HPO similarity - gene of variant - patient

Combine variant score with phenotypic relevance score

(PhenIX, Exomiser, ...)

Filter

Variant Score

Performance

❖ Combination of variant score and phenotype score is key.

❖ Keywords: Exomiser, PhenIX, …

❖ More about this on the practical session on Wednesday.

Mapping phenotypes across species❖ “We are able to provide a confident interpretation of the

clinical relevance for only a … small proportion of variants in human populations.” Lloyd, Robinson, MacRae, STM 2016

❖ Use model organism?❖ Each model organism has its own phenotype ontology,

e.g.❖ MPO❖ ZP

Logical definition of phenotypes

❖ We define phenotypes using atomic ontologies for❖ Anatomy❖ Chemicals❖ Cells❖ Qualities❖ Gene Ontology❖ …

❖ Reasoning:❖ Major premise: All men are mortal.❖ Minor premise: Socrates is a man.❖ Conclusion (by reasoner):

Socrates is mortal.

anatomy-ontology

quality-ontology

Mungall et al. Integrating phenotype ontologies across multiple species ; Genome Biology (2010)

Reasoner

❖ Human phenotype (HPO)

❖ Mouse phenotype (MP-ontology)

UBERON

midface

face

PATO

part of

decreased length

quality

subclass of

Köhler et al. Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical researchF1000 research (2013)

❖ Quality assurance for HPO

❖ Cross-species phenotype ontology for human, mouse, and zebrafish : Uberpheno

❖ Mechanistic insights (esp. GO)

Reasoner

❖ Human phenotype (HPO)

❖ Mouse phenotype (MP-ontology)

UBERON

midface

face

PATO

part of

decreased length

quality

subclass ofReasoner:

subclass of

Köhler et al. Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical researchF1000 research (2013)

❖ Quality assurance for HPO

❖ Cross-species phenotype ontology for human, mouse, and zebrafish : Uberpheno

❖ Mechanistic insights (esp. GO)

Uberpheno: Situation before

subclass of

subclass of subclass of

subclass of

subclass of

subclass ofsubclass of

subclass of

subclass of

subclass of

Uberpheno

Sleep-wake cycle

disturbanceprolonged circadian period

Zebrafish geneHuman gene Mouse geneLegend:

annotated to

PER2

PSEN2

annotated to annotated to

hcrtr2

Fbxl3

inferred: is a

inferred: annotated to

inferred: annotated to

abnormally increased duration circadian

sleep/wake cycle, sleep

annotated to

inferred: is a

Uberpheno for CNV analysis

❖ E.g. in Array-CGH a found CNV often contains multiple genes

❖ Goals: ❖ Automatically assign which genes are related to which

clinical feature of the patient❖ Use model organism phenotypes❖ Create Visualisation

Köhler et al. Clinical interpretation of CNVs with cross-species phenotype dataJournal of Medical Genetics (2014)

PhenogramGene located in CNV

Symptom of the patient

Shared phenotype betweengene and patient

Match based on Uberpheno

(mouse)

Match based on Uberpheno

(mouse)

Filter unspecific matches

Filter unspecific matches

Filter unspecific matches

Summary❖ HPO -

a controlled vocabulary of phenotypic abnormalities for human genetics

❖ Freely available, open-source

❖ FOR the community, FROM the community (see our papers)

❖ Novel approaches towards:

❖ Differential diagnosis tools (Phenomizer)

❖ Standardized patient description in projects world-wide

❖ Model organism phenotypes

Contribute

Acknowledgements❖ Berlin:

❖ Peter Robinson (now JAX)

❖ Sandra Dölken

❖ Tomasz Zemojtel

❖ Genomics England:

❖ Damian Smedley

❖ Monarch Initiative:❖ Chris Mungall❖ Melissa Haendel❖ Nicole Vasilevsky

❖ A lot missing! Sorry.

All medical experts supporting HPO!

top related