-
Olivier BodenreiderOlivier Bodenreider
Lister Hill National CenterLister Hill National Centerfor
Biomedical Communicationsfor Biomedical CommunicationsBethesda,
Maryland Bethesda, Maryland -- USAUSA
Biomedical Knowledge Visualization
Bethesda, MD July 6, 2004
7th International Protégé Conference2nd Workshop on Visualizing
Informationin Knowledge Engineering (VIKE’04)
-
UMLS Semantic Navigator SemNav
http://umlsks.nlm.nih.gov*
SN Resources Semantic Navigator(* free UMLS registration
required)
-
3Lister Hill National Center for Biomedical CommunicationsLister
Hill National Center for Biomedical CommunicationsLister Hill
National Center for Biomedical Communications
UUnified nified MMedical edical LLanguage anguage
SSystemystem®®
◆◆ Developed at NLM since 1990Developed at NLM since 1990
◆◆ 1515thth edition in 2004edition in 2004
◆◆ Integrates some 60 terminological resourcesIntegrates some 60
terminological resources●● Clinical vocabularies (including
specialties)Clinical vocabularies (including specialties)
●● Core terminologies (anatomy, drugs, med. devices)Core
terminologies (anatomy, drugs, med. devices)
●● Administrative terminologies, standardsAdministrative
terminologies, standards
◆◆ IntegrationIntegration●● Synonymous terms are clustered in a
conceptSynonymous terms are clustered in a concept
●● Hierarchies (trees) are combined in a graph
structureHierarchies (trees) are combined in a graph structure
-
4Lister Hill National Center for Biomedical CommunicationsLister
Hill National Center for Biomedical CommunicationsLister Hill
National Center for Biomedical Communications
Terminology integration Terminology integration TermsTerms
Duchenne muscular dystrophy
MeSH, SNOMEDCTV3, Jablonski,CRISP, DxPlain,MedDRA, LOINC
pseudohypertrophic muscular dystrophyMeSH, CTV3SNOMED
X-liked recessive muscular dystrophy Jablonski
Duchenne de Boulogne muscular dystrophy Jablonski
Duchenne’s muscular dystrophy COSTAR
severe generalized familial muscular dystrophy SNOMED
Duchenne type progressive muscular dystrophy SNOMED
-
5Lister Hill National Center for Biomedical CommunicationsLister
Hill National Center for Biomedical CommunicationsLister Hill
National Center for Biomedical Communications
Terminology integration Terminology integration
RelationshipsRelationships
UMLS
Adrenal Cortex Diseases
Hypoadrenalism
Adrenal Gland Hypofunction
Adrenal cortical hypofunction
Adrenal Gland Diseases
Addison’s Disease
SNOMEDMeSHAODRead Codes
-
6Lister Hill National Center for Biomedical CommunicationsLister
Hill National Center for Biomedical CommunicationsLister Hill
National Center for Biomedical Communications
UMLSUMLS
◆◆ TwoTwo--level structurelevel structure●● Semantic
NetworkSemantic Network
■■ 135 Semantic Types (135 Semantic Types (STsSTs))
■■ 54 types of relationships54 types of relationshipsamong among
STsSTs
●● MetathesaurusMetathesaurus■■ >1M concepts>1M
concepts
■■ ~12 M inter~12 M
inter--conceptconceptrelationshipsrelationships
●● Link = categorizationLink = categorizationConcept
Metathesaurus
SemanticType
Semantic Network
categorization
-
Heart
Concepts
Metathesaurus
22
225
97
4
12
9 31
Esophagus
Left PhrenicNerve
HeartValves
FetalHeart
Medias-tinum
SaccularViscus
AnginaPectoris
CardiotonicAgents
TissueDonors
AnatomicalStructure
Fully FormedAnatomicalStructure
EmbryonicStructure
Body Part, Organ orOrgan Component Pharmacologic
Substance
Disease orSyndrome
PopulationGroup
Semantic Types
SemanticNetwork
-
MeSH Browser
-
12Lister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
Communications
SemNavSemNav Visualization optionsVisualization options
-
17Lister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
Communications
SemNavSemNav RelationshipsRelationships
Dystrophin
Concepts
Semantic Types
MuscularDystrophy,Duchenne55
Amino Acid,Peptide or Protein
Disease orSyndrome
Biologically ActiveSubstance
-
Gene Ontology browser
http://mor.nlm.nih.gov/perl/gennav.pl
-
19Lister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
Communications
Gene OntologyGene Ontology™™
◆◆ Developed by the GO ConsortiumDeveloped by the GO
Consortium
◆◆ Several components (GO database)Several components (GO
database)●● Ontology (~17,000 concepts)Ontology (~17,000
concepts)
■■ Molecular functionsMolecular functions
■■ Cellular componentsCellular components
■■ Biological processesBiological processes
●● Gene products (~1.6M)Gene products (~1.6M)
●● Associations between Gene products and GO concepts
Associations between Gene products and GO concepts
(~6.8M)(~6.8M)
-
Material and Methods
-
Technical details
-
26Lister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
Communications
Technical detailsTechnical details
◆◆ Simple web/Simple web/cgicgi technology (apache,
Perl)technology (apache, Perl)
◆◆ dot (dot (GraphVizGraphViz))●● PNG file (PNG file
(--TpngTpng))
●● ClientClient--side map (side map (--TcmapTcmap))
◆◆ PrecomputePrecomputethe transitive closure on hierarchical
the transitive closure on hierarchical relations to perform the
transitive closure fastrelations to perform the transitive closure
fast
◆◆ Remove cycles (UMLS)Remove cycles (UMLS)
-
Discussion Issues and Challenges
-
28Lister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
Communications
IssuesIssues
◆◆ SizeSize●● Large number of concepts (>1 million)Large
number of concepts (>1 million)
◆◆ ComplexityComplexity●●
PolyhierarchicalPolyhierarchicalstructuresstructures
●● Multiple information sourcesMultiple information sources
●● Multiple propertiesMultiple properties
◆◆ Lack of formalityLack of formality●● Redundant
relationsRedundant relations
●● Hierarchies vs. hierarchical relationsHierarchies vs.
hierarchical relations
-
29Lister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
Communications
ChallengesChallenges
◆◆ Restrict information spaceRestrict information space●● To
selected information sources (To selected information sources
(SemNavSemNav))
●● To selected organisms (To selected organisms
(GenNavGenNav))
◆◆ Reduce complexity (Reduce complexity (SemNavSemNav))●● Group
concepts by semantic groupsGroup concepts by semantic groups
●● Transitive reduction on hierarchical relationsTransitive
reduction on hierarchical relations
●● Select coSelect co--occurring conceptsoccurring concepts
◆◆ Reduce the cognitive burden on the userReduce the cognitive
burden on the user●● Use graphUse graph--based rather than
treebased rather than tree--based representationsbased
representations
-
30Lister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
Communications
SemNavSemNav Semantic groupsSemantic groups
-
31Lister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
Communications
ChallengesChallenges
◆◆ Restrict information spaceRestrict information space●● To
selected information sources (To selected information sources
(SemNavSemNav))
●● To selected organisms (To selected organisms
(GenNavGenNav))
◆◆ Reduce complexity (Reduce complexity (SemNavSemNav))●● Group
concepts by semantic groupsGroup concepts by semantic groups
●● Transitive reduction on hierarchical relationsTransitive
reduction on hierarchical relations
●● Select coSelect co--occurring conceptsoccurring concepts
◆◆ Reduce the cognitive burden on the userReduce the cognitive
burden on the user●● Use graphUse graph--based rather than
treebased rather than tree--based representationsbased
representations
-
32Lister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
Communications
SemNavSemNav Transitive reductionTransitive reduction
-
33Lister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
CommunicationsLister Hill National Center for Biomedical
Communications
ChallengesChallenges
◆◆ Restrict information spaceRestrict information space●● To
selected information sources (To selected information sources
(SemNavSemNav))
●● To selected organisms (To selected organisms
(GenNavGenNav))
◆◆ Reduce complexity (Reduce complexity (SemNavSemNav))●● Group
concepts by semantic groupsGroup concepts by semantic groups
●● Transitive reduction on hierarchical relationsTransitive
reduction on hierarchical relations
●● Select coSelect co--occurring conceptsoccurring concepts
◆◆ Reduce the cognitive burden on the userReduce the cognitive
burden on the user●● Use graphUse graph--based rather than
treebased rather than tree--based representationsbased
representations
-
MedicalOntologyResearch
Olivier BodenreiderOlivier Bodenreider
Lister Hill National CenterLister Hill National Centerfor
Biomedical Communicationsfor Biomedical CommunicationsBethesda,
Maryland Bethesda, Maryland -- USAUSA
Contact:Contact:Web:Web:
[email protected]@nlm.nih.govmor.nlm.nih.govmor.nlm.nih.gov