Introductory Review of Current Knowledge Organization Systems/Structures/Serv ices (KOS) Marcia Lei Zeng Second International Seminar on Subject Access to Information, Helsinki, Finland, 29-30 November 2007
Introductory Review of Current Knowledge Organization Systems/Structures/Services (KOS)
Marcia Lei ZengSecond International Seminar on Subject Access to Information, Helsinki,Finland, 29-30 November 2007
M.L.Zeng @ ISSAI, Helsinki,2007 2
Purpose of this talk
• Introduce different types of knowledge organization systems/structures/services (KOS)
• Provide a common terminology and background
M.L.Zeng @ ISSAI, Helsinki,2007 3
1. KOS overview (1)
Knowledge organization systems/structures/services (KOS) encompass all types of schemes for organizing information and promoting knowledge management. – (Gail Hodge, 2000)
M.L.Zeng @ ISSAI, Helsinki,2007 4
1. KOS overview (2)
These systems • model the underlying semantic
structure of a domain, and• provide semantics, navigation,
and translation through labels, definitions, typing, relationships, and properties for concepts. – (Hill et al. 2002, Koch and Tudhope 2004).
A Taxonomy of KOS
Term Lists: Authority Files Synonym Rings
Classification &Categorization:
Subject Headings
Classification schemes TaxonomiesCategorization schemes
Relationship Models: Ontologies Semantic networksThesauri
Glossaries/Dictionaries Pick lists
GazetteersDirectories
Metadata-like Models:
Function
Structure
M.L.Zeng @ ISSAI, Helsinki,2007 6
2. Fundamentals of KOS Approaches
• 2.1 Eliminating ambiguity • 2.2 Controlling synonyms or
equivalents• 2.3 Making explicit semantic
relationships– Hierarchical relationships– Hierarchical + other associate
relationships • 2.4 Presenting relationships as
well as properties of concepts
M.L.Zeng @ ISSAI, Helsinki,2007 7
2.1 2.1 Eliminating ambiguityEliminating ambiguity
• Ambiguity: terms having the same spelling (homographs) that represent different concepts or meanings
• Ambiguity exists when a given term can be used to represent completely different concepts.
Ambiguity / Homographs
Source: Z39.19-2005, p.25
M.L.Zeng @ ISSAI, Helsinki,2007 9
To eliminate ambiguity (1)
1. Adding a qualifier to a term -- one of the major methods
used by almost every type of KOS, especially lists of subject headings and thesauri.
• e.g., Mercury (automobile)
M.L.Zeng @ ISSAI, Helsinki,2007 10
2. Providing a scope note-- another major method used
by almost every type of KOS, especially lists of subject headings, classifications, and thesauri.
To eliminate ambiguity (2)
Screenshot from MeSHhttp://www.nlm.nih.gov/mesh/MBrowser.htmlEntry: mercury
M.L.Zeng @ ISSAI, Helsinki,2007 11
http://www.nlm.nih.gov/mesh/MBrowser.html
M.L.Zeng @ ISSAI, Helsinki,2007 12
To eliminate ambiguity (3)
3. providing a context of a term
M.L.Zeng @ ISSAI, Helsinki,2007 13
What are these?
• Flying Horse• King Fisher• Royal Challenge• Heineken• Budweiser• Miller-Lite• Bud-Light
Drinks• Flying Horse• King Fisher• Royal Challenge• Taj Mahal• Hayward’s 2000• Heineken• Corona• Budweiser• Miller-Lite• Bud-Light
Lists (Picklists)
A type of controlled vocabulary induced in NISO Z39.19 Standard
M.L.Zeng @ ISSAI, Helsinki,2007 16
• ListsLists are used to describe aspects of content objects or entities that have a limited number of possibilities.
• Examples include: – geography (e.g., country, state, city), – language (e.g., English, French, Swedish),– format (e.g., text, image, sound), or– … …
M.L.Zeng @ ISSAI, Helsinki,2007 17
Lists can be used effectively for both browsing and searching.
• In browsingbrowsing, items are directly accessed when the list of terms is reviewed and one term is selected
M.L.Zeng @ ISSAI, Helsinki,2007 18
Source: http://www.ncbi.nlm.nih.gov/genome/guide/human/resources.shtml
M.L.Zeng @ ISSAI, Helsinki,2007 19
• In searchingsearching, a list may be used to access content in a single term search, or the terms from the list may be used to limit a retrieved set by another attribute of interest for the user (one or more terms in the search).
M.L.Zeng @ ISSAI, Helsinki,2007 20
Source: Google’s advanced search http://www.google.com
pick lists
Waterford County Image Archivehttp://www.waterfordcountyimages.org
M.L.Zeng @ ISSAI, Helsinki,2007 22
Waterford County Image Archivehttp://www.waterfordcountyimages.org
M.L.Zeng @ ISSAI, Helsinki,2007 23
List - Definition, Purpose, and Uses
• A list (also called a pick list) is a limited set of terms arranged as a simple alphabetical list or in some other logically evident way. – A list is a series of terms in
some sequential order. – Terms can be ordered
alphabetically, chronologically, numerically, etc.
Exercise: Which list is better?
M.L.Zeng @ ISSAI, Helsinki,2007 25
• The defining characteristics of a list are that the terms:· are all members of the same
set or class of items (e.g., countries, products)
· are not overlapping in meaning
· are equal in terms of specificity (granularity)
M.L.Zeng @ ISSAI, Helsinki,2007 26
Typical applications
• Lists are frequently used to display small sets of terms that are to be used for quite narrowly defined purposes such as a web pull-down list or list of menu choices.
M.L.Zeng @ ISSAI, Helsinki,2007 27
2. Fundamentals of KOS Approaches
• 2.1 Eliminating ambiguity
• 2.2 Controlling 2.2 Controlling synonymssynonyms or equivalents or equivalents
• 2.3 Making explicit semantic relationships– Hierarchical relationships– hierarchical + other associate
relationships
• 2.4 Presenting relationships as well as properties of concepts
M.L.Zeng @ ISSAI, Helsinki,2007 28
2.2 Controlling synonyms 2.2 Controlling synonyms or equivalentsor equivalents• Synonyms: terms with the
same or similar meanings1. True synonyms (unusual)
– mean exactly the same thing and are used in precisely the same context
2. Near synonyms (most common)
M.L.Zeng @ ISSAI, Helsinki,2007 29
1. True Synonyms• common and technical names
– salt vs. sodium chloride
• changes in usage of terms over time– electronic calculating machines vs.
computers
• in different languages– eyeglasses, spectacles, glasses
• acronyms– BBC, British Broadcasting
Company; MPG, miles per gallon
• variant spellings: – cancelled, canceled; honor, honour
M.L.Zeng @ ISSAI, Helsinki,2007 30
2. Near Synonyms
• Same stem– computing, computers,
computed, microcomputers, supercomputers
• Overlapping concepts– medicine, drugs – fired, laid off – forest, woods– arid, dry
• General and specific termsCoffee– Double Espresso– Latte– Cappuccino– Short Black – Macchiato– Flat White– etc.
M.L.Zeng @ ISSAI, Helsinki,2007 31
Synonymy
Source: Z39.19-2005, p.25
M.L.Zeng @ ISSAI, Helsinki,2007 32
• Each distinct concept should refer to a unique linguistic form.
• Information or content that is provided to a user should not spread across the system under multiple access points, but should be gathered together in one place.
… … 150 World War, 1939-1945 450 European War, 1939-1945 450 Second World War, 1939-
1945 450 World War 2, 1939-1945 450 World War II, 1939-1945 450 World War Two, 1939-1945
Source: FAST: Faceted Application of Subject Terminologyhttp://fast.oclc.org/
Controlling synonyms: there will only be one term used to represent a given concept or entity.
or:
World War, 1939-1945 UF European War, 1939-1945 UF Second World War, 1939-1945 UF World War 2, 1939-1945 UF World War II, 1939-1945 UF World War Two, 1939-1945
European War, 1939-1945USE World War, 1939-1945
Second World War, 1939-1945USE World War, 1939-1945
World War 2, 1939-1945USE World War, 1939-1945
World War II, 1939-1945USE World War, 1939-1945
World War Two, 1939-1945USE World War, 1939-1945
Authority File
Thesaurus
M.L.Zeng @ ISSAI, Helsinki,2007 34
Source: Art and Architecture Thesaurus (AAT)
M.L.Zeng @ ISSAI, Helsinki,2007 35
Source: Medical Subject Headings (MeSH)
Synonym Rings
A type of controlled vocabulary induced in NISO Z39.19 Standard
astronaut
spaceman cosmonaut
spationaut taikonaut
A synonym ring connects a set of words that are defined as equivalent for retrieval.
An example from International SEMATECH.
A search for Silicon would look like this:
Your search was submitted as “CILICON” or “SI”
M.L.Zeng @ ISSAI, Helsinki,2007 39
Synonym Rings are used--• to expand queries for content
objects – If a user enters any one of these terms as
a query to the system, all items are retrieved that contain any of the terms in the cluster.
• in systems where the underlying content objects are left in their unstructured natural language format – The control is achieved through the
interface by drawing together similar terms to these clusters.
• in conjunction with search engines
Poverty mitigation
Poverty alleviation
Poverty elimination
Poverty reducation
Poverty eradication
Poverty abatement
Poverty prevention
Poverty reduction
Rings can include all kinds of synonyms - true, misspellings, predecessors, abbreviations
Source: Bedford, 2006 ppt.
M.L.Zeng @ ISSAI, Helsinki,2007 41
Exercise
• Find synonyms of this type of object:
M.L.Zeng @ ISSAI, Helsinki,2007 42
2. Fundamentals of KOS Approaches
• 2.1 Eliminating ambiguity • 2.2 Controlling synonyms or
equivalents
• 2.3 Making explicit 2.3 Making explicit semantic semantic relationshipsrelationships– Hierarchical relationshipsHierarchical relationships– hierarchical + other hierarchical + other
associate relationships associate relationships • 2.4 Presenting relationships as
well as properties of concepts
M.L.Zeng @ ISSAI, Helsinki,2007 43
2.3 Making explicit semantic 2.3 Making explicit semantic relationships – relationships – Hierarchical relationshipsHierarchical relationships
Birds Cardinals Doves Robins Wrens
All specific names of birds are kinds of birds.
Phylum: Chordata Class: Reptilia
Subclass: Anapsida Order: Testudines
Suborder: Cryptodira Family: Dermochelyidae
Genus: Dermochelys Species: Dermochelys coriacea
(Leatherback turtle)
Scientific Taxonomy An example: Leatherback turtle
M.L.Zeng @ ISSAI, Helsinki,2007 45
superordinate classes (e.g., parents). coordinate classes (e.g., siblings)
. . subordinate classes (e.g., children). . subordinate classes
. coordinate classes . coordinate classes
. . subordinate classes
relationship types: generic, instance, and whole-part
Classifications
M.L.Zeng @ ISSAI, Helsinki,2007 46
M.L.Zeng @ ISSAI, Helsinki,2007 47
Part / WholeCause / EffectProcess / AgentAction / ProductAction / PatientConcept or Thing /
PropertiesConcept or Thing / OriginsThing or Action / Counter-
agentRaw material / ProductAction / Property
Antonyms
Bicycle / Bicycle WheelAccident / InjuryVelocity measurement /
SpeedometerWriting / PublicationTeaching / StudentSteel alloy / Corrosion resistanceWater / WellPest / PesticideGrapes / WineCommunication / Communication
skillsSingle people / Married people
Relationship Example
2.3 Making explicit semantic relationships – 2.3 Making explicit semantic relationships – Associative relationships (not hierarchical)Associative relationships (not hierarchical)
M.L.Zeng @ ISSAI, Helsinki,2007 49
M.L.Zeng @ ISSAI, Helsinki,2007 50
Source: Z39.19-2005, p.29
KOS in Use at World Bank
• Topic Thesaurus (500,000+ English terms, French and Spanish language versions in progress now)
• Topic Classification Scheme (30 top classes, 700+ subtopics, 300+ subsubtopics)
• Business Function Thesaurus (50,000 terms and growing)
• Business Function Classification Scheme (5 business areas, 30 lines of business, 300+ business processes)
• Country-Region classification scheme (6 regions, ca. 200 countries)
• Content Type Classification Scheme (8 content types, 300+ secondary content types – in refinement now)
• Media-Format Classification Scheme
• Country Name Authority Control (synonym, predecessor, successor sources)
• Edition Statements Authority Control
• Publisher Name Authority Control
• Organization Authority Control
• Language Authority Control
• Series Name/Collection Title Authority Control
• Translation Type Authority ControlSource: Bedford, 2007, ASIST
M.L.Zeng @ ISSAI, Helsinki,2007 53
Pick lists Hierarchical taxonomy
Synonym Rings
Synonym Rings
Vision of An Enterprise Advanced Search
Source: Revised based on Bedford, 2006 ppt.
M.L.Zeng @ ISSAI, Helsinki,2007 54
Synonym Rings
Thesaurus
Metadata
Source: Revised based on Bedford, 2006 ppt.
2. Fundamentals of KOS Approaches• 2.1 Eliminating ambiguity • 2.2 Controlling synonyms
or equivalents• 2.3 Making explicit
semantic relationships– Hierarchical relationships– hierarchical + other associate
relationships
• 2.4 Presenting 2.4 Presenting relationshipsrelationships as well as as well as propertiesproperties of concepts of concepts
M.L.Zeng @ ISSAI, Helsinki,2007 56
2.4 Presenting relationships as well as properties of concepts• Entity types• Relationship types• Properties
M.L.Zeng @ ISSAI, Helsinki,2007 57
Semantic networks
organize sets of terms representing concepts, modeled as the nodes in a network of variable relationship types.
M.L.Zeng @ ISSAI, Helsinki,2007 58
UMLS Semantic Network
135 Semantic Types (link) and 54 Semantic Relation Types (link)
Source: Noy, N. F. and Tu, S.W. (2003).
Ontologies
Classes
attributes
instances
M.L.Zeng @ ISSAI, Helsinki,2007 61
M.L.Zeng @ ISSAI, Helsinki,2007 62
M.L.Zeng @ ISSAI, Helsinki,2007 63
The Graph view of relations
M.L.Zeng @ ISSAI, Helsinki,2007 64
A Taxonomy of KOS © 2007 Zeng
Ontologies Semantic networks
Thesauri
Glossaries/Dictionaries Pick lists
xxxxxpresenting properties
xxxxxxxxxestablishing relationships: associative
xxxxxxx xxxxestablishing relationships: hierarchical
xxxxxxxxx xxxxxxcontrolling synonyms
xxxxxxxxx xxxxxeliminating ambiguity
establishing
x establishing
xx
xx
function
Two-dimensions
Term Lists: Synonym RingsFlat
structure
Classification &Categorization:
Subject Headings
Classification schemesTaxonomies
Categorization schemes
Relationship Models:
GazetteersDirectories
Authority Files
Metadata -like Models:
Multiple dimensions
Maj
or fu
nctio
ns
M.L.Zeng @ ISSAI, Helsinki,2007 66
Networked KOS NKOS
• KOS are not used in isolation;• KOS may be used, re-used, and re-
purposed in web-based services; • KOS are used for:
– organizing, indexing, cataloging, and searching, AND
– learning, knowledge modeling, reasoning, etc.
• NKOS need to be machine-processable, machine-understandable– (more to discuss later today)
M.L.Zeng @ ISSAI, Helsinki,2007 67
References
• Hodge, Gail (2000). Systems of Knowledge Organization for Digital Libraries: Beyond Traditional Authority Files. Washington, DC: Council on Library and Information Resources. http://www.clir.org/pubs/reports/pub91/contents.html http://www.clir.org/pubs/reports/pub91/pub91.pdf
• Hill, Linda, Buchel, Olha, Janee, Greg, and Zeng, Marcia L. 2002. Integration of knowledge organization systems into digital library architectures: In: Mai, Jens-Erik, et al. ed.: Advances of classification research, volume 13, proceedings of the 13th ASIST SIG/CR Workshop, 17 November 2002 Philadelphia PA, pp. 62-68.
• Koch, Traugott and Tudhope, Douglas. 2004. User-centred approaches to Networked Knowledge Organization Systems/Services (NKOS): Background. http://www2.db.dk/nkos-workshop/#Background