Top Banner
Exploiting a Thesaurus- Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing, Mathematics and Computing Technology
29

Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Dec 16, 2015

Download

Documents

Kenia Orum
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based

Search

Peter ClarkJohn ThompsonLisbeth Duncan

Heather Holmback

Knowledge SystemsBoeing, Mathematics and Computing Technology

Page 2: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Overview• Problem: searching for information

– in particular, for human experts• Approach:

– Search using concepts, not words– Use a thesaurus as the initial ontology– Enhance it using simple AI techniques

• The Application: – Two deployed “Expert Locator” applications

Page 3: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Overall Picture

SearchEngine

Query words

“tube placement”

DatabasesHumanExperts

Web pages Documentrepositories ...

Page 4: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Problems with word searches..• Words have many senses (polysemy)

– e.g. “plane” finds both airplanes and geometry• Many words mean the same thing (synonymy)

– e.g. “tail fin” misses “vertical stabilizer” • Lack of world knowledge

– e.g. “jet engine” misses “propulsion systems”

Goal: organize search around concepts, not words

Need a conceptual vocabulary (“ontology”)

Page 5: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

The Ontology Bottleneck• Massive up-front cost to build an ontology

• Use a technical thesaurus, enhanced with AI techniques

• Boeing’s Thesaurus:

– Highly customized to aerospace and Boeing

– Massive knowledge repository • 37,000 concepts, 18,000 synonyms

• 100,000 relationships (3 types)

– Many person-years investment of effort

The Approach

Page 6: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

A (tiny) fragment of the ontology...

Jetengines

flameout

combustion

Burningrate

afterburning

Ramjetengines

Hydrogenfuels

enginesPropulsion

systems

thrustliftTurbojetengines

Enginestarters

Flamestability

Combustionstability

Flamepropagation

Pneumaticequipment starting

ignition

sprayJet spray

Page 7: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Converting Words to Concepts

Jetengines

flameout

combustion

Burningrate

afterburning

Ramjetengines

Hydrogenfuels

enginesPropulsion

systems

thrustliftTurbojetengines

Enginestarters

Flamestability

Combustionstability

Flamepropagation

Pneumaticequipment starting

ignition

• Search word: “jet”

sprayJet spray

?

?

?

?

Page 8: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Matching Query and Target Concepts

Jetengines

flameout

combustion

Burningrate

afterburning

Ramjetengines

Hydrogenfuels

enginesPropulsion

systems

thrustliftTurbojetengines

Enginestarters

Flamestability

Combustionstability

Flamepropagation

Pneumaticequipment starting

ignition

• Semantic distance between “ignition” and “jet engines”?

sprayJet spray

Page 9: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Expert Locator Demo

(see end of this presentation for the demo in powerpoint form)

Page 10: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

• 100,000 links are not enough!– 40% of concepts are “orphans”

• But: Many concept names are phrases– Can add links by analyzing these phrases

Enhancing the Thesaurus:1. Increase connectivity using subsumption

Space Shuttle Main Engine Enginegeneralization

Space Shuttlerelated-to

Page 11: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Subsumption Computation Algorithm

Space Shuttle Main Engine

1. Compute all possible generalizations by “word chopping” and “word generalization”...

Engine

Space Shuttle Engine

Space Engine

Space Vehicle Main Engine

Space Shuttle Main Space Shuttle

Space VehicleSpace

Shuttle

VehicleVehicle Engine

Vehicle Main Engine

Vehicle Main

Page 12: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Space Shuttle Main Engine

Space Shuttle Engine

Space Engine

Space Vehicle Main Engine

Vehicle Main Engine

Space Shuttle Main

Space VehicleSpace

Shuttle

Vehicle Engine

Engine

Space Shuttle

Vehicle

Subsumption Computation Algorithm2. Identify existing Thesaurus concepts and links within these

Vehicle Main

Page 13: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Space Shuttle Engine

Space Engine

Space Vehicle Main Engine

Space Shuttle Main

Space VehicleSpace

Shuttle

Vehicle Engine

Engine

Space Shuttle

Vehicle

Space Shuttle Main Engine

Subsumption Computation Algorithm3. Add missing connections to nearest existing concepts

Vehicle Main Engine

Vehicle Main

Page 14: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

MeasuringInstruments

Equipment

OpticalMeasuring

Instruments

DistanceMeasuringEquipment

Range Finders

Optical Range Finders

Halogen Compounds

Fourine Compounds

NitrogenFourine

CompoundsFourides

Nitrogen Flourides

Some Example Inferred Links

• 21,000 generalization/specialization and 37,000 related-to links added

• Number of “orphans” down from 40% to 13%

Page 15: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Metal Tube Metalmade-ofNew:

Enhancing the Thesaurus:2. Use NLP to refine the “related-to” links

Metal Tube Metalrelated toCurrent:

• 27 relationship types chosen (causes, location, …)• heuristic noun-noun rules selects relationship, e.g

For compound “X Y” (e.g. “metal tube”):IF X is a MaterialAND Y is a Physical-ObjectTHEN Y made-of X

• Can use relation type to help compute semantic distance

Page 16: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Definition: “Flap: A movable airfoil attached to an airplane’s wing, and used to increase lift or drag.”

Flap isa: Airfoil attribute: Movable attached-to: Wing part-of: Airplane purpose: Increase

object: Lift, Drag

NLP

Flap

Airfoil

Airplanert

bt

Wing

Lift

DragIncrease

Increase

Movable isaattribute

purpose

purposeobject

object

attached-topart-of

Enhancing the Thesaurus:3. Knowledge from Text

Page 17: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Status and Evaluation• The Applications

– Two “Expert Locators” deployed and in use

– Sustained usage (~20 searches / day)

– Plans to quickly expand them further• more experts

• also cover projects and work groups

• add in attribute filters (years at Boeing, location, …)

• How do the Thesaurus Enhancements Affect Search?

– Study: Expert assessed relevance of “hit” concepts

– Recall increased (44% 75%) with only minimal effect on precision (58% 57%)

Page 18: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Discussion• “Number N of links” “relevance”?

– only for very small N!• The useful bias of a domain-specific Thesaurus:

– only contains relevant concepts• massively reduces errors in Thesaurus enhancement

– only contains relevant links• provides very domain-specific search

• Limitations:– ignored “quality” of expert, social issues, etc.– what if the concept you want isn’t there?

• Generality: Applies to any resource, not just experts

Page 19: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

Summary• Search using concepts, not words

• Use of a thesaurus as an initial ontology:

– Can leverage many years of work by librarians

– Made viable using simple AI techniques of• search

• subsumption computation

• language processing

• Domain-specific thesauri provide valuable bias

Page 20: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,

End - demo in PPT follows

Page 21: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,
Page 22: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,
Page 23: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,
Page 24: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,
Page 25: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,
Page 26: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,
Page 27: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,
Page 28: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,
Page 29: Exploiting a Thesaurus-Based Semantic Net for Knowledge-Based Search Peter Clark John Thompson Lisbeth Duncan Heather Holmback Knowledge Systems Boeing,