Top Banner
Ontology-Based Ontology-Based Free-Form Query Free-Form Query Processing for the Processing for the Semantic Web Semantic Web Mark Vickers Mark Vickers Brigham Young University Brigham Young University MS Thesis Defense MS Thesis Defense Supported by:
31

Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

Ontology-Based Ontology-Based Free-Form Query Free-Form Query

Processing for the Processing for the Semantic WebSemantic Web

Mark VickersMark Vickers

Brigham Young UniversityBrigham Young University

MS Thesis DefenseMS Thesis Defense

Supported by:

Page 2: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

2

Presentation OverviewPresentation Overview

Web Queries Web Queries Explanation of AskOntosExplanation of AskOntos DemoDemo Evaluation Evaluation Future Work and ConclusionFuture Work and Conclusion

Page 3: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

3

Web Queries: Web Queries: ChallengesChallengesExample: Searching for a carExample: Searching for a car

Cannot specify constraintsCannot specify constraints

Documents returned (usually too many)Documents returned (usually too many)

Takes time to read through documents Takes time to read through documents

Determine relevance Determine relevance

Find information (price, year, etc.)Find information (price, year, etc.)

Page 4: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

4

Web Queries: Web Queries: OpportunitiesOpportunities Semantic webSemantic web

Proposed ontology-based framework for Proposed ontology-based framework for making information machine-readablemaking information machine-readable

Uses markup languages to identify Uses markup languages to identify informationinformation

““[A] search program can look for only those [A] search program can look for only those pages that refer to a precise concept…”pages that refer to a precise concept…”

-Tim Berners-Lee-Tim Berners-Lee

How should semantic web be searched?How should semantic web be searched?

Page 5: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

5

Solution: AskOntos – a Solution: AskOntos – a Query System for the Query System for the Semantic WebSemantic Web

Allows free-form queries over Allows free-form queries over

semantically annotated pagessemantically annotated pages

Processes queries using information Processes queries using information

extractionextraction

Returns tables of extracted valuesReturns tables of extracted values

Page 6: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

6

AskOntos OverviewAskOntos Overview

Page 7: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

7

Extraction OntologiesExtraction Ontologies

Object sets

Relationship sets

Participation constraints

Lexical

Non-lexical

Primary object set

Aggregation

Generalization/Specialization

Page 8: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

8

Extraction OntologiesExtraction Ontologies

Value Expression: \s*[$]\s*(\d{1,3})*(\.\d{2})?

Key Word Phrase

Left Context: $

Data Frame:

Internal Representation: float

Value Phrase

Key Word Expression: ([Pp]rice)|([Cc]ost)| …

Operation Phrase

Operator: >

Expression: (more\s*than)|(more\s*costly)|…

Page 9: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

9

Annotating Web PagesAnnotating Web Pages

Page 10: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

10

Annotating Web PagesAnnotating Web Pages

Page 11: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

11

Step 1. Parse Query Step 1. Parse Query “Find me the and of all s – I want a ”

price

mileage

red

Nissan

1996

or newer

>= Operator

Page 12: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

12

Step 2. Find Related Step 2. Find Related OntologyOntology

Similarity value: 5

Similarity value: 2

“Find me the price and mileage of all red Nissans – I want a 1996 or newer”

Page 13: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

13

Conjunctive and aggregate queries run Conjunctive and aggregate queries run over selected ontology’s extracted over selected ontology’s extracted valuesvalues

Value-phrase-matching words Value-phrase-matching words determine conditionsdetermine conditions

Conditions:Conditions: Color = “red”Color = “red” Make = “Nissan”Make = “Nissan” Year >= 1996Year >= 1996 >= Operator

Step 3. Formulate XQuery Step 3. Formulate XQuery ExpressionExpression

Page 14: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

14

For

Let

Where

Return

Step 3. Formulate XQuery Step 3. Formulate XQuery ExpressionExpression

Page 15: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

15

Step 4. Run XQuery Step 4. Run XQuery Expression OverExpression Over Ontology’s Extracted Ontology’s Extracted DataData Uses Qexo 1.7, GNU’s XQuery engine for JavaUses Qexo 1.7, GNU’s XQuery engine for Java

Orders results according to number of valuesOrders results according to number of values

Page 16: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

16

DemoDemo

Page 17: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

17

Evaluation of AskOntosEvaluation of AskOntos

Success Measure:Success Measure: ability to translate free- ability to translate free-form queries into formal queriesform queries into formal queries

Extraction ontologiesExtraction ontologies: car ads, house ads, : car ads, house ads, countries, movies, and diamond adscountries, movies, and diamond ads

3 rounds of testing3 rounds of testing 50 queries each (gathered from other CS 50 queries each (gathered from other CS

students)students) 11stst round discarded due to queries round discarded due to queries Minor improvements on system between Minor improvements on system between

roundsrounds

Page 18: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

18

Query Translation Query Translation MetricsMetrics

“Find me the price and mileage of all red Nissans – I want a 1996 or newer.”

Human conversion

for $doc in document("file:///.../Car.OWL")/rdf:RDF for $Record in $doc/owl:Thing

… where($Color="red" or empty($Color)) and ($Make="Nissan" or empty($Make)) and ($Year="1996" or empty($Year)) return <Record ID="{$id}"> <Price>{$Price}</Price> <Color>{$Color}</Color> <Make>{$Make}</Make> <Year>{$Year}</Year> </Record>

Automated conversion PrecisionPrecision RecallRecall

Return-Clause Return-Clause NamesNames 100%100% 80%80%

ConditionsConditions 66%66% 66%66%

Return-Clause

Names: {Price,Color, Make, Year}

Conditions: {(Color,=,“red”), (Make,=,“Nissan”), (Year,=,“1996”)}

Return-Clause

Names: {Price, Mileage,Color, Make, Year}

Conditions: {(Color,=,“red”), (Make,=,“Nissan”), (Year,>=,“1996”)}

Page 19: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

19

ResultsResults

Page 20: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

20

Result AnalysisResult AnalysisCommon reasons for errors:Common reasons for errors:

1. Word not in lexicon:1. Word not in lexicon:

““5 Bedrooms, 3 Bath, 5 Bedrooms, 3 Bath, studystudy, , game roomgame room, 2 car garage, and < $250,000”, 2 car garage, and < $250,000”

Page 21: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

21

Result AnalysisResult Analysis

““Which countries Which countries ususe the euro?”e the euro?”

2. Mistakes in regular expressions2. Mistakes in regular expressions

Page 22: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

22

Result AnalysisResult Analysis3. Not enough context:3. Not enough context:

““What are the models from What are the models from 20052005””

Page 23: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

23

Conclusion/Conclusion/ContributionsContributions AskOntos AskOntos

Is a free-form query system for the semantic Is a free-form query system for the semantic webweb

Applies information extraction for query Applies information extraction for query processingprocessing

Answers questions with extracted data valuesAnswers questions with extracted data values ContributionsContributions

Web queries that use semantic annotationsWeb queries that use semantic annotations Web queries returning answers from extracted Web queries returning answers from extracted

datadata Processing free-form queries using ontologies Processing free-form queries using ontologies

Page 24: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

24

Future WorkFuture Work

Disjunction and negationDisjunction and negation Fuzzy queriesFuzzy queries Spellchecker Spellchecker

Page 25: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

25

Page 26: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

26

TREC 2004 QA Question TREC 2004 QA Question TopicsTopics

Page 27: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

27

Related ResearchRelated Research

SimilaritiesSimilarities DifferencesDifferences

QUESTQUEST (1999)(1999)

• Uses OntologiesUses Ontologies

• Graphic-based interfaceGraphic-based interface• Returns generated Returns generated documents and documents and

graphsgraphs

SHOESHOE (2000) (2000) • Returns tables of dataReturns tables of data • Form-based interfaceForm-based interface

AQUAAQUA (2004) (2004)

• Natural language Natural language interfaceinterface• Uses ontology as part of Uses ontology as part of query translation processquery translation process

• For single domain For single domain environmentenvironment• Part-of-speech recognitionPart-of-speech recognition• Uses ontology for term Uses ontology for term replacementreplacement• Returns passagesReturns passages

Page 28: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

28

Related ResearchRelated Research

SimilaritiesSimilarities DifferencesDifferences

Bernstein Bernstein et alet al. (2005). (2005)

• Natural language Natural language interfaceinterface

• Allows only subset of English Allows only subset of English (Attempto Controlled English) (Attempto Controlled English) queriesqueries

SWSE (2005)SWSE (2005)

• Natural language Natural language interfaceinterface• Returns semantically Returns semantically annotatedannotated

datadata• No part-of-speech No part-of-speech recognitionrecognition

• Query context found by Query context found by matchingmatching

RDF labels, comments and RDF labels, comments and literalsliterals• Uses WordNetUses WordNet

NaLIX (2006)NaLIX (2006)

• Converts natural Converts natural languagelanguage

query to same XML query to same XML queryquery

languagelanguage

• Limited to parsing ability of Limited to parsing ability of MINIPARMINIPAR• For XML databaseFor XML database• Query terms expanded with Query terms expanded with WordNetWordNet

Page 29: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

29

  records returned correct precision recall

simple1 19 20 19 95.00% 100.00%

Simple2 19 17 17 100.00% 89.47%

Simple3 11 11 11 100.00% 100.00%

Simple4 9 9 9 100.00% 100.00%

Simple5 12 13 11 84.62% 91.67%

Simple6 12 11 10 90.91% 83.33%

Simple7 14 10 10 100.00% 71.43%

Simple8 5 7 5 71.43% 100.00%

Simple9 14 14 14 100.00% 100.00%

Simple10 15 15 15 100.00% 100.00%

Total 130 127 121 95.28% 93.08%

  records returned correct precision recall

simple1 19 22 19 86.36% 100.00%

simple2 19 20 0 0.00% 0.00%

simple3 11 14 11 78.57% 100.00%

simple4 9 10 9 90.00% 100.00%

simple5 12 16 12 75.00% 100.00%

simple6 12 23 9 39.13% 75.00%

simple7 14 22 13 59.09% 92.86%

simple8 5 10 0 0.00% 0.00%

simple9 14 16 14 87.50% 100.00%

simple10 15 16 0 0.00% 0.00%

Total 130 169 87 51.48% 66.92%

Simple Multiple-Record Simple Multiple-Record Documents Documents

VSM SeparatorVSM Separator Highest-Fanout SeparatorHighest-Fanout Separator

Genealogy Domain – from Troy Walker’s thesis

Page 30: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

30

Complex Complex MultipleMultiple-Record -Record DocumeDocume

ntsnts

  records returned missed extra correct precision recall

complex1 10 10 0 0 10 100.00% 100.00%

complex2 15 15 0 0 15 100.00% 100.00%

complex3 12 12 0 0 12 100.00% 100.00%

complex4 7 9 1 3 6 66.67% 85.71%

complex5 16 15 1 0 15 100.00% 93.75%

complex6 15 16 2 3 13 81.25% 86.67%

complex7 13 12 1 0 12 100.00% 92.31%

complex8 10 10 0 0 10 100.00% 100.00%

complex9 19 20 1 2 18 90.00% 94.74%

complex10 10 10 1 1 9 90.00% 90.00%

complex11 15 11 4 0 11 100.00% 73.33%

complex12 15 15 0 0 15 100.00% 100.00%

complex13 11 11 0 0 11 100.00% 100.00%

complex14 16 18 1 3 15 83.33% 93.75%

complex15 8 8 2 2 6 75.00% 75.00%

complex16 8 9 0 1 8 88.89% 100.00%

complex17 10 11 0 0 11 100.00% 110.00%

complex18 4 1 3 0 1 100.00% 25.00%

complex19 8 11 0 3 8 72.73% 100.00%

complex20 16 13 4 1 12 92.31% 75.00%

Total 238 237 21 19 218 91.98% 91.60%

Page 31: Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:

31

Scaling to the WebScaling to the Web

Ontologies crawl and harvest web Ontologies crawl and harvest web pagespages

Ontologies extract values from pagesOntologies extract values from pages Ontologies indexed Ontologies indexed Queries extracted by relevant Queries extracted by relevant

ontologiesontologies

Rely on Google-like technologyRely on Google-like technology