Top Banner
Question- Answering: Overview Ling573 Systems & Applications March 31, 2011
89

Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Question-Answering:Overview

Ling573Systems & Applications

March 31, 2011

Page 2: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Quick Schedule NotesNo Treehouse this week!

CS Seminar: Retrieval from Microblogs (Metzler)April 8, 3:30pm; CSE 609

Page 3: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

RoadmapDimensions of the problem

A (very) brief history

Architecture of a QA system

QA and resources

Evaluation

Challenges

Logistics Check-in

Page 4: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Dimensions of QABasic structure:

Question analysisAnswer searchAnswer selection and presentation

Page 5: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Dimensions of QABasic structure:

Question analysisAnswer searchAnswer selection and presentation

Rich problem domain: Tasks vary onApplicationsUsersQuestion typesAnswer typesEvaluationPresentation

Page 6: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

ApplicationsApplications vary by:

Answer sourcesStructured: e.g., database fieldsSemi-structured: e.g., database with commentsFree text

Page 7: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

ApplicationsApplications vary by:

Answer sourcesStructured: e.g., database fieldsSemi-structured: e.g., database with commentsFree text

Web Fixed document collection (Typical TREC QA)

Page 8: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

ApplicationsApplications vary by:

Answer sourcesStructured: e.g., database fieldsSemi-structured: e.g., database with commentsFree text

Web Fixed document collection (Typical TREC QA)

Page 9: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

ApplicationsApplications vary by:

Answer sourcesStructured: e.g., database fieldsSemi-structured: e.g., database with commentsFree text

Web Fixed document collection (Typical TREC QA) Book or encyclopedia Specific passage/article (reading comprehension)

Page 10: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

ApplicationsApplications vary by:

Answer sourcesStructured: e.g., database fieldsSemi-structured: e.g., database with commentsFree text

Web Fixed document collection (Typical TREC QA) Book or encyclopedia Specific passage/article (reading comprehension)

Media and modality:Within or cross-language; video/images/speech

Page 11: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

UsersNovice

Understand capabilities/limitations of system

Page 12: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

UsersNovice

Understand capabilities/limitations of system

ExpertAssume familiar with capabiltiesWants efficient information accessMaybe desirable/willing to set up profile

Page 13: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Question TypesCould be factual vs opinion vs summary

Page 14: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Question TypesCould be factual vs opinion vs summary

Factual questions:Yes/no; wh-questions

Page 15: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Question TypesCould be factual vs opinion vs summary

Factual questions:Yes/no; wh-questionsVary dramatically in difficulty

Factoid, List

Page 16: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Question TypesCould be factual vs opinion vs summary

Factual questions:Yes/no; wh-questionsVary dramatically in difficulty

Factoid, ListDefinitionsWhy/how..

Page 17: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Question TypesCould be factual vs opinion vs summary

Factual questions:Yes/no; wh-questionsVary dramatically in difficulty

Factoid, ListDefinitionsWhy/how..Open ended: ‘What happened?’

Page 18: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Question TypesCould be factual vs opinion vs summary

Factual questions:Yes/no; wh-questionsVary dramatically in difficulty

Factoid, ListDefinitionsWhy/how..Open ended: ‘What happened?’

Affected by formWho was the first president?

Page 19: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Question TypesCould be factual vs opinion vs summary

Factual questions:Yes/no; wh-questionsVary dramatically in difficulty

Factoid, ListDefinitionsWhy/how..Open ended: ‘What happened?’

Affected by formWho was the first president? Vs Name the first

president

Page 20: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

AnswersLike tests!

Page 21: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

AnswersLike tests!

Form:Short answerLong answerNarrative

Page 22: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

AnswersLike tests!

Form:Short answerLong answerNarrative

Processing:Extractive vs generated vs synthetic

Page 23: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

AnswersLike tests!

Form:Short answerLong answerNarrative

Processing:Extractive vs generated vs synthetic

In the limit -> summarizationWhat is the book about?

Page 24: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Evaluation & PresentationWhat makes an answer good?

Page 25: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Evaluation & PresentationWhat makes an answer good?

Bare answer

Page 26: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Evaluation & PresentationWhat makes an answer good?

Bare answerLonger with justification

Page 27: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Evaluation & PresentationWhat makes an answer good?

Bare answerLonger with justification

Implementation vs Usability

QA interfaces still rudimentary Ideally should be

Page 28: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Evaluation & PresentationWhat makes an answer good?

Bare answerLonger with justification

Implementation vs Usability

QA interfaces still rudimentary Ideally should be

Interactive, support refinement, dialogic

Page 29: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

(Very) Brief HistoryEarliest systems: NL queries to databases (60-s-

70s)BASEBALL, LUNAR

Page 30: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

(Very) Brief HistoryEarliest systems: NL queries to databases (60-s-

70s)BASEBALL, LUNARLinguistically sophisticated:

Syntax, semantics, quantification, ,,,

Page 31: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

(Very) Brief HistoryEarliest systems: NL queries to databases (60-s-

70s)BASEBALL, LUNARLinguistically sophisticated:

Syntax, semantics, quantification, ,,,Restricted domain!

Page 32: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

(Very) Brief HistoryEarliest systems: NL queries to databases (60-s-

70s)BASEBALL, LUNARLinguistically sophisticated:

Syntax, semantics, quantification, ,,,Restricted domain!

Spoken dialogue systems (Turing!, 70s-current)SHRDLU (blocks world), MIT’s Jupiter , lots more

Page 33: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

(Very) Brief HistoryEarliest systems: NL queries to databases (60-s-

70s)BASEBALL, LUNARLinguistically sophisticated:

Syntax, semantics, quantification, ,,,Restricted domain!

Spoken dialogue systems (Turing!, 70s-current)SHRDLU (blocks world), MIT’s Jupiter , lots more

Reading comprehension: (~2000)

Page 34: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

(Very) Brief HistoryEarliest systems: NL queries to databases (60-s-70s)

BASEBALL, LUNAR Linguistically sophisticated:

Syntax, semantics, quantification, ,,, Restricted domain!

Spoken dialogue systems (Turing!, 70s-current) SHRDLU (blocks world), MIT’s Jupiter , lots more

Reading comprehension: (~2000)

Information retrieval (TREC); Information extraction (MUC)

Page 35: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

General Architecture

Page 36: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Basic StrategyGiven a document collection and a query:

Page 37: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Basic StrategyGiven a document collection and a query:

Execute the following steps:

Page 38: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Basic StrategyGiven a document collection and a query:

Execute the following steps:Question processingDocument collection processingPassage retrievalAnswer processing and presentationEvaluation

Page 39: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Basic StrategyGiven a document collection and a query:

Execute the following steps:Question processingDocument collection processingPassage retrievalAnswer processing and presentationEvaluation

Systems vary in detailed structure, and complexity

Page 40: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

AskMSRShallow Processing for QA

1 2

3

45

Page 41: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Deep Processing Technique for QA

LCC (Moldovan, Harabagiu, et al)

Page 42: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Query FormulationConvert question suitable form for IR

Strategy depends on document collectionWeb (or similar large collection):

Page 43: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Query FormulationConvert question suitable form for IR

Strategy depends on document collectionWeb (or similar large collection):

‘stop structure’ removal: Delete function words, q-words, even low content verbs

Corporate sites (or similar smaller collection):

Page 44: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Query FormulationConvert question suitable form for IR

Strategy depends on document collectionWeb (or similar large collection):

‘stop structure’ removal: Delete function words, q-words, even low content verbs

Corporate sites (or similar smaller collection):Query expansion

Can’t count on document diversity to recover word variation

Page 45: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Query FormulationConvert question suitable form for IR

Strategy depends on document collectionWeb (or similar large collection):

‘stop structure’ removal: Delete function words, q-words, even low content verbs

Corporate sites (or similar smaller collection):Query expansion

Can’t count on document diversity to recover word variation

Add morphological variants, WordNet as thesaurus

Page 46: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Query FormulationConvert question suitable form for IR

Strategy depends on document collectionWeb (or similar large collection):

‘stop structure’ removal: Delete function words, q-words, even low content verbs

Corporate sites (or similar smaller collection):Query expansion

Can’t count on document diversity to recover word variation

Add morphological variants, WordNet as thesaurus Reformulate as declarative: rule-based

Where is X located -> X is located in

Page 47: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Question ClassificationAnswer type recognition

Who

Page 48: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Question ClassificationAnswer type recognition

Who -> PersonWhat Canadian city ->

Page 49: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Question ClassificationAnswer type recognition

Who -> PersonWhat Canadian city -> CityWhat is surf music -> Definition

Identifies type of entity (e.g. Named Entity) or form (biography, definition) to return as answer

Page 50: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Question ClassificationAnswer type recognition

Who -> PersonWhat Canadian city -> CityWhat is surf music -> Definition

Identifies type of entity (e.g. Named Entity) or form (biography, definition) to return as answerBuild ontology of answer types (by hand)

Train classifiers to recognize

Page 51: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Question ClassificationAnswer type recognition

Who -> PersonWhat Canadian city -> CityWhat is surf music -> Definition

Identifies type of entity (e.g. Named Entity) or form (biography, definition) to return as answerBuild ontology of answer types (by hand)

Train classifiers to recognizeUsing POS, NE, words

Page 52: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Question ClassificationAnswer type recognition

Who -> PersonWhat Canadian city -> CityWhat is surf music -> Definition

Identifies type of entity (e.g. Named Entity) or form (biography, definition) to return as answerBuild ontology of answer types (by hand)

Train classifiers to recognizeUsing POS, NE, wordsSynsets, hyper/hypo-nyms

Page 53: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.
Page 54: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.
Page 55: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Passage RetrievalWhy not just perform general information

retrieval?

Page 56: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Passage RetrievalWhy not just perform general information

retrieval?Documents too big, non-specific for answers

Identify shorter, focused spans (e.g., sentences)

Page 57: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Passage RetrievalWhy not just perform general information

retrieval?Documents too big, non-specific for answers

Identify shorter, focused spans (e.g., sentences) Filter for correct type: answer type classificationRank passages based on a trained classifier

Features: Question keywords, Named Entities Longest overlapping sequence, Shortest keyword-covering span N-gram overlap b/t question and passage

Page 58: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Passage RetrievalWhy not just perform general information retrieval?

Documents too big, non-specific for answers

Identify shorter, focused spans (e.g., sentences) Filter for correct type: answer type classificationRank passages based on a trained classifier

Features: Question keywords, Named Entities Longest overlapping sequence, Shortest keyword-covering span N-gram overlap b/t question and passage

For web search, use result snippets

Page 59: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Answer ProcessingFind the specific answer in the passage

Page 60: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Answer ProcessingFind the specific answer in the passage

Pattern extraction-based: Include answer types, regular expressions

Similar to relation extraction:Learn relation b/t answer type and aspect of question

Page 61: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Answer ProcessingFind the specific answer in the passage

Pattern extraction-based: Include answer types, regular expressions

Similar to relation extraction:Learn relation b/t answer type and aspect of question

E.g. date-of-birth/person name; term/definitionCan use bootstrap strategy for contexts, like Yarowsky

<NAME> (<BD>-<DD>) or <NAME> was born on <BD>

Page 62: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Answer ProcessingFind the specific answer in the passage

Pattern extraction-based: Include answer types, regular expressions

Similar to relation extraction:Learn relation b/t answer type and aspect of question

E.g. date-of-birth/person name; term/definitionCan use bootstrap strategy for contexts<NAME> (<BD>-<DD>) or <NAME> was born on

<BD>

Page 63: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

ResourcesSystem development requires resources

Especially true of data-driven machine learning

Page 64: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

ResourcesSystem development requires resources

Especially true of data-driven machine learning

QA resources:Sets of questions with answers for development/test

Page 65: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

ResourcesSystem development requires resources

Especially true of data-driven machine learning

QA resources:Sets of questions with answers for development/test

Specifically manually constructed/manually annotated

Page 66: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

ResourcesSystem development requires resources

Especially true of data-driven machine learning

QA resources:Sets of questions with answers for development/test

Specifically manually constructed/manually annotated ‘Found data’

Page 67: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

ResourcesSystem development requires resources

Especially true of data-driven machine learning

QA resources:Sets of questions with answers for development/test

Specifically manually constructed/manually annotated ‘Found data’

Trivia games!!!, FAQs, Answer Sites, etc

Page 68: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

ResourcesSystem development requires resources

Especially true of data-driven machine learning

QA resources:Sets of questions with answers for development/test

Specifically manually constructed/manually annotated ‘Found data’

Trivia games!!!, FAQs, Answer Sites, etc Multiple choice tests (IP???)

Page 69: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

ResourcesSystem development requires resources

Especially true of data-driven machine learning

QA resources:Sets of questions with answers for development/test

Specifically manually constructed/manually annotated ‘Found data’

Trivia games!!!, FAQs, Answer Sites, etc Multiple choice tests (IP???) Partial data: Web logs – queries and click-throughs

Page 70: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Information ResourcesProxies for world knowledge:

WordNet: Synonymy; IS-A hierarchy

Page 71: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Information ResourcesProxies for world knowledge:

WordNet: Synonymy; IS-A hierarchyWikipedia

Page 72: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Information ResourcesProxies for world knowledge:

WordNet: Synonymy; IS-A hierarchyWikipediaWeb itself….

Term management:Acronym listsGazetteers ….

Page 73: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Software ResourcesGeneral: Machine learning tools

Page 74: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Software ResourcesGeneral: Machine learning tools

Passage/Document retrieval: Information retrieval engine:

Lucene, Indri/lemur, MGSentence breaking, etc..

Page 75: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Software ResourcesGeneral: Machine learning tools

Passage/Document retrieval: Information retrieval engine:

Lucene, Indri/lemur, MGSentence breaking, etc..

Query processing:Named entity extractionSynonymy expansionParsing?

Page 76: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Software ResourcesGeneral: Machine learning tools

Passage/Document retrieval: Information retrieval engine:

Lucene, Indri/lemur, MG Sentence breaking, etc..

Query processing: Named entity extraction Synonymy expansion Parsing?

Answer extraction: NER, IE (patterns)

Page 77: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

EvaluationCandidate criteria:

RelevanceCorrectnessConciseness:

No extra informationCompleteness:

Penalize partial answersCoherence:

Easily readable Justification

Tension among criteria

Page 78: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

EvaluationConsistency/repeatability:

Are answers scored reliability

Page 79: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

EvaluationConsistency/repeatability:

Are answers scored reliability?

Automation:Can answers be scored automatically?Required for machine learning tune/test

Page 80: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

EvaluationConsistency/repeatability:

Are answers scored reliability?

Automation:Can answers be scored automatically?Required for machine learning tune/test

Short answer answer keys Litkowski’s patterns

Page 81: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

EvaluationClassical:

Return ranked list of answer candidates

Page 82: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

EvaluationClassical:

Return ranked list of answer candidates Idea: Correct answer higher in list => higher score

Measure: Mean Reciprocal Rank (MRR)

Page 83: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

EvaluationClassical:

Return ranked list of answer candidates Idea: Correct answer higher in list => higher score

Measure: Mean Reciprocal Rank (MRR)For each question,

Get reciprocal of rank of first correct answerE.g. correct answer is 4 => ¼None correct => 0

Average over all questions

Page 84: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Dimensions of TREC QAApplications

Page 85: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Dimensions of TREC QAApplications

Open-domain free text searchFixed collections News, blogs

Page 86: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Dimensions of TREC QAApplications

Open-domain free text searchFixed collections News, blogs

UsersNovice

Question types

Page 87: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Dimensions of TREC QAApplications

Open-domain free text searchFixed collections News, blogs

UsersNovice

Question typesFactoid -> List, relation, etc

Answer types

Page 88: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Dimensions of TREC QAApplications

Open-domain free text searchFixed collections News, blogs

UsersNovice

Question typesFactoid -> List, relation, etc

Answer typesPredominantly extractive, short answer in context

Evaluation:

Page 89: Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.

Dimensions of TREC QA Applications

Open-domain free text searchFixed collections News, blogs

UsersNovice

Question typesFactoid -> List, relation, etc

Answer typesPredominantly extractive, short answer in context

Evaluation:Official: human; proxy: patterns

Presentation: One interactive track