1 Multi-Perspective Questi on Answering Using the O pQA Corpus (HLT/EMNLP 20 05) Veselin Stoyanov Claire Cardie Janyc e Wiebe Cornell University
Jan 04, 2016
1
Multi-Perspective Question Answering Using the OpQA Corpus (HLT/EMNLP 2005)
Veselin Stoyanov Claire Cardie Janyce Wiebe
Cornell University University of Pittsburgh
2/18
Abstract Introduction:
Multi-Perspective Question Answering (MPQA) OpQA Corpus
Analysis of characteristics of opinion answers Answer length Partial answers Syntactic constituent of the answer
Experiments on Filters: Subjectivity filters Opinion source filters
Conclusion
3/18
Introduction of MPQA Fact-based QA:
“When did McDonald’s open its first restaurant?”
A lot of research has been done already Multi-Perspective QA (MPQA):
“How do the Chinese regard the human rights record of the United States?”
Relatively little research has been done here (Will successful approaches in fact-based
QA work well for MPQA?)
4/18
Introduction of OpQA Corpus 98 documents: June 2001~May 2002 Phrase-level opinion info 4 general and controversial topics:
President Bush’s alternative to the Kyoto protocol (kyoto)
The US annual human rights report (humanrights)
The 2002 coup d’etat in Venezuela (venezuela) The 2002 elections in Zimbabwe and Mugabe’s
reelection (mugabe) 19~33 docs per topic
Questions: 6~8 Qs per topic evenly 30 Qs totally
5/18
Introduction of OpQA Corpus Answers: every text segment that contributes
to an answer to any of the 30 questions Mark the minimum answer spans
“a Tokyo organization representing about 150 Japanese groups” “a Tokyo organization”
Partial answer Lack the specificity needed to constitute a full
answer Q: “When was the Kyoto protocol ratified?” A: “before May 2004.” (when a specific date is known)
Need to be combined with at least one additional answer segment to fully answer the question
Q: “Are the Japanese unanimous in their opposition of Bush’s position on the Kyoto protocol? “
Only partially by a segment expressing a single opinion
6/18
Characteristics of opinion answers Use the OpQA corpus to analyze and c
ompare the characteristics of fact vs. opinion Qs.
Traditional QA architectures: IR module Linguistic filters
Semantic filters : when date/time; who person/organization
Syntactic filters : who noun phrase
7/18
Answer length
Approximately twice as long as those of fact questions likely to span more than a single syntactic constituent rendering the syntactic filters and the semantic filters less effective
8/18
Partial answers Much more likely to represent partial
answers rather than complete answers Answer generator:
Distinguish between partial and full answers Recognize redundant partial answers Identify which subset of the partial answers Determine whether additional documents need
to be examined to find a complete answer Assemble the final answer from partial pieces
of information
9/18
Syntactic constituent of the answer Use Abney’s (1996) CASS partial parser,
and count the number of times an answer segment for the question matches each constituent type
4 constituent types: noun phrase (n) verb phrase (v) prepositional phrase (p) clause (c)
10/18
Syntactic constituent of the answer 3 matching criteria
ex: answer segments whose spans exactly correspond to a constituent in the CASS output
up: the constituent completely contains the answer and no more than three additional (non-answer) tokens
up/dn: the answer matches according to the up criterion or if the answer completely contains the constituent and no more than three additional tokens
Results
11/18
Characteristics of opinion answers_ Overview Approximately twice as long as those of
fact questions Much more likely to represent partial
answers rather than complete answers Vary much more widely with respect to
syntactic category; in contrast, fact answers are overwhelming associated with noun phrases
Roughly half as likely to correspond to a single syntactic constituent type
12/18
Subjectivity Filters for MPQA Systems 3 subjectivity filters:
Manual: consider a sentence to be opinion if it contains at least one opinion of intensity medium or higher, and to be fact otherwise
Rulebased: use a bootstrapping algorithm to perform a sentence-based opinion classification
Naïve Bayes: trained a Naive Bayes subjectivity classifier on the labeled set
13/18
Experiments on Subjectivity Filters Answer rank experiments:
Can subjectivity filters improve the answer identification phase?
For each opinion Q, do the following:
Results
14/18
Experiments on Subjectivity Filters
Answer probability experiments: Can opinion information be used in an
answer generator? Compute the probabilities:
Results: <
<
<
<
<
>
15/18
Opinion Source Filters for MPQA Systems
Source filter: removes all sentences that do not have an opinion annotation with a source that matches the source of the question (manually identified)
Use Manual source annotation only Answer rank experiment
16/18
Opinion Source Filters for MPQA Systems Results:
Outperforms the baseline on some questions and performs worst on others
MRR is worse than the baseline (0.4633 vs. 0.4911) MRFA is the best (11.26 vs. 61.33) the ability to recogniz
e the As to the hardest Qs M7: What did South Africa want Mugabe to do after the 2002
election? (rank: 153 21) M8: What is Mugabe’s opinion about the West’s attitude a
nd actions towards the 2002 Zimbabwe election? (rank: 182 11)
Exception: V3: Did anything surprising happen when Hugo Chavez regai
ned power in Venezuela after he was removed by a coup? No clear source, only a single answer, opinion not clear……
Always ranked an answer within the first 25 answers Especially useful in the additional processing phase
17/18
Conclusion Use OpQA corpus to compare the characteri
stics of answers to fact and opinion questions
Surmise that traditional QA approaches may not be as effective for MPQA as they have been for fact-based QA
Investigate the use of machine learning and rule-based opinion filters and showed that they can be used to guide MPQA systems
18/18
Q & A
19/18
Questions in the OpQA collection by topic
20/18
Syntactic Constituent Type
the % of correct answers that would remain after filteringroughly half as likely to correspond to a single syntactic constituent type
Vary much more widely with respect to syntactic category
21/18
Results for the subjectivity filtersNo filtering at least as high as in the baseline