Top Banner
Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous Knowledge Processing (UKP) Lab Computer Science Department Technische Universit¨at Darmstadt, Hochschulstraße 10 D-64289 Darmstadt, Germany ACL 2009 Reporter: Kan-Wen Tien Date: 2009.10.22
47

Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Dec 29, 2015

Download

Documents

Joy Byrd
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding

Delphine Bernhard and Iryna GurevvchUbiquitous Knowledge Processing (UKP) Lab

Computer Science DepartmentTechnische Universit¨at Darmstadt, Hochschulstraße 10

D-64289 Darmstadt, Germany

ACL 2009

Reporter: Kan-Wen TienDate: 2009.10.22

Page 2: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Outlines

• Introduction• Related Work• Parallel Datasets• Semantic Relatedness Experiments• Answer Finding Experiments• Conclusion

Page 3: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

• Introduction• Related Work• Parallel Datasets• Semantic Relatedness Experiments• Answer Finding Experiments• Conclusion

Page 4: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Introduction

• Lexical gap between queries and documents or questions and answers

• Several solutions:– Query reformulation, query paraphrasing– Query expansion – Semantic information retrieval

Page 5: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Introduction

• Several solutions:– Integrate monolingual statistical translation

models in the retrieval process (1999)• Drawback: limited availability of truly parallel

monolingual corpora

• Training data often consist in question-answer pairs and usually extracted from the evaluation corpus itself

Page 6: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

• Introduction

• Related Work• Parallel Datasets• Semantic Relatedness Experiments• Answer Finding Experiments• Conclusion

Page 7: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Related Work

• Statistical translation models for retrieval• Built synthetic training data • Train translation models on Q&A pairs – Answers -> source language– Questions -> target language

• Select the most important terms to build compact translation models

Page 8: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

• Introduction• Related Work

• Parallel Datasets• Semantic Relatedness Experiments• Answer Finding Experiments• Conclusion

Page 9: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Parallel Datasets

• Different data resources:(1)Manually-tagged question reformulations

and question-answer pairs from the WikiAnswers social Q&A site

(2) Glosses from WordNet, Wiktionary, Wikipedia and Simple Wikipedia

Page 10: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.
Page 11: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Parallel Datasets

(1) Manually-tagged question reformulations and question-answer pairs

• From social Q&A sites: WikiAnswers (WA)– Question-Answer Pairs (WAQA)

– Question Reformulations (WAQ)

[URL]

Page 12: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Parallel Datasets

(2) Glosses from WordNet, Wiktionary, Wikipedia and Simple Wikipedia

• Lexical Semantic Resources (LSR)– Word sense alignment

• Example !

Page 13: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Parallel Datasets

• Example: “moon”– Wordnet (sense 1): The natural satellite of the

Earth.– English Wiktionary: The Moon, the satellite of

planet Earth.– English Wikipedia: The Moon (Latin: Luna) is

Earth’s only natural satellite and the fifth largest natural satellite in the Solar System.

Page 14: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Parallel Datasets

Three datasets: • Question-Answer Pairs (WAQA)

1,227,362 parallel pairs

• Question Reformulations (WAQ)4,379,620 parallel pairs

• Lexical Semantic Resources (LSR)397,136 pairs

Page 15: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Parallel Datasets

• Translation Model Training– Pre-processing steps

– GIZA++ SMT Toolkit -> word-to-word translation probabilities

– IBM translation model 1

Page 16: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Parallel Datasets

• Combination of the datasets– Lin (combination of models after training)

– Pool (concatenating the corpora before training)

Page 17: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Parallel Datasets

Page 18: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Parallel Datasets

Page 19: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Parallel Datasets

Page 20: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

• Introduction• Related Work• Parallel Datasets

• Semantic Relatedness Experiments• Answer Finding Experiments• Conclusion

Page 21: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Semantic Relatedness Experiments• Goal: Word translation probabilities vs.

Concept vector based measure

• Concept vector based measure relying on Explicit Semantic Analysis(Gabrilovich and Markovitch, 2007)

• Compare with traditional semantic relatedness measures

Page 22: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.
Page 23: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Semantic Relatedness Experiments

Page 24: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Semantic Relatedness Experiments

• Testing data set: 353 word-to-word pairs– Created by Finkelstein et al. (2002)– Fin1-153: 153 pairs– Fin2-200: 200 pairs

Page 25: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Semantic Relatedness Experiments

• Testing data set: 353 word-to-word pairs– Created by Finkelstein et al. (2002)– Fin1-153: 153 pairs– Fin2-200: 200 pairs

Page 26: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Semantic Relatedness Experiments

• Use Spearman’s Rank Correlation Coefficients (-1, 0, +1)

[URL]

Page 27: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Semantic Relatedness Experiments

• Use Spearman’s Rank Correlation Coefficients (-1, 0, +1)

[URL]

Page 28: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

• Introduction• Related Work• Parallel Datasets• Semantic Relatedness Experiments

• Answer Finding Experiments• Conclusion

Page 29: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Answer Finding Experiments

• Goal: provide an extrinsic evaluation of the translation probabilities by employing them in an answer finding task.

• Using a ranking function to perform retrieval

Page 30: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Answer Finding Experiments

• Ranking function (β = 0.8, λ = 0.5)

Page 31: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Answer Finding Experiments

• Ranking function (β = 0.8, λ = 0.5)

Page 32: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Answer Finding Experiments

• Ranking function (β = 0.8, λ = 0.5)

Query likelihood modelTranslation model

Page 33: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Answer Finding Experiments

• Testing data: Microsoft Research QA Corpus• 1,364 questions, 9,780 answers• 5 levels of relevance judgements:

0: No Judgement Made1: Extract Answers3: Off Topic4: On Topic, Off Target5: Partial Answer

Page 34: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Answer Finding Experiments

• Testing data: Microsoft Research QA Corpus• 1,364 questions, 9,780 answers• 5 levels of relevance judgements:

0: No Judgement Made1: Extract Answers3: Off Topic4: On Topic, Off Target5: Partial Answer

Page 35: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Answer Finding Experiments

Page 36: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Answer Finding Experiments

• Mean Average Precision (MAP)• Mean R-Precision (R-prec)• Baselines: – Query likelihood model (QLM) ---> β = 0

– LuceneQuery likelihood model Translation model

Page 37: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Answer Finding Experiments

Page 38: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Answer Finding Experiments

Page 39: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Answer Finding Experiments

Page 40: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Answer Finding Experiments

Page 41: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Answer Finding Experiments

Page 42: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.
Page 43: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.
Page 44: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.
Page 45: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

• Introduction• Related Work• Parallel Datasets• Semantic Relatedness Experiments• Answer Finding Experiments

• Conclusion

Page 46: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Conclusion

• Propose new kinds of datasets for training• Provide the first intrinsic evaluation of word

translation probabilities with respect to human relatedness rankings for reference word pairs

• Models based on translation probabilities for answer finding

Page 47: Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

Thank you !