Lecture 24 Distributional Word Similarity II Topics Topics Distributional based word similarity example PMI context = syntactic dependencies Readings: Readings: NLTK book Chapter 2 (wordnet) Text Chapter 20 April 15, 2013 CSCE 771 Natural Language Processing
34
Embed
Lecture 24 Distributional Word Similarity II Topics Distributional based word similarity example PMI context = syntactic dependenciesReadings: NLTK book.
– 3 – CSCE 771 Spring 2013 Pointwise Mutual Informatiom (PMI) mutual Information Church and Hanks 1989 (eq 20.36) Pointwise Mutual Information (PMI) Fano 1961 . (eq 20.37) assoc-PMI (eq 20.38)
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Lecture 24Distributional Word Similarity II
Topics Topics Distributional based word similarity example PMI context = syntactic dependencies
Readings:Readings: NLTK book Chapter 2 (wordnet) Text Chapter 20
April 15, 2013
CSCE 771 Natural Language Processing
– 2 – CSCE 771 Spring 2013
OverviewLast TimeLast Time
Finish up Thesaurus based similarity …
Distributional based word similarity
TodayToday Last Lectures slides 21- Distributional based word similarity II syntax based contexts
Readings: Readings: Text 19,20 NLTK Book: Chapter 10
Next Time: Computational Lexical Semantics IINext Time: Computational Lexical Semantics II
– 3 – CSCE 771 Spring 2013
Pointwise Mutual Informatiom (PMI) mutual Information Church and Hanks 1989mutual Information Church and Hanks 1989 (eq 20.36) (eq 20.36)
Pointwise Mutual Information (PMI) Fano 1961Pointwise Mutual Information (PMI) Fano 1961 . (eq 20.37) . (eq 20.37)
assoc-PMIassoc-PMI (eq 20.38) (eq 20.38)
– 4 – CSCE 771 Spring 2013
Computing PPMI Matrix F with W (words) rows and C (contexts) Matrix F with W (words) rows and C (contexts)
columnscolumns ffijij is frequency of w is frequency of wii in c in cjj, ,
– 5 – CSCE 771 Spring 2013
Example computing PPMI
computer data pinch result salt
apricot 0 0 1 0 1
pineapple 0 0 1 0 1
digital 2 1 0 1 0
information 1 6 0 4 0
Word Similarity_ Distributional Similarity I --NLP Jurafsky & Manning
p(w information, c=data) = p(w information) =
p(c=data) =
– 6 – CSCE 771 Spring 2013
Example computing PPMI
computer data pinch result salt
apricot 0 0 1 0 1
pineapple 0 0 1 0 1
digital 2 1 0 1 0
information 1 6 0 4 0
Word Similarity_ Distributional Similarity I --NLP Jurafsky & Manning
p(w information, c=data) = p(w information) =
p(c=data) =
– 7 – CSCE 771 Spring 2013
Associations
– 8 – CSCE 771 Spring 2013
PMI: More data trumps smarter algorithms
“More data trumps smarter algorithms: Comparing pointwise mutual information with latent semantic analysis”Indiana University, 2009http://www.indiana.edu/~clcl/Papers/BSC901.pdf“we demonstrate that this metric • benefits from training on extremely large amounts of
data and • correlates more closely with human semantic
similarity ratings than do publicly available implementations of several more complex models. “
Figure 20.10 Co-occurrence vectors Based on syntactic dependencies Dependency based parser – special case of shallow Dependency based parser – special case of shallow
parsingparsing identify from “I discovered dried tangerines.” (20.32)identify from “I discovered dried tangerines.” (20.32)
Defining Context using syntactic info• dependency parsingdependency parsing• chunkingchunking
discover(subject I) -- S NP VP I(subject-of discover) tangerine(obj-of discover) -- VP verb NP tangerine(adj-mod dried) -- NP det ? ADJ N
– 11 – CSCE 771 Spring 2013
Figure 20.11 Objects of the verb drink Hindle 1990 ACL
• frequenciesfrequencies• it, much and anything
more frequent than wine
• PMI-AssocPMI-Assoc• wine more drinkable
Object Count PMI-Assoc
tea 4 11.75
Pepsi 2 11.75
champagne 4 11.75
liquid 2 10.53
beer 5 10.20
wine 2 9.34
water 7 7.65
anything 3 5.15
much 3 2.54
it 3 1.25<some Amounnt> 2 1.22
http://acl.ldc.upenn.edu/P/P90/P90-1034.pdf
– 12 – CSCE 771 Spring 2013
vectors reviewdot-productdot-product
lengthlength
sim-cosinesim-cosine
– 13 – CSCE 771 Spring 2013
Figure 20.12 Similarity of Vectors
– 14 – CSCE 771 Spring 2013
Fig 20.13 Vector Similarity Summary
– 15 – CSCE 771 Spring 2013
Figure 20.14 Hand-built patterns for hypernyms Hearst 1992 Finding hypernyms (IS-A links)Finding hypernyms (IS-A links) (20.58) One example of red algae is Gelidium.(20.58) One example of red algae is Gelidium. one example of *** is a ***one example of *** is a ***
500,000 hits on google
Semantic drift in bootstrappingSemantic drift in bootstrapping
– 16 – CSCE 771 Spring 2013
Hyponym Learning Alg. (Snow 2005)Rely on wordnet to learn large numbers of weak hyponym patternsRely on wordnet to learn large numbers of weak hyponym patterns
Snow’s AlgorithmSnow’s Algorithm
1.1. Collect all pairs of wordnet noun concepts with <cCollect all pairs of wordnet noun concepts with <c ii IS-A c IS-A cjj,>,>
2.2. For each pair collect all sentences containing the pairFor each pair collect all sentences containing the pair
3.3. Parse the sentences and automatically extract every possible Hearst-Parse the sentences and automatically extract every possible Hearst-style syntactic patterns from the parse treestyle syntactic patterns from the parse tree
4.4. Use the large set of patterns as features in a logistic regression classifierUse the large set of patterns as features in a logistic regression classifier
5.5. Given each pair extract features and use the classifier to determine if the Given each pair extract features and use the classifier to determine if the pair is a hypernym/hyponympair is a hypernym/hyponym
New patterns learnedNew patterns learned NPH like NP NP is a NPH
NPH called NP NP, a NPH (appositive)
– 17 – CSCE 771 Spring 2013
Vector Similarities from Lin 1998 hope (N):hope (N):
print sorted([lemma.name for synset in types_of_motorcar print sorted([lemma.name for synset in types_of_motorcar for lemma in synset.lemmas])for lemma in synset.lemmas])• http://nltk.googlecode.com/svn/trunk/doc/howto/wordnet.html
###'a member of the genus Canis (probably ###'a member of the genus Canis (probably descended from the common wolf) that has been descended from the common wolf) that has been domesticated by man since prehistoric times; occurs domesticated by man since prehistoric times; occurs in many breeds'in many breeds'
for lemma in wn.synset('stretch.v.02').lemmas:for lemma in wn.synset('stretch.v.02').lemmas: print lemma, lemma.frame_idsprint lemma, lemma.frame_ids print lemma.frame_stringsprint lemma.frame_strings
– 28 – CSCE 771 Spring 2013
wn05-Similarity.py### Section 5 Similarity### Section 5 Similarityimport nltkimport nltkfrom nltk.corpus import wordnet as wnfrom nltk.corpus import wordnet as wn
wn06-AccessToAllSynsets.py### Section 6 access to all synsets### Section 6 access to all synsetsimport nltkimport nltkfrom nltk.corpus import wordnet as wnfrom nltk.corpus import wordnet as wn
for synset in list(wn.all_synsets('n'))[:10]:for synset in list(wn.all_synsets('n'))[:10]: print synsetprint synset
for synset in islice(wn.all_synsets('n'), 5):for synset in islice(wn.all_synsets('n'), 5): print synset, synset.hypernyms()print synset, synset.hypernyms()
– 31 – CSCE 771 Spring 2013
wn07-Morphy.py# Wordnet in NLTK# Wordnet in NLTK# # http://nltk.googlecode.com/svn/trunk/doc/howto/wordnet.htmlhttp://nltk.googlecode.com/svn/trunk/doc/howto/wordnet.html
import nltkimport nltk
from nltk.corpus import wordnet as wnfrom nltk.corpus import wordnet as wn
smoothing=1.0)smoothing=1.0)Creates an information content lookup dictionary from a Creates an information content lookup dictionary from a corpus.corpus.