Generative Distributional Models Siva Reddy Lexical Computing Ltd, UK http://sketchengine.co.uk University of Edinburgh May 22 2012 in collaboration with Diana McCarthy and also with: Spandana Gella, Ioannis Klapaftis, Suresh Manandhar Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 1 / 61
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 19 / 61
Compositionality in language
Data (rationale)
compound nouns containing two wordsno existing dataset with compositionalityrelatively simple since no morphological or syntactic variations
constituent scores with phrase level compositionality scores; examine therelation
balance data; examine score distribution
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 20 / 61
Compositionality in language
Compound Noun Set
90 compounds from four different classes - extracted semi-automatically1 Both words are literal
swimming pool2 First word is literal and second is non-literal
night owl3 First word is non-literal and second literal
zebra crossing4 Both words non-literal
smoking gun
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 21 / 61
Compositionality in language
Experimental Setup
Three tasks per compound1 is the phrase literal?2 is the first constituent used literally in the given phrase?3 is the second constituent used literally in the given phrase?
Each task annotated by 30 random annotators out of 151 annotators
Total 8100 annotations (90 * 3 * 30 = 8100)
5 random examples from ukWaC (Ferraresi et al., 2008)
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 22 / 61
Compositionality in language
How literal is this phrase?
Sample examples at http://tinyurl.com/is-it-lit
web site:
Definitions:
1. a computer connected to the internet that maintains a series of web pages on the World Wide Web
Examples:
1. can simply update the firmware and modem drivers by downloading patches from the modemmanufacturers web site . It may be best to contact the manufacturers of your modem in the first
2. up with the Government position here ( mainly pro-badger killing ) , visit the DEFRA web site ,and use the search function to trace papers about badgers and tuberculosis . Action
3. of galaxy formation and evolution and of the enrichment of the intergalactic medium . This web siteis part of a research project by Graham Thurgood who is a senior lecturer .
4. of use represent the complete and only statement of the terms of use of this web site . 4 . MyPortfolio within the Financial Organiser Friends Provident receives its data feed
5. Courts . If you require to contact us in regard to the content of this web site or with a view toobtaining consent from the University to use the material contained
Note: Please select the answers below carefully based on the definition which occurs frequently in theexamples
Step 1: score of 0-5 for how literal is the use of "web" in the phrase "web site"
0 1 2 3 4 5
Please provide any comments in case you want to tell us about your judgement or any otherqueries/suggestions! Not Mandatory but helpful.
Submit
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 23 / 61
Compositionality in language
Annotation
No. of turkers participated 260No. of them qualified 151‘Spammers’ ρ <= 0 21Turkers with ρ >= 0.6 81annotations rejected 383
Table: Compounds with their constituent and phrase level mean±st. dev scores
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 24 / 61
Compositionality in language
Agreement: Spearman’s correlation
highest ρ avg. ρ
ρ for phrase compositionality 0.741 0.522ρ for first word’s literality 0.758 0.570ρ for second word’s literality 0.812 0.616ρ all three tasks 0.788 0.589
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 25 / 61
2 Compositionality in languageAnalysis on the DataComputational Models
3 Context Aware Composition
4 Generative Distributional Grammar
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 30 / 61
Compositionality in language Computational Models
Computational Models for Compositionality
Constituent based modelsdetermine the literality of each constituentuse literality score of each constituent to predict phrase compositionalityscore
Composition function based modelsbuild a compositional model of a phrase using its constituentssimilarity between the composed model and phrase model gives phrasecompositionality score
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 31 / 61
Compositionality in language Computational Models
Computational Models for Compositionality
Constituent based modelsdetermine the literality of each constituentuse literality score of each constituent to predict phrase compositionalityscore
Composition function based modelsbuild a compositional model of a phrase using its constituentssimilarity between the composed model and phrase model gives phrasecompositionality score
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 31 / 61
Compositionality in language Computational Models
Computational Models for Compositionality
w1 model w2 model
w1 w2
phrase
model
word1 word2
phrase
fscoreconstituent
simsim
sim = compositional score
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 32 / 61
Compositionality in language Computational Models
Constituent Based Modelss3 = f(s1, s2)
If a constituent word is used literally in a given compound it is likelythat the compound and the constituent share commonco-occurrences e.g. swimming in swimming pool.
Literality of a Constituent
s1= sim(v1, v3); s2= sim(v2, v3)
sim is Cosine Similarity.
human judgmentsfirst constituent second constituent
s1 0.616 –s2 – 0.707
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 33 / 61
Compositionality in language Computational Models
Composition Function based modelss3= sim(v1 ⊕ v2, v3)
Mitchell and Lapata (2008); Widdows (2008); Erk and Padó (2008)
e.g. Traffic⊕Light is the meaning composed from Traffic and Light
⊕ is the composition function
simple addition and simple multiplication (Mitchell and Lapata, 2008)
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 34 / 61
Compositionality in language Computational Models
Results for Computational ModelsPhrase level correlations
Model ρ
Constituent Based Models s3 = f(s1,s2)
ADD 0.686MULT 0.670COMB 0.682WORD1 0.669WORD2 0.515Composition Function Based Models s3= sim(v1 ⊕ v2, v3)
av1+bv2 0.714v1v2 0.650RAND 0.002
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 35 / 61
Compositionality in language Computational Models
Findings
both types of models competitive
additive composition models bestPossible reasons
constituent based models use contextual information of each constituentindependentlycomposition function models use contexts of both the constituentssimultaneouslyperhaps contexts salient to both the words are important?
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 36 / 61
4 phone call committee meeting 225 phone call committee meeting 711 football club league match 611 health service bus company 114 company director assistant manager 7
Table: Evaluation dataset of (Mitchell and Lapata, 2010)
108 compound noun pairs
7 annotators judge each pair for phrase similarity
Score range: 0-7
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 47 / 61
Context Aware Composition
Evaluation Setting: Phrase Similarity Task
Model’s phrase similarity prediction sim(⊕(N),⊕(N ′))i.e. the similarity between composed vectorssim is Cosine similarity
Correlation between model prediction scores and mean of humanjudgments
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 48 / 61
Context Aware Composition
Simple Add Simple MultStatic Prototypes (not sense based)
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 55 / 61
Generative Distributional Grammar
Bibliography I
Bannard, C., Baldwin, T., and Lascarides, A. (2003). A statistical approach tothe semantics of verb-particles. In Proceedings of the ACL 2003 workshopon Multiword expressions: analysis, acquisition and treatment - Volume 18,MWE ’03, pages 65–72, Stroudsburg, PA, USA. Association forComputational Linguistics.
Biemann, C. and Giesbrecht, E. (2011). Distributional semantics andcompositionality 2011: Shared task description and results. In Proceedingsof DISCo-2011 in conjunction with ACL 2011.
Clark, S. and Pulman, S. (2007). Combining symbolic and distributionalmodels of meaning. In Proceedings of the AAAI Spring Symposium onQuantum Interaction, Stanford, CA, 2007, pages 52–55.
Curran, J. R. (2003). From distributional to semantic similarity. Technicalreport, PhD Thesis, University of Edinburgh.
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 56 / 61
Generative Distributional Grammar
Bibliography II
Erk, K. and Padó, S. (2008). A structured vector space model for wordmeaning in context. In Proceedings of the Conference on EmpiricalMethods in Natural Language Processing, EMNLP ’08, pages 897–906,Stroudsburg, PA, USA. Association for Computational Linguistics.
Ferraresi, A., Zanchetta, E., Baroni, M., and Bernardini, S. (2008). Introducingand evaluating ukWaC, a very large web-derived corpus of english. InProceedings of the WAC4 Workshop at LREC 2008, Marrakesh, Morocco.
Firth, J. R. (1957). A Synopsis of Linguistic Theory, 1930-1955. Studies inLinguistic Analysis, pages 1–32.
Grefenstette, E. and Sadrzadeh, M. (2011). Experimental Support for aCategorical Compositional Distributional Model of Meaning. Proceedings ofthe 2011 Conference on Empirical Methods in Natural LanguageProcessing.
Harris, Z. S. (1954). Distributional structure. Word, 10:146–162.
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 57 / 61
Generative Distributional Grammar
Bibliography III
Kilgarriff, A. (1997). I don’t believe in word senses. In Computers and theHumanities, 31(2):91-113.
Korkontzelos, I. and Manandhar, S. (2009). Detecting compositionality inmulti-word expressions. In Proceedings of the ACL-IJCNLP 2009Conference Short Papers, ACLShort ’09, pages 65–68, Stroudsburg, PA,USA. Association for Computational Linguistics.
McCarthy, D., Keller, B., and Carroll, J. (2003). Detecting a continuum ofcompositionality in phrasal verbs. In Proceedings of the ACL 2003workshop on Multiword expressions: analysis, acquisition and treatment -Volume 18, MWE ’03, pages 73–80, Stroudsburg, PA, USA. Association forComputational Linguistics.
Mitchell, J. and Lapata, M. (2008). Vector-based Models of SemanticComposition. In Proceedings of ACL-08: HLT, pages 236–244, Columbus,Ohio. Association for Computational Linguistics.
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 58 / 61
Generative Distributional Grammar
Bibliography IV
Mitchell, J. and Lapata, M. (2010). Composition in distributional models ofsemantics. Cognitive Science.
Montague, R. (1970). Universal grammar. Theoria, 36(3):373–398.
Montague, R. (1973). The Proper Treatment of Quantification in OrdinaryEnglish. pages 221–242.
Partee, B. (1995). Lexical semantics and compositionality. L. Gleitman and M.Liberman (eds.) Language, which is Volume 1 of D. Osherson (ed.) AnInvitation to Cognitive Science (2nd Edition), pages 311–360.
Pustejovsky, J. (1991). The generative lexicon. Computational Linguistics, 17.
Reddy, S., Klapaftis, I. P., McCarthy, D., and Manandhar, S. (2011a). Dynamicand static prototype vectors for semantic composition. In Proceedings ofThe 5th International Joint Conference on Natural Language Processing2011 (IJCNLP 2011), Chiang Mai, Thailand.
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 59 / 61
Generative Distributional Grammar
Bibliography V
Reddy, S., McCarthy, D., and Manandhar, S. (2011b). An empirical study oncompositionality in compound nouns. In Proceedings of The 5thInternational Joint Conference on Natural Language Processing 2011(IJCNLP 2011), Chiang Mai, Thailand.
Rychlý, P. and Kilgarriff, A. (2007). An efficient algorithm for building adistributional thesaurus (and other sketch engine developments). InProceedings of the 45th Annual Meeting of the ACL on Interactive Posterand Demonstration Sessions, ACL ’07, pages 41–44, Stroudsburg, PA,USA. Association for Computational Linguistics.
Schütze, H. (1998). Automatic Word Sense Discrimination. ComputationalLinguistics, 24(1):97–123.
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 60 / 61
Generative Distributional Grammar
Bibliography VI
Venkatapathy, S. and Joshi, A. K. (2005). Measuring the relativecompositionality of verb-noun (v-n) collocations by integrating features. InProceedings of the joint conference on Human Language Technology andEmpirical methods in Natural Language Processing, pages 899–906,Vancouver, B.C., Canada.
Widdows, D. (2008). Semantic vector products: Some initial investigations. InSecond AAAI Symposium on Quantum Interaction, Oxford.
Siva Reddy (Lexical Computing Ltd) Generative Distributional Models University of Edinburgh 61 / 61