Top Banner
Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1
32

Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

Dec 30, 2015

Download

Documents

Vivian Simpson
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

1

Adaptor Grammars

Ehsan Khoddammohammadi

Recent Advances in Parsing TechnologyWS 2012/13

Saarland University

Page 2: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

2

Outline

• Definition and motivation behind unsupervised grammar learning

• Non-parametric Bayesian statistics• Adaptor grammars vs. PCFG• A short introduction to Chinese Restaurant

Process• Applications of Adaptor grammar

Page 3: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

3

Unsupervised Learning

• How many categories of objects?• How many features does an object have?• How many words and rules are in a language?

Page 4: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

4

Grammar Induction

Goal:– study how a grammar and parses can be learnt

from terminal strings alone

Motivation:– Help us understand human language acquisition– Inducing parsers for low-resource languages

Page 5: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

5

Nonparametric Bayesian statistics

• Learning the things people learn requires using rich, unbounded hypothesis spaces

• Language learning is non-parametric inference, no (obvious) bound on number of words, grammatical, morphemes.

• Use stochastic processes to define priors on infinite hypothesis spaces

Page 6: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

6

Nonparametric Bayesian statistics

• Likelihood: how well grammar describes data• Prior: Encode our knowledge or expectation of

grammars before seeing the data– Universal Grammar (very specific)– Shorter Grammars (general constraints)

• Posterior: Shows uncertainty of learner about which grammar is correct (distribution over grammars)

Posterior Likelihood Prior

Page 7: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

7

Is PCFG good enough for our purpose?

• PCFG can be learnt through Bayesian framework but …

• Set of rules is fixed in standard PCFG estimation

• PCFG rules are “too small” to be effective units of generalization

How can we solve this problem?

Page 8: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

8

1. let the set of non-terminals grow unboundedly:– Start with un-lexicalized short grammar– Split-Join of non-terminals

2. let the set of rules grow unboundedly:– Generate new rules when ever you need– Learn sub-trees and their probabilities ( Bigger

units of generalization)

Two Non-parametric Bayesian extensions to PCFGs

Page 9: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

9

Adaptive Grammar

• CFG rules is used to generate the trees as in a CFG

• We have two types of non-terminals:– Un-adapted (normal) non-terminals• Picking a rule and recursive expanding its children as in

PCFG

– Adapted non-terminals• Picking a rule and recursive expanding its children• Generating a previously generated tree (proportional to

number of times that it is already generated)We have a Chinese Restaurant Process for each adapted non-terminal

Page 10: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

10

The Story of Adaptor Grammars

• In PCFG, rules are applied independently from each other.• The sequence of trees generated by an adaptor grammar

are not independent.• if an adapted sub-tree has been used frequently in the

past, it's more likely to be used again.• An un-adapted nonterminal expands Using with probability

proportional to • An adapted nonterminal expands:

– to a sub-tree rooted in with probability proportional to the number of times was previously generated

– Using with probability proportional to – is prior.

Page 11: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

11

Properties of Adaptor grammars

• In Adaptor grammars:– The probability of adapted sub-trees are learnt

separately, not just product of probability of rules.

– “Rich get richer” (Zipf distribution)

– Useful compound structures are more probable than their parts.

– there is no recursion amongst adapted non-terminals (an adapted non-terminal never expands to itself)

Page 12: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

12

The Chinese Restaurant Process

Page 13: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

13

The Chinese Restaurant Process

P(zi j | z1,...,zi 1)

n ji 1

existing table j

i 1

next unoccupied table

• n customers walk into a restaurant, choose tables zi with probability

• Defines an exchangeable distribution over seating arrangements (inc. counts on tables)

Page 14: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

14

CRP

Page 15: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

15

CRP

Page 16: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

16

CRP

Page 17: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

17

CRP

Page 18: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

18

CRP

Page 19: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

19

Application of Adaptor grammars

No usage for parsing! Because grammar induction is hard.

1. Word Segmentation2. Learning concatenative morphology3. Learning the structure of NE NPs4. Topic Modeling

Page 20: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

20

Unsupervised Word Segmentation

• Input: phoneme sequences with sentence boundaries

• Task: identify words

Page 21: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

21

Word segmentation with PCFG

Page 22: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

22

Unigram word segmentation

Page 23: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

23

Collocation word segmentation

Page 24: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

24

Performance

Generalization Accuracy

Unigram 56%

+ collocations 76%

+ syllable structure 87%

• Evaluated on Brent corpus

Page 25: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

25

Morphology

• Input: raw text• Task: identify stems and morphemes and

decompose a word to its morphological components

• Adaptor grammars can just be applied for simple concatenative morphology.

Page 26: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

26

CFG for morphological analysis

Page 27: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

27

Adaptor grammar for morphological analysis

Generated Words:1. cats2. dogs3. cats

Page 28: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

28

Performance

• For more sophisticated model:

• 116,129 tokens: 70% correctly segmented

• 7,170 verb types: 66% correctly segmented

Page 29: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

29

Inference

• distribution of adapted trees are exchangeable : Gibbs sampling

• Variational Inference method is also provided for learning adaptor grammars.

Covering this part is beyond the objectives of this talk.

Page 30: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

30

Conclusion

• We are interested in inducing grammars without supervision for two reasons:– Language acquisition – Low-resource languages

• PCFG rules are too much small for bigger generalization

• Learning the things people learn requires using rich, unbounded hypothesis spaces

• Adaptor grammars using CRP to learn rules from this unbounded hypothesis spaces

Page 31: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

31

References• Adaptor Grammars: A Framework for Specifying Compositional

Nonparametric Bayesian Models, M. Johnson et al. , ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS, 2007

• Using adaptor grammars to identify synergies in the unsupervised acquisition of linguistic structure, Mark Johnson, ACL-08, HLT , 2008

• Inferring Structure from Data, Tom Griffith, Machine Learning summer school, Sardinia, 2010

Page 32: Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

32

Thank you for your attention!