Top Banner
Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley
31

Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Dec 15, 2015

Download

Documents

Samir Stephens
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Prototype-Driven Grammar Induction

Aria Haghighi and Dan Klein

Computer Science Division

University of California Berkeley

Page 2: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Grammar Induction

DT NN VBD DT NN IN NNThe screen was a sea of red

Page 3: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

First Attempt

DT NN VBD DT NN IN NNThe screen was a sea of red

Page 4: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Central Questions

• How do we specify what we want to learn?

• How do we fix observed errors?

What’s an NP?

That’s not quite it!

Page 5: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Experimental Set-up

• Binary Grammar

{ X1, X2, … Xn} plus POS tags

• Data• WSJ-10 [7k sentences]• Evaluate on Labeled F1

• Grammar Upper Bound: 86.1

Xj

Xi

Xk

Page 6: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Experiment Roadmap

• Unconstrained Induction• Need bracket constraint!

• Gold Bracket Induction• Prototypes and Similarity

• CCM Bracket Induction

Phrase Prototype

NP DT NN

PP IN DT NN

Page 7: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Unconstrained PCFG Induction

• Learn PCFG with EM• Inside-Outside Algorithm• Lari & Young [93]

• Results0 i j n

(Inside)

(Outside)

26.3

86.1

20 40 60 80 100

PCFG

UpperBound

Labeled F1

Page 8: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

• Gold Brackets• Periera & Schables [93]

• Result

Constrained PCFG Induction

26.3

51.6

86.1

20 40 60 80 100

PCFG

PCFGGOLD

UpperBound

Labeled F1

Page 9: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Encoding Knowledge

What’s an NP?

Semi-Supervised Learning

Page 10: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Encoding Knowledge

What’s an NP?

For instance,DT NNJJ NNSNNP NNP

Prototype Learning

Page 11: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

• Add Prototypes• Manually

constructed

Grammar Induction Experiments

Page 12: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

How to use prototypes?

??

?

?

DTThe

NNkoala

VBD sat

IN in

DT the

NN tree¦ ¦

?

NP NP

PP

VP

S

Page 13: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

How to use prototypes?

NP

PP

VP

DTThe

NNkoala

VBD sat

IN in

DT the

NN tree¦ ¦

S

JJ hungry

?NP

Page 14: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Distributional Similarity

• Context Distribution (DT JJ NN) = { ¦ __ VBD : 0.3,

VBD __ ¦ : 0.2, IN __ VBD: 0.1, ….. }

• Similarity

(DT NN)

(DT JJ NN) (JJ NNS)

(NNP NNP)

NP

Page 15: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Distributional Similarity

• Prototype Approximation (NP) ¼

Uniform ( (DT NN), (JJ NNS), (NNP NNP) )

• Prototype Similarity Feature

• span(DT JJ NN) emits proto=NP• span(MD NNS) emits proto=NONE

Page 16: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Prototype CFG+ Model

NP

PP

VP

DTThe

NNkoala

VBD sat

IN in

DT the

NN tree¦ ¦

S

JJ hungry

NP

NP

P (DT NP | NP)P (proto=NP | NP)

Page 17: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Prototype CFG+ Induction

• Experimental Set-Up• BLIPP corpus• Gold Brackets

• Results

51.6

71.1

20 40 60 80

PCFGGOLD

PROTOGOLD

Labeled F1

Page 18: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Summary So Far

• Bracket constraint and prototypes give good performance!

26.3

51.6

71.1

86.1

20 40 60 80 100

PCFG

PCFGGOLD

PROTOGOLD

UpperBound

Labeled F1

Page 19: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Constituent-Context Model

Page 20: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Product Model

• Different Aspects of Syntax• CCM : Yield and Context properties• CFG : Hierarchical properties

• Intersected EM [Klein 2005]• Encourages mass on trees

compatible with CCM and CFG

Page 21: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Grammar Induction Experiments

• Intersected CFG and CCM• No prototypes

• Results

26.3

35.3

20 25 30 35 40

PCFG

PCFG CMM

Labeled F1

Page 22: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Grammar Induction Experiments

• Intersected CFG+ and CCM• Add Prototypes

• Results

35.3

62.2

20 40 60 80

PCFG CMM

PROTOCCM

Labeled F1

Page 23: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Reacting to Errors

• Possessive NPs

Our Tree Correct Tree

Page 24: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Reacting to Errors

• Add Prototype: NP-POS NN POS

New Analysis

Page 25: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Error Analysis

• Modal VPs

Our Tree Correct Tree

Page 26: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Reacting to Errors

• Add Prototype: VP-INF VB NN

New Analysis

Page 27: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Fixing Errors

• Supplement Prototypes• NP-POS and VP-INF

• Results

62.2

65.1

20 40 60 80

PROTOCCM

BEST

Labeled F1

Page 28: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Results Summary

62.2

26.3

86.1

71.1

35.3

62.2

65.1

20 40 60 80 100

PCFG

PCFGBracket

PROTOBracket

Best

UpperBound

Labeled F1

CCM BracketLabeled F1Gold BracketLabeled F1

Page 29: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Conclusion

• Prototype-Driven LearningFlexible Weakly Supervised Framework

• Merged distributional clustering techniques with supervised structured models

Page 30: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Thank You!

http://www.cs.berkeley.edu/~aria42

Page 31: Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Unconstrained PCFG Induction

• Binary Grammar

{ X1, X2, … Xn}

• Learn PCFG with EM• Inside-Outside Algorithm• Lari & Young [93]

0 i j n

(Inside)

(Outside)

VXj

Xi

Xk N

Xi

Xk Xj

Xi

V N

Xi