Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley
Prototype-Driven Grammar Induction
Aria Haghighi and Dan Klein
Computer Science Division
University of California Berkeley
Grammar Induction
DT NN VBD DT NN IN NNThe screen was a sea of red
First Attempt
DT NN VBD DT NN IN NNThe screen was a sea of red
Central Questions
• How do we specify what we want to learn?
• How do we fix observed errors?
What’s an NP?
That’s not quite it!
Experimental Set-up
• Binary Grammar
{ X1, X2, … Xn} plus POS tags
• Data• WSJ-10 [7k sentences]• Evaluate on Labeled F1
• Grammar Upper Bound: 86.1
Xj
Xi
Xk
Experiment Roadmap
• Unconstrained Induction• Need bracket constraint!
• Gold Bracket Induction• Prototypes and Similarity
• CCM Bracket Induction
Phrase Prototype
NP DT NN
PP IN DT NN
Unconstrained PCFG Induction
• Learn PCFG with EM• Inside-Outside Algorithm• Lari & Young [93]
• Results0 i j n
(Inside)
(Outside)
26.3
86.1
20 40 60 80 100
PCFG
UpperBound
Labeled F1
• Gold Brackets• Periera & Schables [93]
• Result
Constrained PCFG Induction
26.3
51.6
86.1
20 40 60 80 100
PCFG
PCFGGOLD
UpperBound
Labeled F1
Encoding Knowledge
What’s an NP?
Semi-Supervised Learning
Encoding Knowledge
What’s an NP?
For instance,DT NNJJ NNSNNP NNP
Prototype Learning
• Add Prototypes• Manually
constructed
Grammar Induction Experiments
How to use prototypes?
??
?
?
DTThe
NNkoala
VBD sat
IN in
DT the
NN tree¦ ¦
?
NP NP
PP
VP
S
How to use prototypes?
NP
PP
VP
DTThe
NNkoala
VBD sat
IN in
DT the
NN tree¦ ¦
S
JJ hungry
?NP
Distributional Similarity
• Context Distribution (DT JJ NN) = { ¦ __ VBD : 0.3,
VBD __ ¦ : 0.2, IN __ VBD: 0.1, ….. }
• Similarity
(DT NN)
(DT JJ NN) (JJ NNS)
(NNP NNP)
NP
Distributional Similarity
• Prototype Approximation (NP) ¼
Uniform ( (DT NN), (JJ NNS), (NNP NNP) )
• Prototype Similarity Feature
• span(DT JJ NN) emits proto=NP• span(MD NNS) emits proto=NONE
Prototype CFG+ Model
NP
PP
VP
DTThe
NNkoala
VBD sat
IN in
DT the
NN tree¦ ¦
S
JJ hungry
NP
NP
P (DT NP | NP)P (proto=NP | NP)
Prototype CFG+ Induction
• Experimental Set-Up• BLIPP corpus• Gold Brackets
• Results
51.6
71.1
20 40 60 80
PCFGGOLD
PROTOGOLD
Labeled F1
Summary So Far
• Bracket constraint and prototypes give good performance!
26.3
51.6
71.1
86.1
20 40 60 80 100
PCFG
PCFGGOLD
PROTOGOLD
UpperBound
Labeled F1
Constituent-Context Model
Product Model
• Different Aspects of Syntax• CCM : Yield and Context properties• CFG : Hierarchical properties
• Intersected EM [Klein 2005]• Encourages mass on trees
compatible with CCM and CFG
Grammar Induction Experiments
• Intersected CFG and CCM• No prototypes
• Results
26.3
35.3
20 25 30 35 40
PCFG
PCFG CMM
Labeled F1
Grammar Induction Experiments
• Intersected CFG+ and CCM• Add Prototypes
• Results
35.3
62.2
20 40 60 80
PCFG CMM
PROTOCCM
Labeled F1
Reacting to Errors
• Possessive NPs
Our Tree Correct Tree
Reacting to Errors
• Add Prototype: NP-POS NN POS
New Analysis
Error Analysis
• Modal VPs
Our Tree Correct Tree
Reacting to Errors
• Add Prototype: VP-INF VB NN
New Analysis
Fixing Errors
• Supplement Prototypes• NP-POS and VP-INF
• Results
62.2
65.1
20 40 60 80
PROTOCCM
BEST
Labeled F1
Results Summary
62.2
26.3
86.1
71.1
35.3
62.2
65.1
20 40 60 80 100
PCFG
PCFGBracket
PROTOBracket
Best
UpperBound
Labeled F1
CCM BracketLabeled F1Gold BracketLabeled F1
Conclusion
• Prototype-Driven LearningFlexible Weakly Supervised Framework
• Merged distributional clustering techniques with supervised structured models
Thank You!
http://www.cs.berkeley.edu/~aria42
Unconstrained PCFG Induction
• Binary Grammar
{ X1, X2, … Xn}
• Learn PCFG with EM• Inside-Outside Algorithm• Lari & Young [93]
0 i j n
(Inside)
(Outside)
VXj
Xi
Xk N
Xi
Xk Xj
Xi
V N
Xi