Top Banner
Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute for Language, Cognition and Computation University of Edinburgh [email protected] April 6, 2017 Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 1 / 18
32

Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Jun 04, 2018

Download

Documents

lykhanh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Dependency Parsing as Head Selection

Xingxing Zhang, Jianpeng Cheng, Mirella Lapata

Institute for Language, Cognition and ComputationUniversity of Edinburgh

[email protected]

April 6, 2017

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 1 / 18

Page 2: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Dependency Parsing

Dependency Parsing is the task of transforming a sentenceS = (root,w1,w2, . . . ,wN) into a directed tree originating out of root.

Parsing Algorithms

Transition-based ParsingGraph-based Parsing

Our parser is neither Transition-based nor Graph-based (duringtraining)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 2 / 18

Page 3: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Dependency Parsing

Dependency Parsing is the task of transforming a sentenceS = (root,w1,w2, . . . ,wN) into a directed tree originating out of root.

Parsing Algorithms

Transition-based ParsingGraph-based Parsing

Our parser is neither Transition-based nor Graph-based (duringtraining)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 2 / 18

Page 4: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Transition-based Parsing

Data Structure

Buffer, Stack, Arc Set

Parsing:

Choose an action fromSHIFTREDUCE-LeftREDUCE-Right

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 3 / 18

Page 5: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Graph-based Parsing

A Sentence → A Directed Complete Graph

(Graphs from Kubler et al., 2009)

Parsing: Finding Maximum Spanning Tree

Chu-Liu-Edmond algorithm (Chu and Liu, 1965)Eisner algorithm (Eisner 1996)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 4 / 18

Page 6: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Recent Advances

Mostly replacing discrete features with Neural Network features.

Transition-based Parsers

Feed-Forward NN features (Chen and Manning, 2014)Bi-LSTM features (Kiperwasser and Goldberg, 2016)Stack LSTM: Buffer, Stack and Action Sequences modeled byStack-LSTMs (Dyer et al., 2015)

Graph-based Parsers

Tensor Decomposition features (Lei et al., 2014)Feed-Forward NN features (Pei et al., 2015)Bi-LSTM features (Kiperwasser and Goldberg, 2016)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 5 / 18

Page 7: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Do we need a transition system or graph algorithm?

root kids love candy

An important fact: Every word has only one head!

Why not just learn to select the head?

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 6 / 18

Page 8: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Do we need a transition system or graph algorithm?

root kids love candy

An important fact: Every word has only one head!

Why not just learn to select the head?

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 6 / 18

Page 9: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Do we need a transition system or graph algorithm?

root kids love candy

An important fact: Every word has only one head!

Why not just learn to select the head?

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 6 / 18

Page 10: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Dependency Parsing as Head Selection

DeNSe: Dependency Neural Selection

Phead(root|love,S) =exp(MLP(aroot, alove))∑3k=0 exp(MLP(ak , alove))

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 7 / 18

Page 11: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Dependency Parsing as Head Selection

DeNSe: Dependency Neural Selection

Phead(root|love,S) =exp(MLP(aroot, alove))∑3k=0 exp(MLP(ak , alove))

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 7 / 18

Page 12: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Dependency Parsing as Head Selection

DeNSe: Dependency Neural Selection

Phead(root|love,S) =exp(MLP(aroot, alove))∑3k=0 exp(MLP(ak , alove))

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 7 / 18

Page 13: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Dependency Parsing as Head Selection

DeNSe: Dependency Neural Selection

Phead(root|love,S) =exp(MLP(aroot, alove))∑3k=0 exp(MLP(ak , alove))

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 7 / 18

Page 14: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Dependency Parsing as Head Selection

DeNSe: Dependency Neural Selection

Phead(root|love,S) =exp(MLP(aroot, alove))∑3k=0 exp(MLP(ak , alove))

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 7 / 18

Page 15: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Decoding

Greedy Decoding: The output may not be a (projective) tree!

Greedy DecodingDataset #Sent (Dev) Tree Proj

PTB (English) 1,700 95.1 86.6CTB (Chinese) 803 87.0 73.1Czech 374 87.7 65.5German 367 96.7 67.3

Decoding with a Maximum Spanning Tree Algorithm (relatively rare)

Projective Parsing: Eisner AlgorithmNon-projective Parsing: Chu-Liu-Edmond Algorithm

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 8 / 18

Page 16: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Decoding

Greedy Decoding: The output may not be a (projective) tree!

Greedy DecodingDataset #Sent (Dev) Tree Proj

PTB (English) 1,700 95.1 86.6CTB (Chinese) 803 87.0 73.1Czech 374 87.7 65.5German 367 96.7 67.3

Decoding with a Maximum Spanning Tree Algorithm (relatively rare)

Projective Parsing: Eisner AlgorithmNon-projective Parsing: Chu-Liu-Edmond Algorithm

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 8 / 18

Page 17: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Decoding

Greedy Decoding: The output may not be a (projective) tree!

Greedy DecodingDataset #Sent (Dev) Tree Proj

PTB (English) 1,700 95.1 86.6CTB (Chinese) 803 87.0 73.1Czech 374 87.7 65.5German 367 96.7 67.3

Decoding with a Maximum Spanning Tree Algorithm (relatively rare)

Projective Parsing: Eisner AlgorithmNon-projective Parsing: Chu-Liu-Edmond Algorithm

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 8 / 18

Page 18: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Labelled Parser

A two-layer Rectifier Network (Glorot et al., 2011)

Dependent Word:

Bi-LSTM FeatureWord EmbeddingPoS Embedding

Head Word:

Bi-LSTM FeatureWord EmbeddingPoS Embedding

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 9 / 18

Page 19: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Experiments

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 10 / 18

Page 20: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Projective Parsing Results (PTB; English)

NN (Chen & Manning, 2014); S-LSTM (Dyer et al., 2015);Bi-LSTM (Kiperwasser & Goldberg, 2016); SynNet (Andor et al. 2016)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 11 / 18

Page 21: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Projective Parsing Results (PTB; English)

NN (Chen & Manning, 2014); S-LSTM (Dyer et al., 2015);Bi-LSTM (Kiperwasser & Goldberg, 2016); SynNet (Andor et al. 2016)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 11 / 18

Page 22: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Projective Parsing Results (PTB; English)

NN (Chen & Manning, 2014); S-LSTM (Dyer et al., 2015);Bi-LSTM (Kiperwasser & Goldberg, 2016); SynNet (Andor et al. 2016)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 11 / 18

Page 23: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Projective Parsing Results (PTB; Chinese)

NN (Chen & Manning, 2014); S-LSTM (Dyer et al., 2015); Bi-LSTM(Kiperwasser & Goldberg, 2016); 3rd-cubic (Zhang & McDonald 2014)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 12 / 18

Page 24: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Non-projective Parsing Results (German)

MST-1st, MST-2nd (McDonald et al., 2005) Turbo-1st, Turbo-3rd(Martins et al., 2013) RBG-1st RBG-3rd (Martins et al. 2013)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 13 / 18

Page 25: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Non-projective Parsing Results (German)

MST-1st, MST-2nd (McDonald et al., 2005) Turbo-1st, Turbo-3rd(Martins et al., 2013) RBG-1st RBG-3rd (Martins et al. 2013)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 13 / 18

Page 26: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Non-projective Parsing Results (Czech)

MST-1st, MST-2nd ((McDonald et al., 2005) Turbo-1st, Turbo-3rd(Martins et al., 2013) RBG-1st RBG-3rd (Martins et al. 2013)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 14 / 18

Page 27: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Non-projective Parsing Results (Czech)

MST-1st, MST-2nd ((McDonald et al., 2005) Turbo-1st, Turbo-3rd(Martins et al., 2013) RBG-1st RBG-3rd (Martins et al. 2013)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 14 / 18

Page 28: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Unlabeled Exact Match

PTB CTBParser Dev Test Dev Test

C&M14 43.35 40.93 32.75 32.20Dyer15 51.94 50.70 39.72 37.23DeNSe 51.24 49.34 34.74 33.66DeNSe+E 52.47 50.79 36.49 35.13

Table: UEM results on PTB and CTB.

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 15 / 18

Page 29: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

UAS v.s. Length

11 14 17 20 23 26 28 32 38 118PTB sentence length

89

90

91

92

93

94

95

96U

AS

(%

)

C&M14DeNSe+EDyer15

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 16 / 18

Page 30: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

UAS v.s. Length

5 9 14 18 22 26 30 37 49 116PTB sentence length

80

81

82

83

84

85

86

87

88

89

90

91

92

93U

AS

(%

)

C&M14DeNSe+EDyer15

CTBCTB

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 16 / 18

Page 31: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

Conclusions

We propose a dependency parser as greedily selecting the head ofeach word in sentence.

Combine the greedy model with a MST algorithm can further increasethe performance

Code available: https://github.com/XingxingZhang/dense parser

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 17 / 18

Page 32: Dependency Parsing as Head Selectionhomepages.inf.ed.ac.uk/s1270921/res/slides/dense.pdf · Dependency Parsing as Head Selection Xingxing Zhang, Jianpeng Cheng, Mirella Lapata Institute

ThanksQ & A

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 18 / 18