Top Banner
Densely Connected Graph Densely Connected Graph Convolutional Networks for Convolutional Networks for Graph-to-Sequence Learning Graph-to-Sequence Learning Joint work with Yan Zhang, Zhiyang Teng, Wei Lu 1 Zhijiang Guo
31

Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Mar 05, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Densely Connected GraphDensely Connected GraphConvolutional Networks forConvolutional Networks for

Graph-to-Sequence LearningGraph-to-Sequence Learning

Joint work with Yan Zhang, Zhiyang Teng, Wei Lu

1

Zhijiang Guo

Page 2: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Graph-to-Sequence LearningGraph-to-Sequence Learning

AMR-to-Text GenerationAMR-to-Text Generation

Syntax-Based Machine TranslationSyntax-Based Machine Translation

2

Page 3: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

You guys knowwhat I mean.

AMR-to-Text GenerationAMR-to-Text Generation

3

Page 4: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Ignore the graphstructure

 

(Konstas et al. 2017)

4

Sequence EncoderSequence Encoder

Page 5: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

5

Graph State LSTM (Song et al., 2018)Gated Graph Neural Networks (Beck et al., 2018)

Recurrent Graph EncoderRecurrent Graph Encoder

Page 6: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Empirically, the best performance of GCNs is achievedwith a 2-layer model (Li et al., 2018, Xu et al., 2018)

GCNsGCNs

8

Page 7: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

first convolutional layer captures first-orderproximity  (immediate neighbors) information

GCNsGCNs

8

First-OrderProximity

Page 8: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

GCNsGCNs

second convolutional layer capturessecond-order proximity  information

9

Second-OrderProximity

Page 9: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

(Bastings et al., 2017) (Damonte and Cohen , 2019 )6

Convolutional Graph EncoderConvolutional Graph Encoder

Page 10: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Is it possible to build a more expressive GCNmodel to learn a better graph representation

without relying on additional LSTM?

7

MotivationMotivation

Densely Connected GraphConvolutional Networks (DCGCNs)

Page 11: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

one layer takes inputs from all preceding layersrather than the previous layer only (Huang et al., 2017)

 

10

DenselyConnected

Dense ConnectivityDense Connectivity

Page 12: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Densely Connected Sub-Block

Stack Identical BlocksStack Identical Blocks

Linear Combination Layer

11

Densely Connected GCNsDensely Connected GCNs

Page 13: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Both sub-blocks are denselyconnected graph convolutionallayers with different numbers

(m and n) of layers.

12

Densely Connected Sub-BlockDensely Connected Sub-Block

Page 14: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Densely Connected Sub-BlockDensely Connected Sub-Block

13

Sub-blocks with different numberof layers capture structural

information at different abstractlevels, similar to different filters.

Page 15: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Densely Connected Sub-BlockDensely Connected Sub-Block

For parameter efficiency,the output dimension of each

layer in the sub-block isdesigned to be small.

14

Page 16: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Densely Connected Sub-BlockDensely Connected Sub-Block

Input dimension: 300Sub-block layers: 3

Output dimension: 300(concatenate output from all 3 layers)

Hidden dimension of each layer:100 = 300 / 3 (proportional to #layers)

15

Page 17: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Linear Combination LayerLinear Combination Layer

This layer assigns differentweights to outputs of differentlayers. Initial inputs of the sub-block are also incorporate by

the residual connection.

16

Page 18: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Graph-to-Sequence ModelGraph-to-Sequence Model

17

Page 19: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

AMR-to-Text Generation AMR-to-Text Generation

18

AMR 2015AMR 2015

AMR 2017AMR 2017

Syntax-Based Machine TranslationSyntax-Based Machine Translation

English-Czech (WMT 16)English-Czech (WMT 16)

English-German (WMT 16)English-German (WMT 16)

ExperimentsExperiments

Page 20: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Dataset Train Dev TestAMR 2015 16,833 1,368 1,371AMR 2017 36,521 1,368 1,371

En-Cs 181,112 2,656 2,999En-De 226,822 2,169 2,999

19

Data StatisticsData Statistics

Page 21: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Sequential Encoder: LSTM (Konstas et al., 2017)Graph Encoder: GS LSTM (Song et al., 2018)

Model External Data BLEULSTM No 22.0GS LSTM No 23.3GCN + LSTM No 24.4DCGCN No 25.7

GCN + LSTM (Damonte and Cohen , 2019 )

20

AMR 2015AMR 2015

Page 22: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Using External Training Data (0.2M)

21

AMR 2015AMR 2015

Model External Data BLEULSTM 0.2M 27.4GS LSTM 0.2M 28.2DCGCN 0.1M 29.0DCGCN 0.2M 31.6

Page 23: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Using External Training Data (0.3M)

21

Model External Data BLEULSTM 2M 32.3LSTM 20M 33.8GS LSTM 2M 33.6DCGCN (Single) 0.3M 33.2DCGCN (Ensemble) 0.3M 35.3

AMR 2015AMR 2015

Page 24: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Model #Parameters BLEU CHRF++LSTM 28.4M 21.7 49.1GGNNs 28.3M 23.3 50.4GCN + LSTM N/A 24.5 N/ADCGCN   18.5M 27.6 57.3

Sequential Encoder: LSTM (Beck et al., 2017)Graph Encoder: GGNNs (Beck et al., 2018)

22

GCN + LSTM (Damonte and Cohen , 2019 )

AMR 2017 (Single)AMR 2017 (Single)

Page 25: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Model #Parameters BLEU CHRF++LSTM 142.0M 26.6 52.5GGNNs 141.0M 27.5 53.5DCGCN    92.5M 30.4 59.6

Sequential Encoder: LSTM (Beck et al., 2017)Graph Encoder: GGNNs (Beck et al., 2018)

23

AMR 2017 (Ensemble)AMR 2017 (Ensemble)

Page 26: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Model Type #Param BLEU CHRF++BoW + GCN Single N/A 12.2 N/A

CNN + GCN Single N/A 13.7 N/A

BiRNN + GCN Single N/A 16.1 N/A

Seq2Seq Single 41.4M 15.5 40.8

GGNNs Single 41.2M 16.7 42.4

Our DCGCN       Single 29.7M   19.0    44.1

Sequential Encoder: LSTM (Konstas et al., 2017)Graph Encoder: GGNNs(Beck et al., 2018)

BoW/CNN/RNN + GCN (Bastings et al., 2017)

24

English-GermanEnglish-German

Page 27: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Model Type #Param BLEU CHRF++BoW + GCN Single N/A 7.5 N/A

CNN + GCN Single N/A 8.7 N/A

BiRNN + GCN Single N/A 9.6 N/A

Seq2Seq Single 41.4M 8.9 33.8

GGNNs Single 41.2M 9.8 33.3

Our DCGCN    Single 29.7M   12.1 37.1

26

English-GzechEnglish-GzechSequential Encoder: LSTM (Konstas et al., 2017)Graph Encoder: GGNNs(Beck et al., 2018)

BoW/CNN/RNN + GCN (Bastings et al., 2017)

Page 28: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Density of Connection Density of Connection 

Model BLEUDCGCN 25.5

- {4} dense block 24.8

- {3, 4} dense block 23.8

- {2, 3, 4} dense blocks                     23.2

28

Page 29: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Ablation TestAblation Test

Model BLEUDCGCN 25.5

- Global Node (GN) 24.2

- Linear Combination (LC)               23.7

- GN, LC 22.9

29

Page 30: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

ConclusionConclusion

DCGCNs allow the encoder to better capturethe rich structural information of a graph,especially when it is large.

Future: investigate how other NLP applicationscan potentially benefit from our proposedapproach.

30

Page 31: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a

Thank YouThank You Code available:

http://www.statnlp.org/research/machine-learning