Top Banner
Sourceside Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach, Qin Gao and Stephan Vogel Carnegie Mellon University 1
53

Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Sep 21, 2018

Download

Documents

dinhthien
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Source‐side Dependency Tree Reordering Models with SubtreeMovements and Constraints 

Nguyen Bach, Qin Gao and Stephan VogelCarnegie Mellon University

1

Page 2: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Overview• We introduce source‐side dependency tree reordering models

• Inspired by lexicalized reordering model (Koehn et. al 2005) , hierarchical dependency translation (Shen et. al, 2008) and cohesive decoding (Cherry, 2008)

• We model reordering events of phrases associated with source‐side dependency trees

• Inside/Outside subtree movements efficiently capture the statistical distribution of the subtree‐to‐subtree transitions in training data

• Utilize subtree movements directly at the decoding time alongside with cohesive constraints to guide the search process

• Improvements are shown in English‐Spanish and English‐Iraqi tasks

2

Page 3: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Outline

• Background & Motivations

• Source‐side dependency tree reordering models– Modeling

– Training

– Decoding

• Experiments & Analysis

• Conclusions

3

Page 4: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Background of Reordering Models

4

Explicitly model phrasereordering distances

Put syntactic analysis ofthe target languageinto both modeling anddecoding

Use source language syntax

Page 5: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

5

Explicitly model phrasereordering distances

Put syntactic analysis of the targetlanguage into both modeling anddecoding

Use source language syntax

Distance‐based (Och, 2002; Koehn et.al., 2003)

Lexicalized phrase (Tillmann, 2004; Koehn, et.al., 2005; Al‐Onaizan and Papineni, 2006)

Hierarchical phrase (Galley and Manning, 2008)

MaxEnt classifier (Zens and Ney, 2006; Xiong, et.al., 2006; Chang, et. al., 2009)

Direct model target language constituents  movement in either constituency trees (Yamada and Knight, 2001; Galley et.al., 2006; Zollmannet.al., 2008) or dependency trees (Quirk, et.al., 2005)

Hierarchical phrase‐based (Chiang, 2005; Shen et. al., 2008)

Preprocessing with syntactic reordering rules (Xia and McCord, 2004; Collins et.al., 2005; Rottmann and Vogel,2007; Wang et.al., 2007; Xuet.al. 2009)

Use syntactical analysis toprovide multiple source sentence reordering optionsthrough word lattices (Zhang et.al., 2007; Li et.al.,2007; Elming, 2008).

Page 6: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

6

Explicitly model phrasereordering distances

Put syntactic analysis of the targetlanguage into both modeling anddecoding

Use source language syntax

Distance‐based (Och, 2002; Koehn et.al., 2003)

Lexicalized phrase (Tillmann, 2004; Koehn, et.al., 2005; Al‐Onaizan and Papineni, 2006)

Hierarchical phrase (Galley and Manning, 2008)

MaxEnt classifier (Zens and Ney, 2006; Xiong, et.al., 2006; Chang, et. al., 2009)

Direct modeling of target language constituents  movement in either constituency trees (Yamada and Knight, 2001; Galley et.al., 2006; Zollmann et.al., 2008) or dependency trees (Quirk, et.al., 2005)

Hierarchical phrase‐based (Chiang, 2005; Shen et. al., 2008)

Preprocessing with syntactic reordering rules (Xia and McCord, 2004; Collins et.al., 2005; Rottmann and Vogel,2007; Wang et.al., 2007; Xuet.al. 2009)

Use syntactical analysis toprovide multiple source sentence reordering optionsthrough word lattices (Zhang et.al., 2007; Li et.al.,2007; Elming, 2008).

Source‐side  Dependency Tree Reordering Models

with Subtree Movements and Constraints

Page 7: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

What are the differences?

• Instead of using flat word structures to extract reordering events, utilize source‐side dependency structures– Provide more linguistic cues for reordering events

• Instead of using pre‐defined reordering patterns, learn reordering feature distributions from training data – Capture reordering events  from real data

• Instead of preprocessing the data,  discriminatively train the reordering model via MERT– Tighter integration with the decoder

7

Page 8: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Cohesive Decoding

• A cohesive decoding (Cherry, 08; Bach et. al., 09) is forcing the cohesive constraint:– When the decoder begins translation any part of a source subtree, it 

must cover all words under that subtree before it can translate anything outside.

• Source‐side dependency tree reordering models– Efficiently capture the statistical distribution of the subtree‐to‐subtree

transitions  in training data.

– Directly utilize it at the decoding time to guide the search process. 

8

Page 9: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Outline

• Background of Reordering Models

• Source‐side dependency tree reordering models– Modeling

– Training

– Decoding

• Experiments & Analysis

• Conclusions

9

Page 10: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Lexicalized Reordering Models (Tillmann, 2004; Koehn, et.al., 2005; Al‐Onaizan & Papineni, 2006)

10

∏=

−=n

iiiaii aafeopfeOp

i1

1 ),,,|(),|(

);( possibles 3over valuea has each sequence; phrasen orientatio is

;alignment an by defined phrase d translatea has which phrase source a is

;alignments phrase is )(phrases; language target theis ),...,(

sentence;input theis

ia

1

1

M, S,DoO

aef

,...,aaaeee

fwhere

i

ii

n

n

==

Page 11: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

11

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

0

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Page 12: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

12

16

15

14

13

12

11

10

9

8

7

6

5

4

3 ● quisiera 

2 ● tanto 

1 ● lo 

0 ● Por

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Page 13: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

13

16

15

14

13

12

11

10

9

8

7

6

5

4 ● ● pedirle

3 ● quisiera 

2 ● tanto 

1 ● lo 

0 ● Por

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Discontinuous

Page 14: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

14

16

15

14

13

12

11

10

9

8

7

6

5 ● ● nuevamente

4 ● ● pedirle

3 ● quisiera 

2 ● tanto 

1 ● lo 

0 ● Por

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Discontinuous

Swap

Page 15: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

15

16

15

14

13

12

11

10 ● que

9 ● de

8 ● encargue

7 se

6 que

5 ● ● nuevamente

4 ● ● Pedirle

3 ● quisiera 

2 ● tanto 

1 ● lo 

0 ● Por

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Discontinuous

Swap

Discontinuous

Page 16: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

16

16 ● ● neerlandés 

15 ● canal 

14 ● Un

13 también 

12 ● ver 

11 ● ● podamos 

10 ● que

9 ● de

8 ● encargue

7 se

6 que

5 ● ● nuevamente

4 ● ● Pedirle

3 ● quisiera 

2 ● tanto 

1 ● lo 

0 ● Por

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Discontinuous

Swap

Discontinuous

Monotone

Page 17: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Pros & Cons of Lexicalized Reordering Models

• Pros– intuitively model flat word movements

– well‐defined for phrase‐based framework

• Cons– No linguistics structures

– Need alignment matrix to determine movements

17

Page 18: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Completed/Open subtrees

18

g

b

a c

d

e f

A completedsubtree

All words under a node have been translated then we call a completed subtree

Page 19: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Completed/Open subtrees

19

g

b

a c

d

e f

An open subtree

A subtree that has begun translation but not yet complete, an opensubtree

Page 20: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Inside/Outside subtree movements

20

g

b

a c

d

e f

Inside

“c” is moving inside a subtreerooted at “b”

A structure is moving inside a subtree if it helps the subtree to be completed or less open

Page 21: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Inside/Outside subtree movements

21

g

b

a c

d

e f

Outside

“d e” is moving outside a subtreerooted at “b”

A structure is moving outside a subtree if it leaves the subtree to be open

Page 22: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Source‐side Dependency Tree (SDT) Reordering Models

22

∏=

−=n

iiiiaii ssafedpfeDp

i1

1 ),,,,|(),|(

;alignment an by defined phrase d translatea has which phrase source a is

;alignments phrase is )(phrases; language target theis ),...,(

sentence;input theis

ia

1

1

ii

n

n

aef

,...,aaaeee

fwhere

==

;each tree;dependency sourceover

movements phrase syntactic of sequence therepresents

; and phrases source of structures dependency are and 1-ii aa1

{I, O} d

D

ffss

i

i-i

=

Page 23: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

23

neerlandés 

canal 

un

también 

ver 

podamos 

que 

de 

encargue 

se 

que

nuevamente

Pedirle

quisiera 

tanto 

lo 

Por

ask

more

once

would

get

channelthat we

thereforeI

ensure

you toas

Dutcha well

Page 24: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

24

neerlandés 

canal 

un

también 

ver 

podamos 

que 

de 

encargue 

se 

que

nuevamente

Pedirle

quisiera 

tanto 

lo 

Por

ask

more

once

would

get

channelthat we

thereforeI

ensure

you toas

Dutcha well

Page 25: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

25

● ●

neerlandés 

canal 

un

también 

ver 

podamos 

que 

de 

encargue 

se 

que

nuevamente

Pedirle

quisiera 

tanto 

lo 

Por

ask

more

once

would

get

channelthat we

thereforeI

ensure

you toas

Dutcha well

Discontinuous

Inside

Page 26: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

26

● ●

● ●

neerlandés 

canal 

un

también 

ver 

podamos 

que 

de 

encargue 

se 

que

nuevamente

Pedirle

quisiera 

tanto 

lo 

Por

ask

more

once

would

get

channelthat we

thereforeI

ensure

you toas

Dutcha well

Discontinuous

Inside

Swap

Outside

Page 27: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

27

● ●

● ●

neerlandés 

canal 

un

también 

ver 

podamos 

que 

de 

encargue 

se 

que

nuevamente

Pedirle

quisiera 

tanto 

lo 

Por

ask

more

once

would

get

channelthat we

thereforeI

ensure

you toas

Dutcha well

Discontinuous

Inside

Swap

Outside

Discontinuous

Inside

Page 28: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

28

● ●

● ●

● ●

● ●

neerlandés 

canal 

un

también 

ver 

podamos 

que 

de 

encargue 

se 

que

nuevamente

Pedirle

quisiera 

tanto 

lo 

Por

ask

more

once

would

get

channelthat we

thereforeI

ensure

you toas

Dutcha well

Discontinuous

Inside

Swap

Outside

Discontinuous

Inside

Monotone

Inside

Page 29: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

29

● ●

● ●

● ●

● ●

neerlandés 

canal 

un

también 

ver 

podamos 

que 

de 

encargue 

se 

que

nuevamente

Pedirle

quisiera 

tanto 

lo 

Por

ask

more

once

would

get

channelthat we

thereforeI

ensure

you toas

Dutcha well

Discontinuous

Inside

Swap

Outside

Discontinuous

Inside

Monotone Source-side Dependency Tree R.M.

Lexicalized R.M.

Inside Outside

Inside

Inside Inside

Discontinuous Swap Discontinuous Monotone

Page 30: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Extended Source‐side Dependency Tree (SDT) Reordering Models

30

∏=

−−=n

iiiiiaii ssaafedopfeDp

i1

11 ),,,,,|)_((),|(

; and phrases source of structures dependency are and

;alignment an by defined phrase d translatea has which phrase source a is

;alignments phrase is )(phrases; language target theis ),...,(

sentence;input theis

1-ii

i

aa1

a

1

1

ffss

aef

,...,aaaeee

fwhere

i-i

ii

n

n

==

;each tree;dependency sourceover

movements phrase syntactic of sequence therepresents

}, S_O, D_O, D_I, M_O {M_I, S_I (o_d)

D

i =

Page 31: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Extended Source‐side Dependency Tree (SDT) Reordering Models

31

∏=

−−=n

iiiiiaii ssaafedopfeDp

i1

11 ),,,,,|)_((),|(

; and phrases source of structures dependency are and

;alignment an by defined phrase d translatea has which phrase source a is

;alignments phrase is )(phrases; language target theis ),...,(

sentence;input theis

1-ii

i

aa1

a

1

1

ffss

aef

,...,aaaeee

fwhere

i-i

ii

n

n

==

;each tree;dependency sourceover

movements phrase syntactic of sequence therepresents

}, S_O, D_O, D_I, M_O {M_I, S_I (o_d)

D

i =

D_I S_O D_I M_I

Inside Outside Inside Inside

Discontinuous Swap Discontinuous Monotone

Page 32: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Training 

• Obtain dependency parse of the source side

• Given a sentence pair and the source side dependency tree– Phrase extraction: also extract source dependency structures of phrase pairs

– Identify Inside/Outside movement by using Interruption Check Algorithms (Bach et.al., 2009)

32

Page 33: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Training

33

∑ ∑ +

+=

k j kj

kjkjaikj docount

docountdofedop

i ))_(()_(

),,,|)_((γ

γ

∑ +

+=

k kj

kjkaikj docount

docountdfedop

i ))_(()_(

),,|)_((γ

γ

∑ +

+=

j kj

kjjaikj docount

docountofedop

i ))_(()_(

),,|)_((γ

γ

DO: a joint probability of subtree movements and lexicalized orientations

DOD: conditioned on subtree movements

DOO: conditioned on lexicalized orientations

Page 34: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Decoding

• Without cohesive constraints – Having no information about the source dependency tree information 

during the decoding time

– Consider both subtree movements, and add them up to the translation model costs

• With cohesive constraints– The source dependency tree is available during the decoding time

– Only consider either inside or outside movement, depending on the output of the interruption check algorithm

34

Page 35: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Outline

• Background of Reordering Models

• Source‐side dependency tree reordering models– Modeling

– Training

– Decoding

• Experiments & Analysis

• Conclusion

35

Page 36: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Experiments setups

• Baseline: a phrase‐based MT with lexicalized reordering model

• Coh: using cohesive constraints

• DO / DOD / DOO: using source‐side dependency tree (SDT) reordering model with different parameter estimations

• DO+Coh / DOD+Coh / DOO+Coh: decoding with both SDT reordering model and cohesive constraints.

36

Page 37: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

English‐Spanish (Europarl)

• Source‐side dependency tree reordering models and cohesive constraints obtained improvements over the lexicalized reordering models.

37

32.6

32.8

33

33.2

33.4

33.6

33.8

BLEU

English‐Spanish: nc‐test2007

19.6

19.8

20

20.2

20.4

20.6

20.8

BLEU

English‐Spanish: news‐test2008

Page 38: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

English‐Iraqi (TransTac)

38

• Decoding with both source‐side dependency tree reordering models and cohesive constraints often obtain the best performance.

25

25.1

25.2

25.3

25.4

25.5

25.6

25.7

BLEU

English‐Iraqi: june2008

17.6

17.8

18

18.2

18.4

18.6

18.8

19

19.2

BLEU

English‐Iraqi: nov2008

Page 39: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Where are improvements coming from?

39

Page 40: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Test set breakdown

• Divide the test sets into three portions based on sentence‐level TER of the baseline system

• μ  and σ are  mean and standard deviation of the whole test set

• Head, Tail and Mid as the sentence whose score is lower than μ‐1/2 σ, higher than μ+1/2 σ and the rest

40

Page 41: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

41

‐0.8

‐0.3

0.2

0.7

BLEU

English‐Spanish: nc‐test2007

tail

mid

head

‐1

‐0.5

0

0.5

1

BLEU

English‐Spanish: news‐test2008

tail

mid

head

‐1

‐0.5

0

0.5

1

1.5

BLEU

English‐Iraqi: june‐2008

tail

mid

head

‐4

‐2

0

2

4

BLEU

English‐Iraqi: nov‐2008

tail

mid

head

june‐08 nov‐08 nc‐test2007 news‐test2008

Head 7.92 6.27 20.39 13.07

Mid 12.31 11.09 28.07 22.78

Tail 13.91 14.08 35.29 25.33

Page 42: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

What is the most significant effect the source‐tree reordering models contribute?

42

Page 43: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Numbers of Reorderingsnc‐test2007 news‐test2008 june‐2008 nov‐2008

Baseline 1507 1684 39 24Coh 2045 2903 46 21DO 2189 2113 97 58DO+Coh 1929 1900 155 88DOD 1735 2592 123 60DOD+Coh 2070 2021 148 90DOO 1735 1785 164 49DOO+Coh 1818 1959 247 66

43

• More reorderings can be generated without losing performance.

• The source‐tree reordering models provide a more discriminative mechanism to estimate reordering events.

• Reordering  is more language‐specific than general translation models, and the conditions for a reordering event to happen vary among languages.

Page 44: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Outline

• Background & Motivations

• Source‐side dependency tree reordering models– Modeling

– Training

– Decoding

• Experiments & Analysis

• Conclusions

44

Page 45: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Conclusions & Future Work

• Conclusions– Source‐side dependency tree reordering models are helpful

• Model reordering event with Inside/Outside subtree movements

– The effectiveness was shown when comparing with a strong reordering model

– Obtained improvements with 2 language pairs and  also covered a training corpus sizes, ranging from 500K up to 1.3M sentence pairs

• Future work– A hierarchical source side dependency reordering model: extend 

Galley&Manning (2008).

– Packed‐forest dependency tree reordering models 

45

Page 46: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Back up

46

Page 47: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

47

g

b

a c

d

e f

A completedsubtree

g

b

a c

d

e f

An opensubtree

g

b

a c

d

e f

Outside

“d e” is moving outside a subtree rooted at “b”

g

b

a c

d

e f

Inside

“c” is moving inside a subtree rooted at “b”

Page 48: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

48

g

b

a c

d

e f

Outside

g

b

a c

d

e f

Insideg

b

a c

d

e f

Inside

g

b

a c

d

e f

Outside

Page 49: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

What do you mean by introducing Inside/Outside notions?

• The movement of the subtree inside or outside a source subtree can be viewed as the decoder is leaving from the previous source state to the currentsource state. 

• Tracking facts about the subtree‐to‐subtreetransitions observed in the source side of word‐aligned training data. 

49

Page 50: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

50

Lexicalized Source‐treeask you # pedirle dis swap  D_I *ask you # pedirle mono mono M_I   ask you # pedirle mono mono M_O  once more # nuevamente swap dis S_O *once more # nuevamente dis swap  D_Oonce more # nuevamente que swap dis S_O 

M_I  S_I D_I M_O S_O D_ODO     0.691 0.003 0.142 0.119 0.009 0.038DOD  0.827 0.003 0.17 0.719 0.053 0.228DOO  0.854 0.25 0.79 0.146 0.75 0.21

inside and outside probabilities for phrase “ask you”- “pedirle” according to three parameter estimation methods

Page 51: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

Distributions of Reordering Events

51

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

M_I S_I D_I M_O S_O D_O

En‐Es

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

M_I S_I D_I M_O S_O D_O

En‐Ir

Observed monotone & inside (M_I) movements more often than other categories

Page 52: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

52

Explicitly model phrasereordering distances

Put syntactic analysis of the targetlanguage into both modeling anddecoding

Use source language syntax

Distance‐based (Och, 2002; Koehn et.al., 2003)

Lexicalized phrase (Tillmann, 2004; Koehn, et.al., 2005; Al‐Onaizan and Papineni, 2006)

Hierarchical phrase (Galley and Manning, 2008)

MaxEnt classifier (Zens and Ney, 2006; Xiong, et.al., 2006; Chang, et. al., 2009)

Presenter
Presentation Notes
The first approach is widely used in phrase-based translation framework. For example Distance-based (Och, 2002; Koehn et.al., 2003): base the cost for word movement only on the distance in the source sentence between the previous and the current word or phrase. Lexicalized phrase (Tillmann, 2004; Koehn, et.al., 2005; Al-Onaizan and Papineni, 2006): condition the probability of phrase-to-phrase transitions on the words involved Hierarchical phrase (Galley and Manning, 2008): dynamically determine phrase boundaries using efficient shift-reduce parsing MaxEnt classifier (Zens and Ney, 2006; Xiong, et.al., 2006, Chang, et. al., 2009): Discriminative reordering models showed improvements over the distance based distortion model
Page 53: Nguyen Bach, Qin Gao and Stephan Vogelnbach/papers/MT-Summit-XII-slides.pdf · Source‐side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen Bach,

53

Explicitly model phrasereordering distances

Put syntactic analysis of the targetlanguage into both modeling anddecoding

Use source language syntax

Distance‐based (Och, 2002; Koehn et.al., 2003)

Lexicalized phrase (Tillmann, 2004; Koehn, et.al., 2005; Al‐Onaizan and Papineni, 2006)

Hierarchical phrase (Galley and Manning, 2008)

MaxEnt classifier (Zens and Ney, 2006; Xiong, et.al., 2006; Chang, et. al., 2009)

Direct model target language constituents  movement in either constituency trees (Yamada and Knight, 2001; Galley et.al., 2006; Zollmannet.al., 2008) or dependency trees (Quirk, et.al., 2005)

Hierarchical phrase‐based (Chiang, 2005; Shen et. al., 2008)

Presenter
Presentation Notes
In the 2nd approach is widely used in syntax-based translation framework. One theme in this category is to Directly model target language constituents movement in either Constituency trees (Yamada and Knight, 2001; Galley et.al., 2006; Zollmann et.al., 2008) or Dependency Tree (Quirk, et.al., 2005) Another theme is Hierarchical phrase-based (Chiang, 2005; Shen et. al., 2008) which implicitly model word movement with SCFG