Top Banner
Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen
32

Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Symmetric Probabilistic Alignment

Jae Dong Kim

Committee:Jaime G. Carbonell

Ralf D. BrownPeter J. Jansen

Page 2: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Motivation

In the CMU EBMT system, alignment has been less studied compared to the other components.

We want to investigate a new sub-sentential aligner which uses translation probabilities in a symmetric fashion.

Page 3: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Outline

Introduction

Symmetric Probabilistic Alignment

Experiments and Results

Conclusions

Future Work

Page 4: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Aligner in the EBMT

Page 5: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Sub-sentential Alignment

The CMU EBMT system refers to translation examples to translate unknown source sentence

Since it is hard to find an exactly matching example sentence, the system finds the longest match Encapsulated local context Local reordering

The aligner should work on fragments (sub-sentences)

Page 6: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Need for a new aligner

Relatively less studied compared to the other components

The old aligner Heuristic based

Builds a correspondence table Finds the longest target fragment and the shortest

target fragment Checks every substring of the longest one, which

includes the shortest one

Fast but doesn’t use probabilities

Page 7: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Related Work

IBM models (Brown et al, 93)

HMM (Vogel et al, 96)

Competitive link (Melamed, 97)

Explicit Syntactic Information(Yamada et al, 02)

ISA (Zhang, 03)

The SPA is different from the above in that it aligns sub-sentences using translation probabilities and some heuristics when the boundary of source fragment is given.

Page 8: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Outline

Introduction

Symmetric Probabilistic Alignment

Experiments and Results

Conclusions

Future Work

Page 9: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Basic Algorithm (1)

Assumptions: A bilingual probabilistic dictionary is available Contiguous source fragments are translated into

contiguous target fragments Fragments are translated independently of surrounding

context

Given and s i1 ,... , s ik t j 1 ,... , t j l

t =argmax t ∏p=1

k max max q =1l p t jq∣s ip ,

1k

×∏q=1l max max p=1

k p s ip∣t jq , 1l

Page 10: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Basic Algorithm (2)

Assume that we are considering a candidate target fragment 't2 t3 t4' given a source fragment 's7 s8 s9'

Source -> Target Translation ScoreS_tmp = max( p(t2|s7), p(t3|s7), p(t4|s7), ε )

x max( p(t2|s8), p(t3|s8), p(t4|s8), ε )

x max( p(t2|s9), p(t3|s9), p(t4|s9), ε )

S_st = S_tmp^{1/3}

Page 11: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Basic Algorithm (3)

Source <- Target Translation ScoreS_tmp = max( p(s7|t2), p(s8|t2), p(s9|t2), ε )

x max( p(s7|t3), p(s8|t3), p(s9|t3), ε )

x max( p(s7|t4), p(s8|t4), p(s9|t4), ε )

S_ts = S_tmp^{1/3}

Source <->Target Translation ScoreScore = S_st * S_ts

t

Page 12: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Restrictions (1)

Untranslated word penaltys7 s8 s9

t2 t3 t4

Anchor Contexts6 s7 s8 s9 s10 s6 s7 s8 s9 s10

t1 t2 t3 t4 t5 t1 t2 t3 t4 t5

Page 13: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Restrictions (2)

Length penalty “t2 ... t30” for “s7 s8 s9”. Realistic? We expect a proportional target fragment length

to the source fragment length.

Distance penalty “t45 t46 t47” for “s7 s8 s9”. Realistic? Maybe. Between similar word order languages, we

might expect a proportional position.

Page 14: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

The SPA CFD

Page 15: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Combined Aligner

Set a threshold for the SPA

The SPA produces results with higher score than the threshold

For each source fragment If there is a result from the SPA -> use the SPA

result Otherwise, use the IBM result

Page 16: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Outline

Introduction

Symmetric Probabilistic Alignment

Experiments and Results

Conclusions

Future Work

Page 17: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Alignment Accuracy (1)Evaluation Metrics F1 (Precision, Recall) - based on positions

Data English-Chinese

Xinhua news wire Training data: 1m sentence pairs

Trained GIZA++ with default parameters For the SPA, used the dictionary by GIZA++

Test data: 366 sentence pairs - 3 copies by 3 people 20 more sentence pairs - 1 copy by another 27286 3-8 words long source fragments

Page 18: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Alignment Accuracy (2)

Data French-English

Canadian Hansard Training data: 1m sentence pairs

Trained GIZA++ with default parameters For the SPA, used the dictionary by GIZA++

Test data 91 sentence pairs 12466 3-8 words long source fragments

Page 19: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Alignment Accuracy (3)

Alignments to be compared Random: random alignment to a reasonably long target fragment Positional: alignment to a proportionally positioned target fragment Oracle: the best possible contiguous human alignment SPA-uni: unidirectional basic alignment SPA-basic: bidirectional basic alignment SPA: the best SPA alignment with restrictions IBM4: non-contiguous alignment by IBM Model 4 COMB: the combination of SPA and IBM4 alignments SPA-top10: the best of top 10 alignment results of SPA

Page 20: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Alignment Accuracy : En-Cn

SPA-basic outperformed SPA-uni

SPA was the best when we applied untranslated word penalty and length penalty

Our significance test showed that the difference between IBM4 and COMB is significant

Recall Precision F1Random 0.321979 0.372175 0.345262Positional 0.582254 0.576207 0.579215Oracle 0.905602 0.861449 0.882974SPA-uni 0.942574 0.355970 0.516776SPA-basic 0.869897 0.473884 0.613538SPA (u,l) 0.733485 0.693883 0.713135IBM4 0.738995 0.807471 0.771717COMB 0.756338 0.804163 0.779517

Page 21: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Alignment Accuracy : Fr-EnRecall Precision F1

Random 0.193939 0.238384 0.213877Positional 0.668841 0.728991 0.697622Oracle 0.980509 0.937717 0.958636SPA-uni 0.880979 0.281680 0.426874SPA-basic 0.707808 0.712078 0.709936SPA (u,a,l,d) 0.781466 0.801407 0.791311IBM4 0.777064 0.965592 0.861130COMB 0.781734 0.960679 0.862018

SPA-basic outperformed SPA-uni

SPA was the best when we applied all the restrictions

Our significance test showed that the difference between IBM4 and COMB is not significant

Page 22: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Human Alignment Evaluation

Recall Precision F1ltao/xuwang 0.858758 0.980900 0.915774ltao/sandy 0.742688 0.982903 0.846076xuwang/ltao 0.896835 0.976533 0.934989xuwang/sandy 0.783359 0.987704 0.873742sandy/ltao 0.959004 0.950798 0.954884sandy/xuwang 0.968574 0.961476 0.965012

Rough idea about how much humans agree on alignment

Page 23: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

EBMT Performance (1)

Data French-English (Canadian Hansard) 20k training sentence pairs Test

Development set: 100 sentence pairs 2 reference set: 2 references for 100 source

sentences Evaluation set: 10 X 100 sentence pairs

Evaluation Metric BLEU

Page 24: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

EBMT performance (2)

Devtest 2refTest TestEBMT 0.1632 0.2400 0.13455SPA 0.2214 0.2896 0.17287IBM4 0.2197 0.2785 0.17549COMB 0.2240 0.2815 0.17506

SPA, IBM4 and COMB performs significantly better than EBMT (the old aligner)

For 'Test', SPA outperformed EBMT by 28.5 %

Among SPA, IBM4 and COMB, nothing is significantly better than the others

Page 25: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Outline

Introduction

Symmetric Probabilistic Alignment

Experiments and Results

Conclusions

Future Work

Page 26: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Conclusions

Improvement on EBMT performance

Combined aligner worked the best on English-Chinese set

Bidirectional alignment worked better than unidirectional alignment

Page 27: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Future Work

Incorporating human dictionaries to cover more general domains

Non-contiguous alignment

Co-training of the SPA and a dictionary

Experiments on different data sets and different language pairs

Experiments with different metrics

Speed up

Page 28: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

References

Ying Zhang, Stephan Vogel and Alex Waibel. Integrated Phrase Segmentation and Alignment Model for Statistical Machine Translation. submitted to Proc. of International Confrerence on Natural Language Processing and Knowledge Engineering (NLP-KE), 2003, Beijing, China.

Peter F. Brown, Stephen A. Della Pietra, Vin-cent J. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machinetranslation: Parameter estimation. Computa-tional Linguistics, 19 (2) :263-311.

Stephan Vogel, Hermann Ney, and Christoph Till-mann. 1996. HMM-based word alignment in statistical translation. In COLING '96: The 16th Int. Conf. on Computational Linguistics, pages 836-841, Copenhagen, August.

I. Dan Melamed. "A Word-to-Word Model of Translational Equivalence". In Procs. of the ACL97. pp 490--497. Madrid Spain, 1997.

K. Yamada and K. Knight. A decoder for syntax-based statistical MT. In ACL '02, 2002.

Page 29: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Thank You !!

Questions?

Page 30: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Backup Slides

Alignment Accuracy Calculation

Non-contiguous Alignment

Page 31: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Alignment Accuracy Calculation

Human Answer... under the unemployment insurance plan of the

other country ...

Machine Answer... under the unemployment insurance plan of the

other country ...

Precision: 4/5 = 0.2

Recall: 4/8 = 0.5

F1 = 0.2857

Page 32: Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

Non-contiguous Alignment