CS 378 Lecture 17 Machine Translation Today - Phrase - based MT - Word alignment Announceinents - Midterm - FP - A- 3 -1-14 Read LSTMS Is T2 involves gated 9 I , t Iz connections ☐=☐= q ñ , q Iz Ii = f- ① Ii , I , I ,
CS 378 Lecture 17
Machine Translation
Today - Phrase- based MT- Word alignment
Announceinents - Midterm - FP- A-3
-1-14
Read LSTMS
Is T2involves gated
9 I,t Iz
connections☐=☐=q ñ
, qIz Ii = f- ① Ii
- ,
I,I,
Machinetranslati.nloday : phrase - based (pre
- 2015)Next time : neural w/ seqfseq models
(post- 2015), following on how we
used RNNS for LM
Input : 5 source sentence
Output: I target language
Data : bitext . Set of (5) F) pairs
French →meEn
→ Jasentence ↳
..-
we don't know[ how to do this !
interlingua
Bernard Vauquois ( 1968)0 interlingua
ic- syntax
¥É sweetspot
0 word- level>0
5 I
Phrasetsasedmt
Bitext → alignedword table LM
phrase alignmentbitext
→ phyrase ,Je fois un bureau decoder• A- ☒ ☒Iaa desk
Decoder : searches over the space of
phrase- by - phrase translations to find
one that scores best ( includingan LM score)
Jes fais in bureau →a) ÷: "→ to figure out
↳candidate is
grammaticalEnglish
Wordltlignment (focus of today )
Input : bilext (STI) pairs
Output: one -to - many alignments fromJ to I
placeholderd
J= Je rais le faire NULLa-= '/ 2/12 ¥3E-I am going to do it a~=2
a 5- 2-. .
Each word in t aligns to one
word in 5
Define a vector a-
ai= index in 5 that word ti
aligns to
Alignment models : place distributionover p(toils ) generative model
of I,a-
f- = words in an HMM
a- ✗ tags
IBMM.de/1(1993)-n targetwords
a- = (ai , - - -
,an ) f- = (t , , . .,tn)
5- (si , -. -
,5m,NULL) m source
words
Model t.PH , a- 151=17 Plait Pltilsai )-1=1
Generative process: for each target
word i, pick a source index ai
Pla:) =¥ uniform over these options
Generate ti conditioned on Saiai th source
word
Model params : translation dictionaryQ Pl target word / Je )
g-
¥.
Mta>a- women
like emissions
in Hmm !
Each ai is like a switch,tells
youwhat row of ⑤ to use
Je fois
a. =L : Plt , / Je)a ,=2 : Plt
,/ fais )
T={I , lire, eat] 5=1 Je , It , mange ,aime, NULL]
I*Je 0.8
µ"
go"
J'0.8 0.1 0.1
Je NULL
mange 0 0 " °213 A fly,
aime 0 1-0 0
INULL 0.4 0.3 0.3
Je NULL Platt ,5)?
t= Iprop.to {
PIIIJE)""
PCI /Naya:"
what do we want ?= {0-8=>43Posterior p( a- 15,1--1 0.4 Y}
Hmm : Play/ ⇒ ply-l.FI posteriormodel (tagging )
£ PIE,a- 1st
argmax p(a-It,5) multiply Wtop/bottoma- ✓pltls-lpla-H.sk
constant
w.r.t.aaogmai-pla-lt.skargmaxpcñstls)a- a-
constant ¥,
= argmax ITP Pltilsai )a- it
P (ai It , 5) proportional to Pctilsai)
I 2 3
FIL NULL 5
Ia'
I like
9=1 J' PCIIJY
0.8 213
Pla , 15,t)=ai-2{ aime PLT-la.me/⇒ o
o113
argmax a , ?= ,?⇒ NULL a. y
Pla , 15,1=1
argmax a,?=2
Bitext Je^IJ ' aime I likeTeItTammak
⇒ alignments + phrases
Learning Hara !
Unsupervised : no examples of labeleda-
.
Expectation Maximization :D
maximizes { log { p(a- ,F'" 1st")
a-
(sci ),tciyd i -4-in Plt Cil 151
")
Phrase-basedMT-⇐it ) pairs ⇒ learn an aligner
⇒ align our data
phrase extraction : aligned sent⇒ phrase translationoptions