Top Banner
CS 479, section 1: Natural Language Processing Lecture #36: Alignment and Metrics This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License . Thanks to Dan Klein of UC Berkeley and Uli Germann of ISI for many of the materials used in this lecture.
24

CS 479, section 1: Natural Language Processing

Feb 20, 2016

Download

Documents

Shanna

This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License . CS 479, section 1: Natural Language Processing. Lecture # 36: Alignment and Metrics. Thanks to Dan Klein of UC Berkeley and Uli Germann of ISI for many of the materials used in this lecture. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CS 479, section 1: Natural Language Processing

CS 479, section 1:Natural Language Processing

Lecture #36: Alignment and Metrics

This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License.

Thanks to Dan Klein of UC Berkeley and Uli Germann of ISI for many of the materials used in this lecture.

Page 2: CS 479, section 1: Natural Language Processing

Announcements Check the schedule

Plan enough time to succeed! Don’t get or stay blocked. Get your questions answered early. Get the help you need to keep moving forward. No late work accepted after the last day of instruction.

Project Report: Early: Friday Due: Monday

Reading Report #14 Phrase-based MT paper Due: next Wednesday(online again)

Page 3: CS 479, section 1: Natural Language Processing

Objectives Consider additional models for statistical word

alignment

See how our alignment models capture real phrases

Understand how to actually score a word-level alignment with a word alignment model

Discuss how to evaluate alignment quality

Page 4: CS 479, section 1: Natural Language Processing

Quiz

Why do we use parallel corpora (bitexts)?

What is the hidden (unknown) variable in building translation models?

What was the main idea behind IBM Model 1?

Model 2?

Page 5: CS 479, section 1: Natural Language Processing

Recall This Example

Des tremblements de terre ont à nouveau touché le Japon jeudi 4 novembre.

On Tuesday Nov. 4, earthquakes rocked Japan once again

What else is going on here that we haven’t tried to model?

Page 6: CS 479, section 1: Natural Language Processing

Models Summary IBM Model 1: word alignment IBM Model 2: word alignment, with global position

(order) model HMM Model: word alignment, with local position IBM Model 3: adds model of fertility to model 2, deficient

IBM Model 4: adds relative ordering to model 3, deficient

IBM Model 5: fixes deficiency of Model 4

Page 7: CS 479, section 1: Natural Language Processing

Context

Given a source language sentence, The search algorithm must propose possible

translations (target language sentences) along with corresponding alignments

Let’s pretend we’re in the midst of a search and scoring a single hypothesis

How do we use our models to compute such a score?

Page 8: CS 479, section 1: Natural Language Processing

Example: How to score Spanish source sentence:

“Maria no daba una bofetada a la bruja verde” Here denotes “foreign” sentence.

During the search, we propose a possible English translation:“Mary did not slap the green witch”

We consider one possible alignment What is the score, according to model 5?

i.e., what is ?

Page 9: CS 479, section 1: Natural Language Processing

Example: How to score

Mary not slap slap slap the green witch n(3|slap)

P(NULL)

t(la|the)

d(j|i)Maria no daba una bofetada a la bruja verde

Mary not slap slap slap NULL the green witch

Maria no daba una bofetada a la verde bruja

[Al-Onaizan and Knight, 1998]

Mary did not slap the green witchExamples from Local Models

Page 10: CS 479, section 1: Natural Language Processing

Example: How to score

Mary not slap slap slap the green witch n(3|slap)

P(NULL)

t(la|the)

d(j|i)Maria no daba una bofetada a la bruja verde

Mary not slap slap slap NULL the green witch

Maria no daba una bofetada a la verde bruja

[Al-Onaizan and Knight, 1998]

Mary did not slap the green witch

Page 11: CS 479, section 1: Natural Language Processing

Cascaded Training

Standard practice for training: Initialize one model with the previous (simpler)

model Proceed with EM

Typical order: 1, (2 | HMM), 3, 4, 5

Page 12: CS 479, section 1: Natural Language Processing

Insight into these models

Page 13: CS 479, section 1: Natural Language Processing

Examples: Translation and Fertility

Page 14: CS 479, section 1: Natural Language Processing

Example: Idioms

Page 15: CS 479, section 1: Natural Language Processing

Example: Morphology

Page 16: CS 479, section 1: Natural Language Processing

Example: WSD Word sense disambiguation:

Word-based MT systems rarely have a WSD step Why not?

Page 17: CS 479, section 1: Natural Language Processing

Choosing an Alignment

Page 18: CS 479, section 1: Natural Language Processing

Choosing an Alignment

Page 19: CS 479, section 1: Natural Language Processing

Efficiency of Choosing Best Alignment

Page 20: CS 479, section 1: Natural Language Processing

Evaluating TMs

How do we measure TM quality?

Measure quality of the alignments produced Metric: AER

Page 21: CS 479, section 1: Natural Language Processing

Alignment Error Rate

/ Actual

Page 22: CS 479, section 1: Natural Language Processing

AER

Easy to measure

Problems? Hard to know what the gold alignments should be May not correlate with translation quality

like perplexity and speech reco. accuracy in LMs

Page 23: CS 479, section 1: Natural Language Processing

AER Results[O

ch &

Ney

, 200

3] C

anad

ian

Hans

ards

dat

a

Page 24: CS 479, section 1: Natural Language Processing

Next

Decoding

Complexity of Decoding

Evaluating Translation