Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University.

Post on 24-Dec-2015

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Automating Translation in the Localisation Factory

An Investigation of Post-Editing Effort

Sharon O’BrienDublin City University

Assumptions about MT

T (MT + PE) < T (Trans)

Do we have proof?

Dated studies: Pan-American Health Organisation General Motors European Union

3-4 times faster than translation But:

No details given More Recently:

Average daily throughput for PE: 5,250 words per day

Krings (2001): only thorough, published empirical data on PE rates

MT + CL

CL: Relatively young field of research/implementation

Consequently: little empirical data

CL improves “translatability”

The notion of translatability is based on so-called "translatability indicators" where the occurrence of such an indicator in the text is considered to have a negative effect on the quality of machine translation. The fewer translatability indicators, the better suited the text is to translation using MT.

(Underwood and Jongejan, 2001: 363)

Can we prove it - empirically?

By using CL rules to eliminate negative “translatability indicators”, post-editing effort of MT output will be lower than for output where negative translatability indicators have not been removed.

Experimental Set-Up

Validity!Professional, experienced subjects, native

speakers (German)Homogenous backgrounds and level of

experienceFamiliar text (user guide)Familiar working environmentPayment for time

However: limited number of subjects

Framework of Analysis

How do you measure post-editing “effort”?TemporalTechnicalCognitive

Two sentence types: “Snti” “Smin-nti”

Framework of Analysis

Temporal Effort: How much time, in seconds, did it take to post-edit

each sentence?

Technical Effort: How many deletions, insertions, cut & pastes were

made for each sentence?

Cognitive Effort: Combined Temporal & Technical Additional measurement: Choice Network Analysis

Analysis Tools

IBM WebsphereTranslogExcel

Translog User Interface

Translog Log File

Results: General Temporal Effort

0

2

4

6

8

10

12

14

16

18

Median WordsPer Minute

Post-Editor

Translator

Temporal Effort: Individual Variation

12.9

13

13.1

13.2

13.3

13.4

13.5

13.6

13.7

Median Wordper Minute

Translator 1

Translator 2

Translator 3

0

5

10

15

20

25

30

Median WordsPer Minute

Fastest Post-Editor

Slowest Post-Editor

Temporal Effort by Sentence Type

Processing Speed: the total number of source words in each

segment divided by the total processing time for that segment

Processing Speed by Sentence Type

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

Median Processing Speed

Snti

Smin-nti

Technical Effort by Sentence Type

0

0.5

1

1.5

2

2.5

3

3.5

4

MedianDeletions

Snti

Smin-nti

0

0.5

1

1.5

2

2.5

3

3.5

4

MedianInsertions

Snti

Smin-nti

Technical Effort: Cut & Paste

Very little activity!Retyping of entire phrases rather than

cutting & pastingLess effort to re-type?Need for training?

Cognitive Effort

On average, the elimination of NTIs suggests that PE effort is reduced.

However, CNA shows:More edits to some NTIs than to othersEven though NTIs have been removed from

a sentence, this does not guarantee zero post-editing

High PE Effort

Gerund (“ing” form of verb) Ungrammatical Phrase Putting an adjective after the noun Non-finite verb (no tense marked) Slang Misspelling Long Noun Phrase Ellipsis Long Sentence (more than 25 words) Verbs with particles Use of Footnotes Multiple Prepositions Short Segment (fewer than 4 words)

Medium PE Effort

Multiple Coordinators Problematic Punctuation Passive Voice Phrase not syntactically complete Use of Personal Pronouns Use of Slash as a separator Ambiguous coordination Use of brackets Proper Nouns Missing “that” in a relative clause

Low PE Effort

AbbreviationsDemonstrative PronounsMissing “in order to”Contractions (“Let’s”)

Conclusions

Taking into account that no QA was performed on the final texts:

On average post-editing can be faster than translationHigh degree of individual variation

On average, removing NTIs reduces PE EffortBut some NTIs demand more effort than

others

Conclusions

Even if all known NTIs are removed, sentences may still require PE effort.

Conclusions

Not all CL rules will have equal impactEven if CL is applied, PE effort will not

be removed completelyPost-editors are still human and still

translators…

top related