Top Banner
Motivations for transfer-based translation • lexical ambiguity • structural differences See further Ingo 91
65

Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Dec 17, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Motivations for transfer-based translation

• lexical ambiguity

• structural differences

See further Ingo 91

Page 2: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Example 1

Sv. Fyll på olja i växellådan. En. Fill gearbox with oil.(from the Scania corpus)

• fyll på fill

• obj adv

• adv obj

Page 3: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Example 2

Sv. I oljefilterhållaren sitter en överströmningsventil.

En. The oil filter retainer has an overflow valve.(from the Scania corpus)

• sitter has• adv subj• subj obj

Page 4: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Transfer-based translation

• intermediary sentence structure• basic processes

– analysis– transfer– generation (synthesis)

• language modules– dictionary and grammar of SL– transfer dictionary and transfer rules– dictionary and grammar of TL

Page 5: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

SL TL

Interlingua

Direct translation

Transfer

Multra

Metal

Page 6: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Levels of intermediary structure

• cf. J&M, Chapter 21

• word order

Page 7: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Metal

• See H&S

Page 8: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

MULTRA

Multilingual Support for Translation and Writing• translation engine• transfer-based

– shake-and-bake

• modular• unification-based• preference machinery• trace-able

Page 9: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.
Page 10: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Analysis

• chart parser (Lisp C)– procedural formalism

• unification and other kinds of operations

• sentence structure– feature structure– grammatical relations– surface order implicit via grammatical relations

See further Sågvall Hein&Starbäck (99),Weijnitz (02), Dahllöf (89)

Page 11: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Transfer

• unification-based• declarative formalism

– Multra transfer formalism (Beskow 93) • lexical and structural rules

• rules are partially ordered• a more specific rule takes precedence over a

less specific one– specificity in terms of number of transfer equations

• all applicable rules are applied• written in prolog

Page 12: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Generation

• syntactic generation– Multra syntactic generation formalism (Beskow 97a)– PATR-like style

• unification• concatenation• typed features

• morphological generation (Beskow 97b)– lexical insertion rules– morphological realisation and phonological finish in

prolog

• written in prolog

Page 13: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

An example: Tippa hytten.Tippa hytten. :

(* = (PHR.CAT = CL MODE = IMP

SUBJ = 2ND VERB = (WORD.CAT = VERB INFF = IMP DIAT = ACT LEX = TIPPA.VB.1

VSURF = +) OBJ.DIR = (PHR.CAT = NP NUMB = SING GENDER = UTR CASE = BASIC DEF = DEF HEAD = (LEX = HYTT.NN.1 WORD.CAT = NOUN))) REG = (V1.LEM = TIPPA.VB) SEP = (WORD.CAT = SEP LEX = STOP.SR.0)))

Page 14: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Transfer structureTransfer structure

[VERB : [WORD.CAT : VERB LEX : TILT.VB.0 DIAT : ACT INFF : IMP] OBJ.DIR : [PHR.CAT : NP DEF : DEF NUMB : SING HEAD : [WORD.CAT : NOUN LEX : CAB.NN.0]] MODE : IMP SUBJ: 2ND VSURF: + SEP : [WORD.CAT : SEP LEX : STOP.SR.0] PHR.CAT : CL]

Page 15: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Generation

Tilt the cab.  

Page 16: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

A grammar rule

defrule legal.obj {<?1 phr.cat> = 'np,not <?1 case> = 'gen, not <?1 case> = 'subj

}

Page 17: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Transfer rules

• copy feature

• delete feature

• transfer feature

• assign feature

Page 18: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Copy feature

LABEL modeSOURCE <* mode> = ?x1TARGET <* mode> = ?x2TRANSFER

Page 19: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Delete feature

LABEL REGSOURCE <* REG> = ANYTARGET <*> = <*> TRANSFER

Page 20: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Transfer feature

LABEL OBJ.DIRSOURCE <* OBJ.DIR> = ?x1TARGET <* OBJ.DIR> = ?x2TRANSFER ?x1 <=> ?x2 

Page 21: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Define feature

LABEL trycka.in-pressSOURCE <* lex sym>=trycka.vb+in.ab.1 <* word.cat>=VERBTARGET <* lex>=press.vb.1 <* word.cat>=VERBTRANSFER

 

Page 22: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

A generation rule

LABEL CL.IMPX1 ---> X2 X3 X4 : <X1 PHR.CAT> = CL <X1 VERB> = <X2> <X1 TYPE> = IMP <X1 OBJ.DIR> = <X3> <X1 SEP> = <X4>

Page 23: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

A contextual lexical ruleLABEL tänka.på-think.aboutSOURCE <* verb lex sym> = tänka.vb.1 <* obj.prep phr.cat> = pp <* obj.prep prep> = ?prep <* obj.prep prep lex sym> = på.pp.1 <* obj.prep rect> = ?rect1TARGET <* obj.prep phr.cat> = pp <* obj.prep prep word.cat> = PREP <* obj.prep prep lex> = about.pp.1 <* obj.prep rect> = ?rect2TRANSFER ?rect1<=>?rect2

 

Page 24: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

A generation trace

1-Applying Rule cl-sep

1- Applying Rule cl.imp

1- Applying Rule subj2nd-verb-obj.dir

1- Applying Rule verb.main.act

1- Applying Rule np.the-df

1- Applying Rule ng.noun-def

1-Success!

Page 25: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Language resources in the MATS system

• dictionary in a database with different views

• analysis grammar

• transfer grammar– incl. contextually defined lexical rules

• generation grammar

Page 26: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

sv-en_LinkLexicon

Page 27: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

en-Inflections

Page 28: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

en_LemmaLexicon

Page 29: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

en_LexemeLexicon

Page 30: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

en_Lexicon

Page 31: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

en_StemLexicon

Page 32: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

sv_Inflections

Page 33: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

sv_LemmaLexicon

Page 34: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

sv_LexemeLexicon

Page 35: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

sv_Lexicon

Page 36: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

sv_StemLexicon

Page 37: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

The MATS system

Frozen demo…

Page 38: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Assignment 2: Working with MATS

http://stp.ling.uu.se/~evapet/mt04/assignment2.html

Page 39: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Lexicalistic translation

• Identify (lexical) translation units in the source sentence

• Translate each unit separately (considering the context)

• Order the result in agreement with a model of the target language

Formulation due to Lars Ahrenberg; see further AH (reading list) ; see also Beaven, L. John, Shake-and-Bake Machine Translation. Coling –92, Nantes, 23-28 Aout 1992.

Page 40: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

T4F – a lexicalistic system

• processes in T4F– tokenisation– tagging– transfer– transposition– filtering

See further AH (in the reading list)

Page 41: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Interlingua translation

• See SN

Page 42: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.
Page 43: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.
Page 44: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.
Page 45: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Applications of alignment

• translation memories

• translation dictionaries

• lexicalistic translation

• statistical machine translation

• example-based translation

Page 46: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Translation memories

• based on sentence links

• optionally, sub sentence links

See further Macklovitch, E. (2000)

Page 47: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Translation dictionaries

• based on word links

• refinement of word links

Page 48: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Refinement of word alignment data

• neutralise capital letters where appropriate• lemmatise or tag source and target units• identify ambiguities

– search for criteria to resolve them

• identify partial links– compounds?– remove or complete them

• manual revision?

Page 49: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Informally about statistical MT

• build a translation dictionary based on word alignment

• aim for as big fragments as possible• keep information on link frequency• build an n-gram model of the target language• implement a direct translation strategy

– including alternatives ordered by length and frequency

• process the output by the n-gram model filtering out the best alternatives and adjust the translation accordingly

Page 50: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Example-based MT

HS (in the reading list)

Page 51: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Some current research topics

• intersentential dependences• hybrid systems: data-driven and rule-driven• improved alignment techniques• improved language modeling in ST• automatic learning from post-editing• translation by structural correspondences• translation of spoken language• improved preference strategies• ambiguity preserving translation

Page 52: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Intersentential dependencies

• pronoun resolution

• lexical ambiguity resolution, such as– (torkar)motorn the motor– (förbrännings)motorn the engine

• fluency

Page 53: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Preserving the information structure

• information structure is expressed in different ways in the source and the target

• syntactic clues are exploited in the analysis to compute the information structure (topic-focus articulation)

• information structure is used to guide the generation

Page 54: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

An example

Torkarmotorn M2 är sammankopplad med omkopplare S24 och intervallrelä R22. För att inte motorn skall överbelastas, t.ex. om torkarbladen fastnat, finns en inbyggd termovakt som bryter strömmen till motorn när …

Wiper motor M2 is connected to switch S24 and intermittent relay R22. To prevent motor overload, e.g. if the wiper blade gets stuck, there is an integral thermal sensor which breaks the current to the motor when …

Page 55: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Preferences

• syntactic preferences– the principle of right association– the principle of minimal attachment– two-stage processing

• semantic preferences– lexical selectional restrictions– lexical contextual rules– conceptual taxonomies– likelihood of occurrence

See further Bennet, P. & Paggio, P., 1993, Preference in Eurotra.

Page 56: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Preferences in Multra

• parsing– a formalism for expressing syntactic

preferences in the parse• not fully developed

• transfer– contextual lexical rules– rule specificity

• generation– rule specificity

Page 57: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Hybrid systems

• aims

• components

• problems

• architecture

• scores

Page 58: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Aims of a hybrid system

• simple techniques for simple tasks

• complex techniques for complex tasks

Page 59: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Components of a hybrid systems

• component strategies– translation memory

• full sentences• fragments

• direct translation– statistical translation– ebmt

Page 60: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Component strategies, cont’d

• rule-based translation– simplistic analysis (cf. direct translation)

• word by word (S sequence of words)• phrase by phrase (S sequence of phrases)

– partial parsing– full parsing

Page 61: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Problems of a hybrid system

• how does the system know when a simple technique is appropriate?– does the source tell?– does the target tell?

Page 62: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Architecture and scores

• simple first?

• concerting results?

• scoring?

Page 63: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Improved techniques for re-use of translation

• combining clues for word alignment (Tiedemann 2003)

• interactive word alignment (Ahrenberg et al. 2003)

• parallel treebanks

Page 64: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Translation by structural correspondences

• LFG

• HPSG

Page 65: Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Translation of spoken language

See

Krauver, Steven (ed.), 2000, Machine Translation, June 2000. Volume 15, Issue 1-2, Special issue on Spoken Language Translation.