Top Banner
[1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events in English and Korean: The influence of la nguage-specific lexicalization patterns 2004 Fall Presented by Yeongmi Jeon
42

[1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Jan 17, 2016

Download

Documents

Martin Gregory
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

[1].Handling Structural Divergences and Recovering Dropped Arguments

in a Korean/English Machine Translation System

[2].Learning to express motion events

in English and Korean: The influence of language-specific lexicalization patterns

2004 Fall

Presented by Yeongmi Jeon

Page 2: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Handling Structural Divergences and Recovering Dropped Arguments

in a Korean/English Machine Translation System

Chung-hye Han, Martha Palmer

(IRCS/CIS, UPenn)

Benoit Lavoie, Richard Kittredge,

Tanya Korelsky, Myunghee Kim

(CoGenTex, Inc.)

Owen Rambow

(ATT Labs-Research)

Nari Kim

(Konan Technology, Inc.)

AMTA ’2000

Oct. 12 - 14, 2000

Page 3: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Outline of the Talk

• Linguistic issues

• System overview• Deep Syntactic Structure (DSyntS)• Parser output conversion• Handling structural divergences: Transfer• Dropped argument recovery

Page 4: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Linguistic Issues in Korean/English MT-1

Word Order

SOURCE: chuka kongkwupmul-eul 103 ceonwiciweontaetae-eke saryeongpu-ka cueossta

GLOSS: additional supply-Acc 103rd forward support battalion-Dat headquarters-Nom gave

TARGET: Headquarters gave 103rd forward support battalion additional supplies.

OUTPUT: Headquarters gave an additional supply to a 103rd forward support battalion.

Page 5: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Linguistic Issues in Korean/English MT-2

Dropped arguments and Morphology

SOURCE: IBP hwail-eul keomsaekhaci moshaess-tamyeon cikeum tasi ponaekessta.

GLOSS: IBP file-Acc retrieve could_not- if now again will_send

TARGET: If (NP1) could not retrieve IBP file, (NP2) will send again (NP3) now.

OUTPUT: If one can not retrieve an IBP file, one will send it again now.

Page 6: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Overview of the System

Page 7: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Deep Syntactic Structure-1

• Dependency structure based on Meaning Text Theory (Mel’cuk 1988).

• Nodes are labeled by lexemes.

• Directed arcs with dependency relation labels: I, II, III, ATTR.

Critical to the success of translation!!!

• Grammatical information is represented as features on the node labels.

• Well suited to MT:

Abstracts away from superficial grammatical differences between languages, such as linear order and the usage of function words.

Page 8: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

DSyntS-2: example ‘John often eats beans.’

Page 9: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Predicate-Argument Lexicon-1: English

• Subcategorization information for verbs and adjectives.

Critical for recovery of dropped arguments!!!

Page 10: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Predicate-Argument Lexicon-2: Korean

• Arguments are listed with case or adverbial postpositions.

-case postpositions: nominative, accusative.

-adverbial postpositions: {e-Ke}(‘to’), {Ro} (‘to’), {e-Seo} (‘from’).

Critical for conversion!!!

Page 11: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Conversion-1

Generic dependency structure (Yoon et. al. 1997) )

MTT-based DSyntS

-STEP 1: Rewriting feature labels.

Page 12: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Conversion-2

-STEP 2: Making dependency relationships more explicit.

Korean predicate-argument lexicon is used as a guide.

Page 13: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Conversion-3

-STEP 3: Promoting features to lexemes and vice versa.

Page 14: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Conversion-4: from Korean Parser Output to DSyntS

Page 15: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Transfer-1

• Based on DSyntS grammars that are independently motivated by source and target languages.

• Transfer rules relate DSyntS subtrees.

• Map source DSyntS subtrees to target DSyntS subtrees.

• Use of variables allows generalization of rule application.

• Features on DSyntS nodes constrain rule application.

Page 16: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Transfer-2

• Simplest case: The related subtrees are reduced to a single node.

• Structural divergence is represented in the transfer lexicon by including contextual information in the related subtrees.

Page 17: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Transfer-3: Multi-word

Transfer of predicative adjectives

Page 18: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Transfer-4: from Inflection to a Lexeme

Page 19: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Transfer-5: More Complex Example

Korean complex NP whose head noun is lexicalized as an auxiliary noun { Keos} in the context of a copular English to-infinitive.

Page 20: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Transfer-6: from Korean DSyntS to English DSyntS

Page 21: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Argument Recovery-1

• Dropped arguments must be recovered in order to obtain grammatical English sentences.

• Add default pronouns for missing arguments using grammatical and lexical knowledge.

- English predicate-argument lexicon is critical.

• This is performed just before English realization, by preprocessing the English DSyntS obtained from transfer.

Page 22: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Argument Recovery-2: Rules

• Insertion of Missing Actant I:

• Determining whether pronouns are animate or not:

Page 23: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Argument Recovery-3: Before

‘If (NP1) could not retrieve IBP file, (NP2) will send (NP3) again now.’

Page 24: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Argument Recovery-4: After

‘If one cannot retrieve an IBP file, one will send it again now.’

Page 25: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Conclusion and Future Work• Transfer based on predicate argument structures of each l

anguage.

Allows us to use off-the-shelf parsers.

• The development of a TreeBank for a Korean-English parallel corpus.

• Use syntactically annotated corpus for automatic extraction of transfer rules.

• Explicit annotation of empty arguments as well as the incorporation of a discourse model for a more principled recovery of implicit arguments.

Page 26: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Current Status

• Parallel corpus: military language training manual

-50,000 word tokens, 3800 word types, 5000 sentences.

• Predicate-argument lexicon

-1000 entries.

• Transfer lexicon

-4000 entries.

• Grammatical analysis -simple clause (declaratives, imperatives, interrogatives),

-complex clause (subordination, coordination),

-scrambling, empty argument, adjective phrase,

-noun phrase (compound nouns, NP modifiers, relative clauses, complex noun phrases),

-verb phrase (auxiliary verbs, light verbs, compound verbs),

-negation, copular sentence, adverb modification, etc.

Page 27: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Learning to express motion events in English and Korean: The influence of language-spec

ific lexicalization patterns

Soonja Choi and Melissa Bowerman

Page 28: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Outline of the Talk

• Introduction

• Semantic components of a motion event• English:

-Conflation of Motion with Manner or Cause

• Korean: Mixed conflation pattern

-Spontaneous motion

-Caused motion

Page 29: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Introduction-1

• Encoding of motion events

-provides core structuring principles to

many meanings

-different in many languages

• Language acquiring -two sources : nonlinguistic knowledge , semantic

organization of the language

-want to know how they interact in acquiring of a language

Page 30: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Introduction-2

• 4 basic components of (dynamic) motion event-Motion, Figure, Ground, Path

• Additional components -Manner, Cause, Deixis

• Fundamental typological differences [Talmy] in how a motion event is expressed

-3 patterns 1> [Motion + [Manner|Cause] ] - [Path]2> [Motion + Path] - [Manner|Cause]3> [Motion + Figure] - [Path] - [Manner|Cause]

Page 31: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

English

• Usual pattern = [ Motion + [Manner | Cause] ] – [Path] [Motion + Manner]

The rock SLID/ROLLED/BOUNCED down ( the hill )

[Motion + Cause]

The wind BLEW the napkin off the table

[Motion + Deixis]: (towards vs. away from the speaker)

John CAME/WENT into the room

• The same verb conflations in both intransitive, transitive sentences

• Path - marked in the same way in both intransitive, transitive sentences

Page 32: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Korean-1: Basic

• Different encoding patterns for transitive, intransitive verbs

• Path markers are also verbs: No dedicated system of morphemes

- <cf.> prepositions or particles in English

- 3 locative case endings: are suffixed to a Ground nominal and function like prepositions

EY “at, to”, -LO “toward”, -EYSE “from”

• Basic word order: subject-object-verb

• Verb phrase: one or more “full” verbs- The final verb bears all the inflectional suffixes

- Compound verb: connected by a “connecting” suffixes

Page 33: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Korean-2: Spontaneous motion

• Main verb: usually KATA “go” or OTA “come”

• Pattern = [Manner] - [Path] - [Motion+Deixis]

• Path verbs- Do not express posture changes

<cf.> up, down in English for changes of location and postures

• Posture changes with monomorphemic verbs- ANCTA “sit down”, NWUPTA “lie down”- [Path]-[posture verbs]: serialized events

OLLA ANCTA “get on to a higher surface and sit down"

Page 34: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Korean-2: Spontaneous motion verb-1

Page 35: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Korean-2: Spontaneous motion verb-2

Page 36: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Korean-3: Caused Motion Verbs-1

Page 37: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Korean-3: Caused Motion Verbs-2

Page 38: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Korean-3: Caused motion-1

• Pattern = [Motion+Path] • Path

- Different forms - Different meanings: Require finer distinction in actions

<ex.> KKITA/PPAYTA Path category“putting in/on/together”

result in a fitting relationship = KKITAloose = NEHTAsurface contact = NOHTA,

PWUTHITA- Incorporate aspects of Figure and Ground also

: different verbs for different Figures or Ground

Page 39: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Korean-3: Caused motion-2

• Deixis- No deictic transitive verb

<cf.> take, bring in English , KATA, OTA in Korean intransitive

- Special encoding

take = KACY-E "have" - KATA "go"

bring = KACY-E "have" - OTA "come"

Page 40: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Korean-3: Caused motion-3

• [Manner|Cause]-[Path]- Possible but less frequent than in English- Reason = Different restrictions on obligatory information

English: Better spell out Path completely John threw his keys TO his desk ( x )

John threw his keys ONTO his desk ( o )

Korean: Path can often be omitted

if Manner or Cause supplied

if the relationship between Figure and Ground can be easily inferred locative case endings are sufficient

Page 41: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Conclusion

Page 42: [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Conclusion

English - The same verb conflation patterns in both spontaneous motion expressions and caused motion expressions

- Encodes Path separately with the same markers for both kinds of motions

Korean- Different lexicalization patterns for spontaneous and caused motion

- Path markers (verbs) are different for two kinds of motions and have narrower usage ranges