Top Banner
Drop me a mail: Drop me a mail: [email protected] Visit me at: Visit me at: http:// rushdishams.googlepages.com 1 Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh
43
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Types of machine translation

Drop me a mail: Drop me a mail: [email protected] me at: Visit me at: http://rushdishams.googlepages.com

1Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh

Page 2: Types of machine translation

Translation Approach The translation process may be stated as:

1. Decoding the meaning of the source text2. Re-encoding this meaning in the target

language. Machine translation can use a method

based on linguistic rules- words will be translated in a linguistic way the most suitable words of the target language

will replace the ones in the source language.

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 2

Page 3: Types of machine translation

Translation Approach The success of machine translation requires

the problem of natural language understanding to be solved first.

Generally, rule-based methods parse a text, usually creating an intermediary, symbolic

representation, from which the text in the target language is

generated.

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 3

Page 4: Types of machine translation

Translation Approach According to the nature of the intermediary

representation, an approach is described as interlingual machine translation or transfer-based machine translation.

These methods require extensive lexicons with morphological, syntactic, and semantic information, and large sets of rules.

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 4

Page 5: Types of machine translation

Translation Approach Machine translation programs often work

well enough for a native speaker of one language to get the

approximate meaning of what is written by the other native speaker.

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 5

Page 6: Types of machine translation

Translation Approach the large multilingual corpus of data needed

for statistical methods to work is not necessary for the grammar-based methods.

But then, the grammar methods need a skilled linguist to carefully design the grammar that they use.

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 6

Page 7: Types of machine translation

Types of Machine Translation

Text Generation

Syntactic Parsing

Semantic Analysis

Sentence Planning

Source (Arabic)

Target(English)

Transfer Rules

Direct: SMT, EBMT

Interlingua

Page 8: Types of machine translation

Rule based MT The rule-based machine translation

paradigm includes 1. transfer-based machine translation, 2. interlingual machine translation and 3. dictionary-based machine translation

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 8

Page 9: Types of machine translation

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 9

Page 10: Types of machine translation

Transfer based MT Itis necessary to have an intermediate

representation that captures the "meaning" of the original sentence in order to generate the correct translation

In interlingua-based MT this intermediate representation must be independent of the languages in question, whereas in transfer-based MT, it has some dependence on the language pair involved.

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 10

Page 11: Types of machine translation

Transfer based MT The original text is first analyzed

morphologically and syntactically

in order to obtain a syntactic representation.

This representation can then be refined to a more abstract level putting emphasis on the parts relevant for translation and ignoring other types of information.

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 11

Page 12: Types of machine translation

Transfer based MT The transfer process then converts this final

representation (still in the original language) to a representation of the same level of abstraction in the target language.

These two representations are referred to as "intermediate" representations.

From the target language representation, the stages are then applied in reverse.

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 12

Page 13: Types of machine translation

Transfer based MT

Page 14: Types of machine translation

Transformation process Morphological analysis

Surface forms of the input text are classified as○ to part-of-speech (e.g. noun, verb, etc.) and ○ sub-category (number, gender, tense, etc.)

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 14

Page 15: Types of machine translation

Transformation process Lexical categorization

In any given text some of the words may have more than one meaning, causing ambiguity in analysis.

Lexical categorization looks at the context of a word to try and determine the correct meaning in the context of the input.

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 15

Page 16: Types of machine translation

Transformation process Lexical transfer

This is basically dictionary translationthe source language lemma (perhaps with sense

information) is looked up in a bilingual dictionary and the translation is chosen.

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 16

Page 17: Types of machine translation

Transformation process Structural transfer

While the previous stages deal with words, this stage deals with larger constituents

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 17

Page 18: Types of machine translation

Transformation process Morphological generation

From the output of the structural transfer stage, the target language surface forms are generated.

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 18

Page 19: Types of machine translation

Transfer Types Superficial transfer (or syntactic)

This level is characterized by transferring "syntactic structures" between the source and target languages.

It is suitable for languages in the same family or of the same type.

for example in the Romance languages between Spanish, Catalan, French, Italian, etc.

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 19

Page 20: Types of machine translation

Transfer Types Deep transfer (or semantic)

This level constructs a semantic representation that is dependent on the source language.

This representation can consist of a series of structures which represent the meaning.

In these transfer systems predicates are typically produced. The translation also typically requires structural transfer. This level is used to translate between more distantly related

languages (e.g. Spanish-English or Spanish-Basque, etc.)

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 20

Page 21: Types of machine translation

Dependency Grammar

Page 22: Types of machine translation

Case Grammar

Page 23: Types of machine translation

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 23

Page 24: Types of machine translation

Interlingual MT the source language, i.e. the text to be

translated is transformed into an interlingua, i.e., an abstract language-independent representation.

The target language is then generated from the interlingua.

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 24

Page 25: Types of machine translation

Interlingual MT In the direct approach, words are translated

directly without passing through an additional representation.

In the transfer approach the source language is transformed into an abstract, less language-specific representation.

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 25

Page 26: Types of machine translation

Interlingual MT

Page 27: Types of machine translation

Advantage and disadvantage The advantage in multilingual machine

translations is that no transfer component has to be created for each language pair

The obvious disadvantage is that the definition of an interlingua is difficult and maybe even impossible for a wider domain.

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 27

Page 28: Types of machine translation

Components Dictionaries for analysis and generation A conceptual lexicon, which is

the knowledge base about events and entities known in the domain.

A set of projection rules (specific to the domain and the languages).

Grammars for the analysis and generation of the languages involved.

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 28

Page 29: Types of machine translation

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 29

Page 30: Types of machine translation

Dictionary-based MT The words will be translated as a dictionary does

— word by word, usually without much correlation of meaning between them

Dictionary lookups may be done with or without morphological analysis or lemmatisation

used to expedite manual translation, if the person carrying it out is fluent in both languages and therefore capable of correcting syntax and grammar.

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 30

Page 31: Types of machine translation

Dictionary-based MT

Page 32: Types of machine translation

Dictionary-based MT

Page 33: Types of machine translation

Example-based MT

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 33

Page 34: Types of machine translation

Example-based MT characterized by its use of a bilingual corpus

with parallel texts as its main knowledge base

It is essentially a translation by analogy and can be viewed as an implementation of case-based reasoning approach of machine learning

Page 35: Types of machine translation

Example-based MT characterized by its use of a bilingual corpus

with parallel texts as its main knowledge base

It is essentially a translation by analogy and can be viewed as an implementation of case-based reasoning approach of machine learning

Page 36: Types of machine translation

Example-based MT

Page 37: Types of machine translation

Example-based MT bilingual parallel corpora contain sentence

pairs like the example shown in the table. How much is that X ? corresponds to Ano X

wa ikura desu ka. red umbrella corresponds to akai kasa small camera corresponds to chiisai kamera

Page 38: Types of machine translation

Example-based MT President Kennedy was shot dead during the

parade. and The convict escaped on July 15th. We could translate the sentence The convict was shot dead during the parade. by substituting the appropriate parts of the sentences.

Page 39: Types of machine translation

Statistical MT

Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh 39

Page 40: Types of machine translation

Statistical MT

The idea behind statistical machine translation comes from information theory.

A document is translated according to the probability distribution p(e | f) that a string e in the target language (for example, English) is the translation of a string f in the source language (for example, French).

Page 41: Types of machine translation

Statistical MT

The problem of modeling the probability distribution p(e | f) has been approached in a number of ways. One intuitive approach is to apply Bayes Theorem

Page 42: Types of machine translation

where the translation model p(f | e) is the probability that the source string is the translation of the target string, and the language model p(e) is the probability of seeing that target language string string.

Page 43: Types of machine translation

Statistical MT Finding the best translation is done by picking up

the one that gives the highest probability