Introduction to Computational Linguistics · requires all steps relevant to linguistic analysis of input sentences and linguistic generation of output sentences hence, machine translation
Post on 23-Sep-2020
1 Views
Preview:
Transcript
Introduction to ComputationalLinguistics
Frank Richter
fr@sfs.uni-tuebingen.de.
Seminar fur Sprachwissenschaft
Eberhard-Karls-Universit at Tubingen
Germany
Intro to CL – WS 2006/7 – p.1
Central Goal of the Field
build psychologically adequate models of humanlanguage processing capabilities on the basis ofknowledge about the way in which humans acquire,store, and process language.
build functionally correct models of human languageprocessing capabilities on the basis of knowledge aboutthe world and about language elicited from people andstored in the system.
Intro to CL – WS 2006/7 – p.2
Application Areas
machine translation
speech recognition
speech synthesis
man-machine interfaces
Intro to CL – WS 2006/7 – p.3
Application Areas
intelligent word processing: spelling correction,grammar correction
document managementfind relevant documents in collectionsestablish authorship of documentscatch plagiarismextract information from documentsclassify documentssummarize documentssummarize document collections
Intro to CL – WS 2006/7 – p.4
A bit of Philosophy of Science
Theory:A set of statements that determine the format andsemantics of descriptions of phenomena in the purviewof the theory
Methodology:An effective theory comes with an explicit methodologyfor acquiring these descriptions
Application:A theory associated with a methodology can be appliedto tasks for which the methodology is appropriate.
Intro to CL – WS 2006/7 – p.5
Scientific Strategies
Method Oriented Approach:devise or import a tool, a procedure or a formalism,apply it to a task and develop it further. Then(optionally) see whether it works for additional tasks
Task oriented Approach:select a task; devise or import a method or severalmethods for its solution; integrate the methods asrequired to improve performance.
Intro to CL – WS 2006/7 – p.6
Machine Translation
What makes Machine Translation an important applicationarea to study:
historically first application area, and for at least adecade the only application area, of computationallinguistics
Intro to CL – WS 2006/7 – p.7
Machine Translation
What makes Machine Translation an important applicationarea to study:
historically first application area, and for at least adecade the only application area, of computationallinguistics
requires all steps relevant to linguistic analysis of inputsentences and linguistic generation of output sentences
Intro to CL – WS 2006/7 – p.7
Machine Translation
What makes Machine Translation an important applicationarea to study:
historically first application area, and for at least adecade the only application area, of computationallinguistics
requires all steps relevant to linguistic analysis of inputsentences and linguistic generation of output sentences
hence, machine translation is scientifically one of themost challenging and most comprehensive tasks incomputational linguistics
Intro to CL – WS 2006/7 – p.7
The Purposes of Translation
Information Acquisition:e.g. Gather information on scientific articles ornewspapers written in a foreign language.
Intro to CL – WS 2006/7 – p.8
The Purposes of Translation
Information Acquisition:e.g. Gather information on scientific articles ornewspapers written in a foreign language.
Information Dissemination:e.g. Translation of technical manuals, legal texts,weather reports, etc.
Intro to CL – WS 2006/7 – p.8
The Purposes of Translation
Information Acquisition:e.g. Gather information on scientific articles ornewspapers written in a foreign language.
Information Dissemination:e.g. Translation of technical manuals, legal texts,weather reports, etc.
Literary Translation:e.g. Translation of novels, poems, etc.
Intro to CL – WS 2006/7 – p.8
Relating Translation Purposes to MT
Information Acquisition:involves translation from a foreign to a nativelanguage
Intro to CL – WS 2006/7 – p.9
Relating Translation Purposes to MT
Information Acquisition:involves translation from a foreign to a nativelanguagetypically used by non-linguists with little or nolinguistic competence in the source language
Intro to CL – WS 2006/7 – p.9
Relating Translation Purposes to MT
Information Acquisition:involves translation from a foreign to a nativelanguagetypically used by non-linguists with little or nolinguistic competence in the source languagepre-processing of the input not feasible due to lack oflinguistic competence by the user in the sourcelanguage
Intro to CL – WS 2006/7 – p.9
Relating Translation Purposes to MT
Information Acquisition:involves translation from a foreign to a nativelanguagetypically used by non-linguists with little or nolinguistic competence in the source languagepre-processing of the input not feasible due to lack oflinguistic competence by the user in the sourcelanguagemay require special-purpose lexica
Intro to CL – WS 2006/7 – p.9
Relating Translation Purposes to MT
Information Acquisition:involves translation from a foreign to a nativelanguagetypically used by non-linguists with little or nolinguistic competence in the source languagepre-processing of the input not feasible due to lack oflinguistic competence by the user in the sourcelanguagemay require special-purpose lexicalow-quality translation is tolerable
Intro to CL – WS 2006/7 – p.9
Relating Translation Purposes to MT(2)
Information Dissemination:involves translation from a native to a foreignlanguage
Intro to CL – WS 2006/7 – p.10
Relating Translation Purposes to MT(2)
Information Dissemination:involves translation from a native to a foreignlanguagepre- and post-processing of the input feasible due tolinguistic competence by the translator in the sourcelanguage
Intro to CL – WS 2006/7 – p.10
Relating Translation Purposes to MT(2)
Information Dissemination:involves translation from a native to a foreignlanguagepre- and post-processing of the input feasible due tolinguistic competence by the translator in the sourcelanguagemay involve sublanguage with restricted vocabulary;e.g. translation of weather reports
Intro to CL – WS 2006/7 – p.10
Relating Translation Purposes to MT(2)
Information Dissemination:involves translation from a native to a foreignlanguagepre- and post-processing of the input feasible due tolinguistic competence by the translator in the sourcelanguagemay involve sublanguage with restricted vocabulary;e.g. translation of weather reportsoften involves special terminologies stored in aterminology database; e.g. for translation oftechnical manuals
Intro to CL – WS 2006/7 – p.10
Relating Translation Purposes to MT(2)
Information Dissemination:involves translation from a native to a foreignlanguagepre- and post-processing of the input feasible due tolinguistic competence by the translator in the sourcelanguagemay involve sublanguage with restricted vocabulary;e.g. translation of weather reportsoften involves special terminologies stored in aterminology database; e.g. for translation oftechnical manualspurely human translation for such tasks can betime-consuming, inconsistent, or tedious.
Intro to CL – WS 2006/7 – p.10
Relating Translation Purposes to MT(3)
Literary Translationrequires stylistic elegance, often involvesmetaphorical and metonymic language
Intro to CL – WS 2006/7 – p.11
Relating Translation Purposes to MT(3)
Literary Translationrequires stylistic elegance, often involvesmetaphorical and metonymic languageabundance of highly-trained human translators
Intro to CL – WS 2006/7 – p.11
Relating Translation Purposes to MT(3)
Literary Translationrequires stylistic elegance, often involvesmetaphorical and metonymic languageabundance of highly-trained human translatorstask rarely performed by machine translation
Intro to CL – WS 2006/7 – p.11
What Makes Machine Translation Hard
Lexical Ambiguity
Intro to CL – WS 2006/7 – p.12
What Makes Machine Translation Hard
Lexical Ambiguity
Lexical Gaps
Intro to CL – WS 2006/7 – p.12
What Makes Machine Translation Hard
Lexical Ambiguity
Lexical Gaps
Syntactic Divergences between Source and TargetLanguage
Intro to CL – WS 2006/7 – p.12
top related