An Intuitive An Intuitive Representation of Human Representation of Human Languages Languages for Translation for Translation Gábor Gábor Prószéky Prószéky MorphoLogic MorphoLogic & & Faculty of Faculty of Information Information Technology, Technology, Pázmány University Pázmány University Kalmár Workshop Kalmár Workshop Szeged, October 1-2, 2003 Szeged, October 1-2, 2003
28
Embed
An Intuitive Representation of Human Languages for Translation Gábor Prószéky MorphoLogic& Faculty of Information Technology, Pázmány University Kalmár.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
An Intuitive An Intuitive Representation of Human Representation of Human
Languages Languages for Translationfor Translation
Gábor PrószékyGábor PrószékyMorphoLogicMorphoLogic
&&
Faculty of Information Faculty of Information Technology,Technology,
Pázmány UniversityPázmány University
Kalmár WorkshopKalmár WorkshopSzeged, October 1-2, 2003Szeged, October 1-2, 2003
Contents
Some words on Prof. Kalmár’s activity in computational linguistics
Problems of human language description with formal tools
A new representation with patterns Introduction to machine translation
methods Application of patterns to
translation
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
Kalmár & languages
Kalmár’s paper in formal language theory: „An Intuitive Representation of Context-Free Languages”
Kalmár’s activity in machine translation (conference in 1962): „Representation of Languages with the Help of Mathematical Structures”
RG/FSA: not enough CF/RTN: not enough CS ? 0/ATN: Turing Machine Transformations and
metarules Arguments for and against
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
NL grammar formalisms Competence and performance? Kornai number (left-recursion, center-
embedding, “respectively” construction) Gradually from unrestricted to regular (i) anbn ->a*b* (n is lost!) (ii) anbn ->{ε,ab,aabb,aaabbb} “Finitization” by length No structure in FSA; finite systems,
Categorial grammar: early logical representations of syntax (Kalmár)
DCG: interpretation & representation
Rule-to-rule hypothesis
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
Conflict handling
Lexicon meets syntax: who is right?
Lexicon: off-line info coming from past experiences
Which is more important in a specific situation?
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
Open classes
Open vs. closed classes:that is, features can or cannot be overridden
Proper names, jabbers, folk etymology, loanwords, ...
Grammar of closed classes:minimal grammar
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
Finite morphology Finite patterns Finite number of entries Descriptions assigned to
entries Finite & open vs.
infinite & closed Underspecified entries for
guessing
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
Finite syntax
“Item and arrangement” (as in morphology)
“Arrangement” describes a rather free constituent-order
Metawords in a meta-dictionary, e.g. ‘(Det (Adj (N)))’ ‘DAN’
Cascades without loop
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
The „plastic box”
John is a boy. ”John” is a noun. Go is a verb. ”Go” is a verb. is a sign. ”” is a sign. is a . � �
(where is a ”plastic box”)�
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
Real examples
(a) Unusual use:Go is a verb.POS [np] POS [v]
(b) Metaphor:My car drinks a lot.ANIMATE [+] ANIMATE [-]
(c) Unknown entry:Kalmár is a family name.POS [np]
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
Linguistic frames
Psychology: ”Gestalt” Morphological complex
structures treated as frames by humans
Frames in AI: ‘shopping’, ‘walking’, ...
As ‘high-level parsing’ relates to ‘detailed on-line analysis’
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
Translation of human languages
old problems (50’s) direct (60’s) interlingual (70’s) transfer (80’s) examples (90’s)
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
Patterns: general linguistic Patterns: general linguistic informationinformation in lexicalized formin lexicalized form Short, fully specified patterns are:
Single pass: no separate transfer steps Target structure generation:
by-product of parsing
Jabberwocky
‘Twas brillig, and the slighty tovesDid gyre and gimble in the wabe:All mimsy were the borogroves,And the mone raths outgrabe.
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
‘Twas �, and the � �sDid � and � in the �:All � were the �s,And the � �s �.
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
Translation rules for Jabberwocky
‘twas � � volt �, and � �, és � the �s did � a �ok �tak � and � � és � in the � a �ban all � teljesen � � were the �s �k voltak az �ok the �s � a �ok �tek
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
‘Twas �, and the � �s
Did � and � in the �:All � were the �s,And the � �s �.
� volt, és a � �ok�tak és �tek a �ben:teljesen � voltak a �okés a � �ok �tek.
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
Translation of Jabberwocky
Dzsebervoki
Brillig volt, és a szlájti tóvokgájertak és gimbeltek a vébben:teljesen mimszik voltak a borogróvokés a món rátok autgrébtek.
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
An intuitive representation...
1. X-bar based structures2. Feature-based descriptions3. Metarules (used off-line) 4. Rule-to-rule principle5. Lexicon should be finite but
open6. Closed classes belong to the
minimal grammar7. Minimal grammar describes
”basically” linguistic elements
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
An intuitive representation...
(cont’d)8. Linguistic constructions can be
described by finite patterns9. A huge & finite description set
is used rather than a limited & infinite grammar
10. In case of conflict, lexical information is either redundant or contradicting to the actual description
11. Known constructions need no real-time analysis (Gestalt, frame)
Kalmár Kalmár Workshop Workshop
2003 2003
Gábor Prószéky:An Intuitive
Representationof Human Languages
for Translation
An intuitive representation... (cont’d)
12. ”Broken” frames are analyzed real-time
13. Structural (source/target) pattern pair is assigned to every frame to be translated
14. Target structure is computed while parsing source structure