In two minds: How to teach translation students to learn from parallel corpora

Post on 14-Jan-2016

45 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

In two minds: How to teach translation students to learn from parallel corpora. Toma ž Erjavec Department of Intelligent Systems Jožef Stefan Institute tomaz.erjavec@ijs.si Špela Vintar Department of Translation and Interpreting University of Ljubljana spela.vintar@guest.arnes.si. - PowerPoint PPT Presentation

Transcript

In two minds: How to teach translation students to learn from

parallel corpora

Tomaž ErjavecDepartment of Intelligent SystemsJožef Stefan Institutetomaz.erjavec@ijs.si

Špela VintarDepartment of Translation and Interpreting

University of Ljubljanaspela.vintar@guest.arnes.si

Overview

The corpus and concordancerUsing the resource to teach

students

The IJS-ELAN parallel corpus

EU MLIS project ELAN: IJSSlovene-English parallel texts1 million words, 15 textssentence aligned, tokenisedTEI encodedfreely available http://nl.ijs.si/elan/

Example TU

<tu lang="sl-en" id="spor.902"><seg lang="sl"><w type=dig>117.</w> <w>&ccaron;len</w></seg><seg lang="en"><w>Article</w> <w type=dig>117</w></seg></tu>

<tu lang="en-sl" id="gnpo.303"><seg lang="en"><w>Memory</w> <w>exhausted</w></seg><seg lang="sl"><w>zmanjkalo</w> <w>pomnilnika</w></seg></tu>

Web concordance

IMS CQP backendCGI Perl interfaceApache server

Queries

Vanilla queries: dog*, *dogFull regular expressions: “dog.*”Positional attributes: [num=“dual”]Expressions over tokensConstrains on aligned segments

Using the corpus in translator training:

Developing corpus literacy

what is a corpus?what’s in the corpus?how to find things in the corpus?how to use the results?

Formulating corpus queries

learning to formalize languagewordform vs. lemma (Slovene!)using parallel search to filter out

unwanted examples

Evaluating the results

critical eye: corpus translations may be false or bad

before relying on quantitative data, consider corpus composition

corpus != dictionary

Types of activities

frontal presentationsgroup workindividual work - translating with the

corpusseminar assignments

Things to observe

translation (in)equivalence, terminological variety

word-formation strategiespragmatic/cultural conventions of

text typescontrastive analysisother translation strategies

lokaln* samouprav* ?

kuca: z ustreznim razmerjem med državo in lokalno samoupravo, med središčem države in A society with an appropriate relationship between the state and local government, between the national centre and individual regions.

parl: obstajati. Specifične oblike lokalne samouprave so Slovenci poznali pod imenom župa, Specific forms of local self-administration were known to Slovenes by the term župa, which meant one or more villages led by a župan.

ecmr: reforme javne uprave, razvoj lokalne samouprave, pa tudi oceno kadrovskih potreb in It is therefore an operative document which, apart from strategic goals, defines the areas of reforms, macro - and micro-economic policy measures, development of judicial system, public administration reform, development of local administration, as well as an estimate of the staff and financing requirements for realisation of those reforms. ekol: okolja33. V ta sklop sodi tudi raven lokalne samouprave s svojimi pristojnostmi na področju This also includes the level of local self-government with its responsibilities in the area of environmental protection, which otherwise are dealt with in a special chapter.

Things to observe

translation (in)equivalence, terminological variety

word-formation strategiespragmatic/cultural conventions of

text typescontrastive analysisother translation strategies

*bug*

20 bugs13 bug 9 debugging 8 debug 3 buggers 3 bug-free 2 buggy 2 Debugging 1 tar-bugs@gnu.ai.mit.edu 1 request@bugs.debian.org 1 debuggers 1 debugger 1 bug-wget@gnu.org 1 bug-gnu-utils@gnu.org 1 bug-fixes 1 bug-fileutils@gnu.org

*hrošč*

11 hroščev 6 hrošču 5 razhroščevanje 5 hroščih 4 hrošče 3 hrošč 2 razhroščevanja 2 razhroščevalnega 2 hrošči 2 Razhroščevanje 1 razhroščujejo 1 razhroščiti 1 razhroščevanju 1 razhroščevalniku 1 razhroščevalniki 1 razhroščevalnik 1 razhroščevalnih 1 razhroščevalne 1 hroščem 1 hroščati 1 hroščat 1 hrošča

Things to observe

translation (in)equivalence, terminological variety

word-formation strategiespragmatic/cultural conventions of

text typescontrastive analysisother translation strategies

Ways of translating deontic modality - shallusta: Within its own territory, Slovenia shall protect human rights and fundamental Država na svojem ozemlju varuje človekove pravice in temeljne svoboščine.

usta: 11 The official language of Slovenia shall be Slovenian. In those areas where Uradni jezik v Sloveniji je slovenščina.

spor: This schedule shall provide for a phasing-out Ta razpored mora predvideti postopno opuščanje tako uvedenih carin, s katerim je treba začeti najkasneje dve leti po uvedbi dajatev, in sicer po enakih letnih stopnjah.

orwl: " " Obviously we shall put it off as long as " Nujno jo morava odložiti za tako dolgo, kot moreva. "

kuca: a state which shall be fair to all, Je pa v moči vseh državljank in državljanov, da si ustvarijo tako državo, ki bo pravična do vseh, ne glede na njihove poglede na svet, politično prepričanje ali narodno pripadnost.

kuca: world. Thus we shall create harmony Tako bomo ustvarjali ravnovesje v sebi, z drugimi in z okoljem.

Things to observe

translation (in)equivalence, terminological variety

word-formation strategiespragmatic/cultural conventions of

text typescontrastive analysisother translation strategies

Things to observe

translation (in)equivalence, terminological variety

word-formation strategiespragmatic/cultural conventions of

text typescontrastive analysisother translation strategies

A peek into the log file

~1,900 different queries since 1999 L2 search: prevarication, forfeiture,

runlevel, kernellexical-gap words: bias, retrieve,

prepoznavnost culturally bound words: potica, kozolec(multiword) terms: legira.* (alloy steel)

top related