Top Banner
A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE Katrin Menzel Institute of Applied Linguistics, Translation and Interpreting, Saarland University Corpus Linguistics Conference – 27 June
50

A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Feb 03, 2016

Download

Documents

aldan

A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE. Katrin Menzel Institute of Applied Linguistics, Translation and Interpreting , Saarland University Corpus Linguistics Conference – 27 June 2013, St. Petersburg. GECCo project. http://www.gecco.uni-saarland.de/ GECCo/Home.html. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A

COHESIVE DEVICEKatrin Menzel

Institute of Applied Linguistics, Translation and Interpreting, Saarland University

Corpus Linguistics Conference – 27 June 2013, St. Petersburg

Page 2: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

GECCo project

http://www.gecco.uni-saarland.de/GECCo/Home.html

Page 3: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

GECCo projectGECCo: German-English Contrasts in Cohesion• supported by DFG1st phase 2011-20132nd phase 2013-2016

• Project Team:Marilisa Amoia Kerstin Kunz Ekaterina Lapshinova Katrin MenzelErich Steiner

Page 4: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Main research questions

Which systemic resources of cohesion are instantiated in English and German texts in different registers/genres?

How frequent are they?

Which cohesive meanings do they express?

Page 5: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Research goals

analyse cohesive resources provided by the language systems and instantiations in texts

explore contrasts in form, frequency, function and meaning relations across and between languages, registers and production types

Page 6: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Motivation

Filling major research gaps:

Comprehensive accounts of cohesion: only existent from a monolingual perspective (e.g. Halliday & Hasan 1976)

empirical monolingual or contrastive analyses on text and discourse level mainly deal with individual phenomena

Page 7: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

CORPUS RESOURCES

procedures to extract cohesive phenomena require compilation, annotation and exploitation of GECCo Corpus (written and spoken texts)

assumption: no clear dividing line but a continuum from written to spoken

Page 8: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE
Page 9: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

written part of GECCo is a translation corpus and consists of various genres (popular-scientific, fictional and tourism text, prepared speeches, political essays, corporal communication, instruction manuals, websites) of English and German original texts that are aligned with their translation

Page 10: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

spoken part of the corpus is comparable corpus of English and German original texts (interviews, academic lectures, web-forum, talkshows…)

Page 11: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Corpus resources

Page 12: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

http://fedora.clarin-d.uni-saarland.de/cqpweb/

Page 13: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

http://de.clarin.eu/en

Page 14: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE
Page 15: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE
Page 16: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

http://fedora.clarin-d.uni-saarland.de/cqpweb/doc/Simple_query_language.pdf

Page 17: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Types of cohesive devices

Page 18: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Present Study:

ellipsistypes: nominal, verbal, clausal(cf. Halliday&Hasan, 1976)across:- languages: English vs. German- registers: different text types- production types: originals vs. translations

Page 19: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Research goals

describing ellipsis from a cross-linguistic viewpoint in English and German

enhancing corpus linguistic methods to cover a comprehensive variety of ellipses in different registers of spoken and written language in a bilingual corpus (GECCo) of about 1 million words

Page 20: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Defining cohesive ellipsis Ellipsis as a cohesive device is the omission of an element normally required by the grammar that can be recovered by the linguistic context.

Halliday/Hasan: nominal, verbal, clausal ellipsis

Examples: There are two approaches to problem-solving: the empirical [ ] and the rational [ ].I want to help you, but I can’t [ ].What is the capital of the Philippines? – Manila [ ].

Page 21: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Ellipsis as a cohesive device

• cohesive ellipsis vs. other types of ellipsis and fragments (e.g. headlines, exophoric ellipsis without textual antecedent, lexicalised ellipsis)

• missing information must be supplied from the surrounding co-text (usually anaphorically)

Page 22: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Some difference between English and German

e.g. nominal ellipsis: ellipsis remnant has to show strong morphological agreement in order to license the elided noun in Germanein grünes [Haus], keine [Häuser], keins [?]

in a few cases, this also happens in English (mine, none…)

Page 23: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Verbal ellipsis in English and Germanlack of correspondence between English and German verbal system more differences between E/G than with regard to nominal ellipsis

e.g. inclusive imperative: Let’s [go]. / Let’s not. (does not exist in German: *Lass uns!)

English examples in GECCo: many subtypes of verbal ellipsis with varying degree of complexity German: mainly ellipses of modal verb complement (Er muss [ ])

Page 24: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Clausal ellipsis in English and German

Differences G/E: case

Von wem wurde der Junge untersucht? – (Von) Einer Psychologin. * Eine Psychologin. Who was the boy examined by? – A psychologist.

Sluicing:

Er will jemandem schmeicheln, aber sie wissen nicht wem [ ] He wants to flatter someone, but they don't know who [ ].

Page 25: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Practical Issues

Annotating / querying ellipsis in corpora

Page 26: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Manual annotation with MMAX2 to compare with automatic annotation

http://www.h-its.org/english/research/nlp/download/mmax.php

Pointer relation can be used to link a bridging expression to its bridging antecedent.

Page 27: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

CQP queries: to query empty elements we have to find syntactically incomplete or deficient structures

German: Stuttgart-Tübingen-TagSet STTS, English: Penn Treebank tagset

Page 28: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Querying corpus with CQP(German: Stuttgart-Tübingen-TagSet STTS, in English: Penn Treebank tagset)

Sample patterns nominal ellipsis CQP query design Examples English

1. possessive marker 's not followed by noun

[#

[word='s'][pos!='nn|ne']

Mrs. Wood’s [hat]

That was your dream. Kim’s [dreams] were all nightmares.

2. nominal ellipsis after article/det/numeral/quantifier/possessive marker (+optional adjective)

e.g. in German subcorpora: [pos='adja'][pos='vafin']; (adjective + finite verb) or[pos='art'][pos='adja'][pos!='nn|ne']; (article + adjective, not followed by noun/proper noun) in English subcorpora (different tagset) : [pos='jj'][pos='vv.*']; (adjective + verb)

I accept the first argument, but reject the other two [ ]/ the third [ ]While Kim had lots of books, Pat had very few [ ].I went up that skyscraper in Boston, but the tallest [ ] is in Chicago.

Page 29: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Sample CQP queries

GO: • [pos='adja'][pos='vafin']; (adjective + finite verb);• [pos='art'] [pos='adja'][pos!='nn|ne']; (article + adjective, not followed by noun/proper noun)

• EO/ETRANS (different tagset) : [pos='jj'][pos='vv.*']; (adj. + verb)

Page 30: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

some manual correction necessary

difficulty for tagger: in English, many ellipsis remnants have multiple word class membership

Page 31: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

pronouns (e.g. "other": det/adj/pron), words ending in -ing:

the second being very... - to know whether being is a verb or a noun context has to be taken into account as tagging is sometimes wrong and leads to irrelevant examples in query results)

Page 32: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

e.g. "one": number/pronoun/det/adj/ noun - sometimes used with nominal ellipsis, sometimes nominal substitution):

the green one (= nominal subsitution)we saw one [lion] (=nominal ellipsis)

Page 33: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

sometimes ellipsis remnants are zero derivations (especially in English this additionally contributes to word class ambiguity for taggers, e.g. N/V: salt, ship, Adj./N: modal)

Page 34: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

- some nominalised elements (tagged as adj. / numerals), which often refer to people or abstract concepts + lexicalised / context-free ellipsis also have to be sorted out manually:

- the immoral, the rich, - the elderly, a 1 year old- the Fantastic Four- the big two [?] (referring to Oxford and Cambridge university, lexicalised?)- lexicalised idiomatic ellipsis: eine [ ] rauchen

Page 35: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Nominal ellipsis verbal clausal ∑

GO Interview 62.2 9.7 42.2 114.1

EO Interview 129.3 58.0 42.2 229.5

GO Academic 124.4 9.8 43.9 178.1

EO Academic 131.0 29.7 12.4 173.1

GO Fiction 114.2 38.1 51.7 204.0

EO Fiction 154.1 37.8 27.0 218.9

GO Tourism 24.6 13.7 16.4 54.7

EO Tourism 52.9 5.6 0 58.5

normalized frequencies of typical ellipsis subtypes per 100.000 words in 4 German & English registers of GECCo

Page 36: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Spoken registers EO/GO GECCo:

Redundant elements were inserted - instead of elided -, words were repeated, even in an ungrammatical way to remind the hearer of items that were mentioned earlier in the text. - Da machen wir etwas was es absolut verrückt ist. - Ich war bis 1975 war ich in Stuttgart (GO Interview)- For me it’s important is identifying where you come from. (EO Interview)

Page 37: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Translation as a cause of linguistic change with regard to cohesive devices

Page 38: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

cohesive devices, especially ellipsis and substitution, are particular elements where translations involve specific shifts and some kind of ‘fingerprints’ (Gellerstam 2005) or ‘shining through’ (Teich 2003) from the source language into the target language

Page 39: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

‘shining through’ (Teich 2003) from source language into target language:

empirically identifiable traces of source language interference in terms of proportional frequencies of constructions that have the potential to spread from translated to non-translated target language texts

Page 40: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

- translation-induced language change is subtle and often overlooked, but in recent years, some interesting studies have demonstrated the significance of translation as a site of language contact (e.g. House 2006)

- lexical and orthographic level is probably affected most frequently as words are sometimes borrowed through translation

Page 41: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

- source language interference with regard to syntactic or discourse-structural patterns, such as the use of cohesive devices, is more complex and less easily perceptible without a quantitative analysis of proportional frequencies in larger text corpora

- using translation and parallel text corpora, House (2011) for instance has demonstrated that textual norms in German are adapted to anglophone ones

Page 42: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

analysis of GECCo corpus indicates that, compared to English originals, English translations of German texts include a higher frequency of nominal ellipsis after adjectives where we would normally expect for example ‘one/s’, ‘of them’, a general or a specific noun: (1) ein Denken …, das strenger ist als das begriffliche [ ] translation: a thinking more rigorous than the conceptual [ ] (2) Der größte und schönste [ ] ist der Naschmarkt. translation: The largest and most impressive [ ] is Naschmarkt.

Page 43: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

On the other hand, translations into English seem to have a higher frequency of 'one' as a substitute where it is not obligatory (e.g. after 'next', 'second', 'another', 'which').

Page 44: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

translators often insert ‘tun‘ in the case of English lexical verb ellipsis or use it as a direct translation of ‘do’

If we do not, no one else will [ ]. translation in corpus: Wenn wir es nicht tun, wird niemand es tun.

just as Ukraine and South Africa had done and as Libya is doing today translation: so wie es die Ukraine und Südafrika getan haben, und wie Libyen es heute tut

Page 45: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

corpus extraction results show that number of hits of lemma ‘tun’ is much higher in German translations (41 / 100.000 words) than in German originals (29 / 100.000 words)

Page 46: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

translations contribute to semantic bleaching of this verb (writers of German original texts usually tend to avoid the verb ‘tun’ as a substitute for a main verb for stylistic reasons)

Page 47: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

depending on various factors such as standardization of the language and genre and amount and prestige of translated texts, language specific structures and innovations may spread from translated to non-translated target language texts

Page 48: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

References:

Evert, S. 2005. The CQP Query Language Tutorial. IMS, Universität Stuttgart.

Gellerstam, M. 2005. Fingerprints in Translation, In: In and out of English: For Better, for Worse, ed. by G. Anderman and M. Rogers, Clevedon: Multilingual Matters, pp. 201-13.

Halliday, Michael. A.K. and Ruqaiya Hasan. 1976. Cohesion in English. London: Longman. House, J. 2006. Covert Translation, Language Contact, Variation and Change. In: SYNAPS 19. 25-47. House, J. 2011. Using translation and parallel text corpora to investigate the influence of Global English on text norms in other languages. In: A. Kruger et al eds. Corpus-based Translation Studies. London: continuum.

Page 49: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Kunz, K. & Lapshinova-Koltunski, E. 2011, Tools to Analyse German-English Contrasts in Cohesion. In proceedings of GSCL-2011, Hamburg, Germany.

Neumann, S. & S. Hansen-Schirra. The CroCo Project. Cross-linguistic corpora for the investigation of explicitation in translations. In Proceedings from the Corpus Linguistics Conference Series (PCLC), 2005. Vol. 1 no. 1,

Steiner, E. 2008. Empirical studies of translations as a mode of language contact - “explicitness” of lexicogrammatical encoding as a relevant dimension. In: Siemund, P. & N. Kintana (eds.). Language contact and contact languages. Amsterdam: John Benjamins (Hamburg Studies in Multilingualism Vol. 7). pp. 317-346.

Teich, Elke. 2003. Cross-Linguistic Variation in System and Text: A Methodology for the Investigation of Translations and Comparable Texts. Berlin: Mouton de Gruyter.

Page 50: A CORPUS LINGUISTIC STUDY OF ELLIPSIS AS A COHESIVE DEVICE

Спасибо за внимание!

У вас есть вопросы?Do you have any questions? Comments?

Katrin [email protected]

50