Top Banner
Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies
52

Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Jan 17, 2016

Download

Documents

Dustin Campbell
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Corpus approaches to discourse analysis2000891

Text and corpus analysis in English Studies

Page 2: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Course format

• Lectures and seminars, group work (40 hours)• reading days, tutorials and individual work on

MOOC (50 hours)• Exam: individual corpus projects either using a

corpus you have compiled or assisting in the compilation of a corpus (60 hours)

• Corpus analysis using software

Page 3: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Timetable

• Course starts 05 October 2015• Monday 16:00 – 18:00 Room 447• Tuesday 14:00 – 16:00 Room 349a• Wednesday 09:00 – 11:00 Room 447

• Course ends: 17 November 2015

Page 4: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Aims

• The aims of this course are to give students the awareness, knowledge, experience and skills of analysis of naturally occurring language texts through a corpus assisted discourse studies approach.

Page 5: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Aims cont’d

• the awareness of processes of text production and reception and the effects of register, domain and text type differences.

• Of the use and manipulation of language in society

• Investigations of social issues, how language in society can be investigated through the construction and analysis of electronic corpora

Page 6: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Aims cont’d

• experience of a corpus-based approach to discourse studies through exposure via reading assignments and seminars and awareness through participation in the group discussions in a blended learning format,

• the skills are developed through seminar work and the experience of undertaking a project of corpus analysis using software tools.

Page 7: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Text and Corpus:

• Text: “the record of some speaker’s or writers’ discourse, uttered or written in some context and for some purpose.”

• Corpus: ‘a collection of pieces of text in electronic form, selected according to external criteria to represent as far as possible a language or language variety as a source of data for linguistic research’ (John Sinclair)

Page 8: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

By text, I mean

• the record of some speaker’s or speakers’ discourse, uttered or written in some context and for some purpose.

• A corpus consists, then, of the records of authentic discourses, of actual uses of a language in their social contexts.

Page 9: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Corpus topics in the course

•definitions of corpora, purposes and applications background to corpus linguistics

• corpus assisted discourse studies reading concordances applications of corpora in research textlinguistics and sociolinguistics

Page 10: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Texts and language system

• The system of language is instantiated in the form of text

• Like the relationship between weather and climate – the same phenomenon seen from different standpoints: climate is weather seen from a greater depth of time.

• Weather can be said to resemble texts • while climate is the equivalent for the system

Page 11: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Texts in linguistic investigation

• The use of texts, as records of discourse, is already absolutely central to many types of linguistic investigation

• – from discourse analysis, to conversation analysis, sociolinguistics, ethnomethodology, forensic linguistics, lexicography etc

Page 12: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Language in use

• To communicate means to use language with a purpose and language is not just an abstract entity that we can study detached from its users and contexts in which it is used.

Page 13: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Language and function

• This course takes a functional view of language whereby language is seen as having a social, interactive function (establishing human relationships and negotiating communicative goals) as well as an informational/communicative function.

Page 14: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Hymes notion of communicative competence

• communicative competence includes both linguistic competence (implicit and explicit knowledge of the rules of grammar), and contextual or sociolinguistic knowledge of the rules of language use in context.

Page 15: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Four aspects of communicative competence

• Hymes viewed communicative competence as having the following four types:

• what is formally possible, • what is feasible, • what is the social meaning or value of a given

utterance, • and what actually occurs.

Page 16: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Communicative competence

• Grammatical• Lexical • Textual• Organisational• sociolinguistic• Pragmatic• Strategic

Page 17: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Communicative competence

• Throughout their lives, speakers engage in a multitude of different discourses, both as performers and addressees. They do so, typically, one discourse at a time. Over the course of a lifetime, an individual human may participate in thousands of casual conversations, write and read hundreds of postcards, listen to numerous speeches of various kinds, read scores of recipes, instructional leaflets, tax forms etc.

Page 18: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Generic competence

• As our communicative competence develops, we not only develop varying degrees of what we might term ‘generic competence’, we also come to have expectations associated with the use of language in a vast range of contexts, expectations which allow us to make judgements about the social meaning or value of a given utterance, and to know what is likely or probable based on our experience of what actually occurs by being exposed to texts.

Page 19: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Generic competence

• Particular text types have typical patterns of form and convey particular sets of meanings.

• Part of a native speaker’s competence lies in having been exposed to many examples of language in context and being thus primed for the meanings and effects of particular patterns.

Page 20: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

The usual and the unusual

• The ability to recognise a particular register, tone or text type is vital for literary appreciation and for translation purposes.

• Some patterns may be deviant or particularly creative when compared to other text types and an awareness of naturalness and unusuality is important in linguistic research.

Page 21: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Sources of information about language

• There are two main sources of information about words: introspection and observation.

• introspection means ‘looking inside’ your own brain and trying to remember everything you know about a word

• observation means examining real examples of language in use (in newspapers, novels, blogs, tweets, and so on), so that we can observe how people use words when they are communicating with one another

Page 22: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Corpora in English studies

• The premise behind corpus analysis in English Studies is that language patterns can be retrieved (with the aid of software programmes) and can provide insights into language and literature which intuitive approaches to the same objects of study may fail to reveal.

Page 23: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Corpus analysis

• Corpus analytic techniques can help provide multi-purpose strategies (in the investigation of literature, linguistics and language teaching) which help corroborate or disconfirm our intuitions about patterns and meanings in texts.

• Corpus analysis and corpus literacy are key skills and provide an essential component in English studies

Page 24: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Corpus-based studies look at patterns

words and word groupsgrammatical units

meaningsattitudes

frequenciesco-occurrence

in context

Page 25: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

• Provide techniques for building topic-specific corpora (e.g. Gabrielatos, 2007)

• Reveal salient contextual elements (“trigger events” – Gabrielatos et al., 2012)

• Reveal differences as well as similarities (e.g. Taylor, 2013) intertextuality / interdiscursivity

• Pinpoint absence (e.g. Partington, 2014)

CL can…

Page 26: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Text: form of data used for linguistic analysis

• When we study texts we see patterns that some texts share and describe these in terms of text type

• texts vary very systematically according to contextual values.

• Recording technology and computers have made

it possible to capture spontaneous speech and store and access data in increasing quantities

Page 27: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

The methodology of modern linguistics

• Corpus data is authentic• It can include spoken language• A corpus makes it possible to study language

in quantitative terms

Page 28: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

The methodology of modern linguistics

• it examines the relationship between instance and system, between the typical and the exceptional, between signal and noise

• Partington Patterns and meanings 1998

• Qualitative and quantitative research

Page 29: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Language and society

• a close attention to language data which still has significance for the wider world of social, cultural and political studies

• looking at discourse and rhetorical strategies and seeing how they can be analysed with the aid of corpora and semi-automatic computational tools

Page 30: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Example: KWIC

• language often looks very different when you see a lot of it together (Sinclair)

• The concordancer – a collector and collator of

examples

Page 31: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Concordance lines

• A concordance line is a line of text taken from a corpus, i.e. a collection of language texts which are organised and stored on a computer. The concordance line may come from the beginning, the middle or the end of one of the texts. It may be made up of one sentence, part of a sentence or part of two sentences. Each concordance line in a set includes the target word, i.e. the word being studied. The target word is always in the middle of the concordance line. This means that when we study a word in a set ofconcordance lines we can see its context, in other words, the words which are used before it and after it.

Page 32: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

• A node word with a set of linguistic environments a span of words from left to right

• Environments in which the word finds itself, we can observe common features in the context

• A concordance makes it possible to observe repeated events

• The co-occurrences are observable on the syntagmatic horizontal axis

• Repeated paradigmatic choices are observable on the vertical axis

• Repetitions are made visible by the layout

Page 33: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

• You can know a word by the company it keeps (Firth)

• We learn meanings through the accumulated effects of our encounters in contexts, our experience of language through texts, spoken and written

Page 34: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Meanings through encounters

• we learn the meaning of a word through our encounters with it

• its grammatical category,• its collocations,• its colligations,• its syntactic preferences• Its textual preferences• Its pragmatic associations

Page 35: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Try these

• What do you know about these words?• neck• scarf • trouser• suggestive• typical• brimming• fraught

Page 36: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Corpus data

• corpora provide detailed linguistic evidence to enhance the study of discourse features of a particular genre of language.

• You can make an investigation of the communicative strategies used by speakers and writers to achieve their aims

Page 37: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Meaning is spread out in all parts of a text.

• It is conveyed in the choices an author makes:• at lexical level, the words and phrases selected• at grammatical level , whether to express a process as a

verb or a noun, a description as an adjective or an adverb;• what roles to give participants, grammatical subject or

thematic subject; • when an utterance contains two ideas whether to present

them as coordinate or to subordinate one to the other; • which order to present them in (i.e. which to thematise);

how much and what kind of modality to employ.

Page 38: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Non-obvious meanings

• Even authors sometimes are not fully aware of the meanings their texts convey, much of what carries meaning in texts is not open to direct observation

• We can discover how meaningful choices are by comparing them with those which are normal or usual within the genre, we compare instance with system.

Page 39: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Corpus work is comparative

• If texts are not studied in comparison to other bodies or corpora of text it is not possible to know what is normal and what is marked.

• We are not justified in interpreting the

significance of a single linguistic event unless we can compare it with other similar events

Page 40: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

comparisons

• in choosing our corpora we need to control the variables in a reasoned way.

• To compare like with like • To pinpoint differences and similarities• Salience and significance• Change over time

Page 41: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Types of corpora

• Corpora can be either heterogeneric or monogeneric, that is, they may contain texts of many different types, generally as many different types as the compilers can practically and legally obtain, or they may contain texts of a single type.

• heterogeneric corpora are thus intended to be in some way representative of the language in question as a whole.

Page 42: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Hetero- and mono-generic

• Monogeneric corpora are compiled as a means of studying one particular text-type, for example, the language of law, of economics, of Parliamentary debates, and so on.

Page 43: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

heterogenic

• Heterogeneric corpora tend to be very large, nowadays typically from 100million to a billion words in size.

• Their compilation is complex and expensive and tends to be carried out by special organizations attached to Universities or large institutions, such as publishing houses.

Page 44: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

monogeneric

• Monogeneric corpora, on the other hand, can be relatively easy to compile and are often created by individual researchers with a special interest in a particular text-type. The favourite source for accessing texts today is the Internet.

Page 45: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

heterogeneric

• Heterogeneric corpora, by enabling researchers to take into account vast quantities of language data and therefore obtain an overview of the authentic behaviour of language users not otherwise readily available to the ‘naked ear’, have helped provide a mass of new information about the grammar and lexis of languages, and have led to the compilation of a new generation of dictionaries, of grammatical descriptions, as well as language-teaching materials.

Page 46: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Corpus = archive

• A corpus by itself it simply an inert archive. However, it can be ‘interrogated’ using dedicated software. The most important interrogation tools include, first of all, the concordancer, then calculators of frequency, keywords, clusters and dispersion.

Page 47: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Monogeneric corpora and the study of discourse

• Research of this types generally entails the comparison of two or more corpora of a particular text-type and very often also the comparison of the contents of a monogeneric corpus with that of a heterogeneric one. In fact, discourse study is necessarily comparative in two separate but related ways

Page 48: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Meaningful choices

• Firstly, within an individual discourse type, only by comparing the choices being made by speakers or writers at any point in a discourse with those which are normal, that is, usual within the genre, can we discover how meaningful those choices are.

Page 49: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Discourse type and specific features

• Secondly, if we are also interested in the characteristics and content of the discourse type itself, it is vital to be able to compare its particular features and patterns with those of other discourse types.

• In this way we discover how it is special, and can go on to consider why.

Page 50: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

Comparison is key

• All genre or discourse-type analysis is thus properly comparative.

• In the wider field of discourse studies, this requirement has unfortunately not always been observed in practice.

• Corpora provide the means and methodology to enable rigorous and principled comparative study to be performed.

Page 51: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

CADS

• Throughout the course we will examine different aspects of Corpus Assisted Discourse Studies

• See http://en.wikipedia.org/wiki/Corpus-assisted_discourse_studies

• And http://www3.lingue.unibo.it/blog/clb/• Follow the Corpus Linguistics group on

Facebook

Page 52: Corpus approaches to discourse analysis 2000891 Text and corpus analysis in English Studies.

For non-attenders

• You should follow the MOOC on Corpus Linguistics:

• Corpus Linguistics, Method, Analysis, Interpretation

• https://www.futurelearn.com/courses/corpus-linguistics.

• You can enrol from September 28.