Top Banner
Corpus Linguistics: Counting words, texts or features Mike Scott, University of Liverpool Corpus Linguistics Summer Institute June-July 2008
14

Corpus Linguistics: Counting words, texts or features Mike Scott, University of Liverpool Corpus Linguistics Summer Institute June-July 2008.

Mar 31, 2015

Download

Documents

Camren Acuff
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Corpus Linguistics: Counting words, texts or features Mike Scott, University of Liverpool Corpus Linguistics Summer Institute June-July 2008.

Corpus Linguistics: Counting words, texts or featuresMike Scott, University of Liverpool

Corpus Linguistics Summer Institute June-July 2008

Page 2: Corpus Linguistics: Counting words, texts or features Mike Scott, University of Liverpool Corpus Linguistics Summer Institute June-July 2008.

Aims

to identify what is in principle countable using CL techniques

to consider what it is in principle desirable to count and why

Page 3: Corpus Linguistics: Counting words, texts or features Mike Scott, University of Liverpool Corpus Linguistics Summer Institute June-July 2008.
Page 4: Corpus Linguistics: Counting words, texts or features Mike Scott, University of Liverpool Corpus Linguistics Summer Institute June-July 2008.
Page 5: Corpus Linguistics: Counting words, texts or features Mike Scott, University of Liverpool Corpus Linguistics Summer Institute June-July 2008.

No, not that kind of sentence

Page 6: Corpus Linguistics: Counting words, texts or features Mike Scott, University of Liverpool Corpus Linguistics Summer Institute June-July 2008.

What have we got, anyway?

electronic texts is anything missing?

Page 7: Corpus Linguistics: Counting words, texts or features Mike Scott, University of Liverpool Corpus Linguistics Summer Institute June-July 2008.

What is a text, anyway?

Page 8: Corpus Linguistics: Counting words, texts or features Mike Scott, University of Liverpool Corpus Linguistics Summer Institute June-July 2008.

What we’re looking at

Words in Textssentencesparagraphs

sectionskey words

etc.

Words in the Brainmemory e.g. tip-of-the-tongue

word associationsenjoyment

priming

Words in the Languagelexicographyterminology,

phraseology, etc.patterns of “standard English”

Words in Culturecultural key words,

indicators of class andstance, bias, etc.

Page 9: Corpus Linguistics: Counting words, texts or features Mike Scott, University of Liverpool Corpus Linguistics Summer Institute June-July 2008.

What is countable?

characters word-forms parts of speech sentences headings? paragraphs? lines? pages? other divisions (section, chapter) if marked up utterances turns grammatical sequences

Page 10: Corpus Linguistics: Counting words, texts or features Mike Scott, University of Liverpool Corpus Linguistics Summer Institute June-July 2008.

What isn’t countable?

metaphors semantic prosody patterns

because these are abstractions

Page 11: Corpus Linguistics: Counting words, texts or features Mike Scott, University of Liverpool Corpus Linguistics Summer Institute June-July 2008.

though we have to try …

by seeking various markers, frames signalling these abstractions

recognising, however, that 1 form ≠ 1 function

Corpus Linguistics is all about pattern-seeking!

Page 12: Corpus Linguistics: Counting words, texts or features Mike Scott, University of Liverpool Corpus Linguistics Summer Institute June-July 2008.

Why counting, anyway?

search for interpretations understanding re-defining categories

via patterns WordSmith

Page 13: Corpus Linguistics: Counting words, texts or features Mike Scott, University of Liverpool Corpus Linguistics Summer Institute June-July 2008.

What should we count?

the question of focus the question of scope pointfulness: the search for patterns the POS-trap

metadata are used to forget the data (François Rastier)

Page 14: Corpus Linguistics: Counting words, texts or features Mike Scott, University of Liverpool Corpus Linguistics Summer Institute June-July 2008.

Reference

Scott, M. & C. Tribble, 2006. Textual Patterns: keyword and corpus analysis in language education, Amsterdam: Benjamins. Chapters 1 & 2.