Top Banner
Stylistics and stylometry
28

Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

Stylistics and stylometry

Page 2: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

2/28

What is “style”?

• Term not much loved by linguists– Too vague– Has connotations in neighbouring fields (“style” = good style, ie a

value judgment)

• Many books/articles make reference to etymology of the word (Lat. stilus = ‘pen’), so it follows that style is mainly about written language

• Various definitions, some very close to things already seen (especially “register”)

• Two main aspects widely supposed:– style is choice– style is described by reference to something else

Page 3: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

3/28

Style as choice

• For any intended meaning there are a range of alternative ways of expressing that meaning

• Different choices express nuances– of meaning– of other things (style?) eg buy vs purchase

• Example:– Visitors are respectfully informed that the coin

required for the meter is 50p; no other coin is acceptable

– 50p pieces only– Propositional meaning is the same; difference in

expression conveys something else (register etc)

Page 4: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

4/28

Style as choice

• Style is a choice, but often the “choice” is somewhat predetermined

• ie a choice between appropriate and inappropriate style

• So maybe “style” is just another word for register?

Page 5: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

5/28

Style and the norm

• Some writers define style as– “individual characteristics of a text”– “total sum of deviations from a norm”

• But what is the “norm”?– Is there some form of the language that is neutral as

regards style/register?– Note also that the norm shifts: eg Bible AV was

written in the vernacular of its time

• Literary stylistics focuses on the exceptional

Page 6: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

6/28

• Even if there is no norm, we can describe style comparatively– Stylistics mainly involves comparing and

contrasting texts– and associating linguistic variance with

contextual explanation

• Some authors see style as being what is added to the text

Page 7: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

7/28

Stylistic analysis

• Gulf between literary vs linguistic stylistics– Lit crit focuses on effect on the reader,

intended or otherwise, so largely intuitive and subjective

– Linguistic stylistics looking for characterisations of style (including literary style) in terms of linguistic phenomena at the various levels of linguistic description

Page 8: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

8/28

Stylistic analysis

• Inventory of linguistic devices and their effect– usually in a contrastive way:– in contrast with other writers in a similar genre– in contrast with other genres

• Linguistic devices described in terms of the usual linguistic levels of description: phonology, morphology, lexis, grammar, etc.

• Effects can be directly expressive, or indirectly, by association– example: onomatopoeia vs alliteration as a

phonological device

Page 9: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

9/28

Stylistic analysisCrystal & Davy (1969) Investigating English Style

• Informally identify stylistic features felt to be significant

• Devise a method of analysis which facilitates comparison between usages

• Identify the stylistic function of the features so identified

Page 10: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

10/28

Types of features

• “Invariable” features due to the individual or the time – usually of little interest

• Discourse features– medium (= Halliday’s mode), what features distinguish written

language from spoken language– participation: eg monologue vs dialogue

• Province (= field) lexis and syntax• Status (= tenor) features relating to relative social

standing of writer/speaker and reader/listener• Modality (= text type) eg message delivered as a letter,

postcard, text message, email, etc• Singularity: deliberate occasional idiosyncracies

Page 11: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

11/28

Method and function

• Methods and features determine each other– you can only measure features that you can extract– simple counting features are easy to extract– more complex features can be extracted thanks to

NLP techniques of corpus annotation (tagging, parsing, etc)

• Describing the function of observed differences – could be based on intuition– or (see later) partially automated (factor analysis)

Page 12: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

12/28

What to count

• Simple things may characterise different styles– average sentence length– average word length– type:token ratio (vocabulary richness)

• number of types = number of different words• number of tokens = total number of words

– vocabulary growth (homogeneity of text)• number of new types in 1st, 2nd, …, nth 1000 words• in rich varied text, number will climb steadily

• Especially when used comparatively

Page 13: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

13/28

What to count

• More complex analyses can give a more interesting picture– specific syntactic structures– degree of modification in NPs– types of verbs (eg verbs of persuasion, speech verbs, action

verbs, descriptive verbs)– distribution of pronouns (1st/2nd/3rd person)– etc … (anything you can think of)

• Quite sophisticated mathematical techniques can give an overall picture– eg factor analysis: identifies from a (big) range of variables

which ones best identify/characterize differences

Page 14: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

14/28

Normalization and significance

• Always important to compare like with like– It is usual when counting things to “normalize” over

the length of the text– If one text is longer than the other, of course you

would expect higher frequencies of everything

• Issue of statistical significance– Small differences may not really tell you anything– Various measures can confirm whether difference is

statistically significant or due to random fluctuation

Page 15: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

15/28

How to count

• How to recognize paragraph breaks?• How to recognize sentence breaks?

– Headlines don’t end in a fullstop– Not all sentences end in a fullstop– Not all full stops are sentence ending (abbreviations)

• How to count words– Hyphenated words, contractions e.g. don’t

• How to measure word-length/complexity– length only roughly corresponds to complexity– number of characters vs number of syllables– cf. through vs idea– counting syllables implies either a dictionary or an algorithm

Page 16: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

16/28

More sophisticated counting

• Tagging and parsing allows you to look at grammatical and lexical issues– Use of particular POSs (conjunctions,

pronouns, auxiliaries, modals)– Use of particular features (tenses, …)– Use of particular constructions (passives,

interrogatives)

Page 17: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

17/28

Quantifying register differences

• Much work based on corpora trying to quantify and characterize register differences

• Work pioneered by Douglas Biber

• Simple counts like the ones suggested

• Also, more complex computations

Page 18: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

18/28

Example

From D. Biber, S. Conrad & R. Reppen, Corpus Linguistics: InvestigatingLanguage Structure and Use, Cambriufge University Press, 1998.Ch 5: the study of discourse characteristics

05

10152025303540

Expressions per 200 words

conversation speech news academic

Register

Exophoric and anaphoric referring expressions

anaphoric nouns

anaphoric pronouns

exophoric pronouns

Page 19: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

19/28

Multidimensional analysis

• Collect a huge range of measures of a wide variety– some simple word counts– syntactic features– classes and subclasses of N,V,Adj,Avd

• Factor analysis

Page 20: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

20/28

Page 21: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

21/28

~150 features in all

Page 22: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

22/28

Factor analysis

• Statistical method to take large number of apparently random variables and group them together into “factors”

• Factors will be groups of (+ve and –ve) features

• Linguist might then try to characterize the factors in terms of some psycholinguistic feature

Page 23: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

23/28

Page 24: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

24/28

Example

• Biber took two Google classifications of text types: “Home” and “Science”

• Harvested ~1500 webpages in each category (3.74m words)– originally got ~2500 webpages, but some

were not suitable

http://jan.ucc.nau.edu/biber/Web text types.ppt

Page 25: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

25/28

Page 26: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

26/28

Summary of analysis

Page 27: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

27/28

Page 28: Stylistics and stylometry. 2/28 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good.

28/28