Top Banner
Differential Possessor Expression in English: Re- evaluating Animacy and Topicality Effects Catherine O'Connor Boston University Arto Anttila NYU Vivienne Fong NYU Joan Maling Brandeis University Annual Meeting of the Linguistic Society of America January 9 - 11, 2004 Boston, Massachusetts
74

The question:

Mar 19, 2016

Download

Documents

phila

Differential Possessor Expression in English: Re-evaluating Animacy and Topicality Effects Catherine O'Connor Boston University Arto Anttila NYU Vivienne Fong NYU Joan Maling Brandeis University Annual Meeting of the Linguistic Society of America January 9 - 11, 2004 Boston, Massachusetts. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The question:

Differential Possessor Expression in English: Re-evaluating Animacy and Topicality Effects

Catherine O'Connor Boston University Arto Anttila NYU

Vivienne Fong NYU Joan Maling Brandeis University

Annual Meeting of the Linguistic Society of AmericaJanuary 9 - 11, 2004

Boston, Massachusetts

Page 2: The question:

The question:What are the factors that drive the English alternation between the "Saxon genitive" and the "Of genitive"?

The man's widow

The widow of the man

X'SSpec

OF-XComp

Page 3: The question:

Hypothesis 1:ANIMACY

The X'S construction tends to attract animate possessors/ modifiers, and the OF-X construction tends to attract inanimate possessors/ modifiers. (Jespersen, Rosenbach, Stefanowitsch, Anschutz, R. Hawkins...)

Page 4: The question:

This is a statistical tendency, at best:

Walking's many virtuesX'S

The many virtues of walkingOF-X

Hemispheres Magazine, 2001

Page 5: The question:

DISCOURSE STATUS

The X'S construction attracts old, topical, or highly accessible modifiers. The OF-X construction attracts newer or less accessible modifiers (Deane, Anschutz)

Hypothesis 2:

Page 6: The question:

This is also a statistical tendency:

its rejection (a bill)X'S

...recommend passage of itOF-X

A neighbor's carX'S

The car of a neighborOF-X

Page 7: The question:

Hypothesis 3:

WEIGHT

The X'S prenominal construction attracts lighter modifiers, and the OF-X construction attracts heavier modifiers (Stefanowitch, also cf. Arnold et al., J.Hawkins, Wasow)

Page 8: The question:

An analytical problem:These three hypotheses are

seriously confounded:

Humans are often topical

His advocacy of betting on the ponies

Pronouns are lightTopics are

often expressed as pronouns

Page 9: The question:

The glass of waterOf-X

Another analytical problem:Which examples can legitimately be

expected to alternate between Of-X and X's?

Water's glass X's

There are many, many such distractors

Page 10: The question:

Our plan of inquiry:1. Secure a large number of OF-X and X'S

tokens in the Brown Corpus.

2. Exclude tokens of non-reversible types.

3. Code remaining tokens for weight, animacy and discourse status.

4. Control for confounds where possible, and try to model the statistical findings within an OT grammar, following work by Aissen and others.

Page 11: The question:

1. Cleaning the sample

Page 12: The question:

First: exclude non-nominals.Using F.Karlsson's part-of-speech tagged version

of the Brown corpus (1995), we excluded all irrelevant Of-NP and NP's tokens.

A few examples:

Verbal OF-X: He thought of her.

Adjectival OF-X: bald and afraid of women.

Contraction X'S: Kate's all right.

Page 13: The question:

"All NP" sample, after removal of non-nominal examples

X'S

4744

47%

OF-X

5263

53%

N = 10,006

Page 14: The question:

Second: exclude all tokens of non-reversible constructions

A few examples:

Partitives: half of his stirrup guard

Measure and a drop of liquorcontainer phrases: two saucers of water

Classifier phrases: a grove of trees a flight of wooden steps

Configuration and: strips of skin constitutive phrases a...castle of pine boughs

Page 15: The question:

Second: exclude all tokens of non-reversible constructions

and many more:

'Sort' phrases the crassest kind of materialism

Headless OF-X: that of a frustrated gnome

a man of brooding suspicions[the] concept of the white-suited big-daddy colonel

the notion of philosophy as Queen Bee. . .

Nominal dog-eared men's magazinescompounds:

Page 16: The question:

Partially clean sample after removal of 'strict' non-reversibles

N = 7,443

X'S

4604

62%

OF-X

2839

38%

X'S

4604

62%

Page 17: The question:

Third: exclude tokens where reversal substantially alters meaning--

'soft' non-reversibles(a)Idioms, fixed phrases, and titles

(b) Deverbal nominals with argument constraints

(see handout for more examples)

bachelor of science *science's bachelor

Satan's L'il Lamb #the L'il Lamb of Satan

fear of him his fear€

Page 18: The question:

Cleaner sample after removal of 'soft' non-reversibles

N = 6570

X'S

4585

70%

OF-X1985

30%

Page 19: The question:

2. Coding the sample

Page 20: The question:

My sister's houseX'S

The house of my sisterOF-X

For each token,

and the modifier.

Each was coded for animacy, definiteness, NP form, and weight.

we coded the head,

Page 21: The question:

CODING for ANIMACY:

•Human(oid)s•Animals

•Human organizations

ANIMATE

ORG

•Concrete objects•Abstract entities•Locations •Temporal entities

INANIMATE

Page 22: The question:

CODING for WEIGHT:

Arnold et al., Wasow, and J. Hawkins assert that the [orthographic] word is a reasonable measure of weight for most purposes.

It is also easily automated.Each head and modifier were coded for weight in words, from 1 through >20.

Page 23: The question:

How to code for Discourse Status?

Even simple codes such as 'New', 'Inferrable', and 'Old' are quite time-consuming, although they are clearly desirable.

With thousands of tokens, we chose instead to exploit certain robust relationships between NP form and discourse status / accessibility.

Relying on previous research of Prince, Gundel et al., Ariel, i.a., we coded modifiers and heads for NP form and for morphosyntactic definiteness.

Page 24: The question:

Coding for NP Form and Definiteness:

Pronoun

Proper Noun

Common Noun (definite)

Common Noun(indefinite)

Most accessible, most topical, discourse-old...

Least accessible, least topical, discourse-new...

Page 25: The question:

3. Controlling weights

Page 26: The question:

After we coded our clean sample for weight, we noticed that 99% of our X'S examples had possessive modifiers that were 1, 2, or 3 words in weight.

We controlled for weight effects by limiting OF-X tokens to those of 1, 2, or 3 words in weight.

his only attack on the Republicansthe taxpayers' pockets

Speaker Sam Rayburn's forces

the invasion of Cubathe rapid growth of juvenile delinquency

the 9th precinct of the 23rd ward

Page 27: The question:

Cleanest sample, after removal of modifiers greater than 3 words in weight

X'S

4455

75%

OF-X1485 25%

N = 6034

Page 28: The question:

3. Generalizations

Page 29: The question:

We decided to convert the raw numbers of X'S and Of-X tokens into ratios.

For example, of 4177 animate tokens

X'S Of-X

3909 268

3909268

≈ 151

X'S

Of-X

Page 30: The question:

Let's compare the inanimate tokens:

1359 inanimate tokens

X'S Of-X357 1002

3571002

≈ 13

X'S

Of-X

Page 31: The question:

Ratio of X'S to OF-X by Animacy categoryin Cleanest sample (n=6034)

0.1

1

10

100

Animate Org Inanimate

15 : 1

1.3 : 1

1 : 3

favors X'S

logscale

favors Of-X

(N=3937) (N=498) (N=1359)

Page 32: The question:

Ratio of X'S to OF-X by NP form type in 'Cleanest' sample (n=6034)

0.1

1

10

100

1000

Pronoun Proper Comm.Def Comm.Indef.

297:1

1.85 :11 : 7.71 : 2.3

favors X'S

favors Of-X

(N=3577) (N=971) (N=947) (N=539)

Page 33: The question:

Both animacy and discourse status seem to have a large effect. How about weight? Do we find effects of similar magnitude when we examine our possessive modifiers by our three weight values--1, 2 and 3 words?

Yes: here, the results span two orders of magnitude.

Page 34: The question:

Ratio of X'S to OF-X by Weight in Cleanest sample (n=6034)

0.1

1

10

100

W=1 W=2 W=3

(N=4443) (N=1174) (N=417)

10 : 1

1 : 2

1 : 5

favors X'S

favors Of-X

Page 35: The question:

Animacy, discourse status, and weight all show strong effects. If we hold one factor constant, do the other factors disappear?

First we will hold animacy constant and look at the effects of discourse status, through the proxy of NP form.

Page 36: The question:

Ratio of X's to OF-X by NP form, controlling Animacy (n=6034)

0.01

0.1

1

10

100

1000

10000

Animate Org Inanimate

Pronoun

Proper

Comm.Def

Comm.Indef

(N=4177) (N=498) (N=1359)

favors X'S

favors Of-X

Page 37: The question:

If we hold NP form constant, do the Animacy rankings hold up?

Page 38: The question:

Ratio of X's to OF-X by Animacy category, controlling NP form (n=6034)

0.01

0.1

1

10

100

1000

10000

Pron. Prop. Com.Def Com.Indef

AnimateOrgInanimate

(n=3577)

(n=971)

(n=947)

(n=539)

Page 39: The question:

Do the animacy and discourse status ratios hold the same relative values when we control for weight?

Yes.

If we repeat the process for all tokens with modifiers of weight 1, 2, and 3, the relative ranking of ratios stays the same, and the magnitude of the differences persists.

Page 40: The question:

Just how robust are these results?The animacy and discourse status effects remained intact no matter what we controlled for. The relative ranking of the animacy and NP form categories was unchanged, although the ratios themselves differed somewhat in magnitude.

What would happen if we computed the same ratios on our original sample of 'All NPs' (n=10,006)? Did all our laborious extractions and exclusions really make a difference?

Page 41: The question:

Return to initial sample, "All NPs"

X'S

4744

47%

OF-X

5263

53%

N = 10,006

Page 42: The question:

Ratio of X'S to OF-X by NP form type: Comparison of Cleanest (n=6034) vs. All NPs (n=9963)

0.01

0.1

1

10

100

1000

Cleanest All NPs

Pronoun Proper Comm.Def Comm.Indef

297

0.96

0.040.150.13

0.44

1.85

28

Page 43: The question:

Ratio of X'S to OF-X by Animacy: Comparison of Cleanest (n=6140) vs. All NPs (n=9963)

0.1

1

10

100

Clean All NPs

Animate Org Inanimate

5.53

0.59 0.11

14.59

1.29

0.36

Page 44: The question:

4. Interpreting the results

Page 45: The question:

Recall our goal:4. Control for confounds where possible, and try

to model the statistical findings within an OT grammar, following work by Aissen and others.

This is in progress, with very good results. A set of three binary constraints fits the data from our corpus study, and makes predictions that can be tested cross-linguistically.

Page 46: The question:

OT Analysis

In this preliminary phase, we classify possessors in terms of

three binary features: [±animate][±human]

[±pronoun]

Page 47: The question:

Input[+anim, +hum, +pron] =she[+anim, +hum, –pron] = butler[–anim, +hum, +pron] = it (organization)[+anim, –hum, +pron] = it (animal)[–anim, –hum, +pron] = it (other)[+anim, –hum, –pron] = dog[–anim, +hum, –pron] = government[–anim, –hum, –pron] = table

Page 48: The question:

OT constraints for the Complement (the X in Of-X)

(a)*P/C ‘No [+pron] in Comp.’*P-NP/C ‘No [±pron] in Comp.’

(b) *A/C ‘No [+anim] in Comp.’

*A-I/C ‘No [±anim] in Comp.’

(c)*H/C ‘No [+hum] in Comp.’ *H-NH/C ‘No [±hum]in Comp.’

Page 49: The question:

OT constraints for the Specifier(the X in X'S)

(a) *NP/S ‘No [−pron] in Spec.’*NP-P/S ‘No [±pron] in Spec.’

(b) *I/S ‘No [−anim] in Spec.’*I-A/S ‘No [±anim] in Spec.’

(c) *NH/S ‘No [−hum] in Spec.’ *NH-H/S ‘No [±hum] in Spec.’

Page 50: The question:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

she S S S S S S S S S S S S S S S S S S C

butler S S S S S S S S S S C S S C S C C C C

it (org) S S S S S S S S C S S S C S C S C C C

it (animal) S S S S S S S S S C S C S S C C S C C

it (other) S S S S C S C C C C S C C C C C C C C

dog S S S C S C S C S C C C C C C C C C C

government S S C S S C C S C S C C C C C C C C C

table S C C C C C C C C C C C C C C C C C C

19 predicted languages (out of 256 logically possible ones)

Page 51: The question:

The OT grammar imposes a partial ordering on possessor types in terms of their

propensity to appear in the Specifier and Complement positions.

Page 52: The question:

Aissen Lattice

she[+anim, +hum, +pron]

butler it (animal) it (organization)[+anim, +hum, -pron] [+anim, -hum, +pron] [-anim, +hum, +pron]

dog government it (other)[+anim, -hum, -pron] [-anim, +hum, -pron] [-anim, -hum, +pron]

table[-anim, -hum, -pron]

Page 53: The question:

Empirical support?

Do the predicted rankings of affinity for Spec (X'S) hold in our corpus?

We examined the percentages of the relevant NPs in Spec vs. Comp:

Page 54: The question:

Aissen Lattice

she[+anim, +hum, +pron]

butler it (animal) it (organization)[+anim, +hum, -pron] [+anim, -hum, +pron] [-anim, +hum, +pron]

dog government it (other)[+anim, -hum, -pron] [-anim, +hum, -pron] [-anim, -hum, +pron]

table[-anim, -hum, -pron]

99.9%

100%100%75%

37% 97%50%

9%

Page 55: The question:

Confirmation

The predictions hold nearly perfectly in the 'Cleanest' sample.

Will they hold for the 'All NP' sample?

Page 56: The question:

Aissen Lattice

she[+anim, +hum, +pron]

butler it (animal) it (organization)[+anim, +hum, -pron] [+anim, -hum, +pron] [-anim, +hum, +pron]

dog government it (other)[+anim, -hum, -pron] [-anim, +hum, -pron] [-anim, -hum, +pron]

table[-anim, -hum, -pron]

98%

99%100%31%

18% 76%35%

3%

Page 57: The question:

Conclusions•We conducted a large corpus study in which

many features were controlled and investigated (relationality);

•We found robust effects for animacy, discourse status and weight; difficult to disaggregate;

•We modelled these using three binary features and six pairs of constraints; we made predictions about crosslinguistic factorial typology.

Page 58: The question:

Conclusions

"Using OT you can probably model a guy frying an egg." (Arto Anttila)

However, the inevitable question: "So what?"

Moreover, our OT model treated discourse status ("±pronoun") and humanness and animacy as independent factors. This is probably wrong, and there is nothing explanatory about it.

Page 59: The question:

ConclusionsThe question we would like to answer is a

metatheoretical question:

What does it mean for the theory of grammar that these animacy and discourse and weight effects are so robust, and yet so inextricably combined? Is it possible to demonstrate in a principled way whether one derives from another, at least with respect to the alternation between the X'S and OF-X constructions? We haven't answered this question.

Page 60: The question:

Conclusions

(See acknowledgements section of handout for URL and email details)

So now that we have spent several years looking closely at thousands of examples of the two constructions, and now that we have shown that some effects are so robust that you can get them with really unfiltered data, we would like to invite anyone who is interested to try their hand at these questions. We will be putting our coded corpus on the web soon and we invite you to investigate it yourself...

Page 61: The question:

Acknowledgements

This research was supported by NSF grant BCS-008037, "Optimal Typology of Determiner Phrases". The support of the NSF Linguistics Program is gratefully

acknowledged. No endorsement of this research is implied.

Many thanks to our graduate research assistants:Gregory Garretson

Marj HoganBarbora Skarabela

and our undergraduate research assistants:Amy Rose Deal

John Manna

Many thanks also to Joan Bresnan, Annie Zaenen, and Tom Wasow, for discussions of animacy.

Thanks to Boston University students in LS 751, Spring 2002, for discussions of some of this material.

Page 62: The question:

A conjecture:Claim: The prenominal position favors accessible, topical referents. (Many clause-level constructions tend towards 'old first, new last,' and this can be observed in the NP as well.)Observation: Discourse topics tend to be human; 1st and 2nd persons are among the most accessible entities in any speech situation.As speakers discuss events involving animate actors, inanimate NPs introduce background objects, properties, and arguments of many predicates.

Page 63: The question:

A conjecture:How does this well-worn observation buy us anything?

If Animates are highly topical, and thus highly accessible entities, thus mentioned frequently, they will be expressed with pronouns; lighter entities are favored in initial position. So are older entities. The preference of Animates (particularly Humans) for the initial position is a by-product of their topicality and concomitant length.

Page 64: The question:

A conjecture:If inanimates are usually mentioned only once or twice, they will predominantly get expressed as definite and indefinite common nouns that must be fully informative, thus longer.

Inanimates are not usually highly topical, and thus they do not favor the X'S slot. Their appearance in the OF-X construction is a by-product of their non-topical status and length, not a fact about animacy per se.

This may explain the redundancy between inanimates, common nouns, higher weights, and the OF-X position.

Page 65: The question:

Hypothesis 4:

LEXICAL SEMANTICS

Some have claimed that features of the head noun (e.g. relationality) or the semantic relation between the head and modifier account for most of the variation (Stefanowitsch, Taylor, Barker, i.a.)

Page 66: The question:

Relational semantics

Recall the lexical semantics hypothesis. We have tried to control for effects of nominal head semantics by excluding as many strict and soft non-reversibles as we can find. However, there is one more issue to deal with.

Barker, Stefanowitsch, and others have claimed that only relational heads are truly reversible. Yet our X'S sample includes many examples of possession of non-relational nouns. Kim's truck --> ??the truck of Kim

These are said to be irreversible because truck is not relational.

Page 67: The question:

Relational semantics

If we limited our X'S and OF-X sample to only relational heads, would the animacy and discourse status effects disappear or persist?

To test this, we selected a large sample of relational heads from our Clean sample (which excludes all strict non-reversible constructions, e.g. partitives etc.) We disaggregated all the examples that had kinship or body part heads: his cousin or the feet of Fred Astaire.

This included 934 tokens.

Page 68: The question:

Ratio of X'S to OF-X by NP form type. "Clean" sample: Relational Heads only (n=934)

0.1

1

10

100

1000

Pronoun Proper Comm.Def Comm.Indef

Rel.Heads

9.0

.221.89

350

Page 69: The question:

Relational semantics

Then we took all the tokens that had Concrete Inanimate heads (excluding body parts, which are relational.)

Not all of these examples are non-relational, but the prototypical examples of non-relational nouns often include concrete objects that may be possessed by humans but do not have any discernible argument structure of their own.

This included 489 tokens.

Page 70: The question:

Ratio of X'S to OF-X by NP form type. Clean sample: Relational Heads (n=934) vs.

Concrete Inanimate Heads (n=489)

0.1

1

10

100

1000

Pronoun Proper Comm.Def Comm.Indef

Rel.Heads Conc.Heads

9.0

.221.89

350 327

1.9.52 .72

Page 71: The question:

Text analysis (Brown Corpus): an excerpt from a Western novel.

Some facts:2000 words

Approximately 650 coded NP 'mentions'Approximately 150 distinct 'referents'

Dan Morgan his dry lipsnight his plans and dreamssleep a wife ...as fickle as Ann

Page 72: The question:

Core participants:those with the most mentionsDan Morgan: 165 mentions

The visitors: 63 mentionsSharon Jones: 87 mentions

Billy Jones: 43 mentions

Peripheral participants: non-core a trick Al Budd had thought up: 2 mentions

a pathetic, woebegone expression: 1 mentionan idiot: 1 mention

ordinary years: 1 mention

Page 73: The question:

Core participants:Dan Morgan: 165 mentions

The visitors: 63 mentionsSharon Jones: 87 mentions

Billy Jones: 43 mentions

How are core participants realized?More than half of the NPs in the text refer to one of these four "Core" participants. Of these 358

mentions, 83% are pronouns.

Page 74: The question:

How are peripheral participants realized?Over 75% of the 308 "Peripheral" participants

are mentioned only once or twice.Of these 308 NPs, over 85% are

common nouns.

Peripheral elements:a trick Al Budd had thought up: 2 mentions

a pathetic, woebegone expression: 1 mentionan idiot: 1 mention

ordinary years: 1 mention