Tracking the able in stable: Toward an understanding of morphological decomposition in processing and representation Alec Marantz, Olla Somolyak, Ehren.

Post on 13-Dec-2015

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Tracking the able in stable:Toward an understanding of morphological

decomposition in processing and representation

Alec Marantz, Olla Somolyak, Ehren Reilly

Bill Badecker, Asaf Bachrach, Susan Gabrieli, John Gabrieli

KIT/NYU MEG Joint Research Lab (et al.)Departments of Psychology and

Linguistics NYU

Outline

• Strawmen Burnt cartoon dual route model cartoon obligatory decomposition model

• Real Issues Identified modality specific access “lexicon”? access lexicon contains morphologically complex forms?

knowledge contributions to affix-stripping/decomposition?

effects of decomposition on stem access? effects of stages of processing on RT in, e.g., lexical decision?

• Experiment Done evidence for, at least, full decomposition or interactive dual route model

no support for whole word access route support for early effects of surface frequency of affixed forms relative to stem frequency – at decomposition stage

• Experiment Planned tracking the -able in amiable and stable

Competing Models of Lexical Access – take your pick

(or, the field’s take on Words and Rules)

Saturday Morning Models of Lexical Access

(as most people read Pinker)

• Full storage model: all complex words (walked, taught) stored and accessed as wholes: only surface frequency effects on access predicted

• Full decomposition model: no complex words stored and accessed as wholes: only stem frequency effects on access predicted

• Dual Route Model (hooray!): irregular complex forms are stored and accessed as wholes; regular complex forms (except high frequency regulars) are not: surface frequency effects on access for irregulars and high frequency regulars

stem frequency but no surface frequency effects on access for regulars

Facts Support Dual Route Model, if Alternatives are these Cartoon

Versions

• Fact: Stem Frequency effects in access for complex words

• Fact: These effects are arguably not attributable to post-access decomposition particularly when considered in connection with masked priming studies showing morphological priming when neither form nor semantic priming are found

• But, fact: Surface frequency effects in lexical access are found in wide variety of cases, including completely regular morphology (e.g., for most inflected words in Finnish)

Problems for Cartoon Dual Route Model

• The representation of even irregular derived or inflected forms must be complex from the grammatical point of view, from morphology, syntax and semantics, felt is as complex as walked e.g., behavior with respect to do support, further inflection, impossibility of derivation, rigidity of meaning…

from the psycho and neurolinguistic point of view, irregulars contain the stem in the same way that regulars do taught-teach identity priming in long-lag priming and for M350 brain response

• So, the issue of “whole word” access to complex irregulars must be focused on possible “whole word” representations in something like a modality-specific “access lexicon,” not on the lexical representations of such forms, which must be complex and as complex as the representations of regulars.

Comparing surface and base frequency effects: Level 1 vs Level 2 Morphology

Surface Frequency Effect:Biggest for “Level 1”

Base Frequency Effect:Only for “Level 2”

Groups of Affixes

Base Frequency Effect:Nothing significant for groups of affixes

Surface Frequency Effect

Semantic Transparency Doesn’t Explain Lack of Base Frequency

Effects

Whole Word “Representations” for Regulars

• Surface frequency effects on access are seen for a variety of completely regular derivations and inflections.

• This should not surprise any obligatory decompositionalist, since surface frequency effects could be tied to the decomposition (the more you’ve decomposed a particular letter/sound sequence into stem and affix, the faster you are at it) or recombination (the more often you’ve put together a particular stems and affix, the faster you are at it) stages of processing. But any such effects imply representation of whole word as complex structure, regardless of regularity.

“Representations”

• Saying that every combination of morphemes in perception or production, no matter how regular, leaves a trace in the language system of the speaker is saying that frequency information is part of the grammar and that all combinations of morphemes are stored in some sense.

• Theories of morphology in particular have made explicit the difference between stored information about combinations of morphemes that may have grammatical effects on syntax, morphology, phonology and certain types of compositional semantics and stored information about combinations that may only have an effect on “idiomatic” or phrasal meanings.

• It’s fairly straightforward to claim that walked is “stored” as a complex form with a certain frequency in the same way that And now for something completely different is. Both must be composed with the grammar when heard or produced, but both may have frequency and special meaning information associated with them that, as far as the kinds of things most linguists study, have no implications for the grammar whatsoever.

Beyond the Cartoons

Realistic Full Decomposition Models Must…

• Recognize that complex words, both regular and irregular, are stored in some sense, leading to possible surface frequency effects

• Investigate the role of surface frequency in decomposition stem access recombination

Realistic Dual Route Models Must…

• Recognize that all complex forms must be representationally complex, containing structures of morphemes and contrasting with monomorphemic constituents

• Focus on the possible existence of stored “whole word” representations at modality-dependent “access lexicons” (whole word word form representations) to distinguish themselves from obligatory decomposition models

u n r e a l

un unrealreal

[un[real]]

“not” REAL

form code

modality specific access lexicon(?)

lemma

(lexical entry)

EncyclopediaStored info about encountered items)

And now for something completely different

UN+REAL (??)

Realistic Interactive Dual Route Models

• Discuss possible sources for facilitation or difficulty for identifying the affixes for decomposition

• These include phonotactic clues to morpheme boundaries (cf. sixths) as well as statistically properties of the the combinations of morphemes, particularly conditional probabilities.

Effect of “Dominance” on Lexical Access

• Jen Hay has made some about the importance of the relative frequency of a morphologically complex form with respect to the frequency of its stem.

• Complex words with high frequencies relative to their stems are “affix-” or “surface dominant”; those with low frequencies are “stem” or “base dominant”

• Hay: affix dominance leads to difficulty in parsing/decomposition, thus reliance on whole-word recognition and suppression of complexity of representation.

Simplistic Prediction of Hay Model

• Affix dominant words should show surface frequency effects since they are accessed via the whole word route.

• Stem dominant words should show stem (cumulative) frequency effects since they are accessed via the decomposition route.

matched for stem frequency (9), difference in surface dominant (mere(ly)) or stem dominant

(sane(ly))

• meremerely• meremerely• meremerely• meremerely• merely

• sanesanely• sane• sane• sane• sane• sane• sane• sane

Taft (2004): “Morphological Decomposition and the Reverse Base Frequency Effect”

Makes same predictions as Hay for RT with full parsing theory

• Base frequency effects… RT to complex word correlates with freq of stem

• …reflect accessing the stem of morphological complex forms whereas

• Surface frequency effects… RT to complex word correlates with freq of complex word

• …reflect the stage of checking the recombination of stem and stripped affix for existence and/or well-formedness.

How can we distinguish these accounts of RT differences?

PL-Dominant PL (High Surface Freq)

SG-Dominant PL (Low Surface Freq)

Post-Access processing (until response)Latency of Lexical Access

Post-Access processing (until response)Latency of Lexical Access

Full Parsing*

Latency of Lexical AccessPost-Access processing (until

response)

Latency of Lexical AccessPost-Access processing (until

response)

Full Listing / Parallel Dual Route

PL-Dominant PL (High Surface Freq)

SG-Dominant PL (Low Surface Freq)

Reilly, Badecker & Marantz 2006 (Mental Lexicon)

Sequential processing of words

Sequential processing of words

Pylkkänen and Marantz, 2003, Trends in Cognitive Sciences

(Pylkkänen, Stringfellow, Flagg, Marantz, Biomag2000 Proceedings, 2000)

Repetition Frequency

1 2 3 4 5 6

Frequency Category (Frequent -- Infrequent)

Behavioral Data: Reaction Time

Categories (n/Million):

1: 7002: 1403: 30 4: 6 5: 1 6: .2

1 2 3 4 5 6

Frequency Category (Frequent -- Infrequent)

Latency of m350 Component

Categories (n/Million):

1: 7002: 1403: 30 4: 6 5: 1 6: .2

(Embick, Hackl, Shaeffer, Kelepir, Marantz, Cognitive Brain Research, 2001)

Latency of M350 sensitive to lexical factors such as lexical frequency and

repetition

Experiment: parallel behavioral and MEG processing

measures • Lexical Manipulation (Baayen, Dijkstra

& Schreuder, 1997, JML) Lemma frequency (CELEX database) Morphological dominance (surface frequency)

Stem Frequenc

y:Stem Dominant

Affix Dominant

High desk – desks crop – crops

Mid deck – deckscliff – cliffs

Low chef – chefschord – chords

Stimuli: 3 Lexical Categories

• Nouns: singular/plural bone bones

• Verbs: stem/progressive chop chopping

• Adjectives:adjective/-ly adverb clear clearly

Experiment: behavioral measures

• Reliable effect of stem frequency in RT

High Medium Low

620

640

660

680

700

720

740

760

Stem Frequency

Experiment: behavioral measures

• Interacting effects on RT of affixation (base vs. affixed) and dominance (base-dominant vs. affix-dominant

B

B

J

J

Unaffixed Affixed

640

660

680

700

720

740

760

780

Affixation

B Base-Dominant

J Affix Dominant

Analysis of M350 peak latency

• Reliable effect of Stem frequency for unaffixed words and for affixed words

High Medium Low

250

300

350

400

Stem Frequency

High Medium Low

250

300

350

400

Stem Frequency

Unaffixed Words Affixed Words

Analysis of M350 peak latency

• Reliable effect of Affixation (base vs. affixed)

Unaffixed Affixed

250

300

350

400

Affixation

Analysis of M350 peak latency

• No effect of Dominance (base-dominant vs. affix-dominant) on M350 peak latency

Affix Dominant Base Dominant

250

300

350

400

Affixed Words

Analysis of M350 peak latency

• No interaction between Dominance (base-dominant vs. affix-dominant) and Affixation (base vs. affixed)

B

B

J

J

Unaffixed Affixed

335

345

355

365

375

385

Affixation

B Base-Dominant

J Affix Dominant

B

B

J

J

Unaffixed Affixed

640

660

680

700

720

740

760

780

Affixation

M350 peak latencyCumulative Response Time

Analysis of M350 peak latency

• Evidence that early stages of access for affixed words is based on full parsing: Whole word frequency affects post-access stages.

Affix Dominant

Base Dominant

0 100 200 300 400 500 600 700 800

M350 Peak Latency and Residual RT for Base-Dominant and Affix-Dominant Affixed Words

Problem with this Conclusion

• No acknowledgement of the effects of dominance and/or surface frequency on parsing stage of decomposition

• No acknowledgement of possible effects of finding affix on stem access

Possible effects of dominance at different stages in word

recognition

parsing affix

stem access recombin-ation and checking

RT

affix dominantmerely

harder, since tighter connection (possibly only at high surface freq values)

based on stem frequency, possibly speeded if high conditional probability of stem given affix

faster, should correlate with surface frequency at high freq values

might correlate (-) with surface frequency, given speed up in recom-bination

stemdominantsanely

easier than for affix dominant (lower transition probability)

based on stem frequency

at lower surface freq values, no effect of surface freq

at lower surface freq value, should correlate (-) with stem freq

• Let’s examine these effects with correlational analysis

• [For affixed words, dominance and surface frequency are correlated in these materials, perhaps unfortunately for our purposes]

Correlations with RT

For all words:• Stem Frequency

r = -0.11; p << 0.01• Surface Frequency

r = -0.15; p << 0.01• Word Frequency

r = -0.18; p << 0.01• Dominance

r = -0.07; p < 0.01• Affixation

r = 0.08; p < 0.01 For affixed words only:

• Stem Frequency r = -0.07; p < 0.05

• Surface (word) Frequency r = -0.17; p << 0.01

• Dominance r = -0.13; p << 0.01

For affixed surface dominant words:• Stem Frequency

r = -0.14; p < 0.01

• Surface Frequency r = -0.14; p < 0.01

Correlational Analyses

• Within a subject (or two): for each sensor, for each point in time, correlate measured magnetic field strength for each stimulus with the value of some stimulus variable (length, frequency, etc.)

correlation with left vs. right hemifield presentation

- 1 0 0 - 5 0 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0

- 0 . 4

- 0 . 3

- 0 . 2

- 0 . 1

0

0 . 1

0 . 2

0 . 3

0 . 4

0 . 5

L e f t V s R i g h t

S i g n i f i c a n c e L e v e l = 0 . 0 9 0

t i m e f r o m s t i m u l u s o n s e t ( m s )

s

t

r

e

n

g

t

h

o

f

c

o

r

r

e

l

a

t

i

o

n

150ms

RMS Correlations Across Subjects

• For some set of sensors, calculate at each time point in each experimental “epoch” the root mean square (RMS) = the square root of the mean of the squares of the values at each sensor (after normalization of values)

• So, for each subject, for each item, an RMS “wave” can be provided for the correlational analysis

• At each time point, the RMS value for each stimulus is correlated with a stimulus variable

Grand Average All Stimuli All Subjects (11)

M170 Sensors Chosen on the basis of field pattern, subject by subject

M350 sensors chosen subject by subject

M170 RMS Across Subjects:Affixed Words Split Half Divided by

Affix Dominance

M170 Correlation with Dominance:Significant “parsing” effect

No M170 Stem Frequency Effect for unaffixed words

No M170 Word/Non-Word “lexicality” effect

Although surface frequency correlates with affix dominance in these words, no

surface frequency M170 effect

RMS Averages Support Previous Conclusion that M350 peak lags for Affixed as

Compared to Un-Affixed Words

Affix Dominance RMS Averages Suggest Dominance Effects at M350

Base Frequency Correlation for Affixed Surface Dominant Words at M350 Supports

Full Decomposition

Effects of Dominance on M350: In Contrast to M170, Now Bigger Effect for Stem Dominant Words(so, dominance effects affix-dominant words early = parsing, stem-dominant words later =

recombination?)

Recombination Effect?:Correlation with Conditional Probability of

Stem, Given Affix, for Affixed Words

Evidence For Modality-Specific Access Lexicon?

• At M170, where there are large effects of dominance, it’s difficult to find word-form frequency effects or lexicality effects

• However, does “parsing” at M170 require access to “lexicalized” word forms (recall Zweig & Pylkkänen’s winter/farmer contrast) or to high-n n-grams or what?

• Dominance effects at M170 suggest frequency information associated with word-forms, as does winter/farmer contrast dominance reflects the conditional probability of the affix given the stem

Sophisticated Interactive Dual-Route Models predict Whole Word WordForm

Effects

• Note that there is no evidence from the current study (or any study I know of) that whole word wordforms are available for complex words

• To the contrary, complex words with high affix dominance – the very words that should have whole word wordforms according to Hay – show the biggest effect of complexity at the M170, where wordform effects are expected (i.e., these don’t act like simplex wordforms)

word length; 2 subjects combined

- 1 0 0 - 5 0 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0

- 0 . 2

- 0 . 1 5

- 0 . 1

- 0 . 0 5

0

0 . 0 5

0 . 1

0 . 1 5

0 . 2

0 . 2 5

T i m e f r o m S t i m u l u s O n s e t ( m s )

S

t

r

e

n

g

t

h

o

f

C

o

r

r

e

l

a

t

i

o

n

V a r i a b l e 1

S i g n i f i c a n c e L e v e l = 0 . 0 5 7

Localization of M100 from averaged data

one subject, with orthographic neighborhood size

- 1 0 0 - 5 0 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0

- 0 . 2

- 0 . 1 5

- 0 . 1

- 0 . 0 5

0

0 . 0 5

0 . 1

0 . 1 5

0 . 2

T i m e f r o m S t i m u l u s O n s e t ( m s )

S

t

r

e

n

g

t

h

o

f

C

o

r

r

e

l

a

t

i

o

n

3 1 0 1 : O r t h o g r a p h i c N e i g h b o u r s

S i g n i f i c a n c e L e v e l = 0 . 0 8 0

fMRI data, same experiment, same variable

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Correlation in One Subject with Word Length

Correlation with Word Frequency

Correlation with Lexicality

if there were a modality specific word form lexicon…

• Since word-form access lexicon effects would be expected at the M170, given the nature of the morphological complexity results for the M170, the whole-word word form hypothesis would predict: if word-form frequency effects are at all visible for unaffixed words at the M170, there should be word-form frequency effects at the M170 for complex forms once effects of parsing are regressed away

Experiment Planned, Continuous Variables:Investigate Factors Influencing Stages of

Processing for Morphologically Complex Words AND Provide Neural Indices of Morphological

Decomposition for Controversial Cases• Initial parsing and decomposition

affix frequency? transition probabilities, computed over morphemes or over strings?

• stem activation frequency “family” structure (family size, frequency, semantic transparency, etc.)

conditional probability from affix recognition?

• re-merger of pieces surface frequency and conditional probabilities?

• evaluation of combined structure semantic transparency?

status of bound stems

• durable same root in duration predicts durability

• amiable (stable?) no other uses of root but, predicts amiability (stability)

• cable false parse, compared to stable

tracking the -able in amiable

• If words like tolerable with a recurring root and amiable with a unique root nevertheless are parsed and computed as is workable with a word root, then M170 “parsing” effects should be visible for these “opaque” words, since effects are strong for stem-dominant words

M350 effects should be modulated by morphological family and conditional probability of word given the affix

correlational contrasts with “matched” unaffixed words should be seen at all stages of access

Categories of Affixed Words for New Experiment

• 1. Word-Affix taxable

• 2. Root-Affix tolerable

• 3. Pseudo-Affix (mono-morphemes) capable

• Morphological parsing as from English Lexicon Project

Nine Affixes

•able•ary•ant•ity•ate

•ic•er•al•ion

More Examples

Word-Affix Root-AffixPseudo-Affix

rationality

quantity vicinity

destroyer sorcerer character

classic specific empiric

•6 words per category per affix

•6 x 3 x 9 = 162 affixed words

The 27 subgroups…

• All groups are equivalent and well-distributed over:

length mean bigram count frequency

• All word-affix groups are well-distributed over:

surface vs stem frequency dominance

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Length

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Log Frequency

Mean Bigram Count

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Matched Words

• Matched on frequency of ending: Calculated affix frequencies from CELEX database

For each affix, found a list of mono-morphemes with roughly the same ending frequency. (log frequency of +/- 0.3 from affix log frequency)

Chose 18 matched words for each affix that had same parameter distributions as affixed words.

Examples

• er evening, lightning, mediocre

• ity caricature, terrain, pertain

• able sentence, thought, profound

Parsability

• Affixed and Matched words are equally distributed over “parsability” - transitional probability between the last 2 letters of the stem and the first two letters of the affix (or affix-match).

• Example - for “ability” this would be: given that you see “il” 5 letters from the end of a word, what is the probability that “it” will follow it?

Non Words

• 324 non-words chosen randomly to have the same distributions over length and mean bigram count as the words.

• 648 total words and non-words

Homographs

• Added 20 homographs to compare the effects of word frequency with word-form frequency (to test for nature of modality specific access lexicon, if any)

• Examples: content, contract, wind

• Matched with a group of words and a group of non-words.

Total = 728 stimuli

• 162 affixed words 3 categories, 54 per category

• 162 words matched to affixed words• 20 homographs• 20 words matched to homographs• 324 non-words matched to affixed words

• 40 non-words matched to homographs

That’s all, fo..(oh, you know the rest…)

top related