Top Banner
Ž . Cognitive Brain Research 7 1999 357–369 Research report Pre-attentive detection of vowel contrasts utilizes both phonetic and auditory memory representations Istvan Winkler a,b, ) , Anne Lehtokoski b , Paavo Alku c,d , Martti Vainio e , Istvan Czigler a , ´ ´ Valeria Csepe a , Olli Aaltonen f , Ilkka Raimo f , Kimmo Alho b , Heikki Lang g , ´ ´ Antti Iivonen e , Risto Naatanen b,h ¨¨ ¨ a Institute for Psychology, Hungarian Academy of Sciences, H-1394 Budapest, P.O. Box 398 Szondi u. 83 r 85, Hungary b CognitiÕe Brain Research Unit, Department of Psychology, UniÕersity of Helsinki, Helsinki, Finland c Department of Applied Physics, Electronics, and Information Technology, UniÕersity of Turku, Turku, Finland d Acoustics Laboratory, Helsinki UniÕersity of Technology, Helsinki, Finland e Department of Phonetics, UniÕersity of Helsinki, Helsinki, Finland f Department of Phonetics, UniÕersity of Turku, Turku, Finland g Department of Clinical Neurophysiology, UniÕersity of Turku, Turku, Finland h BioMag Laboratory, Helsinki UniÕersity Central Hospital, Helsinki, Finland Accepted 11 August 1998 Abstract Ž . Ž . Ž . Event-related brain potentials ERP were recorded to infrequent changes of a synthesized vowel standard to another vowel deviant in speakers of Hungarian and Finnish language, which are remotely related to each other with rather similar vowel systems. Both language groups were presented with identical stimuli. One standard-deviant pair represented an across-vowel category contrast in Hungarian, but a within-category contrast in Finnish, with the other pair having the reversed role in the two languages. Both within- and Ž . across-category contrasts elicited the mismatch negativity MMN ERP component in the native speakers of either language. The MMN amplitude was larger in across- than within-category contrasts in both language groups. These results suggest that the pre-attentive Ž . Ž . change-detection process generating the MMN utilized both auditory sensory and phonetic categorical representations of the test vowels. q 1999 Elsevier Science B.V. All rights reserved. Keywords: Event-related potential; Mismatch negativity; Auditory sensory memory; Phonetic representation; Category boundary effect; Cross-language study 1. Introduction wx Based on Broadbent’s 7 original scheme of informa- tion processing, the classical multi-store models of the human memory system postulate a clear sequential rela- tionship between sensory and categorical forms of repre- sentation. Current views based on behavioral research Ž wx. e.g., Ref. 6 allow for simultaneous processing of sen- sory and categorical forms of stimulus representation by separate subsystems. Recent research suggested that sen- sory as well as categorical sound representations may provide the basis for pre-attentive auditory change detec- ) Corresponding author. Fax: q36-1-269-29-72; E-mail: [email protected] tion. The present study tested whether pre-attentive audi- tory change detection utilizes sensory and categorical rep- resentations of the same stimuli in parallel. Phonetic stimuli are an obvious choice for such research as they are categorized at an early stage of auditory information processing and, therefore, both types of codes might be accessed by pre-attentive processes. Early work on speech perception showed that discriminating two dif- ferent phonemes of one’s own native language occurs almost instantaneously, whereas discriminating two exem- plars of the same phoneme category might be very difficult w x even if the acoustic differences involved are identical 18 . Traditionally, this form of perception is called categorical perception. The phonetic category boundary effect sug- gests that the voluntary discrimination of phonetic stimuli 0926-6410r99r$ - see front matter q 1999 Elsevier Science B.V. All rights reserved. Ž . PII: S0926-6410 98 00039-1
13

Pre-attentive detection of vowel contrasts utilizes both phonetic and auditory memory representations

Jan 25, 2023

Download

Documents

János Szepesi
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Pre-attentive detection of vowel contrasts utilizes both phonetic and auditory memory representations

Ž .Cognitive Brain Research 7 1999 357–369

Research report

Pre-attentive detection of vowel contrasts utilizes both phonetic andauditory memory representations

Istvan Winkler a,b,), Anne Lehtokoski b, Paavo Alku c,d, Martti Vainio e, Istvan Czigler a,´ ´Valeria Csepe a, Olli Aaltonen f, Ilkka Raimo f, Kimmo Alho b, Heikki Lang g,´ ´

Antti Iivonen e, Risto Naatanen b,h¨¨ ¨a Institute for Psychology, Hungarian Academy of Sciences, H-1394 Budapest, P.O. Box 398 Szondi u. 83r85, Hungary

b CognitiÕe Brain Research Unit, Department of Psychology, UniÕersity of Helsinki, Helsinki, Finlandc Department of Applied Physics, Electronics, and Information Technology, UniÕersity of Turku, Turku, Finland

d Acoustics Laboratory, Helsinki UniÕersity of Technology, Helsinki, Finlande Department of Phonetics, UniÕersity of Helsinki, Helsinki, Finland

f Department of Phonetics, UniÕersity of Turku, Turku, Finlandg Department of Clinical Neurophysiology, UniÕersity of Turku, Turku, Finlandh BioMag Laboratory, Helsinki UniÕersity Central Hospital, Helsinki, Finland

Accepted 11 August 1998

Abstract

Ž . Ž . Ž .Event-related brain potentials ERP were recorded to infrequent changes of a synthesized vowel standard to another vowel deviantin speakers of Hungarian and Finnish language, which are remotely related to each other with rather similar vowel systems. Bothlanguage groups were presented with identical stimuli. One standard-deviant pair represented an across-vowel category contrast inHungarian, but a within-category contrast in Finnish, with the other pair having the reversed role in the two languages. Both within- and

Ž .across-category contrasts elicited the mismatch negativity MMN ERP component in the native speakers of either language. The MMNamplitude was larger in across- than within-category contrasts in both language groups. These results suggest that the pre-attentive

Ž . Ž .change-detection process generating the MMN utilized both auditory sensory and phonetic categorical representations of the testvowels. q 1999 Elsevier Science B.V. All rights reserved.

Keywords: Event-related potential; Mismatch negativity; Auditory sensory memory; Phonetic representation; Category boundary effect; Cross-languagestudy

1. Introduction

w xBased on Broadbent’s 7 original scheme of informa-tion processing, the classical multi-store models of thehuman memory system postulate a clear sequential rela-tionship between sensory and categorical forms of repre-sentation. Current views based on behavioral researchŽ w x.e.g., Ref. 6 allow for simultaneous processing of sen-sory and categorical forms of stimulus representation byseparate subsystems. Recent research suggested that sen-sory as well as categorical sound representations mayprovide the basis for pre-attentive auditory change detec-

) Corresponding author. Fax: q36-1-269-29-72; E-mail:[email protected]

tion. The present study tested whether pre-attentive audi-tory change detection utilizes sensory and categorical rep-resentations of the same stimuli in parallel.

Phonetic stimuli are an obvious choice for such researchas they are categorized at an early stage of auditoryinformation processing and, therefore, both types of codesmight be accessed by pre-attentive processes. Early workon speech perception showed that discriminating two dif-ferent phonemes of one’s own native language occursalmost instantaneously, whereas discriminating two exem-plars of the same phoneme category might be very difficult

w xeven if the acoustic differences involved are identical 18 .Traditionally, this form of perception is called categoricalperception. The phonetic category boundary effect sug-gests that the voluntary discrimination of phonetic stimuli

0926-6410r99r$ - see front matter q 1999 Elsevier Science B.V. All rights reserved.Ž .PII: S0926-6410 98 00039-1

Page 2: Pre-attentive detection of vowel contrasts utilizes both phonetic and auditory memory representations

( )I. Winkler et al.rCognitiÕe Brain Research 7 1999 357–369358

is primarily based on categorized information. When thisfails, time-consuming extensive reprocessing of the sen-

w xsory information is assumed to be executed 13 . Thus,according to the results of studies on voluntary phonemediscrimination, sensory and categorical information seemto be managed by separate subprocesses.

There exists a discriminative auditory process that doesnot require the subject’s attention. When a sound violatesthe regularities of the preceding auditory sequence, it

Ž .elicits an event-related brain potential ERP componentŽ .termed the mismatch negativity MMN even when the

subject is engaged in some task completely unrelated toŽthe auditory stimulation for recent reviews of the MMN,

w x.see Refs. 20,29,32 . The MMN component appears as afrontocentrally negative wave usually peaking between100 and 300 ms from the onset of stimulus deviation. Amajor part of the component is generated in the auditory

w xcortex 15 . In the prototypical MMN paradigm, deviantsounds infrequently replacing a repetitive auditory stimu-

Ž .lus standard elicit an MMN. MMN elicitation in suchsituations indicates that the deviant sound was pre-atten-tively discriminated from the standard stimulus representedby its memory trace. MMN is elicited by changes in

w xsynthesized 2,30 as well as naturally- produced speechw xsounds 31 .

A large body of evidence supports the notion that thediscriminative MMN process relies on sensory representa-

Ž w x.tions of the auditory stimuli for a discussion, see Ref. 8 .For example, auditory recognition masking affects recogni-

w xtion performance and the MMN amplitude similarly 37 .The question addressed by the present study is whether thepre-attentive discrimination reflected by the MMN canutilize in ‘parallel both categorical and sensory stimulusrepresentations of the same stimuli.’ The phonetic categoryboundary effect provides a suitable method for this test.However, previous research on MMN elicited by phoneticstimuli yielded mixed results with respect to the presentquestion. Results obtained by increasing the amount ofacoustic separation between the deviant and standard stim-uli until the deviant stimulus fell into a different phoneticcategory than the standard showed no category boundary

w xeffect on the MMN amplitude 3,19,30,33 . Because thesestudies confounded acoustic deviation and the separationbetween phonetic categories, it is quite possible that thelarge effect of the amount of acoustic deviation on the

w xMMN amplitude 35 drowned out a possible smallerŽ w x.parallel phonetic category boundary effect see Ref. 19 .

w x ŽAulanko et al. 5 showed that MMNm magnetoen-.cephalographic counterpart of the electric MMN was

Želicited by an across-category phonetic contrast a repeti-tive CV syllable being occasionally replaced by one start-

.ing with a different consonant even when the fundamentalfrequency of both standard and deviant CVs was varied.ŽWhen the F0 value of a CV is changed the resultingsound is perceived as the same CV voiced in a different

.pitch. The authors reasoned that this result demonstrated a

phonetic-category-based MMNm. However, since then,MMN was observed also in sequences where several stim-ulus features were varied: by keeping one feature identical

Ž .for the majority of the stimuli e.g., duration infrequentw xsounds deviating in this feature elicited the MMN 14,36 .

w xTherefore, the results by Aulanko et al. 5 do not necessar-ily reflect phonetic representations.

w xSubjects of Phillips et al. 24 were presented withrandom sequences of eight CV syllables taken from the

Žrdær–rtær continuum 4–4 on both sides of the percep-tual boundary, differing from each other only in their voice

Ž ..onset time VOT . In the test sequences, the ratio betweenthe synthesized CVs belonging to one or the other phoneticcategory was 7:1. The elicitation of MMNm by the CVs ofthe rare phonetic category seemed to demonstrate that

Ž . Ž .MMN m can be based on phonetic categorical informa-w xtion. However, Phillips et al. 24 also found that the

Žinformation of the amount of VOT below 30 ms the upper.limit of rdær is lost at an early stage of auditory

Žprocessing for compatible evidence based on non-speechw x.stimuli see, 16 . Therefore, the auditory sensory memory

representations of their different rdær CVs probably didnot differ from each other, and thus the acoustical variance

Žemployed in this study could not separate phonetic cate-. Ž . Žgorical and auditory sensory effects at least for the

. w xdifferent rdær CVs . Moreover, previous research 36established that MMN is elicited even when the deviantstimulus differs from a slightly varying standard in the

Ž .varied feature e.g., sound intensity . Therefore, the MMNsw xelicited in the study of Phillips et al. 24 could have been

based on auditory sensory traces only.Cross-language studies represent a further step in pursu-

ing the issue of whether MMN for phonetic material isbased on acoustic or phonetic memory codes. Dehaene-

w xLambertz 9 found that an infrequent change from rdarto rbar elicited a large negativity between 248–320 ms innative French speakers. No such component was observedfor other contrasts having similar amounts of acousticdifference but falling within the range of the French rbar

Žor rdar category from dental rdar to retroflex rDar,distinguished in Hindi, but not in French, and between two

.retroflex rDars, also both identified as rdar in French .Voluntary discrimination performance was very low for

Ž .all within-category discriminations below 20% , whereassubjects detected the rdar to rbar category change inmore than 80% of the trials. The author suggests that thispattern of results was probably due to a genuine within-

w xcategory reduction in sensory discrimination abilities 9 .In other words, loss of acoustic information may havetaken place prior to the stage of the MMN process. In a

Žsimilar study, Phillips et al. for preliminary results, seew x.Ref. 24 showed that no MMNm was elicited by Japanese

Žsubjects to rrar–rlar contrasts as the distinction be-tween rlr and rrr is not phonetically relevant in

.Japanese , whereas the same stimuli elicited a sizableMMNm in native English speakers. Again, it is quite

Page 3: Pre-attentive detection of vowel contrasts utilizes both phonetic and auditory memory representations

( )I. Winkler et al.rCognitiÕe Brain Research 7 1999 357–369 359

possible that the sensory memory representations of rlarand rrar were not sufficiently different in Japanese speak-ers.

In summary, none of the studies reviewed above pro-vided conclusive evidence of a phonetic category MMNeffect. One common problem of these designs was the use

Ž .of CV syllables, because for at least some consonants, itis hard to tell whether a process utilized their auditoryŽ . Ž .sensory or phonetic categorical representations due tothe lack of sufficient accuracy of the auditory sensory

Ž w x.representations. It has been suggested e.g., Ref. 34 thatŽ .consonants at least stop consonants are preserved in a

Ž .phonetic code only , whereas vowels are retained in theform of auditory memory traces. This would explain whywithin-category differences are more discriminable for

w xvowels than for consonants 12 . On the other hand, resultsŽ .of reaction-time RT studies demonstrated that vowels are

Ž w x.also categorically coded e.g., Refs. 25,26 . Therefore,the evidence for utilizing both types of codes in parallelmay best be found by employing vowels as stimulusmaterial.

In an MMN study using vowel stimuli, Naatanen et al.¨¨ ¨w x22 tested Finnish and Estonian subjects by using con-trasts between synthesized isolated vowels. In all condi-tions, the standard stimulus was a sound perceived as rerby speakers of both languages. Occasional deviants dif-fered from the standard in the F2 formant value. Two ofthe test deviants were perceived similarly in both lan-guages: one as ror the other as ror. The critical test¨deviant was close to the Estonian ror vowel, its F2 value˜falling between those of ror and ror. However, ror is¨ ˜not present in the Finnish language, therefore the ror-like˜sound was not perceived by Finnish speakers as beingtypical to any of their native vowels. In Estonians, theMMN-amplitude elicited by the ror deviant fell close to˜those elicited by ror and ror. However, in Finnish¨subjects, the amplitude of the corresponding MMN re-sponse was smaller than that for either one of the otherdeviants. It appears that the identification of the deviant asa native-language vowel enhanced the MMN amplitude,suggesting that the phonetic representation of the deviantsound was involved in the discriminative mismatch pro-cess.

Magnetoencephalographic measurements showed thatthe source of the MMNm elicited by native-languagevowel deviants was stronger in the left hemisphere, whereasthe source of the MMNm to non-native vowel deviants hadapproximately equal strength in both hemispheres. On thisbasis, the authors suggested that the MMNs elicited bynative-language vowel deviants consisted of two mismatchsignals: one based on auditory codes and the other onphonetic representations.

w xThe study of Naatanen et al. 22 included no within-¨¨ ¨category contrast. All deviant stimuli were selected fromF2 ranges outside the standard vowel category. In addition,because of the nature of the two languages, the design was

not symmetric: Estonians were not tested by vowels pre-sent only in the Finnish language.

2. Experiments

We now report results from a symmetric cross-languagedesign employing two pairs of synthesized isolated vowels.One pair appeared as a within-category contrast in Hungar-ian while constituting an across-category contrast inFinnish, and the other vice versa. Both Hungarian andFinnish have two different ‘e’ sounds. The range of the

ŽFinnish rer vowel in the four-dimensional F1–F4 for-. Žmant space cuts across both Hungarian ‘e’s rer and´

.r´r while the Hungarian r´r overlaps the Finnish rerŽ .and rær see Fig. 1 . Therefore, one can synthesize such

a pair of vowel-like sounds that are categorized as rer´and r´r by Hungarian speakers, whereas Finnish speakersperceive both as rer. And conversely, it is possible toproduce a pair of stimuli that are perceived as rer andrær by Finnish speakers, whereas Hungarians categorizeboth as r´r.

The present experiment capitalized on this interplay ofthe two languages. Since infrequent, acoustically deviant

Žsounds elicit the MMN component, both foreign within-. Ž .category and native language across-category contrasts

could be expected to elicit the MMN in either languageŽgroup when presented in a passive oddball paradigm i.e.,

.one ‘e’ delivered frequently, the other infrequently . How-ever, when the difference between two synthesized ‘e’vowels was also phonetically relevant in a given language,

Fig. 1. A schematic illustration of the Hungarian r´r and rer, Finnish´rer and rær, and the common ryr vowel in the F1–F2 formant space.Notice that the Hungarian r´r cuts across the range of both Finnish ‘e’vowels while the Finnish rer overlaps both Hungarian ‘e’ vowels. Thecontinuous line between rer and r´r shows the range of the synthe-´sized Hungarian vowels, the dashed line between rer and rær theFinnish continuum. The dots marked on the lines depict the approximatelocations of the sounds selected for the EEG experiment.

Page 4: Pre-attentive detection of vowel contrasts utilizes both phonetic and auditory memory representations

( )I. Winkler et al.rCognitiÕe Brain Research 7 1999 357–369360

we expected this deviance to elicit a phonetic representa-tion based MMN in native speakers of this language.

w xBasing on the results by Naatanen et al. 22 , we predicted¨¨ ¨that this phonetic MMN will be additive to the sensoryrepresentation related MMN. Thus in each language group,the foreign contrast could only elicit the sensory represen-tation based MMN component, whereas the native contrast

Ž .was expected to elicit two sensory and phonetic MMNsubcomponents. If the sensory and phonetic MMN compo-nents had similar peak latencies, the resulting MMN shouldbecome larger. If the sensory and the phonetic MMNsubcomponents peaked in two different latency ranges, thecorresponding deviant-minus-standard difference responseshould display two successive negative peaks. In eithercase, if the amplitudes of the sensory representation basedMMNs elicited by the two ‘e’ contrasts were approxi-mately equal, the overall size of the MMN response should

Žbe larger either by having a higher amplitude or by being. Ž .longer for the native language across-category than the

Ž .foreign language within-category contrast in both lan-guage groups. This means that Hungarian subjects wereexpected to elicit larger overall MMN to the Hungarianthan Finnish contrast and, conversely, Finnish subjectswere expected to elicit larger overall MMN to the Finnishthan Hungarian contrast.

Such an opposite pattern of results in the two languageŽ .groups with regard to the actual vowel pairs would

support the notion that pre-attentive discrimination re-Ž .flected by MMN utilizes both auditory sensory and pho-

Ž .netic categorical stimulus codes.

2.1. Experiment 1: categorization and goodness ratings

2.1.1. Subjects and procedureŽTwelve native Finnish-speaking seven females; 20–35

.years of age and 14 native Hungarian-speaking subjectsŽ .six females; 18–24 years of age participated in thisexperiment. The subject’s task was to categorize eachsound according to what vowel it resembled most and,subsequently, to evaluate how close the stimulus matchedthe ‘typical’ pronunciation of the selected vowel. A 5-gradescale was used, with 5 representing the perfect match. Thesubject controlled the delivery of the sounds by pressing a

button after marking the vowel category and the goodnessscore of the previous sound on a prepared form. This tasktook between 1 and 2 h with relatively large varianceacross subjects. Finnish subjects were tested in Finland,Hungarians in Hungary. Experiments were conducted us-

Žing identical stimuli, equipment NeuroStim stimulation. Ž .system , procedures, and subject instructions translated in

the two laboratories.

2.1.2. StimuliŽStimuli were isolated vowels intensity: 70 dB SPL,

.duration: 165 ms, riserfall times: 2.5 ms synthesizedfrom the Finnish rer to rær, and the Hungarian rer to´

Ž .r´r vowel continuum Fig. 1 . In each continuum, 42different stimuli were created covering the correspondingrange of the first, second, third and fourth formant with

Ž .equal frequency steps Table 1 . For the sake of simplicity,synthesized vowels of the Hungarian rer to r´r contin-´uum will be referred to as ‘Hungarian vowels’, those from

Žthe other continuum as ‘Finnish vowels’ even thoughthese synthesized vowels are ‘equally’ Finnish or Hungar-

.ian . Two randomized series of 430 stimuli each, onecontaining the Finnish, the other the Hungarian vowels,were binaurally delivered to subjects via headphones. Eachsynthesized vowel appeared 10= within the correspondingstimulus sequence. In addition, both stimulus sequencesincluded 10 presentations of a near-prototype synthesized

Žryr vowel ryr is common to both languages; Fig. 1,.Table 1 in order to test how well the subject understood

the task. The order of the two stimulus blocks was bal-anced across subjects within each language group.

2.1.3. Vowel synthesisTest sounds were generated by using a method that

produces synthetic vowels from natural glottal excitationin conjunction with a vocal tract model. This approach isbased on the separated speech-production model whichassumes that speech is produced as a cascade of threeindependent processes: the glottal excitation, the vocal

w xtract, and the lip radiation effect 11 . In conventionalspeech synthesis, all three processes are artificially mod-elled. The stimulus material of the present study wascreated by a semi-synthetic scheme: the excitation processof the separated speech production model was computed

Table 1Vowel synthesis

F1 F2 F3 F4

From To Step From To Step From To Step From To Step

Finnish rer to rær 450 635 4.7 2200 1745 11.7 2700 2470 5.9 3500 3200 7.7ryr 290 1960 2380 3300Hungarian rer to r´r 400 650 6.4 2260 1610 16.7 3000 2560 11.3 4170 3750 10.8´

Ž .Formant ranges and frequency step values all in Hz used in synthesizing vowels in the Finnish rer to rær continuum and the Hungarian rer to r´r´continuum.

Ž .The resulting test stimuli are equally spaced in frequency along the straight lines connecting the extreme values in the four-dimensional F1–F4 formantspace.

Ž .The formant frequencies of the ryr vowel used in Experiment 3 are also provided.

Page 5: Pre-attentive detection of vowel contrasts utilizes both phonetic and auditory memory representations

( )I. Winkler et al.rCognitiÕe Brain Research 7 1999 357–369 361

from a human voice, whereas the vocal tract and lipradiation effects were artificially modelled. The naturalexcitation of the semi-synthetic scheme was determined

Žfrom a male voice vowel rar, normal phonation, F0s.114 Hz for all synthesized vowels, using an inverse

w xfiltering technique 4 . The vocal tract was approximatedby an eighth-order all-pole filter after adjusting the coeffi-cients for the required formant structure. The lip radiationeffect was modelled by a fixed differentiator. The formantranges and frequency step values used to synthesize thetest sounds are presented in Table 1.

2.1.4. Data analysisThe synthesized ryr vowels were correctly identified

by all subjects. Except for a negligible number of in-stances, the presentations of the other synthesized vowelswere always categorized as one or the other ‘e’ vowel inthe subject’s native language. Therefore, the goodnessscores awarded by a subject to the 10 repetitions of a giventest stimulus were summed separately for those instanceswhen the stimulus was categorized as one or the other ‘e’vowel of the subject’s native language. Thus each synthe-

Žsized vowel received 2 score values from each subject one.for each ‘e’ vowel . These scores were then converted to

Žpercentages of the theoretical maximum score 50, if allrepetitions of the given test stimulus were categorized as

.the same vowel with a goodness score of 5 . The percent-age scores were then averaged separately for the twolanguage groups.

2.1.5. ResultsFig. 2 presents the group-average score percentages

from Expt. 1. The smooth curves crossing over in anintermediate part of each range show that the categoriza-tion of the test sounds was unequivocal at the extremes ofeach continuum in both language groups. Only the identifi-cation of the test stimuli around the cross-over pointsproved to be a harder task. As could be expected, the testsounds receiving the highest score in a given vowel cate-gory within the series synthesized for the subjects’ native

Ž .language were regarded better by ca. 10% than the ‘best’approximations of the same vowel in the other sequence.For example, Hungarian subjects gave an average score ofca. 80% to the test stimulus best approximating the Hun-garian rer within the Hungarian series, whereas the corre-´sponding ‘best’ vowel in the Finnish series received onlyca. 70% average score from the same subjects. Theseresults demonstrate that phoneme categories are language-specific within certain limits of the physical parameters of

Žthe sounds as could be expected from the literature seew x.e.g., Refs. 1,27 .

It is important to note that in either language contin-uum, Finnish and Hungarian subjects have the crossoverbetween their respective vowel categories at different for-mant values. For example, Finns perceive the synthesized

vowels of the Hungarian series as Finnish rer below ca.F1s554 Hz and above F2s1860 Hz, F3s2729 Hz, andF4s3912 Hz, whereas Hungarians regard the same soundsbelow F1s490 Hz and above F2s2027 Hz, F3s2842Hz, and F4s4019 Hz as Hungarian rer. The range´between the above two points is perceived by Hungarianspeakers as r´r. Thus, there exists a range of synthesizedvowels in each language-continuum, within which thevowel category changes over for native speakers of thegiven language, whereas subjects of the other languagegroup classify these sounds as members of the same vowel

Ž .category of their own native language see Figs. 1 and 2 .

2.2. Experiment 2: Õowel identification

2.2.1. SubjectsŽTen native Finnish-speaking five females; 20–35 years

. Žof age and 10 native Hungarian-speaking subjects five.females; 22–27 years of age participated in this experi-

ment. Some of the subjects also participated in Expt. 1.Finnish subjects were tested in Finland, Hungarians inHungary. Experiments were conducted using identical

Ž .stimuli, equipment NeuroStim stimulation system , proce-Ž .dures and subject instructions translated in the two labo-

ratories.

2.2.2. StimuliTwo blocks of 70 test stimuli, each, one containing two

different Finnish, the other two different Hungarian vowelswere binaurally delivered to subjects via headphones. The

ŽFinnish vowel pair consisted of an rer F1s545 Hz,.F2s1967 Hz, F3s2582 Hz, F4s3346 Hz and an rær

ŽF1s588 Hz, F2s1862 Hz, F3s2529 Hz, F4s3277.Hz , both of which were previously categorized as r´r by

Ž .Hungarian listeners in Expt. 1 Fig. 2 . The Hungarian pairŽconsisted of an r´r F1s515 Hz, F2s1960 Hz, F3s. Ž2797 Hz, F4s3976 Hz and an rer vowel F1s458 Hz,´

.F2s2110 Hz, F3s2898 Hz, F4s4073 Hz , both werepreviously categorized as rer by Finnish listeners in Expt.Ž .1 Fig. 2 . The physical difference between the two vowels

was somewhat larger for the Hungarian than for the Finnishpair. Intensity and stimulus duration were as in Expt. 1.

2.2.3. ProcedureStimulus blocks started with five pairs of sounds. Each

pair presented the two synthesized vowels of the block inŽ .the same order. The within-pair interval onset to onset

was 1.2 s, while consecutive pairs were separated by 2.4 s.Subjects were informed that they will hear two differentsounds 5= in the same order. They were instructed tocarefully listen to these sounds in order to be ready foridentifying them in the subsequent segment of the stimulusblock, where the same two sounds will be presentedseveral times in a random order. The rest of the stimulusblock contained a randomized sequence of the two synthe-

Ž .sized vowels 50–50% presented with a constant 1.2 sŽ .stimulus onset asynchrony SOA . Subjects were instructed

Page 6: Pre-attentive detection of vowel contrasts utilizes both phonetic and auditory memory representations

( )I. Winkler et al.rCognitiÕe Brain Research 7 1999 357–369362

Ž . Ž .Fig. 2. Categorization of the synthesized vowels. Group-average rating of each synthesized Hungarian left side and Finnish vowel right side in percentŽof the maximal possible score which was 50 if all 10 repetitions of a synthesized sound were categorized identically with the maximal 5 points for

. Ž .closeness to the typical pronunciation of the selected vowel category . Hungarian subjects first row categorized the synthesized vowels of bothŽ . Ž . Ž . Ž .Hungarian and Finnish sequences as either r´r thin continuous line or rer thick continuous line . Finnish subjects second row categorized all´

Ž . Ž .synthesized vowels as either rær thin dashed line or rer thick dashed line . The vowel pairs selected for the EEG experiment are marked by arrows.The F1–F4 formant parameters of the selected vowels are given under the bottom panel.

to press button 1 when hearing the first and button 4Ž .standard NeuroStim reaction pad for the second sound ofthe rehearsal pairs. Accuracy and speed were both empha-sized in the instruction given to the subjects. The random-ized sequence was started when the subject signalled thathershe was ready to perform the task.

2.2.4. Data analysisHit rates and RTs were compared with a two-factor

ŽANOVA subject group = language-continuum of the. Ž .vowel . All significant p-0.05 results are described.

2.2.5. ResultsŽSubjects could identify faster RTs of the correct re-

Ž .sponses: group= language interaction: F 1,18 s4.28, p. Ž-0.05 and more accurately the percentage of correct

Ž .responses: group= language interaction: F 1,18 s12.90,.p-0.005 the synthesized vowels that belonged to differ-

ent phonetic categories in their native language than thoseŽ .that fell into the same vowel category Table 2 . The

results thus showed the expected phonetic category bound-ary effect. On the other hand, the subjects’ performance

Ž .level was relatively high above 60% also for the pair ofŽ . Ž .same-category foreign contrast vowels see Table 2 .

Page 7: Pre-attentive detection of vowel contrasts utilizes both phonetic and auditory memory representations

( )I. Winkler et al.rCognitiÕe Brain Research 7 1999 357–369 363

Table 2Vowel identification

Finnish subjects Hungarian subjects

Ž . Ž .Hit % RT ms Hit % RT msMeans"S.D. Means"S.D.

Finnish Vowels 88"18 381"123 72"27 389"134Hungarian Vowels 61"28 432"94 90"11 355"111

Ž .The percentage of correct identification responses and RTs in msŽaccording to subject groups and the ‘language’ in which the two vowels

.fall into separate categories of the vowels.

Therefore, one can assume that both phonetic and auditorystimulus representations of the test sounds were availableto the subjects.

( )2.3. Experiment 3: pre-attentiÕe change detection MMN

2.3.1. Subjects and procedureThe same subjects as in Expt. 2 participated in Expt. 3.

These two experiments were run in the same session, withExpt. 3 preceding Expt. 2 to avoid the stimuli beingattentively processed before the ERP measurements started.Subjects gave informed consent after the nature of theprocedure was explained to them. During the recording,the subject was sitting in an electrically- and acoustically-shielded room reading a self-selected book. The subjectwas instructed to disregard the sounds presented via head-phones. Experiments were conducted using identical stim-

Žuli, equipment NeuroStim stimulation system, NeuroScan.ERP recording and analysis system , procedures, and sub-

Ž .ject instructions translated in the two laboratories.

2.3.2. StimuliStimuli were delivered in blocks of 700 sounds. Three

different stimuli were randomized together in each block.Ž‘Finnish’ blocks contained rer as a standard probability

. Ž . Ž .of occurrence 82.5% , and rær 15% and ryr 2.5% asŽ .deviants. ‘Hungarian’ blocks consisted of r´r 82.5% ,

Ž . Ž .rer 15% , and ryr 2.5% . The ‘e’ sounds were identi-´cal to the ones used in Expt. 2 and the ryr vowel was asin Expt. 1. Stimuli were presented with a constant 1.2 sSOA. Intensity and duration were the same as in Expt. 1.There were two Finnish and two Hungarian blocks deliv-ered in a counterbalanced order. The very rare ryr vowelsenabled one to make comparisons between thegroupsrconditions with large physical and categorical sep-aration between the standard and the deviant stimulus.

2.3.3. EEG recordingŽEEG was recorded 0.1–100 Hz, y3 dB points, sam-

.pling rate 250 Hz with AgrAgCl electrodes placed at theŽ . Ž .midline Fpz, Fz, Cz, Pz , the two mastoids Lm and Rm ,

and the coronal line connecting the left and right mastoidŽlocations via Fz L1 and L2 over the left, and R1 and R2

over the right hemisphere were positioned at one- and

two-thirds, respectively, of the distance between Fz and.the mastoid of the corresponding hemisphere . The com-

mon reference electrode was attached to the tip of theŽ .nose. Changes in the electrooculogram EOG due to

horizontal eye movements were recorded bipolarly be-tween electrodes positioned near the outer canthi of the

Ž .two eyes HEOG . EOG changes caused by vertical eyemovements were monitored bipolarly with electrodes be-

Ž .low and above the right eye VEOG .

2.3.4. Data analysisThe EEG was filtered below 30 Hz. ERPs were aver-

aged separately for each type of stimulus. Epochs started50 ms before and continued 500 ms after stimulus onset.Trials contaminated by eye movements, blinks, or otherextracerebral artifacts exceeding 150 mV at any trace wererejected. For assessing the MMN component, the responseto the standard stimulus was subtracted from the corre-

Ž . Žsponding same stimulus block deviant responses ‘y’ and.‘e’, separately . Two different latency ranges could be

observed for the possible MMN component in the grand-Ž .average responses Fig. 3 and the corresponding differ-

Ž .ence curves Fig. 4 . An early negative difference waveŽappeared between 100 and 200 ms elicited by the Hungar-

ian sequence in both groups and the Finnish sequence in.the Finnish group , another late negative difference peakedŽclose to 300 ms elicited by the Finnish sequence in both

.groups . This was taken as a sign of two MMN subcompo-Ž .nents being elicited see the predictions in Section 2 and,

therefore, beside the overall MMN amplitude measurementŽ .100–300 ms at Fz which offers the best MMN estimate ,the deviant-minus-standard difference amplitude was also

Žcalculated in two shorter periods: 100–200 ms early. Ž .MMN , and 200–300 ms late MMN . All amplitude

measurements were referred to the average voltage of thepre-stimulus period. The MMN responses to the ryrdeviant were evaluated in the 100–200 ms interval only,

Žbecause the major part of the deviant-minus-standard ryr.minus ‘e’ negative difference fell into this time range.

ŽStatistical testing was done by ANOVA one-wayANOVAs for testing the difference between the deviantand standard response amplitudes and two-way ANOVAs

Ž .for the MMN deviant-minus-standard difference ampli-tudes for testing group and within- vs. across-categorycontrast effects; where applicable, repeated measures were

. Ž .employed . All significant p-0.05 results are de-scribed.

2.3.5. Results

2.3.5.1. Responses to the ‘e’ deÕiants. Fig. 3 summarizesŽ .the grand-average frontal Fz responses to all stimuli in

both groups. The corresponding difference curves areshown in Fig. 4. Both the within- and across-categorycontrasts elicited significant MMNs in the Hungarian and

Page 8: Pre-attentive detection of vowel contrasts utilizes both phonetic and auditory memory representations

( )I. Winkler et al.rCognitiÕe Brain Research 7 1999 357–369364

Ž . Ž . Ž .Fig. 3. ERP responses to standard and deviant stimuli. Frontal Fz group-average responses to deviant thick line and standard thin line synthesizedŽ . Ž .vowels in the Hungarian rer and ryr vs. r´r; left side and Finnish vowel sequence rær and ryr vs. rer; right side . Top panel: The ‘e’ deviants.´

Ž .Group-averaged responses for the Hungarian subjects are shown on the first row, those for the Finnish subjects on the second row. The early 100–200 msŽ . Ž .and late 200–300 ms deviant-minus-standard differences MMN are marked by black and dark grey filling of the respective areas. Bars under the ERP

Ž .responses indicate the mean deviant-minus-standard difference amplitude Standard Error of Mean marked on top of the corresponding MMN segment.Bottom panel: The ‘y’ deviants. Group-averaged responses for the Hungarian subjects are shown on the first row, those for the Finnish subjects on the

Ž .second row. The MMN interval measured was filled with light grey shading. Bars under the responses Standard Error of Mean marked on top give theŽ .mean MMN deviant-minus-standard amplitude in the corresponding interval. ) p-0.05 or )) p-0.01 show the significance level calculated by

one-way ANOVAs between the deviant and standard responses in the corresponding interval.

Page 9: Pre-attentive detection of vowel contrasts utilizes both phonetic and auditory memory representations

( )I. Winkler et al.rCognitiÕe Brain Research 7 1999 357–369 365

Ž .Fig. 4. Deviant-minus-standard difference curves. Frontal Fz group-average differences between the responses to deviant and standard synthesized vowelsŽ . Ž .in the Hungarian rer and ryr vs. r´r; left side and Finnish vowel sequence rær and ryr vs. rer; right side . Top panel: The ‘e’ deviants.´

Group-averaged differences for the Hungarian subjects are shown on the first row, those for the Finnish subjects on the second row. Bottom panel: The ‘y’deviants. Group-averaged differences for the Hungarian subjects are shown on the first row, those for the Finnish subjects on the second row.

Finnish speakers. The Hungarian sequence elicited a sig-Ž .nificant overall MMN in both groups Table 3 . The

corresponding MMN peak latencies of the grand-averageresponses were 180 and 160 ms in the Hungarian and

Ž .Finnish groups, respectively Fig. 4 . The overall MMN tothe Finnish sequence was significant in the Finnish speak-

ers whereas in Hungarian speakers, there was only aŽ .tendency for the presence of the overall MMN Table 3 .

For the Finnish sequence, two MMN peaks could bediscerned in the responses of Finnish subjects, one at 176ms and the other at 268 ms, while in the Hungarian group,

Ž .the MMN peak latency was 300 ms Fig. 4 . The early part

Page 10: Pre-attentive detection of vowel contrasts utilizes both phonetic and auditory memory representations

( )I. Winkler et al.rCognitiÕe Brain Research 7 1999 357–369366

Table 3Overall MMN amplitudes elicited by ‘e’ deviants

Finnish subjects Hungarian subjectsc aFinnish sequence y0.74"0.17 y0.27"0.15b cHungarian sequence y0.49"0.16 y0.57"0.14

Ž . ŽGrand-average frontal Fz MMN amplitudes in mV "S.E.M.—Stan-.dard Error of Mean elicited by ‘e’ deviants measured from the 100–300

Ž .ms overall MMN interval.The difference between deviant and standard responses was tested by

Žone-way dependent ANOVA tests degrees of freedom were 1 and 9;a b c .p-0.1, p-0.05, p-0.01 .

of the MMN was significant in the same cases where theŽoverall MMN was i.e., in both groups for the Hungarian

sequence and in the Finnish group for the Finnish se-.quence , whereas the late MMN was significant for the

Finnish sequence, but not for the Hungarian one, in bothŽ .groups Fig. 3, top panel . The latter result indicates that

Hungarians also pre-attentively detected infrequent changesbetween the two different exemplars of their r´r vowelŽ .the Finnish rer vs. rær contrast .

Across-category contrasts elicited larger overall MMNsthan within-category contrast in both language groups: therelation between the MMN amplitudes elicited in the twovowel sequences was reversed in the two groups. Thispattern of results was confirmed by the significant group=

language interaction in a two-way ANOVA of the overallŽ Ž .MMN amplitudes group= language interaction: F 1,18

.s4.61, p-0.05; see also Table 3 , which can be ex-plained by the different early–late MMN amplitude pat-

Ž .terns found in the two groups Fig. 3, top panel . In theHungarian speakers, the Hungarian vowel pair elicited ahigher-amplitude MMN than the Finnish pair in the early-

Ž Ž . Ž .MMN range only one-way ANOVA language : F 1,9 s.7.27, p-0.03 . In the Finnish subjects, the Finnish pair of

‘e’s elicited a higher-amplitude MMN than the HungarianŽ Ž . .pair in the late-MMN range only F 1,9 s6.66, p-0.03 .

2.3.5.2. Responses to the ‘ y’ deÕiants. The rare ryrsounds, inducing a categorical as well as a large acousticchange, elicited MMNs of approximately equal size in

Žboth groups in either of the two sequences Fig. 3, bottom. Žpanel Fig. 4, bottom panel . Neither the interaction nor

any of the main effects were significant in the group=.language ANOVA. The peak latencies were 168 and 176

ms in the Hungarian, 152 and 160 ms in the Finnish groupfor the grand-mean MMN responses in the Hungarian andFinnish sequences, respectively.

2.3.5.3. The positiÕe deÕiant-minus-standard differences.In some of the difference curves, a positive wave can beobserved peaking before 100 ms. This difference did not,however, reach the level of significance for any of thegrouprsequencerdeviant combinations. Another positivewave appears following the MMN response to ‘y’ deviants

for both groupsrsequences. This latter positivity mightŽ .reflect P3 mostly P3a components elicited in some sub-

Žjects due to the widely deviant ‘y’ sound compared to the.frequent ‘e’ standard . The P3a component is often elicited

when the separation between the standard and deviantstimulus is large. It is a sign of involuntary attention

w xswitching to the deviant stimulus 20 . None of the presentlate positive waves reached the level of significance, how-ever.

3. Discussion

3.1. Summary of the results

The present experiments were aimed at determiningwhether the pre-attentive change detection process re-flected by the MMN component can utilize both auditorysensory memory and phonetic codes of isolated vowels.For this purpose, two series of vowels were synthesized,one from the Hungarian rer–r´r, the other from the´Finnish rer–rær continuum. In Expt. 1, Hungarian andFinnish subjects categorized each synthesized vowel andgave scores according to its similarity to the typical pro-nunciation of the selected vowel category. On the basis ofthe results obtained, two pairs of synthesized vowels wereselected. One pair was predominantly categorized as rer´and r´r by the Hungarian speakers, while the Finnishspeakers identified both of them as rer. The other pairwas predominantly categorized as rer and rær by theFinnish speakers, while the Hungarians identified both ofthem as r´r. In this way, each of the two pairs ofsynthesized vowels constituted an across-category contrastin one language, and a within-category contrast in theother language.

In Expt. 2, Hungarian and Finnish subjects identifiedmembers of the selected synthesized vowel pairs by distin-guishing them from each other in separate forced-choicereaction tasks. The results showed the typical phonetic-cat-egory boundary effect: both language groups performedfaster and more accurately in identifying members of theacross- than the within-category contrast pair. Thus therelationship between the performance in the two sequenceswas reversed in the two groups. Yet the identificationperformance in the within-category contrast was also wellabove the chance level.

Finally, in Expt. 3, ERPs were recorded from Hungar-ian and Finnish subjects in separate passive oddballparadigms employing the synthesized vowel pairs selectedin Expt. 1. Both types of infrequent vowel changes elicitedan MMN in both language groups. In Hungarian speakers,this MMN had a larger amplitude and earlier latency forthe across- than within-category contrast. In the Finnishgroup, the across-category contrast elicited an MMN with

Ž .two successive peaks an early and a late one , whereas theŽ .within-category contrast only elicited a single early

MMN. Overall, the MMN was larger for the across- than

Page 11: Pre-attentive detection of vowel contrasts utilizes both phonetic and auditory memory representations

( )I. Winkler et al.rCognitiÕe Brain Research 7 1999 357–369 367

within-category contrast. Thus similarly to the perfor-mance levels observed in Expt. 2, the ratio between theoverall size of the MMNs elicited in the two sequenceswas reversed in the two groups.

3.2. MMN based on auditory sensory memory representa-tions

The MMNs elicited by within-category vowel contrastsprove that auditory sensory representations are involved inthe pre-attentive detection of infrequent phonetic changes.The MMN elicited by the Finnish contrast in Hungariansubjects as well as the MMN to the Hungarian contrast inFinnish subjects could only be based on auditory sensoryrepresentations, because these contrasts fell within the

Žsame vowel category in the other language as was shown.in Expt. 1 . The notion that the MMN-generating process

matches the accuracy of the subject’s individual perceptualŽ w x.discrimination abilities see e.g. 21 , is supported by the

good correspondence found between the subjects’ perfor-mance in the identification task and the amplitudes of theMMN responses in the corresponding passive oddball situ-ation. Therefore, the MMNs elicited by within-categorydeviants in the present experiment reflect pre-attentivechange detection based on stimulus representations con-taining auditory sensory information.

The peak latencies of the sensory representation basedMMNs did not show as close correlation with the reactiontimes recorded in Expt. 2 as could be expected forsensory-code related MMNs on the basis of previous find-

w xings 23,35 . However, the present Expt. 2 employed aforced choice reaction task with equal probabilities for thetwo synthesized vowels, whereas in those studies thattested the relationship between the MMN peak latency andRT, subjects had to press button to rare deviant tonesembedded in series of a slightly different standard tone. In

Žthese studies, pre-attentive detection of rare deviants the.mismatch process was a prerequisite of the voluntary

response. Therefore, the correlation between the MMNpeak latency and RT could be expected. Identification ofthe synthesized vowels in Expt. 2 required subjects tomatch each stimulus to one of the two previously pre-sented standards. This task did not rely on the mismatch

Ž .process and no MMN would be elicited in this sequence .Therefore, no close correlation between the RTs measuredin Expt. 2 and the MMN peak latency obtained in theoddball situation of Expt. 3 should be expected.

3.3. MMN based on categorical phonetic representations

One major result of the present study is that it providesstrong support for the notion that MMN can be based oncategorical representations. The present results are fullycompatible with the assumed categorical processing ofspeech stimuli. The involvement of phonetic vowel repre-sentations in the MMN-generating process is supported by

the results showing that the ratio between the overallMMN amplitudes elicited in the two sequences reversed intwo language groups: speakers of each language had analtogether larger MMN to the contrast that crossed a vowelboundary in their native tongue than to the contrast be-

Žtween two identically categorized vowels see also, Table.3 . Our prediction for the overall MMN size was thus

fulfilled even though the two ‘e’ contrasts elicited some-what different sensory representation based MMN re-

Ž .sponses cf. in Section 3.4 . As the stimulus sequencespresented to the two groups of subjects were identical,non-language related group differences cannot account forthese results, especially because the MMNs elicited by the

Ž .ryr vowels common to both languages were almostidentical between the different groupsrsequences.

However, it might still be possible that the accuracy ofthe sensory representation of a given test vowel was lowerin those subjects for whom this sound fell close to thecenter of a native-language vowel category than in thosefor whom it was near the boundary between two neighbor-ing vowels of their native tongue. This notion is supported

w xby results of Aaltonen et al. 1 who found that the MMNamplitude elicited by a contrast between synthesized iso-lated vowels close to the prototype of the given vowelcategory was lower than that to physically identical amount

Ž .of vowel deviation near but still within the phoneticŽ w x.boundary the perceptual magnet effect 17 . The conclu-

w x Žsions of Dehaene-Lambertz 9 for consonants see Section.1 are also compatible with this explanation. On this

Ž .alternative, phonetic categorical representations were notnecessarily involved in the MMNs elicited by across-cate-gory contrasts. However, it should be noted that plasticchanges in auditory processing as a consequence of the

Ž .significance or its lack of a range of phonetic auditoryfeatures in a given language do not contradict the notion ofthe existence of categorical phonetic representations. Infact, it is possible that during the first stages of learning tospeak a given language, constriction and expansion of the

Žauditory feature space in accordance with the phonetic.structure of that language is a precursor of true categorical

phonetic representations.w xThe results by Naatanen et al. 22 suggest that the¨¨ ¨

present MMNs elicited by across-category vowel contrastsinclude a genuine phonetic-code related contribution. Re-

w xcent results 10,28 further support the distinction betweenMMNs based on auditory vs. phonetic sensory traces inthat they showed somewhat different locations for MM-

Ž .Nms the magnetoencephalographic equivalent of MMNbased on auditory vs. phonetic deviation. Unfortunately,the ERP technique, without the use of dense electrodearrays, does not have sufficient spatial resolution for test-ing differences in the MMN generator location within arelatively small brain area. The technical background of

Ž .the present study common to both laboratories did notallow us to conduct such analysis. However, the abovereferred studies strongly suggest that the present difference

Page 12: Pre-attentive detection of vowel contrasts utilizes both phonetic and auditory memory representations

( )I. Winkler et al.rCognitiÕe Brain Research 7 1999 357–369368

between the MMNs elicited by within- vs. across-categorycontrasts also stem from the presence or absence of agenuine phonetic representation based mismatch compo-nent. That is, the present MMNs elicited by across-cate-gory vowel contrasts are not entirely based on auditory

Žsensory representation related responses cf. in Section.3.4 .

3.4. Parallel processing of auditory and phonetic represen-tations

Although our prediction for the overall sizes of theMMN components elicited by the ‘e’ contrasts was ful-filled by the present results, the pattern of MMN responseswas somewhat different for the two ‘e’ contrasts. TheMMN component elicited by the Hungarian sequence inFinnish subjects peaked early and had a large amplitude,whereas the MMN to the Finnish sequence in the Hungar-ian group had a long peak latency and a lower amplitudeŽ .Fig. 4. . These results suggest that the acoustic separationbetween the two pairs of ‘e’s was not equal, as severalstudies demonstrated that the MMN peak latency decreaseswhereas the amplitude increases when the amount of sepa-ration between the standard and deviant stimulus is in-

w xcreased 20,35 . Indeed, the physical difference betweenthe Finnish pair of ‘e’s was somewhat larger than that

Ž .between the Hungarian pair of ‘e’s see Fig. 2 .Because the latencies of the sensory representation based

MMNs were different between the Hungarian and theFinnish pair of ‘e’s, if sensory and phonetic representationbased MMN subcomponents were additive, as we expectedthem to be, different response patterns should be observedalso for the MMNs elicited by across-category vowel

Ž .contrasts see the predictions described in Section 2 . TheŽ .elicitation of two sensory- and phonetic-code related

MMN signals was confirmed by the response of the Finnishsubjects to the Finnish contrast. In Finnish speakers, theFinnish contrast elicited an MMN with two sequential

Ž .peaks separated by ca. 100 ms Fig. 4. . The latency of thelater peak corresponded well to the sensory representationbased MMN elicited by the same sequence in the Hungar-ian group. This suggests that the early peak of the sameresponse represents the phonetic representation basedMMN. The alternative explanation suggesting that thepresent MMN results only reflect differences in discrimi-nating relevant vs. irrelevant ranges of phonetic features in

Ž .the two languages see Section 3.3 cannot account for theelicitation of two successive MMNs by the Finnish con-trast in Finnish speakers. This is because if one assumesthat the advantage of Finnish speakers compared with theHungarian ones in discriminating the two Finnish ‘e’sstems only from their better sensory resolution of thefeature range separating the two Finnish ‘e’ vowels, thenone should expect the Finnish contrast to elicit only anearly high-amplitude MMN in the Finnish group ratherthan two successive medium-amplitude MMNs.

Is it reasonable to suggest that the phonetic representa-tion related MMN could precede the MMN based on thesensory representation of the same stimuli? The categoriza-tion process producing phonetic representations for speechstimuli must be very fast and it probably proceeds parallel

Ž .to at least a major part of the auditory sensory analysis.Otherwise one would not be able to understand speech in‘real-time’. Once two stimuli fall into two distinct cate-

Ž .gories, contrasting comparing them can proceed veryquickly. On the other hand, finding some difference be-tween two complex, physically quite similar auditory stim-

Ž .uli along sensory dimensions could take somewhat longer.For example, the MMN peak latency elicited by complexmissing-fundamental tones differing in their virtual pitch

w xwas found to be approximately 250 ms 38,39 . Therefore,it is quite possible that the small acoustic difference be-tween the two Finnish ‘e’s elicited late sensory representa-tion based MMNs in either language group, whereas inFinnish subjects, for whom these synthesized vowels fellacross a native language phonetic category boundary, thesame stimuli also elicited a significantly earlier phoneticcategory based MMN response.

The single short-latency MMN peak elicited by theHungarian contrast in Hungarian subjects, as well as thesimilar but even higher-amplitude MMN responses to the‘y’ deviants for either grouprsequence, also support theconclusion that phonetic representation based MMNs peakquite early. On the basis of the two MMN responses

Žobserved in Finnish speakers to the across-category native.language contrast, one can conclude that both types of

Ž .sensory and phonetic MMN responses were also elicitedin the above cases. Because the Hungarian pair of ‘e’s

Želicited an early sensory representation MMN see theresponses of the Finnish subjects to this contrast in Fig..4. , therefore, in Hungarian subjects, the peak latencies of

the sensory and phonetic representation based MMNs tothe Hungarian contrast fell very close to each other, thusforming a single MMN response. This was also the casefor the ‘y’ deviants that were acoustically even moredifferent from either ‘e’ standard than the corresponding‘e’ deviant. The ‘y’ deviants also represent an across-cate-gory phonetic change from the ‘e’ standards in either ofthe two languages. Therefore, the second main finding ofthe present study is that the MMNs elicited by auditoryŽ . Ž .sensory and phonetic categorical deviations can pro-ceed in parallel which implicates that these two forms of

Ž .stimulus representation coexist at least for some time inthe human brain.

4. Conclusion

The present results suggested that, for isolated vowels,categorical phonetic and auditory sensory memory repre-sentations are both preserved by the human central audi-tory system and that these codes are utilized in parallel by

Page 13: Pre-attentive detection of vowel contrasts utilizes both phonetic and auditory memory representations

( )I. Winkler et al.rCognitiÕe Brain Research 7 1999 357–369 369

the pre-attentive change detection process reflected in theMMN component.

Acknowledgements

This study was supported by the Academy of FinlandŽand the Hungarian National Research Fund OTKA

.T022800 .

References

˚w x1 O. Aaltonen, O. Eerola, A. Hellstrom, E. Uusipaikka, A.H. Lang,Perceptual magnet effect in the light of behavioral and psychophysi-

Ž .ological data, J. Acoust. Soc. Am. 101 1997 1090–1106.w x2 O. Aaltonen, P. Niemi, T. Nyrke, M. Tuhkanen, Event-related brain

potentials and the perception of a phonetic continuum, Biol. Psychol.Ž .24 1987 197–207.

w x3 O. Aaltonen, J. Tuomainen, M. Laine, P. Niemi, Event-relatedpotentials and discrimination of steady-state vowels within phonemecategories: a preliminary study, Scand. J. Logoped. Phonet. 17Ž .1992 107–112.

w x4 P. Alku, E. Vilkman, Estimation of the glottal pulseform based ondiscrete all-pole modeling, Proc. Int. Conf. Spoken Lang. Proc.,Yokohama, Japan, 1994, pp. 1619–1622.

w x5 R. Aulanko, R. Hari, O.V. Lounasmaa, R. Naatanen, M. Sams,¨¨ ¨Phonetic invariance in the human auditory cortex, NeuroReport 4Ž .1993 1356–1358.

w x6 A.D. Baddeley, The Psychology of Memory, Basic Books, NewYork, 1976.

w x7 D.E. Broadbent, Perception and Communication, Pergamon, NewYork, 1958.

w x8 N. Cowan, Attention and Memory. An Integrated Framework, Ox-ford Univ. Press, Oxford, 1995.

w x9 G. Dehaene-Lambertz, Electrophysiological correlates of categoricalŽ .phoneme perception in adults, NeuroReport 8 1997 919–924.

w x10 E. Diesch, T. Luce, Magnetic mismatch fields elicited by vowels andŽ .consonants, Exp. Brain Res. 116 1997 139–152.

w x11 G. Fant, The Acoustic Theory of Speech Production, Mouton, TheHague, 1960.

w x12 D.B. Fry, A.S. Abramson, P.D. Eimas, A.M. Liberman, The identifi-cation and discrimination of synthetic vowels, Lang. Speech 5Ž .1962 171–189.

w x13 Fujisaki, H., Kawashima, T., A model of the mechanisms for speechperception—quantitative analysis of categorical effects in discrimi-nation, Ann. Rep. Eng. Res. Inst. Faculty of Eng., Univ. of Tokyo,Vol. 30, 1971, pp. 59-68.

w x14 H. Gomes, W. Ritter, H.G. Vaughan Jr., The nature of preattentiveŽ .storage in the auditory system, J. Cogn. Neurosci. 7 1995 81–94.

w x15 E. Halgren, P. Baudena, J.M. Clarke, G. Heit, C. Liegeois, P.´Chauvel, A. Musolino, Intracerebral potentials to rare target anddistractor auditory and visual stimuli: I. Superior temporal plane and

Ž .parietal lobe, Electroencephalogr. Clin. Neurophysiol. 94 1995191–220.

w x16 M. Joliot, U. Ribary, R. Llinas, Human oscillatory brain activitynear 40 Hz coexists with cognitive temporal binding, Proc. Natl.

Ž .Acad. Sci. U.S.A. 91 1994 11748–11751.w x17 P.K. Kuhl, Human adults and infants show a ‘perceptual magnet

effect’ for the prototypes of speech categories, monkeys do not,Ž .Percept. Psychophys. 50 1991 93–107.

w x18 A.M. Liberman, K.S. Harris, H.S. Hoffman, B.C. Griffith, Thediscrimination of speech sounds within and across phoneme cate-

Ž .gories, J. Exp. Psychol. Human Percept. Perf. 54 1957 358–368.

w x19 A.C. Maiste, A.S. Wiens, M.J. Hunt, M. Scherg, T.W. Picton,Event-related potentials and the categorical perception of speech

Ž .sounds, Ear Hear. 16 1995 68–90.w x20 R. Naatanen, Attention and Brain Function, Lawrence Erlbaum¨¨ ¨

Associates, Hillsdale, NJ, 1992.w x Ž .21 R. Naatanen, K. Alho, Mismatch negativity MMN —the measure¨¨ ¨

for central sound representation accuracy, Audiol. Neuro-Otol. 2Ž .1997 341–353.

w x22 R. Naatanen, A. Lehtokoski, M. Lennes, M. Cheour-Luhtanen, M.¨¨ ¨Huotilainen, A. Iivonen, M. Vainio, P. Alku, R.J. Ilmoniemi, A.Luuk, J. Allik, J. Sinkkonen, K. Alho, Language-specific phonemerepresentations revealed by electric and magnetic brain responses,

Ž .Nature 385 1997 432–434.w x23 G.P. Novak, W. Ritter, H.G. Vaughan Jr., The chronometry of

attention-modulated processing and automatic mismatch detection,Ž .Psychophysiology 29 1992 412–430.

w x24 C. Phillips, A. Marantz, M. McGinnis, D. Pesetsky, K. Wexler, E.Yellin, Brain mechanisms of speech perception: a preliminary re-

Ž .port, in: C.T. Schutze, J.B. Ganger, K. Broihier Eds. , Papers on¨Language Processing and Acquisition. MIT Working Papers inLinguistics, Vol. 26, MIT Press, Cambridge, MA, 1995, pp. 153–191.

w x25 D.B. Pisoni, J. Tash, Reaction times to comparisons within andŽ .across category, J. Acoust. Soc. Am. 15 1974 285–290.

w x26 L. Polka, Linguistic influences in adult perception of non-nativeŽ .vowel contrasts, J. Acoust. Soc. Am. 97 1995 1286–1296.

w x27 L. Polka, J.F. Werker, Developmental changes in perception ofnon-native vowel contrasts, J. Exp. Psychol. Human Percept. Perf.

Ž .20 1994 421–435.w x28 T. Rinne, K. Alho, P. Alku, M. Holi, J. Sinkkonen, J. Virtanen, O.

Bertrand, M. Tervaniemi, R. Naatanen, Hemispheric asymmetry of¨¨ ¨cortical activation as reflected by the mismatch negativity revealswhen a sound is processed as speech. Abstracts, 27th Ann. MeetingSoc. Neurosci., New Orleans, October 25–30, 1997, p. 1058.

w x29 W. Ritter, D. Deacon, H. Gomes, D.C. Javitt, V.G. Vaughan Jr., Themismatch negativity of event-related potentials as a probe of tran-

Ž .sient auditory memory: a review, Ear Hear. 16 1995 52–67.w x30 M. Sams, R. Aulanko, O. Aaltonen, R. Naatanen, Event-related¨¨ ¨

potentials to infrequent changes in synthesized phonetic stimuli, J.Ž .Cogn. Neurosci. 2 1990 344–357.

w x31 S. Sandridge, A. Boothroyd, Using naturally produced speech toŽ .elicit mismatch negativity, J. Am. Acad. Audiol. 7 1996 105–112.

w x32 E. Schroger, On the detection of auditory deviants: a pre-attentive¨Ž .activation model, Psychophysiology 34 1997 245–257.

w x33 A. Sharma, N. Kraus, T. McGee, T. Carrell, T. Nicol, Acousticversus phonetic representation of speech as reflected by the mis-match negativity event-related potential, Electroencephalogr. Clin.

Ž .Neurophysiol. 88 1993 64–71.w x34 M. Studdert-Kennedy, Discovering phonetic function, J. Phon. 21

Ž .1993 147–155.w x35 H. Tiitinen, P. May, K. Reinikainen, R. Naatanen, Attentive novelty¨¨ ¨

detection in humans is governed by pre-attentive sensory memory,Ž .Nature 372 1994 90–92.

w x36 I. Winkler, P. Paavilainen, K. Alho, K. Reinikainen, M. Sams, R.Naatanen, The effect of small variation of the frequent auditory¨¨ ¨stimulus on the event-related brain potential to the infrequent stimu-

Ž .lus, Psychophysiology 27 1990 228–235.w x37 I. Winkler, K. Reinikainen, R. Naatanen, Event-related brain poten-¨¨ ¨

tials reflect traces of the echoic memory in humans, Percept. Psy-Ž .chophys. 53 1993 443–449.

w x38 I. Winkler, M. Tervaniemi, M. Huotilainen, R. Ilmoniemi, A. Aho-nen, O. Salonen, C.-G. Standertskjold-Nordenstam, R. Naatanen,¨ ¨¨ ¨From objective to subjective: pitch representation in the human

Ž .auditory cortex, NeuroReport 6 1995 2317–2320.w x39 I. Winkler, M. Tervaniemi, R. Naatanen, Two separate codes for¨¨ ¨

missing-fundamental pitch in the human auditory cortex, J. Acoust.Ž .Soc. Am. 102 1997 1072–1082.