Top Banner
Language Learning ISSN 0023-8333 Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny Niclas Abrahamsson and Kenneth Hyltenstam Stockholm University The incidence of nativelikeness in adult second language acquisition is a controversial issue in SLA research. Although some researchers claim that any learner, regardless of age of acquisition, can attain nativelike levels of second language (L2) proficiency, others hold that attainment of nativelike proficiency is, in principle, impossible. The discussion has traditionally been framed within the paradigm of a critical period for language acquisition and guided by the question of whether SLA is constrained by the maturation of the brain. The work presented in this article can be positioned among those studies that have focused exclusively on the apparent counterexamples to the critical period. We report on a large-scale study of Spanish/Swedish bilinguals (n = 195) with differing ages of onset of acquisition (<1–47 years), all of whom identify themselves as potentially nativelike in their L2. Listening sessions with native-speaker judges showed that only a small minority of those bilinguals who had started their L2 acquisition after age 12, but a majority of those with an age of onset below this age, were actually perceived as native speakers of Swedish. However, when a subset (n = 41) of those participants who did pass for native speakers was scrutinized in linguistic detail with a battery of 10 highly complex, cognitively demanding tasks and detailed measurements of linguistic performance, representation, and processing, none of the late learners performed within the native-speaker range; in fact, the results revealed also This study was made possible by a research grant to K. H. and N. A. from The Bank of Sweden Tercentenary Foundation (grant No. 1999-0383:01). We are greatly indebted to the participants of the study, who without hesitation agreed to go through the 4-hour long and quite demanding test session. We would also like to thank Johan Roos for carrying out the testing and data collection, Katrin St¨ olten for doing the VOT analyses, our colleagues at the Centre for Research on Bilin- gualism at Stockholm University as well as the anonymous Language Learning reviewers for their comments on earlier versions of this article, and Thomas Lavelle for correcting and improving our English writing. Correspondence concerning this article should be addressed to Niclas Abrahamsson, Centre for Research on Bilingualism, Stockholm University, SE-106 91 Stockholm, Sweden. Internet: [email protected] Language Learning 59:2, June 2009, pp. 249–306 249 C 2009 Language Learning Research Club, University of Michigan
59

Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Feb 05, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Language Learning ISSN 0023-8333

Age of Onset and Nativelikeness

in a Second Language: Listener

Perception Versus Linguistic Scrutiny

Niclas Abrahamsson and Kenneth Hyltenstam

Stockholm University

The incidence of nativelikeness in adult second language acquisition is a controversialissue in SLA research. Although some researchers claim that any learner, regardlessof age of acquisition, can attain nativelike levels of second language (L2) proficiency,others hold that attainment of nativelike proficiency is, in principle, impossible. Thediscussion has traditionally been framed within the paradigm of a critical period forlanguage acquisition and guided by the question of whether SLA is constrained by thematuration of the brain. The work presented in this article can be positioned amongthose studies that have focused exclusively on the apparent counterexamples to thecritical period. We report on a large-scale study of Spanish/Swedish bilinguals (n =195) with differing ages of onset of acquisition (<1–47 years), all of whom identifythemselves as potentially nativelike in their L2. Listening sessions with native-speakerjudges showed that only a small minority of those bilinguals who had started their L2acquisition after age 12, but a majority of those with an age of onset below this age,were actually perceived as native speakers of Swedish. However, when a subset (n =41) of those participants who did pass for native speakers was scrutinized in linguisticdetail with a battery of 10 highly complex, cognitively demanding tasks and detailedmeasurements of linguistic performance, representation, and processing, none of thelate learners performed within the native-speaker range; in fact, the results revealed also

This study was made possible by a research grant to K. H. and N. A. from The Bank of Sweden

Tercentenary Foundation (grant No. 1999-0383:01). We are greatly indebted to the participants of

the study, who without hesitation agreed to go through the 4-hour long and quite demanding test

session. We would also like to thank Johan Roos for carrying out the testing and data collection,

Katrin Stolten for doing the VOT analyses, our colleagues at the Centre for Research on Bilin-

gualism at Stockholm University as well as the anonymous Language Learning reviewers for their

comments on earlier versions of this article, and Thomas Lavelle for correcting and improving our

English writing.

Correspondence concerning this article should be addressed to Niclas Abrahamsson, Centre

for Research on Bilingualism, Stockholm University, SE-106 91 Stockholm, Sweden. Internet:

[email protected]

Language Learning 59:2, June 2009, pp. 249–306 249C© 2009 Language Learning Research Club, University of Michigan

Page 2: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

that only a few of the early learners exhibited actual nativelike competence and behavioron all measures of L2 proficiency that were employed. Our primary interpretation of theresults is that nativelike ultimate attainment of a second language is, in principle, neverattained by adult learners and, furthermore, is much less common among child learnersthan has previously been assumed.

Keywords adult second language acquisition; age of onset; critical period hypothesis(CPH); listener perception; maturational constraints; multiple-task design; nativelike-ness; near-nativeness; L1 Spanish; L2 Swedish

Introduction

In Larry Selinker’s seminal article “Interlanguage” (Selinker, 1972), publishedduring the first phase of theory development in second language acquisition(SLA), a number of central concepts were discussed that together came to play acrucial role in second language (L2) research. Although the term interlanguagehad already been introduced a few years earlier (Selinker, 1969), it was throughthe 1972 article that it became established as a general term referring to theseparate linguistic system responsible for the learner’s observable version ofthe L2. In the same article, the term fossilization was introduced, as well asthe idea of a number of psycholinguistic, or cognitive, processes governing thesuccessive growth and change of the interlanguage.

In his article, which focused exclusively on adult L2 learners, Selinkeralso dealt with the question of the relatively few individuals who despite alate age of onset of acquisition succeed in reaching levels of L2 “competence”comparable to that of native speakers; there, Selinker talked about “absolutesuccess” (1972, p. 212). In this context, he suggested that these individualsmay constitute approximately 5% of all adult L2 learners. The reason formentioning these learners only in passing was that he wanted to exclude themfrom the domain of SLA research. These individuals, Selinker argued, are sounique and make use of such different psychological processes in their learningthat they need not be considered at all in L2 theory building; “these successfullearners may be safely ignored” (p. 212).

Despite the obvious arbitrariness of Selinker’s 5% estimate, it has beenperpetuated by the SLA literature numerous times over the years. As an effect,many students of SLA, including researchers, have treated Selinker’s guessmore or less as an established fact. On the other hand, there are researcherswho have questioned the 5% figure, for different reasons and from differ-ent angles. Although some have indeed suggested that a much larger number(say 10–15%) of adult learners reach nativelike levels in the L2 (see, e.g.,

Language Learning 59:2, June 2009, pp. 249–306 250

Page 3: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Birdsong, 1999, 2005a; Seliger, Krashen, & Ladefoged, 1975), others claimthat entirely nativelike adult L2 learners do not exist at all. This latter po-sition is taken by, for example, Bley-Vroman (1989), who holds that adultL2 acquisition comes about through general, cognitive learning strategies, asopposed to the linguistically domain-specific principles that govern children’sacquisition of a first language (L1). To learn a language fully on the basis ofgeneral cognitive learning strategies alone is, according to Bley-Vroman, notonly difficult but impossible, and, therefore, “virtually no normal adult learnerachieves perfect success, if what one means thereby is development of native-speaker competence” (p. 44). However, if absolute success does occur in afew, rare adult L2 learners, this could, according to Bley-Vroman, be giventhe same pathological status as the exceptional phenomenon of failure in L1acquisition.

Bley-Vroman’s work (1989) can be said to be representative for the Univer-sal Grammar (UG) paradigm and for those researchers who argue that adult L2learners no longer have access (or only partial access) to the innate universalprinciples and constraints that are responsible for language development (e.g.,Epstein, Flynn, & Martohardjono, 1996; Eubank & Gregg, 1999; Schachter,1989). These researchers all take the theoretical position that adult L2 speak-ers’ competence is different from that of L1 speakers; adult language learningsimply does not lead to absolute nativelikeness. Gregg (1996) formulated thisidea quite categorically when claiming that “truly native-like competence in anL2 is never attained” (p. 52). Similarly, there are researchers outside the UGparadigm who, on both theoretical and empirical grounds, suggested that thenumber of absolute nativelike adult learners should be zero; for example, Longand Robinson (1998) assumed that the maximal level of L2 attainment shouldbe what is frequently labeled “near-native” rather than “nativelike,” which isan assumption that we ourselves have made previously.1

The existence or nonexistence of late, nativelike L2 learners has generallybeen discussed in relation to the critical period hypothesis (CPH; Lenneberg,1967). If such individuals do exist, many researchers claim that they wouldconstitute the evidence necessary to reject the CPH or any other hypothesesproposing biological/maturational constraints on language learning. In fact,Long (1990) argued that one single post-critical-period L2 learner with anunderlying competence indistinguishable from that of native speakers wouldsuffice to reject the CPH. The work presented in this article can be positionedamong those studies that have focused exclusively on the apparent counterex-amples to the critical period, in order to test the hypothesis that languageacquisition is maturationally constrained.

251 Language Learning 59:2, June 2009, pp. 249–306

Page 4: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Previous Research on the Incidence of Nativelikeness

Since the late 1960s, a large number of studies have compared groups ofearly and late learners’ ultimate attainment of an L2 (e.g., Asher & Garcıa,1969; Johnson & Newport, 1989; Munro & Mann, 2005; Patkowski, 1980).In general, such studies have found a strong negative correlation betweenL2 learners’ age of onset (AO) of acquisition and some measure of their L2proficiency; whenever nativelike bahavior has been observed, this has beenassociated exclusively with younger learners. Of course, the prospect of findinghighly advanced, potentially nativelike adult learners is rather small when usingsamples of randomly selected individuals with varying degrees of ultimateattainment. For example, the study by Johnson and Newport of 46 Chineseand Korean learners of L2 English demonstrated not only a strong negativecorrelation (r = −.77) between AO of L2 acquisition and scores on an English276-item grammaticality judgment test (GJT) but also that no participants withAO beyond 7 years scored within the native-speaker range. Similarly, in theirpartial replication of the Johnson and Newport study, Bialystok and Miller(1999) found participants in two learner groups (L1 Spanish and L1 Chinese)who performed like English native speakers on the GJT until AO 8 years,whereas no participants with AO beyond this age were reported to score withinthe range of native speakers.

However, in various other replications of the Johnson and Newport study(1989; henceforth J&N’89), late learners have indeed been found to performwithin the native-speaker range. What these replications have in common isthat the selection of participants departs from the original study in some crucialways, the two most important adjustments being, first, the extension of theminimum length of residence in the host country, from J&N’89’s 5 years toat least 10 years, and, second, the choice of participants with L1s other thanChinese and Korean. In a partial replication using two different groups ofDutch learners of English, all of whom had begun their L2 learning after age12, Van Wuijtswinkel (1994) reported 8 of 25 learners in one learner groupand 7 of the 8 learners in another group with performance scores within therange of native-speaker performance. In their study of 200 Korean learners ofEnglish, Flege, Yeni-Komshian, and Liu (1999) found six participants with AO≥ 12 who performed like natives on a subset of J&N’89 stimulus sentences(although none with an AO beyond 16), but in pronunciation tests, they foundno L2 participants with AO above 9 who spoke English without a detectableforeign accent. Furthermore, Birdsong and Molis (2001) found in their J&N’89replication with Spanish learners of English that more than 20% of the late

Language Learning 59:2, June 2009, pp. 249–306 252

Page 5: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

learners (defined as those with AO ≥ 17 years) performed within the J&N’89native-speaker range on the GJT; in fact, a majority of the participants with AO≥ 12 years performed within this range. Finally, in a replication with a subsetof the J&N’89 sentences but with Hungarian learners of English and no nativecontrols, DeKeyser (2000) identified 10 out of 42 late learners who scoredwithin the range of child learners. However, in the absence of native controlspeakers, it is difficult to evaluate the incidence of nativelike performanceamong DeKeyser’s learners, although a qualified guess is that native speakerswould score at or close to ceiling, as has been the case in the original J&N’89study and in the various other replications.

On the basis of these and similar studies, one might be tempted to con-clude that nativelike attainment of an L2 is indeed possible, even common,among individuals who started their acquisition after childhood. However, aswe have argued earlier (see Hyltenstam & Abrahamsson, 2003b; see also 2000,2001), postpuberty (including adult) learners may well attain the same linguisticknowledge and exhibit the same linguistic behavior as native speakers in certain(limited) areas of the target language without thereby being indistinguishablefrom mother-tongue speakers in all relevant respects. Our claim is that muchof the research that is frequently taken as evidence for the existence of late, na-tivelike L2 learners suffer from Type II errors because it has either been basedon language tests that are too easy and involve quite simple structures (e.g., theJ&N’89 GJT) or because language production data have not been analyzed insufficient detail. Both of these shortcomings tend to result in ceiling effects andunwarranted claims for nativelikeness (Hyltenstam & Abrahamsson, 2003b,p. 570; Long, 2005).

Therefore, what appears to be more compelling evidence for adult-learnernativelikeness can be found in studies that have focused exclusively on late,high-proficiency L2 speakers who have been preselected, or screened, for poten-tially nativelike verbal behavior. Characteristic of these studies is that they haveemployed quite sophisticated techniques for linguistic scrutiny, either through(a) great stringency and detail of the analyses, (b) demanding tests and tasks(e.g., through the choice of unusual target-language structures that are knownto be difficult for learners), and/or (c) the use of multiple-task designs cov-ering various linguistic domains rather than one or a few isolated structures,phenomena, or domains. These methodological features will be illustrated nextthrough a review of a sample of studies.

In a pronunciation study by Moyer (1999), in which 24 highly proficientand highly motivated American learners of German participated, four speechelicitation techniques were used, representing four different speech modes:

253 Language Learning 59:2, June 2009, pp. 249–306

Page 6: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

word-list reading, sentence reading, paragraph reading, and free speech pro-duction. Results revealed that the word-list task produced the highest inci-dence of nativelikeness, as judged by a panel of four native German listeners;this was followed by sentence reading, paragraph reading, and free speechproduction, on which most learners failed to pass for native speaker. Onlyone individual among the original 24 advanced learners passed for a nativespeaker in all four speech modes. The Moyer study highlights an impor-tant problem with those pronunciation studies in which conclusions aboutthe critical period have been based solely on late L2 learners’ accent-freereading of rehearsed words, sentences, or short passages (e.g., Bongaerts,Mennen, & van der Slik, 2000; Bongaerts, Planken, & Schils, 1995; Bon-gaerts, van Summeren, Planken, & Schils, 1997; Neufeld, 2001; for overviews,see Bongaerts, 1999; Long, 2005). The typical result of such studies is thatquite a few participants pass for native speakers when their pronunciation isjudged by a panel of native listeners. However, as these results concern re-hearsed reading (sometimes even imitation; see Neufeld, 1979) rather thanfree speech production (as in the Moyer study), it is not unlikely that theymay reflect “language-like” behavior (Long, pp. 297f f ) rather than actual L2proficiency.2

In a phonetic study of five intermediate and five advanced English-speakinglate L2 learners of Spanish, Colantoni and Steele (2006) investigated onespecific area of phonological acquisition: stop-liquid sequences. Despite thisobvious limit in scope, the study surpassed most other CPH studies in itshigh degree of detail and scrutiny. Instead of using native-speaker judges, thereadings of 44 words from each participant were analyzed acoustically withregard to three phonetic properties of stop-liquid clusters: stop voicing, rhoticlength and voicing, and epenthesis rate and vowel length. Results revealed thatonly one of the advanced learners (AO 24 years) and none of the intermediatelearners exhibited truly nativelike behavior (as defined by the analyzed speechof 10 native control speakers) on all three properties. In a similarly detailedstudy, Birdsong (2007) reported on aspects of the pronunciation of 22 lateL1 English learners of L2 French, all of whom had resided in the Paris areafor 11 years on average. It was shown that two participants performed withinthe range of 17 native speakers of French on three measures: vowel length,voice onset time, and global pronunciation, as rated by three native judges.Although the incidence of nativelikeness was more or less identical in thesetwo studies (9–10% of the samples), interestingly enough the conclusions drawnby the authors diverge significantly: Whereas Colantoni and Steele concludedthat although possible, nativelike attainment of an L2 by adults “is clearly

Language Learning 59:2, June 2009, pp. 249–306 254

Page 7: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

exceptional” (p. 71), Birdsong described his results as “impressive rates ofnativelike pronunciation” (p. 112).3

In the area of grammar, Coppieters (1987) found that among 21 apparentlynativelike and highly educated adult learners of French as a foreign language,none performed within the range of native controls on a syntactic/semanticjudgment task, covering a variety of (UG and non-UG) morphosyntactic con-structions. In addition, from analyses of recorded interviews, it was observedthat many of them produced errors in structures that were actually mastered inthe judgment task. However, in a replication of the Coppieters (1987) study,although using other criteria for participant selection, Birdsong (1992) reportedthat no less than 15 of 20 late foreign language learners (AO 11–28 years) ofFrench, all of whom had also resided in France for a minimum of 3 years, per-formed within the native-speaker range. Similarly, in a UG-oriented study ofthe accessibility of Subjacency and the Empty Category Principle (ECP), Whiteand Genesee (1996) found no difference in test performance between a groupof near-native speakers of L2 English (including 16 individuals with AO ≥12) and a group of native English participants, although significant differenceswere reported between a nonnative group and the native group. Furthermore, noage effects within the participant groups were observed. The authors concludedthat access to these UG principles is unaffected by age, but they remained non-committal about other linguistic domains (White & Genesee, p. 262). However,a problem with this study is that most of the participants were L1 speakers ofFrench, a language in which Subjacency and the ECP work largely as they do inEnglish. It is therefore not clear why one should expect near-native participantsto have any particular difficulties with sentences including these features.

In contrast, Montrul and Slabakova (2003) focused on an area known to bevery difficult for L2 learners of Spanish—morphological and semantic prop-erties of aspectual tenses. With the focus set on highly proficient learnerswith English as an L1, they investigated three participant groups: 17 near-natives, 23 superior learners, and 24 advanced learners, all of whom hadbegun their Spanish studies in high school (age ≥ 12 years).4 Two linguis-tic tasks were employed: one sentence-conjunction task and one truth-valuejudgment task, both of which were reported to be very difficult, even fornative speakers. Yet the results showed that 19 out of the total of 64 L2 par-ticipants performed within the range of 20 native control speakers on bothtasks; 12 of these were found in the group of 17 near-natives. Therefore, theseresearchers concluded that a nativelike command of the Spanish aspectual sys-tem does not become unattainable after a certain age, although, in line withWhite and Genesee (1996), acknowledging the possibility of one or several

255 Language Learning 59:2, June 2009, pp. 249–306

Page 8: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

critical periods for other linguistic structures or domains (Montrul & Slabakova,p. 384).

A recent study by van Boxtel, Bongaerts, and Coppen (2005; based on vanBoxtel, 2005) is another good example of research that focuses on advanced latelearners’ ability to acquire structures or details of the target language that areknown to be extremely difficult for L2 learners. In this case, the target area wasdummy subject constructions in Dutch, for “which no explicitly formulatedrules are available” (p. 376). The L2 proficiency of 43 very advanced lateimmigrant learners (AO ≥ 12 years) with either German, French, or Turkishas the L1 was scrutinized with two tests: one elicited imitation task and onesentence preference task. The learners’ performances were compared to those of44 native speakers of Dutch. The study produced eight learners (three German,four French, and one Turkish) who scored within the native-speaker range onboth tasks and on all sentence types. It was concluded that implicit acquisition(from L2 input alone) of unusual and difficult structures is indeed possible evenfor late learners and that results of this kind are unsupportive of the CPH.

Finally, and most relevant for the present study, there are a few studies thathave tried to identify late L2 learners with nativelike competence and behav-ior across a wide range of tasks, thereby covering as many linguistic domainsas possible. This approach to “global nativelikeness” was first taken by Ioup,Boustagui, El Tigi, and Moselle (1994) in their influential study of two excep-tional adult learners of Arabic. The two learners, called Julie and Laura, bothhad English as an L1 and were chosen for the study because native speakers ofArabic did not usually notice their nonnative background. At the time of thestudy, both learners were residents of Egypt. Julie had no knowledge of Arabicbefore moving from Britain to Cairo at the age of 21 years, and she was marriedto an Egyptian man, had two children, and spoke only Arabic with her familyand her husband’s relatives. Her length of residence in Egypt was 26 years,and she had learned Arabic exclusively through informal exposure. Laura, onthe other hand, was a native speaker of American English and had receivedextensive formal exposure to Arabic at various universities. She was marriedto an Egyptian man and had been living in Egypt for 10 years at the time of thestudy. Thus, both of these L2 speakers could be thought of as being optimallyimmersed into the L2 as well as into the Egyptian society and culture. Thetwo learners were subjected to a large test battery, consisting of six measuresthat covered speech production (free speech judged by a native-speaker panel),accent identification (two different tests), and grammatical proficiency (trans-lation, grammaticality judgment, and interpretation of anaphora). Althoughboth Julie and Laura performed extremely well on all these tasks (in fact,

Language Learning 59:2, June 2009, pp. 249–306 256

Page 9: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Julie performed better than many native speakers on the accent identificationtasks), both performed significantly below the native-speaker range on aspectsof grammatical intuition.

In a recent study of L2 mastery “across-the-board,” Marinova-Todd (2003)investigated 30 highly advanced late learners’ ultimate attainment of English.The participants were selected on the basis of recommendations by native speak-ers who found them to be highly proficient in English. The learners had varyingL1 backgrounds and they had arrived in the United States between the ages of 16and 31 years. Their ages of first exposure to English were 10–21 years and theyhad a length of residence between 5 and 20 years. A battery consisting of nineinstruments was used, covering four linguistic domains: pronunciation (elicitedand spontaneous speech), lexicon (receptive vocabulary size and productivelexical diversity), morphosyntax (grammaticality judgment, production, andsentence comprehension), and language use/pragmatics (politeness strategiesand narrative ability). When compared to the performances of 30 native Englishspeakers, most L2 participants did not pass for native speakers on all nine mea-sures. However, two individuals did so, and one additional learner performedwithin the native-speaker range on all but one measure (vocabulary size). Allthree learners arrived in the United States at age 18 years and had a length ofresidence of 5–7 years in the country. Worth pointing out is that these threelate learners had, prior to their arrival, received formal English instruction intheir home countries for 5 years on average, which means that they were about13 years old—not adults—when they actively began to learn English, a factthat makes them less comparable to, for example, Julie and Laura in the Ioupet al. (1994) study.

As has become clear from the above review of the research literature,the reports on the incidence of nativelikeness in late L2 learners vary enor-mously, from quite high rates (e.g., Birdsong, 1992; Birdsong & Molis, 1998;Montrul & Slabakova, 2003; Van Wuijtswinkel, 1994; White & Genesee,1996), through more moderate rates (e.g., Birdsong, 2007; Bongaerts, 1999;Colantoni & Steele, 2006; Flege et al., 1999 [for grammar]; Marinova-Todd,2003; Moyer, 1999; van Boxtel et al., 2005), to zero occurrences (e.g., Bia-lystok & Miller, 1999; Coppieters, 1989; Flege et al. [for accent]; Ioup et al.,1994; Johnson & Newport, 1989). In our own empirical studies of very ad-vanced L2 speakers, in which we have tried to adopt stringent elicitationmethods and techniques of analysis, we have consistently failed to identifynativelike late learners of L2 Swedish. So far, we have interpreted this assupport for our claim that nativelike L2 learners with an AO of acquisi-tion beyond puberty are extremely difficult, or even impossible, to find (see

257 Language Learning 59:2, June 2009, pp. 249–306

Page 10: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Abrahamsson & Hyltenstam, 2008; Abrahamsson, Stolten, & Hyltenstam, inpress; Hyltenstam, 1992; Hyltenstam & Abrahamsson, 2003a; see also Stolten,2005, 2006). Here, as in a few other studies, the question of the actual ulti-mate attainment resulting from childhood learning has also been highlighted;in fact, it now seems clear that differences do exist even between early learners’ultimate attainment and native-speaker proficiency (cf. the results reported byBialystok & Miller, 1999; Butler, 2000; Flege, Munro, & MacKay, 1995; Flegeet al., 1999; Lee, Guion, & Harada, 2006; McDonald, 2000; Tsukada, Birdsong,Bialystok, Mack, Sung, & Flege, 2005; for a discussion, see Hyltenstam &Abrahamsson, 2003b). Obviously, if very short delays in the onset of acquisi-tion can be shown to have effects on the ultimate level of L2 proficiency, this willhave important implications for the CPH or any other theory of maturationalconstraints in SLA.

The Present Study

A central point of departure for the study to be described here is that aslong as there are no accounts of a single adult learner who, in all relevantrespects, can be shown to have attained proficiency in an L2 that is identicalto a native speaker’s, there can be no claims for the existence of such learners(Hyltenstam & Abrahamsson, 2003b; cf. also Long, 1990, 1993). Therefore,rather than investigating the ultimate attainment of a representative sample ofthe L2 learner population, the present study aimed at identifying individualswho would potentially constitute the evidence necessary to reject the CPH. Inother words, the study positions itself among those previous studies that havefocused exclusively on learners who (at least) seem to have attained a nativelikelevel of L2 proficiency.

The present study was conducted in two consecutive parts, referred to hereas Part I and Part II, respectively. Part I concerned native-listener perceptionof nativelikeness of a large pool (n = 195) of very advanced L2 speakers ofSwedish with AOs of L2 acquisition between <1 and 47 years. In addition, thenative-listener judgments resulting from this part of the study also served as ascreening that formed the basis for participant selection for the second part.

Part II consisted of a detailed linguistic scrutiny of nativelikeness of a subset(n = 41) of L2 speakers with AOs 1–19 years, all of whom had passed for nativespeakers of Swedish with a majority of the native listeners in Part I.

As described in greater detail below, the study aimed at incorporatingthree important methodological features that (with a few exceptions) have beenlacking in CPH-related research: (a) a specified understanding of the concept

Language Learning 59:2, June 2009, pp. 249–306 258

Page 11: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

of “nativelikeness”; (b) initial screening of participants; and (c) an in-depthscrutiny of actual linguistic nativelikeness. The lack of these features in earlierstudies has contributed significantly to what we see as an overestimation of theincidence of nativelikeness.

The first feature concerns the way in which the concept of “nativelikeness”can be understood. In order to address this question in greater detail, we dis-tinguish three different ways in which the concept of “nativelikeness” has beeninterpreted:

Interpretation 1: to self-identify as a nativelike speaker of the target language(e.g., Piller, 2002; Seliger, 1978; Seliger et al., 1975)

Interpretation 2: to be perceived as a nativelike speaker by native speakersof the target language (e.g., Bongaerts, 1999; Moyer, 1999;Neufeld, 2001)

Interpretation 3: to be a nativelike speaker of the target language (e.g.,Birdsong, 1999; Bley-Vroman, 1989; Long, 1990)

It is, of course, difficult, or even impossible, for L2 users to judge for themselveswhether they pass for native speakers (i.e., Interpretation 1), a fact dealt withonly in passing in this article. Although we acknowledge the psychosocial re-ality of self-identification as a nativelike speaker (i.e., regardless of what nativespeakers think or of what a linguistic analysis would reveal), we believe thatthis interpretation of nativelikeness may be safely disregarded in the followingdiscussion, primarily because it clearly falls outside the scope of what is usuallymeant by “nativelikeness” or “native speaker” in scholarly discourse. Whethersomeone is perceived as a native speaker by (actual) native speakers (Interpre-tation 2), on the other hand, is a central aspect of nativelikeness. These speakersare part of a language community in which they are joined by a reciprocal iden-tification as members of this community based on linguistic characteristics—aclearly sociolinguistic issue. The question of whether someone is linguisticallylike a native speaker (Interpretation 3) constitutes, in our view, a basically psy-cholinguistic problem, although it may also be viewed from other perspectives(i.e., social, pragmatic, etc.).

In our current research, we are primarily interested in Interpretation 3 ofnativelikeness—that is, the extent to which L2 learners exist who are native-like in their language competence and behavior (see, e.g., Abrahamsson &Hyltenstam, 2008; Abrahamsson et al., in press; Hyltenstam & Abrahamsson,2003a). However, the study reported in this article covers all three interpreta-tions of nativelikeness in the following manner. First, all participants includedin Part I of the study are highly advanced L2 learners, all of whom identify

259 Language Learning 59:2, June 2009, pp. 249–306

Page 12: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

themselves as potentially nativelike or near-native speakers of Swedish. Fur-thermore, Part I of the study investigates the number of these advanced learnerswho are perceived as nativelike speakers by native listeners. Finally, Part II ofthe study investigates how many of these perceived nativelike speakers, in fact,behave linguistically like native speakers of Swedish on a broad selection oflanguage proficiency measures and tasks.

The second methodological feature that this study tried to capture is initialscreening of participants. Because the research agenda advanced by the studyconcerns the incidence of actual, linguistic nativelikeness, the selected partic-ipants need to be highly advanced L2 speakers. As correctly pointed out byLong (1993), “[t]here is no value in studying obviously non-native-like individ-uals intensively in order to declare them non-native-like” (p. 204); therefore,screening participants for potential nativelikeness is a crucial procedure priorto language testing. In previous research, such procedures have been adopted inone of two ways. The first is “informal screening,” in which recruitment of par-ticipants comes about through impressionistic judgments by teachers or schooladministrators (Bongaerts, 1999; Hyltenstam, 1992; Moyer, 1999; van Boxtelet al., 2005) and/or the researchers themselves (e.g., Colantoni & Steele, 2006;Hyltenstam & Abrahamsson, 2003a; Marinova-Todd, 2003; Neufeld, 1979),most often as a result of “word-of-mouth” and “friends of friends” networking.The other way to select participants is through more “formal screening” pro-cedures, in which expert judges or larger panels of linguistically naıve nativelisteners make judgments of recorded speech samples, usually from a largepool of candidates identified through, for example, newspaper advertisementsor posters on university campuses (e.g., White & Genesee, 1996). As mentionedearlier, the sessions with native listeners in Part I of the present study functionedas an extensive and careful formal screening procedure for participant selectionfor Part II.

The third methodological feature that this study tried to incorporate is thein-depth scrutiny of actual, linguistic nativelikeness. Because we are obviouslydealing with very advanced, seemingly nativelike individuals, our primarychallenge is to avoid Type II errors—that is, claims of nativelikeness for L2speakers whose linguistic knowledge and behavior ought to be described asnear-native rather than nativelike. One way of doing this is to avoid ceilingeffects. This calls for testing procedures in which linguistic measurement ischaracterized by a sufficient degree of scrutiny: The tests and tasks should bedemanding, and the linguistic analyses should be made in great detail and withextreme care. Another way to avoid Type II errors and “false positives” is toinclude diverse measures of nativelikeness, representing different aspects and

Language Learning 59:2, June 2009, pp. 249–306 260

Page 13: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

levels of linguistic proficiency, rather than limit the scope of inquiry to one or afew linguistic domains. One distinction that is sometimes made concerning therelevance of the critical period and maturational constraints is the one betweenphonology and grammar. Some researchers (e.g., Scovel, 1988) maintain thatbecause of the physiological and muscular basis of articulation, a critical periodcan be expected only for the ultimate attainment of pronunciation but notnecessarily for higher order linguistic phenomena, such as morphology orsyntax. However, still others claim that only morphosyntactic features specifiedby UG, as opposed to non-UG features, constitute the relevant domain ofresearch on nativelikeness and the critical period, and most of these researcherswould agree that core UG features, as opposed to peripheral ones, should bethe focus of attention (e.g., Eubank & Gregg, 1999). In contrast, there areresearchers who suggest that the focus should be on aspects of the L2 or ofL2 acquisition that are known to be difficult for learners; for example, Long(1993), suggests that one focus should be on “unusual” structures.

As we see it, the scope of research on nativelikeness must not be limited toany specific aspect of the L2. Rather, studies need to include measurements ofvarious kinds of L2 features, including all linguistic levels (phonology, gram-mar, lexis, etc.), skills, processing, automaticity, as well as both production andperception. Research conducted by Sorace (see, e.g., 1993, 2003) offers evi-dence that a fruitful area of investigation should be the way in which L2 speakersmay diverge in their grammatical (and lexical) choices, without therefore ex-hibiting overt errors. Her studies show that near-native L2 speakers frequentlydiverge from native speakers, although their performance/competence may wellbe in accordance with formal target-language norms or UG constraints. Thus,the subtle differences between native-speaker and near-native-speaker compe-tence must be searched for also—or even especially—beyond pronunciation andoutside the UG domain. Furthermore, Birdsong (2006) suggested that “wherenativelikeness is perhaps least likely to be observed is in certain domains oflanguage processing” (p. 21), and in a similar vein, Liu (2006) contended that“despite recent advances in research on successful L2 users’ end-state compe-tence, much remains unknown about their end-state processing ability in theL2” (p. 2).

In order to produce a representative “across-the-board” measurement ofnativelikeness, including aspects of language processing, the present studyemployed 10 instruments for L2 scrutiny, covering phonetic production andperception (voice onset time), perception of words and sentences in white noiseand babble noise, grammaticality judgments (written and auditory test modeswith latency times), grammatical, lexical, and semantic inferencing (a cloze

261 Language Learning 59:2, June 2009, pp. 249–306

Page 14: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

test), and formulaic language (tests of idiomatic expressions and proverbs) (seebelow for further details, Part II of the study).

The questions that guided the study were the following: (a) Do late (i.e.,adolescent and adult) L2 learners exist who are perceived as native speakers?(Part I); (b) Are most early (i.e., child) L2 learners ultimately perceived asnative speakers? (Part I); (c) Do late (i.e., adolescent and adult) L2 learnersexist who are nativelike when scrutinized in detail? (Part II); and (d) Are mostearly (i.e., child) L2 learners ultimately nativelike when scrutinized in detail?(Part II).

Part I: Perceived Nativelikeness

MethodParticipantsDuring the period from September 2002 to March 2004, we identified a total of195 L2 speakers of Swedish (132 females and 63 males) who had begun theiracquisition at various ages and who identified themselves as advanced andpotentially nativelike L2 speakers. They were identified through three largeadvertisements in daily newspapers5 (see Appendix A) and a poster campaignat nearly all universities and colleges in the Stockholm area in which weencouraged people to call us on the telephone if they believed that their non-Swedish background was usually not noticed by native speakers of Swedish.

To qualify as participants for the study, respondents had to meet six criteriamentioned in the advertisement. They had to (a) have Spanish as their L1, (b)speak Swedish fluently without a foreign accent or any obvious grammaticaldeviations, (c) be 19 years of age or older, (d) have lived in Sweden for 10 yearsor more, (e) have an educational level of no less than senior high school (i.e.,minimally 12 years of schooling), and (f ) have primarily been exposed to andacquired the variety of Swedish spoken in the greater Stockholm area (foran English translation of the advertisement, see Appendix A). Because thecandidates responding to the first advertisement were strongly biased towardlower AOs of acquisition, the second advertisement addressed only those withAOs above 7 years, and the third advertisement addressed only those with AOsabove 10 years. In all other respects, the advertisements were identical. The firstadvertisement resulted in 135 respondents, the second advertisement resultedin 50 respondents, and the third advertisement and the poster campaign togetherresulted in 10 respondents; that is, there were 195 in total.

In the following analyses, we will distinguish between the learner categories“AO ≤ 11 years” and “AO ≥ 12 years” because these may be thought of asrepresenting L2 learning before and after the closure of a critical period, which

Language Learning 59:2, June 2009, pp. 249–306 262

Page 15: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Table 1 Background information (independent variables) on the 195 respondents; com-parisons between respondents with age of onset (AO) ≤11 and ≥12 years (df = 193)

AO ≤ 11 AO ≥ 12(n = 107) (n = 88) t-test (two tailed)

Independent variable M SD M SD t p

AGE (years) 28.6 7.2 41.5 9.1 −11.1 <.0001LOR (years) 23.1 7.5 21.2 7.3 1.82 .071, nsL2 EXP (years) 22.4 7.4 20.9 7.3 1.41 >.1, nsL1 USE (%) 27.4 17.5 30.9 18.1 −1.34 >.1, nsSEX (% f/m) 69/31 61/39 0.23∗ >.1, ns

∗chi-square test, χ 2(1, 195) = 0.23, p > .1.

traditionally has been associated with the onset of puberty (approximately age12 years; Lenneberg, 1967). Although puberty has been questioned as a validupper limit for a critical period for language acquisition (not least by ourselves;see, e.g., Hyltenstam & Abrahamsson, 2003b), we still find a distinction basedon a theoretically established hypothesis to be more valuable than a distinctionbased on theoretically arbitrary grounds, such as AO ≤ 15 and AO ≥ 16 (cf.,e.g., Bialystok & Miller, 1999; Birdsong & Molis, 2001; Johnson & Newport,1989; Patkowski, 1980). Furthermore, age 12 years is a reoccurring cutoffpoint that has been used or explicitly explored in previous studies (see, e.g.,Bongaerts, 1999; Cranshaw, 1997; Flege et al., 1999; McDonald, 2006; Montrul& Slabakova, 2003; van Boxtel et al., 2005; Van Wuijtswinkel, 1994; White &Genesee, 1996), which justifies further the present division into early (AO ≤11) and late (AO ≥ 12) learners.

Of the 195 participants, 107 began their acquisition of Swedish before age12 years6 and 88 began to learn Swedish at the age of 12 years or later. For mostof the participants, AO of acquisition coincides with age at immigration. Acomparison of background variables of the two AO groups is shown in Table 1.The mean chronological age (i.e., age at the time of the study) was 28–29 yearsin the early-learner group and 41–42 years in the late-learner group, a differencethat is statistically significant. In all other respects, however, the two groupsare fully comparable: There are no significant differences concerning length ofresidence (LOR) in Sweden, amount of L2 exposure (operationalized as numberof years in Sweden minus number of years spent outside the L2 environmentsince the time of immigration), frequency of L1 use (operationalized as theinformants’ self-reported daily use of Spanish, expressed in percentages), ordistribution of women versus men. That the groups differ in chronological age

263 Language Learning 59:2, June 2009, pp. 249–306

Page 16: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

is rather unproblematic because there are no theoretical reasons to believe thata somewhat higher age as such would have an impact—in any direction—onnativelikeness (cf. results in MacKay, Flege, & Imai, 2006; for a discussion ofthe “age-length-onset problem,” that is, that age of onset logically is confoundedwith length of residence as well as with chronological age, see Stevens, 2006).

Interview and Speech ElicitationOur first encounter with the potential participants was by telephone. In responseto the newspaper advertisements, the 195 candidates called a project assistantand went through a 15-min interview, which, with the respondent’s consent,was recorded on a SONY TC-D5M cassette recorder. The interview generatedthe background data given in Table 1 as well as information about knowledge oflanguages other than Spanish and Swedish, any residency outside the Stockholmarea since the time of immigration, formal instruction in Swedish as an L2,mother-tongue instruction, any known hearing impairment, and any history ofdyslexia.

At the end of each interview, samples of more or less spontaneous speechwere elicited, which would later serve as stimuli in the listening sessions withnative speakers of Swedish (see below). The participants were asked to talkfreely for 1 min on a certain subject that anyone living in Sweden can relate to,namely Astrid Lindgren, the most famous Swedish author of children’s storiesand books (e.g., Pippi Longstocking).

Speech samples were also elicited over the telephone from 20 native speak-ers of Swedish, 10 females and 10 males, who had a mean age of 28 years(range: 23–40). Of these, 10 had grown up in the Stockholm area and 10 hadmigrated to Stockholm from other parts of Sweden; these latter native partici-pants had, however, lived in the Stockholm area for many years and exhibitedonly very subtle dialectal features in their speech. (We will return later to thereasons for including dialectal variation in the material.) The notion “nativespeaker of Swedish” was operationalized in this study as someone who (a) hasspoken only Swedish at home during childhood, (b) has had Swedish as the onlylanguage of instruction at school, and (c) has lived his or her whole life in a con-text in which Swedish has been the majority language. Pure monolingualismwas not a requirement.

Preparation of Speech StimuliThe first 20–30 s of the 1-min speech samples were extracted and used as stimuliin three separate listening sessions with native judges—one session after eachnewspaper advertisement (see Procedure section). The recordings were only

Language Learning 59:2, June 2009, pp. 249–306 264

Page 17: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

minimally edited. In those cases in which the content of a speech samplerevealed the non-Swedish origin of the speaker, this particular informationwas cut out (e.g., “Back home in Chile, we used to . . .” or “When we movedto Sweden . . .”). Very long pauses were also edited out. However, the finalduration of the speech samples was always between 20 and 30 s, which is asample length that has been demonstrated to be sufficient in speech-judgmenttasks involving linguistically naıve listeners (e.g., Cunningham-Andersson &Engstrand, 1989; see also Flege, 1984).

Native JudgesFor each of the three listening sessions, 10 different native speakers of Stock-holm Swedish were engaged as judges. These were recruited among studentsat Stockholm University but were linguistically and phonetically naıve and hadno knowledge of Spanish. All of the 30 judges (15 females and 15 males)had grown up in the Stockholm area and had a mean age of 25 years (range:21–30).

The reason for engaging a new panel of native judges for each of the threesessions was twofold. First, it was practically impossible to reassemble theoriginal panel 7 and 18 months after the first session. Second, because thejudges were informed about the actual purpose and design of the research afterthe session, it was necessary to engage a new, objective panel for each session.

ProcedureThe listening sessions were run within a few months after each advertisement ata point when new candidates no longer called us on the telephone. The sessionswere lead by a male native speaker of Stockholm Swedish.

The first listening session included speech samples from the 135 candidateswho responded to the first advertisement as well as speech samples from the20 native speakers of Swedish. The session took 90 min and the judges werepaid SEK 150 immediately after the session. The second listening sessionincluded speech samples from the 50 candidates who responded to the secondadvertisement plus samples from 8 of the native speakers (4 from Stockholm, 4with subtle regional features). This session took 45 min and these judges werepaid SEK 100. The third listening session included speech samples from the 10candidates who responded to the third advertisement and the poster campaignas well as samples from 8 native speakers (not the same individuals as insession 2); furthermore, in order to make this third session more comparableto the previous two sessions with regard to length and content, the speechmaterial was supplemented by 40 randomly selected L2 speech samples from

265 Language Learning 59:2, June 2009, pp. 249–306

Page 18: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

the second session. In addition, using 40 samples in two independent listeningsessions with different judges provided a good opportunity to obtain a measureof interrater reliability (see below). The duration of the third session and thepayment was the same as for session 2.

Judges were told that the research project concerned people’s ability to dif-ferentiate Stockholm pronunciation from regional dialects and foreign accents.This distracting information was given in order to prevent the judges fromfocusing solely on foreign accents and to make the task resemble an authen-tic, everyday speaker-judgment situation, in which many sources of phoneticvariation may come to people’s minds. In addition to the influence of social,pathological, and personality factors, very mild and occasional deviances inadvanced L2 learners’ speech are frequently interpreted by native listeners asa consequence of regional variation rather than a nonnative background (cf.Markham, 1997).7 Limiting the task to discrimination between native and non-native speech, with no opportunity given to reflect on alternative origins ofphonetic variation, would thus prompt judges to interpret any kind of devianceas a sign of nonnativeness, which is why the dialect dimension was also includedas a possibility. (For a similar method for capturing any possible confusion be-tween regional accent/dialect and minor foreign accent in very advanced L2speakers, see Marinova-Todd, 2003, p. 62.) After the session, the judges weregiven correct information about the purpose of the study as well as about theactual speaker distribution.

Instructions were given both orally by the session leader and in writingon the computer screen; thereafter, the judges were encouraged to ask forclarifications if needed. The judges were not instructed to focus on any linguisticfeature in particular, such as pronunciation, morphosyntax, or lexical choicebut rather to aim for an overall impression of each sample in order to judge eachspeaker’s status as native/nonnative speaker of Swedish (for a similar approach,see Montrul & Slabakova, 2003, pp. 367–368).

The listening sessions took place with each judge alone in a sound-treatedroom. The task was designed and run with the computer software E-Prime.8

Speech samples were presented in different random orders for each judgethrough KOSS KTX/PRO earphones. Since the telephone recordings varied tosome extent in sound quality, the judges were able to adjust the volume at anytime during the session. During each sample, the following three alternativeswere presented on the screen:

(A) This person’s mother tongue is Swedish and he/she is native to theStockholm area

Language Learning 59:2, June 2009, pp. 249–306 266

Page 19: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

(B) This person’s mother tongue is Swedish but he/she is not native to theStockholm area

(C) This person’s mother tongue is not Swedish

Three keys on the computer keyboard were marked with A, B, and C and thejudges could make their choice at any time during each speech sample. Thatis, after only a few seconds if they had decided upon a certain alternative, theycould interrupt the sample presentation by pressing the corresponding button.After judging a speech sample, the judges were instructed to hit the space key togenerate a new sample or to take a short break; thus, the judges performed thetask at their own pace with no time limit. Each sample was, however, presentedonly once; judges could not go back and reconsider a sample.9

AnalysisThe distinction “from Stockholm/not from Stockholm” was not included in theanalysis; instead, alternatives A and B were both regarded as an indication thata participant was perceived as a mother-tongue speaker of Swedish, with orwithout subtle (perhaps even nonidentifiable) dialectal features. In other words,alternative C alone represented the judgment “does not pass for native speaker.”

The reason for using a method based on binary alternatives, rather thanscalar alternatives in the form of a 1–5 or 1–9 scale (widely used particularlyin foreign accent studies; see, e.g., Flege et al., 1999; Munro & Mann, 2005),was that “nativelikeness” (unlike, e.g., “foreign accent”), by definition, is abinary phenomenon similar to, for example, “marriedness” and “deadness.”Thus, our intention was not to have each native listener rate “how” nativelikethe participants were but rather to investigate whether some L2 speakers are,in fact, interpreted as native speakers of Swedish.

Nevertheless, in order to operationalize and quantify the collective per-ception of the 10 native listeners, their judgments were transformed for eachspeaker into scores of “perceived nativelikeness.” which we will here refer toas PN scores.10 Thus, a PN score corresponds to the number of judges whochose alternative A or B. For example, a speaker’s PN score of 8 means that8 out of the 10 native judges believed that this speaker is a native speakerof Swedish—again, with or without what the judges may have interpreted assubtle dialectal features.

Interrater ReliabilityIn order to obtain a measure of interrater reliability, here in terms of interpanelagreement, the judgments from sessions 2 and 3 (i.e., from two independent

267 Language Learning 59:2, June 2009, pp. 249–306

Page 20: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

listener panels) of the overlapping 40 L2 speakers were compared. The corre-lation between the two sets of judgments was exceedingly high, r = .97, df =38, p < .001, which indicates that the choice between engaging one panel ofjudges or several different panels is of less concern in studies of this kind (cf.also Cunningham-Andersson & Engstrand, 1989).

ResultsA plot of all individual PN scores for the 195 L2 participants and the 20 native-speaker participants is shown in Figure 1. The figures in the plot express,for each AO, the number of participants whom the judges believed were nativespeakers of Swedish. Let us begin by establishing that there was nearly absoluteagreement among the judges concerning the 20 native speakers. As many as 18of the natives were judged by all 10 judges as being mother-tongue speakersof Swedish, whereas only 2 of the judges chose alternative C for one nativespeaker each.11

However, as is evident from Figure 1, the judgments are more varied con-cerning the L2 speakers, and a negative correlation between AO and PN scorecan easily be observed by eye. Table 2 presents a comparison of the mean PNscores for the native speakers, the early L2 learners (AO ≤ 11 years), and the lateL2 learners (AO ≥ 12 years). The native speakers received a mean PN score of9.9 [i.e., (18 × 10) + (2 × 9)/20 = 9.9], the early L2 learners received a scoreof 7.9, and the late learners received a score of 2.5. As shown in Table 2,all differences are statistically significant [one-way ANOVA: F(2, 215) =111.61, p < .0001; comparisons of adjacent groups with Fisher’s ProtectedLSD post-hoc test12]. Age of onset of acquisition is the variable most stronglyassociated with perceived nativelikeness, r = −.72, df = 193, p < 001, andcan therefore explain more than half of the variation: r2 = .52. Each of theother variables (see Table 1) explains only about 2–8% of the variation: r2 =.024–.076. In other words, AO appears to be the best predictor of perceivednativelikeness.

In Figure 1, the 195 L2 participants have been divided into five smaller AOgroups: early childhood (AO ≤ 5 years, n = 53), late childhood (AO 6–11 years,n = 54), adolescence (AO 12–17 years, n = 31), early adulthood (AO 18–23years, n = 33), and later adulthood (AO ≥ 24 years, n = 24). These 6-yearintervals are motivated partially by general phases in language development,and on closer examination these divisions are, in fact, reflected in the generalpattern in Figure 1. What the two lower AO groups (i.e., early childhood andlate childhood) have in common is that a majority of the participants areperceived as mother-tongue speakers of Swedish by most of the judges; as with

Language Learning 59:2, June 2009, pp. 249–306 268

Page 21: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Fig

ure

1S

catt

erpl

otof

PN

scor

esve

rsus

AO

for

all1

95pa

rtic

ipan

tsan

dth

e20

nativ

eco

ntro

ls(A

O0

year

s).

269 Language Learning 59:2, June 2009, pp. 249–306

Page 22: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Table 2 Group comparisons of mean PN scores for the native speakers (NS), the AO ≤11 learners, and the AO ≥ 12 learners with Fisher’s protected LSD post hoc test basedon ANOVA: F(2, 215) = 111.61, p < .0001

Participant Participantgroup 1 n M SD group 2 n M SD Fisher’s LSD

NS 20 9.9 0.3 AO ≤ 11 107 7.9 2.9 p = .005AO ≤ 11 107 7.9 2.9 AO ≥ 12 88 2.5 3.0 p < .001

the native speakers, most L2 speakers in these two groups have received a PNscore of 9 or 10. However, these groups differ concerning the distribution oflower PN scores; for example, there is no participant with PN score 0 in theearly childhood group, and only a few have received a PN score less than 6.On the other hand, no participants in the two highest AO groups (i.e., earlyadulthood and later adulthood) were judged as mother-tongue speakers ofSwedish to the same degree as the 20 native speakers (i.e., PN score 9–10);rather, in these two groups, the ratings have been concentrated around PNscores 0 and 1. At the same time, there is a larger number of participants inthe early adulthood group that received a PN score higher than 1 than is thecase with the later adulthood group, in which all participants but one receiveda PN score of 1 or 0. A clearly higher degree of variation is found in the middlegroup (i.e., adolescence), with no concentration of PN scores at either end ofthe scale. At these AOs (12–17 years), there are some (five participants) whoare perceived as mother-tongue speakers of Swedish by all or all but one ofthe judges, and some (six participants) who are perceived as mother-tonguespeakers by one or none of the judges; the remaining 20 participants are fairlyequally distributed along the scale (with one to five participants at each PNscore).

Figure 2 provides a clearer illustration of the relation between the fivesmaller AO groups and the relation between these learner groups and thenative-speaker group. A one-way ANOVA test reveals that there are significantdifferences between the groups, F(5, 215) = 67.40, p < .0001. As shown inTable 3, Fisher’s protected LSD post hoc test reveals that the main differencescan be found between the native group and all other groups—including theearliest learner group13—and between the adolescence group and all othergroups. However, neither the difference between the two childhood groups northe one between the two adulthood groups reached significance, which indicatesthat the major changes in eventual perceived nativelikeness of L2 learners canbe associated with adolescence.14

Language Learning 59:2, June 2009, pp. 249–306 270

Page 23: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

0

1

2

3

4

5

6

7

8

9

10

Native

speakers

Early

childhood

AO <1-5

Late

childhood

AO 6-11

Adolescence

AO 12-17

Early

adulthood

AO 18-23

Later

adulthood

AO 24-47

Age of onset group

Sco

re o

f P

erc

eiv

ed

Nati

veliken

ess

n = 20

n = 53

n = 54

n = 31

n = 33

n = 24

Figure 2 Average PN scores for all 195 participants, divided into five AO categories,and the 20 native controls.

As we have seen in Figure 1, most early learners (AO ≤ 11 years) areperceived as mother-tongue speakers of Swedish, whereas a majority of thelate learners (AO ≥ 12 years) are not. In fact, most of the adult learners (i.e.,AO ≥ 18 years) are perceived as native speakers by either one or none ofthe judges, and this is particularly true of those with AO ≥ 24 years. Thispattern is summarized in Table 4. We can see from the right column of Table4 that when the whole group of L2 speakers is taken into account (i.e., all ofthe 195 participants), approximately one third of the participants had receiveda PN score of 9–10 (i.e., they were perceived as native speakers by 9 or 10of the native listeners, a level that corresponds to the judgment of the nativespeakers), one third received a PN score of 0–1 (i.e., they were perceived asnative speakers by only 1 or none of the judges), whereas the remaining thirdreceived judgments somewhere in between (i.e., PN score 2–8). If we then

271 Language Learning 59:2, June 2009, pp. 249–306

Page 24: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Table 3 Group comparisons of mean PN scores for the native speakers and the fiveadjacent learner groups (AO <1–5, 6–11, 12–17, 18–23, and 24–47 years) with Fisher’sprotected LSD post hoc test based on ANOVA: F(5, 215) = 67.40, p < .0001

Participant Participantgroup 1 n M SD group 2 n M SD Fisher’s LSD

Native speakers 20 9.9 0.3 Early childhood 53 8.3 2.4 p < .02Early childhood 53 8.3 2.4 Late childhood 54 7.6 3.3 p = .136, nsLate childhood 54 7.6 3.3 Adolescence 31 5.1 3.2 p < .001Adolescence 31 5.1 3.2 Early adulthood 33 1.6 2.0 p < .001Early adulthood 33 1.6 2.0 Later adulthood 24 0.4 0.7 p = .087, ns

Table 4 Number and percentage of the participants with the two highest (9–10) and thetwo lowest (0–1) PN scores and of the participants with PN scores in between (2–8)

Native speakers AO ≤ 11 AO ≥ 12 All L2 participantsPN score (N = 20) (n = 107) (n = 88) (N = 195)

9–10 20 (100%) 66 (62%) 5 (6%) 71 (36%)2–8 — 35 (32%) 32 (36%) 67 (35%)0–1 — 6 (6%) 51 (58%) 57 (29%)

examine the distribution within the two AO groups, we see that approximatelyone third in both groups are in fact perceived as native speakers by two toeight judges. However, for the highest and lowest distributions of PN scores,the pattern that emerges is entirely the opposite: 62% of the early learners passfor native speakers of Swedish (PN score 9–10) whereas 6% are absolutelynot perceived as native speakers (PN score 0–1); in contrast, among the latelearners, 6% pass for native speakers with 9–10 of the judges whereas 58% areperceived as native speakers by only 1 or none of the judges.

SummaryPerceived nativelikeness was investigated in a sample (n = 195) of the pop-ulation of advanced early and late L2 learners who perceive themselves aspotentially near-native or even nativelike speakers of Swedish. Their ages ofonset were <1–47 years. Among the native Swedish control participants, 18out of 20 (or 90%) were perceived as native speakers by all 10 native judges,whereas the remaining two were perceived as native speakers by 9 of the 10judges. Of the 107 early L2 learners (AO ≤ 11 years), 62% were perceivedas native speakers by 9 or 10 judges, whereas only three were perceived as

Language Learning 59:2, June 2009, pp. 249–306 272

Page 25: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

nonnative speakers by all 10 judges. In contrast, of the 88 late learners (AO12–47 years), 6% were perceived as native speakers by 9 or 10 judges (and thisresult was limited to AOs 12–17 years), whereas 36 (or 41%) were perceivedas nonnative speakers by all 10 native judges.

It is important to stress that there were relatively few participants (in total,39 of 195) who were not perceived as native speakers by any of the judges. Thisindicates that the majority of the participants in fact did pass for native speakersof Swedish with at least one (and often several) of the judges. For example, ascan be seen in Figure 1, five participants in the early adulthood group actuallypassed for native speakers by as many as 5 or 6 judges; similarly, in the lateradulthood group, there is one individual with AO 30 years who convinced 3of the 10 native judges that he or she was a native speaker of Swedish. Thisgives us a clear indication of the generally advanced proficiency level of mostof the participants, despite the fact that they were, in this study, classified asnonnativelike when having received a PN score lower than 9. Our very strictcriterion of nativelikeness is based on how the 20 native speakers were judged,which turned out to be at least 9 of the 10 judges identifying a participantas a mother-tongue speaker of Swedish. If instead we had used an arbitraryor more liberal nativelikeness criterion—say, when at least half (or why notone?) of the native judges chose the alternative “This person’s mother tongueis Swedish. . .”—we see from Figure 1 that a much greater number of theparticipants would have passed for native speaker, even among those with AO12–23 years.

Part II: Scrutinized Nativelikeness

MethodSelection of ParticipantsAs mentioned earlier, the listening sessions in Part I of the study served asa formal screening procedure for participant selection in Part II. Our origi-nal intention was to include only those L2 speakers who were judged to bemother-tongue speakers by the listeners to the same extent as the 20 Swedishparticipants—in other words, only those L2 learners whose casual, everydayspeech was indistinguishable from that of native speakers. However, becauseonly five participants with AO at or beyond 12 years and no participant withAO beyond 17 years passed for native speakers, it was decided that the criterionlevel for participant selection in Part II should be somewhat adjusted. A moreliberal definition of perceived nativelikeness was therefore adopted and it wasdecided that potential candidates for inclusion in Part II should be those withPN scores ≥ 6; that is, those L2 speakers who had passed for mother-tongue

273 Language Learning 59:2, June 2009, pp. 249–306

Page 26: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

speaker of Swedish with a majority of the 10 native judges. This criterionresulted in no less than 104 potential participants (i.e., 55%) out of the orig-inal 195 candidates, although there was still a strong bias toward lower AOs:Whereas 87 (81%) passed for native speakers out of the candidates with AO <

1–11 years, only 17 (19%) individuals did so among the candidates with AO12–47 years.

From these 104 potential candidates, 41 individuals (32 females and 9males) were selected who met most of the background criteria. Our originalintention was to be able to include 60 participants, evenly distributed across anAO span between 1 and 20+ years (i.e., with three participants for each AO)and carefully matched against each other for background factors such as age,sex, and frequency of daily L1 use. However, such selection procedures werepossible only among the candidates with AO < 1–11 years because nativelike-ness (in terms of PN score) was clearly biased toward lower AOs, and severalgaps occur in the upper half of the AO continuum (12–20+ years), with noparticipants selected with AO 12, 18, or 20+ years. Of the 17 late learners whopassed for native speakers according to the PN score ≥ 6 criterion in Part I ofthe study, only the 10 individuals who met the most crucial background criteriacould be selected for participation in Part II. The reasons for excluding no lessthan 7 of the 17 late-learner candidates were the following. One candidate (AO:23; PN score: 6) proved to be an L2 speaker of Spanish with Basque as the L1.By her account, she began to learn Spanish when she entered the Spanish schoolsystem at the age of 5 years. Another candidate (AO: 17; PN score: 9), who wasinitially selected, later declined to participate in the project. Two candidates(AO: both 12; PN scores: 8 and 9) were excluded because they had lived innon-Spanish-speaking countries (Romania and East Germany, respectively) for7–8 years prior to moving to Sweden. One candidate (AO: 13; PN score: 10)reported that although she grew up in a Spanish-speaking country, her parentswere actually native speakers of English. The two remaining candidates (AO:both 14; PN scores: both 6) were excluded because the quota for their AO hadalready been filled; the three candidates who were actually chosen as represen-tatives of AO 14 years were those with the highest nativelikeness ratings (PNscores: 9, 9, and 8).

The mean age for the 41 selected L2 speakers at the time of testing was32 years (range: 20–50). Their mean LOR in Sweden was 25 years (range:12–42) and their mean length of L2 exposure was 24 years (range: 12–42). As shown in Table 5, all differences, except for chronological age, be-tween the early and late learners are statistically nonsignificant, which sug-gests that the two groups are satisfactorily comparable as far as background

Language Learning 59:2, June 2009, pp. 249–306 274

Page 27: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Table 5 Background information (independent variables) on the 41 selected partici-pants; comparisons between participants with age of onset (AO) ≤ 11 and ≥ 13 years(df = 39)

AO ≤ 11 AO ≥ 13(n = 31) (n = 10) t-test (two tailed)

Independent variable M SD M SD t p

AGE (years) 30.9 6.6 37.2 6.9 −2.60 <.02LOR (years) 25.5 7.1 22.3 6.0 1.31 >.1, nsL2 EXP (years) 25.2 6.9 22.2 5.9 1.22 >.1, nsL1 USE (%) 23.5 13.4 31.5 12.9 −1.65 >.1, nsSEX (% f/m) 71/29 100/0 −1.97 >.05, ns

variables are concerned, even after screening and after participantselection.

As for country of origin and Spanish variety, there was a strong bias towardChilean Spanish (27 participants) because Chileans are, by far, the largestgroup among Spanish L1 speakers in Sweden. The other countries of originrepresented among the participants were Peru (six participants), Colombia (twoparticipants), Spain (two participants), Argentina (one participant), Bolivia (oneparticipant), Mexico (one participant), and Uruguay (one participant).

A group of native-speaker participants was included, consisting of 15mother-tongue speakers of Swedish. These were selected on the basis of thesame definition and operationalization of “native speaker” as in the recruitmentof native participants in Part I. This group was matched with the L2 speakergroup regarding the skewed sex distribution (11 females and 4 males), edu-cational level (senior high school diploma at a minimum), variety of Swedish(Stockholm), and age (M: 30 years; range: 23–46). None of the native speak-ers had any experience in the phonetic or linguistic sciences or any academictraining in Swedish or other Scandinavian languages.

ProcedureThe whole testing procedure took place in a sound-treated room at StockholmUniversity with each participant individually. The session lasted for approxi-mately 4 hr and was divided into three subsessions with two 20-min breakswith food and refreshments in between. Prior to language testing and speechelicitation, participants went through a hearing test with an OSCILLA SM910screening audiometer, and a decrease of no more than 10 dB for one frequencyon one ear was considered acceptable. After the session, participants received

275 Language Learning 59:2, June 2009, pp. 249–306

Page 28: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

a financial compensation of SEK 500. Testing and data collection was carriedout by the same male native speaker of Stockholm Swedish who had conductedthe listening sessions in Part I.

InstrumentsAs the overarching aim of this part of the study was to scrutinize the participants’actual nativelikeness in a variety of linguistic phenomena and language abilities,a large set of instruments was employed. These instruments were designed insuch a way that they would allow for differentiation between native-speakerproficiency and near-native proficiency, on the one hand, and between differentdegrees of near-native proficiency, on the other hand. The language tests andtasks were deliberately made highly complex in order to cause a high degreeof difficulty and cognitive load even for native speakers. Measurements werecarried out with as much care and in as much detail as possible. The rationalebehind this design was the absolute need to avoid any possible ceiling effects,which we believe have strongly influenced previous research and theorizing(see Hyltenstam & Abrahamsson, 2000, 2001, 2003b, for discussions; see alsoMontrul & Slabakova, 2003, p. 385).

Although the test battery contained some 20 different instruments for lan-guage testing and speech elicitation, the present article will report only onthe 10 measures that have been analyzed so far. However, the present set ofresults cover speech production, speech perception, morphosyntax, and for-mulaic language—in other words, a fairly representative sample of the broadspectrum of L2 knowledge and processing abilities. The 10 instruments andmethods of analysis are presented next.

Production and perception of voice onset time (VOT). Voiceless stopsare usually associated with longer VOT values and voiced stops with shortervalues, VOT being the time interval between the onset of the release burst of astop consonant and the onset of periodicity from vocal fold vibration. Spanishand Swedish differ as to where on the VOT continuum the voiced/voicelesscategories separate: The Spanish category boundaries are located at lower(usually negative) VOT values than is the case in languages like Swedish (andEnglish), for which boundaries are found at higher (positive) values (Lisker &Abramson, 1964). The present study included two VOT-related measures basedon one production task and one test of categorical perception. In the productiontask (Instrument 1), the participants read aloud the Swedish words par, tal, andkal.15 Each participant read each word 10 times, and the readings were recorded.Spectral analyses were then made of the initial voiceless stops /p/, /t/, and /k/using the Soundswell package (Hitech Development) (for further details, see

Language Learning 59:2, June 2009, pp. 249–306 276

Page 29: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Abrahamsson et al., in press; Stolten, 2005). The categorical perception test(Instrument 2) was based on the minimal pairs par-bar, tal-dal, and kal-gal,16

which had been recorded in an anechoic chamber by a native female speaker ofSwedish. Using the Soundswell software, a 5-ms-step VOT continuum rangingfrom −60 to +90 ms was created for all three minimal pairs, and the stimulusitems were presented through earphones in different random orders for allparticipants. Each word was presented together with the carrier phrase Nu hordu. . . “Now you will hear. . .”, and the participants’ task was to decide whetherthey heard the voiceless or the voiced member of the word pair by pressing oneof two buttons. The test was designed and run in E-Prime and took about 5 minto complete (for details, see Stolten, 2006).

Speech perception in noise. Nonnative listeners are generally less able totake advantage of linguistic context to decode speech presented in noise, andthe (negative) effect of increasing noise is greater for nonnative than for na-tive listeners (see Bradlow & Bent, 2002; Hyltenstam & Abrahamsson, 2003a;McAllister, 1997; Spolsky, Sigurd, Sato, Walker, & Arterburn, 1968). Thepresent study included two different perception-in-noise tests: one of word per-ception in babble noise and one of sentence perception in white noise. In thebabble noise test (Instrument 3; for details, see McAllister & Brodda, 2002),participants encountered (in earphones) 30 simple, highly frequent bisyllabicstimulus words in increasing babble noise (i.e., noise consisting of multiplevoices). The words were randomly and automatically selected from 100 po-tential stimulus words and were presented together with the carrier phrase Nuhor du. . . “Now you will hear. . .”. The words and the carrier phrase had beenrecorded by a female native speaker in an anechoic chamber. The participant’stask was to repeat each word, and the experimenter entered the response wordsinto the computer. An in-built metric (see MacAllister & Brodda, 2002) cal-culated the phonological distance between the stimulus word and the responseword, and the noise level was automatically adjusted accordingly. This meansthat whenever the participant responded with the correct word (e.g., solen forsolen “the sun”) or with a phonologically similar word (e.g., stolen “the chair”instead of solen “the sun”), the signal-to-noise ratio (SNR) increased by 0.4dB,but decreased to the same extent whenever the stimulus word and the responseword were different and phonologically distant from each other (e.g., katten“the cat” instead of solen “the sun”). In this way, a perceptual threshold levelcould be established for each participant. The test took about 5 min to complete.In the white noise test (Instrument 4), the participants encountered (throughearphones) 28 sentences (recorded by a male native speaker in an anechoic

277 Language Learning 59:2, June 2009, pp. 249–306

Page 30: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

chamber), containing both predictable and unpredictable information, and thattogether formed an informational text on a fictional subject (adapted fromPlatzack, 1973). The participants’ task was to repeat each sentence verbatim,and their repetitions were recorded by the computer for later analysis. The firstseven sentences were presented without noise; thereafter, the level of whitenoise successively increased after every seventh sentence, giving SNRs of 13dB, 6 dB, and 2 dB, respectively. The test took about 5 min to complete. Astrict scoring procedure was then adopted, in which only exact repetitions werescored as correct repetitions (for a similar approach, cf. Bradlow & Bent, p.276). The seven noise-free sentences were excluded from the analysis.

Grammaticality judgment. In order to measure the participants’ grammat-ical L2 intuition and morphosyntactic processing ability, a comprehensive anddemanding grammaticality judgment test was administered in two versions: oneauditory (Instrument 5) and one in writing (Instrument 6). The test consistedof 80 rather long (mean length: 17 words) and complex sentences based onfour morphosyntactic features of Swedish grammar: subject-verb inversion, re-flexive possessive pronouns, placement of sentence adverbs in relative clauses,and gender and number agreement (see Appendix B for sample items; forfurther details, see Abrahamsson & Hyltenstam, 2008). Half of the sentenceswere grammatically correct and half contained one grammatical error. Thesentences (which, in the auditory version, had been recorded in an anechoicchamber by a female native speaker) were given in different random orders forall participants. They were presented through earphones in the auditory versionand on the computer screen in the written version. By pressing one of two but-tons at any point during or after a sentence, the participants indicated whetherthey perceived it as grammatically correct or incorrect. Along with YES/NOresponses, the auditory GJT also registered reaction times (Instrument 7). Bothversions of the test were designed and run in E-Prime and each took 15–20 minto complete.

Grammatical, lexical, and semantic inferencing. A more global measureof the participants’ L2 Swedish proficiency was obtained by the use of a clozetest (Instrument 8). This technique, originally developed by Taylor (1953), hasbeen proven to mobilize the testee’s total grammatical, lexical, contextual, andpragmatic knowledge (McNamara, 2000, p. 15), which in normal language useis used in perception and comprehension of both spoken and written language.L2 speakers, even at very advanced levels, have been observed to have greaterdifficulties than native speakers in making semantically and syntactically basedpredictions about a text’s continuation (see, e.g., Hyltenstam & Abrahamsson,

Language Learning 59:2, June 2009, pp. 249–306 278

Page 31: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

2003b). The cloze test employed here consisted of a 300-word text (adaptedfrom Platzack, 1973) where every seventh word had been removed. The testwas an untimed pen-and-paper task in which the participants were to fill in the42 blanks with words that would fit into the context. The participants’ perfor-mances were then blind scored by both authors independently. Responses inthe form of words other than the original ones (i.e., those hidden by the blanks)were judged for lexical, morphosyntactic, and semantic appropriateness withrespect to their linguistic context; encyclopaedic errors were not considered(e.g., if someone with poor knowledge in modern history would fill in the wordstarted in the sentence World War II ____ in 1945). Disagreements in judgmentwere settled through discussion and careful consideration.

Formulaic language. Native speakers of a language frequently make useof formulaic language, and both L1 and L2 learners rely heavily on prefabri-cated linguistic chunks in early phases of language development (Wray, 2002).Paradoxically, however, for L2 learners of advanced proficiency, the idiomaticuse of formulaic language seems to be “the biggest stumbling block to sound-ing nativelike” (Wray, p. ix). The present study included two tests of formulaiclanguage: one of idioms (Instrument 9) and one of proverbs (Instrument 10).Both tests were created and run in E-Prime, and they were identical in designand procedure. Each test included 50 items, which were presented in writingon the computer screen (one at a time and in the same order for all participants)with a blank to be filled in (a missing word or chunk; e.g., Hon lopte verkligen[linan] ut, roughly “She really went the whole hog,” and Ju fler kockar [destosamre soppa] “Too many cooks spoil the broth,” where the words in bracketsrepresent the blank). The participants responded orally by reading the whole id-iom or proverb including the missing word or phrase. Responses were recordedand later analyzed. Both tests were timed, and participants were given 10 secto complete each item. The tests took each 7–8 min to complete.

AnalysisFor the analysis, we used performance within the native-speaker (NS) rangeas a way of defining nativelike behavior, and the lowest NS result on each ofthe 10 measures was defined as the minimum criterion of nativelikeness of thatspecific aspect of Swedish. This, of course, is a far more inclusive criterionthan those based on performance within, say, a 95% confidence interval ofnative controls or within one or two standard deviations from the NS mean(cf., e.g., Birdsong, 2007; Flege et al., 1999; Piske, MacKay, & Flege, 2001).A nativelikeness criterion based on the NS absolute range should be viewedas a stronger guarantee against Type I errors and “false negatives” (i.e., claims

279 Language Learning 59:2, June 2009, pp. 249–306

Page 32: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

of nonnativelikeness for L2 users who in fact behave like at least some nativespeakers).

As mentioned earlier, the results generated from each of the 10 instrumentswill not be presented in any exact detail here but rather in terms of whetherthe participants passed for native speakers on how many and on which of theinstruments. Therefore, by analogy with the use of PN scores (0–10) earlier, aparticipant’s score of “scrutinized nativelikeness” was indexed by the numberof instruments on which performance fell within the NS range and will bereferred to here as SN scores (0–10).

ResultsThe NS ranges and means as well as maximal results for the 10 linguistictest instruments are presented in Table 6. As can be seen, especially for thoseinstruments with fixed maximum scores, the tests and tasks were difficult evenfor the native speakers. None of the native speakers reached the maximum resulton any of Instruments 4–6 and 8–10, which serves as a guarantee against ceilingeffects (for similar arguments, see Montrul & Slabakova, 2003, p. 385). Forobvious reasons, neither the measures of production and perception of VOT,tolerance for babble noise, nor reaction times allow for a maximum result.

Table 6 The ten instruments with maximum result and native-speaker mean, highest,and lowest results

Instruments Max. NS Mean NS High NS Low

1. VOT production /p/ —a 16.9 22.7 11.7(% of word dur.) /t/ — 15.7 21.8 10.7

/k/ — 18.3 25.5 14.62. VOT perception (ms.) /p-b/ — 7.2 27.8 −13.0

/t-d/ — 15.3 27.5 5.4/k-g/ — 24.6 33.3 17.9

3. Babble noise (SNR, dB) — −7.46 −11.53 −5.064. White noise (score) 21 16 18 125. Auditory GJT (score) 80 69 78 576. Written GJT (score) 80 70 78 587. RT, auditory GJT (ms) — 7,729 7,160 8,888b

8. Cloze test (score) 42 36 41 309. Idioms (score) 50 43 48 33

10. Proverbs (score) 50 39 46 33

aNot applicable.bThe lowest NS result is represented by the longest reaction time.

Language Learning 59:2, June 2009, pp. 249–306 280

Page 33: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Figure 3 Scatter plot of SN scores versus AO for the 41 selected participants; all 10linguistic instruments.

Furthermore, the NS ranges are exceedingly wide for many of the measures,which, of course, offers the L2 participants good prospects for performingwithin those ranges.

The overall results of the L2 participants are presented in Figure 3, which isthe equivalent to Figure 1 in Part I. Although the participants were selected onthe basis of a nativelikeness criterion (PN score ≥ 6), we still find a significantdifference in mean SN scores between the early and late learners, t(39) = 2.80,p < .01 (two tailed), as well as a negative (albeit weak) correlation betweenthe AO and the SN score, r = −.38, df = 39, p < .02. The weakness of thiscorrelation has a great deal to do with the two highest scoring participants inthe late-learner group, who, paradoxically, were those with the two highest AOs(see below). By analogy with Figure 2 and Table 3 in Part I, Figure 4 and Table 7provide a clearer illustration of the relation between the early and late learnerswhen divided into smaller AO groups. A one-way ANOVA test revealed thatthere are significant differences between the groups, F(2, 41) = 3.892, p <

.03. However, a post hoc test (Fisher’s protected LSD) showed that the basis

281 Language Learning 59:2, June 2009, pp. 249–306

Page 34: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

0

1

2

3

4

5

6

7

8

9

10

Early childhood

(AO 1-5)

Late childhood

(AO 6-11)

Post-puberty

(AO 13-19)

Age of onset group

Sco

re o

f S

cru

tin

ized

Nati

veli

ken

ess

n = 15 n = 16

n = 10

Figure 4 Average SN scores for all 41 selected subjects, divided into three AOcategories.

Table 7 Group comparisons of mean SN scores (AO 1–5, 6–11, and 13–19 years) withFisher’s protected LSD post hoc test based on ANOVA: F(2, 41) = 3.892, p < .03

Participant Participantgroup 1 n M SD group 2 n M SD Fisher’s LSD

AO 1–5 15 6.1 2.4 AO 6–11 16 5.8 2.8 p = .718, nsAO 1–5 15 6.1 2.4 AO 13–19 10 3.5 2.0 p < .02AO 6–11 16 5.8 2.8 AO 13–19 10 3.5 2.0 p < .03

for this result is the differences between the adolescent/adult group and thetwo early-learner groups; the marginal difference in mean SN scores betweenthe two early learner groups was not significant.17 Thus, as was the case withperceived nativelikeness and PN scores in Part I of the study, the major changesin actual nativelikeness and SN scores can be associated with AOs beyond 12years.

Language Learning 59:2, June 2009, pp. 249–306 282

Page 35: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Let us turn next to the individual level and the main question of this part ofthe study, namely the incidence of actual nativelikeness. Of the 41 participantsselected, only two, possibly three,18 received a SN score of 10 (i.e., performedwithin the range of the 15 native-speaker participants on all 10 measures ofSwedish proficiency). These learners’ AOs were 3, 7, and 8 years, respectively.Three participants received an SN score of 9, another two received an SNscore of 8, and these learners’ AOs were between 1 and 7 years. Among the10 late learners, the highest performing participant received an SN score of7 and another received an SN score of 6. Interestingly, these two were thosewith the highest AOs: 19 and 17 years, respectively. The remaining eight latelearners received SN scores of 1–5. Thus, there was no evidence of actualnativelikeness among any of the late learners, and only few childhood learnersexhibited nativelike results across the board.

Table 8 exhibits individual results for participant and for each instrument.The results are expressed in terms of +/− within the NS range, where + standsfor results at or above the lowest NS result and − stands for results below thelowest NS result. In the few cases where data are missing (due to, e.g., technicalproblems or uninterpretable responses), results are treated as if they were withinthe NS range, but they are, for the sake of clarity, expressed with (+). As theresults show, the 10 instruments represent different degrees of difficulty; forexample, whereas 35 of the 41 participants had GJT reaction times (Instrument7) within the NS range, only 6 performed within the NS range on the proverbtest (Instrument 10). However, there are no linguistic domains or tasks thatwere never mastered, not even among the late learners; in other words, everyinstrument is marked with at least one +. For example, participant 070 (AO 19)exhibited nativelike knowledge and behavior on measures of morphosyntax,formulaic language, and sentence repetition in white noise but not on detailsof speech production and perception, whereas several other late learners hadnonnativelike results on the morphosyntactic tests but at the same time exhibitednativelike behavior on at least one of the phonetic measures. Similarly divergentpatterns can be discerned among the early learners.

Nevertheless, some interesting differences can be observed concerningwhich of the linguistic instruments cause the least and the most difficultyamong early and late learners, respectively. In Table 9, the 10 instruments havebeen rank-ordered according to the percentage of learners within each groupwho performed within the NS range. As can be seen, both groups exhibited mostnativelike behavior where reaction times on the auditory GJT was concerned,whereas both groups showed least nativelike performance on the proverb test.However, the main difference between the ranks concerns the relation between

283 Language Learning 59:2, June 2009, pp. 249–306

Page 36: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Tab

le8

Ove

rall

resu

lts

for

the

41pa

rtic

ipan

tson

the

10m

easu

res

ofL

2S

wed

ish

profi

cien

cy

Inst

rum

ents

1.V

OT

2.V

OT

3.B

abbl

e4.

Whi

te5.

GJT

6.G

JT7.

RT

8.C

loze

IDA

Opr

od.

perc

.no

ise

nois

e(a

ud.)

(wri

.)(a

ud.)

test

9.Id

iom

s10

.Pro

verb

sS

N

122

1+

++

++

++

++

−9

002

1+

+−

++

++

++

−8

049

1−

−+

++

++

−−

−5

030

2+

−+

++

++

+−

−7

126

2+

++

−+

++

−−

+7

100

2+

−+

−−

++

−−

−4

012

3+

++

++

++

++

+10

043

3(+

)−

+−

−−

++

+−

(4)

041

3(+

)+

++

−−

+−

−−

(5)

052

4+

++

++

++

++

−9

051

4−

++

−+

++

+−

−6

090

4−

+−

−−

−−

−+

−2

007

5−

++

−−

−+

+−

−4

101

5+

+−

−−

−+

−+

−4

127

5−

(+)

++

++

++

+−

(8)

013

6+

+−

++

−+

++

−7

031

6+

++

−+

++

−+

−7

118

6+

++

++

+−

−+

−7

089

7+

++

++

++

++

+10

Language Learning 59:2, June 2009, pp. 249–306 284

Page 37: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

015

7−

++

++

++

++

+9

042

7+

++

−−

++

−−

−5

096

8+

−+

−−

−+

−−

−3

081

8−

−−

−−

−+

−+

−2

076

8+

++

(+)

++

++

(+)

(+)

(10)

086

9+

−−

−−

−+

−−

−2

188

9−

+−

−−

−+

−−

−2

194

9(+

)+

−+

++

+−

+−

(7)

157

10+

++

−−

−+

+−

−5

045

10+

−−

−−

−+

−+

−3

016

11+

+−

−+

++

++

−7

033

11+

−+

++

++

+−

−7

107

13−

−+

−−

++

−−

−3

114

13−

+−

−−

−−

−−

−1

145

14+

−−

−+

++

+−

−5

001

14−

−+

−+

−+

+−

−4

180

14+

−−

−−

−−

+−

−2

173

15−

−−

+−

+−

−+

−3

102

15(+

)−

+−

−−

+−

−−

(3)

103

16−

+−

−−

−−

−−

−1

172

17+

−−

++

++

+−

−6

070

19−

−−

++

++

++

+7

Not

e.+/

−=

resu

lts

wit

hin/

belo

wN

Sra

nge;

(+)=

data

mis

sing

;ID

=pa

rtic

ipan

tide

ntifi

cati

onnu

mbe

r;A

O=

age

ofon

set;

SN

=sc

ore

ofsc

ruti

nize

dna

tivel

iken

ess

(num

ber

ofte

sts

wit

hin

NS

rang

e).

285 Language Learning 59:2, June 2009, pp. 249–306

Page 38: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Table 9 The rate of nativelike attainment within different linguistic domains; the teninstruments rank-ordered per AO group

Percent of participants AO 1–11 AO 13–19within NS range (n = 31) (n = 10)

94% RTs (GJT, auditory)74% VOT production71% VOT categorical perception68% Word percentage in babble

noise65% GJT (written)60% RTs (GJT, auditory)58% GJT (auditory)58% Idioms52% Cloze test

(grammar/semantics)50% GJT (written)50% GJT (auditory)50% Cloze test (gramm./sem.)48% Sentence percentage in

white noise40% VOT production30% Word percentage in babble

noise30% Sentence percentage in

white noise20% Idioms20% VOT categorical perception16% Proverbs10% Proverbs

phonetic and grammatical aspects of Swedish. Among the early learners, native-like VOT production and perception, as well as word perception in babble noise,was more common than nativelike grammatical intuition. The reverse patternis found for the late learners: The formal grammatical aspects—represented bythe two GJTs and the cloze test—are all ranked higher than the pure phoneticaspects, which all appear in the lower half of this group’s rank order. As can beseen in Table 10, the differences between the early and late learners that werestatistically significant concern speech production and perception (i.e., Instru-ments 1–3) as well as reaction times (Instrument 7) and idiomatic expressions

Language Learning 59:2, June 2009, pp. 249–306 286

Page 39: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Table 10 Differences in rate of nativelikeness between AO ≤ 11 learners and AO ≥13 learners on each of the 10 measures; number (%) of participants; χ 2 test (n = 41,df = 1)

AO 1–11 AO 13–19Instrument (n = 31) (n = 10) χ 2 p

1. VOT production 23 (74%) 4 (40%) 3.931 <.052. VOT perception 22 (71%) 2 (20%) 8.092 <.013. Babble noise test 21 (68%) 3 (30%) 4.437 <.054. White noise test 15 (48%) 3 (30%) 1.038 >.1, ns5. GJT (auditory) 18 (58%) 4 (40%) 0.992 >.1, ns6. GJT (in writing) 19 (65%) 5 (50%) 0.397 >.1, ns7. RT (aud. GJT) 29 (94%) 6 (60%) 6.812 <.018. Cloze test 16 (52%) 5 (50%) 0.008 >.1, ns9. Idioms 18 (58%) 2 (20%) 4.385 <.05

10. Proverbs 5 (16%) 1 (10%) 0.227 >.1, ns

(Instrument 9). The differences for the more formal linguistic aspects, such asmorphosyntax (Instruments 5, 6, and 8) and proverbs (Instrument 10), did notreach significance.

SummaryIn Part II of the study, a subset of 31 childhood learners (AO 1–11 years)and 10 adolescent and adult learners (AO 13–19 years) who had passed fornative speakers with at least 6 of the 10 judges in Part I were selected for abroad and detailed scrutiny of actual (linguistic) nativelikeness. Of these, onlytwo, possibly three, performed within the range of the 15 native participantson all 10 measures of Swedish proficiency. These learners’ AOs were 3, 7,and 8 years. Of the 10 late learners, 1 performed within the range of nativespeakers on seven measures, 1 on six measures, and the remaining 8 performedwithin the native-speaker range on one to five measures. In other words, onlya few of the early learners and none of the late learners exhibited actual,linguistic nativelikeness across a broad range of tasks when their performancewas scrutinized in detail.

Discussion

The first important finding of the present study concerns the distribution andincidence of nativelikeness across AOs of acquisition. At first glance, the resultsseem to be compatible with those of earlier studies. First, we saw a strong

287 Language Learning 59:2, June 2009, pp. 249–306

Page 40: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

negative correlation (r = −.72) between perceived nativelikeness and the ageat which L2 acquisition began, and an overall group comparison revealed asignificant difference in perceived nativelikeness between early and late learners(cf., e.g., Asher & Garcıa, 1969; DeKeyser, 2000; Hyltenstam & Abrahamsson,2003a; Flege et al., 2005; Johnson & Newport, 1989; MacKay et al., 2006;Munro & Mann, 2005; Oyama, 1976, 1978; Patkowski, 1980; Seliger et al.,1975). However, there were no differences between early and late childhoodlearners (AO < 1–5 vs. 6–11) or between early and late adulthood learners(AO 18–23 vs. 24–47); the only significant differences were that betweenchild learners and adolescent learners (AO 12–17) and that between adolescentlearners and adult learners. The average perceived nativelikeness began todecrease at or around AO 12 years, but this decrease leveled out at somepoint after adolescence (cf. the sigmoid function of AO and degree of accentsuggested by Munro & Mann). Furthermore, and in accordance with manyprevious studies (e.g., Johnson & Newport; Oyama 1976, 1978; Patkowski),no clear connections could be observed between nativelikeness and LOR orlength of L2 exposure. In other words, and as has been shown in many studies(e.g., Johnson & Newport; Munro & Mann), AO of L2 acquisition stands outas the variable that best predicts ultimate perceived nativelikeness. The presentstudy actually shows this to be the case even when the speaker sample consistsexclusively of L2 learners who identify themselves as potentially nativelike ornear-native speakers.

Also in accordance with previous research (as well as with most laymanobservations), a majority of the early learners were perceived as native speakers,whereas most of the late learners were thought to have a native language otherthan Swedish (cf., e.g., DeKeyser, 2000; Flege, 1999; Flege et al., 1999; Johnson& Newport, 1989; Patkowski, 1980). On the other hand, 5 of the 88 late learnersactually passed for native speakers. This undeniably would appear to supportthe claims of some researchers that nativelike adult learners do exist and,consequently, indicate the nonexistence of a biologically determined criticalperiod for (second) language acquisition. However, such an interpretation ofour results would, for several reasons, be most unwarranted and even faulty.

First, it is important to note that these five individuals were found exclusivelyamong those with AOs 12–17 years and that no participant among the 57candidates with an AO beyond 17 years passed for a mother-tongue speakerof Swedish. This aspect of our results is, at least in part, congruent with theresults in a study by Flege et al. (1995), in which only a handful of the lateItalian learners of English were judged to speak the L2 without a foreign accent,although none of these had an AO of acquisition above 16 years. In fact, in

Language Learning 59:2, June 2009, pp. 249–306 288

Page 41: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

the study by Flege et al. (1999), none of the Korean participants with an AObeyond 10 years spoke L2 English without a foreign accent.

Second, the results from the 10 different measures of L2 knowledge andprocessing—representing various aspects of speech production and perception,morphosyntax, and formulaic language—showed that none of the late learn-ers (AO 13–19) exhibited actual, overall linguistic nativelikeness when theirperformance was scrutinized in detail. These results support the position heldby, for example, Bley-Vroman (1989), Long (1990), Gregg (1996), Long andRobinson (1998), as well as ourselves (Hyltenstam & Abrahamsson, 2003b),namely that nativelike L2 proficiency is, in principle, never attained by adultlearners. On the other hand, the results do not seem to support the claim thatcertain linguistic features would be unlearnable after a certain age while theacquisition of other aspects remain unaffected by the age of the learner. Neitherdo they support the theoretical position (taken by, e.g., Scovel, 1988) that acritical period is of relevance only for the phonetic/phonological domains oflanguage but not for morphosyntax, even if the late learners in this study wereactually shown to be less nativelike when it came to speech production andperception as compared to the more formal, grammatical levels of Swedish.However, what our data do show is that even if a nativelike mastery of any lin-guistic aspect of an L2 is indeed possible, even for late learners, the probabilityof a late learner developing a nativelike command of all (or even a majority of)relevant linguistic aspects (and across all linguistic domains, too) is close tozero. We therefore believe that one can (and should) remain skeptical towardany claims of absolute nativelikeness in adult learners and toward the rejectionsof the CPH that tend to follow such claims, especially if based solely on per-ceived nativelikeness or on the apparent incidence of adult nativelike behaviorin certain linguistic domains.

Another central result of the present study concerns the incidence of na-tivelikeness in early learners. The results of Part I revealed that even if mostchild learners were perceived as native speakers of Swedish by most judges,this was far from the case with all of them. In fact, as many as 41 of the107 candidates in this AO category were perceived as having a mother tongueother than Swedish. In addition, even though 25 of the 31 early learners whowere selected for participation in Part II were perceived as native speakers by9 or 10 native judges in Part I, only three of these learners performed withinthe native-speaker range on all 10 measures of Swedish proficiency. In mostprevious studies, early learners with less than nativelike L2 proficiency haveeither not been identified at all or, in a few studies, only as single exceptions(e.g., DeKeyser, 2000; see also Ioup, 1989; Obler, 1989). Only in recent years

289 Language Learning 59:2, June 2009, pp. 249–306

Page 42: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

have researchers been able to present results that suggest much higher rates ofnonnativelike early learners (e.g., Hyltenstam & Abrahamsson, 2003a; Butler,2000; Ekberg, 2004; Flege, Freida, & Nozawa, 1997; Flege et al., 1999; Leeet al., 2006; MacKay et al., 2006; McDonald, 2000; Tsukada et al., 2005; fora relatively early report in this direction, see Hyltenstam, 1992). For example,Flege et al. (1997) observed in their L2 participants a noticeable foreign ac-cent in English, even when their L2 acquisition had begun at ages 5–6 yearsand despite them having used the language in an L2 environment for 34 yearson average. Furthermore, in a detailed acoustic analysis of the production ofunstressed English vowels by early and late Korean and Japanese bilinguals,Lee, Guion & Harada showed that except for the nativelike behavior concerningfundamental frequency (F0), the Korean late and early learners were nonnative-like where duration, intensity and vowel-quality reduction was concerned; theJapanese late and early learners were nonnativelike on vowel-quality reductiononly.

These results suggest that one may consider it a myth that L2 learning thatbegins in childhood, easily, automatically, and inevitably results in nativelike-ness (cf. also Harley & Wang, 1997, p. 44); in fact, the present study suggeststhat nativelike L2 proficiency in individuals with low starting ages is consider-ably less common than has been assumed earlier. Irrespective of whether thisfact ought to be explained with a theory of nonmaturational factors or withone of maturational constraints operating successively at early ages (for a de-tailed discussion, see Hyltenstam & Abrahamsson, 2003b, pp. 553ff, 569ff, and572ff), one may safely conclude that an early AO of acquisition is a necessaryalthough not sufficient requirement for nativelike ultimate attainment in an L2(cf. Hyltenstam, 1992; Hyltenstam & Abrahamsson, 2003a, 2003b).

That most early learners in previous studies have uniformly passed fornative speakers (e.g., by exhibiting test results comparable to those of nativecontrols) is most probably due to ceiling effects, which, in our view, is alsothe reason why a handful of individual adult learners have been classified asnativelike speakers in previous research. In other words, we believe that thetests and measures adopted have not been sufficiently demanding and that theanalyses used have not been sophisticated enough to allow for discriminationbetween native and near-native levels of language proficiency. As an example,consider the type of sentences and structures used in Johnson and Newport’s(1989) grammaticality judgment test (also used in the various replications ofthis study; see above). The 276 test sentences were quite short and structurallysimple, as illustrated by the following ungrammatical examples (pp. 73–77):∗The farmer bought two pig at the market, ∗The little boy is speak to a policeman,

Language Learning 59:2, June 2009, pp. 249–306 290

Page 43: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

∗Yesterday the hunter shoots a deer, ∗Susan is making some cookies for we, ∗Canride the little girl a bicycle? and ∗Martha a question asked the policeman.19

Obviously, test items of this kind cannot be used in studies of near-native L2speakers without resulting in full scores for all participants. Had we decidedto administer a Swedish version of the Johnson and Newport grammaticalityjudgment test to our participants, we would have expected all of them to passfor native speakers, irrespective of age of learning.20 We therefore suggest thatmany, if not all, of the early learners in Johnson and Newport’s study can beconsidered “false positives” (cf. Long, 2005).

The question remains, however, why Selinker’s unverified 5% estimationconcerning nativelike adult learners is still interpreted as a fact by laymen andresearchers, and why even higher rates of nativelikeness are still being sug-gested. We believe several factors have contributed to this. First, individualperceptions and subjective judgments of nativelikeness may be highly influen-tial. One must bear in mind that even though there were relatively few adoles-cent or adult learners who passed for native speakers in the present listeningsessions, there were relatively few individuals—only 39 of 195 in total—whowere not perceived as native speakers by any of the 10 native judges; that is,156 participants were actually perceived as native speakers by one, sometimesseveral, of the native listeners. This means that an L2 speaker who most nativespeakers perceive as a nonnative speaker can still be perceived, interpreted, anddescribed as a nativelike speaker by individual reporters—laymen as well aslinguists—and thereby also be presented as evidence against a critical periodfor language acquisition.

Second, varying definitions of “nativelikeness” certainly have had an influ-ence on the different rates of nativelikeness reported. Studies that have usedeither self-evaluation or native-speaker judgments typically report quite highrates of nativelikeness (e.g., Bongaerts, 1999; Seliger et al., 1975). However,as revealed by the different results of Part I versus Part II of the present study,neither a definition based on self-identification nor one based on identifica-tion by others can reliably approximate the incidence of nativelikeness. Soraceand Robertson (2001) stated that “non-native grammars may exhibit certainsubtle features that distinguish them from native grammars” (p. 266) and that“even learners who are capable of native-like performance often have knowl-edge representations that differ systematically from those of native speakers”(p. 266). In a similar vein, Bley-Vroman (1989) argued that no adult learn-ers attain nativelike levels of L2 competence, “even though some may havea performance difficult to distinguish from that of native speakers” (p. 44).Earlier we have introduced the concept of “non-perceivable non-nativeness”

291 Language Learning 59:2, June 2009, pp. 249–306

Page 44: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

(Hyltenstam & Abrahamsson, 2003b)—that is, nonnativelikeness that cannoteasily be detected in everyday conversation. Despite the somewhat lumberingterminology, the concept of “non-perceivable non-nativeness” captures an im-portant insight into the various rates of nativelikeness that have been suggestedfor adult L2 learners (such as Selinker’s 5%), namely that they denote the in-cidence of near-nativeness rather than nativelikeness (cf. also Hyltenstam &Abrahamsson, 2003b; Long & Robinson, 1998). A majority of native judges inPart I of the present study could not detect the actual nonnativelikeness of the10 late learners revealed in Part II; nor could the actual nonnativelikeness ofthe early learners be detected by the judges when the basis for their judgmentswas spontaneous and casual speech.

Finally, as has already become clear from our discussion, we are convincedthat the somewhat “liberal” levels of scrutiny of previous studies (with a fewexceptions; e.g., Ioup et al., 1994) have resulted in ceiling effects and therebyType II errors and “false positives.” Obviously, the target of analysis must notbe the most basic rules, structures, or skills because much of the divergencebetween the very advanced L2 speaker’s and the native speaker’s proficiencyconsists of nonovert and/or low-frequency phenomena. Therefore, research onnativelikeness and advanced learners’ L2 ultimate attainment (regardless of AOof acquisition) calls for a much higher sensitivity of the tests and instrumentsthan does, for example, research on the initial phases of interlanguage devel-opment. Only after the detailed scrutiny in Part II of the present study couldthe L2 knowledge and behavior of our participants be distinguished from thatof native speakers.

To summarize our interpretation, the reason nativelike adult L2 learnersare still treated by many SLA researchers as “quite ordinary occurrences”(Bialystok, 1997, p. 134) is a combination of several factors: on one hand,personal, subjective, and unverified observations, and, on the other hand, em-pirical results based on either inappropriate definitions of nativelikeness orinsufficiently sophisticated techniques for linguistic scrutiny.

Some concern, and even some harsh criticism, has been raised againstthe research agenda represented by the present study, and such criticism hasgenerally targeted the scope and degree of scrutiny in the analysis. For example,Davies (2003) wrote that for the psycholinguist, “no test is ever sufficient todemonstrate conclusively that native speaker and nonnative speaker are discrete:when nonnative speakers have been shown to perform as well as a nativespeaker on a test, the cry goes up for yet another test” (p. 213). In the samespirit, Birdsong (2005b) warned us that “the acid test of nativelikeness runsthe risk of being over-applied [such that] individuals who have demonstrated

Language Learning 59:2, June 2009, pp. 249–306 292

Page 45: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

nativelikeness in several areas of experimental performance could be subjectedto even further poking and prodding, until a betraying shibboleth is found”(p. 322). His view is that “it would be a disservice to the scientific process toinsulate the CPH/L2A from falsifiability by adding task upon task and measureupon measure to the nativelikeness criterion” (p. 322).

If we were to agree with Birdsong that perhaps some kind of line ought tobe drawn somewhere, we would, at the same time, have to realize that we arein no position to draw such a line yet; as we see it, research on nativelikenesshas only begun. For the moment, all we can say about that line is that it shouldbe drawn far beyond measures of nativelikeness based on self-identification,far beyond measures of the phonetic ability to imitate native speakers or topass for native speaker on the basis of language-like behavior, far beyondlinguistic representation and UG constraints, and far beyond crude measuresof a limited set of linguistic phenomena. It must be remembered that we areactually dealing with two of the most central and crucial questions in linguisticsand SLA, namely “Can L2 learners ever attain nativelike proficiency?” and “Isthere a critical period for (second) language acquisition?” Given that the nullhypothesis states that there are no differences between native speakers and(adult) seemingly nativelike L2 speakers, it would be a greater disservice tothe scientific process if we, as researchers, chose not to do our best in trying toreject it.

Summary and Conclusion

Perceived and actual (linguistic) nativelikeness was investigated in a sample ofthe population advanced early and late L2 learners who perceive themselves aspotentially near-native or even nativelike speakers of Swedish. This was donein two steps: first through listening sessions with a large sample of L2 speakers(n = 195; AO < 1–47 years) and native judges, and then through broad anddetailed linguistic analyses of a subset of participants (n = 41; AO 1–19 years)who had passed for native speakers in the listening sessions. Results revealed,first, that a majority of the early learners but only a few of the late learners wereperceived as mother-tongue speakers of Swedish and, second, that only a fewof the early learners and none of the late learners exhibited actual, linguisticnativelikeness across the board when their performance was scrutinized indetail. The highest performing late learner exhibited results within the native-speaker range on 7 of the 10 measures of L2 proficiency, and her deviance fromnative-speaker norms was limited to phonetic aspects of speech production andperception.

293 Language Learning 59:2, June 2009, pp. 249–306

Page 46: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Of course, we are in no position to offer any percentages for the occurrenceof nativelike ultimate attainment, neither for adult nor child learners and neitherfor perceived nor actual nativelikeness. What we can offer instead is a numberof empirically founded reasons for treating existing estimates and rates ofnativelikeness with caution or even scepticism. Actually, we do not see revisednativelikeness rates as something desirable, as estimates in the form of exactpercentages run the risk of being circulated as established facts. On the contrary,we would like carefully to concur in Munro and Mann’s (2005) comparativelysober assertion that “[no] model of an age–accent connection should ever hopeto claim ‘before age X, a person is guaranteed to develop a native accent and,after age Y, a foreign accent is unavoidable’” (p. 337). Nevertheless, Selinker’s(1972) guess, that as much as 5% of the adult L2 learning population attain“absolute success,” clearly is a gross overestimation of the actual situation;our results point more in the direction that absolute nativelikeness in latelearners, in principle, does not occur. This, however, does not mean that we seehighly successful adult learners as theoretically uninteresting exceptions; to thecontrary, we believe that this population of L2 learners has a highly important,not to say crucial, role to play in SLA theory building—not least concerningthe question of a critical period for language acquisition.

Revised version accepted 2 June 2008

Notes

1 In accordance with previous discussions (e.g., Hyltenstam & Abrahamsson, 2000,2001, 2003b), our definition of “near-nativeness” throughout the present articlewill be “levels of nonnativeness that are nonperceivable in normal, everydaylanguage use.”

2 As Neufeld (1979) demonstrated, an adult speaker may well learn to imitateutterances in another language (even when being unaware of the meaning orlinguistic structure of the strings of speech imitated), and imitations maysometimes be of such a quality that native listeners become convinced that theyoriginate from a native speaker.

3 It is important to note, however, that one of the two high-performing participants inBirdsong’s study had received 1 year of university-level phonetic instruction,whereas the other worked as an actress in Parisian theater, a job that certainlyrequires an unusually high degree of training in idiomatic pronunciation.Furthermore, although Birdsong saw their academic exposure prior to, or during,their residence in France as insignificant, one of them had, in fact, received 3 yearsof high-school French, beginning at age 14 years and 1 year of college French at

Language Learning 59:2, June 2009, pp. 249–306 294

Page 47: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

age 21 years; the other participant had 3 years of college-level French in Francebeginning at age 20 years.

4 The L2 participants were highly educated and “were recruited from amonginstructors, professors and advanced undergraduate students in Spanish languageprograms at three major research universities in the United States” (p. 366).

5 The newspapers were Metro, a free morning paper with 625,000–720,000 dailyreaders (the Stockholm edition), distributed in the Stockholm public transportationsystem (subway, buses, etc.) as well as in shopping malls and other public places,and Aftonbladet, the leading tabloid newspaper in Sweden, with 325,000 dailyreaders in the Stockholm area. The two Metro advertisements (September 16, 2002and April 16, 2003) were both 125 × 182 mm in size, and the Aftonbladetadvertisement (March 15, 2004) was 250 × 100 mm.

6 Although the participants with the earliest AOs (<1–2) can be considered to havesimultaneously acquired two languages, we have chosen to disregard the distinctionsimultaneous/successive bilingualism in this study. First, the issue affects only aminority of our participants, but, second, and more importantly, it remains an openempirical question what the effects are of even a minimal delay in L2 exposure.

7 In other words, we believe that the possibility of subtle dialectal variation may bean important confounding factor contributing to the relatively high incidence ofnativelike adult learners reported in studies using Interpretation 2 of nativelikenessabove (i.e., “nativelikeness as perceived by native speakers”).

8 E-Prime (Psychology Software Tools, Inc.; Schneider, Eschman, & Zuccolotto,2002a, 2002b) is one of the most user-friendly and widely used PC softwareswithin psychological/behavioral experimental research; for a review, see Marinis(2003, pp. 157–158).

9 Each choice caused four new alternatives to appear on the screen from which thejudges were to indicate their degree of certainty. This was done by pressing one offour keys associated with the following alternatives: (a) “and I’m absolutely sure,”(b) “and I’m fairly sure,” (c) “but I’m fairly unsure,” and (d) “but I’m very unsure.”This part of the procedure was originally included in order to make the task lesstrivial, to stimulate careful judgments from the listeners, and to produce a morefinely grained quantification of the judgments. However, as being “fairly sure/fairlyunsure/very unsure” about an alternative does not reveal anything about thespecific cause of listener uncertainty (i.e., whether the uncertainty concerns thechoice between foreign accent and dialectal variation, between Stockholmpronunciation and dialectal variation, or between Stockholm pronunciation andforeign accent), this measure turned out to be useless for this part of the study andwas therefore excluded in the analysis. Nevertheless, we believe that this part of theprocedure successfully made the task less trivial and that it might have helped inpromoting careful and well-balanced decisions from the judges.

10 See Munro and Mann (2005), who used the notion “degree of perceived accent”(DPA) when focusing on pronunciation only. However, the focus of the present

295 Language Learning 59:2, June 2009, pp. 249–306

Page 48: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

study is not on accent per se but on the broader concept of “nativelikeness,”including various levels of linguistic performance.

11 In both of these cases the degree of certainty was the lowest—that is “. . .but I’mvery unsure.” The fact that all judges consistently perceived the native controlspeakers as mother-tongue speakers of Swedish should be seen as additionalevidence for a high degree of reliability and validity of the listener judgments.Obviously, “native speaker” and “native-speaker proficiency” seem to be conceptsthat carry psychological relevance for native listeners, even if such concepts maybe difficult to define theoretically (see discussions in Birdsong, 2004; Cook, 1999;Davies, 2003).

12 These differences remain significant with Tukey’s HSD and Bonferroni post hoctests.

13 However, when checked with Tukey’s HSD and Bonferroni post hoc tests (whichare somewhat more conservative procedures for multiple comparison), thedifference between the native-speaker group mean and the early childhood groupmean did not reach significance (p = .154 and .246, respectively).

14 This result, of course, potentially constitutes a serious challenge to the variouscritiques of Lenneberg’s (1967) original critical period hypothesis. First, itchallenges the abandonment of puberty (commonly referred to as age 12–13 years)as a valid end point of the critical period (cf., e.g., Johnson & Newport, 1989).Second, it challenges the rejections of a critical period that are based onnonobserved discontinuity at a certain age or within a certain age span (cf., e.g.,Bialystok & Hakuta, 1999; Flege et al., 1999). Similarly, and most importantlyfrom our point of view, the curve in Figure 2 poses a problem for the hypothesisthat a maturationally constrained age function could be depicted as a linear declinefrom birth (as suggested in Hyltenstan & Abrahamsson, 2003b). Anage-nativelikeness function with a marked slope from age 12 throughoutadolescence, but with only minor slopes (if not plateaus) during childhood andadulthood, suggests instead that a blend between the critical period model (with acutoff point at, say, puberty) and the linear decline model (lacking discontinuity atany point) would better describe the present result. In fact, the recent study byMunro and Mann (2005) suggests that the curve describing the relationshipbetween age of learning (or, in their case, age of immigration) and degree ofperceived foreign accent, “although heavily linear on a restricted range, seems tobe globally sigmoid” (p. 337) (for similar patterns, see Flege et al., 1999). Despiteits potentially far-reaching theoretical consequences, this aspect of our data willnot, however, be developed further in the present article. A serious reevaluation ofpuberty as an upper limit of a critical or sensitive period as well as suggestions of asigmoid decline model clearly require more focused investigation in future studies.

15 English translation: par “pair, couple,” tal “speech, number,” kal “bare, bald.”16 English translation: bar “bar, carried, naked,” dal “valley,” gal “crow(s) (verb,

pres.).”

Language Learning 59:2, June 2009, pp. 249–306 296

Page 49: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

17 When checked with Tukey’s HSD and Bonferroni post hoc tests, the difference inmeans between the AO 6–11 and AO 13–19 groups did not reach significance (p =.062 and .074, respectively); however, the difference between AO 1–5 and AO13–19 remained significant (both p < .04).

18 Due to technical problems, data are missing from the participant with AO 8 forInstruments 4, 9, and 10 (see Table 8).

19 Similarly short and structurally simple sentences were used by White and Genesee(1996) as well as by Marinova-Todd (2003) in their studies of very advanced lateL2 learners.

20 In fact, if one was to use the original English version of the Johnson and Newport(1989) test as a measure of nativelikeness, most seventh graders in the Swedishschool system would turn out to be nativelike speakers of English, which, ofcourse, would be a quite bizarre claim.

References

Abrahamsson, N., & Hyltenstam, K. (2008). The robustness of aptitude effects innear-native second language acquisition. Studies in Second Language Acquisition,30(4), 481–509.

Abrahamsson, N., Stolten, K., & Hyltenstam, K. (in press). Effects of age on voiceonset time: The production of voiceless stops by near-native L2 speakers. In S.Haberzettl (Ed.), Processes and outcomes: Explaining achievement in languagelearning. Berlin: Mouton de Gruyter.

Asher, J., & Garcıa, G. (1969). The optimal age to learn a foreign language. ModernLanguage Journal, 38, 334–341.

Bialystok, E. (1997). The structure of age: In search of barriers of second languageacquisition. Second Language Research, 13, 116–137.

Bialystok, E., & Hakuta, K. (1999). Confounded age: Linguistic and cognitive factorsin age differences for second language acquisition. In D. Birdsong (Ed.), Secondlanguage acquisition and the critical period hypothesis (pp. 161–181). Mahwah,NJ: Lawrence Erlbaum.

Bialystok, E., & Miller, B. (1999). The problem of age in second-language acquisition:Influences from language, structure, and task. Bilingualism: Language andCognition, 2, 127–145.

Birdsong, D. (1992). Ultimate attainment in second language acquisition. Language,68, 706–755.

Birdsong, D. (1999). Introduction: Whys and why nots of the critical period hypothesisfor second language acquisition. In D. Birdsong (Ed.), Second language acquisitionand the critical period hypothesis (pp. 1–22). Mahwah, NJ: Lawrence Erlbaum.

Birdsong, D. (2004). Second language acquisition and ultimate attainment. In A.Davies & C. Elder (Eds.), Handbook of applied linguistics. London: Blackwell.

297 Language Learning 59:2, June 2009, pp. 249–306

Page 50: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Birdsong, D. (2005a). Interpreting age effects in second language acquisition. In J.Kroll & A. De Groot (Eds.), Handbook of bilingualism: Psycholinguisticperspectives (pp. 109–127). Cambridge: Cambridge University Press.

Birdsong, D. (2005b). Nativelikeness and non-nativelikeness in L2A research. IRAL,43, 319–328.

Birdsong, D. (2006). Age and second language acquisition and processing: A selectiveoverview. Language Learning, 56, 9–49.

Birdsong, D. (2007). Nativelike pronunciation among late learners of French as asecond language. In O.-S. Bohn & M. Munro (Eds.), Second language speechlearning: The role of language experience in speech perception and production (pp.99–116). Amsterdam: Benjamins.

Birdsong, D., & Molis, M. (2001). On the evidence for maturational constraints insecond-language acquisition. Journal of Memory and Language, 44, 235–249.

Bley-Vroman, R. (1989). What is the logical problem of foreign language learning? InS. Gass & J. Schachter (Eds.), Linguistic perspectives on second languageacquisition (pp. 41–68). Cambridge: Cambridge University Press.

Bongaerts, T., Mennen, S., & van der Slik, F. (2000). Authenticity of pronunciation innaturalistic second language acquisition: The case of very advanced late learners ofDutch as a second language. Studia Linguistica 54(2), 298–308.

Bongaerts, T., Planken, B., & Schils, E. (1995). Can late learners attain a native accentin a foreign language? A test of the critical period hypothesis. In D. Singleton &Z. Lengyel (Eds.), The age factor in second language acquisition (pp. 30–50).Clevedon: Multilingual Matters.

Bongaerts, T., Planken, B., & Schils, E. (1997). Age and ultimate attainment in thepronunciation of a foreign language. Second Language Research, 19, 447–465.

Bongaerts, T. (1999). Ultimate attainment in L2 pronunciation: The case of veryadvanced late L2 learners. In D. Birdsong (Ed.), Second language acquisition andthe critical period hypothesis (pp. 133–159). Mahwah, NJ: Lawrence Erlbaum.

Bradlow, A. R., & Bent, T. (2002). The clear speech effect for non-native listeners.Journal of the Acoustical Society of America, 112, 272–284.

Butler, Y. G. (2000). The age effect in second language acquisition: Is it too late toacquire native-level competence in a second language after the age of seven? In Y.Oshima-Takane, Y. Shirai, & H. Sirai (Eds.), Studies in language sciences 1 (pp.159–169). Tokyo: The Japanese Society for Language Sciences.

Colantoni, L., & Steele, J. (2006). Native-like attainment in the L2 acquisition ofSpanish stop-liquid clusters. In C. A. Klee & T. L. Face (Eds.), Selected proceedingsof the 7th conference on the acquisition of Spanish and Portuguese as first andsecond languages (pp. 59–73). Somerville, MA: Cascadilla Proceedings Project.

Cook, V. (1999). Going beyond the native speaker in language teaching. TESOLQuarterly, 33, 185–209.

Coppieters, R. (1987). Competence differences between natives and near-nativespeakers. Language, 63, 544–573.

Language Learning 59:2, June 2009, pp. 249–306 298

Page 51: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Cranshaw, A. (1997). A study of Anglophone native and near-native linguistic andmetalinguistic performance. Unpublished doctoral dissertation, University ofMontreal, Canada.

Cunningham-Andersson, U., & Engstrand, O. (1989). Perceived strength and identityof foreign accent in Swedish. Phonetica, 46, 138–154.

Davies, A. (2003). The native speaker: Myth and reality. Clevedon, UK: MultilingualMatters.

DeKeyser, R. M. (2000). The robustness of critical period effects in second languageacquisition. Studies in Second Language Acquisition, 22, 499–533.

Ekberg, L. (2004). Grammatik och lexikon I svenska som andrasprak pa nastan infoddniva. In K. Hyltenstam & I. Lindberg (Ed.), Svenska som andrasprak – i forskning,undervisning och samhalle (pp. 221–258). Lund: Studentlitteratur.

Epstein, S., Flynn, S., & Martohardjano, G. (1996). Second language acquisition:Theoretical and experimental issues in contemporary research. Behavioral andBrain Sciences, 19, 677–758.

Eubank, L., & Gregg, K. R. (1999). Critical periods and (second) language acquisition:Divide et impera. In D. Birdsong (Ed.), Second language acquisition and thecritical period hypothesis (pp. 65–99). Mahwah, NJ: Lawrence Erlbaum.

Flege, J. E. (1984). The detection of French accent by American listeners. Journal ofthe Acoustical Society of America, 76, 692–707.

Flege, J. E. (1999). Age of learning and second language speech. In D. Birdsong (Ed.),Second language acquisition and the critical period hypothesis (pp. 101–132).Mahwah, NJ: Lawrence Erlbaum.

Flege, J. E., Frieda, E. M., & Nozawa, T. (1997). Amount of native-language (L1) useaffects the pronunciation of an L2. Journal of Phonetics, 25, 169–186.

Flege, J. E., Munro, M. J., & MacKay, I. R. A. (1995). Factors affecting degree ofperceived foreign accent in a second language. Journal of the Acoustical Society ofAmerica, 97, 3125–3134.

Flege, J. E., Yeni-Komshian, G. H., & Liu, S. (1999). Age constraints onsecond-language acquisition. Journal of Memory and Language, 41, 78–104.

Gregg, K. R. (1996). The logical and developmental problems of second languageacquisition. In W. C. Ritchie & T. K. Bhatia (Eds.), Handbook of second languageacquisition (pp. 49–81). San Diego: Academic Press.

Harley, B., & Wang, W. (1997). The critical period hypothesis: Where are we now? Inde A. Groot & J. Kroll (Eds.), Tutorials in bilingualism: Psycholinguisticperspectives (pp. 19–51). Mahwah, NJ: Lawrence Erlbaum.

Hyltenstam, K. (1992). Non-native features of near-native speakers. On the ultimateattainment of childhood L2 learners. In R. J. Harris (Ed.), Cognitive processing inbilinguals (pp. 351–368). Amsterdam: Elsevier Science.

Hyltenstam, K., & Abrahamsson, N. (2000). Who can become native-like in a secondlanguage? All, some, or none? On the maturational constraints controversy insecond language acquisition. Studia Linguistica, 54(2), 150–166.

299 Language Learning 59:2, June 2009, pp. 249–306

Page 52: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Hyltenstam, K., & Abrahamsson, N. (2001). Age and L2 learning: The hazards ofmatching practical “implications” with theoretical “facts.” (Comments on Stefka H.Marinova-Todd, D. Bradford Marshall, and Catherine E. Snow’s “Threemisconceptions about age and L2 learning”). TESOL Quarterly, 35(1), 151–170.

Hyltenstam, K., & Abrahamsson, N. (2003a). Age of onset and ultimate attainment innear-native speakers of Swedish. In K. Fraurud & K. Hyltenstam (Eds.),Multilingualism in global and local perspectives. Selected papers from the 8thNordic conference on bilingualism, November 1–3, 2001, Stockholm Rinkeby(pp. 319–340). Stockholm: Centre for Research on Bilingualism, StockholmUniversity, and Rinkeby Institute of Multilingual Research.

Hyltenstam, K., & Abrahamsson, N. (2003b). Maturational constraints in SLA. In C. J.Doughty & M. H. Long (Eds.), The handbook of second language acquisition (pp.539–588). Oxford: Blackwell.

Ioup, G. (1989). Immigrant children who have failed to acquire native English. In S.Gass, C. Madden, D. Preston, & L. Selinker (Eds.), Variation in second languageacquisition: Vol. 2. Psycholinguistic issues (pp. 160–175). Clevedon, UK:Multilingual Matters.

Ioup, G., Boustagui, E., El Tigi, M., & Moselle, M. (1994). Reexamining the criticalperiod hypothesis: A case study in a naturalistic environment. Studies in SecondLanguage Acquisition, 16, 73–98.

Johnson, J. S., & Newport, E. L. (1989). Critical period effects in second languagelearning: The influence of maturational state on the acquisition of English as asecond language. Cognitive Psychology, 21, 60–99.

Lee, B., Guion, S. G., & Harada, T. (2006). Acoustic analysis of the production ofunstressed English vowels by early and late Korean and Japanese bilinguals. Studiesin Second Language Acquisition, 28, 487–513.

Lenneberg, E. (1967). Biological foundations of language. New York: Wiley.Lisker, L., & Abramson, A. (1964). A cross-language study of voicing in initial stops:

Acoustical measurements. Word, 20, 384–422.Liu, Y.-T. (2006). Specifying the norms of successful L2 users for developing theories

on the learning potential in SLA. Teachers College, Columbia University WorkingPapers in TESOL & Applied Linguistics, 6(1). Retrieved March 26, 2006, fromhttp://journals.tc-library.org/index.php/tesol/article/view/164/162

Long, M. H. (1990). Maturational constraints on language development. Studies inSecond Language Acquisition, 12, 251–285.

Long, M. H. (1993). Second language acquisition as a function of age: Researchfindings and methodological issues. In K. Hyltenstam & A. Viberg (Eds.),Progression and regression in language (pp. 196–221). Cambridge: CambridgeUniversity Press.

Long, M. H. (2005). Problems with supposed counter-evidence to the Critical PeriodHypothesis. International Review of Applied Linguistics, 43, 287–317.

Language Learning 59:2, June 2009, pp. 249–306 300

Page 53: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Long, M. H., & Robinson, P. (1998). Focus on form: Theory, research, and practice. InC. Doughty & J. Williams (Eds.), Focus on form in classroom second languageacquisition (pp. 16–41). Cambridge: Cambridge University Press.

MacKay, I. R. A., Flege, J. E., & Imai, S. (2006). Evaluating the effects ofchronological age and sentence duration on degree of perceived foreign accent.Applied Psycholinguistics, 27, 157–183.

Marinis, T. (2003). Psycholinguistic techniques in second language acquisitionresearch. Second Language Research, 19, 144–161.

Marinova-Todd, S. H. (2003). Comprehensive analysis of ultimate attainment in adultsecond language acquisition. Unpublished doctoral dissertation, HarvardUniversity, Massachusetts.

Markham, D. (1997). Phonetic imitation, accent, and the learner. Unpublisheddoctoral dissertation. Lund: Lund University Press.

McAllister, R. (1997). Perceptual foreign accent: L2 users’ comprehension ability. InA. James & J. Leather (Eds.), Second-language speech: Structure and process(pp. 119–132). Berlin: Mouton deGruyter.

McAllister, R., & Brodda, B. (2002). Development of a new speech comprehensiontest with a phonological distance metric. Proceedings of Fonetik 2002, the XVthSwedish Phonetics Conference, Stockholm, May 29–31, 2002. Quarterly Progressand Status Report (Dept of Speech, Music and Hearing and Centre for SpeechTechnology, KTH, Stockholm) 44(1), 149–151.

McDonald, J. L. (2000). Grammaticality judgments in a second language: Influencesof age of acquisition and native language. Applied Psycholinguistics, 21, 395–423.

McDonald, J. L. (2006). Beyond the critical period: Processing-based explanation forpoor grammaticality judgment performance by late second language learners.Journal of Memory and Language, 55, 381–401.

McNamara, T. (2000). Language testing. Oxford: Oxford University Press.Montrul, S., & Slabakova, R. (2003). Competence similarities between native and

near-native speakers. An investigation of the preterite-imperfect contrast in Spanish.Studies in Second Language Acquisition, 25, 351–398.

Moyer, A. (1999). Ultimate attainment in L2 phonology: The critical factors of age,motivation and instruction. Studies in Second Language Acquisition, 21, 81–108.

Munro, M., & Mann, V. (2005). Age of immersion as a predictor of foreign accent.Applied Psycholinguistics, 26, 311–341.

Neufeld, G. (1979). Towards a theory of language learning ability. LanguageLearning, 29, 227–241.

Neufeld, G. (2001). Non-foreign-accented speech in adult second language learners:Does it exist and what does it signify? ITL Review of Applied Linguistics, 133–134,185–206.

Obler, L. K. (1989). Exceptional second language learners. In S. Gass, C. Madden, D.Preston, & L. Selinker (Eds.), Variation in second language acquisition(pp. 141–149). Clevedon, UK: Multilingual Matters.

301 Language Learning 59:2, June 2009, pp. 249–306

Page 54: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Oyama, S. (1976). A sensitive period for the acquisition of a nonnative phonologicalsystem. Journal of Psycholinguistic Research, 5, 261–285.

Oyama, S. (1978). The sensitive period and comprehension of speech. Working Paperson Bilingualism, 16, 1–17.

Patkowski, M. (1980). The sensitive period for the acquisition of syntax in a secondlanguage. Language Learning, 30, 449–472.

Piller, I. (2002). Passing for a native speaker: Identity and success in second languagelearning. Journal of Sociolinguistics, 6, 179–206.

Piske, T., MacKay, I. R. A., & Flege, J. E. (2001). Factors affecting degree of foreignaccent in an L2: A review. Journal of Phonetics, 29, 191–215.

Platzack, C. (1973). Spraket och lasbarheten. Lund: GWK Gleerup.Schachter, J. (1989). Testing a proposed universal. In S. Gass & J. Schachter (Eds.),

Linguistic Perspectives On Second Language Acquisition (pp. 73–88). Cambridge:Cambridge University Press.

Schneider, W., Eschman, A., & Zuccolotto, A. (2002a). E-Prime user’s guide.Pittsburgh: Psychology Software Tools, Inc.

Schneider, W., Eschman, A., & Zuccolotto, A. (2002b). E-Prime reference guide.Pittsburgh: Psychology Software Tools, Inc.

Scovel, T. (1988). A time to speak: A psycholinguistic inquiry into the critical periodfor human speech. New York: Newbury House.

Seliger, H. W. (1978). Implications of a multiple critical periods hypothesis for secondlanguage learning. In W. C. Ritchie (Ed.), Second language acquisition research:Issues and implications (pp. 11–19). New York: Academic Press.

Seliger, H., Krashen, S., & Ladefoged, P. (1975). Maturational constraints in theacquisition of second languages. Language Sciences, 38, 20–22.

Selinker, L. (1969). Language transfer. General Linguistics, 9, 67–92.Selinker, L. (1972). Interlanguage. IRAL, 10, 209–231.Sorace, A. (1993). Incomplete and divergent representations of unaccusativity in

nonnative grammars of Italian. Second Language Research, 9, 22–48.Sorace, A. (2003). Near-nativeness. In C. J. Doughty & M. H. Long (Eds.), The

handbook of second language acquisition. Oxford: Blackwell.Sorace, A., & Robertson, D. (2001). Measuring development and ultimate attainment

in non-native grammars. In C. Elder, A. Brown, E. Grove, K. Hill, N. Iwashita, T.Lumley, et al. (Eds.), Experimenting with uncertainty. Essays in honour of AlanDavies (pp. 264–274). Cambridge: Cambridge University Press.

Spolsky, B., Sigurd, B., Sato, M., Walker, E., & Arterburn, C. (1968). Preliminarystudies in the development of techniques for testing overall second languageproficiency. In J. A. Upshur & J. Fata (Eds.), Problems in foreign language testing.Language Learning Special Issue (No. 3, pp. 79–98). Ann Arbor, Mich.: ResearchClub in Language Learning.

Stevens, G. (2006). The age-length-onset problem in research on second languageacquisition among immigrants. Language Learning, 56, 671–692.

Language Learning 59:2, June 2009, pp. 249–306 302

Page 55: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Stolten, K. (2005). Effects of age of learning on VOT in voiceless stops produced bynear-native L2 speakers. In A. Eriksson & J. Lindh (Eds.), Proceedings FONETIK2005. The XVIII Swedish phonetics conference, May 25–27 2005 (pp. 91–94).Goteborg: Department of Linguistics, Goteborg University.

Stolten, K. (2006). Effects of age on VOT: Categorical perception of Swedish stops bynear-native L2 speakers. In G. Ambrazaitis & S. Schotz (Eds.), Proceedings fromFONETIK 2006, Lund, June 7–9 2006 (pp. 125–128). Lund: Centre for Languagesand Literature, (General Linguistics and Phonetics).

Taylor, W. L. (1953). Cloze procedure: A new tool for measuring readability.Journalism Quarterly, 30, 414–438.

Tsukada, K., Birdsong, D., Bialystok, E., Mack, M., Sung, H., & Flege, J. (2005). Adevelopmental study of English vowel production and perception by native Koreanadults and children. Journal of Phonetics, 33, 263–290.

van Boxtel, S. (2005). Can the late bird catch the worm? Ultimate attainment in L2syntax. Unpublished doctoral dissertation, Radboud University Nijmegen,Utrecht.

van Boxtel, S., Bongaerts, T., & Coppen, P.-A. (2005). Native-like attainment ofdummy subjects in Dutch and the role of the L1. IRAL, 43, 355–380.

Van Wuijtswinkel, K. (1994). Critical period effects in the acquisition of grammaticalcompetence in a second language. Nijmegen, The Netherlands: University ofNijmegen.

White, L., & Genesee, F. (1996). How native is near-native? The issue of ultimateattainment in adult second language acquisition. Second Language Research, 12,233–265.

Wray, A. (2002). Formulaic language and the lexicon. Cambridge: CambridgeUniversity Press.

Appendix A

English Translation of the First Newspaper Advertisement

in Metro (Stockholm Edition), September 16, 2002, p. 22:

Stockholm UniversityCentre for Research on Bilingualism

Is Spanish your mother tongue?

Subjects with Spanish as their first learned language wanted for a researchproject on age and language acquisition

Research on human language and language acquisition has consistently shownthat people who begin their acquisition of a second language at an early ageeventually reach an ultimate attainment comparable to the proficiency levels of

303 Language Learning 59:2, June 2009, pp. 249–306

Page 56: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

native speakers, while people who begin their acquisition in adulthood typicallydo not reach such levels of ultimate attainment. But the same research has alsoshown that there are exceptions to this—that is, persons who have begun theirsecond language acquisition relatively late in life (in their teens or later) andstill have reached a proficiency level comparable to that of native speakers.

We are currently looking for people who have begun the acquisition ofSwedish as a second language at varying ages—from childhood to adulthood—and who have reached such a level of proficiency that native Swedish speakersusually do not notice their non-Swedish mother-tongue background in everydaycommunication.

The persons we are looking for must:

- have Spanish as their first learned language (even if they no longer use theirSpanish)

- be entirely fluent in Swedish, without exhibiting a noticeable foreign accentor any obvious grammatical deviation

- be adults today (19 years or older)- have lived in Sweden for at least 10 years- have at least a high-school education- have learned the variety of Swedish spoken in the greater Stockholm area

We are looking for people who are using both Spanish and Swedish on aregular basis as well as people who already from the beginning have almostexclusively used Swedish and only rarely—or never—Spanish (i.e. people who,early or late in life, have had reasons to “leave their mother tongue behind”). Inother words, we are interested in functionally bilingual people as well as peoplewho have shifted language.

If you think that the above description fits with you, and if you are willingto participate in a research project that potentially can offer interesting answersto questions concerning the human language learning ability, please contactus on the following telephone number: [number removed] (telephone hoursMon, Wed, Fri at 10am–2pm), or send an email to [assistant’s name and emailremoved] with your name and telephone number. A small remuneration will begiven those who are eventually selected as participants for the study.

Welcome and call us!Kenneth Hyltenstam Niclas AbrahamssonProfessor PhD

Language Learning 59:2, June 2009, pp. 249–306 304

Page 57: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

Appendix B

Eight examples out of 80 grammaticality judgment items, grouped by structuretype. (a) = grammatical sentences, (b) = ∗ungrammatical sentences. Targetstructures are underlined, and for the ungrammatical items, the correct structureis given in [ ].

1. Subject-verb inversion (V2)

(a) Med tanke pa att den formogenhet familjen forfogade over varganska betydande forstar man deras negativa installning till dagens skat-tesystem.“Given that the fortune the family controlled was rather significant,one understands their negative stance on today’s tax system.”

(b) ∗Med tanke pa att den hogkonjunktur landet gick mot var myckettydlig man forstar [forstar man] kapitalagarnas uppfattning gallandeekonomiska skyddstullar.“Given that the economic upturn the country was approaching was veryobvious, one understands the capitalists’ position regarding protectionisttolls.”

2. Reflexive possessive pronouns

(a) De aterkommande stamgasterna insag genast att deras restaurangbesokinte skulle vara sig lika efter agarbytet.“The returning regular customers realized immediately that their visitsto the restaurant would not be the same after the change of owners.”

(b) ∗De mest rutinerade kroppsbyggarna sag till att sina [deras] benmusklerutvecklades i samma takt som ovriga muskler.“The most experienced body builders made certain that their leg musclesdeveloped at the same rate as their other muscles.”

3. Placement of sentence adverbs in relative clauses

(a) Flygplanet traffade en kraftledning som flygledningen inte fick in pa sinskarm vilket var nara att orsaka en katastrof.“The plane hit a power line that the air-traffic controllerscould not pick up on their monitor, which nearly resulted in a catas-trophe.”

(b) ∗Fartyget rammade en eka som styrmannen observerade inte [inte ob-serverade] pa sin radar vilket fick katastrofala foljder.

305 Language Learning 59:2, June 2009, pp. 249–306

Page 58: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny

Abrahamsson and Hyltenstam AO and Nativelikeness in an L2

“The ship rammed a rowboat that the helmsman hadn’t noticed on hisradar, which had catastrophic consequences.”

4. Adjective agreement in predicative position (example: AGR-Num, plural)

(a) Vardena som legat under det normala i flera veckor och darfor interapporterats till myndigheterna var nu plotsligt starkt forhojda.“The levels that had been below normal for several weeks and thereforenot reported to the government authorities were now greatly increased.”

(b) ∗Skjulen som varit skymda av den hoga stenmuren och darfor inte exis-terat i folks medvetande blev nu helt blottlagd [blottlagda].“The sheds that had been hidden by the high stone wall, and there-fore non-existent in people’s consciousness, were now suddenly entirelyexposed.”

Language Learning 59:2, June 2009, pp. 249–306 306

Page 59: Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny