3 tone sandhi in Standard Chinese: A corpus approachjiahong/publications/j03.pdf · 3rd tone sandhi in Standard Chinese: ... tones frequently undergo changes in connected speech,

3rd tone sandhi in Standard Chinese: A corpus approach

Jiahong Yuan1 and Yiya Chen2

University of Pennsylvania1, Leiden University2

Abstract

In Standard Chinese, a low tone (Tone3) is often realized with a rising F0 contour

before another low tone; this tone change is known as the 3rd tone sandhi. This study

investigated the acoustic characteristics of the 3rd tone sandhi in Standard Chinese in

telephone conversations and broadcast news speech. The sandhi rising tone was found to

be different from the lexical rising tone (Tone2) in disyllabic words in two measures: the

magnitude of the F0 rise and the time span of the F0 rise. We also found that word

frequency affected the realization of the sandhi rising tone. Specifically, the sandhi rising

tone in highly frequent words exhibited a smaller F0 rise (i.e., a greater difference from

the lexical rising tone) than that observed in less frequent words. This result suggests that

different processes may be involved in producing high- vs. low-frequency words in

Chinese.

Key Words: Tone, Tone sandhi, Conversation, Radio news, Corpus

1. Introduction

In lexical tone languages, in which fundamental frequency (F0) changes differentiate

word meanings, tones frequently undergo changes in connected speech, and surface with

F0 contours that differ from the canonical tonal shapes produced in isolation. This tonal

change process is commonly referred to as tone sandhi. During the last two decades, a

significant amount of research has been conducted regarding tone sandhi in various

Chinese dialects; this research culminated in the work by M. Chen (2000). Although

previous studies have greatly improved our understanding of the tone sandhi phenomena

in general, the weakness in most (if not all) studies is that the generalizations are

primarily based on introspective judgments or laboratory speech of a few speakers. Thus,

it is desirable to complement the existing literature by examining the realization of tone

sandhi in large data corpora with naturally occurring speech. The specific sandhi

phenomenon on which we focus in this paper is the 3rd (low) tone sandhi in Standard

Chinese, in which the first tone in a sequence of two low tones surfaces with a rising F0,

which is comparable to or neutralized with the 2nd lexical tone (rising) in the language.

Previous linguistic studies on the 3rd tone sandhi have mainly concerned with two

aspects of the phenomenon. The first aspect concerns the formation of the tone sandhi

domain (e.g., Shih, 1986; Zhang, 1988; Chen, 2000; Duanmu, 2000). The general

consensus in the literature is that disyllabic words with two low tones form a 3rd tone

sandhi domain, in which the first low tone changes to a sandhi rising (SR) tone. The

application of the 3rd tone sandhi across linguistic boundaries above the word level is

known to be determined by a number of factors such as syntactic structure, information

structure, speech prosody, and speaking rate (Speer et al., 1989; Shen, 1994; Shih, 1997;

Chen, 2003; Kuo et al., 2007).

The second aspect of 3rd tone sandhi concerns the exact phonetic nature of the derived

SR tone as compared with the lexical rising (LR) tone. The first well-known report

pertaining to the 3rd tone sandhi is Chao (1948), who described the change as the

replacement of the low tone with an LR tone. This view was challenged by two reports

that were published during the same period (Hockett, 1947; Martin, 1957); both

researchers described the SR tone in a stressed position as a new category that is similar

but not identical to the LR tone. In recent decades, the debate has been whether there is

indeed complete neutralization between the SR and LR tones, and if complete

neutralization is not present, then what are the acoustic parameters that differentiate these

tones? Zee (1980), who conducted the first instrumental investigation of the 3rd tone

sandhi to our knowledge, demonstrated that derived SR tones are pronounced with a

lower dip as well as a lower ending F0 than LR tones on the basis of two native speakers

of Beijing Mandarin. This subtle difference between SR and LR tones was supported by

later acoustic studies (Kratochvil, 1984; Shen, 1990; Xu, 1993, 1997; Peng, 2000; Kuo et

al., 2007), although varying magnitude of the difference between the two tones have been

reported. Based on a review of the literature and an acoustic study of the 3rd tone sandhi

in Taiwan Mandarin, Myers and Tsai (2003) proposed that the 3rd tone sandhi is

processed differently by different groups of Mandarin speakers: native speakers of

Beijing Mandarin apply the 3rd tone sandhi by phonetically modifying Tone3 so that it

sounds more similar to Tone2 whereas speakers of other varieties of Mandarin

categorically replace Tone3 with Tone2.

Despite the consistent trend of differences reported between SR and LR tones, it has

remained unclear whether listeners can hear the difference. Wang and Li (1967)

conducted the first perceptual experiment to test the ability of listeners to differentiate

between SR and LR tones. In the experiment, the subjects were asked to identify whether

a prerecorded word was an SR-Tone3 word (e.g., qi3ma3, ‘at least’) or a LR-Tone3 word

(e.g., qi2ma3, ‘to ride on a horse’). Their results demonstrated that the overall percentage

of accuracy ranged from 49.2% to 54.2% for the 14 listeners who did not participate in

the recording of the stimuli, suggesting that listeners cannot differentiate SR and LR

tones in word identification experiments. However, for the two subjects who recorded the

stimuli, the overall percentages of accuracy were above the chance level at 56.9% and

67.3%, respectively. Peng (2000) conducted a similar word identification experiment and

analyzed the identification results based on the signal detection theory (Macmillan and

Creelman, 2005). In her results, the mean sensitivity index A’ of the 15 listeners was

0.50, which suggested a random guess. However, there were two problems with her

conclusion. First, the standard deviation of A’ was very high (0.17); thus, there was a

significant variability in performance among the listeners. The second problem is that she

calculated the ratios of true and false positives in a manner that differs from that typically

applied in the signal detection theory1. Speer and Xu (2008) examined the time-course of

the resolution of lexical ambiguity from the 3rd tone sandhi by tracking the eye-

movements of listeners during a word-monitoring task. Surprisingly, they found that

1 In the study by Peng (2000), both the identification of Tone3 for underlying Tone2 and the identification of Tone2 for underlying Tone3 were considered false alarms. However, in standard signal detection theory, however, only one of them should be treated as false alarms, depending on which tone is treated as ‘positive’ or ‘alarm’.

when listeners heard an LR-Tone3 sequence, they made early glances at the character for

an SR tone, and when they heard an SR-Tone3 sequence, they made early glances at the

character for an LR Tone. Their result suggested that the listeners were sensitive to the

fine-grained phonetic differences between LR and SR tones.

The studies reviewed above were all based on laboratory speech, excluding the work

of Kratochvil (1984), which analyzed only one speaker. While we in general agree with

the importance and validity of laboratory speech in uncovering phonological patterns and

phonetic realizations (Xu, 2010), the small acoustic differences between the SR and LR

tones that were found in the previous studies must be examined using more naturally

occurring speech. The same argument has been offered regarding the nature of the

“incomplete neutralization” of the voicing contrast in a number of languages, such as

Dutch, German, and Catalan, in which underlying voiced word-final obstruents are

devoiced as a phonological process; however, phonetic studies have found small

differences between underlying voiced and voiceless word-final obstruents. There has

been an extensive debate in the literature regarding whether the incomplete neutralization

of final voicing was an experimental artifact of orthography or laboratory speech

(Fourakis and Iverson, 1984; Jassem and Richter, 1989; Port and Crowford, 1989;

Ernestus and Baayen, 2006; Warner et al., 2006; Kleber et al., 2010).

The goal of this study was to examine the acoustic difference between SR and LR

tones in large corpora of natural speech by expanding our preliminary work reported in

Chen and Yuan (2007). The use of large corpora also provides an opportunity to examine

possible word frequency effects on the acoustic realization of SR and LR tones. The

effect of frequency on speech production has been repeatedly reported in corpus studies

(e.g., Bybee 2002, on word-final /t/ and /d/ deletion rates; Patterson and Connine 2001,

on flap production; and Aylett and Turk 2004, on syllable duration). Zhao and Jurafsky

(2009) found that low-frequency words with mid-range tones in Cantonese are produced

with higher F0 than high-frequency words and that the F0 trajectories of less frequent

words are more dispersed than that of their more frequent counterparts. Zhang and Lai

(2010) demonstrated that “wug” words (i.e., pseudowords) are more resistant to the

application of the 3rd tone sandhi than real words for Mandarin speakers. For the purpose

of this paper, we examined the possible effect of word frequency on the acoustic

realization of SR tones as compared with that for LR tones.

2. Method

2.1. Data

Two large speech corpora were utilized in this study: the HKUST Mandarin Telephone

Speech (LDC2005S15) and the HUB4 Mandarin Broadcast News Speech (LDC98S73).

Broadcast news speech is formal read speech that is produced by well-trained

professional speakers of Standard Chinese; telephone conversation speech is produced by

typical speakers of Standard Chinese who may have different dialectal accents. Syllable

boundaries were automatically obtained through forced alignment using the Penn

Phonetics Lab Forced Aligner (Yuan and Liberman 2008). The CALLHOME Mandarin

Chinese Lexicon (LDC96L15) was used to identify words and tonal sequences from the

corpora.

We analyzed disyllabic words with four tonal sequences: low-low (T3+T3), low-rising

(T3+T2), rising-low (T2+T3), and rising-rising (T2+T2). The main comparison in this

paper is the realization of Tone3 and Tone2 when both tones are followed by Tone3. As a

control, we compared T3+T2 and T2+T2 sequences. Table 1 lists the total number of

tonal sequences used in the study.

Table 1: Total number of tokens for different tonal sequences.

Tonal sequence HKST (tel. conversations)

HUB4 (radio news)

(T2+T3)word 8,113 2,592 (T3+T3)word 3,938 3,090 (T2+T2)word 6,515 4,685 (T3+T2)word 8,112 4,852

2.2. Acoustic Measurements

We first extracted the F0 contour of the target tone, located its minimum F0 and the F0

at the offset of the tone-bearing syllable, and then calculated two measurements. One

measurement is the LogRange of the F0 rise, which is the log of the ratio between the F0

at the syllable offset and the minimum F0. The other measurement is the percentage of the

F0 rise duration derived by calculating the percentage of the duration between the

minimum F0 and the syllable offset over the duration of the tone-bearing syllable. All

measurements were automatically extracted using esps/get_f0 and Python scripts.

3. Results

3.1. Acoustic Realizations

We first examined the acoustic realization of the first syllable in the disyllabic words.

Figure 1 shows that in both telephone conversations and broadcast news, when the

following tone was rising (i.e. in X+T2), X differed significantly in the magnitude of the

F0 rise between the rising and low tones. When the following tone was low (i.e. in

X+T3), the low tone exhibited a great F0 rise (SR) compared to its rise in the X+T2

context, but X remains significantly different in the magnitude of the F0 rise between the

LR and SR tones (Telephone conversations: t(7865.5) = 3.45, p = 0.001; broadcast news:

t(5439.0) = 7.1, p < 0.001). The F0 peak of the SR tone was lower than that of the LR

tone. This result is similar to what Peng (2000) observed and compatible with the general

impression in the literature that the rise in the SR tone is slightly less steep than that of

the LR tone.

Figure 1: Means (and ± two standard errors) of the LogRange of the F0 rise within rising vs. low tones when

the tone-bearing syllable either precedes a low tone or a rising tone.

In both telephone conversations and broadcast news, we further observed a significant

difference between the SR and LR tones regarding the percentage of the F0 rise duration

(i.e., the distance from the F0 minimum to the end of the tone-bearing syllable as a

percentage of the total duration of the syllable). Figure 2 shows that when the following

tone was a low tone (i.e. X+T3), the underlying low tone became more like a rising tone

(i.e., an SR tone). However, the percentage of the rise duration of the LR tone was greater

than that of the SR tone (telephone conversations: t(8113.5) = 13.8, p < 0.001; broadcast

news: t(5557.0) = 7.1, p < 0.001). This result indicates that the rise onset of an SR tone is

slightly later than that of a LR tone.

Figure 2: Means (and ± two standard errors) of the percentage of the F0 rise duration over the tone-bearing unit

within rising vs. low tones when the tone-bearing syllable either precedes a low tone or a rising tone.

In summary, both the LogRange of the F0 rise and the percentage of the F0 rise

duration suggest that despite the great similarity between the SR and LR tones, they are

indeed different in the contexts of both broadcast news and telephone conversations.

Thus, the results from both laboratory speech and corpus data conjointly suggest that a

fine phonetic difference exists between the SR and LR tones.

3.2. Frequency Effects

We further examined whether word frequency affects the realization of the SR tone vs.

the LR tone. We focused on two tonal sequences: low-low (i.e., T3+T3) and rising-low

(i.e., T2+T3). For each tonal sequence, we separated the disyllabic words into four

frequency bins (0-10, 10-100, 100-1000, and more than 1000), based on the frequency

counts of 3,431,707 words in the Xinhua newswire. The frequency counts were provided

by the CALLHOME Mandarin Chinese Lexicon. Figure 3 shows that for the low-low

tonal sequence, there was a significant decrease in the LogRange F0 rise of the first low

tone (i.e., the SR tone) for words with high frequency (i.e., more than 1000). For the

rising-low tonal sequence (i.e., T2 preceding T3), such a word frequency effect does not

hold for the LR tone.

Figure 3: Means (and ± two standard errors) of the LogRange of the F0 rise of the lexical rising tone vs. the

sandhi rising tone within different word frequency ranges.

Figure 4 compares the LR and SR tones for different word frequency bins separately.

It was clear that the SR tone has a smaller F0 rise than the LR tone for high-frequency

words in both telephone conversations and broadcast news. For low-frequency words,

however, the difference between the two tones was not statistically significant. The

results of t-tests comparing the LR and SR tones for different word frequency bins are

presented in Table 2.

Figure 4: Means (and ± two standard errors) of the LogRange of the F0 rise of the lexical rising tone vs. the

sandhi rising tone within different word frequency ranges.

Table 2. The results of t-tests comparing T2 preceding T3 and T3 preceding T3 on the LogRange of the F0 rise.

Corpus Word freq. Number of tokens t-test

<= 10 T2+T3: 2175; T3+T3: 842 t = -0.52, p = 0.6 10_100 T2+T3: 2376; T3+T3: 958 t = -1.2, p = 0.22 100_1000 T2+T3: 3393; T3+T3: 920 t = 0.82, p = 0.41

Telephone converations

> 1000 T2+T3: 169; T3+T3: 1218 t = 3.53, p < 0.001 <= 10 T2+T3: 385; T3+T3: 335 t = -0.85, p = 0.40 10_100 T2+T3: 274; T3+T3: 339 t = 1.17, p = 0.24 100_1000 T2+T3: 1447; T3+T3: 1171 t = 2.01, p = 0.04

Broadcase news

> 1000 T2+T3: 486; T3+T3: 1245 t = 8.50, p < 0.001

Regarding the percentage of the F0 rise duration, the effect of word frequency is less

clear. Nonetheless, as shown in Figure 5, there was a greater difference between the SR

and LR tones for high-frequency words, especially in Broadcast news speech.

Figure 5: Means (and ± two standard errors) of the percentage of the F0 rise duration of the lexical rising tone

and the sandhi rising tone within different word frequency ranges.

4. Discussion

The SR and LR tones are acoustically different in both spontaneous telephone

conversations and formal broadcast news. This result is consistent with those of previous

studies that used laboratory speech and demonstrates that the difference is not an artifact

of orthography or laboratory speech. Our study also shows that word frequency affects

the acoustic realization of the SR tone. The SR tone differs more from the LR tone in

high-frequency words, especially with respect to the magnitude of the F0 rise.

Although they appear to be sensitive to the fine acoustic difference between the SR

and LR tones at the subconscious or unconscious level (Speer and Xu, 2008), native

listeners frequently fail to distinguish between the two tones at the conscious level (Wang

and Li, 1967; Peng, 2000). This type of mismatch between speech production and

perception is not a rare phenomenon. Many studies have reported a class of situations in

sound change called “near-mergers” (Labov et al., 1972, 1991; Yu 2007). In these

situations, “speakers consistently reported that two classes of sounds were ‘the same,’ yet

consistently differentiated them in production” (Labov et al., 1991: pp. 33). Studies on

“incomplete neutralization” also found that listeners often failed to identify the small

acoustic distinction between the voicing contrasts that are not completely neutralized. For

example, Port and O’Dell (1985) reported that German listeners could distinguish the

syllable final voiced and voiceless pairs, a well known example of “incomplete

neutralization”, with only about 60% accuracy (although this number was interpreted as

significantly better than chance in the paper).

Why do listeners fail to identify the difference between the SR and LR tones? From

the perspective of the traditional categorical perception theory (Liberman et al., 1957)

and the feature-based model of lexical access (Stevens, 2002), the SR and LR tones are

perceived as belonging to the same category, the mental representation of which may

consist of a set of tonal features (Wang, 1967; Yip, 1980; Bao, 1999) but contains no

detailed phonetic information. In this framework, phonological encoding precedes mental

lexical access. The phonetic details are the input to the phonological encoding process,

and they are not available in the output of the process. From the perspective of an

exemplar-based model (Johnson, 1997; Pierrehumbert, 2001, 2002), however, the metal

lexicon stores rich and detailed acoustic information. In this framework, “each category is

represented in memory by a large cloud of remembered tokens of that category”

(Pierrehumbert, 2001: pp. 140). Although the SR and LR tones are slightly different in

terms of their “means” ( i.e., the centers of distribution), their probability distributions

greatly overlap. Both frameworks may explain why listeners often fail to differentiate

between the two tones.

How does a native speaker, without consciously perceiving the difference, maintain

the subtle difference between the SR and LR tones? How does a child acquire the two

tones? Much research needs to be done to answer these questions. Our study indicates

that word frequency affects the acoustic realization of the SR tone. This result suggests

that the production of the 3rd tone sandhi may involve word-dependent processes. There

is no doubt that the 3rd tone sandhi is applied on-line in speech production because it

appears across word boundaries, and its application is determined by factors such as

syntactic and information structure. It is, however, unclear whether the production of the

3rd tone shandhi involves only a postlexical process. It is possible that the two syllables in

high-frequency disyllabic words are stored in long term memory together as one unit,

whereas less frequent ones may be stored as two independent syllables and assembled on-

line in speech production (this hypothesis is logical considering that words are not well-

defined in Chinese). Following this hypothesis, high-frequency words with an underlying

low-low tonal sequence are stored in the mental lexicon as a rising-low sequence, the

rising tone of which is slightly different from the LR tone. The postlexical process of the

3rd tone sandhi is, however, a categorical shift from a low tone to an LR tone. This

hypothesis may explain our result that the SR tone differs more from the LR tone in high-

frequency words, but appears to contradict the result of Zhang and Lai (2010), who found

that “wug” words are more resistant to the application of the 3rd tone sandhi than real

words. Another hypothesis, proposed by Loui et al. (2008) in their study on tone-

deafness, is that multiple neural pathways have evolved to combine consciously and

unconsciously obtained information for sound perception and production. Their study

found that tone-deaf individuals, who could not consciously perceive pitch differences,

could produce pitch intervals in target directions. Additional studies are necessary to test

and refine these hypotheses.

5. Conclusions

This paper examined the 3rd tone sandhi phenomenon in large corpora of natural

speech and analyzed both telephone conversations and formal broadcasts. Our results

confirm previous reports and findings that there are indeed low-level acoustic differences

between the sandhi rising and lexical rising tones both in terms of the magnitude of the F0

rise and the rise duration. Our study demonstrates that the difference is not an artifact of

orthography or laboratory speech. We also found that given a disyllabic word, which is a

3rd tone sandhi domain, word frequency affected the realization of the sandhi rising tone.

Specifically, the sandhi rising tone in highly frequent words exhibited a smaller F0 rise

(i.e., it differs more from the lexical rising tone) than in less frequent words. This result

suggests that different processes may be involved in producing high- vs. low-frequency

words in Chinese.

6. Acknowledgements

Both authors contributed equally to the paper. The work is supported by the U.S.

National Science Foundation (IIS-0964556), the Netherlands Organization for Scientific

Research (NWO-VIDI 016084338), and the European Research Council (ERC-Starting

Grant 206198).

7. References

Aylett, M. and Turk, A. 2006. Language redundancy predicts syllabic duration and the

spectral characteristics of vocalic syllable nuclei. Journal of the Acoustical Society of

America, 119: 3048–3058.

Bao, Z. 1999. The Structure of Tone. Oxford: Oxford University Press.

Bybee, J. 2002. Word frequency and context of use in the lexical diffusion of

phonetically conditioned sound change. Language Variation and Change 14: 261–

290.

Chao, Y. R. 1948. Mandarin Primer. Cambridge: Harvard University Press.

Chen, M. 2000. Tone Sandhi. Cambridge University Press. Cambridge.

Chen, Y. 2003. The phonetics and phonology of contrastive focus in Standard Chinese.

PhD dissertation. Stony Brook University.

Chen, Y. and Yuan, J. 2007. A Corpus Study of the 3rd Tone Sandhi in Standard Chinese.

Proceedings of Interspeech 2007. pp. 2749-2752.

Duanmu, S. 2000. The Phonology of Standard Chinese. Oxford: Oxford University Press.

Ernestus, M. and Baayen, H. 2006. The functionality of incomplete neutralization in

Dutch: The case of past-tense formation. In Goldstein, L., Whalen, D. and Best C.

(eds.), Laboratory phonology 8. pp. 27–49. Berlin: Mouton de Gruyter.

Fourakis, M. and Iverson, G. 1984. On the incomplete neutralization of German final

obstruents. Phonetica 41: 140–149.

Hockett, C. F. 1947. Peiping phonology. Journal of American Oriental Society 67:253-

267. Reprinted 1964 in M. Joos (eds.) Readings in Linguistics I, fourth edition. pp.

217-228. University of Chicago Press.

Jassem, W. and Richter, L. 1989. Neutralization of voicing in Polish obstruents. Journal

of Phonetics 17: 317–325.

Johnson, K. 1997. The auditory/perceptual basis for speech segmentation. Ohio State

University Working Papers in Linguistics 50: 101-113.

Kleber, F., John, T. and Harrington, J. 2010. The implications for speech perception of

incomplete neutralization of final devoicing in German. Journal of Phonetics 38: 185-

196.

Kratochvil, P. 1984. Phonetic tone sandhi in Beijing dialect stage speech. Cahiers de

Linguistique - Asie Orientale 13:135-174.

Kuo, Y., Xu, Y. and Yip, M. 2007. The phonetics and phonology of apparent cases of

iterative tonal change in Standard Chinese. In Gussenhoven, C. and Riad, T. (eds.)

Experimental Studies in Word and Sentence Prosody. pp. 211-237. Berlin: Mouton de

Gruyter.

Labov, W., Karen, M. and Miller, C. 1991. Near-mergers and the suspension of

phonemic contrast. Language Variation and Change 3: 33–74.

Labov, W., Yaeger M. and Steiner R. 1972. A quantitative study of sound change in

progress. Philadelphia: U.S. Regional Survey.

Liberman, A. M., Harris, K. S., and Hoffman, H. S. 1957. The discrimination of speech

sounds within and across phoneme boundaries. Journal of Experimental Psychology

54: 358-368.

Loui, P., Guenther, F. H., Mathys, C., and Schlaug, G. 2008. Action-perception mismatch

in tone-deafness. Current Biology 18: R331-332.

Macmillan, N. A. and Creelman, C. D. 2005. Detection Theory: A User's Guide (2nd

edition), Lawrence Erlbaum Associates, Inc.

Martin, S. E. 1957. Problems of hierarchy and indeterminacy in Mandarin phonology.

Bulletin of the Institute of History and Philology 29:209-230. Taipei.

Myers, J., and Tsay, J. 2003. Investigating the phonetics of tone sandhi. Taiwan Journal

of Linguistics 1: 29-68.

Patterson, D., and Connine, C. 2001. Variant frequency in flap production. Phonetica 58:

254–275.

Peng, S. 2000. Lexical versus 'phonological' representations of Mandarin Sandhi Tones.

In Broe M. and Pierrehumbert J. (eds.), Papers in laboratory phonology 5:

acquisition and the lexicon. pp. 152-167. Cambridge: Cambridge University Press.

Pierrehumbert, J. 2001. Exemplar dynamics: Word frequency, lenition, and contrast. In

Bybee, J. and Hopper, P. (eds.) Frequency effects and the emergence of lexical

structure. pp. 137-157. John Benjamins, Amsterdam.

Pierrehumbert, J. 2002. Word-specific phonology. Laboratory Phonology 7. pp. 101-139.

Mouton de Gruyter, Berlin.

Port, R. and Crawford, P. 1989. Incomplete neutralization and pragmatics in German.

Journal of Phonetics 17: 257–282.

Port, R. and O’Dell, M. 1985. Neutralization of syllable-final voicing in German. Journal

of Phonetics 13: 455-471.

Shen, J. 1994. Beijinghua shangshen liandu de diaoxing zuhe he jiezou xingshi [F0 and

rhythm of the 3rd tone Sandhi in Beijing Mandarin], Zhongguo Yuwen 4: 274-281.

Shen, X. S. 1990. Tonal coarticulation in Mandarin, Journal of Phonetics 18: 281-295.

Shih, C. 1986. The Prosodic Domain of Tone Sandhi in Chinese. PhD dissertation.

University of California at San Diego.

Shih, C. 1997. Mandarin third tone sandhi and prosodic structure. In J. Wang & N. Smith

(eds.), Studies in Chinese Phonology. pp. 81-124. Dordrecht: Foris.

Speer, S, R., Shih, C.-L., & Slowiaczek, M.L. 1989. Prosodic structure in language

comprehension: Evidence from tone sandhi in Mandarin. Language and Speech 32:

337-354.

Speer, S. R. and Xu, L. 2008. Processing lexical tone in third-tone sandhi, Labphon 11

abstracts. pp. 131-132.

Stevens, K. N. 2002. Toward a model for lexical access based on acoustic landmarks and

distinctive features. J. Acoust. Soc. Am. 111: 1872-1891.

Wang, W. S-Y. and Li, K. P. 1967. Tone 3 in Pekinese. Journal of Speech and Hearing

Research 10: 629-236.

Wang, W. S-Y. 1967. Phonological features of tone. International Journal of American

Linguistics 33:93-105.

Warner, N., Good, E., Jongman, A., and Sereno, J. 2006. Orthographic versus

morphological incomplete neutralization effects. Journal of Phonetics 34: 285-293.

Xu, Y. 1993. Contextual tonal variation in Mandarin Chinese. PhD dissertation.

University of Connecticut.

Xu, Y. 1997. Contextual tonal variations in Mandarin. Journal of Phonetics 25: 61-83.

Xu, Y. 2010. In defense of lab speech. Journal of Phonetics 38: 329-336.

Yip, M. 1980. The Tonal Phonology of Chinese. Ph.D. dissertation. MIT.

Yu, A. 2007. Understanding near mergers: The case of morphological tone in Cantonese.

Phonology 24: 187-214.

Yuan, J. and Liberman, M. 2008. Speaker identification on the SCOTUS corpus.

Proceedings of Acoustics ’08. pp. 5687-5690.

Zhang, J. and Lai, Y. 2010. Testing the role of phonetic knowledge in Mandarin tone

sandhi. Phonology 27: 153-201.

Zhang, Z. 1988. Tone and Tone Sandhi in Chinese. PhD dissertation. Ohio State

University.

Zhao, Y. and Jurafsky, D. 2009. The effect of lexical frequency and Lombard reflex on

tone hyperarticulation. Journal of Phonetics 27: 231-247.

Zee, E. 1980. A spectrographic investigation of Mandarin tone sandhi. UCLA Working

Papers 49:98-116.

3 tone sandhi in Standard Chinese: A corpus approachjiahong/publications/j03.pdf · 3rd tone sandhi in Standard Chinese: ... tones frequently undergo changes in connected speech,

Documents