Aspects of Persian Phonology and Morpho-phonology · 2012-11-03 · ii . Aspects of Persian . Phonology and Morpho-phonology. Elham Rohany Rahbar . Doctor of Philosophy . Department
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Aspects of Persian
Phonology and Morpho-phonology
by
Elham Rohany Rahbar
A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy
Appendix 8 - The list of o- and u-final made-up words used in the experiment……… 453
Appendix 9 - The list of e-final made-up words used in the experiment……………... 456
Appendix 10 - The list of real words used in the experiment for all three processes….458
1
Chapter 1
Introduction
1.1. Goals
The specific goals of this thesis are twofold. First, I examine the structure of the Persian
vowel system with the goal of understanding the asymmetries in patterning of different
vowels in the system. Second, I examine some suffixation processes in Persian, with the
goal of understanding the seeming irregularities in morphophonemics. The more general
goals are to contribute to the discussions of determining active features of a vowel system
and the processes and interactions of vowels and consonants which occur at a morpheme
boundary, based on evidence from Persian.
Persian, also known as Farsi, is an Iranian language within the Indo-Iranian branch of the
Indo-European family (Trask 1996, Ghomeshi 1996, Lewis 2009). The major dialects of
Persian are Persian spoken in Iran, Dari spoken in Afghanistan, and Tajik spoken in
Tajikistan (Windfuhr 1987, Toosarvandani 2004). The focus of this study is on Persian or
Modern Persian spoken in Iran. The dialect of Modern Persian under study here is
Standard Persian. Within this dialect, there are differences between informal daily speech
and formal speech/written form. The former is under discussion in this study.
The Persian vowel system, a controversial topic in the Persian literature, has attracted
attention and various analyses have been offered for the system. I review and critique
these analyses, and argue that a re-examination of Persian vowels, and the phonological
processes in which the vowels are involved, shed light on the structure of the Persian
vowel system.
This thesis also aims to account for particular suffixation processes in which v and ɡ occur at a suffix boundary between vowel-final roots and vowel-initial suffixes. The
occurrence of these consonants has been discussed in the literature on Persian. No
thorough account, however, has been provided for their occurrence. I show that a careful
2
study, theoretically and experimentally, gives significant insight into the occurrence of
these consonants.
Thus the major empirical goal of this study is to account for some Persian phonological
and morpho-phonological processes which are controversial or less studied.
Theoretically, this work contributes to the field of phonology by arguing that, in the light
of the theory of Modified Contrastive Specification and markedness and contrast as
defined by this theory, in order to determine the active features of a system, phonological
activity in that language should be taken into consideration. This study contributes to our
understanding of phonological theory by analyzing the Persian vowel system considering
markedness and contrast, concepts of central and current importance and controversy in
the theory. In particular, this work contributes to studies on determining active features of
vowel systems. It is sometimes difficult to determine whether contrasts in a vowel system
are based on quantity or quality (see, for instance, van Oostendorp 1995, Odden 2011).
The Persian vowel system presents an interesting example in this respect.
On the morpho-phonological side, the thesis touches upon vowel as well as consonant
occurrence at a suffix boundary and raises issues concerning the interaction of vowels
and consonants in suffixation. Thus it contributes to the field of morpho-phonology with
respect to processes occurring in suffixation.
Another aspect of the thesis is its historical perspective on the processes under study.
Although the goal of this work is to account synchronically for the Persian vowel system
and for some morpho-phonological processes in the language, the discussion of vowels
and morpho-phonological processes benefit from an historical view as well.
In addition to its theoretical dimension, this thesis includes experiments on morpho-
phonological processes. Thus, it also contributes to the field of experimental phonology
and to the growing body of experimental work designed to probe linguistic patterns (e.g.,
Ohala 1987, Albright and Hayes 2003, Hirata 2004, Kang 2007, Alderete and Kochetov
2009, Babel and Johnson 2010, Johnson and Babel 2010, Kawahara 2011, Cohn,
Fougeron, and Huffman 2011).
3
1.2. Theoretical foundations
Markedness and contrast have been the focus of considerable research and different
proposals have been made about their roles in phonological theory (e.g., Saussure 1916,
2009). Throughout this work, I use the notions of contrast and markedness as defined by
this theory. I discuss these below, presenting examples.
1The Sound Pattern of English (SPE), for example, takes underlying representations to be fully specified (Chomsky and Halle 1968); Radical underspecification does not distinguish contrastive and non-contrastive features in phonological computation, that is redundant features might be available to some phonological rules (e.g., Archangeli 1984, Archangeli and Pulleyblank 1994); phonetically-based phonological theories take non-contrastive phonetic features into account in phonology (e.g., Steriade 1997, Kirchner 1997). With respect to markedness, based on one view, as observed in some OT-based accounts, there are fixed universal hierarchies for markedness of features (e.g., McCarthy and Prince 1994, Urbanczyk 1996, Beckman 1997, Lombardi 2002); another view, however, looks at markedness from a language-particular perspective and so different features can pattern as marked in different languages (e.g., Rice 2004).
4
feature is literally not marked or is absent, while the marked element is present. Two
languages with the same surface inventory can have different unmarked features because
the two systems may have different choices of features and also contrasts may be built up
differently in the two systems (see Dresher 2003a, 2003b, Rice 2004, Dresher 2009). As a
consequence of different pathways to the same surface inventory, variation in markedness
is observed even in similar inventories. For example, Yoruba (Niger-Congo) and Gengbe
(Niger-Congo) have identical vowel inventories but behave differently from one another
with respect to markedness. In Yoruba, the vowel /i/ shows the unmarked patterning,
while in Gengbe, the vowel /e/ does so (see Abaglo and Archangeli 1989).
In this theory, feature contrasts and phonological patterning are taken into consideration
to determine which feature is unmarked in a language (Rice 1999). Various diagnostics
are proposed in the literature in order to determine markedness. In particular, unmarked
elements result from neutralization, are likely to be epenthetic, are the target of
assimilation, and are lost in coalescence and deletion (see Rice 1999, 2007 for a
summary). In the theory of Modified Contrastive Specification, markedness is a matter of
structure.2
2 For a different view see de Lacy (2006).
Unmarked elements have less structure than marked ones. Since languages
differ in the amount of structure that they need (depending on relevant contrasts),
different features can play the role of unmarked across languages. How is the amount of
structure decided for a language? Contrast is the determining factor (e.g., Avery and Rice
1989, Dresher 2003a, 2003b, 2003c, 2009). For example, consider vowel height. When
there is no contrast in a vowel system, that is, in a vowel system of only one height, a
vowel can potentially be realized at any height (the language-particular properties
determine which height). In a two-height system one can be [low] and the other non-low
(this non-low height can be phonetically mid or high); or one can be [high] and the other
non-high. In a three-height system, there are various possibilities (Dresher, Piggott, and
Rice 1994). For instance, if [low] and [high] are marked based on the phonological
activities of the language under study, mid vowels will show unmarked patterning (for
example they are targets of harmony or used as epenthetic vowels), as argued by Dyck
5
(1995) for Iberian Spanish dialects, where metaphony (vowel raising or high harmony)
targets mid vowels. If, however, the features [low] and [mid] are specified, high vowels
would be expected to show unmarked patterning. An example of the latter case is
Yoruba, where /i/ shows the unmarked patterning (see Pulleyblank 1998, 2003).
The key point, thus, is that phonological activities are the main diagnostics for
determining contrast and markedness of features in a system because, as the theory of
Modified Contrastive Specification suggests, only contrastive features are present
underlyingly, and are, therefore, active in the phonology of a language. In this work,
when I use ‘active feature’, ‘phonologically active feature’, ‘dimension/basis of contrast’,
‘contrastive feature’, I mean features which are marked or underlyingly present as
defined in this theory.
An important aspect of this theory is that contrastive features are ordered into a
contrastive hierarchy (e.g., Dresher 2003a, 2003b, 2003c, Dresher and Xi 2005, Dresher
2009). Thus not only is identifying the active features in a system important, but so is the
order in which they enter into a system.
Assume a language with the following vowel system: /u, i, e, a/. Let us further assume
that this language has the following contrastive hierarchy: [low] > [high] > [peripheral]
(following Rice (1995, 2002), I consider [coronal] and [peripheral] − whose phonetic
realization can be [labial] or [dorsal] − as vowel place features). [low] > [high] >
[peripheral] means [low] is ordered before [high] which is ordered before [peripheral].
Thus [low] makes the first cut in the system:
(1) First cut made by [low]
i u
e
______________________________
a [low]
6
/a/ does not need to be further specified because it is distinguished from all other vowels
by [low], and there are no other low vowels from which /a/ needs to be distinguished. In
the non-low region, we enter [high] given the assumed hierarchy for this language. The
result is shown in (2).
(2) Second cut made by [high]
i u [high]
______________________________
e
______________________________
a [low]
/e/ does not need to be further specified. We do, however, need a feature to distinguish /i/
and /u/ from each other. [peripheral] then makes the third cut, but only in the [high]
region, as shown in (3).
(3) Third cut made by [peripheral]
[high] i u [high], [peripheral]
______________________________
e
______________________________
a [low]
7
Based on [low] > [high] > [peripheral] (or [high] > [low] > [peripheral] – they yield the
same results), the feature specifications in this language are as follows. The √ shows
where a feature is present/specified.
(4) The feature specifications of /u, i, e, a/ assuming [low] > [high] > [peripheral]
i u e a
[low] √
[high] √ √
[peripheral] √
Now assume the contrastive hierarchy in this language is [low] > [peripheral] > [high].
The first cut, [low], gives us the following (the same as in (1)):
(5) First cut made by [low]
i u
e
______________________________
a [low]
Next, we enter [peripheral], as shown in (6).
(6) Second cut made by [peripheral]
i u [peripheral]
e
______________________________
a [low]
8
/u/ does not need further specification. Now [high] enters to make a cut between /i/ and
/e/.
(7) Third cut made by [high]
[high] i u [peripheral]
____________
e
____________________________
a [low]
The [low] > [peripheral] > [high] order gives the following feature specifications.
(8) The feature specifications of /u, i, e, a/ assuming [low] > [peripheral] > [high]
i u e a
[low] √
[peripheral] √
[high] √
Comparing the features of /u/ in (4) and (8), we see that even in the same inventory (e.g.,
/i, u, e, a/) with the same set of active features (e.g., [low], [high], [peripheral]), vowels
can be distinguished differently. But how can we decide which order is the right one in a
language? We need to look at the phonological processes of that language.
It should be noted that phonetic measurements do not need to necessarily match
phonological patterning based on Modified Contrastive Specification. The vowel /e/, as
an example, may pattern with /i/ in a language although the former is phonetically mid
9
and the latter is high. See, for example, Dresher and Zhang (2005). They consider /i/ in
Written Manchu to be phonetically ATR but phonologically neutral with respect to
harmony. When in position of trigger in harmony, /i/ only occurs with non-ATR vowels.
So a distinction is made between being phonetically ATR, but not being so
phonologically. That is why considering phonological activity is so important. One
cannot decide on the features for a segment just based on its phonetics. The phonological
activity of that segment is the crucial factor.
This is the framework within which my analysis of the Persian vowel system is
presented.
1.3. Morpho-phonological issues
As noted above, one of the goals of this thesis is to examine some Persian morpho-
phonological processes which show irregularities and which involve vowels and
consonants. One of these processes involves vowel epenthesis at a suffix boundary
seemingly with particular stem structures. The other two involve the occurrence of
consonants at a suffix boundary between a vowel-final stem and a vowel-initial suffix,
namely the occurrence of v after o-final and u-final words and the occurrence of ɡ after e-
final words. These processes raise discussion of several issues such as the underlying
representation and the historical background of vowel-final words in Persian, the effect of
phonetic environment, vowel epenthesis, consonant epenthesis, minimal word
requirements, and phonologically-conditioned allomorphy, all of which contribute to our
understanding of Persian vowel structure and activity at a suffix boundary and of vowel-
consonant interactions in that environment. Given that I examine these processes both
from a theoretical perspective and an experimental viewpoint, accompanied by a
historical investigation, an in-depth insight into these processes, and consequently of
Persian phonology and morpho-phonology is gained.
10
1.4. Organization of the thesis
The structure of the thesis is as follows.
Chapter 2 provides background on the Persian vowel system and discusses the active
features of the system, which is a controversial issue in Persian phonology and for which
different views are presented in the literature. I review the literature in this regard and
evaluate the evidence it presents. I examine Persian phonological processes and suggest
that vowel harmony is a decisive process with respect to the phonologically active feature
of the Persian vowel system.
In chapter 3, I propose a featural analysis of the system based on the tense/lax
distinction. I present the contrastive hierarchy of features of the system and discuss
markedness of features in the system. A phonetic study of Persian tense and lax vowels,
harmony across laryngeals, pre-nasal raising, and diphthongs are also discussed in this
chapter.
In chapters 4, 5, and 6, I discuss three processes which appear to provide support for a
quantity-based analysis of the system. Chapter 4 examines epenthesis in suffixation.
Chapter 5 discusses VCC co-occurrence restrictions. Chapter 6 deals with minimal word
requirements. I will argue that none of these processes provides an argument for quantity
in the system and that they are in fact compatible with the underlying tense-based
account.
Chapters 7 and 8 evaluate two morpho-phonological processes occurring at a suffix
boundary between vowel-final bases and vowel-initial suffixes. In chapter 7, I discuss the
occurrence of v after o- and u-final words in suffixation. Chapter 8 presents discussion of
ɡ after e-final words in suffixation.
Chapter 9 presents a summary of the thesis.
11
Chapter 2
The active features of the Persian vowel system: the problem
In this chapter, the organization of the Persian vowel inventory will be discussed. The
particular question which will be addressed is: what are the phonologically active features
in the system? The main distinguishing feature among vowels in Persian has been a
matter of debate in the literature and different accounts have been provided in this regard.
Some consider the system to be height-based (e.g., Samareh 1977, Pisowicz 1985) and
others consider it to be quantity-based (e.g., Hayes 1979). In addition, a synthetic analysis
which requires both quality and quantity is found in the literature (Toosarvandani 2004).
In this chapter, I will review the literature and discuss the evidence it presents in favor of
each of these views. I will then show the problems each of these encounters and conclude
that the arguments offered in the literature for these views are inconclusive. Following
the assumption that in order to determine the phonologically contrastive feature of a
system one needs to examine the phonological activities of that system, as argued by
Modified Contrastive Specification (see section 1.2), I will argue that there exists an
active phonological process in the language, namely vowel harmony, which strongly
supports a feature-based (i.e., qualitative) analysis for the Persian vowel system. I start
my discussion with the vowel inventory and its background.
2.1. The Persian vowel inventory: background
Modern Persian has the surface vowel system given in (1):
(1) i u
e o
a ɑ
12
This arrangement of Persian vowels is usually seen in the literature (e.g., Samareh 1985,
Pisowicz 1985, Darzi 1991, Meshkatod Dini 1999). There are differences in the literature
with respect to the symbols chosen for the vowels, in particular the symbols used for the
two low vowels vary. Some examples of the symbols used in the literature for the two
low vowels which are indicated by a and ɑ in (1) are respectively as follows: æ and a (Darzi 1991), æ and ɑ (Zolfaghari Serish and Kambuziya 2005), ɑ and ɑ (Lazard 1992),
a and â (Windfuhr 1979, Najafi 2001), a and ā (Hayes 1979). Toosarvandani (2004)
uses the low vowel symbols given in (1). The symbols used for the other vowels are
more or less agreed upon in the literature. Sometimes “:” or “ ” might be used in the
literature when referring to length (i.e., i:, u:, ɑ:, ī, ū, ɑ). I use the IPA symbols for low
(unrounded) vowels and represent the vowels as presented in (1) throughout this work
except for when I quote directly from other studies or when, for the sake of argument, it
is necessary to keep the symbols as they are in the original work.
I now present the different views found in the literature regarding the Persian vowel
system with respect to this phonological contrast. Three possibilities are proposed, as
follows (I show the vowels in (2)-(4) as they are shown in the surface vowel system; a
discussion of the underlying representation of the system based on the three possibilities
is given in section 2.2):
(2) Quantity-based analysis
short long
e o i u
a ɑ
13
(3) Quality-based analysis
high i u
mid e o
low a ɑ
(4) Synthetic analysis (integrating quality and quantity)
short long
i u high
e o
a ɑ low
By “quantity” I mean phonological length with long vowels bimoraic and short vowels
monomoraic (based on, for example, Hyman 1985, Hayes 1989). Perlmutter (1995),
discussing prosodic theories of quantity, says that “The important point of agreement is
that the understanding of quantity requires consideration of prosodic structure above the
segmental level” (p. 311). Gordon (2004) notes that “Contrasts in segmental length are
represented by assuming that long segments are associated with two weight units, while
short segments are associated with one unit of weight” (p.3). Considering the mora as a
unit of weight (van der Hulst 1984, Hyman 1985, Hayes 1989, McCarthy and Prince
1995, Perlmutter 1995 among others), what makes a vowel long is association to an
additional mora, that is, a vowel linked to two morae is long (e.g., Perlmutter 1995,
Tranel 1995, Fitzgerald (forthcoming)). In this work, assuming a moraic theory of the
syllable, when I use ‘long’ for a vowel I mean a bimoraic vowel and when I use ‘short’ I
mean a monomoraic vowel. Vowel quantity or phonemic length, therefore, is associated
with syllable structure and the number of morae a vowel carries. A vowel system is
14
quantitative if quantity is the dimension of contrast in that system, that is, if the vowels
are grouped based on the number of morae they occupy.
By “quality” I mean a feature; thus a vowel system is qualitative if a feature (e.g., height,
tenseness, ATR) is the basis of contrast in that system. Note that quantity or length is not
a feature, as discussed in the previous paragraph. In the literature on Persian which argues
for quality in the system, that quality is considered to be height, as shown in (3).
The three views given in (2)-(4) will be discussed in section 2.2. Before turning to these
positions, it is useful to give some historical background on Persian.
2.1.1. Persian historical background
The Old Persian vowel system and the Middle Persian vowel system both were believed
to have been quantitative. Purely quantitative views consider the system of Modern
Persian to remain quantitative, as the former vowel inventory of the language was, while
purely qualitative ones argue that a change of quantity to quality occurred, restructuring
the inventory. In the synthetic analysis, the vowel system of the language still shows a
quantity contrast in addition to a quality contrast which is due to ongoing historical
change from quantity to quality.
Three historical eras are proposed in the evolution of Persian (Bahar 1942, Natel Khanlari
1987 among others), as follows:
i. The old era: from the earliest documents to the end of the Achaemenid empire
(559-331 BC)
ii. The middle era: from the beginning of the Sassanid empire to the Arabs’ conquest
of Iran in 652 AD
iii. The modern era: from the dominance of Islam in Iran to the present time.
15
It is generally agreed that the Old Persian vowel inventory, presented below in (5), was a
quantity-based system (e.g., Natel Khanlari 1987, Beekes 1997).
(5) The Old Persian vowel inventory
i ī u ū
a ā
Diphthongs: ai
au
As (5) shows, the system had three pairs of vowels, each of which included a short
vowel with its long counterpart. In addition, the system had two diphthongs.
It is believed that the Middle Persian vowel system showed no change from the Old
Persian vowel system except for the monophthongization of the diphthongs, as follows:
the Old Persian diphthongs ai and au became ē and ō3
(6) The Middle Persian vowel inventory
respectively in Middle Persian
(Salemann 1930, Rastorgueva 1969, Windfuhr 1979, among others). The Middle Persian
vowel system is given in (6):
i ī u ū
ē ō
a ā
3 The vowels ē and ō were called majhul meaning ‘unfamiliar’ (Amouzgar and Taffazoli 1994, Natel Khanlari 1987, Zomorrodian 1999, among others).
16
It should be noted that, according to Natel Khanlari (1987), “there is no doubt that
quantity was not the only difference between short and long vowels in the Middle
Persian. Quality differences, as we observe in the modern system, existed, too” (p. 255).
Also, in addition to the vowels given in (6), in a few publications (McKenzie 1971,
Amouzgar and Tafazzoli 1994) the vowels e and o are included in the system with the
explanation that their phonemic status is doubtful (McKenzie 1971) and that perhaps they
were allophones (Amouzegar and Tafazzoli 1994)4
It is generally agreed that the Modern Persian vowel system, presented in (7),
distinguishes its vowels by quality, with quantity playing a secondary role (Samareh
1977, 1985, Pisowicz 1985, Darzi 1991, Meshkatod Dini 1999 among others). That is, a
change occurred in the Persian vowel system from the middle to the modern era which
resulted in a quality-based vowel system for Modern Persian.
. The inventory in (6) shows that, as in
the Old Persian vowel system, in Middle Persian, quantity was important. The difference
is that the Middle Persian vowel system has two more long vowels, which were
diphthongs in the old system.
(7) The Modern Persian vowel inventory
i u
e o
a ɑ
4 In Amouzgar and Taffazoli (1994), it is not mentioned of which phonemes e and o were allophones. They are given within parentheses beside i and u. This probably means that they were considered to be allophones of i and u.
17
Several changes must have occurred to change the inventory from (6) to (7): (i) the loss
of two vowels (i.e. ē and ō); (ii) the loss of quantity in the system; (iii) the appearance of
the mid vowels e and o; (iv) backing and fronting in the low vowels.5
The focus of this thesis is on the synchronic status of the Persian vowel system,
particularly with respect to its active features. However, it is important to have some idea
of the historical development since historical arguments are used as evidence in the
synchronic analysis.
I now turn to an investigation of the types of evidence that might distinguish the analyses.
2.2. Literature review
The Modern Persian vowel system, as noted above, is generally considered to be a
quality-based system in which quantity has a secondary role or, in other words, is non-
contrastive. That is, as seen in the literature (e.g., Samareh 1977, Zomorrodian 1999), ɑ, i, u are considered to be non-contrastively long, as displayed in (3). Nevertheless, there
are a few studies (e.g., Hayes 1979, Windfuhr 1979) which consider quantity to still be
the active feature in the system. In this view, ɑ, i, u are phonologically long, or bimoraic,
while a, e, o are phonologically short or monomoraic, as in (2). There are also some
studies which consider quality and quantity to both be active in the system
(Toosarvandani 2004), as in (4). In this view, a synthetic analysis which includes both
quality and quantity is offered and the vowel system of Modern Persian is considered to
be in a “transition state” between the purely quantitative system of classical Persian (see
(6)) and the system of future Persian (as in (3)), which will eliminate any phonological
evidence for quantity altogether and keep quality as the only distinguishing feature. In
this view, a, e, o are short, and ɑ, i, u are long. In addition, i, and u are high, while a and
ɑ are low, e and o are mid ([-high, -low]).
5 It is widely believed that i and u lowered to e and o and also ē and ō merged with ī and ū from Middle to Modern Persian (e.g., Pisowicz 1985).
18
Thus, in general, there are three views in the literature regarding the structurally active
feature of the Persian vowel system, as follows.
(i) In the view according to which quality is active and quantity is not phonologically
relevant, the system is presented in the following way. As suggested in the literature,
height is the qualitative feature involved.
(8) [front] [back]
i u [high]
e o [mid]
a ɑ [low]
It should be noted that following the framework of Modified Contrastive Specification
adopted here, only two of the three height levels observed above can be underlyingly
present or marked, and not all three of them (see section 1.2). I leave this discussion aside
for the moment and return to it in 3.1.1. The point to note is that the vowels are
distinguished from each other by height differences. Given the nature of the system,
having only height is tenable (leaving aside place features).
(ii) In the view according to which quantity is active, the system is presented in the
following way. The phonetic realization of each vowel is given below it. I write features
in square brackets (see (8) and (10)). Length is not written in square brackets since it
involves a quantity distinction, represented with morae (see section 2.1).
19
(9) short long short long
i ī u ū
[e] [i] [o] [u]
short a ā long
[a] [ɑ]
Thus, the system is essentially a three-vowel system, plus a length contrast. Given the
nature of the system, having only quantity in the system is tenable.
(iii) In the view according to which quantity and quality both are active, the system is as
in (10).
(10) [high], long ī ū [high], long
[mid], short e o [mid], short
[low], short a ā [low], long
Assuming that features enter into a system to show contrasts, as required by the
framework of Modified Contrastive Specification, in the Persian vowel system, if vowels
are distinguished by quality, quantity is not required to be active for the same contrast
and if they are distinguished by quantity, there is no need for quality to be present. Thus,
given the nature of the system, under the theoretical assumptions of the adopted
framework, having both quality and quantity is untenable.
Note that the system could be mixed with, for instance, low vowels distinguished by
quantity and non-low vowels by quality, as in (11).
20
(11) i u [high]
e o [mid]
short a ɑ long
This is different from the synthetic system in (10) in which all vowels are distinguished
by both quality and quantity. I set this alternative aside. The synthetic analysis will be
discussed in 2.2.3.
The assumptions provide a research direction: if it is assumed that the view presented
above in (iii) (i.e., synthetic analysis) is not theoretically possible (we will see that
evidence from phonological activity of the language shows that a synthetic analysis is not
required), are there arguments to distinguish between the views presented in (i) (i.e.,
purely qualitative) and (ii) (i.e., purely quantitative)? In the remainder of this chapter, I
examine the evidence in the literature for the phonologically active feature in Modern
Persian. I show that what are widely considered as arguments in favor of quality are, in
fact, inconclusive (section 2.2.1). I also discuss evidence presented in the literature for
Modern Persian being quantitative and show that this evidence also is inconclusive
(section 2.2.2). I further show the problems faced by the synthetic analysis (section
2.2.3). Although the view in (iii) is not a possibility given the assumptions here, in order
to have a review of all three views and to show that there is in fact no need for both
quantity and quality in the Persian system, this perspective will be discussed too.
Having discussed the ambiguities and uncertainties about the active phonological feature
in the Modern Persian vowel inventory based on the literature, I examine a process found
in Modern Persian that distinguishes the hypotheses, namely vowel harmony. I argue that
it plays a decisive role providing strong evidence for a quality-based analysis (section
2.3).
I start my discussion with arguments presented in the literature for quality.
21
2.2.1. Arguments in favor of quality
It is a widespread claim, as noted earlier, that the Persian vowel system is quality based
(e.g., Samareh 1977, Pisowicz 1985, Zomorrodian 1999). The system, thus, is
represented as in (8), repeated as (12).
(12) [front] [back]
i u [high]
e o [mid]
a ɑ [low]
The arguments in the literature for quality involve phonetic measurements and stress. I
first discuss phonetics of the vowels in section 2.2.1.1. Stress will be discussed in section
2.2.1.2.
2.2.1.1. Phonetics of vowels
The literature on the Modern Persian vowel system suggests that the common idea that
the system is no longer quantity based started with phonetic studies on the vowels.
Acoustic experiments carried out around the middle of the 20th century play an important
role in considering quality —and not quantity— to be the main distinguishing feature in
the system. As discussed below, these phonetic studies argue for a qualitative system
based on the following observations: (i) the length distinction is neutralized in most
contexts and (ii) the distribution of length is contextual. Leaving aside the important
point that in order to recognize phonological contrasts in a system, one needs to look at
phonological activity in the language, these phonetic arguments are not conclusive,
because there is not agreement among the phonetic accounts in the literature on where
one sees a length difference in the vowels, as shown below.
22
As Windfuhr (1979) writes “the theoretically based short-long interpretation was
seemingly shaken by a phonetic experiment by Sokolova et al (1952); they demonstrated
that the length distinction is neutralized in most contexts” (p. 136). That is, the phonetic
measurements showed that sometimes the so-called long vowels are not longer than the
so-called short ones. Gaprindašvili and Giunašvili (1964), to which some literature refer
(e.g., Kramsky 1966, Pisowicz 1985), is one of these studies. According to them, the so-
called short vowels (a, e, o) can be phonetically longer than the long ones (ɑ, i, u); for
example (cited in Kramsky (1966)):6
(13) ‘short’ ‘long’
a. ɢam = 230 ms. ‘sorrow’ mɑh = 230 ms. ‘moon’
b. ɑxor = 230 ms. ‘manger’ ɑʃub = 190 ms. ‘riot’
Recall that /i/, /u/, /ɑ/ correspond to former /ī/ (or /ē/), /ū/ (or /ō/), /ā/ respectively, and
the vowels /e/, /o/, /a/ to former /i/, /u/, and /a/. The examples in (13) show that a in ɢam,
which corresponds to a former short vowel, has the same length as ɑ in mɑh, which
corresponds to a former long vowel, and that o in ɑxor, which corresponds to a former
short vowel is even longer than u in ɑʃub, which corresponds to a former long vowel.
6 A point should be made here about vowel-initial words in Persian. Whether a syllable can start with a vowel in Persian or there is a glottal stop preceding a vowel in initial position of a syllable is controversial in the Persian literature (e.g., see Samareh 1985, Meshkatod Dini 1999). I leave this topic aside in this work and do not use the glottal stop.
23
Kramsky (1939) and Kramsky (1966) clearly show the turning point in the analyses from
quantity to quality as the main feature for the system. The 1939 paper considers the
Persian vowel inventory to be quantity based. In the 1966 paper, published 27 years later,
he considered quantity to be secondary in the system7
7 Note that Kramsky (1966) considered quantity to be secondary in Modern Persian, and therefore gives the impression that quantity is phonologically absent from the system in his view. But the vowel system as he represents it seems to include both quality and quantity (at least on the surface). He writes: “when we consider quantity as secondary in Modern Persian, it does not mean that quantity plays an unimportant part in Persian.” (p. 220). In conclusion, Kramsky says that the system of vowel phonemes of Persian should be changed from the original three-phoneme system (i, u, a vs. ī, ū, ā) to a six-phoneme system. He then presents the following vowel system for Persian (p. 220):
. Kramsky (1966) says that the
length-based account given in his 1939 paper needs to be modified in view of the results
of the phonetic research. According to Kramsky (1966), “an exact picture of quantitative
conditions can be obtained by a phonological approach in a close connection with an
exact phonetic research. The latter has been realized quite recently by Gaprindašvili and
Giunašvili and the results published” (p. 217). He further says that, unlike Gaprindašvili
and Giunašvili, who decided quality to be the distinctive feature based on their acoustic
study, one cannot conclude that quantity is not contrastive solely based on these acoustic
results. What is important, he writes, are phonological criteria, coupled with phonetic
criteria. The phonological criteria he considers are correlation and stress. I will return to
stress in 2.2.1.2. In this section, I focus on correlation. Kramsky writes: “Correlation is,
according to Trubetzkoy, a sum of all correlative pairs which are characterized by the
same correlative mark. Correlative pair is formed by two phonemes which stand mutually
in a logically privative proportional one-dimensional opposition. Correlative mark is then
the phonological feature by the presence or absence of which a series of correlative pairs
is marked” (Kramsky 1966, p. 218). To argue for the position that in Persian the
correlation of quantity is absent, Kramsky refers to the work of Gaprindašvili and
ī ū e o a ā
Whether Kramsky means that secondary features can also have a role in phonology or whether he means that the system needs both is not clear. Whichever is the case is not an issue here. The point is that phonetic findings played an important role in introducing quality to studies on Persian vowels.
24
Giunašvili, according to whose study the pairs /i/ - /i:/ and /u/ - /u:/ are not as
proportional as had been assumed based on a difference in their length (or a combination
of length and tension since /i/ and /u/ are realized as lower than their long counterparts in
articulation). Gaprindašvili and Giunašvili, as represented in Kramsky (1966), consider
two levels for high, high1 and high2, and a level of mid. According to them, both ī and ū
are high2; i is high1 while u is mid. So in fact, this criterion, while claimed to be
phonological, is evaluated based on phonetic facts.
Other literature presents similar reasoning for the Modern Persian vowel system to be
quality based. For instance, according to Pisowicz (1985) “the traditional stand-point
referring to diachrony, orthography, and versification bids one to perceive /â/, /i/, and /u/
as long vowels contrasting short /a/, /e/, and /o/ respectively.” He continues, “the above-
mentioned traditional stand-point can be questioned on the basis of experimental data,
which do not confirm the length distinction in the articulation of, on the one hand /â/, /i/,
and /u/, and on the other /a/, /e/, and /o/” (p. 12).
Toosarvandani (2004) says that in Classical Persian the underlying contrast of quantity
was realized on the surface, distinguishing the corresponding short and long vowels,
which had identical quality; in Modern Persian, however, the quantity opposition is
realized only in certain limited environments on the surface. In most environments, the
length of a, e, and o (former short vowels) and the length of ɑ, i, and u (former long
vowels) match. Therefore, one cannot a priori consider a, e, and o to be underlyingly
short. In open, non-final, unstressed syllables, the former short vowels (present a, e, o)
are realized as short; elsewhere, however, they are lengthened. For instance, compare the
first vowels of the words in (14)-(16) with each other (the examples in (14)-(16) are from
Sokolova 1952, cited in Toosarvandani 2004).
A note about the examples in (14)-(16) is important. The present a, e, o are former short
vowels /a/, /i/, /u/ and present ɑ, i, u are former long vowels /ā/, /ī/ (or /ē/), /ū/ (or /ō/)
respectively. I put the vowels under study in each word within brackets and the
historical form to which each of these vowels is related in front of the word which
contains the vowel. Some of the words presented in the examples (14)-(16) are from
25
Arabic and did not exist in Middle Persian, thus the historical form of the vowels should
not be considered as the historical form of the vowel in that particular word. It is
included simply to show how the present vowels can be matched to historical vowels.
In (14), words with historical short vowels are given. Former short vowels are
phonetically short in open, unstressed syllables and long in closed unstressed syllables.
(14) a. *i s[e].dɑ ‘voice’ *i s[e:]f-tár ‘harder’
b. *u x[o].dɑ ‘God’ *u x[o:]ʃk.tár ‘dryer’
c. *a ɢ[a].bɑ ‘a kind of clothes’ *a ɡ[a:]rm.tár ‘warmer’
In (15a), the vowel e in the first word corresponds to a former short vowel and the vowel
i in the second word corresponds to a former long vowel. The same correlation applies to
o and u (15b) as well as to a and ɑ (15c).
(15) a. *i ʤ[e].dɑ:r ‘wall’ *ī b[i:].dɑ:d ‘oppression’
b. *u ʃ[o].dá:n ‘to become’ *ū b[u:].dá:n ‘to be’
c. *a b[a].dá:n ‘body’ *ā b[ɑ:].dé: ‘wine’
These forms show that in open, non-final, unstressed syllables, the former short vowels
(present a, e, o) and the former long vowels (present ɑ, i, u) contrast in quantity in
addition to quality. They also show that contextual quantity distinctions among the two
sets of vowels are observed in some environments.
In addition to the unstressed cases in (14) and (15), in vocative and imperative forms, in
open, non-final stressed syllables, a quantity distinction is observed. For instance,
compare the first vowels of the following words (see Sokolova 19528
8 Cited in Windfuhr (1979) and Toosarvandani (2004).
, Windfuhr 1979,
Toosarvandani 2004). The vowel o in the first word corresponds to a former short
26
vowel and the vowel u in the second word corresponds to a former long vowel. The
same correlation applies to e and i as well as to a and ɑ below.
(16) a. *u h[ó].sejn ‘Hosseyn!’ *ū h[ú:].ʃanɡ ‘Hushang!’
b. *i b[é]-deh ‘give!’ *ī b[í:]-adab ‘impolite!’
c. *a n[á]-kon ‘don’t!’ *ā l[ɑ:]-maz(h)ab ‘infidel!’
The former short vowels, therefore, are realized as short only in certain contexts.
Elsewhere, there is no quantity distinction between them and the former long vowels,
which always retain their long duration.
Another study, however, claims that in a pair such as the one in
(17a), â (my ɑ) in bâr, which corresponds to a former long vowel (/ā/), is longer than a in
bar, which corresponds to a former short vowel (/a/) (Samareh 1977, p. 92). The same
applies to i and e in (17b) and to u and o in (17c).9
(17) a. bâr ‘load’ b. sir ‘full’ c. dur ‘far’
bar ‘over, fruit’ ser ‘secret’ dor ‘jewel’
Based on this study, then, in closed syllables a length distinction exists.
These accounts contradict each other with one saying that in open, non-final, unstressed
syllables, and in vocative and imperative forms in open, non-final stressed syllables a
quantity distinction is observed, and the other saying that in closed syllables a quantity
distinction is observed. Thus one cannot draw any conclusion from these studies about
the Modern Persian vowel inventory.
Regardless of their contradictory results, what these studies show is that phonetic
measurements have received great attention in arguing for quality. The observation that
9 I should mention that the words ser ‘secret’ and dor ‘jewel’ are both considered in Persian to have a final geminated consonant. Gemination will be discussed in 5.5.
27
the quantity distinction changes from context to context and is, in fact, observed only in
limited environments (although there is no agreement on what environments), while the
quality difference is always observed, can explain why quality is widely considered to be
the active feature in the Modern Persian vowel system.
Another phonetic observation that has been used to argue in favor of quality as the
contrastive feature distinguishing Persian vowels is that the length distribution is
contextual, as noted before. The literature suggests that the length of vowels changes
based on the structure they occur in (Samareh 1985). In (18a), the vowel e in ʧeʃm,
which corresponds to a former short vowel, has the same length as the vowel i in sib,
which corresponds to a former long vowel. In (18b), the vowel a in dard, which
corresponds to a former short vowel, is even a little longer than the vowel ɑ in ɡɑz, which
corresponds to a former long vowel.
(18) ‘short’ ‘long’
a. ʧeʃm = 0.17 sec. ‘eye’ sib = 0.17 sec. ‘apple’
b. dard = 0.24 sec. ‘pain’ ɡɑz = 0.23 sec. ‘bite’
It is suggested that former short and long vowels both are long before a consonant and
very long before a consonant cluster (Samareh 1985). As shown in (19), the vowel u
which corresponds to a former long vowel and the vowel a which corresponds to a former
short vowel both are long (based on Samareh, shown below by “v.”) before a consonant
and very long (based on Samareh, shown below by “v:”) before a consonant cluster.
(19) a. ɡu ɡu.ʃ ɡu:ʃt ‘the present stem of ɡoftan ‘to tell’’, ‘ear’, ‘meat’
b. na na.r na:rm ‘no’, ‘male’, ‘soft’
If length distributions are purely contextual and predictable, then it might be argued that
length is not a phonological property.
Nevertheless, I conclude that these phonetic results do not by themselves provide
sufficient evidence for quality being phonologically distinctive: they show that length
28
does not derive straightforwardly from Middle Persian and that the length distinction is
observed only in some particular contexts. A thorough phonetic study on Persian vowels
needs to be done to show what phonetics reveals about the vowels. Moreover, the
phonetic results need to be supported by phonological evidence. It is the latter that I
pursue in this work.
2.2.1.2. Stress
Stress is another criterion to which some literature refers in considering the Modern
Persian vowel system to be qualitative (e.g., Pisowicz 1985). In order to understand the
argument, I begin with a discussion of Persian stress. In Modern Persian, stress is
accounted for without reference to vowel quantity (e.g., Amini 1997, Kahnemuyipour
2003) – although that does not necessarily mean that quantity is not contrastive, as
discussed below; this is why stress does not provide conclusive evidence about the issue
at hand. In Modern Persian, the final syllable in a word receives stress, and in prefixed
verbs the prefix carries stress. To explain the behavior of prefixes in this regard,
Kahnemuyipour (2003) considers prefixes to be phonological words and makes a
distinction between word- and phrase-level stress rules: the word-level stress rule assigns
stress from the right edge while the phrase-level stress rule assigns stress from left,
stressing the initial word in a phonological phrase (see Kahnemuyipour 2003 for
discussion).
Kramsky (1966) considers Trubetzkoy’s typology (1939) for stress, which is based on
syllable peak and quantity, and concludes that Persian cannot be included in this
typology. Kramsky says (in addition to correlation, as discussed in 2.2.1.1), “there is
another relevant circumstance that is connected with the problem of quantity of vowel
phonemes. It is word stress” (Kramsky 1966, p. 219). In order to understand its
relevance, I briefly elaborate on Trubetzkoy’s typology. Trubetzkoy categorizes
languages based on their relation to the formation of syllable peaks and quantity. Both of
these phenomena, according to him, can be “free” (distinctive, contrastive) and “bound”
(fixed, predictable). Thus there are four types of languages:
29
(i) Bound stress and bound quantity: all words with the same number of syllables show
the same distribution of stress and quantity.
(ii) Word differentiating stress and bound quantity: words are distinguished only by
stress. The stressed syllables are longer than the unstressed ones or the duration of
syllables is automatically determined by other factors.
(iii) Word differentiating quantity and bound stress: words are differentiated only by
syllable quantity. The place of stress is automatically determined. There are two
subclasses:
(a) Languages in which all words with the same number of syllables put stress on
the same syllable (e.g., languages with initial or final stress).
(b) Languages in which stress is dependent not only on the word boundary but
also on the quantity of the last or first syllable and therefore all words of the same
number of syllables and of the same distribution of quantity put the stress on the same
syllable.
(iv) Word differentiating formation of the peak and word differentiating quantity: words
are differentiated by both stress and quantity. There is a limitation though: open stressed
final syllables are always long, that is, the opposition of quantity in this position is
neutralized.
According to Kramsky, categorizing Persian in one of these four types is impossible. He
claims that Modern Persian has to be considered to form a transition type between types
(iii) and (iv). This leads us, Kramsky says, to re-examine quantity in Modern Persian, and
if this re-examination shows that quantity is not a relevant feature in Modern Persian,
then the language must be excluded from Trubetzkoy’s prosodic typology. The important
point for our discussion is that if one considers stress to be a determining factor with
respect to quantity (i.e, if one considers the place of stress to be related to vowel length),
then Persian cannot be easily classified. Thus, stress is not an appropriate diagnostic to
conclude that the Persian vowel system is quantity based.
30
Pisowicz, who considers the system to be quality based, also considers stress as an
argument in favor of quality and against quantity. Pisowicz says that “in a stressed
position, which is particularly reliable in the matter of quantity, the length of all vowels is
more or less identical and comparatively small” (p.12).
It is true that a frequent diagnostic for syllable weight is stress pattern. In many
languages, stress is one of the phenomena which treat some types of syllables as heavy
and others as light (e.g., Allen 1973). For example, stress in Yana (also Yanan, an extinct
language isolate of North America) falls on the leftmost syllable which is closed or
contains a long vowel (Sapir and Swadesh 1960 cited in Gordon 2002, 2004). In words
without closed syllables and long vowels, the language places stress on the first syllable.
That is, in Yana, stress treats closed syllables (CVC) and syllables ending in long vowels
(CVV) as heavy and open syllables ending in short vowels (CV) as light. Sample
representations of weight in Yana in Hayes’s moraic theory (1989) are as follows
(Gordon 2004), abstracting away from the possibility of onset clusters:
(20) σ σ σ
μ μ μ μ μ
t a: t a t t a
/ta:/ /tat/ /ta/
As a second example of the attraction of stress to quantity, in Yidiny (a nearly extinct
Australian aboriginal language), a complex interaction of vowel length, syllable count
and stress is observed: syllables alternate between stressed and unstressed, all long
vowels occur in stressed syllables, long vowels must always be separated by an odd
number of syllables, and all words with an odd number of syllables must have a long
vowel in at least one even-numbered syllable (Evans 1995).
31
Thus, stress and syllable weight interact in many languages. The fact that stress in
Modern Persian does not refer to quantity, however, cannot be taken as an argument for
the system to be qualitative since there are languages with distinctive quantity whose
stress systems do not refer to quantity. According to Hayes (1995), although there is a
tendency in languages with quantity distinctions to have quantity-sensitive stress, this is
not a requirement. Some examples of languages with phonemic quantity in which stress
patterns independently from quantity are given in (21) (taken from Hayes 1995; see also
Fitzgerald (forthcoming)).
(21) language place of stress
Livonian (Baltic-Finnic) initial syllable
Mansi (Finno-Ugric) initial syllable
Dalabon (Gunwinyguan) initial syllable
Piro (Arawakan) penultimate syllable
Djingili (Australian) penultimate syllable
The existence of languages with distinctive quantity but with quantity-insensitive stress
systems indicates that the existence of a quantity-insensitive stress system in Modern
Persian cannot be an argument that quality is the active feature in its vowel system.
So far, I have reviewed the evidence that is presented in the literature for the Persian
vowel system to be qualitative –evidence based on phonetic length and evidence based on
stress– and I have argued that none of this evidence allows one to conclude that quality is
the primary feature in the system, as generally assumed in the literature. I now turn to
arguments in the literature for quantity as the distinctive feature in the system. Here too I
will conclude that these arguments do not provide strong evidence for what they claim.
32
2.2.2. Arguments in favor of quantity
In addition to the widespread idea that quality is the active feature in the Persian vowel
system, there are studies which consider quantity to be the dimension of contrast in the
system (Hayes 1979, Windfuhr 1979). Considering quantity to be active in the vowel
system of Persian, the system is represented as in (9), repeated here as (22).
(22) short long short long
i ī u ū
[e] [i] [o] [u]
short a ā long
[a] [ɑ]
The arguments in the literature in favor of quantity are versification and categorization of
vowels. First I examine versification (2.2.2.1), followed by a discussion on categorization
of vowels (2.2.2.2).
2.2.2.1. Versification
In this section, I examine Persian versification, which is referred to in the literature on the
Modern Persian vowels, and in particular is counted as evidence for Modern Persian to be
quantitative. It is, therefore, important to discuss it here. I start from versification in
Modern Persian. Although the main focus of this thesis is a synchronic study of the
Persian vowels since the quantitative system of Modern Persian poetry may be
considered as a continuation of Middle Persian poetry, I will examine Middle Persian
poetry as well.
33
Before starting the discussion, it should be pointed out that there is not agreement on the
versification of Middle Persian, Persian folk poetry, and how Arabic meters have
influenced Modern Persian poetry, as we will see below. Therefore, there are open
questions and unclear points in what is presented below, which is basically a literature
review. In particular, it is not always clear what the evidence is for a claim that a study
has made about Middle Persian poetry or folk poetry and so on. My goal is not to
investigate Middle Persian or Modern Persian poetry or folk poetry or to present an
analysis of these. The point, which I will get at (as seen in 2.2.2.1.4, the conclusion of
this section) and which is important for the purpose of this research, is that we cannot
argue for quantity in Modern Persian based on versification considering what the
literature on Middle Persian and Modern Persian versifications offers. I start the
discussion with Modern Persian.
There are different views of versification in Modern Persian. Samareh (1977) remarks
that “in Persian prosody, which is entirely based upon the syllable, every meter has a
fixed number of syllables, and the number of syllables is the same, from the rhythmic
point of view, in the two hemistiches of a distich. The number of syllables is determined
by counting the vowels, no matter how many consonants they may have around them”10
Let us consider the view which takes versification in Modern Persian to make reference
to quantity. The vowels /ɑ/, /i/, and /u/ are considered as long and the vowels /a/, /e/ and
/o/ as short. The pattern of a Persian poem is expressed as a sequence of macrons ( _ ) for
a long metric position and breves ( ‿ ) for a short metric position. Phonologically, long
vowels are considered as geminates; that is, V is represented as VV. So in assigning
syllable types, CV is considered as a short syllable, represented by a breve, while CVV is
(p. 75). Hayes (1979) points out that “although various scholars have attempted to assign
a role to stress in Persian verse (Rypka 1944; Natel Khanlari 1958), none of these
theories has been documented well enough to receive general support (cf. Elwell-Sutton
1976), and it will be assumed here that Persian verse is purely quantitative” (p. 195).
10 Samareh refers to Natel Khanlari (1958) in saying that Persian prosody is based on the syllable and that every meter has a fixed number of syllables.
34
considered as a long syllable, written with a macron or two breves (e.g., see Hayes 1979).
Based on versification, Hayes (1979) considers the Persian vowels as “either short (i, u,
a) or long (ī, ū, ā)” and he adds that “short i and u are phonetically e and o” (p. 195).
This view that i, u, ɑ are long and e, o, a, are short in assigning metric positions is also
found in other studies on Persian versification (Shahri 1991, Mahyar 1994). The question
here is that if Modern Persian is not a quantity-based system, why is a quantitative
system used even for the poetry written in the recent century? That is, why does
versification represent a quantity-based system while it is generally thought that Modern
Persian is a quality-based system?
It might be argued that versification is conservative, representing a continuation of the
Middle Persian system, a system which is expected to be purely quantitative since the
Middle Persian vowel system is considered to be a quantity-based system. However,
surprisingly, it is claimed that Middle Persian versification is not in fact based on
quantity (see Natel Khanlari 1966 for references). This is important for two reasons.
First, if Modern Persian versification is quantitative, it must not be a continuation of
Middle Persian versification if Middle Persian versification is not quantity based. Second,
if Middle Persian versification is not reliant on quantity, as is argued, as the active feature
in its vowel system, then either (i) the treatment of the Middle Persian vowel system as a
quantity-based system should be revisited if one considers versification as a valid
criterion for identifying the main contrastive feature in the vowel system of a language or
(ii) versification is not a valid criterion for recognizing the active feature of a vowel
system if the Middle Persian vowel system was indeed quantitative. Let us now take a
look at Middle Persian versification.
2.2.2.1.1. Middle Persian versification
Natel Khanlari (1966) reviews the literature on Middle Persian poetry and provides an
account. I first summarize Natel Khanlari’s review of the literature (Nyberg 1929,
Benveniste 1930, 1932, Jackson 1932 are among the references cited in Natel Khanlari
1966), and then review his perspective.
35
According to the work on Middle Persian poetry (not many examples of Middle Persian
poetry have been found), versification is based on the number of syllables and not on
quantity. It is very important to note that folk poems of the present time in some parts of
Iran are argued to follow the same versification pattern. That is, they are not based on
quantity but rather on the number of syllables.
The Modern Persian versification rules that are referred to in arguments for quantity are
from Arabic and all the terminology used for versification also comes from Arabic.
Arabic meter is quantity based and thus the meters of Modern Persian, like those of
Arabic, are considered to be based on quantity. It is interesting, however, that among the
common meters in Arabic poetry (such as tavil, kamel, moteqareb, etc.) only one of them,
moteqareb meter ( _ _ ‿ | _ _ ‿ | _ _ ‿ | _ _ ‿ ), is common in Persian. The most common
meters in Persian poetry (ramal, hazaj, khafif, etc.) are less common in Arabic, Natel
Khanlari writes (citing Christensen 1936). The creativity that Iranians had with respect to
poetry was to make some adjustments between the syllabic principles of Persian and
quantitative Arabic meters. The oldest and the most complete example of this adjustment
is moteqareb meter. According to one study (Tavadia 1950, cited in Natel Khanlari
1966), many poems found in ‘Deraxt-e Asurik’, a book written in Middle Persian, are
parsed in a way very similar to moteqareb meter.
Natel Khanlari believes that neither quantity nor the number of syllables were the basis of
Middle Persian poetry, but that stress had a role in meter. The approximate equality in the
number of syllables in a line was important, with some minor differences in the number
of syllables acceptable. But the exact number of syllables was not the determining factor
for meter.11
11 There are four types of meters: (i) purely quantitative, (ii) purely syllabic, (iii) purely tonic (stress-based), and (iv) syllabo-tonic (both stress and the number of syllables matter) (see, for instance, Preminger et al 1996). According to Hanson and Kiparsky (1996), although many stress-based meters have fixed syllable counts, there are some in which the number of syllables varies according to constraints on syllable weight (see the reference for discussion). If in Middle Persian, stress and the number of syllables were both important in meters, we can categorize the Middle Persian poetry as syllabo-tonic.
Natel Khanlari says that there is not much evidence to show how meters
were determined in Middle Persian as the principles of Middle Persian are not well
36
documented, and the pronunciation of words is not always clear; moreover, the
possibility of dialectal differences adds to the problems. Regarding unclear pronunciation
of words, I should point out that, in addition to the usual problems that one may
encounter in studying the former pronunciation of words (due to changes of sounds, etc.),
in Middle Persian short vowels, /a/, /i/, /u/ (Modern Persian /a/, /e/, /o/), were not
reflected in writing (unlike long vowels); therefore, there are uncertainties about the
pronunciation of words at that time with respect to these vowels, to which Natel Khanlari
also refers. For example, a word written as rōʃn can be rōʃn or rōʃan or ʃkanʤ can be
ʃkanʤ or ʃikanʤ (the examples are from Henning 1950, cited in Natel Khanlari 1966).
How words are considered to have been pronounced in Middle Persian, especially
regarding vowels, can affect our understanding of metrics of that time period.
Natel Khanlari argues, nevertheless, that there are two pieces of evidence from which we
can gain insight into Middle Persian poetry. One is through comparison with other
languages from the same family. It is believed that in Old Persian and also in Avestan
(Old Iranian)12
meter was determined based on stress. The other is folk poems which do
not follow literary Persian meters. I discuss folk poetry next.
2.2.2.1.2. Folk poetry
In this section, folk poetry is discussed. I focus on folk poetry as discussed in Natel
Khanlari (1966). There are two ways to study folk poetry. One is to look at the poems
remaining in the literature from centuries ago. This is difficult, however. The first
problem with these poems is that they do not show the pronunciation of the vowels /a/,
/e/, and /o/ because these vowels are not written in the Arabic script that is used for
Persian. Thus we are faced with the same problem we have with Middle Persian poems
and cannot be sure how words with these vowels were pronounced at the time they were
written. That is, it is not clear if a vowel is present in what is written as a consonant
12 Avestan, an Old Iranian language, is the language used in Avesta, the Zoroastrian scripture.
37
cluster. The second problem in interpreting Middle Persian folk poems is that since folk
poems were not based on Persian formal meters (which are adopted from Arabic), they
were changed in many cases by scholars and poets who accused folk poets of making
mistakes in poetry. Although there are some untouched poems which we can study, the
best way to study these poems is to look at those folk poems which are still used. In a
study of current folk poems, Natel Khanlari notes that these poems follow meters which
are different from those of the Persian literary poems. One of the differences is that in
formal Persian poems which are based on quantity, those syllables which have /a/, /e/,
and /o/ are short and those with /ɑ/, /i/, and /u/ are long. In folk poems, however, stress is
more important. In folk poems, the quantity of the syllables is not fixed and if stress falls
on a short syllable, it will be like a long syllable with respect to phonetic quantity, and if
a long syllable does not carry stress, it will be as short as a short syllable. That is why a
long syllable can be replaced by a short syllable and the opposite provided that the
placement of stress is appropriate. He thus suggests that the meter of folk poems is based
on both stress and quantity and that the quantity of syllables is not fixed (a long syllable
can be replaced by a short syllable and vice versa) and therefore stress is more prominent.
Natel Khanlari gives an example of Persian folk poems and its mismatch with Arabic
meter, as follows. Consider a couplet of a very popular folk poem, focusing on the part in
bold (Note: low vowels are shown as a and ɑ. ā and ɑ are these vowels when they
receive a macron):
38
(23) dĭʃāb ke bɑrun umad jɑr-am lab-e bum umad
last night when rain came sweetheart-1st.sg.gen. edge-Ezafe13
“Last night when the rain came, my sweetheart came on the roof….” (literary)
roof came
“Last night when the rain came, my sweetheart was in the horizon…”
Natel Khanlari assigns this couplet the metric pattern in (24), where stress is shown by
.14
(24) _ _ ‿ | _ ‿ | _ ‿
Note the given pattern should be considered from right to left according to Persian
orthography.
mad u run bɑ ke ʃab di
Natel Khanlari points out that stress placement is important; if stress is incorrect, the
poem will not read well. He considers various Arabic meters and examines this pattern
against those patterns and concludes that there is a mismatch between the pattern of this
poem and the Arabic ones.
For these poems two origins can be considered, as suggested by Natel Khanlari. Either
they represent a continuation of pre-Islamic poetry or the form arose after Arab
dominance in Iran – Iranians who did not deal with formal literature created these meters
as opposed to the Arabic meters adopted by the Persian poets. The latter is very unlikely
and the former is more plausible, so Natel Khanlari concludes that in Middle Persian both
13 Ezafe literally means ‘addition’. It refers to an unstressed –e in Persian which links elements of certain phrases together (e.g., Ghomeshi 1996). For instance: sib-e sorx or ketɑb-e Ali
apple-Ezafe red book-Ezafe Ali
‘red apple’ ‘Ali’s book’
Abbreviations: sg=singular, gen=genitive. 14 It is not clear why stress is not always final in each word. It seems to fall on alternate syllables. Whether the stress pattern was different in the time the poem was made or the non-final stress is due to dialectal differences or due to the metric pattern needs investigation.
39
stress and syllable quantity played a role in versification – he considers quantity to play a
role too because, as noted above, the meters require reference to short and long syllables
but this is determined by stress so both stress and quantity are important according to
him. Recall that other literature considers syllable number to be the main criterion in
Middle Persian versification (e.g., Nyberg 1929, Benveniste 1930, 1932, Jackson 1932;
cited in Natel Khanlari 1966).
What is important for our discussion is that all of these accounts agree that quantity was
not the main element in versification and, more importantly, that the Arabic meters
adopted by Iranians did not suit the characteristics of Persian and were altered. According
to Deo and Kiparky (to appear), “Persian created a hybrid metrical system by adopting a
subset of Arabic meters and modifying them to conform to Persian constraints, while also
retaining a class of indigenous meters that were analyzed within the Arabic system.
Further, Arabic meters which were not unobtrusively assimilable into Persian despite
modification dropped out of use.” Deo and Kiparsky consider the typical Arabic meters
to be weight sensitive, that is, the contrast between light and heavy syllables marks strong
and weak positions. In contrast, Persian meters tend to be mora-counting, that is, meters
are based on feet with a fixed number of moras and they distribute the constant total
weight of the feet in different ways among their syllables. According to them the
indigenous Persian system, to the extent that it can be reconstructed, seems to exhibit this
property (see Deo and Kiparsky (to appear) for discussion). Utah (1994; cited in Deo and
Kiparsky (to appear)) also considers Modern Persian meters to be a fusion of inherited
pre-Islamic Perisan and adopted Arabic elements.
2.2.2.1.3. Adopting Arabic meter
According to Natel Khanlari, differences between Arabic and Persian poetry have been
noticed by Iranians from the time Arabic meters entered Persian poetry and from the
beginning Persian poets realized that they could not follow Arabic meters completely, so
they made changes in those meters (see Hanson and Kiparsky (1996) for a discussion of a
similar situation, how Finnish metrics shows the interplay of linguistic and cultural
40
pressures, and how borrowed Germanic meters were modified in Finnish). Natel Khanlari
furthermore suggests that new principles for versification should be created for Persian
poetry, disregarding Arabic versification. He points out that stress is one factor which
makes Persian incompatible with Arabic meters.
It is important to recognize that there is another fundamental difference between Arabic
and Persian vowels. Natel Khanlari says that in Arabic there is a quantity difference
between vowels while in Persian the vowels are distinguished based on quality. The
Arabic script, Natel Khanlari writes, misleads many researchers and so they consider
Arabic and Persian similar in this respect. For example, Kramsky (1939) mistakenly
considers there to be three long vowels in Persian with their three short counterparts.
Natel Khanlari refers to Trubetzkoy (1949), who posits six vowels for Persian which are
different in quality and for which quantity is not primary. Trubetzkoy says this system is
not compatible with Persian versification which is based on quantity (i.e., Persian
quantity-based versification is somewhat arbitrary and does not really reflect the native
phonology). Natel Khanlari remarks that it is not clear when quality started to be the
main distinguishing factor for the Persian vowel system. In Modern Persian quantity is
important only in poetry which still follows the traditional meters. Again this refers to the
mismatch between the Modern Persian poetry (quantity-baased) and the Modern Persian
vowel system (quality-based).
Let us consider an example of the adjustments of Arabic meters in Persian (taken from
Natel Khanalri 1966). One of the Arabic meters is called Raʤaz and it is as follows (right
to left):
(25) _ ‿ _ _ | _ ‿ _ _ | _ ‿ _ _ | _ ‿ _ _
Most of the Persian poems which are based on this meter are divided into eight parts in
each line (while there are four parts in Arabic) and each part has a stress –the stress can
be on any syllable in that part. (26) shows an example of Raʤaz followed by its metric
pattern as adjusted for Persian - it should be considered from right to left (Note: low
vowels are shown as a and ɑ. ā and ɑ are these vowels when they receive a macron):
41
(26) ēj sɑre bɑn ma nzḗl mắkon ʤṓz dār dĭjɑr-e jɑr-e man
O! cameleer house do not except in land-Ezafe sweetheart-Ezafe my
“O! Cameleer, do not stop in any land other than my sweetheart’s” (literary)
Some adjustments have been made in Arabic meters adopted by Persian. Instead of four
parts in the Arabic meter ((25)), there are eight parts in the adjusted version used in
Persian ((27)). Moreover, each of these eight parts has a stress but the position of stress
varies in the parts (from right to left: in the first, second, third, sixth, seventh, and eighth
parts stress is on the last syllable while in the fourth and fifth parts stress is on the first
syllable. In part 4, stress is on a light syllable while a heavy syllable is available. Also
note that Ezafe (see footnote 13) can be both short and long). In the Arabic meter ((25)),
however, there are four parts and in each stress is on the second syllable (from right to
left). The adjustments suggest that Arabic meters were not directly suitable for Persian
poetry. That is, Persian did not show compatibility with the quantity-based meters of
Arabic. According to Elwell-Stton (1976; cited in Deo and Kiparsky (to appear)), Arabic
meters are ill suited for the Persian meters and the profound differences between the two
metrical systems show their independent origins.
2.2.2.1.4. Conclusion
In this section, I discussed versification, an argument in favor of quantity in the literature.
Versification is important to be taken into consideration because either versification can
be considered as a phonological argument for the contrastive feature in a vowel inventory
or not. If we argue for quantity to be active in the modern system based on versification
for which /a/, /e/ and /o/ are monomoraic and /ɑ/, /i/, and /u/ bimoraic, then since in
Middle Persian quantity was not used for versification, we can conclude that the Middle
42
Persian vowel system was not quantitative. If, however, we cannot consider versification
as the basis for an argument for quantity, then a type of evidence that is considered to
provide support for quantity in Modern Persian is no longer valid as evidence. What is
important for our discussion is that if versification in Modern Persian is not a
continuation of Middle Persian, according to most of the literature, then one cannot use
versification to argue for the system to be “still” the same as the one of the middle era.
And the fact that there were problems realized by the Persian poets from the beginning in
following Arabic meters for Persian is important in showing a difference between the two
languages even in the time that we assume the two languages were similar with respect to
their vowel systems (i.e. Middle Persian and Arabic are both thought to be quantity-based
systems).
2.2.2.2. Categorization of vowels based on phonotactics
Another argument that is introduced in the literature for the Persian vowel system being
quantitative involves categorization of vowels. A classification of a, e, o (former short
vowels) versus ɑ, i, u (former long vowels) is observed in the literature on Persian
vowels regardless of the position these studies take with respect to the active feature of
the system. The reason for this is that these two sets of vowels pattern differently in some
ways, as is discussed in this section.
According to the literature, phonotactic constraints in some cases require a categorization
of Persian vowels according to which a, e, and o are grouped together as opposed to ɑ, i, and u, which form a second group. In this section I start my discussion of phonotactics
with Samareh (1977), who proposes a quality-based system for Modern Persian and
whom other studies (e.g., Darzi 1991) refer to in taking quality to be phonologically
active in the system. According to Samareh, “phonologically speaking, vowel length in
Persian has little or no significance” (p. 92). He adds that “it is true that the historically
long vowels in some situations tend to be longer than the historically short ones”. For
instance, in each of the following pairs (Samareh 1977, p. 92; repeated from (17) for
convenience), the vowel in the first word is longer than the vowel in the second word
43
(e.g., /i/ in sir is longer than /e/ in ser). Recall that /i/, /u/, /â/ correspond to former /ī/ (or
/ē/), /ū/ (or /ō/), /ā/ respectively, and the vowels /e/, /o/, /a/ to former /i/, /u/, and /a/.15
(28) a. sir ‘full’ b. dur ‘far’ c. bâr ‘load’
ser ‘secret’ dor ‘jewel’ bar ‘over’
“Yet it is not mainly the length that makes contrast in these situations but rather the
absolute difference in quality involved.” (Samareh 1977, p. 93).
In his discussion of consonant clusters and the vowels that occur preceding them in
monosyllables, Samareh identifies two functionally different groups of vowels: /e, a, o/,
and /i, â, u/ (recall that Samareh uses /â/ where I use /ɑ/) with respect to possible
following consonant clusters. Before going through what Samareh says in this regard, let
us consider the Persian consonant inventory, shown in (29).
15 Recall from footnote 9 that the words ser ‘secret’ and dor ‘jewel’ are both considered in Persian to have a final geminated consonant. Gemination will be discussed in 5.5.
44
A thorough discussion of Persian vowels, the possible following consonants, and the
distributional details will be presented in chapter 5. The point which should be focused
on here is that there are two groups of vowels, a, e, o vs. ɑ, i, u, with respect to following
consonants.
Dividing the vowels into /e, a, o/, and /i, â, u/, Samareh writes that the first group can
occur before all combinations of consonants as far as the first member of the cluster is
concerned. The only exception is /e/, which cannot occur before clusters starting with /x/.
The vowels of the second group have a very limited occurrence preceding consonant
clusters. They cannot occur before those clusters whose first consonant is /q, ʔ, ǰ, z, h, m/
– as seen, there are differences in the symbols Samareh uses and those which I used for
consonants in (29), which is not an issue for our discussion. Samareh adds /b, t, d, k, n, l,
r/ which occur following the second group of vowels in a few loan words (e.g., kâbl ‘cable’, dubl ‘double’, ritm ‘rhythm’) and only three Persian words (i.e. bâng ‘shout’,
dâng ‘share’, pârs ‘related to Persia, Persian’). Samareh continues that the vowels of the
second group /i, â, u/ can precede /s, f, x/ combinations — the second consonant must be
/t/ with a few exceptions. The consonant /š/ is permitted after /â, u/ but not after /i/. A
summary of these observations is provided in (30).
(30) a. / e, a, o/
No restriction on C1 in C1C2 (/x/ not after /e/)
b. /i, â, u/
*/q, ʔ, ǰ, z, h, m/ as C1 in C1C2
? /b, t, d, k, n, l, r/ as C1 in C1C2
√ /s, f, x/ as C1 followed by /t/ as C2 (with a few exceptions)
√ /š/ (but not after /i/)
Samareh concludes that ‘the most interesting fact is that these two groups correspond
exactly to the traditional “short” and “long” vowels in Persian and although quantity is
45
not the basis for contrast, since the behaviors of “long” and “short” vowels are different
before consonant clusters “the traditional labels “long” and “short” may justifiably be
preserved for the two functionally different groups” (pp. 92 and 93).
Regarding the consonants that can follow /a, e, o/ and /ɑ, i, u/, Zolfaghari Serish and
Kambuziya (2005) note that in words with CVCC structure, the sonority sequencing
principle is met when the vowel is /ɑ, i, u/ (e.g., mɑst ‘yoghurt’, bist ‘twenty’, pust ‘skin’), but it need not be met when the vowel is /a, e, o/ (e.g., tabx ‘cooking’, zebr ‘rough’, sobh ‘morning’). They note that in monosyllabic words, the principle is met, for
instance, if the first consonant of the coda is /r, l, j, n/ and [w] or the second consonant of
the coda is /d, ʔ, ʤ, ʃ, k, ɡ, t/ (e.g., nanɡ ‘shame’, ɡerd ‘round’). Zolfaghari Serish and
Kambuziya say that with respect to the sonority sequencing principle two natural classes
of vowels are formed in Persian; these are /ɑ, i, u/ and /a, e, o/. They do not use any
feature for these two classes so it is not clear how they treat these vowels with respect to
their distinguishing characteristic. The point is that the categorization of former long
vowels versus former short vowels is observed with respect to the sonority principle. I
will return to the VC co-occurrence restrictions in chapter 5.
Lazard (1992) points out that grammars usually classify Persian vowels as long and short.
According to him, this classification refers to an etymological difference: ɑ, i, u represent
former long vowels and a, e, o former short vowels. He further says that the distinction
in quantity is retained in poetry and is the basis of traditional versification. In common
pronunciation, however, he notes that quality is the primary feature. Lazard adds that
quantity, although less important than quality, still plays a role in the Modern Persian
system. “Different facts taken together lead to a grouping on the one hand of the vowels
â, i, u and on the other of the vowels a, e, o. Nevertheless, rather than of “long” and
“short”, it is better to speak of “stable” vowels and “unstable” vowels.” (p. 17). Lazard
calls â, i, u “stable” vowels because they have a relatively constant duration and are not
subject to change in quality16
16 The exception is â (my ɑ) becoming u before nasal consonants (e.g., nɑn ‘bread’ → nun). A discussion of this process will be given in 3.8.
as opposed to a, e, o, which he calls “unstable” vowels
46
that are of variable duration and often show changes in quality. For the purpose of
clarification of Lazard’s statement, by “change in quality” Lazard refers to the harmony
processes in which e and o become i and u respectively (e.g., be-bin ~ bi-bin ‘see!’, kelid ~ kilid ‘key’, xorus ~ xurus ‘rooster’), and a becomes ɑ across laryngeals (e.g., bahɑr ~ bɑhɑr ‘spring’). These processes are discussed in section 2.3.
Windfuhr (1979) argues that the Modern Persian vowel system is still quantitative. He
considers i, u, â to be long vowels and e, o, a to be short vowels (Windfuhr 1979, p.
129). Presenting examples previously given in (14a) and (16) (see section 2.2.1.1),
Windfuhr says that a quantity distinction between stable and unstable vowels is retained
in open, non-final syllables, both stressed and unstressed. He continues that since a
quantity distinction exists, even if only in restricted contexts, the question as to whether
quantity or quality is the major distinctive feature between the two sets of vowels is
important. “It goes without saying that a rule which would attempt to unify e, o, a vs. i, u, â in qualitative terms would be very complex indeed, since there are no single or even
pairs of features that would accomplish such distinction” (Windfuhr 1979, p. 136). He
further notes: “Based on the phonetic evidence it would seem possible to return to length
as the major distinguishing feature” (Windfuhr 1979, p. 136). Windfuhr proposes a
lowering rule to change i and u to e and o respectively. To this then the lengthening rule
applies. These rules lengthen the three short “unstable” vowels in closed syllables17
(31) [-length] → [+length] / __________
and
word-finally, at least in mono- and bi-syllabic words. This rule partially neutralizes the
length distinction (taken from Windfuhr 1979, p. 136):
+vocalic CC
-conson. (C) #
17 Recall that according to Samareh (1977), in closed syllables former short vowels are shorter than former long vowels.
47
Windfuhr continues that “if length is accepted as the primary distinctive feature of
vowels, then quality can be viewed as secondary; and if so, then it would be possible to
posit the short vowels as synchronically rather than diachronically underlying vowels
opposed to the long vowels; i.e., to posit the oppositions as underlying i: ī, u: ū, and a: a. To these then, would apply a pre-surface rule which lowers short i, u to e, o and the rule
that raises final a to e (e.g., banda → bande ‘servant’)” (Windfuhr 1979, p. 137).
In this section, a categorization of vowels into ɑ, i, u versus a, e, o based on phonotactics
was discussed. Phonotactics provides a grouping of vowels into ɑ, i, u versus a, e, o. It
does not, however, show how this categorization is related to quantity. That is, the
categorization could be right but it is not necessarily an indication of quantity.
2.2.2.3. Summary
In this section I discussed the evidence in the literature for quantity in the vowel system
of Persian. In 2.2.2.1, I discussed versification. Versification, which treats ɑ, i, u as long
and a, e, o as short, is a piece of evidence used in the literature for quantity. Another
piece of evidence for quantity used in the literature is phonotactics, which was examined
in 2.2.2.2. Phonotactics suggests a grouping of vowels into ɑ, i, u versus a, e, o.
From what was reviewed above, it is apparent that some categorization between the
historical short vowels (present a, e, o) and the historical long vowels (present ɑ, i, u), is
still assumed in much of the literature on Persian vowels. I used “some” because there is
no unanimous interpretation of their status in the literature. For instance, Samareh
believes quality to be the active feature in the modern system and uses the “traditional
labels” because they are “functionally” useful. Lazard proposes a less important role for
quantity in Modern Persian compared to quality, which plays the primary role, and that is
why in his view it is better to speak of “stable” and “unstable” vowels rather than “long”
and “short”. Toosarvandani (2004) also adopts Lazard’s terms. Toosarvandani points out
that since the former short vowels are realized as short only in limited environments and
have the same phonetic length as former long vowels in most environments, the term
48
“short” is confusing when used to refer to the Modern Persian vowels a, e, and o.
Toosarvandani continues that for descriptive purposes we have to recognize that a, e, and
o behave as a group due to their variable length. He further adds that, “therefore, until
such a point as their true status has been revealed, I will refer to these vowels as
“unstable” following Lazard. Conversely, ɑ, i, and u whose durations remain the same in
all environments, I will refer to simply as “stable” vowels” (p. 241). As previously noted,
Windfuhr and Hayes, considering quantity to be still the active feature in the Persian
vowel system, divide the vowels into “short” and “long”.
I should add that the orthography also divides Persian vowels into two categories. In the
Perso-Arabic script, the vowels a, e, o are not shown in writing. They are shown by
diacritics for new learners or sometimes for clarification when a word which could be
read in different ways is written out of context. The vowels ɑ, i, u, however, are shown
by three letters.
The categorization of the vowels as discussed in the literature, based on versification and
phonotactic evidence, is not an argument for quantity. Persian formal meter, as discussed
in 2.2.2.1, is from Arabic meter with some changes to become compatible with Persian.
Persian folk poetry does not follow Arabic meter and is not quantity based. Having
discussed Modern Persian and Middle Persian meter as well as formal and folk poetry, I
showed that versification is not a strong argument for quantity.
The categorization of vowels based on phonotactics also does not offer a convincing
argument for quantity. As noted above, it shows a categorization of vowel into ɑ, i, u
versus a, e, o but it does not show how this categorization argues for quantity in the
system.
The other categorizations, such as “stable” and “unstable” or using the “traditional
labels” because they are “functionally” useful, which are offered by scholars who argue
for quality (Lazard 1992, Samareh 1977), show there is a grouping of ɑ, i, u versus a, e, o
but it is unclear how these should be interpreted in phonological terms to contribute to the
discussion of the active feature of the system.
49
So far, I have discussed quality-based and quantity-based analyses of the Persian vowel
system. Next, I discuss the synthetic analysis of the system.
2.2.3. The synthetic analysis
As previously discussed, in addition to the quality-based analysis and the quantity-based
analysis of the Persian vowel system, a synthetic analysis which considers both quality
and quantity to be present in the underlying vowel system of Persian is suggested in the
literature (Toosarvandani 2004).18
Toosarvandani (2004) discusses the purely quantitative and qualitative analyses of the
Persian vowel system, and suggests that Persian “requires oppositions of both quantity
and quality in the underlying vowel system in order to describe the observed
distributional facts and alternations adequately. If one posits underlying quality alone, the
underlying system is attractively concrete but a reasonable explanation of surface vowel
length becomes impossible. On the other hand, positing a primary underlying opposition
of length alone yields an elegant, but abstract vowel system” (pp. 250-251).
Toosarvandani speculates that Classical Persian had a vowel system which was
quantitative both underlyingly and on the surface. By Modern Persian, surface historical
forces had changed the quality of the short vowels, forcing a reanalysis of the qualitative
features of the underlying representation to mirror the ones of the surface system.
Toosarvandani says that “While length distinctions are only realized minimally on the
surface, a number of phonological processes depend on the presence of a length contrast
in the underlying representation for their realization, namely short vowel lengthening and
short vowel height and backness assimilation.” (p. 251). By short vowel lengthening,
As also discussed in section 2.2, having both quality
and quantity present underlyingly in a system such as Persian is not tenable given the
assumptions of the theory adopted here. I will, however, present this analysis in order to
provide a thorough review of the literature on Persian vowels.
18 Toosarvandani (2004) considers Kramsky (1966) as the origin of the synthetic analysis for Persian (see footnote 7 for discussion on Kramsky 1966 in this respect).
50
Toosarvandani refers to what acoustic studies show about the lengthening of a, e, o in
some environments, as reviewed in section 2.2.1.1; short vowel height and backness
assimilation refer to cases in which e, o, a, becomes i, u, ɑ respectively (e.g., kelid ~ kilid ‘key’, xorus ~ xurus ‘rooster’, bahɑr ~ bɑhɑr ‘spring’) to which Lazard also refers.
These cases of change in quality will be discussed in 2.3.
According to the synthetic view, the purely quantitative analysis must be rejected for the
following reason. In the quantitative analysis of the system vowels are categorized into
two groups which are distinguished from each other by quantity even though on the
surface the length distinction is neutralized in most environments. The variable duration
of “unstable” vowels in Lazard’s terms, a, e, o, is derived by rule (Toosarvandani refers
to Windfuhr 1979). All short vowels become long in closed syllables and word finally;
elsewhere they are short. Toosarvandani continues that complications arise in accounting
for surface qualitative changes of a, e, o by the quantitative analysis. One needs two rules
to account for these changes: a lowering rule which applies to i and u to yield e and o,
and a fronting rule which applies to ɑ, giving a. Both rules lack conditioning
environments and therefore constitute an appeal to absolute neutralization. It would be
difficult, if possible at all, for learners to construct this abstract system. Toosarvandani
argues: “Though the “quantity only” analysis is able to generalize unstable vowel
lengthening as a group process, its qualitative opacity requires us to reject it” (p. 244).
The purely qualitative analysis of the system should also be rejected, according to
Toosarvandani, for the following reason. A division between a, e, o and also ɑ, i, u,
“unstable” and “stable” vowels, is not possible based solely on quality. Therefore, an
account based on quality cannot explain the unstable vowel lengthening process as a
group lengthening process.
Accordingly the synthetic analysis integrates quantity and quality, with both present in
the underlying representation of the system. This analysis, thus, considers two groups of
vowels, as follows (taken from Toosarvandani 2004, p. 245):
51
(32) i: u:
e o
a ɑ:
The feature specification of the system is as follows according to this account
(Toosarvandani 2004, p.245 - I include the values as they are in the reference):
(33) e o a i: u: ɑ:
high + +
low + +
back + + +
In this account, the alternation observed in Persian is represented as given in (34)-(36)
(Toosarvandani 2004, pp. 247-248). I give examples from Toosarvandani (2004) for each
rule. (34) and (35) show that quality is needed in the system. (36) shows quantity is also
needed.
Recall that Toosarvandani says that a number of phonological processes need quantity to
be present underlyingly, namely short vowel lengthening, short vowel height and
backness assimilation. But the rules given for short vowel height and backness
assimilation, (34) and (35) respectively, do not involve a length contrast. Only the rule
for short vowel lengthening, (36), refers to a length contrast.
52
(34) a. Synthetic analysis height alternation rule (p. 247)
V V
[-low] [-low]
=
[-high] [+high]
b. Examples (pp. 246, 247)
devi:st ~ divi:st ‘two hundred’
fozul ~fuzu:l ‘impertinenet’
be-bi:n ~ bi-bi:n ‘see!’
be-gu: ~ bo-gu: / bu-gu: ‘say!’
(35) a. Synthetic analysis backness alternation rule (p. 248)
V V
[+low] [+low]
=
[-back] [+back]
53
b. Example (p. 247, referring to Lazard 1957)
bahɑ:r ~ bɑhɑ:r ‘spring’
(36) a. Synthetic analysis lengthening rule (p. 248) –this rule lengthens the short vowels
everywhere except in non-final open syllables.
X X X
→ / { –―C]σ }, {–―]σ }
V V
b. Examples (p. 242) –all the examples given in Toosarvandani 2004 related to
this discussion were previously given in (14), (15), and (16). I repeat an example.
xo.dɑ ‘God’ xo:ʃk.tár ‘dryer’
In this section, I examined the synthetic analysis of the system which considers both
quality and quantity to be underlying in the system. I will argue in chapter 3 that we can
account for various processes in the language with underlying quality alone, and as
predicted by the framework of Modified Contrastive Specification, for a system like that
of Persian, both quality and quantity are not required contrastively.
2.2.4. Summary and discussion
The active feature of the Persian vowel system is an issue of controversy in the literature
on Persian vowels, as discussed in this chapter. Some studies consider quality (height) to
be phonologically distinctive in the system based mainly on phonetic measurements as
54
well as stress. Others consider the system to be phonologically quantitative and they take
versification and phonotactic patterning as arguments for this position. A synthetic
analysis which requires both phonological quality and quantity in the system has also
been proposed. This final view is not tenable given the assumptions of the Modified
Contrastive Specification framework based on which, considering the nature of the
Persian vowel system, if one distinguishes Persian vowels phonologically by quality,
there is no need for quantity to be active and vice versa. Thus the system can be either
qualitative or quantitative (see also (11) on mixed system).
In this section, I examined the arguments which are presented in the literature for quality
and quantity and argued that they do not provide strong support for the position they take.
I summarize here the reasons: (i) Phonetic measurements do not strongly argue for
phonological quality because they do not agree upon where the length difference is
observed or upon whether or not there is a correlation between phonetic length and
phonological length in Persian. (ii) Stress is not a conclusive argument for phonological
quality as it is not required that the stress pattern of a quantitative vowel system
necessarily refers to quantity. (iii) Versification does not argue for phonological quantity
since Persian meters are borrowed from Arabic and adjusted into the Persian system and
are not created based on Persian vowels. That is why, as discussed in the literature, only
Persian formal poetry, and not folk poetry, shows a quantitative system; (iv)
Categorization of vowels, which suggests a, e, o as a group versus ɑ, i, u as another
group is not a convincing argument for phonological quantity either. The grouping is
found, as I will argue in chapter 3, but no real account in the literature is given for it.
Different literature considers different criteria as the basis of this categorization, and no
analysis provides a convincing account of the nature of this categorization. In the quality-
based studies, no feature is introduced to distinguish a, e, o versus ɑ, i, u, simply because
this two-way distinction is not easily explained based on height. That is why these studies
either consider the vowels to be “functionally” divided into these two groups – it remains
a question as to what “functionally” means in phonological terms – or they consider the
vowels to be “stable” and “unstable” – again it remains a question as to how these terms
can be transferred into phonological features. In the quantity-based studies, this
categorization is assumed to be necessarily based on quantity because, according to these
55
studies, no qualitative categorization of a, e, o versus ɑ, i, u is possible. However it is
unclear why this grouping must point to quantity. I will argue in chapter 3 that the
grouping can be accounted for with phonological quality.
Following the view that for recognizing contrasts in a system, one should examine the
phonological processes of the language under study (see 1.2), a common shortcoming of
all of these arguments (i.e., arguments just discussed in (i)-(iv)) is that none of them
refers to an active phonological process of Persian based on which one can draw a
conclusion. In the next section, I will show that such a process exists in the phonology of
Persian. This process is vowel harmony, which strongly argues for phonological quality
in the system, as discussed below.
2.3. Vowel harmony and a featural analysis for the system
Quality as the phonologically active feature in Modern Persian is strongly supported by
the raising harmony observed in the language. Persian shows several patterns of vowel
harmony with a mid vowel raising to a high vowel. While all involve raising of mid
vowels to high, there are slight differences depending upon domain, and I thus organize
this section considering the domains in which vowel harmony takes place.
2.3.1. Harmony across a morpheme boundary
In this section, a harmony pattern in non-low vowels which occurs across a morpheme
boundary will be discussed. Persian has a number of verbal prefixes, as follows:18F
19
19 In addition to these prefixes, the conjugated forms of the infinitive xɑstan, as an auxiliary verb, precede the past stem of a verb to mark the future tense in Persian; for example: xɑham raft ‘I will go’. This form of the future is mostly used in formal speech.
56
(37) be- (imperative and also subjunctive marker)
na- (negative marker)
mi- (indicative marker)
The prefix be- combines with the present stem of the verb to form the imperative. An
example is given in (38):
(38) Imperative: prefix /be/ + the present stem of the verb
e.g. xɑbidan ‘to sleep’
xɑb the present stem
be + xɑb → bexɑb ‘sleep!’
The prefix vowel /e/ may assimilate to the first vowel of the stem unless it is a low
vowel.20
(39) e prefix……u stem → u prefix / o prefix (place −and height−assimilation)
The forms in (39) through (41) present a schematic representation of this
assimilation.
a. be + ɡu → beɡu ~ buɡu/boɡu ‘say!’
b. be + xun → bexun ~ buxun/boxun ‘read!’
20 I used “may” because the assimilation of the vowel e in prefix to a high vowel in the stem seems to be influenced by sociolinguistic factors. The assimilation of the prefix vowel e to the vowel o in the stem (e.g., be + ro → boro ‘go!’; be + xor → boxor ‘eat!’) is more common; that is, the assimilated version (e.g., boro) is observed by far more than its non-assimilated counterpart (e.g., bero).
57
(40) e prefix……i stem → i prefix (height assimilation)
a. be + ɡir → beɡir ~ biɡir ‘get!’
b. be + ʃin → beʃin ~ biʃin ‘sit!’
(41) e prefix……o stem → o prefix (place assimilation)
a. be + ro → boro ‘go!’
b. be + xor → boxor ‘eat!’
Compare (39)-(41) with (42).
(42) e prefix……a/ɑ stem → e prefix (no assimilation)
a. be + xar → bexar ‘buy!’ *baxar
b. be + zɑr → besɑz ‘built!’ *bɑsɑz
The main point to be aware of is the raising of a mid vowel to a high vowel as in (39) and
(40). I include the other cases in order to present the overall picture of this case of
assimilation.21
The examples of assimilation in (39)-(41) all involve consonant-initial stems. With a
vowel-initial stem, a j is inserted between the vowel of the imperative marker and the
vowel-initial stems and the e is pronounced as [i], as in (43). This can be considered as a
case of raising triggered by the epenthetic j, which has the same place of articulation as i.
21 Note that the negative imperative form does not show assimilation. Consider na (negative marker) and see the following examples: na + ɡu → naɡu ‘don’t tell!’ (*noɡu/nuɡu), na + ɡir → naɡir ‘don’t get!’ (*niɡir), na + ro → naro ‘don’t go!’ (*noro), na + sɑz → nasɑz ‘don’t build!’ (*nɑsɑz).
58
(43) e prefix-jepenthetic-a, ɑ, o stem → i prefix
a. be + andɑz → bijandɑz ‘throw (sth)!’22
b. be + andiʃ → bijandiʃ ‘think!’
c. be + ɑ → bijɑ ‘come!’
d. be + ɑvar → bijɑvar ‘bring!’ (bijɑr in informal speech)
e. be + oft → bijoft ‘fall down!’
Between morphemes, raising occurs with a mid vowel raising to a high vowel before a
high vowel and preceding the high glide [j].
Next, I discuss what is generally considered to be height harmony within stems.
2.3.2. Harmony within stem
Persian shows two patterns of height harmony within the stem, as follows:
(44) o → u / — Cu
a. soɢut ~ suɢut ‘falling’
b. sotun ~ sutun ‘column’
c. xorus ~ xurus ‘rooster’
d. holu ~ hulu ‘peach’
e. koluʧe ~ kuluʧe ‘a type of cookie’
22 Note that bijandɑz is also pronounced as bendɑz in speech, which suggests the deletion of stem vowel in order to resolve vowel hiatus.
59
(45) e → i / — Ci
a. kelid ~ kilid ‘key’
b. sebil ~ sibil ‘mustache’
c. ɡelim ~ ɡilim ‘a kind of rug’
d. zeɡil ~ ziɡil ‘wart’
e. fetile ~ fitile ‘wick’
Evidence for the underlying presence of /o/ and /e/ (and not of /u/ and /i/) in the above
forms (and so the occurrence of harmony and consequently raising of /o/ and /e/ to high
vowels) is as follows. First, the existence of words with CuCu and CiCi in the language
argues for /o/ and /e/ in the above examples. These words, some of which are given in
(46) and (47), are never pronounced as CoCu and CeCi even in very formal speech. In
their written forms they have the letters used for [i] and [u] which are pronounced in both
formal and colloquial speech.
(46) CuCu (C) Words
a. kuku ‘name of a food’
b. ʤuʤu ‘birdie’
c. susul ‘dandy’
d. tutun ‘tobacco’
e. kuʧulu ‘small’
60
(47) CiCi (C) Words
a. ɡiti ‘universe’
b. bini ‘nose’
c. sili ‘slap’
d. ʃirin ‘sweet’
e. kimijɑ ‘alchemy’
Thus surface C-highV-C-highV words have two sources: the vowels may be
phonologically high, as in (46) and (47), in which case vowel height is invariant, or the
first vowel may vary in its height between mid and high, as in (44) and (45). This varying
vowel is phonologically mid under this height-based analysis.
Second, the orthography of the language supports the presence of /o/ and /e/ in forms such
as those in (44) and (45). In Persian /a/, /e/, and /o/ are represented by diacritics (which
are not inserted in writing except in books for new learners, as noted in 2.2.2.3), and /ɑ/,
/i/, and /u/ by three letters of the alphabet. None of the words in (44) and (45) contain the
symbols used for /i/ and /u/ for their first vowel in their written form. It is only in speech
that [i] and [u] are pronounced. In (46) and (47), on the other hand, the vowels are both
represented by the vowel symbols.
Within-stem height harmony does not occur in other sequences of vowels in a stem (see
Appendix 1 for full data). However, the sequence eCo needs comment. Recall that in
prefix-stem vowel harmony in Persian imperatives discussed in section 2.3.1 the most
frequent pattern is the change of /e/ to [o] (see footnote 20). This place assimilation is not
as common within the stem. While there are cases such as ʤelo ~ ʤolo ‘front’ and ʧelo ~ ʧolo ‘steamed rice’ which might be taken as cases of /e/ to [o] assimilation within
stems, there are cases such as dero ‘harvest’, keʃo ‘drawer’ and senobar ‘black poplar’
which do not show any change in /e/ (*doro; *koʃo, *sonobar) even in informal speech.
Whether there is a pattern in this regard, what the underlying form is in the cases such as
61
ʤelo ~ ʤolo, and in general how CeCo words should be treated are questions for further
research. Comparing the harmony within stems with the one across morphemes leads us
to consider different domains in discussing harmony patterns. That is, although in the
prefix-stem sequence, the change of /e/ to [o] is the most frequent pattern of harmony, in
other words, place harmony is more frequent than height harmony, in within-stem
harmony, height harmony (i.e. change of /e/ to [i] and also /o/ to [u]), and not place
harmony, is more frequent. The assimilation of e to u is not observed in stems either. The
main point for the purpose of this study is the occurrence of harmony among vowels.
2.3.3. Harmony in loan words
Patterns of vowel harmony are also observed in English or French words that have been
borrowed into Persian. Initial and medial consonant clusters are forbidden in Persian, and
an [e] is inserted to break up these clusters. In words starting with sC, the epenthetic [e] is
inserted at the beginning of the cluster. For example:
(48) English/French Persian
a. star → estɑr
b. small → esmɑl
c. steel → estil
d. ski → eski
e. stop → estop
In other cases, [e] is epenthesized between the two consonants, as in pelɑstik for
‘plastic’, pelɑn for ‘plan’ and kelɑs for ‘class’.
The epenthetic [e] optionally undergoes harmony when it is non-initial, assimilating in
height to a following non-low vowel across one consonant. Some examples are given in
(49)-(51). The raising of the epenthetic vowel to a following high vowel is not
62
mandatory, although it is frequent, as there are words with two pronunciations, one with a
raised epenthetic vowel and the other without a raised epenthetic vowel. I include the
unraised version, when it is possible.
(49) Assimilation of [e] to [i]
English/French Persian
a. freezer → firizer ~ ferizer
b. ɡrease → ɡiris ~ ɡeris
c. cliché → kiliʃe ~ keliʃe
(50) Assimilation of [e] to [u]
English/French Persian
a. flu → fulu ~ felu
b. blu(-ray) → bulu ~ belu
b. cruise → kuruz
(51) Assimilation of [e] to [o]
English/French Persian
a. project → poroʒe ~ peroʒe
b. profile → porofɑjl ~ perofɑjl
c. blonde → bolond ~ belond
63
(52) Failure of assimilation of [e] to [a] and [ɑ]
English/French Persian
a. plan → pelan/pelɑn *palan/*pɑlɑn
b. class → kelɑs *kɑlɑs
c. traffic → terɑfik *tɑrɑfik
As in native words, in loan words too [e] changes its height when the potential trigger is a
high vowel.
One might ask why I take [e] as the epenthetic vowel for loanwords in Persian, since we
see different realizations of epenthetic vowels in the examples. In (49)-(51), there are
some reasons to consider [e] as the epenthetic vowel in loan words, putting aside its
similarity to the native assimilation process. In the cases in which no harmony is seen —
when the cluster precedes a low vowel— [e] is always observed in the epenthetic vowel
position (see (52)). In addition, even in cases where harmony normally takes place,
sometimes the foreign word with an [e] as epenthetic vowel can interchangeably be used,
as in [firizer] and [ferizer] for freezer. Finally, the insertion of [e] at the beginning of sC
clusters as the strategy of cluster-breaking (see (48)) also suggests the nature of the
epenthetic vowel. In these words, there is no harmony, perhaps because of the existence
of two consonants between [e] and the next vowel, and [e], as a default epenthetic vowel,
always appears. Thus, in the absence of assimilation, the epenthetic vowel in Persian is
[e].
2.3.4. Preliminary analysis
In this section, I examined three patterns of vowel harmony in Persian, in all of which a
mid vowel is raised to a high vowel: raising across a morpheme (e.g., be + ɡir → beɡir ~
biɡir ‘get!’), raising within a stem (e.g., kelid ~ kilid ‘key’), and raising of an epenthetic
vowel in loan words (e.g., ferizer ~ firizer ‘freezer’).
64
Persian also has a raising process in which ɑ raises to u before nasal consonants, that is, ɑ → u/ −m, n (e.g., bɑdɑm ‘almond’ → bɑdum; bɑrɑn ‘rain’ → bɑrun), but this is not a
case of harmony. It is raising due to the nasal context. Section 3.8 presents a discussion
of pre-nasal raising.
Given the harmony patterns in Persian, the question is: what are the implications of the
occurrence of these patterns for the analysis of the Persian vowel inventory? That is, what
does harmony tell us with respect to the phonologically active feature of the system
considering the two possibilities: quality or quantity? Harmony is generally assumed to
be feature based (e.g., see van der Hulst and van de Weijer 1995, van Oostendorp 1995),
as suggested by the cross-linguistically attested harmony patterns (this will be discussed
in detail in 2.3.6). Thus if quality is considered as the basis for contrast in the system, the
changes due to harmony are easy to account for. Quantity, however, occupying two
positions, is not a feature and cannot spread. Thus a quantity-based analysis cannot
explain the changes occurring due to harmony. To sum up, harmony patterns strongly
support the presence of a feature in the system and the absence of quantity.
In the next section, I examine an additional harmony pattern, one found in Persian low
vowels.
2.3.5. Harmony in low vowels
In sections 2.3.1-2.3.3, it was seen that low vowels do not participate in the harmony
patterns as either triggers or targets in interaction with non-low vowels. There is,
however, an interaction between the two low vowels. In formal speech, the form before
assimilation is more frequent, while in informal speech the form with harmony is
common.
65
(53) a → ɑ / C−ʔɑ(C)
a → ɑ / C−hɑ(C)
a. bahɑ → bɑhɑ ‘price’
b. bahɑr → bɑhɑr ‘spring, also a name for girls’
c. ʤahɑn → ʤɑhɑn ‘world, also a name for boys’
d. ʃahɑb → ʃɑhɑb ‘meteor, also a name for boys’
e. mahɑl → mɑhɑl ‘impossible’
f. behbahɑn → behbɑhɑn ‘a city in Iran’
g. maʔɑʃ → mɑʔɑʃ ‘livelihood’
h. saʔɑdat → sɑʔɑdat ‘happiness’
As (53) illustrates, a assimilates to ɑ across a laryngeal consonant.
Evidence for the underlying presence of a in these words come from various sources.
First the orthography does not show ɑ for the first vowel in these words. Second in
formal speech the first vowel of these words is a and not ɑ; and third there are words such
as those in (54) which do not show a for the first vowel even in very formal speech, and
which include ɑ in written form for both vowels.
(54) a. morɑʔɑt ‘consideration’ *moraʔɑt
b. sɑʔɑt ‘hours’ *saʔɑt
c. farɑhɑn ‘a city in Iran’ *farahɑn
d. mɑhɑn ‘a city in Iran’ *mahɑn
The process only occurs across laryngeal consonants and does not occur across other
consonants, as the following examples show.
66
(55) a. tabɑr *tɑbɑr ‘lineage’
b. ɢatɑr *ɢɑtɑr ‘train’
c. davɑ *dɑvɑ ‘medication’
d. maʤɑl *mɑʤɑl ‘opportunity’
e. marɑm *mɑrɑm ‘doctrine’
f. salɑh *sɑlɑh ‘advisable’
g. xarɑb *xɑrɑb ‘ruined’
h. kamɑl *kɑmɑl ‘perfection’
The process does not occur in the ɑʔa or ɑha environment, as shown in (56).
(56) a. xɑhar *xahar/*xɑhɑr ‘sister’
b. rɑhat *rahat/*rɑhɑt ‘comfortable’
c. sɑhat *sahat/*sɑhɑt ‘realm’
d. etɑʔat *etaʔat/*etɑʔɑt ‘obedience’
Also, the process involves only low vowels, as the following examples show. That is, h
and ʔ are not transparent if they are not flanked on both sides by low vowels.23
(57) a. mahin *mihin ‘better’ (a high vowel follows h)
b. mihan *mahan ‘country’ (a high vowel precedes h)
c. sɑhel *sehel ‘shore’ (a mid vowel follows h)
23 An exception is sɑheb which can be pronounced as sɑhɑb ‘owner’ in informal speech.
67
d. kohan *kahan ‘ancient’ (a mid vowel precedes h)
e. moʔin *miʔin ‘helpful’ (ʔ intervenes a mid and a high vowel)
f. mohit *mihit ‘surrounding’(h intervenes a mid and a high vowel)
Is this process a case of laryngeal transparency? Why would it occur only in the low
vowels? How can this phenomenon be related to the feature characteristics of h and ʔ, and also to those of low vowels? These are interesting questions which I leave aside for
now and return to in 3.7. The point relevant to our discussion at the moment is the
occurrence of harmony. Given the feature-based nature of harmony, a feature must be
involved in the assimilation of a to ɑ across laryngeals.
So far, the harmony patterns in low vowels and in non-low vowels have been discussed
separately. In the next section, the harmony patterns will be examined putting low and
non-low vowels together.
2.3.6. Harmony in low and non-low vowels: a question
I argued in 2.3.4 that the existence of harmony patterns in Persian strongly support a
featural analysis for the system. The assumption that harmony is feature-based derives
from a number of surveys of harmony which argue that harmony is based on features
rather than on length. That is, length harmony is unattested (Kramer 2003). The attested
types of harmony found in the languages of the world are as follows (e.g., van der Hulst
and van de Weijer 1995, Kramer 2003):
68
(58) Attested types of harmony across languages
Palatal harmony Finnish, Hungarian
Example
Labial harmony24
Tongue root harmony African languages (e.g., Yoruba)
Altaic languages (e.g., Turkic)
Height harmony Bantu languages (e.g., Shona)
Nasal harmony Kikongo
Retroflex harmony Yurok
Tense/lax Harmony Canadian French25,
Andalusian Spanish26
Unattested → Length harmony (Kramer 2003, p. 17)
In addition to these harmony types, some mixed systems (two or more harmony patterns)
are also attested. In these mixed cases, no instance of length in combination with another
type of harmony has been found. Some examples of mixed cases are as follows (from van
der Hulst and van de Weijer 1995, Kramer 2003): ATR and height in Klao (Kru) and in
Togo-remnant languages, backness and ATR/RTR in Kalabari (Niger-Congo), backness
and roundness in Turkish, backness and roundness and ATR in kàlɔŋ (Niger-Congo),
etc.
24 Note that Kramer (2003) puts a ‘?’ in front of labial harmony. Kramer writes that the question mark indicates that no clear instances of this type of harmony have been found. The reason is that, according to him, labial or rounding harmony is a by-product of backness harmony and/or is limited to vowels of a particular height. Regarding the need for having both [labial] and [back] for vowels in harmony process, D’Arcy (2004), reexamining several patterns of harmony in different languages, shows that two place features are sufficient for vowels on accounting for cross-linguistic harmony patterns. She follows Rice (1995, 2002) in considering [coronal] and [peripheral] (the latter covers [labial] and [back] in vowels) as vowel place features. As pointed out in 1.2, I also follow Rice’s two-place model in my analysis of Persian vowels. 25 Douglas Walker (1984), Rachel Walker (2005). 26 Rachel Walker (2005).
69
I do not go further into the details of these cases of harmony. The point is that length
harmony is not attested27
and in all harmony patterns features are involved. Thus a
feature-based analysis is needed to account for harmony patterns in Persian. Given
harmony patterns in low vowels and non-low vowels, the question is: is there one
harmony system or two harmony systems (one among mid and high vowels, and the other
between the low vowels)? That is, is there one feature involved in all these harmony
patterns or are there two features involved? This will be fully discussed in the next
chapter.
2.4. Summary and conclusion
In this chapter I studied a controversial issue in Persian phonology, namely the active
feature of the vowel system of the language. A review of the literature showed that three
different views are found regarding the active feature of the system. Some studies
consider height, a quality, to be the phonologically contrastive feature. Others consider
quantity to be the basis of phonological contrast in the system. A synthetic account which
considers both quality and quantity to be underlyingly present in the vowel system of
Persian is also found in the literature. The third view is not tenable under the assumptions
of Modified Contrastive Specification because, given the nature of the system, if the
vowels are distinguished by phonological quality, there is no need for quantity to be
involved and vice versa. This leaves us with quantity or quality as the basis of contrast in
the system. I reexamined the arguments in the literature for quantity (versification and
phonotactic patterning) and those for quality (phonetic measurements and stress), and
showed that they are inconclusive. Based on the occurrence of several harmony patterns,
assuming that harmony is feature based, I argued for a featural analysis.
27 I have found only one case where it is argued that length harmony is required, Leggbó (Hyman and Udoh, 2002). The authors point out that this is truly unusual given the cross-linguistically attested patterns of harmony, for which a feature is required. This system may perhaps be open to reanalysis.
70
I end this chapter with two questions. First, are there two features involved in harmony
patterns (one for harmony in non-low vowels and the other for harmony in low vowels)
or is there one feature by which harmony in low vowels and harmony in non-low vowels
both can be accounted for? Second, considering that the major evidence for quantity
comes from the different distributions of the two classes of vowels (a, e, o vs. ɑ, i, u), can
this be accounted for with a feature-based harmony system? In the next chapter, I address
these questions and present a detailed analysis of Persian harmony patterns.
71
Chapter 3
The active features of the Persian vowel system: the solution
In chapter 2, I argued that quantity is not the dimension of contrast in Persian as it cannot
account for harmony, which requires a feature. In this chapter, I turn to quality in order to
account for harmony. I first provide an account of harmony assuming that height is the
active feature of the system, as assumed by qualitative and synthetic accounts of Persian
vowels (e.g., Samareh 1885, Pisowicz 1985, Toosarvandani 2004). I show that although
height accounts for harmony patterns, it does not capture another fact about Persian
vowels, the categorization of vowels into two groups: ɑ, i, u vs. a, e, o. I propose an
account of the system based on tenseness and show that both harmony and the
categorization of vowels can be explained by tenseness. I then present a phonetic
experiment on Persian vowels in order to see how Persian tense and lax vowels pattern in
terms of duration and if their patterning in this respect is closer to what we observe in
quantity-based systems or in tense-based systems. I further discuss contrasts and
markedness in the system. In addition, I study two processes which show changes in
vowels in the environment of particular consonants, namely harmony in low vowels
across laryngeals, which will be discussed with respect to characteristics of laryngeals
and low vowels, as well as pre-nasal raising, for which I will offer an account based on
tenseness. I end this chapter with a note on Persian diphthongs and argue that there is no
phonemic diphthong in Persian.
I start this chapter with a height-based account of harmony patterns.
3.1. Harmony patterns: an account based on height
In the previous chapter, I argued that quantity is not the dimension of contrast in the
system based on a widespread assumption about the feature-based nature of harmony (see
2.3.6). This provides strong evidence in favor of qualitative accounts of the system. The
72
accounts in the literature consider height to be the main feature in the system and in this
section I examine how the harmony patterns can be explained assuming a height-based
system for Persian vowels. In the previous chapter, different patterns of harmony among
non-low vowels and also a harmony pattern between the low vowels were presented (in
2.3). I will first discuss the harmony patterns in non-low vowels, followed by a
discussion of the harmony pattern in low vowels.
3.1.1. Harmony patterns in non-low vowels: an account based on height
In this section, I show how harmony in non-low vowels can be accounted for assuming
that height is the active feature in the system. First, I repeat some examples of the
harmony patterns in non-low vowels, which were discussed in 2.3.1-2.3.3.
(1) Harmony across a morpheme boundary: the vowel e of the prefix be may assimilate
to the following non-low vowel of the stem as in (a-c), and it raises preceding the high
glide [j] as in (d).
a. be + ɡu → beɡu ~ buɡu/boɡu ‘say!’
b. be + ɡir → beɡir ~ biɡir ‘get!
c. be + ro → boro ‘go!’
d. be+ɑ → bijɑ ‘come!’
(2) Harmony within a stem: o and e may assimilate to u and i respectively
a. soɢut → suɢut ‘falling’
b. hozur → huzur ‘presence’
c. sebil → sibil ‘mustache’
d. zeɡil → ziɡil ‘wart’
73
(3) Harmony in loan words: the epenthetic e may assimilate to a following non-low
vowel
a. freezer → firizer ~ ferizer
b. blu(-ray) → bulu ~ belu
c. blonde → bolond ~ belond
Before discussing how these patterns can be explained in a height-based account, I need
to comment on the status of these harmony patterns as lexical rather than post-lexical.
One may ask for evidence to show that the harmony process in Persian is lexical, in
particular given that there is the pattern of raising of e to i before j (e.g., be + ɑ → bijɑ
‘come!’) which may suggest a post-lexical status for the process as it occurs after the
epenthetic j is inserted at the prefix boundary. It is important to know whether the
harmony process is lexical or post-lexical, because if it is post-lexical, it is a phonetic
process; if, however, it is lexical it belongs to phonology and involves underlying
features, and for this reason, it helps us to find out what features are phonologically
present in the system.
Based on criteria usually argued to be characteristic of lexical rules (see Kiparsky 1982,
1985, Mohanan 1982, Kaisse and Shaw 1985, D’Arcy 2003 among others), I argue that
harmony across a morpheme boundary in imperatives where the stem is consonant-initial
(e.g. be + ɡir → biɡir ~ beɡir ‘get!’) and harmony within a stem (e.g., kelid ~ kilid
‘key’) are not post-lexical for the following reasons.
In harmony across a morpheme boundary in imperatives, the process has access to
morphology as it occurs across morphemes –it occurs between the imperative marker be
and the stem (see (1)).
In harmony within a stem, the process is category-sensitive. It is usually nouns which
undergo harmony (see (2)) and not verbs (see (4)).
74
(4) a. keʃid ‘s/he pulled’ *kiʃid
b. resid ‘s/he reached’ *risid
Moreover, there are some exceptions in the noun category itself:
(5) a. neɡin ‘gem’ *niɡin
b. setiɢ ‘ridge’ *sitiɢ
I thus conclude that the harmony is lexical.
I now offer an account of harmony assuming height is the dimension of contrast in the
Persian vowel system. Consider once more the representation of the system if height is
taken to be the contrastive feature, as presented in the literature (e.g., Samareh 1985).28
(6) [coronal] [peripheral]
i u [high]
e o [mid]
a ɑ [low]
Given the system in (6): the feature [high] spreads from a high vowel to a mid vowel.
Recall that, as discussed in chapter 1, in the framework of Modified Contrastive
Specification within which my analysis is presented one does not need three height
features to be active in a system such as the one of Persian. Only two of them (along with
a place feature) are enough to distinguish the vowels from one another in the above
system. The harmony patterns suggest that [mid] is absent underlyingly given the
structural markedness assumed by Modified Contrastive Specification. Recall that one of
the diagnostics to identify marked elements (present in underlying representation) from
28 In the height-based accounts in the literature, the features [back] and [front] are used for vowel place (e.g., Samareh 1985) instead of [peripheral] and [coronal] which I use (see 1.2). This substitution has no effect on the analysis.
75
unmarked ones (absent in underlying representation) is assimilation. Triggers of harmony
are structurally more complex while targets of harmony are less complex. Thus in these
cases of harmony in Persian, high vowels, triggers of harmony, are more complex then
mid vowels because the former must have the feature [high] to spread. The feature [mid],
on the other hand, must be absent since mid vowels are targets in harmony. For a mid
vowel to become high, it would obtain the feature [high] from a neighboring high vowel.
One might ask why the vowel o can be the trigger of harmony for the vowel e although
they both share the feature [mid] (or both lack a height feature) (e.g., be + ro → boro
‘go!’). The fact that o can be a trigger of harmony for e shows that o is structurally more
complex than e. The vowel o is specified for place but not for height. That is, in e to o
assimilation, [peripheral] spreads from o to e, which is not specified for place ([coronal]
is underlyingly absent). It will be shown in 3.6 that other processes of Persian confirm
that [coronal] is unspecified in this language whereas [peripheral] is specified.
Another question which might be asked is why low vowels do not interact with non-low
vowels in the harmony processes in Persian. In order to account for the failure of low
vowels to participate in harmony, I assume a distinction between low and non-low
vowels in the Persian inventory, with low vowels being marked by the feature [low]. This
assumption is supported by evidence from the language, and low vowels pattern similarly
in other languages. The two low vowels do not participate in the processes of height
harmony in Persian (i.e. the low vowels neither raise to a non-low vowel nor does any
non-low vowel lower to them). Similar cases occur in other languages. For example, in
Shona, a Bantu language with a three-height vowel system, the low vowel does not
trigger height harmony although the language shows height harmony among mid and
high vowels. Height harmony also fails to apply across the low vowel in this language
(Beckman 1997). Some examples (taken from Beckman 1997) are given in (7). Compare
(7a)-(7d) with (7e) and (7f).
76
(7) a. pera ‘end’ per-era ‘end in’
b. sona ‘sew’ son-era ‘sew for’
c. oma ‘be dry’ om-esa ‘cause to get dry’
d. bvisa ‘remove’ bvis-ika ‘be easily removed’
e. pamha ‘do again’ pamh-isa ‘make do again’
f. cheyama ‘be twisted’ cheyam-isa ‘make be twisted’
The suffixes in (7) are generally analyzed as beginning with a high vowel; this vowel
assimilates to a preceding mid vowel (a-d) but not to a preceding low vowel (e, f).
Clements (1991) argues that low vowels typically do not trigger height assimilation such
as vowel lowering in Bantu languages. An example of an assimilation rule that does not
apply with low vowels can be found in a height assimilation rule in Kimatuumbi, another
Bantu language, in which any nonlow suffix vowel gains the height of a preceding
nonlow vowel (Clements, 1991). Kimatuumbi has four vowel heights in stems, low, mid,
high and what is identified as super high. Suffixes simply distinguish low and non-low
vowels. The non-low vowel suffix assimilates to non-low vowels, but is realized as super
high following a low vowel. In (8), the cedilla under the vowel indicates a super high
vowel and the capital letters indicate the underlying high vowels of the suffixes.
(8) underlying surface example
a. i + I i + i yi pilya ‘thatch with for’
b. i + U i + u tikulya ‘break with’
c. e + I e + e cheengeya ‘make build’
d. o + I o + e boolelwa ‘be de-barked’
e. a + U a + u tyamu lya ‘sneeze on’
77
In (8a) because of the preceding super high vowel, the suffix vowel is realized as super
high. In (8b) a preceding high vowel makes the suffix vowel high. In (8c) and (8d), /I/ is
realized as [e] due to the presence of preceding mid vowels. Clements explains that with
a preceding low vowel this rule does not apply, as in (8e) above, /U/ is realized as [u ], a
super high vowel. These cases indicate that low and high vowels do not interact in a three
height system. Why the feature [low] shows this exceptional behavior needs
investigation. The point important to our discussion is that the fact that low vowels do not
participate in harmony processes with non-low vowels is not unique to Persian since
cross-linguistically it is not unusual for low vowels to not participate in harmony
processes with vowels of other heights.
In addition, according to van der Hulst and van de Weijer (1995), spreading of both
[high] and [low] in a system is unattested. They discuss harmony systems that involve
either [low] or [high] spreading and they point out that “we do not know of systems that
involve spreading of both” (p. 510). In Persian, as shown, [high] spreads and thus, based
on cross-linguistic patterning, [low] is not expected to spread.
To sum up, the harmony patterns in non-low vowels can be neatly accounted for in a
height-based account: [high] spreads from high vowels to mid vowels.
I now turn to the harmony pattern in low vowels.
3.1.2. Harmony pattern in low vowels: an account based on height
In low vowels, a pattern of harmony is seen, as discussed in 2.3.5., in which a becomes ɑ
across a laryngeal consonant. (9) presents some examples:
(9) a. bahɑ → bɑhɑ ‘price’
b. ʤahɑn → ʤɑhɑn ‘world, also a name for boys’
78
c. maʔɑʃ → mɑʔɑʃ ‘livelihood’
d. saʔɑdat → sɑʔɑdat ‘happiness’
Why the harmony in non-low vowels occurs only across laryngeals and whether this has
to do with the feature characteristics of low vowels and laryngeals in Persian will be
discussed in 3.7. Let us focus on harmony for now.
Considering the representation of the vowel inventory in height-based accounts (see (6)),
the harmony in low vowels is a case of place assimilation: [peripheral] spreads from ɑ to
a. Based on the structural markedness assumed here, this shows that [peripheral] is
marked or present in the system and [coronal] is unmarked or absent, as also shown by
the assimilation of /e/ to [o], as discussed in section 3.1.1.
3.1.3. Harmony patterns in a height-based account: discussion
The underlying representation of the inventory in a height-based account within the
framework of Modified Contrastive Specification will then be as shown in (10). Based on
phonological patterning of the language, the features [peripheral], [high], and [low] are
underlyingly present, while features [coronal] and [mid] are absent.
(10) [peripheral]
i u [high]
e o
a ɑ [low]
Thus the height-based account suggests the two features are involved in harmony
patterns: height and place. That is, there are two harmony systems: one involves
spreading of [high], which is seen in non-low vowels; the other involves spreading of
[peripheral], which is seen in low vowels and in the assimilation of e to o.
79
But if one considers the patterns of harmony in low vowels and non-low vowels together,
the following pattern is observed.
(11) targets triggers
a → ɑ place harmony
Under a height-based account
o → u height harmony
e → i height harmony
The triggers of harmony are ɑ, i, u, the former long vowels (see 2.1), and the targets of
harmony are a, e, o, the former short vowels (within targets, e may change to o as
discussed in 2.3.1 so o can be both target and trigger but the point is that ɑ, i, u are
always triggers and never targets). Recall that a categorization of ɑ, i, u versus a, e, o is
suggested in quantity-based accounts of Persian vowels (see 2.2.2), but it was already
established that due to harmony cases, quantity is not the dimension of contrast in the
system, and a feature should necessarily be involved in the harmony patterns (see 2.3.6).
Can we account for the harmony patterns, which require a feature to be active in the
system, and also for the categorization observed in (11)? I examine this question next.
3.2. Tense/lax distinction
In this section, I provide a feature-based account for the Persian vowel system by which
harmony patterns in low and non-low vowels will find a uniform account and the
categorization of vowels into two groups a, e, o vs. ɑ, i, u, will also be accounted for.
I argue that the system is captured if the contrast between the vowels is explained through
a tenseness feature. That is, I suggest a featural analysis for the system based on tense/lax
distinction. Under this view, the system is represented as follows:
80
(12) [tense] i u [lax] e o
ɑ [low] a
I ended the previous chapter with two questions: (i) are there two features involved in
harmony patterns (one for harmony in non-low vowels and the other for harmony in low
vowels) or is there one feature by which harmony in low vowels and non-low vowels
both can be accounted for? (ii) can the different distributions of the two classes of vowels
(a, e, o vs. ɑ, i, u) be accounted for with a feature-based harmony system? An analysis
based on tenseness provides clear answers to these questions. Regarding the
categorization of vowels, the tense/lax distinction categorizes the vowels into two groups:
ɑ, i, u, as a group versus a, e, o as another group, as seen in (12). With respect to
harmony, the harmony processes in low vowels and non-low vowels are proposed to be
cases of tense harmony. That is, the feature [tense] spreads from ɑ, i, u to a, e, o. Thus,
by using a tense/lax distinction, harmony can be neatly accounted for, since [tense] is a
feature and can spread, and at the same time, ɑ, i, u, the triggers of harmony, are in one
group as opposed to a, e, o, the targets of harmony, which form another group.
Represented structurally, tense harmony can be captured as follows:
(13) V V
[tense]
The advantages of a tense-based account over a quantity-based account are that (i)
tenseness, being a feature, can account for harmony while quantity cannot; (ii) tenseness
categorizes the vowels into a, e, o vs. ɑ, i, u based on an active phonological process of
the language, namely vowel harmony, while for such categorization quantitative accounts
rely on evidence such as versification and phonotactics, which are not phonological
processes.
81
The advantages of a tense-based account over a height-based account are that (i)
tenseness provides a uniform account for harmony patterns in low vowels and non-low
vowels while a height-based account presents two harmony systems, one for low vowels
and the other for non-low vowels; (ii) height cannot account for the categorization of
vowels into a, e, o vs. ɑ, i, u, which is in fact the reason some literature considers
quantity to be active in the system. Tenseness can do so, however.29
The tense-based account also shows that, as predicted by Modified Contrastive
Specification, it is not necessary for both quality and quantity to be underlyingly present
in the Persian vowel system. Thus the synthetic analysis is ruled out.
One may ask why
we need a uniform account for harmony patterns in low vowels and in non-low vowels
given that the environments are not the same. It is true that there are differences in the
domains in harmony patterns in non-low vowels and that the environments of the
occurrence of harmony in low and non-low vowels are not the same but what all these
harmony processes share is ɑ, i, u being always the triggers in harmony patterns and
never the targets. That makes it legitimate to look for a uniform account.
In the next section, I will elaborate on the nature of the feature [tense].
3.3. On the nature of tense/lax
I considered the feature involved in harmony patterns in Persian to be [tense] and I
therefore call the process tense (or tensing) harmony. In this section, I discuss this feature
and explain my reasons for this choice.
There are discussions of tensing/laxing harmony and ATR/RTR harmony in the harmony
literature as well as discussion of whether [tense] and [ATR] are equivalent. For instance,
29 One may ask how this is different from “functionality” (recall that discussing the vowels Samareh (1977) says that “the traditional labels “long” and “short” may justifiably be preserved for the two functionally different groups” (see 2.2.2.2)). The point is that functionality is not a phonological feature so even Samareh who discusses functionality has to use a feature (i.e., traditional long and short). But tenseness is a feature.
82
Halle (1983) considers [tense] and [ATR] to refer to the same phonological property. Van
der Hulst and van de Weijer (1995) say that before the feature [ATR] came into use,
African systems were described in terms of features referring to vowel height
(open/close, high/low), or in terms of a feature [±tense]. Trigo (1991) suggests that
vowel harmony in Turkana involves a tense versus lax distinction. Benus (2005) notes
laxing harmony in Pasiego. D. Walker (1984) discusses laxing harmony in Canadian
French. R. Walker (2005) shows cases of laxing harmony in Andalusian Spanish and also
in Canadian French. In her discussion of Canadian French, R. Walker points out that the
pattern of laxing harmony can be treated as involving spreading of [-ATR]. According to
van Oostendorp (1995), tense is a feature which is not very well defined and is a cover
term for various properties. The same is true for its counterpart, lax. In a discussion of
these terms, he uses [ATR] wherever phonetic explicitness is required and [lax]
everywhere else in his thesis.
Returning to Persian, from the phonological perspective, a feature is needed in order to
account for harmony and the two classes of vowels (a, e, o vs. ɑ, i, u). I choose [tense] as
a cover term, abstracting away from the following: (i) the discussion of whether or not
[ATR] and [tense] are the same –I will explain below why I choose [tense]; (ii) the
phonetic correlates of tenseness in Persian –I also elaborate on this below.
Let us first focus on why I choose [tense] and not [ATR]. For my purpose, I could use
either [ATR] or [tense] (i.e., they are features so they can explain harmony and also they
are able to categorize a, e, o vs. ɑ, i, u). I prefer [tense] because [ATR] is associated with
a particular phonetic gesture, advancement of tongue root, while [tense] does not have
such a correlation -it will be shown below that tenseness can have different phonetic
realizations across languages. Taking the feature involved to be [ATR] instead of [tense]
requires that the low back vowel ɑ also be considered as having [ATR], and taking a low
back vowel like ɑ to be [ATR] is phonetically hard to account for (e.g., ɑ is taken to be
[RTR] in Kramer 2003, van der Hulst and van de Weijer 1995). I should note that I could
use [ATR] as a cover term without worrying about its phonetic properties. Nonetheless, I
prefer using [tense], which does not correspond to a particular phonetic property, as
83
discussed below. This therefore avoids confusion on what is phonetic and what is
phonological, in particular given the presence of ɑ in the system.
I now turn to tenseness and its phonetic correlates, and review phonetic realizations of
tense/lax across languages to show that what is called [tense] or [lax] can have different
phonetic correlates depending on the language under study. This is, as said above, the
main reason for my choice of [tense] over [ATR] – [tense] is not related to a particular
phonetic property and therefore is a more appropriate cover term for a phonological
analysis.
According to Jessen (1998), phonetic studies on different languages show that what is
called ‘tense’ and ‘lax’ in vowels can have different phonetic manifestations across
languages. He identifies at least three types of languages in this respect, as follows (see
Jessen 1998 for discussion, examples, and references):
(i) The Germanic type which is categorized by a difference in duration and in vowel
quality between tense and lax vowels. An example is German.
(ii) The African type in which the tense/lax distinction in vowels is categorized by a
primary distinction in vowel quality, most particularly resulting from an advancement or
retraction of the tongue root. Unlike the Germanic type, a duration difference is of little
importance.
(iii) The Asian type, such as some languages spoken in China, in which the main
difference in tense and lax vowels are due to voice quality, and therefore vowel quality or
vowel duration are not important as correlates of tense/lax.
Jessen proposes that there is another realization of the feature [tense], as observed in
Thai, in which the tense/lax opposition seems to be almost completely based on phonetic
vowel duration (cf. the Germanic type where both duration and quality matter as
correlates of tense/lax). In 3.4, I will discuss phonetic length in tense-based and quantity-
based languages and how they are different.
84
It is interesting to find out the phonetic correlate(s) of tenseness in Persian. Given the
cross-linguistically attested phonetic realizations of tenseness, we might expect to see
height, or duration, or a combination of them as phonetic correlates of tenseness in
Persian – as noted above because of ɑ, a low back vowel, advancement of tongue root
does not seem to be a reasonable correlate for tenseness in this language. In Persian if one
considers the pairs /e, i/ and /o, u/ it can be said that both duration and/or quality (height
in this system) can be phonetic correlates of tense/lax. Now if one includes the vowel ɑ,
which patterns along with the two vowels i and u in vowel harmony, a phonetic
characteristic which can put ɑ along with i and u is required. For this reason, duration (as
opposed to height) might be the relevant phonetic correlate for tenseness in Persian. That
is, if [tense] phonologically puts ɑ, i, u in a group, it is possible that duration or length (as
phonetic correlates of tenseness in the language) phonetically puts them in a group. In
fact, evidence from the language (i.e., harmony) makes it clear to us that tenseness is
primary, and duration, or any other phonetic characteristic, should follow from it. What is
in particular an important factor in considering ɑ, i, u as [tense], regardless of their
phonetic description, is the similar phonological behaviour of these vowels (as opposed
to a, e, o).
The fact that ɑ behaves phonologically similarly to i and u and is thus classified along
with these vowels as tense vowels make the presence of tenseness easier to argue for
because of the fact that tenseness behaves independently of height in this language. There
are languages like Chamorro in which one does not get a tense/lax distinction
independent of height, so it is harder to know whether phonologically it is tenseness or
height that is involved. Other evidence in such languages should be found to show that
which one is indeed active (see van Oostendorp 1999).
To sum up, I use [tense] as a cover term through which the vowel system and the
phonological patterning of the vowels in Persian can be accounted for. I consider
tenseness to be phonologically active in the system and therefore ɑ, i, u are specified by
[tense].
85
I carried out an experiment on Persian vowels in order to see how the tense and lax
vowels pattern in terms of length. As we will see next, based on studies on other
languages, the durational difference between long and short vowels in quantity-based
languages is more than the durational difference between tense and lax vowels in tense-
based languages. Thus we want to find out how Persian behaves in this respect. I discuss
this next.
3.4. Phonetic experiment
In 2.2.1.1, I discussed what we find in the literature regarding phonetic length of Persian
vowels. It was shown that there is not agreement on where one sees length differences.
For example, according to one study (Samareh 1977), a length distinction exists in closed
syllables (e.g., bɑr ‘load’ vs. bar ‘over, fruit’) while based on another study (Windfuhr
1979) the length distinction is neutralized in closed syllables, due to a lengthening rule
which lengthens a, e, o in closed syllables (see 2.2.1.1 for discussion).
I conducted a phonetic experiment in order to measure the length of Persian vowels.30
The prediction for Persian, based on what we see in tense-based and quantity-based
languages, as discussed below, is that tense vowels may show more duration compared to
lax vowels in the same environment, as length could be the phonetic correlation of
tenseness in a language, but we do not expect to see a very large length difference
The experiment had two goals: (i) to see if Persian tense and lax vowels have different
lengths and if there is a pattern among vowels with respect to length; (ii) to compare the
results with the length of vowels in quantity-based and tense-based languages to see to
which group Persian is similar.
30 I am grateful to Alexei Kochetov for his advice and guidance on the experiment, and to Christopher Neufeld for doing the measurements and the tables and graphs (18)-(21). Thank you also to Keren Rice, my supervisor, for her financial support. I would also like to thank my parents for participating in the experiment.
86
between tense and lax vowels. In quantity-based languages, however, a very large
difference is observed between long and short vowels. I will discuss this below.
There were two speakers, one female and one male, both speakers of Standard Persian.
Each speaker read 18 tokens, which were embedded in a carrier sentence, 5 times. The
participants did not repeat the same sentence consecutively. The carrier sentence was as
follows:
(14) Un kalame ……. bud.
That word ……… was
“That word was ….”
The tokens included three syllable structures:
(i) CV
(ii) CV.CVC (the first V is target; stress is not on target vowel)
(iii) CVC
Comparing CV in (i) and CV in (ii) with CVC in (iii) shows us the duration of vowels in
open and closed syllables. Comparing (i) and (ii) reveals the effect of stress (CV in (i) is
stressed while CV in (ii) is not). Additionally, CV in (i) is important for the discussion of
minimal words, which is the topic of chapter 6.
For recording, I used an Olympus digital voice recorder WS-500M and a cyber acoustics
microphone (CVL-1124R-CW). The sounds were recorded in WMA format stereo with
bit rate 128 Kbps.
In (15)-(17), I list the tokens based on their structure -the target vowels are in bold. In the
actual experiment the sentences were randomized through random.org.
87
(15)
e/i contrast
CV structure
se ‘three’
si ‘thirty’
o/u contrast
to ‘you (sg.)’
tu ‘inside’
a/ɑ contrast
na ‘no’
nɑ ‘energy’
(16) CV.CVC structure
e/i contrast
(syllable boundary is shown by ‘.’)
de.ɡar ‘other, else’ (the literary form of di.ɡar)
di.ɡar ‘other, else’
o/u contrast
ko.tak ‘beating’
ku.tɑh ‘brief’
88
a/ɑ contrast
ka.tɑn ‘linen’
kɑ.teb ‘writer’
(17)
e/i contrast
CVC structure
keʃ ‘elastic band’
kiʃ ‘faith’
o/u contrast
pok ‘a puff, a drag’
puk ‘hollow’
a/ɑ contrast
tak ‘unique’
tɑk ‘vine’
3.4.1. Results
The results show that tense vowels are longer than their lax counterparts in CV.CVC and
CVC structures. In CV structure, however, lax vowels can be longer than tense vowels.
The graphs and tables in (18)-(21) show the mean of vowel nucleus length by tenseness,
vowel quality (i/e = front, ɑ/a = low, u/o = back) and syllable structure. In tables (19) and
(21), numbers are bolded where lax vowels are longer than their tense counterparts. The
89
results of the female speaker’s performance are given in (18) and (19), followed by the
results of the male speaker’s performance in (20) and (21).
(18)
(19) Female speaker
back front low
CV (lax) 141.4 133.4 185.5
CV (tense) 114.6 122.7 183.3
CV.CVC (lax) 46.7 77.5 58.1
CV.CVC (tense) 61.2 117 113.3
CVC (lax) 76 76 81
CVC (tense) 93.9 115 127
back front low back front low back front low0
20
40
60
80
100
120
140
160
180
200
Female SpeakerDuration of vowel nucleus
LaxTense
CV CV.CVC CVC
mea
n du
ratio
n (m
s)
90
(20)
(21) Male speaker
To sum up, the results of the experiment show the followings:
(i) Tense vowels are longer than their lax counterparts in CV.CVC and CVC structures.
(ii) Lax vowels may be longer than their tense counterparts in CV structure.
back front low
CV (lax) 161.75 141.49 194.48
CV (tense) 156.59 147.12 236.6
CV.CVC (lax) 57.05 94.09 60.71
CV.CVC (tense) 82.62 108.78 136.4
CVC (lax) 101.68 100.13 96.1
CVC (tense) 134.8 164.17 197.57
back front low back front low back front low0
50
100
150
200
250
Male SpeakerDuration of vowel nucleus
LaxTense
CV CV.CVC CVC
mea
n du
ratio
n (m
s)
91
(iii) Regardless of tenseness, the vowels in CV structure are usually longer than the
vowels in the other syllable structures.
3.4.2. Discussion
The results of the experiment bring up the following questions:
(i) Why can lax vowels be longer than or as long as tense vowels in CV structures (unlike
in CV.CVC and CVC)? I used ‘longer than or as long as’ because one may say that the
difference between, for example, 185.5 and 183.3 in (19) is not significant.
(ii) If tense vowels are longer than lax vowels in Persian, given that long vowels are
longer than short vowels in quantity-based languages, how do we know that the
difference in phonetic length in Persian is not due to underlying presence of quantity in
the system?
The answer to the first question (why in the CV structure lax vowels can be longer than
their tense counterparts) involves a minimal word requirement which holds on the surface
(not underlyingly). In order to satisfy minimality constraints, the vowels, in particular lax
vowels, are produced with more duration. I will return to this in chapter 6 (see also
Fitzgerald (forthcoming) for lengthening due to minimality constraints).
The response to the second question is as follows: the phonetic length difference in
Persian tense and lax vowels is less than what is expected if the language were truly
quantity-based. Studies show that languages in which quantity is the basis of contrast
exhibit a larger difference between the phonetic length of their long and short vowels.
That is, long vowels are significantly (about twice or so) longer than short vowels (Tranel
1995, Goodman 2005 (cited in Kozasa 2005), among others). According to Tranel
(1995), in Runyambo (a Bantu language of Tanzania), long vowels are about twice as
long as short vowels, and in Luganda (a Bantu language of Uganda), long vowels are two
and a half times longer than short vowels. In a phonetic study of vowels of Sudanese,
Saudi, and Egyptian Arabic, a quantity-based language, Alghamdi (1998) notes that in
92
terms of quantity, vowels pattern similarly in these dialects. In all three dialects, long
vowels are more than twice as long in duration as their short counterparts. Consider also
the reported ratio of the duration of long vowels to short vowels in the following
quantity-based languages (Lehiste 1970, Hirata 2004a, 2004b, Abramson 2001, Kozasa
2005):
(22) Ratio of long vowels to short vowels in some quantitative languages
Japanese 2.4-3.2
Thai 2.5-2.9
Finnish 2.27
Danish 1.98
Estonian 2.20
Compare these with English, a tense-based language (Hillenbrand et al 1995, Hillenbrand
2003):
(23) Ratio of tense vowels to lax vowels in American English
/i/ > /ɪ/ 1.26
/e/ > /ɛ/ 1.36
/o/ > /ʊ/ 1.23
/u/ > /ʊ/ 1.23
The mean of ratios in English is 1.27, based on the figures in (23).
As shown below, the ratio of duration of tense vowels to lax vowels in Persian is 1.385,
which is close to the ratio in English and smaller than those in the quantity-based
languages such as Arabic and Japanese. In (24)-(27), I show in detail how I calculated the
ratio in Persian. In (24), the ratio of duration of tense vowels to lax vowels as pronounced
93
by the female speaker is shown. In order to clarify, I explain how the column u > o
should be read. In the CV structure, the ratio of duration of u to o is 0.81, which is
obtained by dividing 114.6 by 141.4 (the first and second rows under the back column in
(19)). In the CV.CVC structure, the ratio of duration of u to o is 1.31, which is obtained
by dividing 61.2 by 46.7 (the third and fourth rows under the back column in (19)). In the
CVC structure, the ratio of duration of u to o is 1.24, which is obtained by dividing 93.9
by 76 (the fifth and sixth rows under the back column in (19)). The mean of 0.81, 1.31,
and 1.24 is 1.12 (see the column u > o in (24)). The columns i > e and ɑ > a in (24)
should be read in the same manner.
(24) Ratio of tense vowels to lax vowels in Persian (female speaker)
(25) shows the same information for the male speaker (e.g., in the CV structure, the ratio
of duration of u to o is 0.97, which is obtained by dividing 156.59 by 161.75 (the first and
second rows under the back column in (21)).
u > o i > e ɑ > a
CV 0.81 0.92 0.99
CV.CVC 1.31 1.51 1.95
CVC 1.24 1.51 1.57
Mean 1.12 1.31 1.50
94
(25) Ratio of tense vowels to lax vowels in Persian (male speaker)
Collapsing the means in (24) and (25) of each column gives (26). That is, for example,
1.185 under u > o in (26) is the sum of 1.12 (in (24)) and 1.25 (in (25)) divided by two.
(26) Ratio of tense vowels to lax vowels in Persian (pairs of vowels)
(27) shows the total ratio of tense vowels to lax vowels in Persian. Putting together the
means in (24), we get 1.31 as the ratio of duration of tense vowels to lax vowels for the
female speaker (i.e., the sum of 1.12, 1.31, and 1.50 (the means in (24)) divided by 3).
Doing the same for (25) gives 1.46 for the male speaker. Thus, as shown in (27), the ratio
of tense vowels to lax vowels in the female speaker is 1.31 and the ratio of tense vowels
to lax vowels in the male speaker is 1.46. The total ratio of tense vowels to lax vowels in
Persian is therefore 1.385 (considering 1.31 and 1.46 together).
u > o i > e ɑ > a
CV 0.97 1.04 1.22
CV.CVC 1.45 1.16 2.25
CVC 1.33 1.64 2.06
Mean 1.25 1.28 1.84
u > o i > e ɑ > a
Female and male
speakers together
1.185 1.295 1.67
95
(27) Ratio of tense vowels to lax vowels in Persian (total)
So in answer to the second question raised above (whether phonetic duration of vowels in
Persian argues for tenseness or quantity), the phonetic length of Persian vowels suggests
that Persian behaves as a tense-based language rather than a quantity-based language as
the difference in vowel duration is far less than expected in a quantity system.
To sum up, the findings of the phonetic experiment support my proposal that the Persian
vowel system is a tense-based system, and that quantity is not the basis of contrast in the
system, but rather the tense vowels tend to have some phonetic duration in CV.CVC and
CVC structures, while lax vowels may be longer than tense vowels in CV structures.
I now turn to an analysis of the system based on contrast followed by a discussion of
markedness which presents more evidence for the underlying presence of [tense].
3.5. Contrasts in the Persian vowel system
In this section, I present the contrastive hierarchy of the Persian vowel system. Recall that
contrastive specification involves the ordering of features into a contrastive hierarchy
(Dresher 2003a, 2003b, 2003c, 2009) (see 1.2). Thus, in addition to identifying which
features are phonologically active in the system, one needs to specify the order in which
tense>lax
Female speaker 1.31
Male speaker 1.46
Total 1.385
96
features enter into the system. The order is determined by phonological activity of
features in the language under study.
Let us begin by reviewing what processes need to be accounted for in Persian. The
following should be taken into consideration in deciding on the order of features in the
Persian vowel system:
(i) Tense harmony (see 3.2)
(ii) Categorization of vowels into ɑ, i, u vs. a, e, o (section 3.2; see also chapters 4, 5,
and 6)
(iii) Pre-nasal raising (i.e., ɑ → u /−m, n) (noted in 2.3.4 and discussed in 3.8) (e.g.,
bɑdɑm ~ bɑdum ‘almond’, bɑrɑn ~ bɑrun ‘rain’).
I propose that the first cut in the Persian vowels involves the feature [peripheral] or
[tense]. It does not matter which one enters the system first.
The absence of [coronal] from the underlying representation of the system was briefly
discussed in 3.1.1, and is further discussed in 3.6. Taking place as the first cut is
supported by evidence from the pre-nasal raising process. I will return to this later in this
section.
In section 3.2, I proposed an analysis of the system based on a contrast of tense/lax,
arguing that [tense] is the active feature based on harmony and that [lax] is unmarked in
the system. The result of introducing [peripheral] and [tense] (or [tense] and [peripheral])
into the system is shown in (29). For the sake of clarity, I enter [peripheral] first as shown
in (28) and then [tense] in (29). But, as noted above, the opposite is possible as well:
[tense] divides the vowels into /u, i, ɑ/ vs. /o, e, a/; then [peripheral] makes a
division between /u, ɑ/ and /i/ as well as between /o/ and /e, a/.
97
(28) [peripheral]: i e a vs. u o ɑ
i u [peripheral]
e o [peripheral]
a ɑ [peripheral]
(29) [tense]: i vs. e a u ɑ vs. o
[tense] i u [peripheral], [tense]
e o [peripheral]
a ɑ [peripheral], [tense]
We still need to distinguish ɑ and u from each other as well as a and e from each other. I
divide the vowels into two height classes, low and non-low. This is supported by the
harmony patterns: low vowels do not interact in harmony processes with non-low vowels
(see 3.1.1 and 3.1.2).
(30) [low]: e vs. a u vs. ɑ
[tense] i u [peripheral], [tense]
e o [peripheral]
[low] a ɑ [peripheral], [tense], [low]
No further features are required to distinguish the vowels. The contrastive hierarchy of
Persian vowels, based on this order of cuts, is as follows (“,” shows that the ordering of
[peripheral] and [tense] is not crucial; “ >” shows that [peripheral] and [tense] are
introduced in the system prior to [low]):
98
(31) [peripheral] , [tense] > [low]
(32) summarizes the feature values for the Persian vowel system. The √ shows where a
feature is present/specified.
(32) a ɑ e i o u
[peripheral] √ √ √
[tense] √ √ √
[low] √ √
(32) results from the hierarchy in (31). If [peripheral] were ordered after [tense] and
[low], as in (33), the pre-nasal raising would be difficult to account for. Under (33), for ɑ
to become u before nasal consonants, ɑ would have to lose [low] and gain [peripheral],
while under (32), the raising of ɑ to u simply involves the loss of [low]. Changes in
height (raising and lowering) due to the nasal context, as we will see in 3.8, are expected
but it is unclear why [peripheral] should be added, that is, why ɑ does not become i.
(33) [peripheral] [peripheral]
[tense] i u [tense] e o
__________________ __________________
[low] [tense] ɑ [low] a
Given the system I proposed in (31) for ɑ to become u before nasal consonants, ɑ would
only lose [low], which is not unexpected in a nasal context, as will be discussed in 3.8.
It is also worthwhile to examine why the order [peripheral] > [low] > [tense] is
inadequate. Such an order would yield the following:
99
(34) [tense] i u [peripheral], [tense]
e o [peripheral]
[low] a ɑ [peripheral], [low]
This order of features accounts for tense harmony in non-low vowels (e→i and o→u). It
also accounts for peripheral harmony (e→o and a→ɑ). But the ɑ, i, u vs. a, e, o
categorization is lost. Thus it is evidence for this categorization that leads to the order in
(31).
In this section, I discussed contrasts in the Persian vowel inventory and showed that
[peripheral] , [tense] > [low] accounts for vowel patterning (i.e., the categorization of
vowels, harmony patterns, and pre-nasal raising).
In the next section, markedness in the system will be discussed.
3.6. Markedness and vowel features in Persian
In this section, I discuss markedness in the vowel system of Persian. In order to identify
which features are marked (i.e., present) and which ones are unmarked (i.e., absent) in the
underlying representation of a vowel system, it is necessary to consider what diagnostics
can be used to determine this. Various diagnostics are proposed in the literature. In
particular, unmarked elements are typically considered to result from neutralization, are
likely to be epenthetic, are target of assimilation, and are lost in coalescence and deletion
(see Rice 1999, Rice 2007 for a summary). I discuss these four diagnostics one by one.
3.6.1. Assimilation
Assimilation is a process which involves the submergence of the unmarked. The target of
assimilation is unmarked and the trigger is marked. It was already seen that in the vowel
harmony processes in Persian the lax vowels are the targets of assimilation while the
100
tense vowels are triggers (see 11). If targets of assimilation are unmarked, this suggests
that lax vowels are unspecified for tenseness in Persian with [lax] absent underlyingly
(i.e., is unmarked).
3.6.2. Epenthesis
Epenthesis is a process which involves the emergence of unmarked. Recall that the lax
vowel e is the epenthetic vowel in loan words (e.g., eski ‘ski’, kelɑs ‘class’).
In both native words and loanwords, too, [e] is the epenthetic vowel. Consonant clusters
may be avoided in syllable margins when a suffix is added to a root in Persian. A strategy
to break up consonant clusters involves inserting the vowel [e], as the following example
shows. This process will be discussed in detail in chapter 4.
(35) ʃɑd ‘happy’ + mɑn → ʃɑdmɑn ~ ʃɑdemɑn ‘happy’
kɑr ‘work’ + ɡar → kɑrɡar ~ kɑreɡar ‘worker’
bɑɢ ‘garden’ + bɑn → bɑɢbɑn ~ bɑɢebɑn ‘gardener’
Epenthesis, as noted above, is thought to be a process which involves the emergence of
the unmarked. Epenthetic segments are more likely to have unmarked feature(s) since
they are not present in lexical representation (see, for instance, Rice 2007). The vowel [e]
as epenthetic segment in Persian supports the unmarked status of [lax] in this language.
Based on its patterning in epenthesis and assimilation, the vowel e can be considered to
be structurally the least complex vowel in the Persian vowel system, being unmarked for
tenseness, place, and height (see (32)). The patterning of e as the least complex vowel is
also observed in other processes in Persian, summarized in what follows.
101
3.6.3. Deletion
In addition to being the target of assimilation and the epenthetic vowel, unmarked
segments are also argued to be more easily lost in deletion processes than marked
segments (see Rice 1999, Rice 2007 for references). In Persian, the vowel e deletes more
easily than the other vowels. For example, /e/-deletion is observed in the indicative mood
of verbs whose infinitives’ first vowel is [e]. Deletion does not occur with other vowels,
as the following examples show.31 The vowel in parentheses can be deleted. The prefix
mi- in (36) is the indicative marker in Persian, and –am shows agreement (1st.sg). I show
the words as they are syllabified.32
(36) i mi.xi.su.nam ‘I soak’ u mi.pu.ʃu.nam ‘I put something on somebody’
e mi.ʃ(e).ka.nam ‘I break’ o mi.so.rɑ.jam ‘I write poems’
a mi.xa.rɑ.ʃam ‘I scratch’ ɑ mi.xɑ.bu.nam ‘I get somebody to sleep’
In verbs with /e/ in the relevant position, the form without [e] is much more common than
the form with [e] in speech. For other vowels, however, the deletion of the vowel is
the deletion of e but not the other vowels provides support for the claim that /e/ has a
different status from the other vowels in Persian, being the least marked vowel.
31 The number of syllables is important in conditioning deletion. In verbs with three syllables in indicative form, no deletion occurs. For example: mi.re.sam meaning ‘I reach’ does not become mir.sam. 32 The infinitives of these verbs are as follows: xisɑndan ‘to soak’ (xisundan in speech due to raising before nasals), ʃekastan ‘to break’, xarɑʃidan ‘to scratch’, puʃɑndan ‘to put something on somebody’ (puʃundan in speech), sorɑjidan ‘to write poems’, xɑbɑndan ‘to get somebody to sleep’ (xɑbundan in speech).
102
3.6.4. Neutralization
Neutralization is another process which provides support for e being the least marked
vowel of the system. A general tendency of Persian in the last millennium is to change
the vowel [a] to [e] (e.g., Natel Khanlari 1987). In final position, this has happened in
almost all words. There are only two words in Modern Persian which end in a: na ‘no’
and va ‘and’. Some examples of this change in final position are given in (37). These
words underwent two changes historically: first, their final ɡ was dropped; later, their
final a became e (Natel Khanlari 1987 among others; the words are taken from
Farahvashi 1967).
(37) Middle Persian Modern Persian
a. pambaɡ pambe ‘cotton’
b. mēwaɡ mive ‘fruit’
c. hamaɡ hame ‘all’
d. waʧʧaɡ baʧʧe ‘child’
These examples all show neutralization to [e]. Since the target of neutralization is
unmarked (see Avery and Rice 1989, Rice 1999, 2004, 2007), [e] is expected to have the
unmarked features.
The historical change of *a (synchronic [a]) to [e] in final position continues
synchronically. The absence of [a] in final position can explain why synchronically there
are words whose final syllable changes from CaC (the formal form) to Ce (the colloquial
form). Consider the following example:
(38) a. diɡar ~ diɡe ‘else’
b. maɡar ~ maɡe ‘unless’
103
Final /r/-deletion commonly occurs in Persian. For instance, /r/ often deletes in words in
which /r/ in final position is preceded by a consonant:
(39) a. ʧeɢadr ~ ʧeɢad ‘how much’
b. fekr kon ~ fek kon ‘think!’
c. sabr dɑʃte bɑʃ ~ sab dɑʃte bɑʃ ‘have patience!’
The following examples show this process with words ending in a vowel followed by /r/.
(40) a. ʧetor ~ ʧeto ‘how’
b. ʧekɑr konam ~ ʧikɑ konam ‘what should I do?’
In these examples, /r/ is deleted but since after /r/-deletion the word ends in a vowel
which is allowed in final position ([o] and [ɑ]), no change affects the remaining vowel. In
(38), however, after the deletion of final /r/ what remains is [a]. This vowel cannot occur
in final position, and thus /a/ raises to [e].
Final [d] also tends to delete in Persian in contexts such as following when the final d is
followed by a consonant-initial word. For example:
(41) a. ɢand bexar ~ ɢan bexar ‘buy sugar cubes!’
b. band kard ~ ban kard ‘s/he insisted’
Another case in which final CaC changes to Ce is in the third person singular indicative,
in which the final /d/ is deleted and the remaining /a/ becomes [e]; as in the following
examples:
(42) a. mi-xor-ad ~ mixore ‘s/he eats’
b. mi-zan-ad ~ mizane ‘s/he hits’
c. mi-xɑb-ad ~ mixɑbe ‘s/he sleeps’
104
Compare the above cases with their plural counterparts, in which [d] is deleted but due to
the presence of [n], [a] does not change.
(43) a. mi-xor-and ~ mixoran ‘they eat’
b. mi-zan-and ~ mizanan ‘they hit’
c. mi-xɑb-and ~ mixɑban ‘they sleep’
The historical change from *a to [e] is a shift from marked ([low] is specified) to
unmarked (no specified height). This outcome of the shift is expected since the historical
processes of *a becoming [e] is neutralization, resulting in the unmarked. The historical
change accounts for the synchronic change of /a/ to [e] after deletion of [d] and [r] (which
is not a case of height harmony since there is no neighboring vowel to trigger harmony)
is due to the absence of [a] in the vowel inventory of Persian in final position.
A comparison also should be made between e and o, the two non-low vowels unspecified
for tenseness. The vowel o does not pattern similarly to the vowel e in many of the
processes discussed above regarding markedness of features. This could be explained by
markedness of place: o is specified for place while e is not (e is unspecified for both place
and tenseness).
The common diagnostics of markedness in the literature (i.e., assimilation, epenthesis,
deletion, and neutralization) show that the least complex vowel in the system is the vowel
e, which is not specified for place, tenseness or height. It was shown that [tense] must be
the underlyingly present feature in the system. A comparison of the three non-tense
vowels (i.e., e, o, a) strongly supports the claim that [peripheral] and [low] are marked in
the system.
In the next two sections, I examine two processes which involve changes in vowels in the
environment of particular consonants. First, in 3.7, I discuss harmony in low vowels
across laryngeal consonants. This process was discussed in 3.1.2. from a harmony
viewpoint, and, in 3.7, I address the characteristics of laryngeals and their interaction
with low vowels. Afterwards, in 3.8, pre-nasal raising in Persian will be discussed.
105
3.7. Harmony in low vowels across laryngeals
Recall the harmony pattern between the two low vowels: across a laryngeal consonant a assimilates to ɑ.
(44) a → ɑ / C−ʔɑ
a → ɑ / C−hɑ
a. bahɑ → bɑhɑ ‘price’
b. ʃahɑb → ʃɑhɑb ‘meteor’
c. maʔɑʃ → mɑʔɑʃ ‘livelihood’
d. saʔɑdat → sɑʔɑdat ‘happiness’
It was shown in 2.3.5 that the process involves only laryngeal consonants (see the Persian
consonant inventory in 2.2.2.2) and does not occur across other consonants (e.g., tabɑr →
*tɑbɑr ‘lineage’). Also, the process involves only low vowels. That is, h and ʔ are not
transparent if they are not flanked from both sides by low vowels (e.g., mahin → *mihin
‘better’).
The questions raised here are: (i) is this phenomenon related to the feature characteristics
of h and ʔ, and to those of low vowels? That is, why is this pattern of harmony (i.e.,
assimilation of a to ɑ) not observed across other consonants?; and (ii) why does the
harmony across h and ʔ not occur with non-low vowels? The consonants h and ʔ form the
class of laryngeal or glottal in the Persian consonant inventory. There are a number of
arguments in the literature for the interaction between gutturals (including laryngeals h
and ʔ) and low vowels. How to interpret this process depends on the features one assumes
for the consonants and vowels involved in the process. There are three kinds of
arguments found in this regard in the literature (e.g., Steriade 1987, McCarthy 1994,
Pickett 1999, Hume 1992, Rose 1996, Flemming et al 2008): (i) laryngeal consonants are
low; (ii) laryngeal consonants are placeless; (iii) low vowels are laryngeal (specified by
[pharyngeal]).
106
I simply speculate here since a careful study of Persian laryngeals is required in order to
answer the questions raised above about the harmony process. Considering the feature
characteristics of low vowels in Persian, which are [low] and [tense] (see section 3.5), if
laryngeal consonants in Persian have the feature [low], then the observation that the
process occurs only with low vowels (and not with non-low vowels) and only across
laryngeals (and not across other consonants) might be explained. That is, from the three
possibilities which are found in the literature, the possibility given in (i), that laryngeal
consonants are low, seems to be the direction of research to follow first.
The second possibility, given in (ii), laryngeals are unspecified for a place feature (e.g.,
Steriade 1987), does not seem to offer an explanation for why the process is observed
across laryngeals and with low vowels because the assimilation of a to ɑ across
laryngeals is not a place assimilation –recall that it is a case of tense harmony with [tense]
spreading from ɑ to a. If assimilation of a to ɑ was a case of place assimilation, one could
say that since laryngeals are placeless the spreading of a place feature is possible across
them. But even under that scenario the question would be: why does place assimilation
not occur across laryngeals with other vowels if laryngeals are placeless? Here again the
reference to [low] for laryngeals seems unavoidable.
The third possibility, given in (iii), treats low vowels as laryngeal (specified by
[pharyngeal]). There is evidence that low vowels have some pharyngeal constriction
acoustically similar to the gutturals: high F1 is shared by a and the gutturals (e.g.,
McCarthy 1994), and in pharyngeal environments, F1 normally raises (e.g., Pickett 1999,
Flemming et al 2008). These along with some phonological processes mentioned below
show why [pharyngeal] is argued to be present in low vowels as well. For instance Hume
(1992) considers the low front vowel æ as both [coronal] and [pharyngeal]. Herzallah
(1990) (cited in Hume 1992) takes the low back vowel ɑ of Palestinian Arabic to be both
[dorsal] and [pharyngeal]; and a as simply [pharyngeal]. Rose (1996) assumes that low
vowels are represented as pharyngeal. In SPE, low vowels, laryngeals and pharyngeals
are characterized by [+low].
107
Let us return to Persian. Considering low vowels to be laryngeals in Persian requires
adding [pharyngeal] to feature specifications of low vowels. Phonologically, this is not
necessary given that [low] and [tense] are sufficient for low vowels to be distinguished
from each other and from other vowels. On the other hand, it might be suggested that
[pharyngeal] could be used instead of [low], since the choice of [low] is somewhat
arbitrary. If low vowels and laryngeal consonants are both [pharyngeal], then the
harmony process which occurs restrictly with low vowels and laryngeal consonants and
not with other vowels and consonants can easily be explained: the process occurs due to
the common place of articulation, [pharyngeal]. However, evidence from pre-nasal
raising (ɑ → u / – nasal C) suggests that [low] is the appropriate feature because if we
replace [low] by [pharyngeal] in the Persian vowel system (see (30)), then for ɑ
becoming u before nasal consonants, ɑ needs to lose its [pharyngeal]. Changes in vowel
height, raising and lowering, in nasal contexts are attested, as will be discussed in 3.8, but
not changes in place features such as [pharyngeal].
Thus the first possibility, that laryngeal consonants are low, seems most promising. I now
briefly present some processes which involve laryngeal consonants and show their
interaction with low vowels as found in the literature.
The lowering effect of gutturals on vowels in Semitic languages provides evidence for
the relation of this class of consonants and low vowels. The consonants with [pharyngeal]
(including laryngeals) trigger a rule in Arabic which is called “Feminine Vowel
Assimilation” (Hoberman, 1995). By this rule the feminine noun and adjective suffix –i is
lowered to a or ɑ when preceded immediately by a pharyngeal consonant (see also Rose
1996 for discussion on lowering). For example:
(45) a. zɑṛɑɑf-i ‘an ostrich’
b. ћilm-i ‘a dream’
c. zarriiʕ-a ‘plants’
108
d. fallaaћ-a ‘peasant woman’
e. ʃɑṭћ-ɑ ‘picnic’
Another example is from Hebrew (McCarthy 1994). Hebrew inserts schwa into
unsyllabifiable consonant clusters. The epenthetic schwa is lowered to [a] when it is
preceded by a guttural. I present some examples in (46):33
(46) a. malk → melek ‘king/my king’
b. qudʃ → qōdeʃ ‘holiness’
c. tuʔr → toʔar ‘form’
d. lahb → lahab ‘flame’
The phonetic similarity and phonological interaction of low (or non-high) vowels and
gutturals may explain why the harmony illustrated in (44) takes place in the environment
of low vowels and the laryngeal consonants in Persian. McCarthy (1994) argues that
[pharyngeal] can but need not pattern with the oral places of articulation, [labial],
[coronal] and [dorsal]; that is, there is a division between oral and [pharyngeal] place
features. In addition to presenting a phonetic-based explanation for the affinity between
pharyngeals and low vowels, he provides phonological evidence for this claim. McCarthy
says that there are vowel-to-vowel assimilation rules to which oral consonants are opaque
and pharyngeal consonants are transparent. Laryngeal transparency, in particular, is a
phenomenon which is observed cross-linguistically. Complete assimilation of vowels
across h and ʔ is observed in many languages (Steriade 1987). Whether transparency is
observed only across laryngeals or in general across gutturals varies from language to
language. Thus some languages show laryngeal transparency, or translaryngeal vowel
harmony. Some show guttural transparency or transgutteral vowel harmony. Let us look
33 For why the epenthetic schwa is represented as e in (46a) and (46b) see McCarthy (1994). The point relevant to our discussion is that vowel lowering occurs due to the guttural environment.
109
at some cases of laryngeal transparency in some languages. Note that they do not
necessarily involve only low vowels.
Japanese shows a typical case of laryngeal transparency (Kawahara, 2003). Echo
epenthesis inserts a copy vowel after allophones of [h] (47a); after other consonants,
however, the default vowel [u] is epenthesized (47b). For example:
(47) a. bahha
mahha ‘Mach’
‘Bach’
kohho ‘Koch’
b. bazu ‘buzz’
kurisumasu ‘Christmas’
Mohawk (Iroquoian) is another example of a language which show laryngeal
transparency (Postal 1969; cited in Kawahara 2003). In Mohawk, leftward echo
epenthesis is observed across [ʔ]; as in the following example:
(48) ʌ+wa+atunisʔa+s+hek+ʔ → ɔwadunizaʔa
‘It will be ripening repeatedly’
shegeʔ
Across other consonants, however, the default [e] is inserted.
(49) wa+o+arʔsʌ+ʔ → yoreʔz
Arbore, a Cushitic language of Ethiopia, also shows laryngeal transparency, as in (50)
(Hayward 1984, Steriade 1987; cited in Rose 1996). The vowels occurring before and
after laryngeal consonants become identical, while when a non-laryngeal consonant
intervenes, there is no harmony.
ʌʔ ‘She is fat’
(50) a. ma beh-o → ma boho ‘he is not going out’
b. ma beʔ-i → ma biʔi ‘he did not go out’
110
These examples show that assimilation across laryngeals and an interaction between
laryngeals and low vowels both are attested across languages. The important point for our
discussion of harmony is that it is not, therefore, surprising for the harmony patterns in
low vowels in Persian to occur under following conditions: (i) for a to assimilate to ɑ, the
transparent h or ʔ is needed; and (ii) for h and ʔ to be transparent, they must be flanked on
both sides by low vowels. As pointed out before, the phonetic characteristics of
laryngeals in Persian await further study.
In the next section, I study a very common process in Persian, raising of ɑ to u before
nasal consonants which can be explained by the presence of tenseness in the system.
3.8. Pre-nasal raising
Persian shows a process of vowel raising in informal speech in which ɑ is raised to u
before nasal consonants (m and n), as in (51). In formal speech the version without raising
is common.
(51) ɑ→ u/ — n
ɑ→ u/ — m
(52) provides some examples:
(52) a. bɑrɑn ~ bɑrun ‘rain’
b. ɑsemɑn ~ ɑsemun ‘sky’
c. ʤɑn ~ ʤun ‘soul’
d. ɑrɑm ~ ɑrum ‘calm’
e. tamɑm ~ tamum ‘finish’
f. davɑm ~ davum ‘persistence’
111
g. ɑmad ~ umad ‘came (3sg)’
h. xɑne ~ xune ‘house’
i. lɑne ~ lune ‘nest’
Evidence for the presence of underlying ɑ in these words comes from the existence of
words in the language with Cum and Cun structures. These words are never pronounced
as Cɑm and Cɑn even in very formal speech. Some examples are in (53):
(53) a. ɢɑnun ‘law’ *ɢɑnɑn
b. halazun ‘snail’ *halazɑn
c. pune ‘spearmint’ *pɑne
d. tumɑr ‘scroll’ *tɑmɑr
e. holɢum ‘throat’ *holɢɑm
This raising is a very common process in the language, and as the data shows, this
process occurs before both tautosyllabic (52a-52f) and heterosyllabic (52g-52i) ɑ-nasal
consonant sequences.
The effect of nasal context on vowel height is a cross-linguistically observed
phenomenon both synchronically and historically (e.g., Ohala 1975, Beddor 1982, 1993,
Wright 1986, Krakow et al. 1988, Beddor and Hawkins 1990, Maeda 1993). In what
follows I present some examples of vowel raising or lowering in different languages in
nasal contexts (although the case in Persian involves only raising, in order to present a
thorough picture of nasal effects I include below cases of lowering too).
In many Southern dialects of American English, [ɛ] is raised to [ɪ] before a nasal
consonant so pen and hem become homophonous with pin and him (Kenstowicz, 1994).
In some regions of Virginia, a relic of seventeenth century colonial English is the
“exactly alike” pronunciation of e and i before m and n; which makes empty and general,
for instance, become impty and gineral (Brown 1991). The same applies to /ɛ/, written a,
112
in many (Brown 1991). According to Labov (1994), the most favored environment for
raising of short /a/ in American dialects of English is provided by words that end in final
nasals: man, ham, etc. In Northern cities dialects (in particular Buffalo in Labov’s study),
a following nasal consonant has a strong effect in maximizing height, e.g. a in dance,
hand, etc. is pronounced higher than a in other environments.
Raising of vowels preceding a nasal consonant is observed in Primitive Germanic times
(Prokosch, 1939): e before a cluster with an initial nasal became i (regardless of the
vowel of the following syllable). For example (Go.=Gothic, OE=Old English, ON=Old
Norse, OS=Old Saxon):
(54) IE bhendh- > ON bindɑ, Go. OE OS bindɑn, OHG bintɑn; Go. bindis, ON bindr,
OE bindest, OS OHG bindis
In Tswana (Bantu), /ɛ/ and /ɔ/ are raised to [e] and [o] respectively before syllabic nasal
consonants. Compare [tsʼɛtsʼɛ] ‘tsetse fly’ and [lɔtʃwʼa] ‘request’ with [tsʼentsha] ‘cause to
enter’ and [tɬhompha] ‘respect’ (Beddor 1982).
According to Beddor (1982), in Tewa (Kiowa-Tanoan), /e/ becomes [ɛ] before nasal
consonants; for example, [feh] ‘stick’ versus [fɛn] ‘black’. In Grand Couli (Tinrin and
Mea), /o/ becomes [ɔ] before nasal consonants. Compare oro: ‘having much soil, dirty’
with ɔmi : ‘having much grass’.34
In Slave (Athapaskan), some vowels develop to some extent differently when they
precede a tautosyllabic nasal (Rice, 1989). Compare (55) and (56) — (examples are taken
from pp.98 & 99): (PA= Proto-Athabaskan, Bl=Bearlake, Hr=Hare).
35
34 In some of these languages other vowels may show lowering, too (for these see Beddor 1982). 35 It suffices here to mention examples of two vowels. For the development of other vowels see Rice 1989.
113
(55) PA Bl Hr
*a∙ a a
*u∙ u u
(56) PA Bl Hr
*a∙n ǫ, ą ǫ, ą
*u∙n ų, ǫ ų, ǫ
In particular in (56), both a and u become o in the pre-nasal position (i.e. both raising and
lowering to a mid vowel occur). The rules which change a and u to o are also observed in
Slave as synchronic rules in cases of alternations (Rice, 1989).
There are different views concerning the effect of nasalization on vowel height. Beddor
(1982) reviews several of these views and I summarize her discussion: while some
theories see lowering as the effect of nasalization (Martinet 1955, Ohala 1974, Wright
1975), for others nasalization has a raising effect on vowels (O’Rahilly 1932, Bhat 1975,
Pandey 1978). According to some approaches, the effect of nasalization on vowel height
is context-dependent (Foley 1975, Ohala 1980) while context-dependency itself is
defined in different ways. For example, a version of this approach says that vowels raise
in the context of a nasal consonant but lower if the nasal consonant is lost. In another
version, phonemic nasalized vowels lower but allophonic ones raise. In addition to
lowering, raising, and context-dependent accounts, there is also an account which sees
centralization as the effect of nasalization; that is, high and mid vowels lower and low
vowels raise (Ohala 1975, Ruhlen 1978, Wright 1980). Each of these views is supported
by examples from some languages. Not all of these approaches are in contrast with each
other. For example, the centralization view is in fact a combination of the restricted
versions of raising and lowering (i.e. high vowels need to lower and low ones need to
raise for centralization to happen).
Based on a study of 75 languages (Beddor 1982), when there is a change in height due to
nasalization, the following patterns are observed with rare exception (ignoring non-
contextual nasal vowels):
114
i. high nasal vowels lower
ii. low nasal vowels raise
iii. mid back nasal vowels raise
iv. mid front nasal vowels lower, except if nasalization affects both front and
back vowel height, in which case both front and back nasal vowels raise
Given these patterns of pre-nasal raising of low vowels, the existence of raising of ɑ to u
in Persian is not too surprising. However, given the surface inventory of Persian as
presented in the literature and the commonly agreed view that height is phonologically
primary in the system, two observations are surprising and hard to explain: (i) ɑ skips o
to raise to u; and (ii) o itself does not participate in the raising process. Thus the questions
are: why does ɑ raise to u and not to o? why does o not raise to u? I repeat here the
Persian vowel system as given in the height-based accounts.
(57) Persian vowel inventory given in height-based accounts
i u
e o
a ɑ
I present here some examples of the vowel o followed by the nasal consonants. No
raising occurs, as (58) shows.
(58) a. kond *kund ‘slow’
b. bon *bun ‘root’
c. dom *dum ‘tail’
d. nomre *numre ‘number’
e. xonsɑ *xunsɑ ‘neutral’
115
Historically, according to Pisowicz (1985), a tendency to raise ɑ (transcribed as ā in the
historical literature) preceding n and m in Persian was observed since the beginning the
15th century and became clearer in the 17th century. The raising at that time was to both o
and u based on transcriptions of Persian by non-native speakers. For example, in “A
wonder beyond three seas” by Athanasius Nikitian, a 15th century Russian merchant, this
tendency is seen (Pisowicz 1985). Nikitian spelled ā before n in the word kona with the
letter o. This word is the classical kān ‘mine’. Pisowicz says that “[this] points to a
tendency which would be clear in the 17th century and is continued in the modern
colloquial pronunciation like un = classical ān” (p. 79) –un (the formal form: ɑn) means
‘that’.
I leave aside the historical background of pre-nasal raising (although it would be
interesting to investigate what the active features of the system were when the raising
first began), and focus on its synchronic status. I said that in a height-based account
where o is on the way of ɑ raising to u, one needs to explain why o is skipped and why o
itself does not raise to u. I have argued that tense/lax, and not height, is the dimension of
contrast in the Persian vowel system. With the tense-based account, the inventory is
underlyingly presented as in (59), presented earlier in (12) – [tense] is present
underlyingly while [lax] is absent:
(59) [tense]
i u e o
ɑ a
Considering this system, the raising of ɑ to u can be explained in the following way: the
vowel o is not in the path of ɑ going to u. Both ɑ and u are tense and [tense] is the
marked feature, that is, the feature which is present underlyingly. Thus when ɑ undergoes
raising due to the presence of the following nasal consonant it raises to a tense vowel,
losing its height feature, [low].
116
It should be noted that this process, although very common, has some exceptions. Some
of these could be idiosyncratic since no particular reason or pattern is observed for them.
For example:
(60) a. onvɑn → *onvun ‘title’
b. xɑme → *xume ‘cream’
Some names of cities may show raising although the version without raising is more
common; others do not show raising.
(61) a. tehrɑn ~ tehrun ‘the capital of Iran’
but
b. lɑhiʤɑn → *lɑhiʤun ‘a city in north of Iran’
(cf. bɑdemʤɑn ~ bɑdemʤun ‘eggplant’)
With the suffix -estɑn, a location suffix, if the word is a common noun, it shows raising
in some cases and does not show it in others, as seen in (62). When -estɑn is used in the
name of countries, it does not show raising, as in (63).
The feature [low] is lost before a nasal, as expected in the account provided.
The fact that o does not raise to u before nasal consonants in Persian (see (58)) also
supports the idea that the process is not phonetic since phonetically o is in the path of ɑ to
u and if the raising is a phonetic process, one might expect to observe raising of o to u as
well. Phonologically, however, o is not on the way of ɑ to u and this supports the process
being phonological because otherwise o would be a problem.
118
It might be asked whether this process is phonologically active or whether the words
which show the raising pattern are lexicalized. The fact that the same word is used
without raising and also with raising depending on its usage suggest that it is an active
process. For instance, consider the word bɑrɑn ‘rain’ (also a girl’s name). As a name it is
never pronounced as Bɑrun; as a noun, it is commonly bɑrun. The word nɑn ‘bread’ is
commonly pronounced as nun but in the title of the book nɑn o ʃarɑb ‘Bread and wine’ raising is not found. The word zabɑn ‘language’ is pronounced as zabun but not in
zabɑnʃenɑsi ‘linguistics’. Thus the raising is an active phonological process.
In this section I examined the pre-nasal raising process in Persian, which is not a case of
harmony. I suggested that with tense/lax opposition being underlyingly active in the
system, this process, which is difficult to account for in a height-based view, can be
explained.
3.9. Summary
I have argued for a featural analysis of the Persian vowel system based on a tense/lax
distinction. I further proposed that [tense] is the underlyingly present feature. This offers
an account of the harmony processes, which indicate the presence of a feature in the
system, and at the same time allows for the maintenance of the categorization of the
vowels ɑ, i, u versus a, e, o (see section 3.2 for more discussion of these two classes). In
addition, contrast and markedness in the Persian vowel system were also discussed (in
3.5 and 3.6). I also discussed laryngeals and low vowels and their interaction in low
harmony (in 3.7) as well as a pre-nasal raising process (in 3.8). In the next section,
Persian diphthongs will be examined.
119
3.10. Diphthongs
I discuss diphthongs in this section in order to present a complete picture of Persian
vowels, especially because there is controversy in the literature over the phonemic status
of some of these diphthongs.
As seen so far, it is commonly agreed that the vowel system of Persian has six
monophthongs. Persian also has six diphthongs on the surface: ɑj, uj, oj, aj, ej and ow
(e.g., Samareh 1985). (68) presents some examples:
(68) a. nɑj ‘trachea’
b. ruj ‘zinc’
c. xoj ‘name of a city, also perspiration (literary)’
d. ɢajjem ‘guardian’
e. zejtun ‘olive’
f. ɢowl ‘promise’
There is general agreement that four of these six diphthongs, ɑj, uj, oj, aj, are not
phonemic but are in fact a combination of a vowel followed by a glide. There is
controversy over the phonemic status of two of them, ej and ow, as discussed below. In
the next two subsections, I discuss non-phonemic and phonemic diphthongs. I use the
following definitions: phonemic diphthongs are tautosyllabic and occupy the syllable
those whose glide element functions as onset or coda and is not a part of nucleus (e.g.,
Booij 1989, Schane 1995). Thus resyllabification does not affect phonemic, or ‘real’,
diphthongs because resyllabification does not affect material in a nucleus, but post-
nuclear glides are affected by resyllabification because segments in a coda can also serve
as the onset of the next syllable (e.g., Booij 1989, Booij and Rubach 1990). I will argue
that there is no phonemic diphthong in Persian.
120
3.10.1. ɑj, uj, oj, and aj
As pointed out above, ɑj, uj, oj, aj are commonly agreed to be non-phonemic in the
Persian literature. There are two arguments presented in the literature for the non-
phonemic status of these diphthongs (e.g., Samareh 1985): (i) the j can be deleted in some
cases; (ii) the vowel and the following j are separable in syllabification in polysyllabic
words. First I summarize these arguments in the following discussion.
The following examples illustrate the four non-phonemic diphthongs (taken from
Samareh 1985, pp. 96-97).
(69) a. ɑj pɑj ‘foot’
b. uj muj ‘hair’
c. oj xoj ‘name of a city, also perspiration (literary)’
d. aj ɢajjem ‘guardian’
With respect to the phonemic status of these diphthongs, Samareh (1985) claims that
these sequences are not phonemes (neither are ej and ow according to him, as seen
below). He notes that in the case of pɑj ‘foot’ and muj ‘hair’, when one deletes the j no
difference in meaning occurs and in fact the version without j is the more common form.
This, according to him, establishes that the j is not a part of the vowel and is a consonant
which follows the vowel. I add that although Samareh is right about the words pɑj ‘foot’
and muj ‘hair’ being common without the j in today’s language, there are some words
containing ɑj and uj whose j cannot be deleted even in very informal speech, such as:
(70) a. ruj ‘zinc’
b. nɑj ‘trachea’
Compare these with the following words:
121
(71) a. ru ‘face’ (also ruj ‘face’ literary and formal form)
b. nɑ ‘energy, mustiness’
As (70) shows, it is not the case that the j can always delete after ɑ and u. A careful study
of the lexicon might reveal why deletion is sometimes not allowed -it is possible that in
cases where the vowel-glide sequence is not in contrast with any word this is possible.
This remains for future work.
In fact, deletion of the glide may not be a good argument for uj and ɑj being a sequence
of a vowel followed by a consonant for those who argue for the non-phonemic status of ej
as well (like Samareh) because as seen below, j cannot be deleted after e in ej.
(72) a. mej ‘wine’ *me
b. pej ‘foundation’ *pe
That is, if deletion of the j in these diphthongs is an indication of the non-phonemic status
of these diphthongs, then are those cases which do not show deletion phonemic
diphthongs?
Discussing the diphthongs which are considered non-phonemic (i.e., ɑj, uj, oj, aj), Samareh notes that in the case of oj, one cannot delete the j but the fact that in suffixation
by a vowel-initial suffix the vowel and the consonant separate from each other and are
syllabified in different syllables shows that the j is not a part of the vowel.
Heterosyllabicity is also observed in the case of uj and ɑj. Samareh gives the following
examples for these (p. 99) and compares them with consonants like r and z. The words
are followed by Ezafe (see footnote 13 for Ezafe).
122
(73) a. xoj + -e → xo.je36
b. pɑj + -e → pɑ.je ‘foot-Ezafe’
‘Xoj (name of a city)-Ezafe, perspiration-Ezafe’
c. muj + -e → mu.je ‘hair-Ezafe’
d. sar + -e → sa.re ‘head-Ezafe’
e. miz + -e → mi.ze ‘table-Ezafe’
In the case of aj, Samareh points out that given the environment in which this diphthong
can occur, as shown in Arabic-origin words such as ɢajjem ‘guardian, moʔajjan ‘fixed’,
sajjɑd ‘hunter’, the j after the vowel cannot be deleted nor can it be syllabified in the next
syllable. It is in fact a geminated j after the vowel a. Therefore, Samareh concludes that
in the case of aj, the j is certainly a consonant.
I should add that there are cases of occurrence of aj in medial position in Persian without
geminated j, as in (74). These are considered to fall in two different syllables; the first
syllable is Ca and the second starts with j as a consonant followed by a vowel. That is, aj here is not considered to be a diphthong.
(74) a. pa.jɑm ‘message’
b. a.jɑr ‘standard, the degree of purity of a precious metal’
c. ba.jɑn ‘expression’
d. ba.jɑt ‘stale’
e. ha.jɑ ‘shame’
f. ha.jɑt ‘life’
36 I think xo.ji ‘pertaining to xoj (the city), also a last name is Iran’ is a better example. xoj in its ‘perspiration’ meaning is very literary and formal and so is xo.je.
123
It should be noted that the former final aj changed to ej over time (e.g., Najafi 2001) as in
maj ‘wine’ which is mej today. This is similar to final a which historically became e (e.g.,
xāna ‘home’ which is xɑne today –see (37)). Some of the final aj-ending words in the
language are from Arabic; for example, hajj ‘alive’ which has a geminated j and is not
pronounced with ej. Sometimes both of the ej and aj versions of Arabic-origin words are
used for a word with the aj version being more formal than the ej version, which is more
commonly used (e.g., tajj and tejj ‘during’). Regardless of the choice of vowel (a or e),
final gemination (e.g., j or jj in final position), unlike gemination in medial position, is
hard to perceive. When words such as tajj ~tejj are followed by a vowel-initial suffix, for
example Ezafe, the final j is heard as geminated (e.g., tejj-e). I will return to gemination
in 5.5.
In this section I discussed the four diphthongs which are commonly thought to have non-
phonemic status (i.e., ɑj, uj, oj, aj) and the arguments presented in the literature in this
regard. These arguments are: (i) the possibility of deletion of j; and (ii) the separation of j
from the vowel in suffixed forms. Next I discuss the two diphthongs whose phonemic
status is controversial.
3.10.2. ej and ow
As noted before, the phonemic status of two of the six diphthongs, ej and ow, is less clear
(e.g., Meshkatod Dini 1999, Najafi 2001). I start with some examples of these
diphthongs:
(75) a. ej mej ‘wine’
mejl ‘willingness’
mejmun ‘monkey’
124
b. ow ʤow ‘barley’
ʤowr ‘torture’
ʤowlɑn ‘parading’
Between ej and ow, the status of ow is in particular unclear not only because of its unclear
phonemic status but also because the presence of w in ow is a topic of debate (e.g.,
Samareh 1977, 1985, Pisowicz 1985, Najafi 2001), as discussed below.
Regarding the phonemic status of ej and ow as diphthongs, there are two different views
in the literature (e.g., Samareh 1985, Meshkatod Dini 1999, Najafi 2001). Some consider
them as being phonemically diphthongs (acting as one vowel with two parts which are
not separable) and others as being a sequence of a vowel followed by a consonant.
The literature which considers ej and ow to be phonemic finds support for this position
from historical data (according to Pisowicz 1985, Najafi 2001, among others): historical
aj became present ej and also av/aw became present ow. That is, for the words which
synchronically include ej and ow, instead of presenting a synchronic analysis, a
diachronic account is presented based on the former status of these diphthongs (aj and
av/aw). The historical status of diphthongs is a topic in itself to investigate. Recall from
2.1 that Old Persian had two diphthongs ai and au which became ē and ō in Middle
Persian. So how diphthongs changed over time from Old Persian to the present time and
what historical status the literature exactly refers to in arguing for present ej and ow being
phonemic are unclear and need to be examined.
The studies which consider ej and ow to be non-phonemic consider them as a sequence of
a vowel followed by a consonant (e.g., Samareh 1985). According to Samareh, a
phonemic diphthong is a diphthong in which two parts of the diphthong act as a single
vowel, that is, the second part is not separable from the first part. As for ej, the latter
explanation (i.e., a sequence of a vowel followed by a consonant) is easy to argue for. It
is, however, problematic in the case of ow, since w is not a phoneme in the consonant
inventory of Persian. As Najafi (2001) writes this is in fact one of the reasons that ow is
considered in some studies as a diphthong phonemically (that is, one cannot consider it as
125
being a vowel followed by a consonant since there is no w in the consonant inventory of
Persian). But those studies which consider ow as a vowel followed by the consonant w,
and not as a phonemic diphthong, treat ow as underlyingly ov. In this view, w after o in
ow is an allophone of v which occurs only after o, as suggested by the fact that v and w do
not contrast with each other as well as by the suffixed form of the ow-final words, such as
those in (76). Thus, under this view, the presence of w in ow is not problematic or an
indication of the phonemic status of ow. That is, w can occur only on the surface, as an
allophone of v, without being in the phonemic inventory. Regarding w being an allophone
of v, according to some studies (e.g., Hayes 1986), the labiodental fricative [v] and the
labiodental approximant [w] are in complementary distribution: [w] occurs in codas after
o; for example, pɑltow ‘overcoat’ and dowr ‘era’ and [v] occurs elsewhere.
Morphological alternations such as mi-ra[v]-am ‘I am going’ and bo-ro[w] ‘go!’ or
no[w]-ruz ‘new year’ and no[v]-in ‘new kind’ indicate that [v] and [w] are allophonically
related.
(76) a. now ‘new’ + -in → novin ‘modern’
b. ʤow ‘barley’ + -in → ʤovin ‘made of barley’
There are two processes which are referred to in some literature (e.g., Samareh 1985) as
evidence for ej and ow not being a diphthong but a vowel followed by a consonant, as
follows.
First, the first and second halves of these diphthongs are syllabified heterosyllabically
before a vowel-initial suffix in suffixation, as shown in (77). In the case of ow, the w
becomes v. Based on this, some literature considers ej and ow not to be phonemic
diphthongs because if they were, the two parts should not be separated (e.g., Samareh
1985, Pisowicz 1985).
(77) a. Rej ‘the name of a city in Iran’ + -i → re.ji ‘pertaining to Rej (the city)’
b. pejrow ‘follower’ + -i → pej.ro.vi ~ pej.ra.vi ‘following’
c. now ‘new’ + -in → no.vin ‘modern’
126
Second, there is no word in Persian with CejCC or CowCC, while the language has many
CVCC words, some of which are given in (78). The maximum final cluster is CC in
Persian.
(78) a. narm ‘soft’
b. fekr ‘thought’
c. ɢors ‘pill’
d. mɑst ‘yogurt’
e. rixt ‘appearance’
f. pust ‘skin’
The fact that CejC and CowC occur, as seen in (79), but not CejCC and CowCC suggests
that j and w are consonants (w being an allophone) as some literature suggests (e.g.,
Pisowicz 1985).
(79) a. sejl ‘flood’
b. tejf ‘range’
c. xejr ‘goodness’
d. ɢejz ‘anger’
e. ʃowɢ ‘eagerness’
f. dowr ‘turn’
g. ɢowm ‘ethnic group’
h. mowz ‘banana’
127
According to Pisowicz (1985), who considers ej and ow to be non-phonemic, the absence
of a phonemic contrast between long and short vowels in Persian rules out the mono-
phonemic status of ej and ow.
Pisowicz further comments on w in ow. Recall that the Persian consonant inventory does
not have w and that is why it is considered as an allophone of v, which exists in the
inventory. Pisowicz considers w to be a phoneme which is in the process of being
dropped from the system, under the pressure of the colloquial language, and probably
will remain as a non-phonemic glide. Pisowicz considers the following problems in
interpreting w as a phoneme, concluding that w is losing its phonemic status: (i) it has a
low frequency which is associated with a very restricted distribution (i.e., after –o); (ii) in
the colloquial and frequently in the literary spoken language, w does not occur due to the
shift of ow to o:. I should add a note about colloquial versus formal speech with respect
to ow. Some literature suggests that ow is not heard anymore in colloquial speech and
instead o or a phonetically longer o (shown as o:) is heard (e.g., Pisowicz 1985, Najafi
2001). Meshkatod Dini (1999) also says that o. (longer o) and ow are in free variation in
Modern Persian as in no./now ‘new’, ro.ʃan/rowʃan ‘bright’.
The case of ow needs comment. Its occurrence in final position is in particular of
importance in the minimal word requirement and in suffixation discussions. I will return
to ow in chapter 6. The point about ow which is important for the discussion of
diphthongs in Persian is that no evidence is presented for its being a phonemic diphthong.
In fact, the existence of w in ow itself is under question as reflected in the literature, as
was seen above and will be discussed in chapter 6.
To conclude the discussion of diphthongs, I follow the view that there is no phonemic
diphthong in the language as there is no convincing evidence for the language having
such diphthongs. All of the six diphthongs are syllabified heterosyllabically when a
vowel-initial suffix is present. Moreover, with none of them is the CVdiphthongCC structure
possible.
128
The overall conclusion with respect to the Persian vowel inventory is that Persian does
not have long vowels or phonemic diphthongs. All vowels in Persian are phonologically
monophthongal and monomoraic.
3.11. Summary
In this chapter, I argued that tenseness is the active feature of the Persian vowel system
(not height or length) and showed that both harmony, which requires a feature to be
phonologically active in the system, and a classification of vowels into two groups a, e, o
vs. ɑ, i, u, which is observed in phonotactics, harmony, etc. are accounted for through
tenseness. I presented a phonetic experiment on the length of Perisan vowels. I further
discussed contrasts and markedness in the system. Moreover, I studied two processes
which show changes in vowels in the environment of particular consonants: harmony in
low vowels across laryngeals and pre-nasal raising. I also looked into the status of
Persian diphthongs showing that, as suggested by some literature (e.g., Samareh 1985),
there is no phonemic diphthong in Persian.
Although tenseness neatly accounts for both harmony and categorization of vowels in
Persian, there are processes in the language which potentially provide evidence for
quantity. In the next three chapters, I will discuss these potential arguments for quantity-
based analysis: epenthesis in suffixation, VC co-occurrence restrictions, and minimal
word requirements. I argue that they do not support the presence of underlying quantity.
In fact, they provide support for a tense-based account compared to both a height-based
account and a quantity-based account.
129
Chapter 4
The epenthetic –e in suffixation: evidence for quantity?
In chapter 3, I argued that contrast in the Persian vowel system is based on a tense/lax
distinction with [tense] being underlyingly present, as suggested by Persian harmony
processes in which [tense] spreads. There is, however, evidence in the language which
seems to provide support for a quantitative analysis of the system. In this and the next
two chapters, I will discuss this evidence and show that it does not provide arguments for
underlying quantity. This potential evidence for a quantity-based analysis involves: (i)
epenthesis in suffixation; (ii) VCC co-occurrence restrictions; (iii) minimal word
requirements. Epenthesis in suffixation is examined in this chapter. VCC co-occurrence
restrictions and minimal word requirements will be discussed in chapter 5 and chapter 6,
respectively. I will show that these three processes/phenomena do not support underlying
quantity, and as much as they are observed in the language, they can be accounted for by
underlying tenseness. In fact, epenthesis, VCC restrictions, and minimal words provide
further support for a tense-based account. An account based on a tense/lax distinction
will be shown to have the merits of being like a featural account in certain respects (e.g.,
harmony processes), and like a quantity analysis in other respects, and the latter is what I
will show in this and the next two chapters. It will be seen that if we consider quantity as
the phonological dimension of contrast in the system, not only is harmony left
unexplained but also processes such as epenthesis, which at first appear to support
quantity, cannot be accounted for. And if we consider height to be the quality of the
system, the categorization, which is observed in harmony and, to some extent, in the
process/phenomena which I will discuss in the present and the next two chapters, is lost.
I first provide a roadmap of the steps I take in discussing epenthesis, VCC restrictions,
and minimal words. Then, I begin my discussion of epenthesis, which is the focus of this
chapter. The first step is to show why epenthesis, VCC restrictions, and minimal words
can be potentially evidence for underlying quantity. The next step is to show how a tense-
based analysis can account for these processes/phenomena to rule out the necessity of
130
underlying quantity. The last step is to argue that these are not productive processes, and
are observed in the language in a limited way, which could be historical residue or a
surface effect of underlying tenseness. I will return to this at the end of chapter 6, once I
discuss all these apparent pieces of evidence for underlying quantity.
I now discuss epenthesis. Epenthesis has been argued to be driven by prosodic
requirements in general (e.g., Selkirk 1981, Itô 1989), and in Persian this analysis has
some initial appeal since epenthesis appears to be sensitive to weight. The epenthesis
under discussion occurs at a stem-suffix boundary when a CVC stem whose vowel is ɑ, i, u is followed by a consonant-initial suffix. It may also occur with stems whose vowel is
a, e, o but only when the root structure is CVCC. Epenthesis does not occur with CVC-
CV sequences when V of the stem is lax. This may suggest that epenthesis occurs when a
stem is heavy (I will return to this below) due to either the existence of ɑ, i, u, which are
bimoraic under a quantitative account, followed by a consonant, or the sequence of a, e, o
followed by two consonants. Thus the environment for epenthesis might be considered to
be an argument for ɑ, i, u to be bimoraic, and consequently an argument against a purely
qualitative analysis of the Persian vowel system. I will discuss this process in detail and
show that the process does not provide an argument for underlying quantity or against
quality if we consider tenseness as the basis of contrast (I will discuss that if height is
considered to be the basis of contrast in the system, we face a problem in accounting for
epenthesis). As I will show, the process can in fact receive a synchronic account.
However, it is not synchronically productive.
I discuss the synchronic status of the epenthesis in 4.1. Section 4.2 presents a possible
quantitative analysis and a discussion of how tenseness can in fact account for the process
without a need to refer to underlying quantity. Section 4.3 addresses a question about the
limited occurrence of epenthesis. In 4.4, I present an overview of Persian suffixes.
Section 4.5 includes discussion of cluster types, productivity of suffixes, and frequency
of suffixes and the suffixed forms. I then present the historical background of the suffixes
and the suffixed forms in 4.6. I next report on an experiment I did to study the synchronic
status of epenthesis in 4.7. Section 4.8 concludes the chapter.
131
4.1. Epenthesis in suffixation: synchronically
Persian has an epenthesis process that occurs when a consonant cluster is created at a
stem-suffix boundary. An epenthetic vowel (the vowel –e, and in a few cases –o or –a) is
inserted in order to break up the morphologically derived consonant clusters. This
process occurs only in a limited number of cases and with some stem structures. In
particular with stems having the shape/ending in the shape CVlaxC epenthesis never
occurs; while with CVlaxCC, CVtenseC, CVtenseCC forms, epenthesis occurs in some cases
and not in others.
I begin the discussion with illustration of when epenthesis does and does not occur. Note
that when I talk about the category in which ‘epenthesis may occur’, I disregard any
differences within that category in terms of the frequency of epenthesis. There are some
words whose version with epenthesis is (by far) more frequent than their non-epenthesis
version and vice versa, and there are some which show similar frequency in their two
versions. The important point for our discussion now is simply that epenthesis is an
option. The data on where epenthesis is possible and where it is not is based on my
intuition, and confirmed by the transcriptions given for the words in Persian dictionaries
(e.g., Emami 2006), by at least two other native speaker of Persian, and also by an
experiment conducted with 10 native speakers of Persian, discussed in section 4.7.
(1) presents examples of stems with (or ending in) CVlaxC form followed by a consonant-
initial suffix. As the examples show, epenthesis is not an option.
(1) CVlaxC (no epenthesis)
a. dar ‘door’ + bɑn → dar.bɑn ‘doorkeeper’ *da.re.bɑn
b. ɢam ‘sadness’ + ɡin → ɢam.ɡin ‘sad’ *ɢa.me.ɡin
c. ɡol ‘flower’ + dɑn → ɡol.dɑn ‘vase’ *ɡo.le.dɑn
d. ʃen ‘gravel’+ zɑr → ʃen.zɑr ‘sandy terrain’ *ʃe.ne.zɑr
e. nam ‘dampness’+ nɑk → nam.nɑk ‘damp’ *na.me.nɑk
132
f. dɑ.neʃ ‘knowledge’+ var→ dɑ.neʃ.var ‘knowledgeable’ *dɑ.ne.ʃe.var
g. ho.nar ‘art’ + mand → ho.nar.mand ‘artist’ *ho.na.re.mand
h. ʃo.tor ‘camel’+ bɑn → ʃo.tor.bɑn ‘cameleer’ *ʃo.to.re.bɑn
In (2) examples of stems with CVlaxCC form followed by consonant-initial suffixes are
when the last vowel of the base is long an intrusive /e/ intervenes between the word final
consonant and the /ɡ/, i.e. /ʔɑmuzeɡɑr/, /pɑjeɡɑh/, /xodɑjeɡɑn/” (p. 137, footnote 1).
Recall that Samareh (1977, 1985) argues that length is not the primary contrastive feature
in Persian vowels; height is distinctive rather than length in his analysis (that is, in the
account he provided for Persian vowels, he considers only height to be the active feature
and does not consider both length and height to be active in the system). Using the term
“long” for the vowels ɑ, i, u seems contradictory to his height-based analysis. I should
mention that the epenthetic e is also seen with suffixes which do not start with -ɡ
(although Samareh does not say anything about other suffixes so maybe his saying that
the intrusive /e/ occurs before /ɡ/ of the suffix is meant to be an explanation only for
those three examples he gives which all start with /ɡ/). Examples of cases with epenthetic
–e with other suffixes include: pɑs + bɑn → pɑ.se.bɑn ~ pɑs.bɑn ‘police officer’, ʃɑd + mɑn → ʃɑd.mɑn ~ ʃɑ.de.mɑn ‘happy’, etc. The point of importance is that Samareh
136
relates the occurrence of epenthesis to the length of the vowel of the root. Lazard (1992)
also points out that a cluster of two consonants which follow one of the stable vowels
(recall that the stable vowels are ɑ, i, u) may be broken up by the vowel e, as observed in
alternations such as pɑsbɑn ~ pɑsebɑn ‘policeman’, ruzɡɑr ~ ruzeɡɑr ‘time’, kɑrɡar ~ kɑreɡar ‘worker’.
I should add that the insertion of an epenthetic –e is in some cases observed within stems
too, as in ɑʃnɑ ~ ɑʃenɑ ‘familiar’. I will discuss these cases again in 4.6.2. My focus at
this point is on epenthesis in suffixation, of which there are far more cases compared to
epenthesis within stems, of which few examples exists.
Epenthesis may appear to be attributable to length, thus supporting a quantity-based
rather than featural analysis of vowel contrasts. In the next subsection I discuss this in
detail.
4.2. Incorporating quantity
Epenthesis is often argued to result from syllabification demands (e.g., Selkirk 1981, Itô
1989). The Persian epenthesis process, as illustrated in section 4.1, appears to argue in
favor of a quantity-based analysis of vowels, where epenthesis may apply with long
vowels and not with short vowels. Assuming a moraic representation of vowels (Hyman
1985, Hayes 1989, McCarthy and Prince 1995, among many others), the vowels I have
identified as lax vowels would be a single mora, while those I have called tense vowels
would be two morae. Syllable structures for sample words are shown in (8).
137
(8) i. Lax vowel ii. Tense vowel
PWd PWd
Φ Φ
σ σ
μ μ μ μ
d a r k ɑ r
‘door’ ‘work’
Epenthesis can be quite readily accounted for under the quantity hypothesis. While
details of prosodic structure in Persian remain to be developed, the following is an outline
of an account under a quantitative analysis of vowel contrasts. Note that although
Samareh and Lazard comment on epenthesis, they do not provide an account of the
process.
The long (tense) vowels are bimoraic (8ii) and the short (lax) vowels monomoraic (8i).
Assuming that a syllable can accommodate two morae, the postvocalic consonant
following the short vowel can receive a mora, and is thus prosodically licensed in this
way. Following the long vowel, on the other hand, the consonant is not moraic. While it
can be licensed by associating to the prosodic word when it is final, when a suffix is
present, the stem-final consonant syllabifies as an onset, with epenthesis of a vowel to
provide a nucleus.37
37 Note that we could also consider the final consonant after long vowels to receive a mora, which means that a syllable could accommodate more than two morae (see (9) below). The latter analysis of the syllable structure, which makes words such as kɑr ‘work’ trimoraic, does not change the possibility of the occurrence of epenthesis as it creates a superheavy syllable. Epenethesis may or may not occur in this environement, as shown in unsuffixed words such as ɑʃnɑ ~ ɑʃenɑ ‘familiar’, ɑʃkɑr ~ ɑʃekɑr ‘apparent’(see also 4.6.2). Thus whether one considers the final consonants in CVtenseC to be assigned a mora or to be licensed by the Prosodic Word (henceforth PWd) or directly by the syllable, epenthesis is possible to account for.
138
For words of the form CVlaxCC and CVtenseCC, the structures in (9) are possible,
assuming no more than one consonant is licensed by the prosodic word. These structures
suggest that Persian syllables can be trimoraic. The trimoraicity of Persian syllables is
suggested by Hayes (1979) and adopted by Darzi (1991). This discussion is to highlight
why epenthesis is potentially a support for a quantitative analysis for Persian.
(9) i. Lax vowel ii. Tense vowel
PWd PWd
Φ Φ
σ σ
μ μ μ μ μ
(ʔ ) a r ʤ m ɑ n d
‘value’ ‘last’
Thus, assuming structures in (8) and (9) for the purposes of argument, basically
epenthesis may apply to accommodate a stray consonant. In (8i), there is no stray
consonant. In all other structures, a stray consonant is present. The reason that one can
consider final consonants in the above cases to be extraprosodic (linked directly to PWd)
is that in Persian one only finds consonant clusters morpheme-finally (e.g., Samareh
1985) with a couple of exceptions (surtme ‘sled’, jurtme ‘trot’). If these consonant
sequences were licensed within a syllable, they might be expected to occur word-
internally. Note that medial CC as in CVCCV(C) exists in Persian but it is syllabified as
CVC.CV(C) (e.g., ɢurbɑɢe ‘frog’ is syllabified as ɢur.bɑ.ɢe) –the syllabification will be
discussed in detail in 5.6.
139
Adopting a quantitative view, one can thus say that in (9i), when a monomoraic vowel is
followed by a cluster of two consonants in the root, a vowel may be epenthesized when a
consonant-initial suffix is present. The final coda consonant of the root acts as the onset
of the syllable with an epenthetic vowel ((ʔ)arʤ + mand becomes (ʔ)ar.ʤo.mand).
Compare this with the structure in (8i), where the short vowel permits the only coda
consonant to be moraic, therefore there is no need for epenthesis (e.g., dar + bɑn becomes dar.bɑn). For words with the structure given in (9ii), a suffix is added to a root
with a long vowel followed by a cluster of two consonants. To avoid having four morae
in the root after suffixation, epenthesis occurs, with the remaining root trimoraic (e.g.,
mɑnd-ɡɑr becomes mɑn.de.ɡɑr). The heaviness of some structures (heavy due to the
bimoraic vowel or a monomoraic vowel followed by a consonant cluster) can lead one to
consider epenthesis to be conditioned by Persian syllable structure.
The patterning of VlaxC as opposed to VtenseC is one indication of the monomoraicity of
the lax vowels. In addition, note that the choice of e, and that if not e, then a or o, as
epenthetic vowels can be considered as further evidence for the monomoraicity of these
three vowels in a quantitative account. Since epenthesis is generally viewed as altering
structure minimally, epenthesis of a long vowel is unexpected (e.g., Steriade 1995).
We now face a dilemma. While the patterning of epenthesis might suggest a quantity-
based analysis, harmony patterns suggest a quality-based analysis. If we adopt the
quantity-based prosodic structure of vowels presented above, with paired vowels
differing by mora count but not by feature, we cannot account for the harmony processes
(see 2.3.6). The question is: can epenthesis be accounted for under the tense/lax
hypothesis? If so, how?
In order to account for epenthesis based on the tense/lax distinction, one can follow the
direction of an analysis according to which features play a role in projecting syllable
structure (see van Oostendorp 1995 for discussion). Quantity is not underlying, but
vowels with the feature [tense] project two morae, unlike lax vowels, which project a
single mora. A syllable is bimoraic only if the vowel has the feature [tense]. That is,
[tense] is underlying and tense vowels are redundantly bimoraic. Closed syllable laxing is
140
a common process cross-linguistically. For example, it is found in Canadian French (e.g.,
Walker 1984, Rose and dos Santos 2008), Ngaju Dayak (Austronesian; Brunelle and
Riehl 2003); see also Hammond 1997 as well as Bermúdez-Otero and McMahon 2006
for discussion of English. Evidence from closed syllable laxing suggests that tense
vowels often pattern as two morae. Thus, the representations in (8) and (9) are possible
surface representations without the implication that quantity is underlying. This is shown
in (10). VV indicates bimoraicity and V monomoraicity. In my account of Persian
vowels, [lax] is absent, so [lax] in (10) can be in fact replaced by [ ], which shows no
specification for tenseness. For the sake of clarity, I put [lax].
(10) [tense] → VV
[lax] → V
I should note that I do not suggest that in Persian tense vowels necessarily project two
morae and lax ones one mora, that is, I do not suggest that we should see the surface
mora-projection effect all the time. What I suggest is that the process can potentially be
accounted for based on underlying tenseness and mora-projection on the surface. As I
will show, we see this surface effect only to some extent in the language considering
discussions in chapters 4, 5, and 6.
As for the choice of e (and sometimes a and o) as epenthetic vowel, I argued in 3.5 that
these three vowels are structurally less complex than the other vowels. Thus a tense-
based account can explain the choice of epenthetic vowel in Persian and there is no need
to appeal to a quantitative analysis in this respect.
4.3. A question about epenthesis
So far, I have shown that where epenthesis is possible, it could be a potential argument
for quantity, but a tense-based analysis can account for the process without an appeal to
underlying quantity.
141
A question that arises is: why does only a limited number of words allow epenthesis, as
discussed in section 4.1? That is, if the occurrence of epenthesis is related to the prosodic
structure of the root, why in most cases is no epenthesis found in suffixed words whose
roots have the required structure? In fact, the number of words with CVlax CC, CVtenseC,
CVtenseCC root structure which do not show epenthesis is by far larger than the number of
those with these structures which show it. Thus, on the one hand, all the cases which
allow epenthesis have one of these structures, and, on the other hand, most of the words
which have one of these structures do not allow epenthesis.
In order to determine whether the conditions under which epenthesis occurs are
systematic I examine a variety of factors including frequency, productivity of the
suffixes, and type of clusters created at a stem-suffix boundary (4.5), and I also
investigate the historical status of the suffixes and the suffixed forms (4.6). But before
discussing these, in section 4.4. I present an overview of Persian suffixes.
4.4. An overview of Persian suffixes
Epenthesis is possible with a group of stress-bearing suffixes. This group is a part of a
large number of stress-bearing suffixes. With other stress-bearing suffixes and also with
non-stress-bearing suffixes, epenthesis does not occur no matter what vowel is present in
the stem. Thus, with stress-bearing and non-stress-bearing suffixes, no particular
difference is observed with respect to epenthesis. With the stress-bearing suffixes (see
Appendix 2 for a full list of suffixes and examples), epenthesis may occur with some
roots with some suffixes. It is worthwhile to examine the range of suffixes although the
only relevant ones for the topic under discussion are consonant-initial suffixes following
C- and CC-final stems. A list of Persian suffixes is given in (11) and (12). I consider
stress as the criterion for dividing the suffixes into two categories.37F
38
38 There is disagreement about some of these suffixes regarding their status as derivational or inflectional. For example, the nominal plural markers, and the comparative and superlative markers are considered as inflectional by Kalbasi (1992) but derivational by Kahnemuyipour (2000). Dividing them by stress, which is sufficient for my purpose, allows me to avoid discussion of controversies in this respect.
(ii) comparative and superlative markers (-tar, -tarin)
(iii) definite marker -e
(iv) noun-forming suffix -i
(v) adjective-forming suffix -i
(vi) a large number of suffixes including (some of these have other functions):
- locative: -ɡɑh, -zɑr, -kade
- diminutive: -ʧe, -ak
- agentive: -bɑn, -ɡar
- attributive: -mɑn, -nɑk, -vɑr, -vaʃ
39 For Ezafe see 2.2.2.1.2 (footnote 13). 40 There are also plural markers ɑt, ʤɑt, in, un most of which are from Arabic and are not as frequently used (e.g., ɑt: hejvɑn ‘animal’ and hejvɑn-ɑt ‘animal (pl.)’, maɢɑle ‘article’ and maɢɑlɑt ‘articles’; ʤɑt: sabzi-ʤɑt ‘vegetable (pl.)’; in: mosɑfer -in ‘traveller (pl.)’; un: rohɑni -j-un ‘clergyman (pl.)’). Among these, -ɑt is the most frequent. I leave them aside.
143
The suffixes in (12vi) are of interest for this discussion as suffixed forms which show
epenthesis contain one of these suffixes (e.g., kɑr-ɡar ~ kɑr-e-ɡar ‘worker’, ʃɑd-mɑn ~ ʃɑd-e-mɑn ‘happy, joyful’). Some examples were given above and more examples will be
given below (see Appendix 2 for the full list).
I present below the non-stress-bearing suffixes and some examples followed by the
stress-bearing suffixes which do not show epenthesis. Note that some of these suffixes
start with vowels and therefore no cluster is created. I include them here simply to
provide the full list of suffixes. For each suffix, I give examples of CVlaxC, CVtenseC,
CVlaxCC, CVtenseCC structures. For example, consider (13). (13a) shows two examples of
CVC, one with a tense vowel (i.e., bɑɢ ‘garden’) and the other with a lax vowel (i.e., ɡol ‘flower’); and (13b) shows two examples of CVCC, one with a tense vowel (i.e., dust ‘friend’) and the other with a lax vowel (i.e., toxm ‘seed’).
Let us first look at examples of the non-stress-bearing suffixes.
(13) -i indefinite article
a. CVC bɑɢ ‘garden’ bɑɢ-i ‘a garden’
ɡol ‘flower’ ɡol-i ‘a flower’
b. CVCC dust ‘friend’ dust-i ‘a friend’
toxm ‘seed’ toxm-i ‘a seed’
(14) -rɑ specificity marker (-ro in speech and usually -o after consonants)
a. CVC bɑɢ ‘garden’ bɑɢ rɑ ‘the garden’ ~ bɑɢ-o
ɡol ‘flower’ ɡol rɑ ‘the flower’ ~ ɡol-o
b. CVCC dust ‘friend’ dust rɑ ‘the friend’ ~ dust-o
To sum up, one cannot find any particular pattern in cases where epenthesis occurs by
looking at suffixes and roots. All that can be said is that (i) there are suffixes with which
epenthesis almost never occurs regardless of the structure of the root. This set includes
most suffixes of the language, including: -bɑr, -ʧe, -ʧi, -dɑn, -dis, -zɑr, -sɑr, -sɑn, -kɑr, -kade, -ɡun, -ɡin, -nɑk, -vaʃ; (ii) with the suffixes -ɡar and -bɑn, in a few cases
epenthesis occurs and in most cases it does not; (iii) there are a few suffixes which show
epenthesis more frequently compared to other suffixes. These suffixes are -mɑn and -ɡɑr. The list of words which shows epenthesis contains other suffixes, as seen in the
examples. But the number of the epenthesis-containing cases with these four suffixes
(i.e., -mɑn, -ɡɑr, -ɡar and -bɑn) is higher.
Note that there are words like parɡɑr ‘a pair of compasses’ where the analysis is not
clear. A root par ‘feather’ exists as a word but whether parɡɑr is a suffixed form of it is a
question. A similar example is darmɑn ‘treatment’. Is it composed of dar ‘door’ and the
suffix -mɑn? I leave aside these and other shaky cases and rely on what is clearly a root
followed by a suffix.
In addition, note that in Persian, the present stems of infinitives are used as suffixes.
Epenthesis is not found in such cases. Some examples follow:
(32) ʤostan ‘to look for’ ʤu present stem
a. dɑ.neʃ ‘knowledge’ + ʤu → dɑ.neʃ.ʤu ‘student’
b. razm ‘war’ + ʤu → razm.ʤu ‘warrior’
c. par.xɑʃ ‘aggression’ + ʤu → par.xɑʃ.ʤu ‘aggressive’
d. ma.dad ‘help’ + ʤu → ma.dad.ʤu ‘someone needing help’
c. fard ‘individual’ + ɡerɑ → fard.ɡerɑ ‘individualist’
d. rɑst ‘right’ + ɡerɑ → rɑst.ɡerɑ ‘rightist’
In conclusion, given the list of suffixes and the large number of words which take these
suffixes, the number of cases where epenthesis is possible is very limited. There is a short
list of words with which epenthesis occurs. A much longer list exists of those which do
not show epenthesis with the same root structure. This might suggest that epenthesis does
not productively occur because if it does, more cases might be expected to show it.
In order to determine whether the conditions under which epenthesis occurs are
systematic, in section 4.5 I examine several factors, namely types of consonant clusters
created at a stem-suffix boundary when epenthesis occurs, productivity of suffixes, and
152
frequency of suffixes and the suffixed forms under study, arguing that none of these
offers an account. What is systematic is that it is never found with VlaxC-C structures
while it is possible with VlaxCC-C, VtenseC-C, and VtenseCC-C structures.
4.5. Cluster types, productivity of suffixes, and frequency
In this section, I discuss the following: in 4.5.1. the cluster types created at a suffix
boundary, where the epenthesis may occur and the productivity of suffixes will be
discussed; in 4.5.2, the frequency of suffixes and suffixed forms will be examined.
4.5.1. Cluster types and productivity of suffixes
I begin the discussion with clusters which may occur at stem-suffix boundaries. It could
be the case that particular sequences of consonants force epenthesis. However, the data
shows that there is no particular tendency in terms of the clusters created in suffixation in
cases where epenthesis occurs. The following combinations, among others, are observed:
(36) With m-initial suffix, words with final: z, t, r, j, d, ʤ
With b-initial suffix, words with final: s, ɢ, r
With ɡ-initial suffix, words with final: d, z, t, r, j
It might be asked whether productivity of suffixes has a role in conditioning epenthesis.
There is no particular correlation between the productivity of a suffix and its tendency to
show the vowel in suffixation. Consider, for instance, the four suffixes with which more
cases of epenthesis are found (i.e., -ɡar, -bɑn, -ɡɑr, -mɑn). According to Kalbasi (1992),
the first three are productive, in particular -ɡar, and -bɑn are very productive. Kalbasi
153
considers the suffix -mɑn to be unproductive.42
With respect to productivity of the
suffixes and its relation to epenthesis, it should be noted that there are several other
productive suffixes (again according to Kalbasi 1992) which do not show epenthesis,
such as -kade, -ɡɑh, -ʧe, -dɑn, etc. Thus, no particular pattern is seen with respect to
productivity.
4.5.2. Frequency of suffixes and the suffixed forms
It might also be asked whether the suffixed forms which show epenthesis occur more
frequently than their root on its own. In this case the dominant pattern, the non-epenthesis
pattern, would not affect them as they would have their own pattern due to their being
more frequent than their root. The frequency of these forms is worth studying because
there are sound changes which affect words of low frequency and those of high frequency
differently, as discussed in the literature (e.g., Phillips 1984, Bybee and Hopper 2001,
Antilla 2006, Kang 2003, 2005, 2007). Intuitively, it does not seem that a frequency
effect plays a role in epenthesis in Persian, but to confirm whether this intuition is
correct, I performed a Google search on the words which show epenthesis and their root
to see how frequent they are in order to compare their frequencies.
Googling in Persian turns out to be very unreliable in this regard for the following reason:
it is not possible to limit the Google search to the word under study. I give a few
examples here. When one googles a word like pɑs to be compared to pɑsbɑn ~ pɑsebɑn
‘police officer’ to see which one is more frequent, in addition to the Persian word, the
loan word pɑs ‘pass’, which has entered the language with the meanings ‘pass a ball (in
soccer, etc.)’ and ‘pass a course’, also appears. The same happens when one googles the
Persian word rast the past stem of rastan ‘to find relief’ to be compared to rastɡɑr ~
42 It seems to me that the suffix -mɑn is used in forming new words, but this needs to be investigated. In a recent dictionary, Emami 2006, (recent compared to the time Kalbasi’s book was published, 1992), the word ɡoftemɑn ‘dialogue’ (consist of ɡoft ‘past stem of ɡoftan ‘to tell’ followed by -mɑn) is considered as a new word –the dictionary shows new words with an ‘N’ in Persian script.
154
rasteɡɑr ‘salvaged’. The result of googling rast also includes rost as in rost bif ‘roast
beef’. This is because in Persian the vowels a, e, o are not written (except for rare cases
mostly for beginners, as previously noted in 2.2.2.3). Similarly, the result of mehr ‘kindness’ to be compared to mehrbɑn ~ mehrabɑn ‘kind’ includes the Persian words
mohr ‘stamp’ and mahr ‘a sum of money that the bridegroom undertakes to pay the
bride’. The result of googling the word kɑr ‘work’ to be compared to kɑrɡar ~ kɑreɡar ‘worker’ includes compounds or suffixed forms which contain kɑr.
Another problem is that some of the roots are the present stem of a verb and therefore are
almost never used on their own. For example, ɑmuz in ɑmuzɡɑr ~ ɑmuzeɡɑr ‘teacher’ is
the present stem of the infinitive ɑmuxtan ‘to learn something, to teach something to
somebody’ and is always used in combination with a root or is used as a suffix (compare
ɑmuzɡɑr ~ ɑmuzeɡɑr, in which ɑmuz is the root with dɑneʃ-ɑmuz ‘student’ which
consists of dɑneʃ ‘knowledge’ and ɑmuz). There are some words with more than one
meaning such as sɑz, and the suffixed forms under study here are not attributed to all of
them. In addition, epenthetic vowels are not written. I will not belabor these cases. The
point is that finding root frequency in the Persian cases is not straightforward due to
various problems, some of which were noted above.
I present, in (37), a few examples of the results which I got through googling some words
in their Persian script (Persian google). The googling was done in fall 2009. The suffixed
forms given below all allow epenthesis. I do not include the epenthetic vowel here as it
does not appear in the Persian script. Note that different times of googling at different
days may give slightly different results in terms of numbers shown for a word, so the
numbers are approximate.
(37) The word in Persian script Google search
a. bɑɢ ‘garden’ 2,470,000
bɑɢbɑn ‘gardener’ 125,000
155
b. pɑs ‘watch, guard duty’ 1,830,000
pɑsbɑn ‘policeman’ 70,000
c. jɑd ‘memory’ 8,270,000
jɑdɡɑr ‘memento’ 2,320,000
d. kɑr ‘work’ 23,700,000
kɑrɡar ‘worker’ 2,190,000
e. arʤ ‘value, appreciation’ 1,080,000
arʤmand ‘valuable, appreciated’ 887,000
f. parvard ‘created, nurtured ‘past stem’’ 22,500
parvardɡɑr ‘creator ‘used for God’’ 485,000
g. sɑz ‘a musical instrument, 6,270,000
the present stem of sɑxtan ‘to build, to structure’,
the present stem of sɑxtan ‘to be compatible with’’
sɑzmɑn ‘organization’ 10,700,000
sɑzɡɑr ‘compatible’ 692,000
156
The data presented above show different patterns. In (a-e), the frequency of the root is
more than that of the suffixed form. In (f), the direction is the opposite, with the root
less frequent than the suffixed form. In (g), the root is more frequent than one of the
suffixed forms but less frequent than the other. Cases like (g) show that the suffixed
forms can have less or greater frequency than the root. The same result can be obtained
from comparing (a-e) with (f). But basically the problem is that the numbers
themselves, in particular those obtained for roots, are not reliable due to the points
discussed above. Therefore, I do not go through more details about this google search.
The fact that in Persian a, e, o are not written is also problematic. The words pɑsbɑn and
pɑsebɑn are both written the same way (no –e occurs in writing) so unless these two
words are pronounced one cannot tell which one is meant. The same is true for kɑrɡar ~ kɑreɡar ‘worker’, ʃɑdmɑn ~ ʃɑdemɑn ‘joyful’, and all other words with two versions. In
order to be able to distinguish between the two versions, I googled these words in their
so-called Pinglish (Persian-English: Persian words written in English script). Pinglish
has an advantage in this case: the –e is present in written forms so one can tell the
difference between, for example, kɑrɡar and kɑreɡar ‘worker’. The result of this test,
although more helpful than the Persian googling, was not entirely reliable either. As
when one googles, for example, ʃɑdemɑn ‘joyful’ to be compared to its without-
epenthesis counterpart ʃɑdmɑn, the English ‘shade man’ also appears in the result.
Another example is kɑrɡar which also includes cases like Kar Gar, the name of a person
on Twitter. The Pinglish googling does not show a particular pattern regarding the
frequency of the with-epenthesis versions as opposed to the without-epenthesis versions.
As seen below, there are some words for which the frequency of the without-epenthesis
version is higher than the one of their non-epenthesis counterpart and there are some
words which show the opposite direction. Compare (38) with (39). I first give each
word’s Pinglish and then the IPA transcription.
(38) a. shadman /ʃɑdmɑn/ 529,000 ‘happy, joyful’
shademan /ʃɑdemɑn/ 269,000
157
b. kargar /kɑrɡar/ 128,000 ‘worker’
karegar /kɑreɡar/ 38,000
(39) a. arjmand /arʤmand/ 41,000 ‘appreciated, valued’
arjomand /arʤomand/ 67,400
b. amuzgar /ɑmuzɡɑr/ 11,500 ‘teacher’
amuzegar /ɑmuzeɡɑr/ 88,500
Note that there is a complication here about Pinglish itself. There is no standard way of
writing in Pinglish. For example, the word ‘shadman’ can also be written ‘shaadmaan’ as
ɑ is sometimes written as aa to be distinguished from a, which is written as a. The
google result for ‘shaadmaan’ is 3,170, and for ‘shaademaan’ is 45. I gave the above
examples with their most common Pinglish. Many of these words which are found in
their Pinglish forms are last names or proper names in general and therefore more seen
with one a for both a and ɑ. Similarly, u can be written as oo or ou (e.g., ruz ‘day’ can
be written as rouz or rooz).
Returning to the results of googling, the result of each of the words under study may
include words which were not meant to be counted in the googling results. The fact that
no pattern is observed in the suffixed words with two versions in terms of frequency of
one version being higher than the other version is intuitively confirmed as there are some
words whose with-epenthesis version is more frequent (e.g., ɢahremɑn is intuitively by
far more frequent in speech than ɢahrmɑn and the Pinglish google search confirms this:
‘gahrman’ 2,490, and ‘gahreman’ 21,400). But again, the google search for the Persian
data does not give us reliable results. In addition to the problems pointed out so far, it is
not clear how many of these words in the google search are based on Standard Persian
158
spoken in Iran, which is the focus in my study, and how many are based on Dari and
Tajik or other related languages and dialects spoken inside or outside Iran.
Another search that I did through googling was to compare suffixed words with the same
root but different suffixes, some of which are given below, to see if a pattern in terms of
frequency is observed. That is, I consider two suffixed forms for a root. One of these
suffixed forms shows epenthesis and the other does not. This is done to see whether a
particular pattern is observed between the forms with epenthesis and the ones without it. I
did this both with google search with Persian script (that is the method I used for (37)), as
well as with the Pinglish search (as in (38) and (39)). (40) and (41) show the Persian
google search, and (42) and (43) show the Pinglish search of the same words. Note that
the suffixed form given in (a) for each word is one which can take epenthesis and the
suffixed forms given in (b), (c), etc. are those which do not show epenthesis. For the
Pinglish results, I show two numbers. The first one is the number for the form without
epenthesis and the second one is the one for the form with the epenthetic vowel.
(40) bɑɢ Google search (in Persian script)
a. bɑɢbɑn/ bɑɢebɑn ‘gardener’ 125,000
b. bɑɢdɑr ‘sb who owns a garden’ 24,500
c. bɑɢʧe ‘small garden’ 818,000
(41) kɑr Google search (in Persian script)
a. kɑrɡar/kɑreɡar ‘worker’ 2,190,000
b. kɑrmand ‘employee’ 1,550,000
c. kɑrɡɑh ‘atelier, workshop’ 2,320,000
d. kɑrdɑr ‘chargé d’affaires’ 45,800
159
(42) bɑɢ Pinglish search
a. bɑɢbɑn/ bɑɢebɑn ‘gardener’ baghban/bagheban 144,000/3,810
b. bɑɢdɑr ‘sb who owns a garden’ baghdar 9,050
c. bɑɢʧe ‘small garden’ baghche/baghcheh43
48,200/29,200
(43) kɑr Pinglish search
a. kɑrɡar/kɑreɡar ‘worker’ kargar/karegar 128,000/38,000
b. kɑrmand ‘employee’ karmand 50,700
c. kɑrɡɑh ‘atelier, workshop’ kargah 38,900
d. kɑrdɑr ‘chargé d’affaires’ kardar 115,000
As the examples show, there is no correlation between the frequency of the words and the
possibility of the occurrence of epenthesis. For example, in (40), the word
bɑɢbɑn/bɑɢebɑn is more frequent than one of the words which do not show epenthesis
(i.e., bɑɢdɑr) but less frequent than the other which does not show epenthesis (i.e.,
bɑɢʧe).
I should note that I also looked at Persian corpora (e.g., Hamshahri corpus (2008),
Bijankhan corpus (2007)). None of them could be considered as a source for my search
because they were based on limited data and also on written Persian (e.g., magazines,
newspapers, etc.), which is not my main focus in this study and as discussed, written
Persian does not show the epenthetic e.
43 In Persian, the words which end in –e are written with a silent –h at the end. In Pinglish, both with-silent h and without-h versions are observed so I searched both versions for the Pinglish of the word bɑɢʧe.
160
As shown, one cannot build a strong argument based on frequency obtained through a
google search and Persian corpora for epenthesis in suffixation.
So far I have discussed suffixes and the suffixed forms synchronically and shown that
types of clusters, productivity of suffixes and frequency do not provide an account for
epenthesis. Next I examine epenthesis from an historical viewpoint. I argue that historical
investigation offers an account for some suffixed forms which show epenthesis but
cannot fully explain the process.
4.6. Epenthesis in suffixation: a historical perspective
In this section I look into the historical background of the suffixed forms which show
epenthesis. Given that, as I will show below, there are present-day consonant-initial
suffixes which were historically vowel-initial, one may ask whether the suffixes with
which epenthesis may occur today had –e in their initial position in their historical forms,
and then lost their initial –e and became consonant-initial. Under this account, the –e
which is observed today is a consequence of the historical form of these suffixes. This is
a possible explanation for epenthesis and therefore is worth studying.
I argue that the occurrence of –e cannot be fully explained historically. That is, looking at
the former forms of these suffixes in Middle Persian, it is not the case that the suffixes
which show –e today had –e in their initial position in the previous stages of the
language.
4.6.1. An historical investigation of suffixes and epenthesis in suffixed forms
In this section, I discuss historical forms of suffixes based on the following sources:
Kalbasi (1992), Farahvashi’s Middle Persian to Modern Persian dictionary (1967), and
Farahvashi’s Modern Persian to Middle Persian dictionary (1973). In Middle Persian, as
in Modern Persian, the vowels a, e, o were not shown in the script. Thus we cannot be
161
entirely sure about the transcriptions given for the words where these vowels are
involved. This uncertainty is observed in various versions of the same word or suffix
found in different literature on Middle Persian, as seen below. In addition, not every word
in Middle Persian has a correspondent in Modern Persian; neither does every word in
Modern Persian have a correspondent in Middle Persian. Nonetheless, I look at these
sources to see what I can find with respect to suffixation. I first examine the suffixes to
present a general picture of the status of the suffixes in Middle Persian and its correlation
with their present status. As noted above, the reason that the earlier forms of these
suffixes are worthy of study is the existence of a vowel in initial position in some
suffixes. After some general observations about the suffixes, I will discuss the specific
cases which may show epenthesis in Modern Persian to see if I find an explanation for
epenthesis by comparing these particular words with their former correspondents. As
argued below, it seems that there is no correlation between the occurrence of the vowel
before the suffix in the present time and its occurrence in the past. It suffices to give some
examples of these suffixes.
According to Kalbasi (1992), the present suffix -zɑr was formerly -ezār/-iʧār/-ēʧār. This
suffix does not show the initial vowel today, as in (44) and (45).
(44) Middle Persian Modern Persian
kār-ezār / kār-iʧār / kār-ēʧār kɑr-zɑr ‘the field for battle’
The word kɑr means ‘work’ today. Historically, it meant ‘work’ and ‘battle’. In the word
kɑr-zɑr the meaning of kɑr as ‘battle’ is retained.
However, note that in addition to the above example, the Middle Persian Farahvashi
dictionary show cases of the suffix -zɑr as former -zār.
(45) Middle Persian Modern Persian
a. ɡul-zār ɡol-zɑr ‘flower field’ (ɡol ‘flower’)
b. kiʃt-zār keʃt-zɑr ‘farm’ (keʃt ‘planting’)
162
With respect to the present suffix -zɑr, it seems that it had both vowel-initial and
consonant-initial versions in Middle Persian. In Modern Persian, no vowel is seen before
this suffix even in the case where there was an initial vowel in Middle Persian.
Now consider the suffix -nɑk. The present suffix -nɑk was formerly -ēnɑk but it does not
show the initial vowel today.
(46) Middle Persian Modern Persian
a. tars-ēnɑk tars.nɑk ‘scary’ (tars ‘fear’)
b. bīm-ēnɑk bim.nɑk ‘fearful’ (bim ‘fear’)
Now consider the following suffixes which did not have an initial vowel in Middle
Persian but may show the –e today:
The present suffix -bɑn had the form -bān/-pān in Middle Persian but it may show an
epenthetic vowel today, as in (47):
(47) Middle Persian Modern Persian
a. bāɣ-pān bɑɢ-bɑn ~ bɑɢ-e-bɑn ‘gardener’ (bɑɢ ‘garden’)
b. pās-pān pɑs.bɑn ~ pɑs-e-bɑn ‘policeman’ (pɑs ‘watch,
guard duty’)
There are cases such as the following in which the vowel occurs with -bɑn neither in its
earlier version nor in its current one.
163
(48) Middle Persian Modern Persian
a. myazd-pān miz-bɑn ‘host/ess’ (miz ‘table’)44
b. marz-pān marz-bɑn ‘border guard’ (marz ‘border’)
Thus, as the data in (44)-(48) show, there is not necessarily a relation between the
occurrence of the vowel before the suffix in the present time and its presence in the
earlier form. That is, there are some suffixes which were vowel initial but they do not
show the vowel today. There were some which were not vowel initial but show the vowel
today. And there are some suffixes which show the vowel neither historically nor
synchronically. This is the general case with suffixes.
Now let us examine some specific words which show epenthesis today and trace them
back to Middle Persian. Looking at the actual words found in the two dictionaries by
Farahvashi, the following cases are found. There are words with a vowel at the stem-
suffix boundary, there are cases where there is not a vowel, and there are cases where
both are found. It is possible that more variation might have existed, but was not recorded
in the sources.
There are some words which show the occurrence of epenthesis at a stem-suffix
boundary and it seems that this is a reflection of the historical situation. For example,
consider (49). I show the morphology and the epenthetic vowel for the current form of
these words in parentheses. The morphology of the historical form of these words will be
discussed shortly.
(49) Middle Persian Modern Persian
a. arʤōmand arʤomand ‘valued’ (arʤ-o-mand)
b. kārīɡar/kārīkar kɑreɡar ‘worker’ (kɑr-e-ɡar)
44 The word myazd in Middle Persian meant anything edible which is put in religious ceremonies for people to eat (Farahvashi’s dictionary).
164
In Modern Persian, these cases are interpreted as follows. In the case of arʤomand, it is
arʤ ‘value’ followed by the suffix mand, and therefore o is considered as an epenthetic
vowel. In Middle Persian, this suffix was in fact ōmand, which changed to mand over
time. See the following examples:
(50) Middle Persian Modern Persian
a. xrat-ōmand xerad-mand ‘wise’ (xerad ‘wisdom’)
b. dard-ōmand dard-mand ‘suffered’ (dard ‘pain, suffer’)
c. hunar-ōmand honar-mand ‘artist’ (honar ‘art’)
d. sūt-ōmand sud-mand ‘beneficial’ (sud ‘benefit’)
e. hōʃ-ōmand huʃ-mand ‘intelligent’ (huʃ ‘intelligence’)
f. nijāz-ōmand nijɑz-mand ‘needy’ (nijɑz ‘need’)
g. kār-ōmand kɑr-mand ‘employee’ (kɑr ‘work’)
It seems reasonable to consider the vowel at the suffix boundary in arʤomand to be a
historical residue.
As for the case of kɑreɡar ‘worker’, it is now considered to consist of kɑr ‘work’
followed by -ɡar, and therefore e is considered to be an epenthetic vowel. As noted
above, this word was kārīɡar/kārīkar (the suffix -ɡar is also seen as -kar in Middle
Persian). But the word kɑr ‘work’ was formerly kār, so the question is: why is there a
vowel ī between kār and ɡar in kāriɡar? Did the suffix have the vowel i in its initial
position as the case was with *ōmand > mand? Middle Persian words such as following
suggest that the suffix was ɡar/kar in Middle Persian (and not īɡar/ īkar).
(51) Middle Persian Modern Persian
a. dāt-kar dɑd-ɡar ‘just’ (dɑd ‘justice’)
b. āmār-kar ɑmɑr-ɡar ‘statistician’ (ɑmɑr ‘statistics’)
165
c. āhan-ɡar ɑhan-ɡar ‘blacksmith’ (ɑhan ‘iron’)
d. zar-ɡar zar-ɡar ‘goldsmith’ (zar ‘gold’)
e. tuwān-ɡar tavɑn-ɡar ‘rich’ (tavɑn ‘power’)
None of these words have a vowel between the root and the suffix in Modern Persian.
According to Farahvashi’s dictionary, the structure of the word kāriɡar/kārikar was kār-īk(īɡ)-ar. Note that, īk/īɡ was the adjective-forming suffix which is –i today (e.g., bɑd ‘wind’; bɑdi ‘windy’). Middle Persian kārīk(īɡ) (kār+ īk) ‘hard working’ is kɑri (kɑr+i) today. The suffix –ar, according to Kalbasi, is an unproductive suffix today; it is
seen in the word anɡoʃt-ar ‘ring’ today (anɡoʃt ‘finger’) –she refers to Moin’s dictionary
saying that –ar was formerly -arīɡ. If these interpretations are correct, then the structure
we consider for kɑrɡar today to consist of kɑr + ɡar is different from what the word
historically is thought to be.
It is important to note that both arʤomand and kɑreɡar have forms which do not show
the vowel at the suffix boundary; that is, arʤ-mand and kɑr-ɡar. These cases where no
vowel is seen at the suffix boundary can be considered as reanalysis of the suffixation
process. That is, since the dominant pattern in Modern Persian is a tendency to not having
the vowel, the vowel, which is a historical residue in these cases, can be eliminated. Thus,
both versions are possible.
There are cases which have both versions today and, as the Middle Persian dictionaries
show, they were variable in Middle Persian too (kār ‘work, battle’). For example:
‘treasure’ (formerly also ɡanʤ). The root varz ‘farm’, now barz, is used only in suffixed
form.
In Modern Persian, the suffixed forms of some words with the suffix –var are as follows.
As (55) shows, no epenthesis occurs.
167
(55) Modern Persian
a. soxan-var ‘fine speaker’ soxan ‘speech’
b. nɑm-var ‘famous’ nɑm ‘name, fame’
c. bɑr-var ‘fruitful’ bɑr ‘fruit’
So the presence of the vowel in the word ʤɑnevar ‘animal’ (ʤɑn ‘soul, life’+var) is best
considered to be a historical residue.45
The cases that were just discussed, including kɑrevɑn ‘caravan’, arʤomand ‘valuable’,
kɑreɡar ‘worker’, and ʤɑnevar ‘animal’, may suggest that the epenthesis cases today
can be traced back to Middle Persian and therefore they are best treated as historical
residues. More examples in favor of this hypothesis are as follows:
(56) Middle Persian Modern Persian
a. mitrā-pān /mihr-bān mehr-a-bɑn ~ mehr-bɑn ‘kind’
(mehr ‘kindness’)
b. dūtak-mān dud-e-mɑn ~ dud-mɑn ‘family’
In the case of *dutak-mān > dud-e-mɑn, I need to note that final -ak/aɡ in Middle
Persian changed to –e in Modern Persian (*dutak ‘family, lineage’ > dude). The form
dude is not used on its own today (if it is used it is in poetry or literary texts).
However, there are cases with the vowel at the suffix boundary in the present time which
cannot be historically explained. First there are words which did not have the vowel in
Middle Persian and now they may show it, as (57) shows.
45 The word ʤɑn followed by var is morphologically possible. In the experiment I did on the epenthesis in suffixation (see 4.7), a speaker asked that by ʤɑn + var (to be put together to form a suffixes form) whether I meant the word for ‘animal’, ʤɑnevar , or ʤɑn + var which could potentially result in ʤɑnvar to be a last name for instance as the speaker noted.
168
(57) Middle Persian Modern Persian
a. rōʧ-kār ruz-ɡɑr ~ ruz-e-ɡɑr ‘times’ (ruz ‘day’)
b. āmōʒ-kār ɑmuz-ɡɑr ~ ɑmuz-e-ɡɑr ‘teacher’
c. ajāt-kār jɑd-ɡɑr ~ jɑd-e-ɡɑr ‘memento’ (jɑd ‘memory’)
d. bāɣ-pān bɑɢ-bɑn ~ bɑɢ-e-bɑn ‘gardener’ (bɑɢ ‘garden’)
e. pās-pān pɑs.bɑn ~ pɑs-e-bɑn ‘policeman’ (pɑs ‘watch,
guard duty’)
Some of the words which may show the vowel at the suffix boundary today are not found
in Middle Persian dictionaries, such as:
(58) a. ʃɑd-mɑn ~ ʃɑd-e-mɑn ‘happy’
b. bɑd-bɑn ~ bɑd-e-bɑn ‘sail’
Again, as mentioned already, there is a tendency with some suffixes to show epenthesis.
These include -mɑn and -ɡɑr and, to a lesser extent, -ɡar and -bɑn. None of these
suffixes were vowel-initial in Middle Persian. In fact, some suffixes like –ēnāk, which
was vowel-initial, do not show the vowel today. But note that one cannot argue that there
is an inverse correlation between the former form of the suffixes and their current form,
as there are suffixes which show the vowel neither historically nor synchronically, as in
(59):
(59) Middle Persian Modern Persian
a. xiʃm-ɡēn / hiʃm-ɡēn xaʃm-ɡin ‘angry’ (xaʃm ‘anger’)
b. andōh-ɡēn anduh-ɡin ‘sad’ (anduh ‘sadness’)
169
c. nikun-sār neɡun-sɑr ‘overthrown’46
d. dast-ɡīr dast-ɡir ‘arrested’ (dast ‘hand’)
To sum up, while considering the historical status of the suffixes is informative for some
cases where synchronically one does not expect to get the vowel at stem-suffix boundary
because the suffix is not usually preceded by the vowel (e.g., the suffix var does not show
epenthesis but there is ʤɑnevar ‘animal’), the synchronic status of epenthesis in
suffixation needs to be considered in order to account for other cases.
4.6.2. A note on epenthesis within stems
The historical insertion or deletion of a vowel where there is a consonant cluster is also
observed within stems. The focus here is not about this sort of within-stem historical
change, so I keep the discussion brief. Initial consonant clusters are forbidden in Modern
Persian, but were acceptable in Middle Persian. (60) presents some examples of the
insertion of a vowel in order to break up the former initial consonant clusters.
(60) Middle Persian Modern Persian
a. frēp farib ‘deception’
b. xrat xerad ‘wisdom’
c. ʃkanʤak ʃekanʤe ‘torture’
d. spās sepɑs ‘appreciation’
46 The word neɡun is used today in compounds or suffixed forms, such as neɡun-sɑr ‘overthrown’, sar-neɡun ‘overthrown’, neɡun-baxt ‘unlucky’.
170
In medial position, the situation is not as consistent as in initial position since there are
cases where a vowel has been inserted, but also there are cases, though few, where the
vowel has been deleted, creating consonant clusters, as shown below:
(61) Middle Persian Modern Persian
vahuman / vohuman bahman ‘avalanche’
Compare this with the following:
(62) Middle Persian Modern Persian
a. āfrīn ɑfarin ‘bravo’
b. ārzōk ɑrezu ‘wish’
There are also cases where the medial consonant clusters are tolerated within stem in the
present time as in the past. (63) presents some examples:
(63) Middle Persian Modern Persian
a. almāst almɑs ‘diamond’
b. narɡis narɡes ‘daffodil’
c. nifrīn nefrin ‘curse’
Sometimes a medial consonant cluster was created as a result of metathesis in initial
position in order to break up the initial consonant clusters which are absolutely forbidden
in the language today.
171
(64) Middle Persian Modern Persian
a. frahanɡ farhanɡ ‘culture’
b. framān farmɑn ‘command’
c. frazand farzand ‘offspring’
There are also cases where a stem has two pronunciations in today’s speech, one with a
vowel so the medial consonant cluster is broken up, and the other without the vowel so
the cluster is tolerated.47
(65) Middle Persian Modern Persian
See (65):
a. āʃnāk ɑʃnɑ ~ ɑʃenɑ ‘familiar’
b. āʃkrāk ɑʃkɑr ~ ɑʃekɑr ‘apparent’
Although in initial position there is a pattern from Middle Persian to Modern Persian with
respect to consonant clusters, in medial position this is not the case. In final position, both
in Middle Persian and in Modern Persian consonant clusters are allowed. Some examples
are presented in (66).
(66) Middle Persian Modern Persian
a. dōst dust ‘friend’
b. narm narm ‘soft’
c. napart nabard ‘war’
d. ɡanʤ ɡanʤ ‘treasure’
47 Lazard (1992) also gives examples of synchronic variation within stems (e.g., ɑftɑb/ɑfetɑb ‘sunshine’).
172
To sum up, the language permits medial consonant clusters, whether within stems or
across suffix boundaries.
The historical investigation of suffixes and suffixed forms cannot fully provide an
account for epenthesis, which occurs in some particular cases and not in others. Thus a
synchronic investigation needs to be done to explain epenthesis in suffixation.
In order to provide further insight into the synchronic status of epenthesis, I carried out an
experiment, discussed in the following section.
4.7. Epenthesis in suffixation: an experiment
In this section, I discuss an experiment on suffixation which was done in order to
determine whether epenthesis is synchronically productive. The experiment was designed
with two goals in mind. First was to test the frequency of occurrence of epenthesis in real
words. Second was to see how epenthesis applies in made-up words with the structure of
real words as the absence or presence of epenthesis in made-up words can shed light on
the structure of Persian vowels.
Let us first review the structures in which epenthesis may or may not occur. Recall that
while epenthesis is never found in the CVlaxC environment, it is possible but not required
in the other environments. Compare (67i) and (67ii):
(67) (i) a. mehr ‘kindness’+bɑn mehrabɑn ~ mehrbɑn ‘kind’ CVlaxCC
b. kɑr ‘work’+ ɡar kɑreɡar ~ kɑrɡar ‘worker’ CVtenseC
c. xɑst ‘desire’ + ɡɑr xɑsteɡɑr ~ xɑstɡɑr ‘suitor’ CVtenseCC
173
(ii) a. xaʃm ‘anger’ + nɑk xaʃmnɑk ‘angry’ (*xaʃmenɑk) CVlaxCC
b. dɑd ‘justice’ + ɡar dɑdɡar ‘just’ (*dɑdeɡar) CVtenseC
c. (ʔ)ist ‘stop’+ ɡɑh (ʔ)istɡɑh ‘station’ (*(ʔ)isteɡɑh) CVtenseCC
The words in (67i) and (67ii) share the same structures (i.e., structures other than
CVlaxC), but do not pattern identically with respect to epenthesis. If the occurrence of
epenthesis is motivated by the properties of vowels and the difference in the syllable
structure is due to these properties, why does epenthesis not occur with all cases which
include roots with those vowels/syllable structures? In order to answer this question, in
section 4.5 I examined a variety of factors including frequency, productivity of the
suffixes, and type of clusters created at a stem-suffix boundary, and found that they do
not provide an account for the variations in epenthesis. I also investigated the historical
status of the suffixes and the suffixed forms involved and concluded that epenthesis
cannot fully be accounted for based on historical facts (section 4.6). As said above, in
order to determine whether epenthesis is productive in the language, I conducted an
experiment with 10 native speakers of Persian, in both production and perception. The
production part included three tasks: reading, question and answer, and wug test. The
perception part included an acceptability rating task.
The results of the experiment show that in production the absence of epenthesis under
suffixation is the dominant pattern in Persian regardless of the nature of vowels and
syllable structure. For those words which can take epenthesis, usually both the epenthetic
version and the non-epenthetic version are acceptable. The cases where epenthesis occurs
form a limited number of frozen cases. In perception, the absence of epenthesis is the
favored pattern, leaving aside the limited set of real words, which have both versions. If
the speaker accepts the version with epenthesis, this is more likely with words with
structures other than CVlaxC. I will return to this after discussing the experiment.
I will first discuss methodology. Then different tasks of the experiment will be explained
and their results will be presented. Afterwards, I will present a summary and discussion
of the experiment.
174
4.7.1. Methodology
The experiment was conducted with 10 native speakers of Persian, three men and seven
women, within the age range of 23 to 61 years of age.48
The experiment consisted of four tasks, as follows:
8 of these people live in Canada;
2 of them live in Iran and were in Canada for a trip at the time I conducted the
experiment. I abstract away from socio-linguistic factors such as age and gender in
analyzing this experiment as a sociolinguistic investigation of this process is not a goal
for this study.
(i) Task 1: Production (reading) (real words)
(ii) Task 2: Production (question and answer) (real and made-up words)
(iii) Task 3: Production (wug test) (made-up words)
(iv) Task 4: Perception (acceptability rating) (real and made-up words)
I consider acceptability rating as a way of evaluating perception. Thus when I use the
term ‘perception’ for task 4 (as opposed to the first three tasks which are ‘production’
tasks), I mean acceptability rating.
See Appendix 3 for a complete version of the experiment with all the tasks and tokens.
The experiment includes testing three different processes, all related to suffixation, one of
which is discussed in this section. The other two processes will be discussed in chapters 7
and 8. In total, in the whole experiment there were 82 tokens for task 1, 137 tokens for
task 2, 96 tokens for task 3, and 182 tokens for task 4 (497 tokens in total). It took about
an hour for each participant to complete the experiment. The advantage of mixing up the
48 I am grateful to Ron Smyth for his advice and guidance on the experiment and statistics. Many thanks to my participants (in the order of participating in the experiment): Ali Esmaeili (AE), Farideh Tajik (FT), Shadi Farshadfar (SF), Vahid Danaee (VD), Shery Shahabi (SS), an anonymous participant (AA), Rana Mohammad Esmaeil (RM), Poopak Haghi (PH), Nikou ParvinNejad (NP), Lili ParvinNejad (LP). Their help and willingness to participate in my experiment are much appreciated. I will use their initials throughout.
175
words which were studied for three different processes was that the speakers could not
figure out what I was looking for. In tasks 2, 3, and 4, the tokens were randomized
through random.org. As I go through each process which was tested through different
tasks, I will refer the readers to specific appendices related to the process and the task.
Appendix 10 contains all the real words used for the experiments. In this chapter, I focus
on epenthesis in suffixation.
I now return to the four tasks introduced above. In the reading task, two texts were
provided for the speakers to read. The texts contained several cases of suffixed forms.
The reading task was followed by a question and answer task. In this task, speakers saw a
word and a suffix on the screen and they were asked to put these together to make a
suffixed form, and to say the word they made in answer to a question. The next task was
wug test. Speakers were asked to make a suffixed form with a made-up word and a real
suffix and put it in a sentence to be read. These were the three production tasks. The last
task involved perception rating. Speakers were asked to rate the acceptability of a
suffixed form they heard.
For recording the speakers I used an Olympus digital voice recorder WS-500M and a
cyber acoustics microphone (CVL-1124R-CW). The sounds were recorded in WMA
format stereo with bit rate 128 Kbps.
I now elaborate on the tasks.
4.7.2 Task 1: Production (reading)
I provided two texts in Persian to be read by the speakers. See Appendix 4 for the two
texts in Persian script and their translations. The appendix also shows the suffixed words
in the two texts. One of the texts was more formal and the other more informal. The
reason to consider two texts with difference in level of formality was to see if there is a
difference in the use of epenthesis when the text is less formal and therefore includes
processes found in informal speech such as raising before nasal consonants (discussed in
section 3.8). Note that the level of formality in itself was not of interest in this study,
176
which is focused on informal daily speech. I took level of formality into consideration in
preparing the reading texts because the other processes which are dependent on level of
formality may affect epenthesis. Also there are some words which have versions both
with and without epenthesis and one is used more often in daily speech and the other one
is more heard in formal speech or poetry, etc. I will show examples of this later. Both
texts had several suffixed words, including those words which can occur with epenthesis
and those which cannot. Some words were repeated more than once.
I put the reading task as the first task of the experiment because it is natural for the
speakers to read in their native language so it was a good warm-up to start the experiment
and participants read the texts without any sense of the purpose of the experiment. The
advantages of having a reading task were as follows. First, it showed me the basic pattern
for each speaker. This allowed me to determine whether a speaker tends to use the non-
epenthetic version of words in general or whether a particular word is preferred without
epenthesis. Second, the reading task showed me intra-speaker variation as some of the
words were repeated so it was possible to see whether the same word is used by the same
speaker once with and the other time without the epenthetic vowel. Third, it showed in
general how much tendency towards epenthesis exists in the language, that is, in cases
where there is optional epenthesis how much the epenthesis occurs.
Let us now discuss the results. The results for the words with which one can get
epenthesis (CVtenseC, CVtenseCC, and CVlaxCC) are given below:
(68) The result of epenthesis-possible cases (“E” stands for epenthesis; without-
epenthesis versions are shown in “No E” row and with-epenthesis versions in “With E”
row; Misc stands for miscellanous).
No E With E Misc Total
187
(47.95%)
203
(52.05%)
0 390
(100%)
177
To clarify, the ‘No E’ column should be read as: there were 390 tokens in total (Total),
out of which 187 were pronounced without epenthesis (No E), or 47.95% while 203, or
52.05%, showed epenthesis (With E). As (68) shows, there is not much difference
between the epenthesis-including version and the non-epenthesis version. Note that I
abstract away from internal differences among the structures which show epenthesis (e.g.,
if a tense vowel before a CC induce epenthesis more or a tense vowel before a CC). For
my purpose, the internal differences do not really matter. The important point is if
epenthesis can occur or not.
It is interesting to look at a few examples. Recall that there are words with both
epenthetic and non-epenthetic versions but one version is more common than the other.
In the experiment, this is seen in, for example, mehrbɑn ~ mehrabɑn ‘kind’ and sɑzmɑn ~ sɑzemɑn ‘organization’. Out of 20 repetitions of mehrbɑn ~ mehrabɑn ‘kind’ we see
mehrabɑn 18 times and mehrbɑn 2 times (this is expected because mehrbɑn is usually
used in poetry or very formal or literary speech). Also sɑzmɑn ~ sɑzemɑn ‘organization’
shows 16 cases of sɑzmɑn and 4 cases of sɑzemɑn. There are words which can have
both versions and they are used (almost) equally, for example, arʤmand ~ arʤomand
‘valued’ shows 10 occurrences of arʤmand and 10 occurrences of arʤomand out of 20
repetitions. As for intra-speaker variation, a speaker may pronounce a word once with
and another time without epenthesis, showing that a speaker can interchangeably use both
versions. For example, one of the speakers (FT) said arʤmand once and arʤomand the
other time; another speaker (SF) said sɑzmɑn and ruzɡɑr once and sɑzemɑn and ruzeɡɑr the other time; and another speaker (NP) said pɑsbɑn once and pɑsebɑn the other time.
I now move on to the non-epenthesis cases. The results for the words with which
epenthesis is not expected are as follows –note that the words with which one does not
expect to get epenthesis includes the words with CVlaxC structure as well as those with
other structures (CVlaxCC, CVtenseC, CVtenseCC –recall that many words in the language
have one of these three structures but do not take epenthesis):
178
(69) The result of non-epenthesis cases
No E With E Misc Total
409
(99.76%)
1
(0.24%)
0 410
(100%)
As (69) shows, out of 410 tokens, 409 were pronounced without any epenthesis, that is,
99.76%. Only one speaker said one word, which is considered to be among the non-
epenthetic words, with epenthesis, and that word is pɑs-ɡɑh in which the root has the
CVtenseC structure. The rest of the words were pronounced without epenthesis by all
speakers.
The results of this task, which includes only real words as it was a reading task, strongly
confirm that there is a set of words which can take epenthesis and there are words which
cannot take epenthesis although they share the same root structure. Among those words
which can take epenthesis, there is no particular tendency towards using or not using
epenthesis, considering the overall result 47.95% vs. 52.05%.
4.7.3. Task 2: Production (question and answer)
The production task includes both real and made-up words (in addition to appendix 3,
which is on the experiment in general, appendix 5 is related to this task in particular).
Participants were seated in front of a computer. They heard a question and saw a slide on
the screen which consists of ‘a root + a suffix’. They were asked to put together the root
and the suffix, and put the word they made in a blank space in a frame sentence they saw
on the screen. The question and the frame sentence are given below:
179
(70) The question: ali ʧi ɡoft?
What did Ali say? (recorded from before)
The frame sentence: fekr mikonam goft…..
I think he said……… (participant’s response)
The speakers could choose to say the whole frame sentence or just the word they made.
There was training with both real and made-up words before doing this task.
An example of what participants saw on the screen is given below:
(71) kɑr + ɡɑh (‘work’ + ɡɑh) (written in Persian script)
fekr mikonam goft….. (I think he said………) (written in Persian script)
Participant’s oral response (in this case: kɑrɡɑh) fits in the blank part.
The words which were examined are categorized into three groups: (i) real words with
which epenthesis is possible; (ii) real words with which it is not expected to see
epenthesis; (iii) made-up words. I did not separate the made-up words into two groups as
with-epenthesis and non-epenthesis since, as seen below, non-epenthesis is by far the
dominant pattern regardless of the structure of the roots.
Now let us look at each of the three categories of words one by one: real words in which
epenthesis is possible, real words in which epenthesis is not possible, made-up words.
The results for the real words where epenthesis is possible are as follows:
180
(72) The result of real words where epenthesis is possible
No E With E Misc Total
159
(56.79%)
121
(43.21%)
0 280
(100%)
As (72) shows, no significant difference is observed between the epenthesis-including
versions and those without epenthesis. This confirms the results of task 1 (see (68)). Now
let us consider words for which one does not expect to see epenthesis. The result is as
follows:
(73) The result of real words with which epenthesis is not expected
No E With E Misc Total
120
(100%)
0
(0%)
0 120
(100%)
Very clearly, the result is as expected. No epenthesis is found with the words with which
it is not expected to see epenthesis.
Let us now look at the made-up words. As said above, I did not separate the made-up
words into two groups as epenthesis-expected and non-epenthesis-expected since the
non-epenthesis is by far the dominant pattern regardless of the structure of the roots. The
words were made up in a way to include the structures under study, as given in appendix
5. As the appendix shows, there are 24 words. For each of the four root structures,
CVlaxC, CVtenseC, CVlaxCC, CVtenseCC, there are 6 words, 3 words with suffixes with
which there are more cases of epenthesis in real language, and 3 words with suffixes with
which there are no or rare cases of epenthesis in real language. In terms of vowels, the lax
181
vowels in CVlaxC include one example of each lax vowel, a, e, o. The same holds for the
lax vowels in CVlaxCC. The tense vowels in CVtenseC also include one example of each
tense vowel, ɑ, i, u. The same holds for the tense vowels in CVtenseCC.
Here is the result of the made-up words in this task:
(74) The result of made-up words
No E With E Misc Total
235
(97.92%)
5
(2.08%)
0 240
(100%)
The table shows that out of 240 tokens of made-up words, only 5 cases were pronounced
with epenthesis. Thus 235 tokens were pronounced without epenthesis, which strongly
confirms that non-epenthesis is the dominant pattern.
Let us look at those 5 cases which were pronounced with epenthesis. The made-up words
which were produced with epenthesis are given below –note that in total three speakers
are responsible for these five cases:
(75) a. mekr + mɑn → mekremɑn (by two speakers)
b. bɑrʧ + mɑn → bɑrʧemɑn (by one speaker)
c. fars + ɡar → farseɡar (by one speaker)
d. kaft + ɡin → kaftɑɡin (by one speaker)
All these words show the CVlaxCC or CVtenseCC structure, which are two of the expected
structures that allow epenthesis. If we look at the suffixes, we see that in (a)-(c) the
suffixes are those which tend to show more cases of epenthesis in the language. A point
must be made about the word given in (d). Persian has a suffix -ɑɡin in addition to -ɡin
182
(e.g., atr ‘scent’ and atr-ɑɡin ‘scented’) so I am not sure whether the speaker intended
epenthesis or whether this vowel appeared due to the influence of or confusion with the
suffix -ɑɡin.
The result of this task strongly suggests that non-epenthesis is the dominant pattern when
speakers are given new words, and if epenthesis occurs, which is very rarely observed, it
is with one of the expected structures.
4.7.4. Task 3: Production (wug test)
This task includes only made-up words (Appendix 6 is related to this task). There was a
training period in which the speakers were given a list of pairs of sentences and a list of
suffixes. They were asked to make a suffixed form out of a word in the first sentence of
the pair using one of the suffixes and put it in the following sentence in the pair. There
was a practice period for made-up words but the list of suffixes and the list of sentences
were the same as those which were used in training with real words. There were different
sentences, one sentence for each suffix, suitable to the suffix’s meaning/function (see
Appendix 3). The idea was to remind the speakers of the way a word is made in Persian,
for example consider mes ‘copper’, from which by adding the suffix -ɡar, there is mesɡar ‘coppersmith’, which can be put in a sentence such as ‘Ali was working/dealing with
mes. Ali was mesɡar’. For a made-up word such as haf, it is possible to make hafɡar to
be put in the sentence. Note that I chose six suffixes for this process: two show more
cases of epenthesis (i.e., -ɡɑr, -mɑn), two show no epenthesis or rarely show it (i.e., -nɑk,
-ɡɑh), and two are in between (i.e., -ɡar, -bɑn). Nonsense roots included the structures
under study. For each suffix, 12 words were created, 3 words for each root structure. The
3 words for lax vowels with CVlaxC structure include one example with a, one with o, and
one with e as the vowel of the root. The same is the case for lax vowels with CVlaxCC
structure. The 3 words for tense vowels with CVtenseC structure include one example with
u, one with i, and one with ɑ. The same is the case for tense vowels with CVtenseCC
structure. See appendix 6.
183
As shown in (76), the results are categorical and clearly show the non-epenthesis pattern.
(76) The result of made-up words (wug test)
No E With E Misc Total
720
(100%)
0
(0%)
0 720
(100%)
This result, together with the results for the made-up words in the previous task (see
(74)), shows that speakers strongly tend to not use epenthesis if they are given new
words. And if one compares the results of the two tables on made-up words ((74) and
(76)) with those of the tables for real words ((72) and (73)) it is observed that in fact the
occurrence of epenthesis is limited to a set of words, and is not seen with other real
words. This leads us to think of the occurrence of epenthesis in suffixation to be limited
to some frozen cases.
The three tasks discussed so far show what the speakers did in production. Next I discuss
the perception rating task.
4.7.5. Task 4: Perception (acceptability rating)
For this part of the experiment, a list of real and made-up suffixed words was recorded.
For each word, there were two versions, one with epenthesis and the other without. The
words were randomized, so the two versions were not adjacent. The participants (8
speakers for this task) were asked to rate each word on the following scale: √ (good,
acceptable, possible); ? (so-so); X (bad, unacceptable, impossible). The list of the roots
was written on paper and the speakers were asked to put one of the three signs (√, ?, or
X) in front of each word after they heard the suffixed forms. The reason that I wrote the
roots on paper was to show the speakers what the root is in order to make sure they
184
realize whether the root has the final –e or whether its presence is due to epenthesis; in
particular this is important with the made-up words. That is, if I had not given the written
form of for example fam to have fameɡar recorded for the speakers to hear and to rate,
the speakers may not have known for sure whether the root is fam or fame since there are
many words in Persian which end in –e. The written form clarifies whether the word is
fam or fame (given that, as mentioned in footnote 43, a silent –h is written at the end of e-
final words in Persian). I also pronounced the root before the suffixed form to be sure
they know what the root is. The reason that I did not write the suffixed forms on paper
was to make sure that their judgments on the acceptability of the suffixed forms are based
on their perception and have nothing to do with the orthography, which does not show –e
and thus can be confusing. Thus the speakers heard a suffixed word while they had the
root with which the suffixed word was formed on paper in front of them, and then there
was a pause for them to rate the suffixed word they heard, and then the next word was
heard to be rated.
The words of this task are categorized into five groups: (i) real words which are expected
to show epenthesis; (ii) real words where epenthesis is not found although the word has
one of the structures with which epenthesis may occur in the language; (iii) real words
with which epenthesis is not expected (CVlaxC structure); (iv) made-up words where
epenthesis can be possible based on the structure; (v) made-up words where epenthesis is
not expected based on the structure. (See Appendix 7 for the list of made-up words in this
task). I here show the results of each of these categories.
For the real words with which epenthesis can occur, cases such as kɑrɡar ~ kɑreɡar ‘worker’, the result of perception rating is given in (77):
185
(77) The result of real words where epenthesis is possible
√ ? X Total
No E 200
(89.29%)
6
(2.68%)
18
(8.04%)
224
(100%)
With E 205
(91.52%)
2
(0.89%)
17
(7.59%)
224
(100%)
This result is consistent with the results of the previous tasks as both versions are
acceptable without much difference in terms of percentage (89.29% and 91.52%). Let
me go through the chart to explain how the results are shown. Consider the chart in (77)
and recall that the rating scale was as follows: √ (good, acceptable, possible); ? (so-so); X
(bad, unacceptable, impossible). We have the non-epenthesis version of words (i.e., No
E) for which there are three options of ranking: √, ?, X. We also have the with-epenthesis
version of words (i.e., With E) for which there are same three options of ranking. The
first row, first column, “No E (√)”, shows the result for the non-epenthesis words which
got √ from speakers. The second row, first column, “With E (√)”, shows the result for the
with-epenthesis words which got √ from speakers, that is, those with-epenthesis words
which were acceptable. The “No E (?)” and “With E (?)” cells respectively show the non-
epenthesis versions which got ? from the speakers and the with-epenthesis ones which
got ?. The “No E (X)” and “With E (X)” cells respectively show the non-epenthesis
versions which got X from the speakers and the with-epenthesis versions which got X
from the speakers. The same method of showing results is used throughout this section.
It is important to note that the total is not necessarily 100% for each No E and its With E
correspondent (e.g., the “No E (√)”and “With E (√)” cells do not add up to 100%). The
reason is that one word can get, for example, √ in both of its versions (e.g., kɑrɡar and
186
kɑreɡar ‘worker’ both can get √ because they are both accepted in the language). As
seen above in the chart, the non-epenthesis and with-epenthesis versions are both fine for
the speakers. The so-so cases (those with ?) are very few. The bad cases (the ones shown
by X) show that there are not many cases where the non-epenthesis is bad and not many
cases either where the with-epenthesis is bad. In particular, in the chart, the “No E
(√)”and “With E (√)” cells are interesting to note as they show (as shown by other tasks
discussed above) how closely acceptable the non-epenthesis and with-epenthesis cases
are when it comes to the limited set of words which may show epenthesis.
For this task, I also included some real words with which it is not expected to get
epenthesis although they have the structures which can show epenthesis (e.g., xaʃm ‘anger’ + nɑk xaʃm-nɑk ‘angry’ *xaʃmenɑk).
(78) The result of real words where epenthesis does not occur although the word has one
of the structures with which one may get epenthesis in the language
√ ? X Total
No E 24
(100%)
0
(0%)
0
(0%)
24
(100%)
With E 2
(8.33%)
1
(4.17%)
21
(87.5%)
24
(100%)
This result shows that with real words with which one does not get epenthesis although
based on their structure should be able to, the acceptability of the non-epenthesis version
is far more than the acceptability of the with-epenthesis version (compare the “No E (√)”
and “With E (√)” cells), and the unacceptability of the with-epenthesis version is also far
more than the unacceptability of the non-epenthesis version (compare the “No E (X)” and
“With E (X)” cells). Comparing the result of this table with (77) shows that it is not only
the structure which makes epenthesis possible as the words in (78) have structures similar
187
to the words in (77). The only difference between them is that the group in (77) has an
epenthesis-including version in the language whereas the group in (78) does not.
Familiarity with the epenthesis-including version is therefore very important in speakers’
judgment on acceptability in the perception task.
Now let us look at the real words with the structure with which it is not expected to get
epenthesis (the CVlaxC structure). The result is shown in (79).
(79) The result of real words with which epenthesis is not expected (CVlaxC)
√ ? X Total
No E 32
(100%)
0
(0%)
0
(0%)
32
(100%)
With E 0
(0%)
1
(3.12%)
31
(96.88%)
32
(100%)
This result is consistent with the results of the production tasks for the words with which
one does not expect to see epenthesis. Looking at the “No E (√)” and “With E (√)” cells,
the non-epenthesis version is acceptable in all cases and the with-epenthesis version is
not acceptable at all (100% vs. 0%). For the so-so cases (those with ?), there is only one
case which is a with-epenthesis version. For the bad cases (shown by X), almost all with-
epenthesis cases are, as expected, bad (the only with-epenthesis case which did not get X
got ?). To sum up, real CVlaxC roots strongly show the non-epenthesis pattern.
I have gone through the results of the perception task for the real words. Before going to
the perception rating task for made-up words, I would like to note a few points.
First, recall that in the production tasks, I gave examples of words whose epenthesis-
including versions had higher frequency compared to their non-epenthesis versions (or
vice versa), and this was either in general in the language (that is there are words which
188
are in general pronounced more often with epenthesis or vice versa although they have
both versions), or it was particular to some speakers who show a tendency towards
articulating the words with epenthesis or vice versa (although the word has two versions).
There were also words which show an (almost or exact) 50-50 distribution regarding their
two versions. In perception, too, the same pattern is observed. There are some words
which got a perfect score for both versions, for example, ruzɡɑr and ruzeɡɑr both got 8
out of 8 for acceptability (i.e., √) without having any ? or X, the same goes for jɑdɡɑr and
jɑdeɡɑr; mehrbɑn and mehrabɑn, zɑjmɑn and zɑjemɑn, etc. There were cases, however,
which showed more acceptability with one version, for example, pɑjɡɑh shows 8 out of 8
for acceptability (√) (so it shows perfect acceptability) but its epenthesis-including
counterpart, pɑjeɡɑh, shows 5 cases of acceptability (√) and 3 cases of unacceptability
(X). The opposite happens for some cases such as mɑndɡɑr, which got 5 cases of
acceptability (√) along with 1 case of so-so (?) as well as 2 cases of unacceptability (X),
while its epenthesis-including version, mɑndeɡɑr, got 8 out of 8 for acceptability (√).
This is expected as in the language some words are pronounced in one of their two
versions more often than in their other version.
Second, if we compare the results of production and perception, it is observed that
sometimes the speakers do not say the epenthesis-including version (or the other version)
themselves, as shown by their production, but in perception they find it fine giving it a
‘√’. The difference might be due to the different tasks. A speaker may not say a word in
this particular way but if they hear it they find it familiar and therefore acceptable since
other people may say the word that way. That is, a speaker can potentially have both
options in mind but have one preference in performance which is shown in production.
An extreme case is bozorɡ-vɑr and its epenthesis-including version bozorɡ-a-vɑr. The
epenthesis-including version had zero frequency in production (that is no one used the
epenthesis-including version in production tasks). In perception, it got 8 cases of √ for
the non-epenthesis version, but for its with-epenthesis version, got 6 cases of √, 1 case of
?, and 1 case of X. Sometimes this is due to the level of formality in the sense that, as
noted in 4.7.2, there are some words which have both non-epenthesis and with-epenthesis
versions but one of them is used more in very formal speech and poetry, etc. and the
other in informal daily speech.
189
For example, consider mehrabɑn ~ mehrbɑn ‘kind’. Its non-epenthesis version is usually
used in formal speech or poetry. Recall that in the reading task, out of 20 repetitions of
mehrbɑn ~ mehrabɑn we see mehrabɑn 18 times and mehrbɑn 2 times. In perception,
both mehrabɑn and mehrbɑn got perfect scores of acceptability (both got 8 out of 8).
This is because both versions are completely acceptable in the language, as the perception
results show, but production shows one version is more favorable in speech. Compare
these with cases which do not take epenthesis in the language. For example, ɡolzɑr ‘flower field’, bɑrɡɑh ‘palace, royal building’ and xaʃmnɑk ‘angry’ are pronounced by all
10 speakers as ɡolzɑr (not ɡolezɑr), bɑrɡɑh (not bɑreɡɑh) and xaʃmnɑk (not xaʃmenɑk);
and in perception, ɡolzɑr, bɑrɡɑh, and xaʃmnɑk got perfect acceptability (i.e., 8 √);
ɡolezɑr, bɑreɡɑh, and xaʃmenɑk, however, got 0 √, 0 ?, and 8 X.
Let us now move on to the made-up words. I first look at the made-up words which have
a structure other than CVlaxC and therefore with which the occurrence of epenthesis is not
unexpected. Recall that in the production tasks speakers overall produce the non-
epenthesis version for these structures.
(80) The result of made-up words where epenthesis can be possible
√ ? X Total
No E 134
(93.06%)
4
(2.77%)
6
(4.17%)
144
(100%)
With E 41
(28.47%)
7
(4.86%)
96
(66.67%)
144
(100%)
Looking at the “No E (√)” and “With E (√)” cells, the non-epenthesis is far more
acceptable. Compare these two cells with the “No E (√)” and “With E (√)” cells of the
chart in (77), where the result for real words with possible epenthesis is given. The “No E
(√)” cells of (77) and (80) are very close in terms of percentage of acceptance. But there
190
is an obvious difference between the “With E (√)” cell in (77) and the one in (80). The
with-epenthesis cases for the real words are by far more acceptable than the with-
epenthesis cases for the made-up words. This shows that if Persian speakers have not
heard the words, although these words have roots of the shape that allows epenthesis, the
form with epenthesis is generally unacceptable. This, along with the table in (78), also
suggests that the limited number of real words which allow an epenthetic vowel are
frozen or lexicalized. That is, as is the case with a large number of real words with
CVtenseC, CVtenseCC, CVlaxCC which do not take epenthesis, the made-up words with
these three structures also do not tend to show epenthesis. Epenthesis is thus seen only in
a limited list of words in the language.
Now I discuss made-up words with roots with the CVlaxC structure, or, those words with
which one does not expect to see epenthesis. See (81).
(81) The result of made-up words with which epenthesis is not expected (CVlaxC)
√ ? X Total
No E 45
(93.75%)
2
(4.17%)
1
(2.08%)
48
(100%)
With E 2
(4.17%)
4
(8.33%)
42
(87.5%)
48
(100%)
Again, the result is consistent with the results of production tasks. That is, the without-
epenthesis forms of these words received a very high percentage of acceptability and the
with-epenthesis forms of them a very high percentage of unacceptability. The result given
in (81) is close, in terms of percentages, to the result given in (79) for real words with
which epenthesis is not expected.
191
Now compare the three charts which show the results for real words. In (77), where the
result for epenthesis-possible real words is given, the non- and with-epenthesis cases are
both acceptable to a high percentage (see the “No E (√)” and “With E (√)” cells) and are
both unacceptable to a very low percentage (see the “No E (X)” and “With E (X)” cells).
In (78), where the result for non-epenthesis real words with structures other than CVlaxC
is given, the acceptability of the non-epenthesis version is far more than the acceptability
of the with-epenthesis version (compare the “No E (√)” and “With E (√)” cells), and the
unacceptability of the with-epenthesis version is also far more than the unacceptability of
the non-epenthesis version (compare the “No E (X)” and “With E (X)” cells). In (79),
where the result for non-epenthesis real words with CVlaxC structure is given, the non-
epenthesis version is acceptable to a high percentage (see the “No E (√)” and “No E (X)”
cells), and the with-epenthesis version is unacceptable to a very high percentage (see the
“With E (√)” and “With E (X)” cells). This is an expected result given the observed
pattern in the language.
Now let us compare the two charts which show the result for made-up words. Comparing
the made-up words with root structures other then CVlaxC in (80) with the result of made-
up words with CVlaxC root structure in (81), there is a difference in the percentage of
acceptability and unacceptability when one considers the epenthesis-including versions.
Compare the “With E (√)” cells of these charts and also the “With E (X)” cells in them
with each other. It is observed that, looking at the “With E (√)” cells, the words with
structures other than CVlaxC are more acceptable with epenthesis than the ones with the
CVlaxC root structure with epenthesis. This means that if speakers are going to accept
epenthesis in made-up words, it is more with the words which do not have CVlaxC as
their root structure. Now compare the “With E (X)” cells of the tables in (80) and (81).
The percentage of unacceptability of the epenthesis-including version is less when the
root has a structure other than CVlaxC. That is, made-up words with structures other than
CVlaxC have a greater chance of being perceptually acceptable with epenthesis compared
to those made-up words with the CVlaxC structure. This tells us that the epenthesis-
including versions, which are not favorable for Persian speakers, are relatively more
acceptable when they have root structures other than CVlaxC and more unacceptable
when they have CVlaxC structure.
192
In this regard, in particular compare the result of the wug test (see (76)), where there is
0% epenthesis response, with the “With E (√)” cell in (80), where 28.47% of the made-up
words with epenthesis-compatible structure are considered as acceptable. This
observation, which shows that in production no epenthesis is seen with made-up words,
but in acceptability rating the made-up words with epenthesis could be acceptable to
some extent, could be due to a priming effect. That is, in production, when speakers
volunteer the forms spontaneously, the epenthesis form is never used but in the
acceptability rating task, speakers are in essence primed to consider epenthesis as a
credible option and hence they are more likely to accept the forms than otherwise and this
is more likely to happen with the structures with which we get epenthesis in the language.
Such an effect is found in other similar studies where both production and acceptability
rating tasks are conducted (see for example Albright and Hayes 2003).
To sum up, in perception, the without-epenthesis version is considered as acceptable, and
the with-epenthesis version is unacceptable except for the limited set of words which
have epenthesis-including versions in the language. In general the non-epenthesis pattern
is dominant for the speakers regardless of the words being real or made-up and regardless
of their root structure. However, if the speakers accept an epenthesis-including version of
made-up words, there is a slight tendency towards doing this with structures other than
CVlaxC.
4.7.6. Summary and discussion
I ran an experiment including made-up and real words testing Persian native speakers for
both production and perception through different tasks in order to study the synchronic
status of epenthesis in Persian and to determine the underlying generalizations in the
speakers’ minds regarding this suffixation process and epenthesis. The results are as
follows:
193
(i) For made-up words, the non-epenthesis pattern is the general pattern (with a very few
exceptions).
Production:
(ii) For real words, for those words with which one can get epenthesis, the division
between the non-epenthesis version and the with-epenthesis version is close to equal
(with a difference which is not significant). For those real words with which epenthesis is
not expected, epenthesis does not occur at all (with one exception by one speaker in one
task).
(i) For made-up words, non-epenthesis is the acceptable pattern in general both for words
with CVlaxC structures and for words with other structures. If the speakers accept the
version with epenthesis, this is more likely with words with structures other than CVlaxC.
Perception:
(ii) For real words, for the words with which one can get epenthesis both versions are
fine. For those real words with which it is not expected to see epenthesis, the non-
epenthesis versions are by far more acceptable, and the epenthesis-including versions are
highly unacceptable.
Epenthesis is thus not usually seen when real and made-up words are tested and it occurs
only with a limited number of words which may show epenthesis in the language. Thus it
is not a synchronically active process in the language. The limited occurrence of
epenthesis with some structures and the tendency of speakers towards accepting
epenthesis only with those structures (structures other than CVC when V is lax) can be
accounted for by underlying tenseness. They cannot be accounted for underlying quantity
because we expect then to see the process much more consistently and productively, and
in addition, harmony remains unexplained. They cannot be accounted for by underlying
height either because height does not provide a two-way distinction among Persian
vowels to capture a distinction between CVC where V is lax and other structure to begin
with.
194
4.8. Summary
In this chapter, I discussed epenthesis in suffixation, a process which can be potentially
an argument against a quality-based analysis and for a quantity-based one. In this
process, an epenthetic vowel is inserted in order to break up consonant clusters which are
created due to suffixation. Epenthesis is possible when the root has the CVtenseC,
CVlaxCC, or CVtenseCC structure but is not possible when the root has the CVlaxC
structure. Thus it is only with roots of heavier structures that epenthesis is possible. I
argued that the process does not provide evidence for underlying quantity. First, I
proposed that epenthesis can be handled by the tense-based analysis, with quantity being
derived. Second, in a study of the synchronic status of epenthesis, I argued that it is not
productive in the language and thus does not really give support to any analysis. The
number of words which show epenthesis is limited compared to the number of words
which do not show it in spite of having the same root structures. This brings up a question
as to why one does not see epenthesis with more or all cases with the roots of the
CVtenseC, CVlaxCC, or CVtenseCC structures. I examined the synchronic status (including
frequency, cluster types, productivity of suffixes) as well as the historical status of the
suffixes and the suffixed forms and showed that they cannot explain the process. I also
conducted an experiment in order to test the productivity of the epenthesis process.
Based on the results of the experiment, I conclude that epenthesis is not usually seen
when real and made-up words are tested and therefore those limited words which may
show epenthesis cannot provide support for a quantitative analysis of vowel structure and
can be accounted for by tenseness. Compare the productivity of harmony (both in native
and in loanwords), which argues for a phonologically qualitative analysis of the system,
with the limited occurrence of epenthesis. Epenthesis is thus not a synchronically
productive process.
195
Chapter 5
VCC co-occurrence restrictions: evidence for quantity?
In the previous chapter, I examined epenthesis in suffixation as a potential argument for a
quantitative analysis of the Persian vowel system. In this chapter, I examine another
potential argument for underlying quantity, namely VCC co-occurrence restrictions. I
will show that there is no need for the underlying presence of quantity in the system in
order to account for VCC co-occurrence restrictions. Rather the restrictions are accounted
for by underlying tenseness from which surface quantity derives. At the same time, the
arguments that suggest an appeal to VCC# co-occurrence restrictions as an argument for
a a, e, o/ɑ, i, u division are themselves brought into question by the existence of
loanwords that do not obey the restrictions. In the first part of this chapter, I examine the
proposed restriction and how it might be accounted for. Then in 5.9 I discuss loanwords,
and propose that, in fact, there is no restriction in Modern Persian.
5.1. VCC co-occurrence restrictions: review of the literature
The literature on Persian phonotactics shows that there is a distinction between a, e, o as
opposed to ɑ, i, u with respect to the word-final consonant clusters that can follow them,
as introduced in 2.2.2.2. The literature raises two issues in this regard: (i) the consonant
clusters that can occur after /a, e, o/ and those after /ɑ, i, u/ are not identical; (ii) the
Sonority Sequencing Principle is met after /ɑ, i, u/ but need not be met after /a, e, o/ in
CVCC syllables in final position. One might argue that these differences are attributed to
a quantity distinction, with monomoraic vowels allowing a wider set of following clusters
than bimoraic vowels. However, I will argue that there is no strong argument in these
observations for underlying quantity or against underlying quality.
Before beginning the discussion, it is useful to recall the Persian vowel and consonant
I now review the literature on the restrictions on consonants following different vowels,
focusing on word-final position as this is the only position in monomorphemic words in
which tautosyllabic clusters are found, with rare exceptions.
As discussed in 2.2.2.2, Persian vowels are divided into two groups, a, e, o vs. ɑ, i, u, where evidence for this split is based on the consonant clusters that can occur
tautosyllabically after the vowel. As discussed in 2.2.2.2, Samareh (1977) identifies two
functionally different groups of simple vowels: /e, a, o/, and /i, â, u/ with respect to
possible following consonant clusters (I use his symbols here). The first group can occur
before all combinations of consonants as far as the first member of the cluster is
concerned. The only exception is /e/, which cannot occur before clusters starting with /x/.
197
The vowels of the second group are of very limited occurrence preceding consonant
clusters. They cannot occur before those clusters whose first consonant is /q, ʔ, ǰ, z, h,
m/. Samareh adds /b, t, d, k, n, l, r/ which occur following the second group of vowels in
a few loan words (e.g., kâbl ‘cable’, dubl ‘double’, ritm ‘rhythm’) and only three Persian
words (i.e. bâng ‘shout’, dâng ‘share’, pârs ‘related to Persian, Persia’). Samareh
continues that the vowels of the second group /i, â, u/ can precede /s, f, x/ combinations –
the second consonant must be /t/, with a few exceptions. The consonant /š/ is permitted
after /â, u/ but not after /i/. A summary of these observations was given in 2.2.2.2 and is
repeated here for convenience in (3):
(3) a. / e, a, o/
No restriction on C1 in C1C2 (/x/ not after /e/)
b. /i, â, u/
*/q, ʔ, ǰ, z, h, m/ as C1 in C1C2
? /b, t, d, k, n, l, r/ as C1 in C1C2
√ /s, f, x/ as C1 followed by /t/ as C2 (with a few exceptions)
√ /ʃ/ but not after /i/
Zolfaghari Serish and Kambuziya (2005) note that in words with CVCC structure, the
sonority sequencing principle is met when the vowel is /ɑ, i, u/ (e.g., mɑst ‘yoghurt’, bist ‘twenty’, pust ‘skin’), but it need not be met in monosyllabic CVCC when the vowel is
/a, e, o/49
49According to Zolfaghari Serish and Kambuziya, the principle is also met with a, e, o in polysyllabic words when CVCC is final with an exception: ɡa.vazn ‘deer’. There are actually other cases of violation of sonority after a, e, o in polysyllabic words, as in: es.taxr ‘swimming pool’, and se.pehr ‘sky’, to which they do not refer. I leave the discussion on sonority in polysyllabics aside as sonority in general is not a topic of this work. The only reason I refer to sonority is that it divides the vowels into two groups: ɑ, i, u (after
(e.g., tabx ‘cooking’, zebr ‘rough’, sobh ‘morning’). They add that the sonority
198
sequencing principle can be met with /a, e, o/, for instance: in monosyllabic words, it is
met if the first consonant of the coda is /r, l, j, n/ and [w] or the second consonant of the
coda is /d, ʔ, ʤ, ʃ, k, g, t/ (with a few exceptions they point out for x and ʃ). There are
also other cases of sonority being met after a, e, o, which are not pointed out by
Zolfaghari Serish and Kambuziya (e.g., a, e, or o followed by any fricative followed by a
stop as in ʧasb ‘glue’, (ʔ)eʃɢ ‘love’; or /m/ followed by a fricative as in lams ‘touch’,
ramz ‘secret’).50
Comparing Samareh’s (1985) list and Zolfaghari Serish and Kambuziya’s (2005) list
gives us largely but not exactly the same results (e.g., Samareh does not have ʃk in kuʃk
and rk in xɑrk, which Zolfaghari Serish and Kambuziya have. Samareh considers sk in
susk as an exception, and he also has sb in ɡoʃtɑsb, which Zolfaghari Serish and
Kambuziya do not have, and so on. The difference arises largely depending on whether or
not we want to include proper names like ɡoʃtɑsb ‘a name for boys’ (an ancient Persian
king), xɑrk ‘name of an island’).
Zolfaghari Serish and Kambuziya conclude that with respect to the
sonority sequencing principle two natural classes of vowels are found in CVCC in
Persian; these are /ɑ, i, u/ and /a, e, o/. Note that Zolfaghari Serish and Kambuziya look
into the sonority sequencing principle and are not concerned in particular about VC1 co-
occurrence or about vowel properties.
Nonetheless, the overall finding is clear: after ɑ, i, u only sequences of falling sonority
are allowed, while after a, e, o no such restriction exists. Thus with respect to sonority
and the consonant clusters that can follow the vowels, there is a grouping of vowels as
follows: a, e, o vs. ɑ, i, u.
which sonority is met in mono- and polysyllabics) versus a, e, o (after which sonority can be violated in particular in monosyllabics). 50 Zolfaghari Serish and Kambuziya’s summary of consonant clusters with respect to sonority needs to be completed as in their data analysis some combinations of consonants which can follow a, e, o and which meet the sonority principle are missing as shown above –they have words such as nasb ‘installation, ʔeʃɢ ‘love’, and lams ‘touch’ in their list of CVCC but in their section of data analysis, they do not include fricatives followed by stops, etc. as cases where sonority is met after a, e, o.
199
Based on the literature (Samareh 1977, Zolfaghari Serish and Kambuziya 2005) and
checking a Persian dictionary (Emami 2006), here is the general observation on the data:
(i) a, e, o show no particular restrictions for CC in CVCC. Note that there might
be some unattested sequences in Persian; but not in particular related to vowels.
(ii) ɑ, i, u show the following restrictions:
In native words, in the Persian consonant inventory given below, tense vowels occur
before the consonants in boxes, that is, C1 is one of these consonants (note that some
Arabic or Turkish words might be included here too; these are words which are so well
integrated into Persian and are not considered foreign words):
(4) Persian consonants (those in box: possible C1’s after tense vowels in native words)
Now, let us look at the second consonant in CC after the tense vowels. C2 can be one of
the following:
200
(5) Possible C2’s after tense vowels
Mostly t
Also a few k -not with i-
Also:
ɡ in nɡ -note none of these with i and u-
d in nd
and a few
s in rs
ʧ in rʧ
d in rd
and a case of ɢ in fɢ (Bɑfɢ ‘name of a city’)
The result of putting these restrictions together is as follows:
201
(6) Possible CC combinations after tense vowels
After lax vowels, no real restriction is observed. I should add that the frequencies of these
combinations are not all the same. For instance, st is far more frequent than rʧ, of which
the language shows only a couple of cases. The same is true for rs, of which only a
couple of words exist. Also, t and k in final position are not the same in terms of
frequency – t is much more frequent than k. There are also some restrictions with tense
vowels (e.g., /i/ is less frequent before CC while /ɑ/ is more frequent).
VCC restrictions present a potential support for a quantity-based analysis, as I will
discuss in 5.4, in particular because, as I will show in 5.2, no restriction in VC co-
occurrences is observed in CVC syllable form, which suggests that the structure CVCC
imposes the restrictions. Recall that the vowels which I consider to be tense are bimoraic
under a quantitative account and those which I consider to be lax are monomoraic. The
sequence of a bimoraic vowel followed by a consonant shows no restrictions because the
result is not a too heavy or too long syllable, counting the number of morae, but when a
C2
C1
t d k ɡ s ʧ
f √
s √ √
ʃ √ √
x √
n √ √
r √ √ √ √
202
bimoraic vowel is followed by two consonants restrictions occur because the syllable is
too heavy. I will return to this in 5.4. Let us then first examine CVC.
5.2. CVC forms
In this section, I examine how the vowels pattern in CVC syllables. The purpose is to see
if a distinction in the distribution of a, e, o vs. ɑ, i, u exists in this syllable type. This
investigation shows whether there is a restriction in the occurrence of single consonants
after ɑ, i, u in general, or whether it is the syllable structure (CVCC) which imposes a
restriction on the consonants which can follow ɑ, i, u. For example, as Samareh notes
(see (3)), the consonant z cannot be C1 in CVC1C2 when V is ɑ, i, u. The question is
whether there is a restriction on the occurrence of z after ɑ, i, u in general, or whether
there is a restriction on the co-occurrence of ɑ, i, u with following z only in CVCC. I will
show that no restrictions between the vowels and following consonants are observed in
CVC.
Recall the list that Samareh gives for forbidden C1 after /ɑ, i, u/, that is, */q, ʔ, ǰ, z, h, m/
as C1 in C1C2, and consider (7)51
51 I do not include /a, e. o/ in (7) as the restriction that Samareh refers to is about /ɑ, i, u/ in CVCC so I want to examine /ɑ, i, u/ in CVC. To show that /a, e, o/ can occur before these consonants, I give some examples here: laɢ ‘loose’, kaʤ ‘crooked’, por ‘full’, dar ‘door’, fer ‘curl’, meh ‘fog’, nam ‘moisture’, dom ‘tail’, tab ‘fever’, xat ‘line’, bot ‘idle’, bad ‘bad’, xod ‘self’, rok ‘blunt’, bon ‘root’, ʃen ‘gravel’, xol ‘crazy’, pol ‘bridge’, del ‘heart’, mes ‘copper’, rox ‘face’, etc.
. The examples in (7) show that it is not the case that
there is a restriction on ɑ, i, u occurring before these consonants. Whatever it is, it is
related to CC after these vowels. In (7), I first give examples of the consonants that
Samareh mentions for restrictions on C1 in C1C2 (i.e., */q, ʔ, ǰ, z, h, m/) to show that these
consonants can occur in CVC where V is ɑ, i, u. These are shown in (7a)-(7f). I then
give examples of other consonants which are marked by a ‘?’ in the summary in (3) (i.e.,
?/b, t, d, k, n, l, r/). These are shown in (7g)-(7m). They are followed by examples of
those consonants which are marked by ‘√’ in (3) (i.e., √/s, f, x/ and √/ʃ/). These are given
in (7n)-(7q). In each line, I start with an example of ɑ, followed by an example of i,
203
followed by an example of u. There are a few cases which the language does not have a
word with one or two of these vowels. In (7), I use IPA symbols (not those used by
Samareh).
(7) ɑ, i, u in CVC
a. bɑɢ ‘garden’, tiɢ ‘blade’, duɢ ‘yogurt drink’
b. ʤuʔ ‘hunger’
c. tɑʤ ‘crown’, ɡiʤ ‘dizzy’
d. nɑz ‘cute’, tiz ‘sharp’, ruz ‘day’,
e. rɑh ‘road’, pih ‘fat’, kuh ‘mountain’
f. nɑm ‘name’, bim ‘fear’, ʃum ‘ominous’
g. xɑb ‘sleep’, sib ‘apple’, ʧub ‘wood’
h. lɑt ‘hooligan’, xit ‘hopeless’, tut ‘berry’
i. bɑd ‘wind’, bid ‘moth’, zud ‘quick’
j. xɑk ‘soil’, nik ‘good’, xuk ‘pig’
k. nɑn ‘bread’, din ‘religion’, xun ‘blood’
l. xɑl ‘mole’, fil ‘elephant’, pul ‘money’
m. mɑr ‘snake’, dir ‘late’, dur ‘far away’
n. jɑs ‘jasmine’, lis ‘lick’, lus ‘spoiled’
o. sɑf ‘flat’, kif ‘purse’, buf ‘owl’
p. ʃɑx ‘horn’, six ‘skewer’, kux ‘hut’
q. mɑʃ ‘a kind of bean’, niʃ ‘sting’, muʃ ‘mouse’
204
Thus ɑ, i, u do not show restrictions with respect to the following consonant in CVC.
In 5.1, the possible consonant clusters after vowels in Persian words were examined. It
was shown that no real restriction is found in lax vowels in CVCC form. Some
restrictions, however, are seen after tense vowels in this structure. In this section, it was
shown that no real restriction is observed in Persian words after tense vowels or lax
vowels in CVC form.
In the next section, 5.3, I will discuss VC co-occurrence restrictions in tense/lax-based
systems in order to establish that the existence of such restrictions is not limited to
quantity-based languages. Both tense-based and quantity-based languages can show VC
restrictions.
5.3. Vowel-consonant co-occurrence restrictions in tense/lax-based vowel systems
In this section, I look at examples of restrictions on the co-occurrence of vowels and
consonants in languages whose vowel systems are tense/lax based. Looking at these
languages shows that restrictions on patterning of vowels with respect to syllable
structure and co-occurrence with following consonants in these systems exist.
In English, for example, lax vowels do not occur in #CV# but tense vowels do (e.g., bee
[bi] and bit [bɪt] exist in English, but [bɪ] does not). This does not necessarily mean that
English has a quantity-based system. We will return to #CV# in chapter 6. For now, let us
focus on tense and lax vowels and the following consonants. The distribution of English
tense and lax vowels has been extensively discussed in the literature (e.g., Chomsky and
Halle 1968, Halle and Mohanan 1985, Borowsky 1989, Green 2001). In English, only lax
vowels occur before consonant clusters with a non-coronal member (tense vowels cannot
occur before CC’s consisting of a sonorant plus a consonant unless the rightmost member
is a coronal). Also only lax vowels occur before ŋ. Thus restrictions on the type of
consonants or consonant clusters are observed in languages with tense-lax distinction.
205
Féry (2003) presents a study of Southern French dialects and shows that there are
restrictions on the type of consonants which can follow tense and lax vowels in Southern
French. In her study, Féry considers the vowels in “stressed syllables”, which are final
syllables in French. For example, consider ø (a tense vowel) and its lax counterpart, œ.
Only ø is allowed before a coronal obstruent (e.g., before [t] meute ‘pack’, before [z]
danseuse ‘dancer (fem.)’), and only œ before other consonants (e.g., before [f] neuf ‘new’, before [ʀ] heure ‘hour’).
In terms of number of following consonants, in Dutch, tense vowels can occur before
zero or one consonant. Lax vowels, however, can occur before one or two consonants
(van Oostendorp 2006). Compare (8a) and (8b) (a, [a:], is tense and ɑ is lax):
(8) a. r[a:] ‘yard’ r[a:]m ‘window’ r[a:]p ‘turnip’ *r[a:]mp ___
b. *rɑ ___ r[ɑ]m ‘ram’ r[ɑ]p ‘quick’ r[ɑ]mp ‘disaster’
van Oostendorp (2006) presents the following relation between tenseness and syllable
structure in Dutch: a tense vowel has to be in an open syllable, a lax vowel has to be in
a closed syllable; the syllable rhyme contains at most two positions, at the end of the
word, a syllable rhyme can be followed by one additional consonant (see van
Oostendorp 2006 for discussion). Therefore, tenseness and syllable structure can have an
interaction resulting in different patterning of tense and lax vowels in CV, CVC, or
CVCC.
Additionally, there are also other restrictions in terms of the nature of following
consonant(s) in VC and VCC in Dutch, a couple of which are presented below (van
Oostendorp 1995). In the context VFə (F = fricative), F is voiceless if and only if V is
lax. Consider a (a tense vowel) and ɑ (a lax vowel) in bl[ɑfə]n ‘bark’, *bl[afə]n, and
l[avə]n ‘refresh’. In Dutch monomorphemic VCV sequences, lax vowels are almost
always followed by voiceless fricatives and tense vowels are almost always followed by
voiced fricative. As van Oostendorp notes, there is an interaction between the features
[lax] (in vowels) and [voice] (in fricatives) in Dutch. It is interesting that in Persian, too,
we see an interaction between [tense] of vowels and [voice] of fricatives – recall that C1
206
in VtenseC1C2# is generally a voiceless fricative in Persian. Furthermore, a restriction on
VCC in Dutch is that lax vowels occur before word-final consonant clusters, VlaxC1C2#,
where C2 is non-coronal - the dorsal nasal behaves like a consonant cluster in this respect.
For example, compare b[ɑŋ] ‘afraid’ with *b[aŋ], b[ɑlk] ‘beam’ with *b[alk], w[ɛlp] ‘whelp’ with *w[elp]. With respect to C2 in VC1C2, it seems that there is a tendency
towards a coronal when V is tense, as we see in English, Dutch, and also Persian (the
most common C2 in VtenseC1C2# is /t/ in Persian).
This brief review of tense and lax vowels in other languages shows that what we see in
Persian with regard to the proposed VCC co-occurrence restrictions (i) is not unusual for
a tense-based language, because other languages with a tense-lax distinction also exhibit
restrictions in terms of both the nature and the number of following consonants; (ii) is
not an indication of underlying quantity.
Having said this, given that both tense-based and quantity-based languages may show
restrictions on VC co-occurrence, we need to consider the possibility of the restrictions
being an indication of underlying quantity in Persian. In 5.4, I will discuss how VC
restrictions might support quantity in Persian.
5.4. VCC restrictions: why a support for quantity?
In 5.1 and 5.2, I discussed CVC and CVCC in Persian with respect to co-occurrence
restrictions between the vowel and following consonant(s). It was shown that no real
restrictions exist in CVC syllable form regardless of the vowel. In CVCC#, however, ɑ, i, u, unlike a, e, o, show some restrictions regarding following consonants: only CC’s with
falling sonority occur after ɑ, i, u. This observation may seem to provide an argument for
underlying quantity in the system.
Under a quantitative account, ɑ, i, u are underlyingly bimoraic, while a, e, o are
monomoraic (see 4.2). In CVC forms both bimoraic and monomoraic vowels occur. In
CVCC forms when the vowel is monomoraic (a, e, o), no restriction is seen. In CVCC
forms when the vowel is bimoraic (ɑ, i, u), there are restrictions. This suggests that the
207
status of vowels as monomoraic or bimoraic may have an effect on the CC clusters in
CVCC. That is, weight appears to play a role in the type of consonant clusters that can
follow the vowels as the restriction is seen in CVCC (and not CVC which is a less heavy
structure). Recall the presentation of Persian syllable structure suggested in 4.2, repeated
here in (9) and (10) – see 4.2 for discussion of these structures and the assumptions from
which these follow.
(9) i. Lax vowel ii. Tense vowel
PWd PWd
Φ Φ
σ σ
μ μ μ μ
d a r k ɑ r
‘door’ ‘work’
(10) i. Lax vowel ii. Tense vowel
PWd PWd
Φ Φ
σ σ
μ μ μ μ μ
(ʔ ) a r ʤ m ɑ n d
‘value’ ‘last’
208
The only structure which shows vowel-consonant co-occurrence restrictions is the one in
(10ii). An analysis along the following lines might be suggested under a quantitative
approach: lax vowels, if monomoraic, would allow two following consonants of any
type; tense vowels, if bimoraic, would allow two following consonants only if the second
one is not a part of the syllable and is linked to the prosodic word. This stray consonant
could be considered to be the onset of the next syllable (in #CVCC# the onset of a
syllable with an empty nucleus; in the epenthesis cases, it is the onset of a vowel-initial
suffix). If the sonority sequencing principle is at play in Persian, the restrictions might
follow. I will return to this in 5.10.
An analysis based on weight and its effect on VCC co-occurrence in
CVCC# also seems to be confirmed by: (i) lexical #CV# being possible with ɑ, i, u, but
not with a, e, o – this will be discussed in chapter 6; (ii) patterning of vowels preceding
geminate consonants, which will be discussed next. I will return to the weight-based
analysis in section 5.10. In this section, I simply wanted to show why the VCC co-
occurrence restrictions can potentially be evidence for quantity and why they are
therefore worth studying.
5.5. Geminates
I now briefly discuss geminate consonants in Persian. Geminate consonants in Persian
and their phonemic status are not well studied and are a matter of controversy and
therefore need careful investigation, which is beyond the scope of this study. The reason
that I discuss geminates is that they show an aspect of VC(C) co-occurrence, the topic of
this chapter. Moreover, they occur more frequently after a, e, o than they do after ɑ, i, u.
The existence of a large number of Arabic-origin words which include geminate
consonants has made the distribution of geminates in Persian difficult to account for.
Mahootian (1997) considers gemination in non-Arabic native words to be often reduced,
that is, the consonants are produced as non-geminate. Hansen (2003) considers a
phonological contrast between geminates and singleton consonants in Persian, as seen in
209
mɑde ‘female’ as opposed to mɑdde ‘material’. Deyhime (2000), cited in Hansen (2003),
based on an experiment with 16 Persian speakers, says that geminate stop consonants in
Arabic-origin words are pronounced as geminates by all speakers; in native words,
however, the geminate stops are pronounced by some speakers as singletons. Lazard
(1992) points out that Persian has numerous cases of geminates (he calls these
‘doubling’) which are mostly of Arabic origin, giving examples such as naɢɢɑʃ ‘painter’,
kaffɑʃ ‘shoemaker’, etc. Lazard further says that in colloquial speech sometimes
gemination is expressive in nature, as in hammiʃe for hamiʃe ‘always’.
I examine geminate consonants with respect to their occurrence after different vowels.
That is, considering geminates as CC, I would like to see if a restriction in terms of
preceding vowels is observed with final geminate consonants.
In Persian, lax vowels, a, e, o, occur preceding geminate consonants far more than tense
vowels, ɑ, i, u, do. I went through the Persian Emami (2006) dictionary looking for
words with geminate consonants to study the distribution of vowels before geminates.
Preceding geminate consonants in final position, the occurrence of vowels is as follows
(other than the parts when I say ‘and more’ at the end of the data, the list is exhaustive):
(11) ɑ, i, u before final geminate consonants
No example with i and u. Some with /ɑ/, as follows:
tɑmm ‘complete’
hɑdd ‘severe’
xɑss ‘special’
dɑll ‘indicative’
ʃɑɢɢ ‘arduous’
210
(12) a, e, o before final geminate consonants
/e/ hess ‘sense’
deɢɢ ‘grief’
serr ‘secret’
ʃeɢɢ ‘option’
zedd ‘opposite’
tebb ‘medicine’
/o/ hobb ‘love’
dorr ‘jewel’
robb ‘paste of fruits’
ɢodd ‘stubborn’
koll ‘all’
/a/ ʤadd ‘ancestor’
ʤavv ‘atmosphere’
hadd ‘limit’
hazz ‘joy’
haɢɢ ‘right’
hakk ‘engraving’
211
hall ‘solving’
xatt ‘line’
rabb ‘god’
sadd ‘obstacle’
ʃarr ‘evil’
ʃakk ‘doubt’
ʃamm ‘intuition’
zann ‘suspicion’
farr ‘spelndour’
laɢɢ ‘loose’
and more
As seen above, the lax vowels (in particular /a/) occur more frequently than the tense
vowels before final geminate consonants. It should be added that, as Lazard also says,
final geminates in Persian usually are not clear unless there is a following vowel, for
example:52
(13) a. zann ‘suspicion’ zann-e ɢavi ‘strong suspicion’
b. zan ‘woman’ zan-e ɢavi ‘strong woman’
Let us now look at medial geminates. Medial geminates are easy to distinguish, as in:
52 A note should be made about the z sound in these two words. In pronunciation, the two z’s in these words are pronounced the same. In writing, they are different due to the Perso-Arabic script.
212
(14) a. mɑde ‘female’ mɑdde ‘material’
b. farɑr ‘escape’ farrɑr ‘volatile’
c. halɑl ‘permissible (in religion)’ hallɑl ‘solvent’
d. kore ‘sphere’ korre ‘the young of some animals’
e. banɑ ‘building’ bannɑ ‘bricklayer’
The distribution of vowels before medial geminates is similar to that before final
geminates: geminates occur more with a preceding lax vowel than with a preceding tense
vowel.
(15) ɑ, i, u before medial geminate consonants
No words with u
/ɑ/ ɢɑrre ‘continent’
hɑrre ‘tropical’
ʤɑdde ‘road’
mɑdde ‘material’
ʃɑmme ‘the sense of smell’
/i/ nijjat ‘intention’
raʔijjat ‘peasant’
(ʔ)atijje ‘gift’
213
(16) a, e, o before medial geminate consonants
/e/ pelle ‘stair’
ɢesse ‘story’
ʧelle ‘bowstring’
zelle ‘pestered’
xette ‘territory’
(ʔ)ellat ‘reason’
ʃeddat ‘intensity’
reɢɢat ‘fluidity’
xeffat ‘humiliation’
/o/ korre ‘foal’
ɢolle ‘peak’
ɢosse ‘sorrow’
moddat ‘period’
torre ‘a lock of hair’
ɢodde ‘gland’
ɢovve ‘strength’
(ʔ)ommol ‘old-fashioned’
214
/a/ darre ‘valley’
barre ‘lamb’
zarre ‘particle’
(ʔ)arre ‘saw’
ɢarre ‘cocky’
lakke ‘spot’
dabbe ‘container’
dabbɑɢ ‘tanner’
ʤarrɑh ‘surgeon’
ɢassɑb ‘butcher’
bannɑ ‘bricklayer’
tarrɑh ‘designer’
baʧʧe ‘child’
(ʔ)amme ‘aunt’
hammɑl ‘porter’
raɢɢɑs ‘dancer’
lappe ‘split pea’
and more
Preceding geminate consonants, regardless of whether they occur finally or medially, lax
vowels are more common, although some words with /ɑ/ and /i/ in this position do occur.
Given the proposed syllable structure under a quantity analysis (see 5.4), a bimoraic
215
vowel followed by a geminated consonant is a predicted structure. While there are
differences in frequency, both structures occur. The distribution of vowels before
geminates thus does not provide evidence for an analysis based on underlying quantity.
5.6. On Syllabification in Persian
In this section, I provide some background on syllabification of consonants in Persian
followed by some examples to clarify the way I syllabify words. There is agreement on
syllabification in Persian (Samareh 1985, Meshkatod Dini 1999, Emami 2006 among
others).
(i) No syllable-initial consonant clusters exist in the language. Medial and final CC,
however, are permitted, syllabified as C.C medially.
(ii) Words with CVCV(C) structure are syllabified as CV.CV(C)
(iii) Words with CVCCV(C) structure are syllabified as CVC.CV(C)
(iv) Words with CVCCCV(C) structure are syllabified as CVCC.CV(C). This is not a
common pattern within a stem and there are only a very few words in Persian with this
structure.
Note that, as pointed out in section 2.2.1.1 (footnote 6), there is debate in the Persian
literature as to whether a syllable can start with a vowel or whether there is in fact a
glottal stop in initial position of seemingly vowel-initial syllables. I leave this issue aside
in this thesis. I show here structures starting with C but in fact if someone takes the
possibility of having vowel-initial syllables in Persian the initial C can be considered as
being optional (C).
I now present some examples illustrating syllabification. Since the focus of our
discussion is on medial consonant(s), I show examples of one, two, and three consonants
in a medial position to show how they are syllabified (three is the maximum number of
consonants one can get in a medial position in Persian).
216
(17) CVCV CV.CV
a. nedɑ ‘call’ ne.dɑ
b. ʒɑle ‘dew’ ʒɑ.le
c. zire ‘cumin’ zi.re
(18) CVCCV CVC.CV
a. maxfi ‘hidden’ max.fi
b. bɑtlɑɢ ‘swamp’ bɑt.lɑɢ
(19) CVCCCV CVCC.CV (a rare pattern)
a. surtme ‘sled’ surt.me
b. jurtme ‘trot’ jurt.me / jort.me
With this background on syllabification, I now consider medial consonant clusters.
5.7. On different syllable structures
Recall the VCC co-occurrence restrictions in CVtenseCC# which are met in the native
vocabulary (see 5.9 on loan vocabulary). The overall picture of CV, CVC, CVCC
monosyllabic structures in Persian with respect to the vowels is presented in (20). ‘Yes’
means the structure is possible with the particular vowels and ‘No’ means it is not. Note
that ‘No’ under a, e, o needs comments (see chapter 6), that is why I put ‘roughly’ in
front of it.
217
(20) Vowel restrictions in native Persian vocabulary (monosyllables)
ɑ, i, u a, e, o
#CV# Yes No (roughly)
#CVC# Yes Yes
#CVCC# Restricted in
native words (only
falling sonority
CC’s)
Yes
In general, considering all possibilities in CVC and CVCC, as I will show below, only
monosyllabic words with CVtenseCC structure (e.g., mɑst ‘yogurt’) and polysyllabic
words with CVtenseCC in final position (e.g., ʃe.nɑxt ‘familiarity, knowledge’) show
restrictions on the following sequencing of the consonants.
Let us first review monosyllabic CVCC words. All vowels can occur in monosyllabic
words of the shape CVCC. No restriction is seen on the consonants following a, e, o.
There are restrictions on the consonant following ɑ, i, u in native words (only consonant
clusters with falling sonority occur after tense vowels –see (6)). Recall that the most
frequent CC after tense vowels is voiceless fricatives followed by t (see (6)). Some of the
words with CVCC are nouns related to infinitives.
(21) a. bɑxt ‘failure’ bɑxt-an ‘to lose’
b. bast ‘clip’ bast-an ‘to bind’
The possible CC combinations do not change by including or excluding this class of
nouns.
One may ask about the distribution of vowels in second syllables in CV(C).CVtenseCC
and CV(C).CVlaxCC, that is, in polysyllabic words ending in CVCC. Checking the
218
Emami dictionary (2006) for words with the structure CV(C).CVtenseCC and
CV(C).CVlaxCC , I found the following results: the number of CV(C).CVlaxCC is by far
more than the number of CV(C).CVtenseCC shape words. The words of the form
CV(C).CVtenseCC are not too many and are mainly limited to those which are nouns
related to infinitives in the following way.
(22) a. ʃe.nɑxt ‘familiarity, knowledge’ ʃenɑxt-an ‘to know’
b. vi.rɑst ‘edition’ virɑst-an ‘to edit’
c. par.dɑxt ‘payment’ pardɑxt-an ‘to pay’
d. an.bɑʃt ‘accumulation’ anbɑʃt-an ‘to accumulate sth’
No additional CC in CVCC is observed for tense vowels in CV(C).CVtenseCC beyond
those found in (6). Again with lax vowels more possibilities are observed, although
clusters are more limited than with CVlaxCC (maybe this is due to the number of words
with CVCC which is more than CV(C).CVCC ones). Some examples are presented in
(23):
(23) a. ɡon.ʤeʃk ‘sparrow’
b. an.ɡoʃt ‘finger’
c. ba.nafʃ ‘purple’
d. ɡa.vazn ‘deer’
e. ve.larm ‘tepid’
f. to.fanɡ ‘gun’
g. be.renʤ ‘rice’
h. ʃe.ɡarf ‘great’
i. ɑ.za.raxʃ ‘lightening’
219
With lax vowels the most common final CC in CV(C). CVlaxCC are nʤ, nɡ, nd, rd, ʃt, ʃk,
st, and then rɡ, ft, xt, and there are some xʃ, rm, nt, xr, rz, hr, br, rs, fs, fʃ, zn, rf.
So far, I discussed CC’s in final position: whether monosyllabic or polysyllabic, only
VtenseCC# shows restrictions. One possibility to account for this restriction, as noted
above, is that syllabification is as follows: VtenseC.CØ, where Ø is an empty nucleus. If
so, then one might expect medial restrictions after a tense vowel as well. I now examine
CVtenseCC in medial position to examine whether the observed restrictions exist in this
position as well.
Thus, I will examine the sonority of CC in Persian non-final CVCC sequences, in order
to present a complete picture of vowel and consonant(s) sequences in Persian, and, more
importantly, because there are languages which obey the Syllable Contact Law (SCL),
according to which, in a sequence VC1.C2V, the sonority value of C1 must be greater than
the sonority value of C2 (e.g., Vennemann 1988, Clements 1990, Beckman 2004). An
example is Tamil in which SCL is never violated (Beckman 2004).
I disregard the clusters across morpheme boundaries, as in the following words, and
focus on monomorphemic words.
(24) a. xɑr.dɑr ‘barbed’ (xɑr ‘barb’+ dɑr ‘present stem of dɑʃtan ‘to have’’)
b. bɑr.bar ‘porter’ (bɑr ‘baggage’+ bar ‘present stem of bordan ‘to carry’’)
c. pir.mard ‘old man’ (pir ‘old’ + mard ‘man’)
d. xun.sard ‘cold-blooded’ (xun ‘blood’ + sard ‘cold’)
e. xaʃm.ɡin ‘angry’ (xaʃm ‘anger’ + ɡin (a suffix))
Also, I do not consider cases such as pɑnz.dah ‘fifteen’ and ʃɑnz.dah ‘sixteen’ –dah
means ‘ten’. The language does not have pɑnz and ʃɑnz as independent morphemes
(panʤ and ʃeʃ are words for ‘five’ and ‘six’ respectively). There are very few words like
these which I disregard.
220
In (25), I summarize the possibilities of CV, CVC, and CVCC sequences as well as
CVCV and CVCCV, which show if medial C’s and CC’s show any restrictions with
respect to their preceding vowels. The summary should be read as follows. Consider, for
example, the first CV sequence in (25a). It is monomorphemic, monosyllabic. The ‘yes’
under ɑ, i, u for this structure indicates that this type of CV can occur with these vowels.
The comments under a, e, o explain how this structure behaves with respect to a, e, o.
(25) is a summary (the syllables are divided by ‘.’). ‘Yes’ indicates that the vowels in
question are possible in that environment; ‘No’ means that they are not.
221
(25) Possible consonants and consonant clusters following vowels in native vocabualry
Sequence ɑ, i, u a, e, o
a. #CV# sequence (CV#)
Monomorphemic
Monosyllabic
Yes
e.g., pɑ ‘foot’
-No i (in lexical words)
- No e (in lexical words)
- No a in final position in Persian
- o needs discussion (see chapter 6)
b. CVCV sequence
Monomorphemic
Polysyllabic
Yes
e.g., pa.tu ‘blanket’
ʤɑ.ri ‘current’
Yes
e.g., ka.re ‘butter’
c. #CVC# sequence (CVC#)
Monomorphemic
Monosyllabic
Yes
No restriction on C in coda
e.g., sud ‘benefit’
Yes
No restriction on C in coda
e.g., por ‘full’
d. CVCCV(C) sequence
Monomorphemic
Polysyllabic
Yes
No restrictions
e.g., fɑx.te ‘stock dove’
Yes
No restrictions
e.g., tar.xun ‘tarragon’
e. CVCC# sequence (CVCC#)
Monomorphemic
Monosyllabic or polysyllabic
Yes
Restrictions on CC
e.g., pust ‘skin’
vi.rɑst ‘edition’
Yes
No restrictions on CC
e.g., fekr ‘thought’
ɡa.vazn ‘deer’
222
Examples for each of the structures shown in (25) along with some notes are found in the
appendix of this chapter.
Given that our focus here is on medial consonant sequences preceded by tense vowels, I
elaborate on this structure.
(26) presents cases with C.C sequences (note: syllabification breaks up medial CC’s):
(26) ɑ, i, u a, e, o
nɑr.ɡil ‘coconut’ dar.jɑ ‘sea’
bɑt.lɑɢ ‘swamp’ tar.xun ‘tarragon’
kɑs.ni ‘chicory’ ban.dar ‘port’
tus.kɑ ‘alder’ nos.xe ‘prescription’
ʃir.ʤe ‘dive’ keʃ.var ‘country’
All vowels can occur. No particular restriction is seen unlike monosyllabic CVtenseCC
words or polysyllabic words ending in CVtenseCC, which show restrictions. A summary of
combinations of consonants after tense vowels in medial position is given in (27). Note
that “?” in (27) shows cases which I am not sure if they are suffixed/compound forms or
are roots. C1’s and C2’s are organized based on sonority (stops, affricates, fricatives,
nasals, liquids, glides).
223
(27) C1 and C2 in CVtenseC1C2 (monomorphemic; polysyllabic (e.g., bɑt.lɑɢ ‘swamp’))
C2
C1
p b t d k ɡ ɢ ʧ ʤ f v ʃ z h m n r l
p √
b ? √ √ √
t √
d ? ?
k ?
ɢ √
ʧ √
f √
v √
s √ √ √ √ √ √ √
ʃ √ √ √ √ √
z √
x √
m √ ? √ √
n √
r √ √ ? √ ? √ √ √ √ √ √
l √ √ √
224
The Syllable Contact Law, therefore, is not met between syllables in Persian; it is
restricted to word final position.
5.8. VCC# restrictions: a tense-based account
The observation that the group ɑ, i, u is restricted with respect to possible clusters in
VCC# may suggest underlying bimoraicity of ɑ, i, u and monomoracity of a, e, o, as
noted above. Under this account, in native CVCC’s, where a bimoraic vowel is followed
by a consonant cluster, only clusters with falling sonority can occur.
I argue in this section that the VCC# restrictions need not be attributed to weight, but are
compatible with a quality analysis based on tenseness.
First of all, recall that restrictions on vowel-consonant(s) sequences are observed in
tense/lax-based languages and are not necessarily an indication of the existence of
underlying quantity (see 5.3).
Moreover, recall the discussion of epenthesis in suffixation in chapter 4. I argued that
even if epenthesis was a productive process (it was shown that it is not) we would not
need underlying quantity to account for it, as underlying tense vowels can project two
morae. Here, too, if the restrictions in CVCC# are taken to be systematic, there is no need
for underlying quantity: quantity is projected from tenseness. At word edge, we might
suppose an analysis along the following lines. An underlying tense vowel projects two
morae, and there can be one following consonant; any additional consonant must be
syllabified as an onset of a following syllable (in this case a syllable with an empty
nucleus; recall that in the epenthesis process, the following syllable was a vowel-initial
suffix). Lax vowels, if projecting one mora, would allow two following consonants of
any type. If the sonority sequencing principle is at play in Persian, the restrictions might
follow, as discussed below. Thus the representations given in (9) and (10) (see 5.4) show
only the surface representation of Persian vowels; underlyingly all vowels are
monomoraic, distinguished by [tense].
225
Considering medial CC sequences, we might expect similar patterns internally, with
restrictions on C.C sequences following tense vowels but not following lax vowels. We
do not see these, however, as discussed in 5.7. The question is: Why might this be?
There is a major difference between final and non-final position in Persian: final vowels
are stressed (see 2.2.1.2). Given this, the restrictions could be related to stress. Compare
(28) with (29) (stress is on the vowel in bold; tense vowels and their following CC are
underlined; a syllable boundary is shown by ‘.’). In (28), where stress is not on the tense
vowel which is followed by a CC (i.e., when the vowel is not in the final syllable), any
CC can follow the vowel. In (29), where stress is on the tense vowel (i.e., when the vowel
is in the final syllable) only falling sonority CC’s occur after the vowel.
(28) bɑt.l
ɢ
ɑɢ ‘swamp’
ɑb.l
ɡ
a.me ‘cooking pot’
us.f
t
and ‘sheep’
us.k
(29) b
ɑ ‘alder’
ɑft
d
‘texture’
ust
vi.r
‘friend’
ɑst
ʃe.n
‘edition’
ɑxt
Thus restrictions are seen in VtenseCC# but not in VtenseCCV. Considering stress and
syllable boundaries (stress is shown by the vowel in bold; a syllable boundary is shown
by ‘.’): VtenseCC in (CV.C)
‘familiarity, knowledge’
VCC# shows restrictions; VtenseCC in (C)VC.C
V(C) does not
show restrictions. This is summarized in (30):
226
(30)
Given that the restrictions occur only in final (stressed) position, it is reasonable to
assume that there is not a quantity distinction underlying, but it is projected.
In the next section I examine restrictions on clusters in loanwords in order to further
examine VCC# restrictions. However, before leaving this section, it is appropriate to
return to words of the structure CVCC.CV. While there are only a couple of these (see
(31)), it is interesting to note that the first vowel is /u/, a tense vowel. Given a
quantitative analysis with a restriction to trimoraic syllables, these forms come as a
surprise: the vowel would occupy two morae, and there is not space for two additional
consonants. Such forms are not unexpected under a qualitative analysis, however,
because the vowel would occupy one mora and there is still space for the two following
consonants.
(31) a. surt.me ‘sled’
b. jurt.me / jort.me ‘trot’
To sum up, I have examined different syllable structures with tense and lax vowels in
Persian native words, and the only structure which shows restrictions is CC in
CVtenseCC#. Let us consider the restrictions seen after tense vowels once more:
Underlying
Vowel
Structure Stress Restrictions
on CC
Vtense VCC# √ √
Vtense VCCV
X X
227
(32) Possible consonant clusters after tense vowels in CVCC in native words
C Vtense C voiceless fricative C t The most common pattern (no /ʃ/ after /i/)
C Vtense C voiceless fricative C k
C Vtense C n C d ɡ
C Vtense C r C d k s ʧ
Next I examine loanwords and the clusters seen in them after tense vowels.
5.9. Loanwords
Consonant clusters which follow tense vowels in the native Persian lexicon display
falling sonority (see (20)). With loanwords things are different. Following tense vowels:
(i) there are falling sonority clusters found in loanwords similar to those found in native
words (see (33)); (ii) there are falling sonority clusters found in loanwords, but not in
native words (see (34)); (iii) there are rising sonority clusters which are forbidden in
native words but are found in loanwords (see (35)). Thus, when loanwords are taken into
account, the distribution of clusters is not restricted, no matter whether the vowel is tense
or lax.
The lexicon of Persian includes many loanwords from a variety of languages, mostly
from Arabic and some from Turkish and other languages – such words are very much
integrated into the Persian lexicon and are not considered as loanwords.53
53 In order to make sure of the origins of the words, I checked the Emami dictionary (2006) (this dictionary indicates if a word is considered a loan) as well as the Dehkhoda encyclopedia.
The loanwords
discussed now are from English and French.
228
Some loanwords from French or English follow the combinations that were seen above
for native words, such as those in (33). The examples all involve tense vowels and the
relevant environment under study.
(33) a. burs ‘bursary’
b. rinɡ ‘ring’54
c. ʃift ‘shift’
d. pɑrkinɡ ‘parking’
e. risk ‘risk’
f. ʒimnɑst ‘gymnast’
g. estɑndɑrd ‘standard’
h. mɑrk ‘mark’
i. mɑsk ‘mask’
Various combinations including the following are also observed in loan words. Some of
these words are commonly used in the language and others are less frequently used.
Some of the very common loanwords are given below. These clusters also show falling
sonority, as expected after the vowels ɑ, i, u.
(34) a. film ‘film, movie’
b. ʃɑns ‘chance’
c. konferɑns ‘conference’
d. lisɑns ‘licence’
54 Words such as ranɡ ‘color’ are transcribed as ranɡ or ranŋɡ in the Persian literature (Samareh 1985, Meshkatod Dini 1999). I show the sequence of nɡ as nɡ throughout.
229
e. lɑmp ‘lamp’
f. bɑnk ‘bank’
In addition, there are clusters that are unexpected after ɑ, i, u, clusters with rising
sonority, given the pattern in the native lexicon. For instance, there are stops as the first
consonant in the clusters, or a fricative followed by a nasal in very common loan words,
such as the following:
(35) a. dubl ‘double’
b. luks ‘luxe’
c. pudr ‘powder’
d. ʧips ‘chips’
e. zirɑks ‘zirax’
f. kɑdr ‘cadre’
g. ritm ‘rhythm’
h. sikl ‘cycle’
i. litr ‘litre’
j. fiks ‘fix’
k. fɑks ‘fax’
l. titr ‘title’
m. kɑbl ‘cable’
n. fibr ‘fibre’
o. ɡoɑtr ‘goitre’
230
p. bɑrfiks ‘barfix’
q. ɑsm ‘asthma’
r. espɑsm ‘spasm’
There are also words with –ism and –ist as suffix (note: -st after tense vowels is found in
native words), such as:
(36) a. kɑpitɑlism, kɑpitɑlist ‘capitalism, capitalist’
b. ʒurnɑlism, ʒurnɑlist ‘journalism, journalist’
c. nɑsijonɑlism, nɑsijonɑlist ‘nationalism, nationalist’
To these, some technical terms such as the following can be added. Here there are
consonant clusters like native possibilities as (37c), and some unlike them as (37b) – as
for (37a), nd after tense vowels exists in the language but not nt, but nt also meets the
requirement of falling sonority after tense vowels.
(37) a. tɑnʒɑnt ‘tangent’
b. dijɑfrɑɡm ‘diaphragm’
c. sɑrs ‘the disease SARS’
Thus, there exists a range of consonant clusters occurring after tense vowels in non-
native words. Several cases of these consonant clusters are observed in Persian (and
Arabic-origin very commonly used in Persian), but only after lax vowels, a, e, o. Some
examples are given in (38). These show that the following CC combinations (among
others) are acceptable in the language; it is just that they do not occur after tense vowels
in the native lexicon.
(38) a. zahr ‘poison’
b. mozd ‘wage’
231
c. ʃemʃ ‘bullion’
d. sabk ‘style’
e. tabl ‘drum’
f. ʧatr ‘umbrella’
g. potk ‘sledge-hammer’
h. bekr ‘intact’
i. fekr ‘thought’
j. vazn ‘weight’
k. seɢl ‘gravity’
l. boɢz ‘grudge’
m. maks ‘pause’
n. sabr ‘patience’
o. hatm ‘certainty’
It is important to ask if loanwords and native words should be treated together. Some of
the loanwords are very much integrated into Persian and, for instance, are used with
Persian native words in compounds. Consider the following example:
(39) ʃɑns ‘chance’ (loan from French)
a. xoʃ ʃɑns b. bad ʃɑns c. kam ʃɑns
good chance bad chance little chance
‘lucky’ ‘unlucky’ ‘not having so much luck’
For some of the loanwords there is no native equivalent Persian item, such as:
232
(40) a. bɑnk ‘bank’
b. film ‘movie’
c. ʃɑrʒ ‘charge (battery, feelings/energy)’
d. luks ‘lux’
Including the loanwords in the study of VCC combinations, various combinations of
consonant clusters in CVCC occur following a tense V as well as with a lax V. An
investigation of possible CC’s in loanwords with CVCC structure entered Persian is
important because loanwords shed light on the underlying structure of a language. The
VCC# restrictions are limited to native words; they are not observed in VCC# loanwords.
Let us see what this suggests, comparing the behavior of harmony with the one of VCC
co-occurrence restrictions in loanwords. Harmony patterns occur in loanwords, just as in
Persian native words (recall from 2.3.3 that the epenthetic vowel undergoes harmony in
loanwords) while the types of CC’s which are seen in loanwords (both rising and falling
sonority is observed after tense vowels in loanwords) and those which are seen in native
words (only falling sonority is observed after tense vowels in native words) are not the
same. Given that the patterns which are seen in loanword adaptation often reveal aspects
of native phonology which are not necessarily apparent from native data (see Kang
2011), the similarity of harmony patterns in loanwords and native words versus the
difference in VtenseCC co-occurrence possibilities in loanwords and native words suggest
that the occurrence of harmony is a productive process but the restriction to falling
sonority of the CC’s after tense vowels is not a systematic generalization as an
asymmetry is observed between native words and loanwords with respect to possible
CC’s after tense vowels. The comparison informs us of the native phonology of Persian,
suggesting that harmony involves the underlying level ([tense] is underlyingly present)
while VCC restrictions are either on the surface (that is they are not due to an underlying
property of vowels) or are not real restrictions.
Recall the chart in (20), the overall picture of CV, CVC, and CVCC in native words with
respect to vowels. I repeat it in (41).
233
(41)
ɑ, i, u a, e, o
#CV# Yes No (roughly)
#CVC# Yes Yes
#CVCC# Restricted in
native words (only
falling sonority
CC’s)
Yes
Looking more broadly at the language and taking loanwords into account, we thus see
that even at an edge, there are not restrictions on clusters following tense vowels.
Therefore, considering the loanwords with the shape of CVtenseCC, with which we see
CC’s with rising sonority, the chart in (41) could be revised, as in (42). Basically, the gap
observed in CVCC# native words is filled by very commonly-used loanwords.
(42)
ɑ, i, u a, e, o
#CV# Yes No (roughly)
#CVC# Yes Yes
#CVCC# Yes (considering
both native words
and loanwords)
Yes
Supporting the conclusion that falling sonority in final consonant clusters is not a
systematic generalization in Persian comes from loanwords with initial consonant
clusters. Loanwords with final CC’s with rising sonority are accepted in Persian without
any adjustment as discussed in this section. Initial consonant clusters, however, are
234
absolutely forbidden in Persian as the occurrence of epenthesis shows (see 2.3.3 for
discussion on epenthesis in loanwords). (43) presents examples of loanwords with initial
consonant clusters (‘.’ shows a syllable boundary). The examples show that an epenthetic
vowel is inserted to break up initial consonant clusters.
(43) ski → es.ki
small → es.mɑl
class → ke.lɑs
press → pe.res
If having consonant clusters with falling sonority after tense vowels were truly related to
the syllable structure of Persian or properties of Persian vowels, we would expect to see
adjustments in loanwords which have final CC’s with rising sonority. But we do not see
such adjustments. They are perfectly accepted in Persian with no adjustments. Forbidding
initial consonant clusters, however, is a systematic generalization in Persian, one that has
to be met, thus adjustments are made in loanwords which deviate from this
generalization, as (43) shows.
Thus both comparing loanwords with Persian native words, and comparing how
loanwords are accepted and used (with or without adjustments) shed light on the
unsystematic status of restrictions on final CC’s following tense vowels.
In the next section, I will conclude that VCC# restrictions do not argue for underlying
quantity.
5.10. VCC co-occurrence restrictions: not a support for quantity
In 5.7-5.9, I discussed VCC# restrictions in native words and loanwords. To conclude, I
suggest that the VCC# restrictions in native words is an accidental gap, and not an
indication of bimoraicity of ɑ, i, u, for the following reasons: (i) the idea that there might
235
be an appeal to tenseness only in final position is supported by those very rare
occurrences of VCC.CV words (e.g., surt.me ‘sled’); (ii) the restrictions could be
accounted for by surface quantity derived from underlying tenseness in final (stressed)
position; (iii) the language appeals to tense/lax finally but the existence of commonly-
used loanwords of CVCC structure with rising sonority consonant clusters suggests that
the VtenseCC# restrictions are not real.
So far I have discussed the synchronic status of consonant clusters with respect to the
preceding vowels. One may ask whether the patterning of vowels and their following
consonants can be explained historically. It could be speculated that the restriction might
be a case of historical residue given that the Middle Persian vowel system is thought to
be quantitative (see 2.1). I discuss Middle Persian final CC’s and geminate consonants in
the next section. The purpose, as noted above, is to find out whether or not the synchronic
restrictions on CC in CVCC where V is a tense vowel has an historical explanation.
5.11. A note on the historical status of final CC’s and gemination
The historical investigation of final consonant clusters and the restrictions on VCC is a
topic on its own which requires careful attention. I briefly summarize what I have found
based on Middle Persian words as documented in Farahvashi’s dictionary (1967). I first
discuss historical final CC’s after long and short vowels in 5.11.1. I then discuss
geminate consonants in Middle Persian in 5.11.2.
5.11.1. Final CC’s in Middle Persian
Looking at all the words in Farahvashi’s dictionary (1967), with both former long vowels
(ō, ē, ā, ū, ī) and former short vowels (a, i, u) CC with rising sonority and CC with falling
sonority are possible –see (44) and (45). Some examples are given in (46) and (47) –in
the examples the equivalents of the words in Modern Persian are given if they exist.
Before going through the CC clusters and examples, a few points should be noted: (i)
236
there are some Modern Persian words which do not have equivalents in Middle Persian
(either because they entered Persian from Arabic or they are newer words) and vice
versa, that is, some Middle Persian words were lost and therefore do not have Modern
Persian equivalent; (ii) some words in the Middle Persian dictionary are documented in
more than one way, so depending on what version is chosen the vowel can be long or
short (e.g., Middle Persian hiʃm and hēʃm ‘anger’ (today: xaʃm)).
The clusters are given in (44) and (45) followed by some examples in (46) and (47).
237
(44) Possible final CC’s in Middle Persian after long vowels
C2
C1
p t d k ɡ s ʃ v h m n r l
t √
d √ √
f √
v √
s √ √ √
ʃ √ √ √
z √ √ √ √
ʒ √
x √ √ √ √ √
Ɣ √
h √ √ √ √ √
n √ √ √
r √ √ √ √
238
(45) Possible final CC’s in Middle Persian after short vowels
C2
C1
p b t d k ɡ s ʃ z ʒ ʧ ʤ v Ɣ h m n r l j
p √ √
t √ √
d √ √ √ √
ɡ √
f √ √ √ √
s √ √ √ √ √
ʃ √ √ √ √ √
z √ √ √ √
ʒ √ √
x √ √ √ √ √ √ √
h √ √ √ √
m √ √
n √ √ √ √ √
r √ √ √ √ √ √ √ √ √ √ √ √
239
(46) Examples of final CC’s in Middle Persian after long vowels
a. Falling sonority in CC
kārt ‘knife’ (today: kɑrd)
rāst ‘right’ (today: rɑst)
mōzd ‘wage’ (today: mozd)
dōst ‘friend’ (today: dust)
ɡōʃt ‘meat’ (today: ɡuʃt)
dūzd ‘thief’ (today: dozd)
mitūxt ‘lie’
mūzɡ ‘grime’
b. Rising sonority in CC
frāʃm ‘appraisal’
hamāhl ‘similar’
dāsr ‘gift’
ēzm ‘firewood’ (today: hizom)
ʧīhr ‘face’ (today: ʧehr)
pēhn ‘wide’ (today: pahn)
zōhr ‘holy water’
240
(47) Examples of final CC’s in Middle Persian after short vowels
a. Falling sonority in CC
pazaʃk ‘doctor’ (today: pezeʃk)
buland ‘tall’ (today: boland)
marɡ ‘death’ (today: marɡ)
marz ‘border’ (today: marz)
ɡurɡ ‘wolf’ (today: ɡorɡ)
muʃt ‘fist’ (today: moʃt)
kiʃt ‘planting’ (today: keʃt)
virinʤ ‘rice’ (today: berenʤ)
b. Rising sonority in CC
ʧaʃm ‘eye’ (today: ʧaʃm/ʧeʃm)
abr ‘cloud’ (today: abr)
sahm ‘share’ (today: sahm)
razm ‘war’ (today: razm)
vafr ‘snow’ (today: barf –falling sonority)
taxl ‘bitter’ (today: talx –falling sonority)
mitr ‘kindness, sun’ (today: mehr)
kirm ‘worm’ (today: kerm)
241
spihr ‘sky’ (today: sepehr)
tuxm ‘seed’ (today: toxm)
The pattern observed in the data is summarized in (48):
(48) Middle Persian
Short vowels are not of interest here as both falling and rising sonority are observed
today in CC’s after a, e, o, which are the present correspondents of former short vowels.
The case of CC’s after long vowels, however, needs to be examined as with these vowels,
clusters of falling and rising sonority both were possible in Middle Persian and now only
falling sonority clusters occur with the present correspondents of these vowels, ɑ, i, u. Some sound changes occurred which caused the changes after former long vowels, today
ɑ, i, u. Consider the examples in (49) and compare them with those in (50). In (49), when
the CC shows rising sonority, either a vowel was inserted to break up the CC (as in (49a)
– o is inserted between z and m) or the long vowel preceding the CC changed to a vowel
which does not correspond to that long vowel (as in (49b) and (49c) – ī > e, ē > a), or
both of these processes occurred (as in (49d) – a is inserted between h and r, and also ō >
o), or the word is lost (as in (49e)). Thus some productive activity with respect to CC
after long vowels occurred from Middle Persian to Modern Persian, resulting in the
restrictions we see today. In (50), however, with a falling sonority cluster in Middle
Persian after long vowels these changes did not occur (leaving aside those words with
final CC with falling sonority after long vowels which were lost).
Structure Falling
sonority
Rising
sonority
CVlongCC# √ √
CVshortCC# √ √
242
(49) Middle Persian Modern Persian
a. ēzm → hizom ‘firewood’
b. ʧīhr → ʧehr ‘face’
c. pēhn → pahn ‘wide’
d. ɡōhr → ɡohar ‘jewel’
e. dāsr ------- ‘gift’
(50) Middle Persian Modern Persian
a. pōst → pust ‘skin’
b. rāst → rɑst ‘right’
c. ʧāʃt → ʧɑʃt ‘a quick and light meal’
This may suggest an historical change in sonority of final CC’s following previous long
vowels (present tense vowels ɑ, i, u) because today we get the pattern shown in (51).
Compare (48) and (51) – note the difference in the rising sonority column.
(51) Modern Persian (native words)
Structure Falling
sonority
Rising
sonority
CVtenseCC# √ X
CVlaxCC# √ √
243
The suggestion noted above regarding an historical change in final CC’s needs to be
confirmed by further investigation. There might have been another linguistic period,
responsible for such changes between Middle Persian and Modern Persian. The role of
Arabic and how sonority works in that language might be worth considering as Persian
has been influenced by a large number of vocabulary entered from Arabic and that might
have had an effect on the changes in final CC’s after vowels. It could be also a change
completely independent from Arabic.
In this section, it was seen that the falling sonority clusters which occur today after ɑ, i, u
are not inherited from Middle Persian and are in fact a consequence of processes which
occurred between Middle Persian and Modern Persian.
5.11.2. Geminates in Middle Persian
In this section I present a brief discussion of geminates in Middle Persian. I leave aside
Arabic words with geminate consonants which entered Persian. Looking at Farahvashi’s
dictionary (1967), it is observed that Middle Persian had some cases of geminates. Some
examples are given below. As seen, some words are documented with both geminate and
single consonants in the dictionary.
There are cases which were geminated in Middle Persian (or have a geminated version
along with a non-geminated version) and still are geminated.
(52) Middle Persian Modern Persian
varrak/varak barre ‘lamb’
ajjār ajjɑr ‘helper’
There were cases which were documented as geminated which are not geminated today
as in (53) and vice versa as in (54).
244
(53) Middle Persian Modern Persian
hannām/annām andɑm ‘body’
purr/pur por ‘full’
karr/kar kar ‘deaf’
parr/par par ‘feather’
(54) Middle Persian Modern Persian
baʧak/vaʧak baʧʧe ‘child’
Based on words in the dictionary geminate consonants occurred only after short vowels
(in particular after a) in Middle Persian with very few exceptions, where long vowels are
followed by geminate consonants. This could be accounted for based on quantity, which
is thought to be the active feature of the Middle Persian vowel system (see 2.1), as a
geminate consonant following a bimoraic vowel results in a structure that is too heavy or
too long.
(55) Middle Persian Modern Persian
hūrram/hūram xorram ‘happy, verdant’
There are also cases of gemination in Middle Persian which were lost in Modern Persian
(e.g., Middle Persian darrāk ‘ruined’, Middle Persian annāk ‘mean’).
Cases of geminates across morpheme boundaries are also observed in Middle Persian.
As van Oostendorp states, this distinction is not one of length. “The difference between
[ɛ] and [e], or between [ɔ] and [o] clearly corresponds to a difference between closed and
open syllables, but there is no reason to assume that Southern French distinguishes
between ‘long’ and ‘short’ vowels” (van Oostendorp, 2006, p. 8).
Thus, there are quantity-based languages which have a minimal word requirement, but
not all quantity-based languages have this requirement. Moreover, there are languages
which have a minimality requirement which might seem to be correlated to quantity, but
the overall phonological activity of those language does not support the existence of
quantity, and the requirement relates to quality. For example, consider Southern French
again. Some Southern French dialects display a process of ‘vowel harmony’ regarding
the feature [tense]: if the following vowel is high, the mid vowel is tense, if it is low, the
mid vowel is lax (van Oostendorp 2006):
(2) bête - bêtise [bɛt betiz] ‘stupid, stupidity’
dos - dossard [do dɔzaʁ] ‘back, number’
van Oostendorp (2006, p. 8) concludes: “Harmonic behavior is a clear indication that we
are dealing with a feature under standard assumptions in phonological theory; length
256
cannot spread, but a feature can”. These examples show that restrictions on syllable
structure and how vowels pattern in this respect, on their own, do not provide strong
evidence for quantity because it is possible that tense-based languages show restrictions.
The overall phonological activity of the language under study should be taken into
account.
With this background on minimal word requirements, I introduce the Persian data.
6.2. Data: description and discussion
In describing constraints on the distribution of vowels, Samareh (1977) says that /a, o/ do
not occur word finally with a few exceptions, as follows (p. 118, Samareh’s
transcription):
(3) /a/ in final position
va ‘and’
na ‘no’
/o/ in final position
monosyllabic disyllabic complex a few loanword
do ‘two’ bero ‘go’ the formal form of borou55
čo ‘as’ mâyo ‘swimming costume’
râdio ‘radio’
to ‘thou’
55 It is not clear why bero/borou ‘go!’ is listed here but not, for instance, bedo/bodou ‘run!’. Both are imperative forms consisting of the imperative marker be- which undergoes place harmony due to the vowel of the stem (see 2.3.1).
257
I would like to note a few points in this regard. In Persian in general, no monosyllabic
word ends in /a, o, e/, except for:
(4) Ca : na ‘no’
va ‘and’
(5) Co : to ‘you (sg)’
do ‘two’
ʧo ‘as’ (not used in speech/used only in poetry, etc.) 56
(6) Ce : be ‘to’
se ‘three’
ʧe ‘what’ (pronounced in speech as ʧi)
ke ‘that’
Looking at the words in (4)-(6), we observe that these are all function words (assuming
numbers are categorized as function words).
I now consider monosyllabic words with /ɑ/, /u/, and /i/ (the list of words is exhaustive).
(7) Cɑ: lɑ ‘fold, ply’
pɑ ‘foot’
nɑ ‘energy, mustiness’
56 The loanword ʃo ‘show’ is not included.
258
ʤɑ ‘place’
bɑ ‘with’
mɑ ‘we’
tɑ ‘until’
jɑ ‘or’57
(8) Cu: ru ‘face’
bu ‘smell’
su ‘direction’
su ‘eyesight’
ʤu ‘brook’
ɢu ‘swan’
mu ‘hair’
xu ‘disposition, habit’
tu ‘inside’
ku ‘where’
(ʔ)u ‘he/she’
57 There are also:
zɑ ‘childbirth’, ‘the present stem of zɑjidan ‘to give birth’’ − which cannot be used alone
hɑ ‘the plural marker’ −it is not used alone
rɑ ‘the specificity marker’ −used following nouns and pronounced as ro or o in speech.
259
(9) Ci: ʧi ‘the informal form of ʧe ‘what’’ (see (4))
ki ‘the informal form of ke ‘who’’
si ‘thirty’
bi ‘without’58
While there are some function words with these vowels, there are also lexical items.
59
The existence of lexical words with /ɑ/ and /u/ (e.g., pɑ ‘foot’, mu ‘hair’) but not with /a/,
/e/, /o/ might be taken as an argument for /ɑ/, /u/, and /i/ being phonologically long or
bimoraic, meeting a minimal word requirement, and /a/, /o/, and /e/ being short or
monomoraic, and thus unable to meet the minimal word requirement. Those
monosyllables with /a/, /o/, and /e/ are function words escaping this requirement.
However, there are several points to be noted. Let us return to (4)-(6).
First consider the absence of /a/ in monosyllables. For words ending in /a/ as in (4), this
gap is not only about monosyllabicity. No word in Persian, whether monosyllabic or
polysyllabic, ends in /a/ except those in (4). The reason is that, as discussed in 3.6.4,
historically /a/ in final position changed to /e/ in the last millennium (Natel Khanlari 1987
among others).
Monosyllabic words ending in /e/ all are function words (see (6)).
Next consider the lack of monosyllabic words with /o/. For words ending in /o/, as
discussed in 3.10, there is disagreement in the Persian literature as to whether a word
(whether monosyllabic or polysyllabic) can end in o or if in fact such words have ow in
final position. I will return to this in 6.3.
58 There are also: fi ‘price’ (which is probably borrowed from English ‘fee’)
li ‘denim’ (This word seems to be borrowed too – from the brand name Lee).
59 There are also present stems of some infinitives, such as ɡu(j), ʤu(j), ro, zi ‘the present stems of ɡoftan ‘to say’, ʤostan ‘to look for’, raftan ‘to go’, and zistan ‘to live’ respectively, but they are not used alone.
260
Now let us look at (7)-(9), words with ɑ, u, i. I exclude the function words (e.g., bɑ
‘with’, tɑ ‘until’, tu ‘inside’, bi ‘without’) and two loan words (see footnote 58). The
lexical words can be categorized into two groups, those that end in uj, and ɑj in formal
and/or literary speech (and in some dictionaries both versions are documented); and those
which end in u and ɑ without having a version that ends in a glide. The first group
ru(j) ‘face’, su(j) ‘direction’, mu(j) ‘hair’. The second group contains: lɑ ‘ply’, nɑ
‘energy’, su ‘eyesight’, ɢu ‘swan’.60
Persian monosyllabic vowel-final lexical words show that there are questions as to the
nature of the vowels in final position. Most of the words ending in ɑ and u have a version
ending in j. There are very few lexical items in any case. It is also unclear whether a word
in Persian can end in o or only ow occurs word finally.
It should be noted that words like pɑ(j) ‘foot’ and
ru(j) ‘face’ are also pronounced without the final j.
In the next section, I discuss why the patterning of vowels in monosyllabic words might
be a potential argument for underlying quantity, and how a tense-based account can
present an analysis for this patterning.
60 A few points should be noted here: (i) Persian vowels show compensatory lengthening when a following adjacent laryngeal consonant (i.e. ʔ and h) is deleted. Thus there are words like ʧɑh ‘well’ which can be pronounced as ʧɑ:, and one may say that Persian has a word like ʧɑ: with final ɑ which can be included in the list. I do not include these cases because, due to the effect of compensatory lengthening, they require different treatment. A question, for instance, is that if ɑ is bimoraic after deletion of /h/ and occurrence of compensatory lengthening, does it become trimoraic?; (ii) I exclude cases of deletion of a final consonant. This is common in Persian speech, for instance, bɑz ‘open’ can be pronounced as vɑz or vɑ. It seems that these words are not usually used alone. Again these are not included in the list since they need their own account; I did not also include ʃu(j), the short form of ʃohar ‘husband’, which is an old word used only in poetry, etc; (iii) Note that some of words given in (7) and (8) have glide-ending counterparts with a different meaning: lɑj ‘silt’, nɑj ‘the trachea’, kuj ‘neighborhood’, ruj ‘copper’ (cf. lɑ ‘fold’, nɑ ‘energy’, ku ‘where’, ru(j) ‘face’); (iv) I exclude Arabic zi and zu/zo ‘owning, having something’ which are not used alone and are seen in a few combinations in Persian.
261
6.3. Analysis of patterning of Persian vowels in vowel-final monosyllables
In this section, I first discuss how the patterning of vowel-final monosyllables presents
potential support for a quantity-based analysis of Persian vowels. I then argue that
tenseness provides an account of the patterning of vowels in monosyllables.
If in Persian vowel-final monosyllabic lexical items, only ɑ, u occur, then a requirement
for a minimum bimoraic size seems to be reasonable. This requirement explains why in
open monosyllables, only tense vowels are possible: ɑ, i, u are bimoraic and thus can
occur in open mono-syllables while a, e, o, being mono-moraic, cannot occur in that
position as they do not fulfill the requirement of a minimum bimoraic size, as shown in
(10).
(10) Underlying representation of open monosyllable words under a quantity-based
account
σ σ
μ μ μ
C ɑ, i, u C a, e, o
I argued based on epenthesis that the representations in (10) are potentially possible
surface representations of tense and lax vowels. As proposed for epenthesis in suffixation
(see 4.2), the distribution of vowels in monosyllables can be accounted for if tense
vowels project two morae in #CV#, while lax vowels project a single mora.
(11) #CVtense# → VV
#CVlax# → V
262
However, the question is whether the distribution of the vowels in #CV# words
differentiates tense and lax vowels. Considering only lexical items, it is clear that tense
/ɑ/ and /u/ occur in monosyllabic open syllables; /i/ and /e/ do not (leaving aside /a/ which
does nto occur in final position in general); what about /o/? This is a question that I will
address below. Let us leave /o/ aside for the moment.
6.3.1. Phonetic length
Consider the phonetic length of vowels in #CV# syllables. As discussed in chapter 3, I
conducted an experiment in order to measure the length of tense and lax vowels in
Persian (see 3.4). As we saw, lax vowels can be longer than tense vowels in a #CV#
structure. This suggests that there is a surface bimoraic requirement, and that vowels are
lengthened to satisfy this requirement with both tense and lax vowels being long in this
structure. Thus, actual phonetic length does not distinguish tense and lax vowels in this
syllable type.
6.3.2. /o/ in #CV#
The distribution of vowels in the #CV# structure does not suggest a categorization of
vowels in favor of quantity. As shown in (9), there is a gap for lexical words in CV form
where V=i. And if there are words ending in o in Persian then open stems do not provide
evidence for the vowel classes. In this section, I discuss /o/ in #CV#.
Recall from 3.10 that there is disagreement in the Persian literature as to whether a word
(whether monosyllabic or polysyllabic) can end in o or if such words have ow in final
position (e.g., Cowan and Yarmohammadi 1978, Samareh 1985, Keer 1999, Meshkatod
Dini 1999). According to some studies (e.g., Hayes 1986), the labiodental fricative [v]
and the labiodental approximant [w] are in complementary distribution: [w] occurs in
codas after short vowels; for example, pɑltow ‘overcoat’ and dowr ‘era’ and [v] occurs
elsewhere. Morphological alternations such as mi-ra[v]-am ‘I am going’ and bo-ro[w]
263
‘go!’ or no[w]-ruz ‘new year’ and no[v]-in ‘new kind’ indicate that [v] and [w] are
allophonically related.
One argument for the occurrence of ow in final position of words like pejro ‘follower’
comes from the presence of [v] in words such as pejrovɑn ‘followers’, which consists of
pejro followed by the plural suffix -ɑn. However, when a word like keʃo ‘drawer’
precedes the genitive marker –am the result is not *keʃovam, it is keʃom (informal) or
keʃo-(ʔ/j)am (formal); or after adding the suffix –i to a word like kɑdo ‘gift’ the result is
kɑdoji or kɑdo-(ʔ)i ‘suitable as a gift’–and not *kɑdovi, which is an impossible form– (cf. pejrovi/pejravi ‘obedience’). Thus, it is possible that /o/ is found in final position
considering these suffixation cases. I study the status of o in final position in Persian in
this section for monosyllabic o-ending words, and in the next chapter in general for both
monosyllabic and polysyllabic o-ending words.
Focusing on monosyllabic words, if one takes the position that words in Persian can end
in o and not necessarily ow, then the following words should be added to (5). All of these
are monosyllabic lexical words and their existence argues against the view that /o/-final
monosyllabic words cannot meet the minimal word requirement if they end in [o] rather
than [ow].
(12) ʤo(w) ‘barley’
do(w) ‘running’
lo(w) ‘disclosure’
mo(w) ‘vine, grape leaves’
no(w) ‘new’
Of course, if we do not take this position, then the words in (12) should be considered as
ending in ow and thus they are excluded from (5). The latter is the position that, for
instance, Samareh (1977) seems to take since in his study all the words ending in o are
transcribed as ou (e.g., mou ‘vine’ (p. 13), ǰelou ‘front’, polou ‘cooked rice’ (p. 111)).
264
Thus the treatment of o is particularly important to clarify whether or not lexical CVlax is
possible in Persian monosyllabic vowel-final words. In this regard, I conducted an
experiment with both real and made-up monosyllabic o-ending words to see whether I
can reach a conclusion on the final o or ow through some suffixation processes. I
conclude based on the experiment that Persian words can end in /o/ and therefore the
distribution of vowels in open monosyllables does not argue for moraicity being present
phonologically. If the language had an underlying requirement disallowing a, e, o in
#CV#, and allowing only ɑ, i, u in this structure, we might expect to find Ci’s, and, more
importantly, to not find Co’s. I used ‘more importantly’, because one may suggest that
the absence of Ci’s is an accidental gap, but note that the existence of Co’s is a problem
for an analysis which suggests Persian has a minimal word requirement motivated by
quantity.
I discuss the experiment on o/ow next.
6.4. Experiment on o/ow in final position
The experiment, looking at o/ow in final position of monosyllables, was included in the
experiment on epenthesis in suffixation, discussed in 4.7. The experiment on o-final
words (made-up and real, monosyllabic and polysyllabic) will also be discussed in
chapter 7. Here, I discuss only monosyllabic o-ending words. The experiment consists of
two parts: production (question and answer) and perception (acceptability rating).
In production, a question recorded in Persian was asked, as follows: ‘What did Ali say?’.
The speakers saw a word followed by a plus sign followed by a suffix in Persian script on
the screen. They were asked to make a suffixed form of the word with the suffix shown
on the screen and to fill in the blank in ‘I think he said…..’. For example: they saw on the
screen ‘ko + in’ (in Persian script). After they heard the question ‘What did Ali say?’
they responded (kojin or kovin). Below I discuss what the choice of kojin or kovin shows.
In perception, two pronunciations of words (suffixed forms) were given (e.g., kojin and
kovin – that is, the result of ko + in once with j at the suffix boundary and the other time
265
with v at the suffix boundary) in random order. The speakers were asked to rate the word
they heard based on the following scale: √: acceptable; ? so-so; X unacceptable. They
used pen and paper in rating.
If there is an underlying v or w in words, it is expected to see it with these words in their
suffixed forms. If the words do not have underlying v or w and if the occurrence of v in
suffixation is an indication of its underlying presence, then with made-up words the
default pattern, namely the insertion of a glottal stop or j or hiatus, is expected.
With real words there are cases such as those in (13) where the presence of v is dominant
in production. The number given in front of each word shows how many speakers
produced the words (e.g., 2 in front of a word shows that out of 10 speakers, 2 of them
said the word with that pronunciation). For real words, I considered the suffixed forms
which already exist in the language (e.g., no-v-in). I also included a few suffixed forms
which do not exist in the language but make perfect sense considering the meaning of
their base (e.g., ʤo ‘barley’ out of which I made ʤoestɑn ‘the field where barley is
planted and grows’. This is parallel with mo ‘vine, grape leaf’ from which
moestɑn/movestɑn ‘vineyard’ derives).
(13) Cases where v-including versions are more frequent
a. no ‘new’ no + in ‘modern’
nojin 0
novin 10 (as expected)
b. ʤo ‘barley’ ʤo + in ‘made of barley’
ʤojin 2
ʤovin 7
1 miscellaneous
266
Compare these with the cases in (14) where the same words are followed by a different
suffix (the adjective-forming –i). As seen in (14), the non-v-including version is more
dominant. Thus the same stem can show different patterns with different suffixes
(compare (13) and (14)). The miscellaneous cases reported here involve cases which did
not show one of the two expected options so I did not include them, for example, for ʤo
+ in, one of the speakers said ʤovejn (instead of ʤojin or ʤovin), which shows a change
of i to ej. I could have considered ʤovejn as a case of occurrence of v because it includes
v but in order to be sure of the results I decided to exclude it since it shows a change of
vowel. Another example of a miscellaneous case had to do with the non-stress-bearing
suffix –i (the indefinite marker), which in some cases some speakers used instead of the
stress-bearing suffix –i, so there was a change of stress.
(14) Cases where j-including versions are more frequent
a. noji 7
novi 3
b. ʤoji 9
ʤovi 1
Similarly, we get:
(15) mo ‘grape leaf’
moji 7
movi 1
2 miscellaneous
More examples of the dominance of the default pattern are presented in (16) (with the
suffix -estɑn ‘indicating the place’ (e.g., kudak ‘child’; kudakestɑn ‘kindergarten’)).
Recall that the insertion of j or the glottal stop or tolerance of hiatus is the default pattern,
and when I write the j-including/non-v-including version, a glottal stop can occur or the
267
hiatus is not resolved. Thus the j-including/non-v-including version is a representative of
the default pattern.
(16) More cases where non-v-including versions are more frequent
a. ʤoestɑn 6
ʤovestɑn 3
1 miscellaneous
b. moestɑn 8
movestɑn 2
These are cases with which one expects to get some v-including suffixed forms. An
obvious case is novin which does not have a version with j in the language, as shown by
the result above (as expected, all speakers said novin and not nojin). In other cases, j-
including versions are observed more, as seen above. Thus if suffixation is a valid
diagnostic for arguing for the existence of an underlying w, then the results given above
suggest that there is no underlying w as one does not see it consistently or more often
even. This may suggest that cases like novin are lexicalized that way. In other cases, two
strategies can be used at a suffix boundary: (i) default pattern; (ii) insertion of v in labial
environment; that is why there are two versions for some suffixed forms of o-ending
words. The same pattern, as I will show in chapter 7, is observed for polysyllabic o-
ending words and u-ending words.
Now let us look at made-up words. In production, both j-including and v-including
versions are possible, as shown below. The version with j seems to be more frequent. In
perception of made-up words, both j-including and v-including versions are acceptable,
with the version with v being more frequent than the version with j. These show that one
can easily get v in suffixation with made-up words where there is neither an underlying
ov/ow nor any historical justification.
268
Two points should be noted. First, one may ask why there are differences between
perception and production. The differences could be due to differences between the tasks.
It is not surprising if a speaker produces a word in one way and then, hearing another
version of the word, considers it as correct since it could be a version said by other
speakers and therefore sounds familiar.
Second, one may suggest that the studies which consider surface Co words to be Cow
analyze those words that way as final o is subminimal so it is analyzed as ow. But note
that this is not what the studies say. Those studies consider ow/ov to be underlying,
surfacing through suffixation and morphological alternations. In addition to monosyllabic
words, their examples include polysyllabic words (e.g., pɑltow ‘overcoat’) for which a
minimal word requirement or subminimality is not an issue.
(17) shows the results of the production task for o-ending monosyllabic made-up words.
(17) made-up words (production)
words # of occurrence
poestɑn
povestɑn
6
0
poɑn
povɑn
5
2
kojin
kovin
5
5
pojin
povin
7
3
269
koji
kovi
6
2
sojin
sovin
6
4
soji
sovi
6
0
koɑn
kovɑn
5
4
sojɑn
sovɑn
5
4
poji
povi
9
0
foestɑn
fovestɑn
10
0
fojin
fovin
7
1
foɑn
fovɑn
7
2
(18) shows the results of the perception task for o-ending monosyllabic made-up words.
270
(18) made-up words (perception) (recall: the scale is: √: acceptable; ? so-so; X
unacceptable)
words √ ? X
ʒoji
ʒovi
5
7
3
1
sojɑn
sovɑn
5
6
1
1
2
1
foestɑn
fovestɑn
4
7
1
1
3
foin
fovin
5
8
1 2
sojin
sovin
5
6
1
2
2
koji
kovi
6
6
2
2
soji
sovi
5
6
1
1
2
1
ʒojɑn
ʒovɑn
4
8
2 2
kojɑn
kovɑn
5
7
1
3
271
Recall from chapter 4 that in the perception rating task, there is some overlap, with a
speaker finding both versions of a word to be acceptable.
As noted above and discussed in detail in chapter 7, the existence of both a default
pattern and a context-sensitive pattern (in the labial environment) is responsible for the
variation observed in both real words and made-up words, leaving aside cases which are
historical residue or are lexicalized in one way or the other. The v in suffixation is not
necessarily an indication of underlying presence of ov/ow, as made-up words also show
it, and some real stems show it with some suffixes and not with others. See chapter 7 for
more discussion.
6.5. Minimal words in Persian
Now I return to the discussion of a minimal word requirement in Persian. The language
has o-final words, assuming that suffixation is an appropriate diagnostic. This vowel
patterns with a and e, the vowels that would be monomoraic under a weight analysis.
Thus Persian does not have an underlying synchronic bimoraic requirement for
monosyllables.
The following tables summarize the distribution of vowels in Persian monosyllabic
words. (19) shows #CVtense# and (20) shows #CVlax#. I leave aside #Ca# because final-a
words do not exist in Modern Persian regardless of the number of syllables in a word (see
6.2).
(19)
#CVtense# Lexical items
#Cɑ# √
#Cu# √
#Ci# X
272
(20)
#CVlax# Lexical items
#Co# √
#Ce# X
It should also be considered whether the distribution of vowels in monosyllables might be
the remnant of historical constraints. Based on Pisowicz (1985), short vowels would not
occur in final position in Middle Persian, while long vowels would. This may indicate
that even if there is a gap in final position for having some former short vowels in fact it
could be because of the historical status of these vowels and it is not necessarily
synchronic. That is, even if one considers o-final words to be impossible in the language,
which based on evidence from suffixation is not the case, this could be a historical
remnant. The historical investigation of vowels in final position remains for the future.61
To summarize, I have argued that there is not a phonological minimal word requirement
in Persian. Although the language roughly provides a list of lexical words ending in long
(tense) vowels versus a list of function words ending in either long or short (lax) vowels,
this distinction is not accurate under further studies, and the language does not present a
61 There are words which end in a lax vowel today but used to end in long vowels (which correspond to present tense vowels) or in consonants –many of the words under study in this discussion are not found in Farahvashi’s Middle Persian dictionary (1967) from which the former forms of the following examples are taken: Middle Persian Modern Persian
tō to ‘you (sg)’
dō do ‘two’
kē ke ‘that’
nōk no(w) ‘new’
These may suggest that the words ending in lax vowels, a, e, o, today used to end in a consonant or in vowels other than a, i, u (short vowels) in Middle Persian and that is why as Pisowicz says short vowels did not occur in final position in that era. However, in Farahvashi’s dictionary, some words are vowel-final where the vowel is a short vowel, for example, aʒ/aʒi ‘dragon’, ahu ‘universe’). Even the same word has two forms in the dictionary, one with a short vowel in final position, and the other with a long vowel in that position (e.g., the present word se ‘three’ has both si and sē versions in the dictionary). This shows that the status of lax vowels in final position in Middle Persian requires careful investigation.
273
clear-cut distinction between the two groups due to the absence of Ci words and the
presence of Co words.
6.6. Summary and conclusion
In this chapter, I discussed a minimal word requirement as a potential support for a
quantity-based account of Persian vowels. I showed that there is not a phonological
minimal word requirement, and thus there is no argument for underlying quantity or
against quality because of the way vowels pattern in #CV#.
I also addressed a controversial topic in the Persian literature, the occurrence of o or ow
in final position in monosyllables. Based on an experiment I conducted, I argued that
Persian has o-final words.
In this and the two previous chapters, I studied three phenomena which might seem to
present evidence for underlying quantity in the Persian vowel system: epenthesis in
suffixation (chapter 4), VC co-occurrence restrictions (chapter 5), and minimal word
requirements. I argued that none of these provides evidence for underlying quantity. Let
us, for a moment, return to the other two potential supports for a quantity-based analysis
of Persian vowels. Epenthesis in suffixation and VtenseCC#, as discussed in chapters 4 and
5, are not supported by the distribution of vowels when we closely examine them. The
former is violated by numerous cases of stems with the structure in which epenthesis is
expected and in which it is impossible. Moreover, epenthesis does not actually argue
even for surface quantity as it is an unproductive process. The latter (VtenseCC
restrictions) is violated by VtenseCC’s in non-final position, the existence of the words
such as surt.me ‘sled’, and by loanwords. If the three observations (epenthesis in
suffixation, VCC co-occurrence restrictions, and minimal word) argue for quantity at all,
it is for surface rather than underlying quantity, and for quantity only on stressed vowels.
Compare these with the productivity of harmony which argues for the underlying
presence of quality (tenseness).
274
The following tables summarize the distribution of vowels in Persian in different
structures, as discussed in chapter 5 and the present chapter. (21) and (22) demonstrate
CV, CVC, CVCC as monosyllables. (23) and (24) demonstrate these structures in non-
final position of polysyllables. ‘free’ means the structure occurs in the language without
restrictions; ‘restricted’ means there are restrictions.
(21)
Structure Occurrence
#CVtense# /ɑ, u/
#CVtenseC# free
#CVtenseCC# restricted (considering only native words)
free (considering both native words and loanwords)
(22)
Structure Occurrence
#CVlax# /o/
#CVlaxC# free
#CVlaxCC# free
(21) and (22) show that restrictions are found only in native words of the form
#CVtenseCC#: as discussed in chapter 5, only falling sonority CC’s occur after Vtense.
Now consider the following tables (‘.’ shows syllable boundary), which show CV, CVC,
and CVCC in non-final position.
275
(23)
Structure Occurrence
CVtense . free
CVtenseC. free
CVtenseC.C free
(24)
Structure Occurrence
CVlax. free
CVtenseC. free
CVtenseC.C free
No restrictions are observed in any structure in non-final position, as (23) and (24) show.
Let us now return to the three views found in the literature regarding the active feature of
the Persian vowel system: (i) quantitative; (ii) qualitative; (iii) synthetic (see 2.2).
If quantity is considered to be underlyingly present, harmony patterns cannot be
accounted for, and even epenthesis in suffixation, VCC restrictions and minimal word,
which at first seem to argue for quantity, are difficult to explain as suffixation shows non-
epenthesis as the dominant pattern, VtenseCC restriction is only applicable in final position
of native words, and an apparent minimal word requirement does not hold when
investigated more carefully. With underlying tenseness and surface quantity for stressed
vowels due to tenseness all can be accounted for.
276
Recall from 2.2 that height has also been considered to be the appropriate feature to
distinguish Persian vowels. If height (instead of tenseness) is considered to be underlying,
although harmony can be accounted for, the following cannot explained: (i) the
underlying categorization of vowels into ɑ, i, u vs. a, e, o (which is suggested by
harmony: ɑ, i, u are triggers and a, e, o are targets); (ii) the surface categorization of ɑ, i, u vs. a, e, o in as much as it exists in VCC restrictions (in native lexicon); (iii) the fact
that epenthesis as much as it exists in the language is observed with structures other than
CVlaxC; (iv) the result of the perception task of the experiment on epenthesis in made-up
words which shows that if the speakers accept the version with epenthesis, this is more
likely with words which have structures other than CVlaxC.
Thus neither height nor length can be the underlying dimesion of contrast in the system
based on the phonological activity of the language. Tenseness, however, can account for
the representation of Persian vowels and their activity in both underlying and surface
levels. Based on harmony, with i, u, ɑ as triggers and e, o, a as targets, I proposed that
[tense] is the underlying feature of the system. Given frequent length-like characteristics
of [tense] in tense-based langauges (see 5.3), it is not surprising to observe some length-
like properties in Persian, which were discussed in this and the previous two chapters and
are briefly summarized below:
(i) Epenthesis shows some nonproductive moraic effect (CVlaxC vs. CVlaxCC, CVtenseC,
CVtenseCC).
(ii) There are some distributional differences in native vocabulary, but restricted to
stressed vowels.
(iii) Results of my phonetic experiment on Persian are in accord with what is seen in
tense-based languages, with some minimal word effect as observed in surface minimality
requirement in #CV#.
Thus, the three processes are in fact compatible with a tense-based account.
As predicted by the framework of Modified Contrastive Specification, I have argued that
the Persian vowel system does not need both quality and quantity to be phonologically
277
present. The primary distinction in the Persian vowel system is one of quality (tenseness),
and there is, perhaps, a secondary, predictable quantity difference for tense vowels.
In the next chapter, I will examine two other phenomena in Persian, both of which occur
at a suffix boundary but involve consonants as well.
278
Chapter 7
More on suffixation: the case of -v
In chapters 4 and 6 I examined some phonological aspects of suffixation in Persian. In
this chapter and the next chapter, I look more into suffixation involving the occurrence of
consonants at a suffix boundary. The first case, which I examine in this chapter, is the
occurrence of a v after some o-final and some u-final words. In chapter 6 I discussed o-
final monosyllables. In this chapter I examine monosyllabic and polysyllabic o-final and
u-final words. The second case, which will be discussed in chapter 8, is the occurrence of
ɡ after e-final bases when the noun-forming suffix –i or the plural marker -ɑn is added to
the base. Thus both v and ɡ occur between a vowel-final base and a vowel-initial suffix.
In (1) and (2), I present the vowels of Persian followed by some vowel-initial suffixes to
show where v and ɡ occur. I assume that the bolded vowels in (1) and (2) are in final
position of the base. As for the vowel a, recall that no word (except for na ‘no’ and va
‘and’) ends in –a in Modern Persian. Also note that in the examples given for words
ending in vowels other than e, [j] in [j]i means the insertion of [j] before the suffix, and
“ii” which is given in “[j]i or ii” for the vowel /i/ means that the hiatus is tolerated or a
glottal stop can be present. Thus the presence of j or nothing shows the default category
as opposed to v or ɡ, which are specific to some environment.
(1) The noun-forming -i
i [j]i or ii u [j]i / ui or [v]i
e [ɡ]i (*ei) o [j]i / oi or [v]i
ɑ [j]i or ɑi
279
(2) The plural marker -ɑn
i [j]ɑn u [j]ɑn or [v]ɑn
e [ɡ]ɑn o [v]ɑn
ɑ [j]ɑn
The occurrence of ɡ is limited to these two suffixes, as we will see in chapter 8. The
occurrence of v, however, is not, that is, there are other vowel-initial suffixes with which
v occurs, as we will see in 7.1.
This chapter and the next chapter, therefore, like chapter 4 and part of chapter 6, touch
upon aspects of Persian morpho-phonology, particularly processes which occur at a suffix
boundary.
In this chapter, I focus on the occurrence of v. I first present an overview of v in section
7.1. Section 7.2 presents possible analyses of the occurrence of v. I then discuss an
experiment that I conducted to test if the occurrence of v is a synchronically productive
process in section 7.3. I conclude the chapter in section 7.4.
7.1. The case of v in suffixation: overview
In this section, I examine the occurrence of –v after some o-final and some u-final words
in some suffixation processes. The source of –v has been addressed in the literature on
Persian. In chapter 6, I examined the occurrence of v in suffixation after o-final
monosyllables in a discussion of whether there is a minimal word requirement, and
concluded that there are o-final monosyllabic words in Persian. A more general and
thorough discussion of v is presented here.
The v occurs with some o-final words when a vowel-initial suffix is present but not with
all of them. It also occurs with some u-final words. This raises a set of questions. Why do
some o-final words show the v while others do not? Why do some words show v with
280
one suffix and not with another suffix? Is the v an epenthetic consonant? If it is
epenthetic, what is the environment? If it is not epenthetic, how can we account for its
distribution?
The v occurs in the following cases among others.
(3) -i (adjective forming), -i (noun forming)
o/u [j]i or [v]i
(4)
o [v]ɑn u [j]ɑn or o[v]ɑn
plural marker -ɑn
(5) and (6) present examples of the occurrence of v with the suffixes given in (3), and (7)
shows examples of the suffix in (4). In (5), the bases are not roots, except for no ‘new’. I
therefore show the morphology by a hyphen. The part before the hyphen might consist of
more than one morpheme (e.g., miɑne consists of miɑn ‘middle’ + -e), which I abstract
away. In (6), the words are synchronically stems (historically there might be some
suffixes involved but they are not productive anymore). The final o may become a when
the suffix is added, as in the suffixed forms of the stems which contain –ro, in (5), shows
(e.g., piʃ-ro ‘pioneer’ → piʃra[v]i).
(5) Examples with the noun-forming -i
pej-ro ‘follower’ → pejro[v]i pejra[v]i ‘following’ (in speech
pijɑde-ro ‘sidewalk’ → pijɑdera[v]i ‘going for a walk, hiking’
281
harf-ʃeno ‘obedient’ → harfʃena[v]i (o[v]i/oi is fine too)
‘obedience’
no ‘new’ → no[j]i ‘newness’ (no[v]i is also possible)
(6) Examples with the adjective-forming -i
keʃo ‘drawer’ → keʃo[j]i or keʃo-i ‘sliding’
ʤelo ‘front’ → ʤelo[j]i or ʤelo-i ‘frontal’
xosro62
xosra[v]i ‘pertaining to xosro, used also as a last name’
‘the name of an ancient Persian king and now a boy name’ →
parto ‘beam, a ray of light’ → parto[j]i or parto-i or
parto[v]i ‘pertaining to parto, used also as a last name’
minu ‘heaven, paradise’ → mina[v]i or minu[j]i ‘heavenly,
divine; used as last name’
ʤɑdu ‘magic’ → ʤɑdu[j]i or ʤɑdui ‘magical’
62 Note that the word xosro was xosraw in Middle Persian. Here is what is found for this word in a middle Persian dictionary (MacKenzie 1971): Middle Persian husraw ‘famous, of good repute’, Early modern Persian xusraw. In the Middle Persian dictionary, there is also: husrawīh ‘fame, good repute’. Note that this –i is the Middle Persian noun-forming suffix. I put xosravi above as an example of suffixation with the adjective-forming –i, because today xosro is used as a name for boys and not as a common word with a meaning, that is, not as an adjective which gets an –i to change to a noun. Maybe only in poetry or literary texts can one find xosro as an adjective and xosravi as a noun. It does not matter for this discussion either way one interprets xosro/xosravi because the difference between the two –i suffixes is not a concern here.
282
ɑlbɑlu ‘sour cherry’ → ɑlbɑlu[j]i or ɑlbɑlu ‘cherry red’
kɑʔuʧu ‘rubber’ → kɑʔuʧu[j]i or kɑʔuʧui ‘rubbery’
limu ‘lemon’ → limu[j]i or limu[j]i ‘pertaining to
lemon, light yellow’
Now let us examine some examples of the v with the plural marker -ɑn.
(7) Examples with the plural marker -ɑn
pej-ro ‘follower’ + ɑn → pejro[v]ɑn
piʃ-ro ‘pioneer’ + ɑn → piʃro[v]ɑn
banu ‘lady’ + ɑn → bano[v]ɑn
abru ‘eyebrow’ + ɑn → abro[v]ɑn
bɑzu ‘arm’ + ɑn → bɑzo[v]ɑn
ɑhu ‘deer’ + ɑn → ɑho[v]ɑn
tars-u ‘coward’ + ɑn → tarsu[j]ɑn
dɑneʃ-ʤu ‘university student’ + ɑn → dɑneʃʤu[j]ɑn
(note: dɑneʃ itself consists of dɑn ‘present stem of dɑnestan ‘to know’’ + -eʃ)
The examples in (5)-(7) show that there are cases of o- and u-final words with v in
suffixation and others without it. There are also words such as ɑhu which show two
different patterns in hiatus:
283
(8) ɑhu ‘deer’ + -i → ɑhu[j]i or ɑhui ‘pertaining to deer’
*ɑho[v]i *ɑhu[v]i
ɑhu ‘deer’ + ɑn → ɑho[v]ɑn ‘deer (pl)’
The v may occur with other suffixes as well. For example, consider no ‘new’ and novin
‘modern’. Here the suffix –in is added to no, resulting in novin. I presented those suffixes
in (3) and (4) with examples in (5)-(7) in particular because there are more examples of
these and because of the variation between j and v with these suffixes. Also note that the
final u may become o before ɑn (e.g., abru ‘eyebrow’ becomes abrovɑn ‘eyebrow (pl)’).
Before discussing the possible hypotheses to account for the v, I need to note that in
addition to the o-final words that were discussed above and which show the o/av
alternation, there are words which show v in suffixation with other vowels in final
position. (9) presents some examples.
(9) musɑ ‘Moses, also used as a name for boys’
→ musa[v]i ‘Jewish, mostly used as a last name’
(the word jahudi is the common word for Jewish)
isɑ ‘Jesus, also used as a name for boys’
→ isa[v]i ‘Christian, also used as a last name’
(the word masihi is the common word for Christian)
kimiɑ ‘alchemy, also a name for girls’
→ kimiɑ[j]i / kimiɑ-i or kimiɑ[v]i
‘pertaining to kimiɑ, used a last name’
284
dehli ‘Delhi’ → dehla[v]i‘used as a last name,
related to/from Delhi’
mɑni ‘Mani/Manes, the prophet’ → mɑna[v]i ‘Manichean’
nabi ‘prophet’ → naba[v]i ‘pertaining to
prophet’
farɑnse ‘France’ → farɑnsa[v]i ‘French’
rije ‘lung’ → rija[v]i ‘pulmonary’
kolje ‘kidney’ → kolja[v]i ‘renal’
xɑʤe ‘an old title for men’ → xɑʤa[v]i ‘used as a last name’
halɢe ‘ring’ → halɢa[v]i ‘ring-shaped’
Cases such as musɑ ‘Moses’ and musavi ‘Jewish’ or mɑni ‘Mani/Manes, the prophet’
and mɑna[v]i ‘Manichean’ may suggest the presence of a suffix –avi. That is, final -ɑ and
–i are deleted and -avi is added but according to Kalbasi (1991) the suffix is –i. I will
return to this shortly.
Note also when two homophonous words show different patterns in suffixation:
(10) Kore ‘Korea’ → Kore[j]i /Kore-i ‘Korean’
Kore ‘sphere’ → Kora[v]i ‘spherical’
The question that these words bring up is: given that none of these words ends in o why
does v appear here? Kalbasi (1992) considers a suffix as the geminated –i (she transcribes
it as /-iy/) which is from Arabic but Persian speakers pronounce it as non-geminated and
so the result is –i. This suffix, according to her, while used with Arabic-origin words,
sometimes may be used with non-Arabic words. Kalbasi adds that this suffix causes some
changes in the final sound of the base, as follows (the transcriptions are mine):
285
- If a word ends in ɑ(ʔ)63
- /t/ in final position of some Arabic words which have entered into Persian either
remains as /t/, or /t/ deletes and /a/ becomes [e]. This /t/ or [e] is deleted when –i is added
, the suffix -i either follows without any change (e.g., samɑ(ʔ) ‘heaven, sky’ → samɑ(ʔ)i ‘celestial’) or the final glottal stop (or the epenthetic glottal)
becomes /v/ (e.g., samɑ(ʔ) → samɑvi).
- Also when words that end in -ɑ and –i are followed by this suffix their final vowels
become a followed by v (e.g., isɑ ‘Jesus’ → isavi ‘Christian’ and ali ‘a religious
leader of Muslims, also a name for boys’ → alavi ‘related to/followers of Ali’).
Kalbasi further adds that the rule for change of final sounds of the word is also observed
with some non-Arabic words and with some words which end in /e/ or end in some
Looking at the Arabic dictionary, I found words like kolje ‘kidney’ and rije ‘lung’; the
words koljavi ‘renal’ and rijavi and ‘pulmonary’ are also in the Arabic dictionary. The
same is for kore ‘sphere’ and koravi ‘spherical’. Recall that there are both kore-i ‘Korean’ from kore ‘Korea’ as well as koravi ‘spherical’ from kore ‘sphere’ (see (10)).
So kore ‘sphere’ and koravi ‘spherical’ are borrowed from Arabic and it seems that kore
‘korea’ and kore-i ‘korean’ are Persian.
It is possible to have both –avi and –i in the cases which come from Arabic. That is, there
are both options for the same word. For instance, for the Arabic word halɢe ‘ring’, I
consulted two Persian dictionaries, one English-Persian, and the other Persian-English. In
the former, for the word ‘ring pull’ the translation is:
63 Kalbasi puts the glottal stop within parentheses because there are words which entered the language from Arabic which have the glottal stop at the end but Persian speakers do not pronounce it. In orthography it may or may not be present.
286
(11) (bɑ) dar-e halɢe-i
(with) door-Ezafe ring-the suffix
‘(with) ring pull’
In the Persian-English dictionary, the word halɢavi (and not halɢe-i) is an entry with the
meaning ‘ring-shaped’. Both halɢe-i and halɢavi exist.
One may suggest that in some cases the word (which ends in vowels other than o and u)
before suffixation and its avi-suffixed forms both entered the language from Arabic. Then
the Persian –i (without –avi change) is used to make a Persian i-suffixed form out of it.
An example of both versions is donjɑ ‘world’ which is Arabic and is in the Arabic
dictionary along with donjavi ‘worldly’; in Persian both of these words are used and there
is also donjɑji ‘worldly’.
As I showed there are some words, mostly of Arabic origin, which end in other vowels
and take –v in suffixation. I leave these cases aside in this study and focus only on o-final
and u-final words.
Considering o-final and u-final words, one can ask why v occurs with some o-final and
u-final words but j occurs with others, and why the same word can show variation with
respect to the –v. In answer to these questions, several possibilities, which will be
discussed next, come to mind to investigate.
7.2. Possible analyses
In this section, I examine different hypotheses which may account for the occurrence the
v after o-final and u-final words.
(i) There are two groups of o- and u-final words in the language, one group has
ow or ov/av underlyingly and the other does not. Only with the former does the v occur.
287
(ii) The insertion of j or glottal stop or tolerance of hiatus is the default in the
language and only some words which are lexicalized or frozen or historical residue show
the v.
(iii) Since –v occurs in o and u environments, it is an epenthetic consonant due to
the phonetic environment of labials. This accounts for why it is v (and not other
consonants) that occurs. This epenthesis is optional, with j epenthesis also being possible.
I discuss these hypotheses one by one in 7.2.1-7.2.3. In 7.2.4, I summarize the discussion
of possible analyses.
7.2.1. Different underlying representations
One possibility to account for the occurrence of v in some cases but not in others is to
suggest that the o-final words in Persian are underlyingly divided into two groups: those
which have –av or –ov underlyingly; and those with underlying –o.
As seen above, there are cases of o-final words which result in –ovi or –avi when the
suffix –i is added (why we can get –avi with some o-final words could be due to some
opaque derivation). In particular we saw ro ‘present stem of raftan ‘to go’’ which shows
this process (see (5) for several compound words with ro). In conjugated forms of the
infinitive raftan ‘to go’, the –av is observed too. Consider the infinitive raftan ‘to go’ as
in (12). The verbal prefix mi- is the indicative marker, and the verbal suffixes are as
The above forms are used in formal speech and writing. In speech, we find mi-ram, mi-ri, mi-re, mirim, mirin (d usually becomes n here), miran (final d is also deleted here). Other cases of suffixation with ro from raftan ‘to go’ are given in (13).
(13) raftan ‘to go’
ro/rav ‘present stem’ (be + ro: bero ‘go!’ due to harmony: boro –see 2.3.1)
raft ‘past stem’
ravand ‘process’ ‘present stem + and’
ravande ‘walker’ ‘present stem + ande’
ravɑl also revɑl ‘procedure’ ‘present stem + ɑl’
raveʃ ‘method’ ‘present stem + eʃ’
ravijje ‘method, procedure’ ‘present stem + ijje’
ravɑne ‘bound for’ ‘present stem + ɑne’
The same pattern is found with ʃenidan ‘to hear’, from which there is ʃenavi in the word
harfʃenavi ‘obedience’ (this word was given in (5)):
The above are the formal forms. In speech, the e after ʃ is deleted, and giving miʃnavam, miʃnavi, miʃnave, miʃnavim, miʃnavin, miʃnavan. Again with other cases of suffixation
with ʃeno ‘present stem of the infinitive ʃenidan ‘to hear’’, the same alternation is
observed. See (15).
289
(15) ʃenidan ‘to hear’
ʃeno ‘present stem’ (be + ʃeno → beʃno ‘listen!’)
ʃenid ‘past stem’
ʃenavɑ ‘with a good hearing’ ‘present stem + -ɑ’
ʃenavɑji ‘hearing’ ‘present stem + -ɑji or ʃenavɑ + -i’
ʃenavande ‘listener’ ‘present stem + -ande’
Similar to raftan ‘to go’ and ʃenidan ‘to hear’ is davidan ‘to run’, as (16) shows.
(16) mi-dav-am ‘I go’ mi-dav-im ‘we go’
mi-dav-i ‘you (sg) go’ mi-dav-id ‘you (pl) go’
mi-dav-ad ‘she/he/it goes’ mi-dav-and ‘they go’
In speech, midoam, midoi, midoe, etc. (or sometimes midovam, midovi, midove, etc.) are common. The imperative is be-do ‘run!’ which is pronounced as bodo (due to
harmony –see 2.3.1). Other words built from the present stem do are given in (17):
(17) davidan ‘to run’ (in speech doidan or dovidan)
do ‘present stem’ (be + do → bedo ~ bodo ‘run!’)
david ‘past stem’ (in speech doid (sometimes dovid))
davɑn ‘running (usually used as davɑn davɑn ‘adv.’)’
(do + the adverb-forming -ɑn)
davande ‘runner’ (do + ande)
290
So far, I discussed cases which seem to have av/ov underlyingly, as the suffixed forms of
these words show.
There are, however, cases which are expected to show ov/av in suffixation under the
underlying av/ov hypothesis, but they do not. (17) presents examples of words which are
formed from do ‘present stem’ from davidan ‘to run’ and show v in suffixed form.
However, for the word pɑdo (consisting of pɑ ‘foot’ and do ‘run, present stem’)
combined with the noun-forming –i, both versions, one with [j] and one with [v], are
possible. The j version is more frequent –both intuitively and based on the results of the
experiment I conducted – two thirds of the participants pronounced it as pɑdo[j]i and one
third as pɑdo[v]i.
(18) pɑdo ‘an errand boy’ → pɑdo[j]i ‘the job of an errand boy’
pɑdo[v]i
Thus in addition to cases such as no ‘new’, ɑhu ‘deer’, etc. there are related groups of
words (related in a sense that they are formed from a particular stem) which show the –v.
The occurrence of –v in suffixation after some o-final and u-final words recalls the
question of whether o or ow occurs in final position in Persian words. This question was
addressed for monosyllabic o-final words in the discussion of a minimal word
requirement (chapter 6). Recall that there is debate in the Persian literature (e.g., Samareh
1985, Meshkatod Dini 1999) about whether apparent [o]-final words are /o/ or /ow/ final.
It is argued (e.g., Cowan and Yarmohammadi 1978, Hayes 1986) that the labiodental
fricative [v] and the labiodental approximant [w] are in complementary distribution: [w]
occurs only in codas after short vowels; for example, pɑltow ‘overcoat’ and dowr ‘era’
(examples from Keer 1999). Elsewhere [v] occurs; for example: initially as in vali ‘but’,
after consonants as in keʃvar ‘country’, after former long vowels as in ɡɑv ‘bull’
(examples from Keer 1999). Morphological alternations also exist, as in cases like
mira[v]-am ‘I am going’ and bo-ro[w] ‘go!’ or no[w]-ruz ‘new year’ and no[v]-in ‘new
kind’, indicating that [v] and [w] are allophonically related (e.g., Keer 1999). A note is in
order about av/ov. As said before, alternations between av/ov with the same stem are
291
possible (e.g., mira[v]-am ‘I am going’ and bo-ro[w] ‘go!’, pejro ‘follower’ and pejrovi / pejravi ‘following’).
The word pɑlto(w) ‘overcoat’ mentioned above, however, gets –i (not v) when a vowel-
initial suffix is added, as (19) shows. This is the adjective-forming –i. I put the w at the
end of the word pɑlto(w) within parentheses because it is unclear whether the word has w
at the end underlyingly or not; that is, whether the word ends in o or in ow.
Thus the o/av alternation is observed with some items and not with others. As I suggested
above, a possible account for this is that some of these words end in –o and some in –av
or –ov underlyingly. The latter group shows the alternation in suffixation.
But note that with the same word both [v] and [j] are possible, sometimes with the same
suffix and sometimes with different suffixes. For instance, with no ‘new’ the nominal
form is no[j]i ‘newness’, with no[v]i possible, while the adjectival form is only novin
‘modern’ (*no[j]in). These cases seem to be problematic for an account which considers
two groups of o- or u-final words based on their underlying v or w. There are few cases
like this. It is possible that stems where [j] and [v] are both possible are lexicalized. For
instance, the form no[v]in might be lexically listed. The reason for this suggestion is that
the j is found regularly in other cases of suffixation with no ‘new’ and it is not clear why
only the v with the suffix –in in novin ‘modern’ is possible, that is, why no-in/no[j]in is
impossible. A similar example of inconsistency in suffixation which was given in (8) was
ɑhu[j]i ‘related to deer’ and ɑho[v]ɑn ‘deer (pl)’.
293
The account which considers two groups of o-final and u-final words underlyingly thus
encounters two problems. First, in order to account for cases which show v in some cases
of suffixation and j in other cases, it is difficult to determine the underlying
representation without resorting to an account based on lexicalization. The words which
show variation in terms of the occurrence of v and j, that is, those words which do not
behave consistently across suffixes or even with one suffix, can be underlyingly either
ow/ov-including or without the final w/v. If a word which shows variation is treated as
underlyingly ending in ow/ov then the occurrence of v in suffixation is expected, but an
explanation is needed for why j occurs in some cases. And if a word which shows
variation is treated as underlyingly without w/v, for which the default pattern ([j]) is
expected to occur in suffixation, then an explanation is needed for why v occurs in some
cases. This is not impossible but requires strong arguments to be defendable.
Second, when we say for a word to be underlyingly this way or that way, what are our
diagnostics and what do we look at? If what we look at are the active phonological
processes of the language as diagnostics for the underlying form, then in the cases where
both j and v are possible we cannot reach a conclusion. If it is the historical background
of the words which we should take into consideration, then again one cannot conclude
anything as there are cases, some of which are given in (23), which historically had -ɡ/-k
in final position but show j or v synchronically.
(23) Middle Persian Modern Persian
āhuɡ ‘deer’ ɑhu ɑhuji/ɑhovɑn ‘related to deer/deer (pl.)’
brūɡ ‘eyebrow’ abru abrovɑn ‘eyebrows’
pahlūɡ ‘side’ pahlu pahluji ‘adjacent’
nōk ‘new’ no noji/novin ‘newness/modern’
294
In this section I discussed the possibility of two different underlying representations, o
and ov, to account for the occurrence of the v in some cases and j in other cases with
apparent o (and u) final stems. I suggested that this analysis encounters problems arising
from the fact that with many stems both [j] and [v] are possible. Next, I look into another
possibility in accounting for the v, the occurrence of the default pattern (the insertion of
j), versus some lexicalized items, which show the v.
7.2.2. Default pattern vs. lexicalized items
Another possible account for –v in suffixation is to take the j or glottal stop or hiatus
tolerance to be default, with words with –v lexicalized. This account differs from the
previous one as this one does not consider an underlying division between o- and u-final
words. Under this account, the cases which show [v] are lexically listed with v as
opposed to other cases for which the default pattern applies. There are two problems with
this account. One is that considering some cases to be lexicalized is just speculation. In
the absence of an explanation for the presence of v in the cases which show it, we resort
to lexicalization. The other is that cases which show both [v] and [j] are difficult to
account for under this view because one needs to consider two versions for them, a
lexicalized version and a version to which the default pattern applies.
In the next subsection, I discuss the possibility that the occurrence of the v is due to the
effect of the labial environment.
7.2.3. Phonetic effect
The third possibility to account for the v involves the phonetic effect of the labial
environment. That is, the v is possible at a suffix boundary when it is a labial
environment (following the o and u). The default j can also occur in this environment.
There is one possible piece of support for this hypothesis. As the examples in (24) show,
the morpheme va ‘and’ has variant forms, va and o (informal). There is also a vo form
295
which is possible only following a vowel-final stem (cf. (24) with (25)). Some consider
the v to be epenthetic (e.g., Najafi 2001). Assuming the correctness of this analysis, again
we see that the insertion of v is triggered by the labial environment.
(24) a. mɑ va
‘we and you’
ʃomɑ formal
mɑ o ʃomɑ ~ mɑ vo
ʃomɑ informal
b. se va
‘three and a half’
nim
se o nim ~ se vo
nim
c. kare va
‘butter and bread’
nɑn (taken from Lazard 1992)
kare o nun ~ kare vo
nun
d. ʤɑ va
‘place and location’
makɑn (taken from Najafi 2001)
ʤɑ o makɑn ~ ʤɑ vo
makɑn
296
e. xɑne va
‘home and abode’
kɑʃɑne (taken from Najafi 2001)
xɑne o kɑʃɑne ~ xɑne vo
One might ask why we do not consider vo in (24) to be the informal version of va. In
other words, why is v considered to be epenthetic? The specificity marker rɑ shows what
at first appears to be a similar variation between ro and o (e.g., ketɑb rɑ ~ ketɑb ro ~ ketɑb-o ‘the book’). However, the environment for the consonant-initial forms vo and ro
differs: with va/vo/o, the vo form is restricted to post-vocalic position, while with rɑ/ro/o
the ro form is unrestricted (e.g., ɢazɑ ro ‘the meal’, ketɑb ro ‘the book’).
kɑʃɑne
(25) a. ketɑb va dafter ‘book and notebook’
ketɑb o dafter *ketɑb vo dafter
b. ʃir va mɑst ‘milk and yogurt’
ʃir o mɑst *ʃir vo mɑst
Cases such as those in (24) may support the position that the occurrence of v is due to
phonetic factors. However, they do not provide conclusive evidence to argue for this
position because (i) one may argue that v is not epenthetic in cases such as (24), but it is
deleted in cases such as (25), when the word before the conjunction is consonant final;
(ii) the cases in (24) do not include the suffixes under study here. Thus stronger evidence
is needed to be able to confidently argue for this position.
7.2.4. Summary
I have introduced three possible hypotheses to account for the appearance of -v in
suffixation. While each has some merit, I have suggested that each encounters some
297
problems. I summarize the three views below. Let us first roughly consider three
categories of words: those which always take v, those which never takes v, those which
show variation between j and v. I said “roughly” because a precise categorization of
words is not possible, as seen in (17) and (18). Even for cases where mostly v appears in
suffixed forms (as in (17)), there is the possibility of j in another suffixed form (as in
(18)). There are also some words which do not show variation and always take v as novin
‘modern’. Now let us see how each of these categories is accounted for in the three views
I discussed in 7.2.1-7.2.3.
-The first view (different underlying representation): the words which always take
v are considered as being underlying ow/ov final; the words which never take v are
underlyingly o final; the words which show variation between j and v are problematic for
this account.
-The second view (default pattern vs. lexicalized items): the words which always
take v are considered to be lexically listed with v; the words which never take v show the
default pattern ([j]); the words which show variation between j and v are problematic for
this account.
-The third view (phonetic effect): the words which show variation and those
which take j are easy to account for as the default pattern could still apply and the labial
environment motivates the presence of v. The words like novin ‘modern’ which always
show v are lexicalized that way. The fact that with the same base it is difficult to find
consistent presence of v (as I explained above about the roughly categorization of words)
may suggest that something other than the underlying form is going on and therefore
strengthens the hypothesis that phonetics has a role here. Also, this view is the only one
which can easily account for the variation.
In order to better understand the distribution of v, I conducted an experiment to see
whether Persian speakers use v in suffixed forms of o- and u-final words, and to see in a
perception task, to what extent they rate words with [v] as acceptable compared to those
with [j].
298
7.3. The experiment
I carried out an experiment to test whether the presence of v after o and u at a suffix
boundary is an active process or is limited to known cases. The experiment was done
with 10 speakers of Persian. These are the same speakers who did the experiment
discussed in chapter 4 for epenthesis in suffixation and in chapter 6 for o-final
monosyllables. As for the experiment itself, as I said in 4.7.1, I conducted one experiment
through which three processes were examined (see Appendix 3 on experiments). Three
tasks were designed for the process on v, as follows:
Task (i): Question and answer task (production) (real and made-up words)
Task (ii): Wug task (production) (made-up words)
Task (iii): Acceptability rating (perception) (real and made-up words)
I did not include the reading task for this process since the v appears in Persian script and
therefore it will be read by speakers.
Below I will go through the tasks for this process.
7.3.1. Task 1: Production (question and answer)
This task includes both real and made-up words. The same question and frame sentence
(with a blank space in which the speaker’s answer fits) which were used for the
epenthesis process in chapter 4 and for the o-final monosyllables in chapter 6 were used
for this process too. I repeat them here for convenience:
(26) The question: ali ʧi ɡoft?
What did Ali say?
The frame sentence: fekr mikon-am goft…..
I think he said……… (participant’s response)
299
In addition to the stress-bearing –i’s and the plural marker -ɑn, I tested some other
suffixes as well. The occurrence of v is not restricted to only noun-forming suffixes,
adjective-forming suffixes, and the plural marker, as seen in no ‘new’, novin ‘modern’, so
I included the suffixes –in and -estɑn as well. For the list of suffixes and o-/u- final made-
up words used for this process see Appendix 8. The real words used for this process are
given in Appendix 10. Note that the o-/u- final words were mixed with the consonant-
final words which were used to study epenthesis (chapter 4) as well as with e-final words
to study the occurrence of ɡ at a suffix boundary (chapter 8). In addition, I included a few
made-up ɑ-final and i-final words to have a complete list of vowels in final position (e.g.,
timɑ, buni).
Let us now examine the results of the question and answer task for the process of v in
suffixation.
(27) The result of made-up o- and u-final words
No v With v Misc Total
191
(56.18%)
105
(30.88%)
44
(12.94%)
340
(100%)
(28) The result of real o- and u-final words
No v With v Misc Total
55
(35.71%)
92
(59.74%)
7
(4.55%)
154
(100%)
300
Before going through these tables, I should note that by ‘No v’ I mean the occurrence of
the default pattern which could be j or nothing or the glottal stop, which, as I said before,
I consider as the default category. I use j as the representative of this category. I should
also add a point about miscellaneous cases. As discussed in chapter 6, the miscellaneous
cases are those cases which are different from one of the expected patterns and are not of
concern here. For example, in some cases some speakers used the non-stress-bearing
suffix –i instead of the stress-bearing suffix –i, so there was a change of stress. Such
cases were categorized as miscellaneous.
The tables in (27) and (28) suggest a preference towards using j with made-up words and
towards using v with real words. In order to formally compare the occurrence of j and v
between made-up words and real words, a two-tailed Sign test was conducted. To run this
statistical test, I needed to know the performance of each speaker individually. Below is a
table which shows the performance of each speaker, shown by his/her initial, for real and
made-up words, eliminating miscellaneous cases. The table should be read as follows:
considering the first row, the speaker AE pronounced 58.33% of the real words of this
task with j (and not v) and 74.19% of the made-up words with j (and not v). The means
are given at the end.
301
(29) The result of the performance of speakers individually (question and answer)
Speakers Real % /j/ Made-up % /j/
AE 58.33 74.19
FT 36.36 78.12
SF 61.53 93.54
VD 20 86.66
SS 64.7 77.42
AA 23.53 67.74
RM 25 51.51
PH 41.18 60.87
NP 41.18 58.82
LP 5.88 12.12
Mean 37.77 66.10
The result strongly indicates that the frequency of /j/ is significantly higher in made-up
words than in real words (p-value = 0.002; highly significant).
Next I discuss a wug test for the v process.
302
7.3.2. Task 2: Production (wug test)
For this task some o- and u-final made-up words were used. In this task, I tested the
noun-forming –i and the plural marker -ɑn. There were two sentences, each for one of
these suffixes. The English equivalents of the sentences are as follows: (i) for the noun-
forming –i: ‘They are looking for a …..Unfortunately Mina does not know….’ (the
blanks in this sentence could be filled for example with ‘driver’ and ‘driving’
respectively); (ii) for the plural marker -ɑn: ‘There was not only one … in that room. It
was a group of ….. there.’ (the blanks in this sentence could be filled for example with
‘teacher’ and ‘teachers’ respectively). Both sentences were practiced with real words first
so the speakers could get familiar with the sentences and afterwards some made-up words
were practiced with the sentences.
The result is as follows:
(30) The result of made-up o- and u-final words (wug test)
No v With v Misc Total
50
(62.5%)
30
(37.5%)
0
(0%)
80
(100%)
It seems that the version with [j] is more frequent. In order to test this formally, I ran a
two-tailed one sample t-test. Again speakers’ performances were looked at individually,
as given below. The table shows that there is 62.5% occurrence of /j/ and therefore 37.5%
occurrence of /v/.
303
(31) The result of the performance of speakers individually (wug test)
Speakers Made-up % /j/
AE 75
FT 87.5
SF 100
VD 50
SS 100
AA 62.5
RM 62.5
PH 75
NP 12.5
LP 0
Mean 62.5
As (31) shows, some speakers use j in all cases (speakers SF and SS), some use less or no
j (speakers NP and LP), and others use j in half of the cases or more (speakers AE, FT,
VD, AA, RM, PH). The overall result does not support the frequency of /j/ being
significantly different from that of /v/ for this task, which includes only made-up words
(p-value = 0.2729; not statistically significant).
304
So in the question and answer task, the frequency of /j/ was higher compared to real
words. In the wug test, also a production task, no significant difference between the
frequency of /j/ and that of /v/ in made-up words is observed. Putting these together, the
overall conclusion for production tasks is as follows: with made-up words /j/ is used
more or equally to /v/.
7.3.3. Task 3: Perception (accessibility rating)
For this part of the experiment, the participants listened to a list of recorded words and
rated them on the following scale: √ (good, acceptable, possible), ? (so-so), X (bad,
unacceptable, impossible). The pen and paper process which was explained in chapter 4
and chapter 6 was used here.
(32) The result of made-up o- and u-final words
√ ? X Total
No v 89
(67.42%)
12
(9.09%)
31
(23.48%)
132
(100%)
With v 106
(80.3%)
11
(8.33%)
15
(11.36%)
132
(100%)
The overall result does not support the frequency of /j/ being significantly different from
that of /v/ (p-value = 0.0574; not statistically significant).
305
(33) The result of real o- and u-final words
√ ? X Total
No v 33
(63.46%)
6
(11.54%)
13
(25%)
52
(100%)
With v 41
(78.85%)
2
(3.85%)
9
(17.31%)
52
(100%)
The result does not support the frequency of /j/ being significantly different from that of
/v/ (p-value = 0.6875; not statistically significant).64
7.3.4. Discussion
In this section I discussed an experiment that I conducted to examine the occurrence of -v
in suffixation with real and made-up o- and u-final words. The results of the experiment
on v lead us toward the following conclusion: in comparing real and made-up words in a
production task, there are more [j] with made-up words than with real words. This might
suggest that familiarity is a factor in having [v] in suffixation. That is for made-up words
the default pattern is more frequent. However, given that in another production task there
is no significant difference between the usage of [j] and [v], it can be concluded that the
[v] is not an indication of the underlying presence of ow/ov in real words and is best
64 Note that the tables including each individual’s performance cannot be provided for the perception task, for two reasons: (i) there is overlapping in each speaker’s judgment; that is, both versions of a word could be acceptable for a speaker (as shown by the sum of some components which exceeds 100% -see, for example, “No v √” and “With v √” in (32) and (33)); (ii) there are some so-so’s (those shown by ?) therefore there is no clear-cut two-way distinction. Whereas in production, it is either one or the other (and not both or so-so); that is, in production, they said a word either with j or with v so if one deducts the percentage of usage of j, the exact percentage of usage of v is obtained. This is not the case in perception for the reasons just noted.
306
explained phonetically. There might be an underlying v in some cases, and also some
frozen or lexicalized cases or some historical residues with v. But there is also an active
v-insertion process which occurs after made-up o-final and u-final words, and this
supports the phonetic effect due to labial environment. The v does not occur all the time
in this labial environment because the default rule is also active, thus yielding variation
between j and v even for the same word. The variation is seen in both real words and
made-up words (e.g., an example for real words: ɑhu, ɑhu[j]i ‘pertaining to deer’, and
ɑho[v]ɑn ‘deer (pl)’; an example of made-up words which show the variation: the word
tɑru by the same speaker gave tɑruji but tɑrovɑn). In perception, in both real and made-
up words, no significant difference is observed between the occurrence of j and v, which
again can be explained by the default rule as well as the environment-dependent v-
occurrence both being active. The observation that v does not occur when the labial
environment is not provided, as shown in (1) and (2) and as also the words with other
vowels in final position in the experiment show (see chapter 8), confirms that v occurs
due to the phonetic effect of labials.
Thus, the conclusion is that the occurrence of v vs. j is governed by rules: the default rule
which inserts j and the rule motivated by labial environment which inserts v. But, then,
there are lexical differences (e.g., some lexical items take v more frequently while others
never or rarely take v) and these should be memorized for each item. This is a good
illustration of how it is hard to draw a firm line between what is derived by rules and
what is listed in the lexicon, i.e., what Zuraw (2000) refers to as ‘patterned exceptions’
and Bybee (2001) refers to as ‘rule/list fallacy’. In other words, these are patterns that
need to be memorized for each known word but nevertheless productive and can be
applied to novel words.
307
7.4. Summary
As noted in chapter 1, one of the major goals of this work is to examine some suffixation
processes in Persian in order to understand the seeming irregularities in
morphophonemics. In this chapter I discussed a suffixation process which involves the
occurrence of v following o- and u-final words at a stem-suffix boundary. I reviewed
three possible hypotheses to account for this process. One possibility is to consider two
groups of o- and u-final words in the language: one group with underlying ow/ov and the
other without having ow/ov underlyingly. A second possibility involves taking cases with
v in suffixation to be frozen or lexicalized as opposed to those cases which show the
default pattern, [j]. The third possibility rests on a phonetic effect of the labial
environment. I conducted an experiment testing the occurrence of v with real and made-
up words. With o-/u-final words, in production j and v both actively occur (in one task j is
more frequent, in the other no significant difference is seen between their occurrences).
In perception of o-/u-ending real and made-up words, the rating of j and v are not
significantly different. I concluded that the v can be phonetically explained, as follows: in
addition to the occurrence of the default pattern, there is an active v-insertion process
which occurs with made-up o-final and u-final words. Thus two patterns, one the default
pattern and the other the phonetically-driven v insertion, are synchronically active and
participate in the suffixation process. They are, therefore, responsible for the observed
variation regarding the consonant which may occur at a suffix boundary after o- and u-
final words. The process is thus best treated synchronically and phonetically.
In the next chapter, I will discuss another process which involves consonants in
suffixation. It is the occurrence of -ɡ in suffixation.
308
Chapter 8
More on suffixation: the case of -ɡ
In the previous chapter, I examined a suffixation process which involves the consonant –v
in suffixation after o- and u-final words. In this chapter, another suffixation process
which also involves a consonant is discussed. In this case, the consonant -ɡ occurs after e-
final stems with two suffixes, the noun-forming suffix –i and the plural marker suffix -ɑn.
In this chapter I first discuss the synchronic status of the -ɡ in 8.1. This will be followed
by a review of the literature in 8.2. I then examine different hypotheses, both historical
and synchronic, to account for the occurrence of the -ɡ in 8.3. I conducted an experiment
on -ɡ, which I will discuss in 8.4. Arguing that the occurrence of -ɡi and -ɡɑn is a case of
phonologically controlled allomorphy with particular suffixes, I conclude the chapter in
8.5.
8.1. The synchronic status of -ɡ
In this section, I examine the occurrence of -ɡ at a suffix boundary with the noun-forming
suffix –i and the plural marker suffix -ɑn in Persian. The status of -ɡ is controversial in
the Persian literature. In the Persian literature, some (e.g., Meshkatod Dini 1999, Najafi
2001) have claimed that [ɡ] is an epenthetic consonant. Natel Khanlari (1987), on the
other hand, considers the present -ɡ to be a historical -ɡ which synchronically appears
with some suffixes.
Before discussing possible analyses for the occurrence of the -ɡ, let us see some examples
to show where -ɡ occurs.
309
8.1.1. The -ɡ: overview
Consider the following examples in which the noun-forming –i (a stress-bearing suffix) is
attached to a vowel-ending base:
(1) a. ʃakibɑ ‘patient’ ʃakibɑ-i ‘patience’
b. baʧʧe ‘child’ baʧʧe-ɡ-i ‘childhood’
In (1a) the suffix is clearly of the form –i, while in (1b), the suffix appears to have the
form –ɡi. The presence of the [ɡ] raises interesting questions. Why is the [ɡ] present?
Why is it present only after the vowel [e] (without exception), as we will see? Why, when
a similar suffix is considered, is there no [ɡ] present? Compare (1) with (2) in which the
adjective-forming –i (also stress-bearing) is attached to a vowel-ending base:
(2) a. talɑ ‘gold’ talɑ-i ‘golden’
b. mantaɢe ‘region’ mantaɢe-i ‘regional’
As we see, in (1a) and (2a), when the base ends in ɑ, the suffixes pattern similarly. In (1b)
and (2b), where the base ends in –e, however, the suffixes pattern differently from each
other. (1b), in which the suffix is noun forming, shows ɡ but (2b), in which the suffix is
adjective forming, does not show ɡ; the latter patterns similarly to the words which end in
-ɑ. More examples of the bases with different vowels will be presented in (3) and (4).
In this section I study this ɡ in some depth. The ɡ is of limited distribution, occurring
between an e-final stem or suffix and two vowel-initial suffixes, the noun-forming -i and
the plural marker -ɑn. It should be noted that the plural marker -ɑn is used more in formal
speech. The plural marker -hɑ is the suffix used in informal speech.
Let us now consider the data. (3) and (4) present examples of suffixation with the noun-
forming –i and the plural marker -ɑn respectively. (3a) and (4a) contain examples of e-
final words with this suffix. (3b) and (4b) contain examples of words ending in sounds
other than e.
310
(3) The noun-forming -i
a. The suffix with e-final words
farsude ‘worn-out’ farsudeɡi ‘being worn-out’
ɡorosne ‘hungry’ ɡorosneɡi ‘hunger’
teʃne ‘thirsty’ teʃneɡi ‘thirst’
xaste ‘tired’ xasteɡi ‘tiredness’
zende ‘alive’ zendeɡi ‘life’
baʧʧe ‘child’ baʧʧeɡi ‘childhood’
nevisande ‘author’ nevisandeɡi ‘authorship’
rɑnande ‘driver’ rɑnandeɡi ‘driving’
jeɡɑne ‘unique’ jeɡɑneɡi ‘uniqueness’
sɑde ‘simple’ sɑdeɡi ‘simplicity’
b. The suffix with sounds other than e in final position
b. The suffix with sounds other than e in final position
deraxt ‘tree’ deraxtɑn
dɑnɑ ‘knowledgeable’ dɑnɑjɑn
ɡedɑ ‘poor’ ɡedɑjɑn
dɑneʃʤu ‘university student’ dɑneʃʤujɑn
pari ‘fairy’ parijɑn
Note that some of these words are roots and some are suffixed forms to which one can
add –i and -ɑn. The presence of j as an epenthetic glide represents the default pattern, as
noted in chapters 6 and 7.
Also relevant here is that there are some words in Persian which may be thought to have j
as a part of the base. For instance, consider dɑneʃʤu ‘university student’ from which
312
dɑneʃʤuji ‘studentship’ and dɑneʃʤujɑn ‘university students’ are formed. The word
dɑneʃʤu ‘university student’ consists of dɑn (‘the present stem of dɑnestan ‘to know’’)
followed by the suffix -eʃ, giving dɑneʃ ‘knowledge’, to which ʤu(j) (‘the present stem
of ʤostan ‘to look for’’) is added (dɑneʃʤu literally means ‘someone who is looking for
knowledge’ used for university students). Now whether the j at a suffix boundary is a part
of the present stem ʤu(j) from ʤostan ‘to look for’ or whether it is epenthetic is a
question. There are, however, cases where j is clearly epenthetic (e.g., dɑnɑ ‘knowledgeable’, ɡedɑ ‘poor’, pari ‘angel’). While these are unclear issues, what matters
in our discussion of -ɡ is that no -ɡ appears with these cases.
As shown above, a ɡ appears when an e-final base is followed by the noun-forming –i.
There are a few exceptions which show ɡ in suffixation but the result is an adjective and
not a noun. For example:
(5) xɑne ‘home’ (noun) xɑneɡi ‘homemade, pertaining to home’ (adjective)
From words like ɢavi ‘strong’, one may, however, make ɢavii or ɢavi[j]i in a sentence
like:
(10) ɑdam-i be in ɢavi[j]i na-dide boud-am.
person-indefinite marker to this strong-the suffix neg-seen was-1st.sg
‘I hadn’t seen a person this strong’
But ɢavi[j]i is not a word that one can use in many other cases. For example, one does
not say:
(11) *xorma xeili be-h-et ɢavi[j]i mi-de.
date (the fruit) very to-epenthetic-you strong-the suffix indicative-give-3rd sg
‘Dates (the fruit) give you a lot of strength/energy.’
The nominalized form ɢovvat (and not ɢavi[j]i) is used in (11) and in most cases. The
point relevant to our discussion is that when we can use these i-ending words with the
suffix –i, no ɡ is observed, and j is inserted.65
An example of the adjective-forming –i is given in (12). Consider the word mɑhi ‘fish’.
mɑhi[j]i/mɑhi-i, while not a common word (which is intended to mean ‘like a fish’ or
‘containing fish’, or ‘liking fish’), is acceptable in the following sentence, which means
that your hand has been in touch with fish and is not clean:
65 I did an experiment with 10 native speakers of Persian for some suffixation processes discussed in chapters 4, 6, 7, and this chapter. In order to have a complete list of vowels in final position, I included a couple of i-ending words (to which the suffixes –i and -ɑn be added). The words were furi and lari and the result strongly confirms what I said above about j being used after i-final words.
315
(12) dast-am mɑhi[j]i bud
hand-1st sg genitive fish-the suffix was
‘My hand was dirty with fish’
This sentence can be said, for instance, in a context such as ‘My hand was dirty with fish
so I did not pick up the phone’. So the point is that again j is used with i-final words
regardless of suffix.
In addition, some i-final words take -jɑji (or maybe -ɑji, preceded by an epenthetic j) to
form an adjective. See the following examples:
(13) libi ‘Libya’ libijɑji ‘Libyan’
romɑni ‘Romania’ romɑnijɑji ‘Romanian’
ʃimi ‘chemistry’ ʃimijɑji ‘chemical’
musiɢi ‘music’ musiɢijɑji ‘musical, pertaining
to music’
I should also add a note about some e-final words followed by an –i. Some Arabic words
and their versions with the adjective-forming -i are both used in Persian.
(14) kufe ‘the name of a city in Iraq’ kufi ‘related to Kufe (the city)’
hole ‘towel’ holei ‘like a towel regarding the material’
herfe ‘profession’ herfei ‘professional’
mantaɢe ‘region’ mantaɢei ‘regional’
eʤɑre ‘rent’ eʤɑrei ‘rental’
dore ‘period’ dorei ‘periodic’
ɢahve ‘coffee’ ɢahvei ‘brown’
sorme ‘surma’ sormei ‘dark/navy blue’
firuze ‘turquoise (the stone)’ firuzei ‘turquoise (color)’
317
peste ‘pistachio’ pestei ‘a kind of green as the color of
pistachio (a color) or sth made of or having
pistachio’
With the adjective-forming suffix –i, e-final words pattern similarly to words with other
vowels in final position (a j can be inserted between /u, ɑ/ and the suffix, too).
(16) ɢuti ‘can, box’ → ɢutiji ‘packed in box’
ʤɑdu ‘magic’ → ʤɑdui ‘magical’
talɑ ‘gold’ → talɑi ‘golden’
Persian has a non-stress-bearing suffix –i as well, which should not be confused with the
two stress-bearing –i suffixes discussed above. The non-stress-bearing suffix –i is the
indefinite article, which was included in the list of suffixes in section 4.4. I mention it
here because it will be referred to later. Both before and after suffixation, the stress in the
following cases is on the final syllable of the base, that is, on final -e. Some examples are
presented in (17) –the stress on final –e of the base is shown in bold:
(17) baʧʧe ‘child’ baʧʧe-i ‘a child’
parande ‘bird’ parande-i ‘a bird’
kise ‘bag’ kise-i ‘a bag’
In informal speech, usually jek ‘a(n), one’ is used to mark indefiniteness (e.g., je(k) baʧʧe ‘a child’). Sometimes in speech jek ‘a(n), one’ precedes the words which get the
indefinite article –i (e.g., je(k) baʧʧe-i ‘a child’). The point relevant to our discussion
here is that the non-stress-bearing -i does not get -ɡ.
I leave aside the non-stress-bearing –i, and focus on the two stress-bearing -i’s. It should
be noted that there is not agreement in the literature on the existence of two suffixes,
noun-forming –i and adjective-forming –i, just as there is not agreement upon the nature
318
of the ɡ. Thus, before going further, I will review the literature on the ɡ in suffixation and
then continue with my discussion.
8.2. Literature review
A number of authors who have written about Persian address the question of the source
of the ɡ that occurs with the noun-forming suffix and the plural marker after the vowel e.
Some studies propose that the ɡ is epenthetic while some consider it as a form of the
suffixes –i and -ɑn which occurs only after -e. There is also an account which considers
the ɡ to be present historically in final position of some words. According to this account,
the ɡ reappears in the present time when these words are followed by certain suffixes. I
will review these accounts in 8.2.1-8.2.5. In 8.2.6, I conclude the section.
8.2.1. Natel Khanlari’s account
Natel Khanlari (1987) provides a historical account of the ɡ, arguing that it was in final
position in some words in Middle Persian and reappears in some cases of suffixation,
where it is not word-final; for example, with the plural marker -ɑn and the noun-forming
–i. Some of his examples are given in (18) (the transcriptions are mine; in the reference
only the suffixed forms are presented, I also include the unsuffixed form):
He thus proposes two –i suffixes. Only one of them, the noun-forming suffix, becomes
-ɡi after –e.
8.2.3. Kalbasi’s account
Kalbasi (1992) considers the suffix –i to have two forms: -i and -ɡi, the latter form occurs
only after –e. The –i has two historical sources, she argues: (i) One was a noun-forming
suffix: this was –īh in Middle Persian, as in tārīɡīh and wadīh (these words are tɑrik-i ‘darkness’ and bad-i ‘badness’ in Modern Persian (from tɑrik ‘dark’ and bad ‘bad’));
(ii) The other was an adjective-forming suffix: this was -īɡ in Middle Persian, as in pārsīɡ
and nāmīɡ (these words are pɑrsi or fɑrsi ‘Persian, Farsi’ and nɑmi ‘famous’ in Modern
Persian (from pɑrs or fɑrs ‘Iran, today Fars is the name of a province in Iran’ and nɑm
‘name, fame’)).66
Kalbasi further says that this suffix is one of the most productive suffixes in Persian and
gives several examples. Among them (the transcriptions and forms before suffixation are
mine), the following are found:
66 In Middle Persian dictionaries there are examples of -īh and -īɡ. I present here some examples (taken from Farahvashi’s 1967 dictionary):
hamsedɑji ‘agreement’ (hamsedɑ ‘having the same view’)
baʧʧeɡi ‘childhood’ (baʧʧe ‘child’)
zendeɡi ‘life’ (zende ‘alive’)
Kalbasi further adds that the -ɡi version, which occurs only after –e, can be used either (i)
to indicate an infinitive (or maybe gerund), as in xaste-ɡi ‘tiredness’; or (ii) to show
relationship or correspondence (connection), as in xɑne-ɡi ‘homemade, related to house’.
However, according to her, adjectives which show relationship/correspondence and
which do not follow the rule are found. Some of Kalbasi’s examples are:
(24) noɢre ‘silver (the material)’ noɢrei ‘silver (the color) or made of silver’
mɑhiʧe ‘muscle’ mɑhiʧei ‘having muscles’
ɢabile ‘tribe’ ɢabilei ‘tribal’
Kalbasi adds, in a footnote, that in some dialects of Persian such as Isfahani (Isfahan is a
city in Iran) -ɡi is used after other vowels as well, for example:
(25) porru-ɡi ‘cheekiness’ In Standard Persian it is porru-ji or purrui
(porru ‘cheeky’)
bɑbɑ-ɡi ‘fatherhood’ In Standard Persian it is bɑbɑ-ji or bɑbɑ-i
(bɑbɑ ‘dad’)
Another example from her 1991 book is as follows:
(26) bijɑberu-ɡi ‘scandal’ In Standard Persian it is bi(j)ɑberu-ji or bi(j)ɑberui
(bi-ɑberu ‘scandalous’)
322
There are also words in other dialects of Persian which sometimes (not always) show -ɡi
for the adjective-forming –i. Some examples, taken from Shalchi’s dictionary (1991), are
given in (27). Shalchi’s dictionary of Khorasanian Dialect (1991) is a dictionary which,
as the author describes, includes some words used in The Great Khorasan –the area which
included the current province of Khorasan (in North East of Iran), as well as
Turkmenistan, Afghanistan, Uzbekistan, and Tajikistan. Some of the words in (27) are
used in Standard Persian in their un-suffixed form but not in the suffixed form (see my
notes in (27)) although note that since the adjective-forming –i can be added to any noun
as long as the meaning is fine, one can make the following words in their suffixed forms
too. In all the examples given below, if in Standard Persian, one wants to add the
adjective-forming –i, she/he will add just –i and not -ɡi.
(27) sina ‘chest, breast’ sina-ɡi ‘a suckling baby’ (in Standard Persian: sine)
ʃanba ‘Saturday’ ʃanba-ɡi ‘pertaining to Saturday’ (in Standard Persian: ʃanbe)
ʃira ‘a kind of drug’ ʃira-ɡi ‘addicted to the drug’ (in Standard Persian: ʃire
and out of it ʃire-i)
kuʧa ‘alley’ kuʧa-ɡi ‘pertaining to alley’ (in Standard Persian: kuʧe)
and out of it kuʧe-i)
In continuation of the discussion on Khorasani dialects, I add that in Mashhad (Mashhad
is a city in Iran in Khorasan province) -ɡi is sometimes used for the e-ending words
followed by the adjective-forming suffix in particular with color words (this is my own
observation and has been confirmed by another Persian speaker). This is observed in the
speech of uneducated older people.
(28) ɢahve ‘coffee’ ɢahveɡi ‘brown’
peste ‘pistachio’ pesteɡi ‘pistachio green’
323
In Standard Persian, these words are pronounced as ɢahvei and pestei.
A study of the dialectal distribution of -ɡ in different dialects of Persian is an interesting
topic that I leave for future research. Kalbasi’s observations on the Isfahani dialect show
that in different dialects, there are different synchronic interpretations of -ɡ although
historically these dialects share the same origin.
Returning to -ɡ in Standard Persian, as Kalbasi considers the suffix as synchronically
one, with a -ɡi form used specifically after –e – she considers two different historical
origins for the suffix. According to her, as discussed above, sometimes some e-final
words do not take -ɡi (the result is an adjective). It should be noted that the number of
words which end in –e and take –i (with the adjective meaning) without showing -ɡ is so
large in the language that one cannot consider them as an exceptional group which
sometimes show deviation from the -eɡi rule. In fact, with some exceptions (see (5)), all
e-ending words which take the –i to form an adjective do not show -ɡi.
8.2.4. Mahootian’s account
Mahootian (1997) introduces a list of suffixes with their function. In the list, the
following are observed (I do not include other suffixes of her list and give only those I
am dealing with here). It seems that Mahootian considers at least two –i’s and does not
consider -ɡ to be one of the epenthetic consonants in Persian. I will return to this after
reviewing her list.
324
(29)
a. Nouns from nouns
These suffixes are semantically regular.
Suffix Noun Noun
-i (stressed) mard mardi Yes
Productive
abstract nom. man manliness
-ɡi barde bardeɡi limited (why limited
abstract nom. slave slavery (will be discussed below)
b. Nouns from adjectives
These are semantically regular.
Suffix Adjective Noun Productive
-i xub xubi Yes
abstract nom. good goodness
-ɡi afsorde afsordeɡi limited
sad sadness
325
c. Nouns from adverbs
Adverbs are not common source of nouns. One suffix for forming nouns from adverb is:
Suffix Adverbs Noun Productive
-(ɡ)i tond tondi Yes
nominalizer fast quickness
d. Adjectives from nouns
These are semantically regular. The words form with -ɡi are also used adverbially.
USuffixU UNounU UAdjectiveU UProductiveU
-ɡi hafte hafteɡi limited
-ly week weekly
-i ɢahve ɢahvei Yes
attributive coffee brown
e. Adverbs from noun
This is typically used with time expressions to form adverbs of time and can also be used
adjectivally.
USuffixU UNounU UAdverbU
-ɡi hafte hafteɡi
Like -ly week weekly
326
It seems that Mahootian considers at least two –i’s (whether there are two –i’s or two
functions of the same suffix is not clear): (i) One is an abstract nominalizer (noun-
forming from nouns and adjectives); (ii) The other is attributive (adjective-making from
nouns). The same for -ɡi: (i) One is an abstract nominalizer (noun-forming from nouns
and adjectives); (ii) The other is adjective-forming from nouns. The resulting adjectives
can be used adverbially as well. There is a –(ɡ)i as adverb-forming from nouns.
It seems that (since there is not an explanation as how these are treated, I cannot be
entirely sure that my interpretation is correct) Mahootian does not consider these suffixes
to be one suffix but to be more than two, or maybe she just points out different functions
of the same suffix, in which case one cannot tell if she takes the suffixes to be two or to
be one with various functions. It also seems that Mahootian does not consider -ɡ as
epenthetic, in particular because later when she refers to epenthesis in sequences of
vowels, she says that j or h or nothing is inserted between two vowels. Thus Mahootian
does not seem to consider -ɡ to be one of the epenthetic consonants. It is also important to
note that she considers the -ɡi to be of limited productivity compared to –i. This might be
interpreted as considering those cases which show -ɡ to be frozen or lexicalized.
8.2.5. Meshkatod Dini’s account
Meshkatod Dini (1999) considers the -ɡ to be epenthetic (along with j, the glottal stop,
etc.). He gives some examples such as:
(30) hafte-ɡ-i ‘weekly’
xɑne-ɡ-i ‘homemade, related to home’
baʧʧe-ɡ-i ‘childhood’
327
Meshkatod Dini continues that the epenthetic -ɡ in the above examples distinguishes
them (as a noun or adjective) from the following cases which have the indefinite marker
–i. Recall that Persian has a non-stress-bearing –i, the indefinite marker.
(31) hafte-j-i / hafte-ʔ-i ‘a week’
xɑne-j-i / xɑne-ʔ-i ‘a house’
baʧʧe-j-i / baʧʧe-ʔ-i ‘a child’
After these, Meshkatod Dini says that it seems that the place of stress has an influence on
the occurrence of the epenthetic -ɡ. He later proposes that the insertion of -ɡ has a
phonological explanation because regardless of morphological considerations and the
syntactic category of the base, the -ɡ is found when preceded by a morpheme-final e
provided that the suffix is the stress-bearing –i. Meshkatod Dini adds that one can add the
suffix -i to a noun or an adjective and get an adjective or a noun respectively. He gives
the following rules to account for this (I show the stress in bold):
(32) ∅ → [ɡ] / e + – + i
In comparison with this, Meshkatod Dini gives the following rule for the –i which does
not carry stress, when stress is on –e:
(33) ∅ → [ʔ] / e + – + i
Meshkatod Dini concludes that ɡ is epenthesized between e and a stressed vowel i. It is
clear, he claims, that a phonological analysis is sufficient to account for the presence of ɡ,
and no morphological or syntactic considerations are necessary. However, Meshkatod
Dini does not consider the full range of facts. First, while he accounts for the presence of
ɡ following e with the noun formative, he does not account for its absence with the
adjective formative, which is also stress bearing. Second, if the segmental environment
must be specified, then it is difficult to account for the fact that the ɡ occurs not only with
a suffix of the form –i, but also one of the form –ɑn.
328
Samareh (1977) also takes the ɡ to be epenthetic or, as he calls it, an intrusive consonant
which is used when the base final vowel is e and the suffix is either –i or -ɑn. Some of his
examples are: morde ‘dead’ and mordeɡɑn ‘dead (pl)’; teʃne ‘thirsty’ and teʃneɡi ‘thirst’.
Najafi (2001) also considers the ɡ to be epenthetic. Some of his examples are: setɑre
‘star’ and setɑreɡɑn ‘stars’; zende ‘alive’ and zendeɡi ‘life’.
I will return to the ɡ and epenthesis in 8.3.4.
8.2.6. Conclusion
Different accounts have been provided for the occurrence of the -ɡ at a suffix boundary
with certain vowel-initial suffixes. I present below the overall picture of the inventory
with these two suffixes. Consider in particular the difference in patterning of the suffixes
with –e, which are in bold.
(34) The vowel inventory with the two –i suffixes
-i (adjective forming) -i/-ɡi (noun forming)
i [j]i or ii i [j]i or ii
e -i e -ɡi
a ____________ a ____________
u [j]i or [v]i u [j]i or [v]i
o [j]i or [v]i o [j]i or [v]i
ɑ [j]i or ɑi ɑ [j]i or ɑi
Here are again some examples of e-final words with these two suffixes:
329
(35) i (adjective forming) -i/-ɡi (noun forming)
ɢahve-i ‘brown’ zende-ɡi ‘life’
dore-i ‘periodic’ teʃne-ɡi ‘thirst’
herfe-i ‘professional’ xaste-ɡi ‘tiredness’
Note the homophonous words which take different –i’s, as follows:
(36) baste ‘package’ baste-i ‘packed in packages’
baste ‘dependant’ basteɡi ‘dependence’
(37) kufte ‘big meatballs’ kufte-i ‘pertaining to kufte/look like
kufte (round, etc.)’
kufte ‘battered, exhausted’ kufteɡi ‘the state of being battered,
severe fatigue’
In addition, there are cases such as the following, which in particular suggest that the
hypothesis of -ɡ being epenthetic due to pure phonological factors or due to phonetic
environment is not tenable. That is, either -ɡ is not epenthetic or if it is epenthetic it is due
to some morphological or morpho-phonological factors (e.g., domains, levels of
suffixation).
(38) pɑrʧe ‘fabric’ pɑrʧe-i ‘made of fabric’*pɑrʧeɡi
jek pɑrʧe ‘integrated’ jek pɑrʧeɡi ‘integrity’
(jek means ‘a(n), one’)
330
(39) daste ‘group’ daste-i ‘as a group, packed in groups’ *dasteɡi
ʧand daste ‘divided’ ʧand dasteɡi ‘division, lack of unity’
(ʧand means ‘several’)
(40) dande ‘gear, rib’ dande-i ‘operated with a gear mechanism’ *dandeɡi
jek dande ‘stubborn’ jek dandeɡi ‘stubbornness’
Compare these with the following example where the suffix –i is adjective-forming:
(41) resɑne resɑne-i
‘media’ ‘pertaining to media’
ʧand resɑne-i (adj.)
multi- media-suffix
‘pertaining to multimedia’
These cases raise a question as to whether the two stress-bearing –i’s belong to the same
level of suffixation. I will now address this question as a potentially possible explanation
for -ɡ among other possibilities which will be also discussed next.
8.3. Possible analyses of -ɡ (synchronic and historical)
In this section I examine possible accounts of the occurrence of the -ɡ, as follows:
(i) The two –i’s belong to different levels of suffixation
(ii) The -ɡ is base final (synchronic and historical)
(iii) The -ɡ is suffix initial
(iv) The -ɡ is epenthetic
331
I discuss these hypotheses one by one in 8.3.1-8.3.4. I summarize in 8.3.5.
8.3.1. Different levels of suffixation
In the literature on the phonology/morphology interface, there has been detailed study of
cases where affixes pattern differently with respect to their phonology. One frequent
account in the literature is that these affixes are added at different levels of word
formation or that they enter into different prosodic relationships with the stem (see, e.g.,
Bermudez-Otero and McMahon 2006 for review). It is worthwhile to investigate whether
such a proposal might account for the different patterning of vowel-initial suffixes in
Persian, with some of them having a ɡ form and others not. Among the vowel-initial
suffixes of Persian, which are listed below, only with the noun-forming –i and the plural
marker -ɑn is the ɡ observed. The vowel-initial suffixes in Persian are as follows: the
indefinite article –i, the relative particle –i, the Ezafe –e, the genitive markers (i.e., am, et, eʃ, emɑn, etɑn, eʃɑn), the plural markers -ɑn, -ɑt, -in-, and -un, the definite marker –e,
the noun-forming –i, the adjective-forming –i, and also a number of suffixes such as -ɑ
(noun and adjective forming), -ɑr (noun and adjective forming), -ɑne (noun, adjective,
and adverb forming), etc. (see Appendix 2 for the full list of suffixes).
There are, in general, phonological, morphological, and semantic reasons for proposing
different levels:
-Phonological: These depend on the language, but might involve prosodic features such
as stress, segmental features such as ability to trigger palatalization, etc.
-Morphological: The suffixes could be sensitive to the morphological structure of the
base which they attach to.
-Semantic: Word-level show semantic compositionality, usually larger productivity;
stem-level might be less productive, often are argued to combine with the base to give an
idiosyncratic, not predictable meaning.
332
The specific criteria by which one can distinguish different levels of suffixation can vary
across languages. For instance, in English trisyllabic laxing, velar softening, and sonorant
syllabification are argued to distinguish stem-level vs. word level suffixes (e.g., Kaisse
2005, Bermudez-Otero and McMahon 2006); in Hebrew (Meir 2005), stem level suffixes
attract stress while word level suffixes are stress neutral; also in this language the word-
level suffixes are semantically coherent or compositionally contribute to the meaning
while the stem-level suffixes may combine idiosyncratically with the stem in terms of
meaning; in Dutch, the effect that an affix has on syllabification is the main criterion for
deciding which class a suffix falls into (Booij 1977, 1995 cited in Kaisse 2005).
Now let us discuss the two stress-bearing –i suffixes in Persian with respect to levels of
suffixation. The reason that I will focus on comparing the two –i suffixes regarding
suffixation is that they are homophonous. Thus the question is whether the suffixes are
added on two different levels; for instance, one of them may be a stem level suffix and
the other a word level suffix. In answer to this, first of all, if these two suffixes are added
on different levels and for that reason -ɡ appears with only one of them, the question will
be why a difference is observed only with the vowel –e and not with other vowels. Thus
the problem of variable ɡ is not entirely solved even if the two suffixes belong to
different levels since one still needs to find a way to explain why the process happens
only with –e. Second, it seems that the two suffixes pattern similarly in terms of levels of
suffixation, as will be discussed below. With respect to levels of suffixation, we must
also not forget about the plural marker -ɑn whose pattern is similar to the noun-forming -i
regarding -ɡ.
In the case of the two –i suffixes in Persian, the language does not suggest any systematic
differences similar to those noted above for other languages. The two –i’s both pattern
similarly regarding stress: they both carry stress. They exhibit freedom of occurrence or,
in other words, do not have restricted occurrence because as long as the meaning allows,
they can attach to a base. They both contribute compositionally to the meaning. That is,
one of them is not less coherent semantically than the other. They are productive with
respect to phonology in that they can attach to any consonant or vowel in final position
(note that the noun-forming –i/-ɡi are considered as the same suffix). They are productive
333
regarding morphology as well because both can attach to a root followed by a suffix.
Thus it seems that there is no morphological selection on their part. The point is that
since they are productive and are very frequently used one can add them to different
words to make new words.
Some examples of the morphological and syntactic structures of the bases to which these
suffixes may attach are given in (42) and (43). Both of these suffixes can be added to
roots (e.g., baʧʧe ‘child’ and out of it baʧʧeɡi ‘childhood’ for the noun-forming suffix;
peste ‘pistachio’ and pestei ‘the green of pistachio as a color/having pistachio’ for the
adjective-forming suffix). Note that identifying a word as being a root is not always
straightforward given that there might be zero derivation, and some affixes are no longer
productive and therefore may not be recognizable in the word in the present time. Both of
these suffixes can attach to a root followed by a suffix or to compounds. For example,
consider the noun-forming –i in (42a). The word zan ‘woman’ is a noun to which the
suffix -ɑne can be added to make zanɑne ‘feminine, womanly’. The noun-forming –i can
be added to zanɑne to make zanɑneɡi ‘womanhood’. More examples are given below in
all of which the processes of suffixation are shown.
(42) Examples with the noun-forming -i
a. zan ‘woman’
zan + ɑne: zanɑne ‘feminine, womanly’
zanɑneɡi ‘womanhood’
b. pusid ‘past stem of the infinitive pusidan ‘to rot’’
pusid + e: puside ‘rotten, past participle of the infinitive pusidan ‘to rot’’
pusideɡi ‘rot’
334
c. nevis ‘present stem of the infinitive neveʃtan ‘to write’’
nevis + ande: nevisande ‘author’
nevisandeɡi ‘authorship’
d. del ‘heart’
dɑde ‘given, past participle of dɑdan ‘to give’’
deldɑde ‘in love, lover, someone who has given his/her heart’
deldɑdeɡi ‘being in love’
It should be noted here that according to Kalbasi (1992) the present suffixes -e, -ɑne, and
-ande were -aɡ, -ɑnaɡ, and -andaɡ respectively in Middle Persian. This recalls the
historical account which was introduced in 8.2.1. and will be discussed in detail in
8.3.2.1.
Now consider the adjective-forming –i. As with the noun-forming –i, with the adjective-
forming –i, a base which is already a suffixed form is possible, as the examples in (43)
show.
(43) Examples with the adjective-forming -i
a. zanʤir ‘chain’
zanʤir + e: zanʤire ‘a chain of sth’
zanʤire + i: zanʤire-i ‘serial, linked together like a chain’
335
b. kin ‘grudge’ (used usually in poetry, formal forms)
kin + e: kine ‘grudge’ (used in informal and formal speech)
kine+ i: kine-i ‘someone who has grudge frequently’
c. mɑh ‘moon’
mɑh + vɑre: mɑhvɑre ‘satellite’
mɑhvɑre + i: mɑhvɑre-i ‘via satellite’
d. ʃenɑs ‘familiar, the present stem of ʃenɑxtan ‘to know’’
nɑme ‘letter’
ʃenɑs + nɑme: ʃenɑsnɑme ‘national identity card’
ʃenɑsnɑme + i: ʃenɑsnɑme-i ‘pertaining to national identity card’
Both suffixes can also attach to syntactic units, for example:
(44) with the noun-forming -i
sɑf o sɑde
honest/clear and simple/naive
sɑf o sɑde + i : sɑf o sɑdeɡi ‘simplicity, honesty, naivety’
336
As in a phrase like: sɑf o sɑdeɡi-j-e motlaɢ
naivety-Epenthetic-Ezafe67
‘absolute naivety’
absolute
Note that sɑf o sɑde is used idiomatically. But one can also use the words independently
with the noun-forming suffix as sɑfi and also sɑdeɡi.
(45) with the adjective-forming –i
ɢom o ɢabile
ethnic group and tribe
ɢom o ɢabile + i : ɢom o ɢabile-i ‘ethnic, tribal’
As in a phrase like: extelɑfɑt-e ɢom o ɢabile-i
disagreements-Ezafe tribal
‘tribal disagreements’
‘used to refer to the disagreements among nations, ethnic groups, etc.’
Note that ɢom o ɢabile is used idiomatically. The individual stems can also be used
independently with the adjective-forming suffix as ɢom-i and also ɢabile-i.
It seems, therefore, that these two suffixes do not belong to different levels, because they
behave similarly considering stress, phonological and morphological conditions on the
part of the base, and their compositional contribution to the meaning.
67 See 2.2.2.1.2 (footnote 13) for Ezafe.
337
I here examined the possibility of different levels for the two –i’s, and ruled it out. Now I
examine other possible analyses for the occurrence of -ɡ.
8.3.2. The -ɡ is base final
Another possible account of ɡ is to consider -ɡ to be base final. I examine this hypothesis
from both synchronic and historical perspectives.
8.3.2.1. A synchronic investigation
Under an account which considers -ɡ to be base final, we would have bases such as
/baʧʧeɡ/ ‘child’ that end in /ɡ/ (note baʧʧeɡi ‘childhood’). Those words which do not
show ɡ would end in a vowel (e.g., rahɑ ‘free’ out of which there is rahɑ-i ‘freedom’).
What are the implications of this hypothesis? First, we might expect ɡ-final words to
always show ɡ in suffixation. However, this is not the case, as shown below. Consider
again the word baʧʧe ‘child’ out of which there is baʧʧeɡi ‘childhood’. This word does
not show ɡ with other suffixes such as the indefinite article and the Ezafe:
(46) Examples with the non-stress-bearing suffixes
with the indefinite article –i: baʧʧe-i ‘a child’ or baʧʧe-j-i
with the Ezafe –e: baʧʧe-j-e ‘the child-Epenthesis-Ezafe’
e.g., baʧʧe-j-e Minu
child-Epenthetic-Ezafe Minu
‘Minu’s child’
338
(47) Example with stress-bearing suffixes
with the definite marker –e: baʧʧe-h-e ‘the child’
(h may be sometimes barely heard in fast speech)
How can one account for the deletion of ɡ in other suffixation patterns if the consonant is
considered to be a part of the base?
Note also that the language has ɡ-final words, such as:
(48) ranɡ ‘color’
farhanɡ ‘culture’
lanɡ ‘lame’
tanɡ ‘tight’
xenɡ ‘stupid’
marɡ ‘death’
bozorɡ ‘big’
bɑnɡ ‘cry, shout’
raɡ ‘blood vessel’
saɡ ‘dog’
riɡ ‘pebble’
diɡ ‘a cooking pot’
suɡ ‘mourning’
The final ɡ of these words remains in suffixation, as the following examples show:
339
(49) with the noun-forming -i
bozorɡ ‘big, great’ bozorɡi ‘bigness, greatness’
xenɡ ‘stupid’ xenɡi ‘stupidity’
zeranɡ ‘clever’ zeranɡi ‘cleverness’
ɢaʃanɡ ‘beautiful’ ɢaʃanɡi ‘beauty’
(50) with the adjective-forming -i
farhanɡ ‘culture’ farhanɡi ‘cultural’
ranɡ ‘color’ ranɡi ‘colorful’
(51) with some other suffixes
with the indefinite article –i: ranɡ-i ‘a color’ as in ranɡ-i ʃɑd
color-indef. happy
‘a happy color’
with the Ezafe –e: farhanɡ-e ‘the culture-Ezafe’ as in farhanɡ-e ʃarɢi culture-Ezafe eastern
‘eastern culture’
with the definite marker –e: saɡ-e ‘the dog’
ɑhanɡ-e ‘the song’
Thus an account cannot be provided to argue for the ɡ being synchronically in final
position of the words which show it in suffixation.
340
Let us now see if the ɡ can be accounted for historically.
8.3.2.2. An historical investigation
Recall that according to the historical account the ɡ which appears in suffixation today
was in final position of these words in Middle Persian (see 8.2.1). This account seems
problematic for the following reasons.
First, why do these words show their former ɡ only with a limited number of vowel-
initial suffixes and not with all of them? Recall the examples in (46) and (47). The word
baʧʧe was waʧʧaɡ in Middle Persian. It was shown that with the noun-forming –i, the ɡ
is observed (i.e., baʧʧeɡi ‘childhood’) but not with the indefinite article, Ezafe, and the
definite marker.
Second, there were words in Middle Persian which ended in ɡ but do not show their ɡ in
suffixation with –i (the Middle Persian data in this section is from Farahvashi’s
But there are cases like moʒde which does not have the plural moʒdeɡɑn. So out of
moʒde, we directly get moʒdeɡɑni. Kalbasi considers cases such as moʒdeɡɑni and
ɑbɑdɑni as having the non-productive suffix -ɡɑni and -ɑni, but cases like ʤesmɑni and
asabɑni (see (63)) with an Arabic suffix -ɑni. That is, according to Kalbasi, there are two
-ɑni’s; one Arabic, the other Persian with a -ɡɑni version which comes only after –e.
I do not go further into details on these suffixes. I showed that there are some ɡ-initial
suffixes which occur after different vowels and consonants. I also showed that there are
suffixes which can be interpreted as having two versions (some of which not productive
anymore), one ɡ-initial and one vowel-initial, the former seems to occur after –e (it
346
depends, of course, on whether one considers, for instance, two -ɡɑne’s as discussed
above or not). The important point for our discussion is that there are ɡ-initial suffixes
which occur after vowels and consonants and their -ɡ is not deleted.
So far I have ruled out the possibility of -ɡ being base final and suffix initial. I have also
ruled out the possibility of the occurrence of -ɡ being due to different levels to which the
suffixes belong. I previously said (in 8.2.5) that some literature considers the -ɡ to be
epenthetic and I showed the problems that this analysis faces with. I briefly review
epenthesis next.
8.3.4. The -ɡ is epenthetic
Under the epenthesis hypothesis, the ɡ is an epenthetic consonant inserted to break up
hiatus. The status of the -ɡ as an epenthetic consonant is suggested in some literature, as
discussed before (e.g., Samareh 1977, Meshkatod Dini 1999, Najafi 2001). It then needs
to be explained what the environment and conditions are for the insertion of the
consonant. That is, based on this account the underlying form of the noun-forming suffix
is –i and a ɡ is inserted only when the base ends in –e. The problem with this account is
that the environment does not force the presence of ɡ, since the same environment is
created by e-ending nouns followed by the adjective-forming -i, but no ɡ is observed
there. In addition, it should be considered that the ɡ appears with the plural marker -ɑn as
well so it appears in both e–i and e–ɑn environments. No phonetic, phonological or
morphological justification for ɡ being epenthetic is seen in the language.
8.3.5. Summary
In this section I examined four hypotheses to account for -ɡ, as follows: (i) the two –i
suffixes belong to different levels of suffixation; (ii) the -ɡ is word final (synchronically
and historically); (iii) the -ɡ is suffix initial; (iv) the -ɡ is epenthetic. I studied these
hypotheses and showed that there are problems with each of them. None of them
347
provides a systematic and thorough account of the patterning of the -ɡ. Having rejected
these various hypotheses, I now consider an alternative hypothesis, allomorphy with
phonological conditioning. I therefore explore a lexically listed allomorphy account
through an experiment on -ɡ, which I discuss next.
8.4. The experiment
The experiment conducted for -ɡ was a part of the experiment discussed for epenthesis
(in chapter 4) and for -v in suffixation (in chapters 6 and 7). The experiment had three
tasks, as for -v.
Task (i): Question and answer task (production) (real and made-up words)
Task (ii): Wug task (production) (made-up words)
Task (iii): Acceptability rating (perception) (real and made-up words)
I did not include the reading task for this process since -ɡ appears in the Persian script
and therefore it will be read by speakers. In addition, the -ɡ process is not an optional
process which speakers can do either way.
Below I review the tasks for this process.
8.4.1. Task 1: Production (question and answer)
The question and answer production task includes both real and made-up words. The
same question and frame sentence which I used for epenthesis discussed in chapter 4 and
the –v process discussed in chapters 6 and 7 was used for this process too. Both the noun-
forming suffix –i and the plural marker -ɑn were tested to see whether a -ɡ appears after
these suffixes. For the list of made-up e-final words which were used for this experiment
see appendix 9. Note that the e-final words were mixed with the consonant-final words
348
which were used to study epenthesis (chapter 4) as well as with o-final and u-final words
to study the occurrence of v at a suffix boundary (chapters 6 and 7).
The made-up words were the target of study because I wanted to see whether Persian
speakers would produce -ɡi and -ɡɑn if they are given a list of e-final words which they
had never heard before. However, testing the process was not as easy since there is the
adjective-forming –i (also stress-bearing) in the language too which can be used instead
of the noun-forming –i and which is used without -ɡ. That is, for speakers it is natural to
say or hear an e-final word which is followed by a stress-bearing –i. Even for the real
words which I included for control, the same point to some extent applies at least for
some speakers. For example, although there is bandeɡi ‘servitude’ out of bande ‘servant’,
one of my speakers said that bande-i is fine too as it can be, for example, a last name.
This could intuitively be true as the adjective-forming –i can be attached to any noun as
long as a meaning is possible so it is not restricted to the lexicon which one can keep
track of through dictionaries or one’s knowledge. Thus both for perception and
production of made-up words (in particular if the words are found out of a given context)
-ɡi and –i are both able to occur. Moreover, the existence of the non-stress-bearing suffix
–i (the indefinite marker) adds to the complexity of testing the process and this is
reflected in the miscellaneous pronunciations in the tables below. For this reason, the
plural marker -ɑn was to some extent easier to test. There is a complication here too,
namely the existence of the suffix -ɡɑn, shown in (58), which can occur with consonant-
final and vowel-final words (and not only with e-final ones). The wug test, which I will
discuss in 8.4.2., was a more reliable task in testing –i since the words the speakers made
were put in a sentence which can only make sense if a noun is put there and not an
adjective.
I discuss the results for the question and answer task for the appearance of ɡ in
suffixation. As discussed in chapters 6 and 7, the miscellaneous cases are those cases
which are different from one of the expected patterns and include or show something
which is not supposed to be there. They are not of concern here. For example, in some
cases some speakers used the non-stress-bearing suffix –i instead of the stress-bearing –i,
or they changed the final vowel of the word (e.g., in testing the made-up word liʃe to
349
become either liʃeɡɑn or liʃejɑn, a speaker said liʃovɑn so the final e changed to o and
then v occurs and this cannot be included in the discussion of e-final words), or in some
cases the speakers used a strategy in resolving hiatus other than adding a consonant (j or
ɡ) (e.g., in testing the made-up word xɑde to become xɑdeji or xɑdeɡi a speaker said
xɑdi, that is the deletion of –e occurred). All such cases were put in miscellaneous
category.
In tables (66) and (67) the results for this task for made-up e-final words and real e-final
words are presented respectively.
(66) The result of made-up e-final words
No ɡ With ɡ Misc Total
72
(72%)
18
(18%)
10
(10%)
100
(100%)
(67) The result of real e-final words
No ɡ With ɡ Misc Total
6
(30%)
12
(60%)
2
(10%)
20
(100%)
By ‘No ɡ’ I mean the occurrence of the default pattern, which could be j or nothing or the
glottal stop, which, as noted before, I consider as the default category. I will use j
throughout my discussion here as the representative of this category.
350
The results shown in (66) and (67) indicate a preference towards using j with the made-
up words68
and towards using ɡ with the real words. In order to formally compare the
occurrence of j and ɡ between the made up words and the real words, a two-tailed Sign
test was conducted. To run this statistical test, I needed to know the performance of each
speaker individually. Below is a table which shows the performance of each speaker,
shown by his/her initials, for both real words and made-up words for this task,
eliminating the miscellaneous cases. The table should be read as follows: looking at the
first row, for example, the speaker AE pronounced 50% of the real words with j (and not
with ɡ) and 88.89% of the e-final made-up words with j (and not with ɡ). At the end of
the table the mean is given for the usage of j in both real words and made-up words.
68 Note that there were a few cases of ɡ after vowels other than e in made-up words in this task. One speaker was responsible for this (e.g., siro → siroɡi and not the expected sirovi or siroji; buni → buniɡɑn and not the expected bunijɑn –in the case of -ɑn ~ -ɡɑn, there is the possibility of confusing -ɑn with the suffix -ɡɑn, which is independent of -ɑn ~ -ɡɑn (see (58)). I leave these few cases aside in my account as they were mostly done by one speaker and only in one task.
351
(68) The result of the performance of speakers individually (question and answer)
Speakers Real % /j/ Made-up % /j/
AE 50 88.89
FT 0 87.5
SF 50 100
VD 0 57.14
SS 50 88.89
AA 50 100
RM 50 100
PH 0 30
NP 50 100
LP 0 42.86
Mean 30 79.53
The result strongly indicates that the occurrence of /j/ is significantly higher in made-up
words than that in real words (p-value = 0.002; highly statistically significant). So, to
sum up, speakers clearly preferred using /j/ over /ɡ/ with made-up words compared with
real words in this task as seen through the performance of all speakers, which show the
same direction (i.e., using /j/ more).
Next I discuss a wug test for the -ɡ process.
352
8.4.2. Task 2: Production (wug test)
For this task some e-final made-up words were used (see appendix 9 for the list). The
task was done as for epenthesis (chapter 4) and the –v process (chapters 6 and 7). I need
to point out that I was careful about choosing the carrier sentence for the suffix –i to
make sure that the speakers would consider the noun-forming suffix –i and not the
adjective-forming –i. The carrier sentence for the plural marker -ɑn was also very clear.
As noted in chapter 7 for –v, both sentences were practiced with real words first so the
speakers could get familiar with the sentences and afterwards some made-up words were
practiced with the sentences. The English equivalents of the sentences are as follows –
these were given before in 7.3.2; for convenience I repeat them here: (i) for the noun-
forming –i: ‘They are looking for a …..Unfortunately Mina does not know….’ (the
blanks in this sentence could be filled for example with ‘driver’ and ‘driving’
respectively); (ii) for the plural marker -ɑn: ‘There was not only one … in that room. It
was a group of ….. there.’ (the blanks in this sentence could be filled for example with
‘teacher’ and ‘teachers’ respectively).
With respect to the sentence for the noun-forming –i, as seen, I write the sentence which
needs a job or skill to become complete. Thus it was very clear for Persian speakers that a
noun-forming –i needs to be added (and not an adjective-forming –i) and this is crucial
for the -ɡ process as only with the noun forming –i is the -ɡ expected to occur. Some
examples are given in (69).
(69) a. parastɑr ‘nurse’ parastɑri ‘nursing’
b. mohandes ‘engineer’ mohandesi ‘engineering’
c. bannɑ ‘construction worker’ bannɑji ‘construction work’
d. nevisande ‘author’ nevisandeɡi ‘authorship’
353
The result for this task is as follows:
(70) The result of made-up words (wug test)
No ɡ With ɡ Misc Total
30
(75%)
6
(15%)
4
(10%)
40
(100%)
The result is toward using j and not ɡ. This suggests that the speakers do not use -ɡ if they
do not know the word even when the context to use the -ɡ is provided for them. In order
to formally test this, I ran a two-tailed one-sample t-test. Again I considered speakers
performances one by one as given in the following table.
354
(71) The result of the performance of speakers individually (wug test)
Speakers Made-up% /j/
AE 100
FT 100
SF 100
VD 66.67
SS 75
AA 100
RM 50
PH 50
NP 100
LP 100
Mean 84.167
As seen in (71), most speakers used only [j] (speakers AE, FT, SF, AA, NP, LP), some
speakers used mostly /[j] (e.g., speakers VD, SS) and some used [j] and [ɡ] half and half
(speakers RM, PH). The table shows that the occurrence of /j/ is 84.17% and therefore the
one of [ɡ] is 15.83%. The result indicates that the frequency of /j/ is significantly higher
than that of /ɡ/ in this task, which tests only made-up words (p-value = 0.001; highly
statistically significant).
355
After I did the experiment with four speakers, I decided to add a few more words which
are similar in terms of melody to some real and very common words (again jobs), given
below, to see if that might bring the -ɡ out.
(72) real words
a. xɑnande ‘singer’ xɑnandeɡi ‘singing’ xɑnandeɡɑn (pl.)
b. rɑnande ‘driver’ rɑnandeɡi ‘driving’ rɑnandeɡɑn (pl.)
In the pattern of these, I created the following words:
(73) made-up words
a. pɑnande pɑnandeɡi? pɑnandeɡɑn?
b. sɑnande sɑnandeɡi? sɑnandeɡɑn?
The “?” after the words in (73) shows that I wanted to see whether the same results are
found as with the real words based on which these words were formed.
Here is the result for these words:
(74)
No ɡ With ɡ Misc Total
8
(33.33%)
15
(62.5%)
1
(4.17%)
24
(100%)
The result seems to be toward the occurrence of -ɡ. But if we consider the results for each
individual speaker, we see that some speakers affect the result towards using -ɡ. I provide
a table in (75) which shows the performance of each speaker in terms of using j or ɡ for
the words given in (73). As the table in (75) shows, three speakers did not use j at all for
these words and used ɡ instead. One speaker used j throughout and two speakers have
356
some j’s and some ɡ’s. Thus the performance of some particular speakers changes the
overall result.
(75) The result of the performance of speakers individually
Speakers Made-up % /j/
SS 100
AA 75
RM 33.33
PH 0
NP 0
LP 0
The result does not support a strong tendency towards /ɡ/ overall, although some
individual speakers do use ɡ. This is very important to note as the speakers did not
necessarily use ɡ even when the context and melody together were provided for them.
To sum up, the two production tasks (the question and answer task and also the wug task)
show ɡ is infrequently used with made-up e-final words. That is, the speakers tend to use
j with the noun-forming –i and the plural marker -ɑn when they do not know the e-final
words and have not heard their ɡ-including suffixed forms.
8.4.3. Task 3: Perception (accessibility rating)
For this part of the experiment, the participants listened to a list of recorded words and
rated them on the following scale: √ (good, acceptable, possible), ? (so-so), X (bad,
unacceptable, impossible). The same pen and paper process which was explained before
357
for the epenthesis process in chapter 4 and the –v process in chapters 6 and 7 was used
here. The results of this part are shown in (76):
(76) The result of made-up e-ending words
√ ? X Total
No ɡ 22
(68.75%)
1
(3.13%)
9
(28.13%)
32
(100%)
With ɡ 32
(100%)
0
(0%)
0
(0%)
32
(100%)
The result indicates that the speakers prefer having ɡ in perception as the complete
acceptability of ɡ-including made-up words shows (see the second and sixth rows and
compare them with the first and fifth rows). Note that the ‘No ɡ’ too has a relatively high
acceptance rate.
Now let us look at the real e-ending words in perception task. Again, the result shows
stronger acceptability of ɡ.
(77) The result of real e-ending words
√ ? X Total
No ɡ 6
(37.5%)
0
(0%)
10
(62.5%)
16
(100%)
With ɡ 16
(100%)
0
(0%)
0
(0%)
16
(100%)
358
To sum, in perception, the speakers prefer ɡ-including versions overall. If we compare
the first rows of (76) and (77) with each other and also with their fifth rows, we see that
‘No ɡ’ is more acceptable with made-up words than with real words, which makes sense
as they are words which are not familiar to the speakers.
8.4.4. Summary
I conducted an experiment on -ɡ on both real and made-up e-ending words. The purpose
of the experiment was to test whether the occurrence of -ɡ is observed when a made-up
word, with which speakers are not familiar, is given to them, that is to test if the process
is productive. The experiment included both production and perception tasks. In
comparison between production of real and made-up words, j is by far more frequent
with made-up words. In perception, however, ɡ is more acceptable for both real and
made-up words. The difference in production and perception results could be due to the
difference in nature of tasks (production vs. perception). Speakers do not have the
process of insertion of ɡ after e-final new words as an active process in their minds, but if
they hear some e-final words with ɡi/ɡɑn, they accept it as it is a familiar pattern to them.
That is, when they are primed to consider the occurrence of ɡ following final-e words as a
credible option, they rate it as acceptable.
8.5. Summary and discussion
In this section I discussed the ɡ, which occurs with some suffixes at a suffix boundary.
There are different analyses of the ɡ in the literature, including the ɡ is epenthetic, and
also the occurrence of the ɡ has to do with the historical status of the words which show it
today. I considered various hypotheses to account for the presence of ɡ and discussed the
problem with each of them. In order to test the productivity of the process, I conducted an
experiment. I showed that the general pattern is that the speakers do not tend to produce
the ɡ with made-up e-final words even when the context (and melody) is provided for
them. That is, the frequency of /j/ is significantly more than the one of /ɡ/ in made-up
359
words compared to real words. In perception, the ɡ-including made-up words are
acceptable as it is a familiar sequence at a suffix boundary (i.e., eɡi, eɡɑn).
In the previous chapter and this chapter I examined two different suffixation processes.
Both involve the occurrence of consonants at a stem-suffix boundary. The –v (discussed
in the previous chapter) and the -ɡ (discussed in the present chapter) both occur at a
suffix boundary between some vowel-final bases and vowel-initial suffixes. Their
occurrence, therefore, touches upon Persian morpho-phonology. They, however, have
different accounts.
The -ɡ, which occurs after e-final bases with a couple of suffixes in the language, is not
productively used in unfamiliar e-final cases, as the highly frequent default j-insertion in
these cases show, and therefore is to a great extent observed with known items. This
shows that the process by which -ɡ occurs is not a productive phonological process, but
phonologically controlled allomorphy with particular suffixes. I argue below why the
occurrence of ɡ is best treated as phonologically conditioned suppletion or suppletive
allomorphy.
The occurrence of suppletive allomorphs cannot be explained by phonetic environment.
They occur in morphologically restricted environment. They show idiosyncrasy. And
they have more than one underlying form (see de Lacy 2006, Bonet et al 2007, Bye 2008,
Paster 2006, 2009, to appear, Nevins 2011). Let us now see how these apply to the ɡ. It was argued (see 8.3.4) that the occurrence of the ɡ is not motivated by phonetic
environment. The ɡ occurs with some particular suffixes so it is restricted
morphologically. However, it shows idiosyncrasy as it does not occur with these suffixes
in all environments; it only occurs after –e and it is not clear why. The -i/ɡi and -ɑn/ɡɑn
do not have a single underlying form. If the only underlying form were -i and -ɑn, the -ɡi and -ɡɑn would have to be derived by unmotivated ɡ epenthesis. If the only underlying
form were -ɡi and -ɡɑn, then -i and -ɑn, would have to be derived by unmotivated ɡ
deletion. Other possibilities such as ɡ being historical or stem-final or occurring due to
different levels of suffixation were also rejected (see 8.3.1-8.3.3). –i and -ɡi, and also -ɑn,
and -ɡɑn are allomorphs whose occurrence is conditioned by phonology: -ɡi and -ɡɑn
360
occur after e-final words with two suffixes, –i and -ɑn in other cases. In fact, the
occurrence of -ɡi and -ɡɑn, which cannot be accounted for by various hypotheses, is
explained by taking them as allomorphs whose occurrence is idiosyncratically limited to
a particular vowel followed by two particular suffixes. The observation that ɡi and ɡɑn
are not actively used in made-up cases shows that they are lexically listed allomorphs.
The –v, which occurs after o-final and u-final words with a range of vowel-initial
suffixes, being motivated by the phonetic effect of labial environment is, however, a
productive process as suggested by its insertion in unfamiliar words and participates in
hiatus resolution along with the default pattern.
The contrast between v and ɡ is quite interesting. In some way, ɡ has a more systematic
well-defined context of occurrence while v is more variable. But nevertheless, speakers
are more willing to use v in novel word production than ɡ. The crucial difference seems
to be, as pointed out above, that the occurrence of ɡ is not phonetically motivated but
more morphologically conditioned compared to v. So, this seems to suggest that
statistical patterning is not the only factor that determines the productivity of a process,
but phonetic naturalness also plays a role.
In chapter 7 and the present chapter, I examined two morpho-phonological processes of
Persian, which are controversial in the Persian literature, and provided an account for
each.
361
Appendix to chapter 8
The historical status of the -ɡ: Old Persian
(i) The historical status of the ɡ : Old Persian
In addition to Middle Persian, discussed in 8.3.2, I also looked at Old Persian in order to
see if there is a historical explanation for the -ɡ based on that era. The motivation behind
doing this was that, as we will see below, there are some references to the origin of
Middle Persian –k/ɡ or –ak/-aɡ in the literature. I therefore thought that it was worthwhile
to consider Old Persian to see if there is an explanation for why some -ɡ’s (i.e., those
which were -aɡ historically) re-appear today but others do not. Regarding the historical
analysis, the questions were: (i) Why do only aɡ-ending words of Middle Persian show
their -ɡ in suffixation? Words which ended in -āɡ, for instance, do not show their former -
ɡ. (ii) Why don’t those aɡ-ending words of Middle Persian which show their -ɡ in
suffixation show their -ɡ with all suffixes today?
(ii) Underlying versus derived hypothesis
In Old Persian, many words ended in vowels (the sources consulted for Old Persian are
given throughout the discussion below). In Middle Persian, based on dictionaries, the
number of words ending in vowels is much reduced –some of these words end in ē and ō
which were the Old Persian diphthongs (which became ē and ō in Middle Persian), some
end in –īhā, which was an adverb-forming suffix, some of them end in vowels but are
function words, some have more than one version (one ending in j or a consonant and one
without j or any consonant at the end), some of them end in a vowel in one dictionary and
in a consonant in another. The overall result is that not many words end in vowels in
Middle Persian (leaving aside the adverb-forming –īhā which productively formed
adverbs, giving vowel-ending words). So the question is: what happened to the vowel-
final words coming from Old Persian to Middle Persian?
362
One process that created consonant-final words involves deletion of the final vowel. For
examples, consider the following words:
(1) Old Persian Middle Persian Modern Persian
aspa asp asb ‘horse’
tarsa tars tars ‘fear’
tanū tan tan ‘body’
dāta dāt dɑd ‘justice’
stūnā stūn sotun ‘pillar’
I should note that I did not have access to an Old Persian dictionary to easily compare the
words with their Middle and then Modern Persian equivalents. I have found the words in
books on Old Persian which I will refer to below. There have been many changes from
Old Persian to Modern Persian so words might differ dramatically, making it difficult to
recognize the Old Persian words and associate them with the Modern words. In addition,
Old Persian had case suffixes -these were eliminated over time- so judgments on suffixes
and endings need to be made with extra care, requiring some expertise on the language of
that time. Thus I am careful about my comments and conclusions on Old Persian. One
thing that seems to be the case is that there is a suffix –aka in Old Persian, whiles there
are no suffixes such as –āka, ūka, ōka, īka in Old Persian. Let us see what the literature
says in this regard.
(iii) The –(a)ka suffix in Old Persian
According to Natel Khanlari (1987) there was a suffix –aka in Old Persian which became
the –ak suffix of Middle Persian, which is –e in Modern Persian. Note that it seems that
the k underwent voicing, with the suffix becoming -aɡ (this final voicing process occurs
in other cases as well; for instance Old Persian aspa ‘horse’, Middle Persian asp, Modern
363
Persian asb; and Old Persian dāta ‘justice’, Middle Persian dāt, Modern Persian dɑd).
Natel Khanlari gives the following example of the changes from Old Persian to Middle
Natel Khanlari adds that whenever this suffix is not final due to the presence of a
following noun-forming –i or plural marker -ɑn the k/ɡ has not been eliminated. This
suggests that perhaps the –ɡ which occurs today is related to the Old Persian –aka.
Kent (1961), discussing the formation of nouns and adjectives in Old Persian, says that a
noun or adjective suffix attached directly to a verbal root is called a primary suffix; one
attached to a noun or adjective stem is called a secondary suffix. Kent later says that noun
and adjective stems with the suffix –ka are adjectives which may assume substantival
meanings. This –ka may be attached directly to a stem, nominal or verbal; it may appear
as –aka or –ika (Old Persian -i might be today’s –e), in which it can often not be
determined whether the vowel belongs to the suffix or to the basic stem. Only when -ika-
is attached to an –a- stem is it clear that the i belongs to the suffix. Among his examples
are: ahri-ka ‘evil, faithless’, ban da-ka ‘servant’ , vazra-ka ‘great’ (note that in Middle
Persian, the word for ‘servant’ above is bandak and in Modern Persian it is bande (from
which there is bande-ɡi ‘servitude’, bande-ɡɑn ‘servant (pl)’). Also the word for ‘great’
in Middle Persian is vazurk and it is bozorɡ in Modern Persian).
Meillet (1915), discussing Old Persian, writes: Le suffixe –ka- qui a servi d’élargissement
à tant de mots au cours de l’histoire de l’indo-iranien, joue déjà son rôle; on a ainsi
ba(n)daka ‘serviteur’ et arik, arika (plutôt que arika) ‘ennemi’. My translation: The
suffix –ka which was used to extend many words in the history of Indo-Iranian already
plays its part, there are thus ba(n)daka ‘servant’ and arik, arika (rather than arika)
‘enemy’.
Thus it seems that there existed a suffix –(a)ka, which lost its final –a as some other
words lost their final vowels from Old to Middle Persian. But it seems that Old Persian
did not have -ūka, -āka, etc. suffixes for which we can apply the same scenario as for -aka. That is, one cannot say –āka of Old Persian became –āk in Middle Persian, and then
–ā in Modern Persian. This can explain why the ɡ occurs only after e today –it is because
364
of the historical –aka. But where did the -ɡ in -ūɡ or -āɡ, etc. come from in Middle
Persian if no such suffixes as -ūka, -āka, etc. existed in Old Persian?
I give an example of Middle Persian here: Modern Persian has the words dānā ‘wise,
knowledgeable’ and from which dānāji ‘wisdom, knowledge’ (*dānāɡi) is formed. These
words are as follows in Middle Persian: dānāɡ and dānāɡīh (with the same meanings as
Modern Persian). Recall that the present noun-forming –i was īh in Middle Persian. So
where did the -ɡ in Middle Persian dānāɡ come from? Here is what we find in the
literature in this regard.
(iv) The possible source of -ɡ
According to Natel Khanlari, a ɡ was inserted in all vowel-final roots in Middle Persian.
Here are his examples:
(2) pariɡ ‘fairy’
āhuɡ ‘deer’
dāruɡ ‘medication’
hynduɡ ‘Hindi’
kadaɡ ‘house, place of something’
Natel Khanlari then says that the consonant is deleted in Modern Persian, and whenever
the vowel before it was –a, this became –e in Modern Persian. I add that in the above list
those words which can take an –i form suffix today take the adjective-forming –i and do
not show the -ɡ (e.g., dɑru ‘medication’ and dɑruji ‘medicinal’).
Now the question is whether one can interpret Natel Khanlari’s explanation as dividing
the Middle Persian ɡ-final words into two groups:
365
(i) One group had the ɡ in Middle Persian (i.e., aɡ-final words) coming from the
Old Persian suffix –aka, which became –ak in Middle Persian; for instance, as seen
(ii) The other group got the -ɡ by a process of adding a ɡ at the end of vowel-final
words (i.e., so sequences of other vowels followed by inserted -ɡ were created); examples
of this group are given in (2).
Only the first group which has the consonant underlyingly from Old Persian (i.e., the
words which end in –aka in Old Persian and -aɡ/-ak in Middle Persian), shows the ɡ
today, as shown by bandeɡi ‘servitude’.
Note, however, that Natel Khanlari has the word kadak /kadaɡ ‘house, place of
something’ in the list in (2). This word is kade in Modern Persian and is used today in
compounds (e.g., dɑneʃ kade ‘faculty (as in Faculty of Arts)’: this word consists of dɑneʃ ‘science, knowledge’, followed by kade ‘a suffix which shows the place of something’ so
literally the compound means ‘the place of knowledge’). Natel Khanlari argues that Old
Persian –aka became –ak in Middle Persian and then –a and finally –e in Modern Persian.
This appears to indicate that final k in kadak should be from –ak, from the Old Persian -aka. So why does he list kadak along with words which used to end in other vowels and
which according to him originally did not have a final consonant and got it in Middle
Persian (see 2)? One can say that he did not mean that all –ak-ending words in Middle
Persian were originally –aka-ending (that is maybe he meant that all aka-ending words of
Old Persian became –ak-ending in Middle Persian as in banda-aka (Old Persian), bandak
(Middle Persian), not the other way around, that is not all –ak/-aɡ-final words of Middle
Persian came from Old Persian). If this is the case, then maybe there were two groups of
aɡ-ending words in Middle Persian:
(i) One which had the suffix –aka in Old Persian and which became ak-final in
Middle Persian shows the -ɡ in present suffixation (e.g., banda-aka (Old Persian), bandak
(Middle Persian), bande (Modern Persian) ‘servant’ and bandeɡi ‘servitude’).
366
(ii) The other group, which ended in –a in Old Persian and which then got the k/ɡ
as every vowel-final word did in Middle Persian (see (2)), does not show the -ɡ in the
present processes of suffixation (e.g., kade does not result in kadeɡi).
Natel Khanlari does not say this though. This is my guess which is not obviously based
on enough information.
From what has been discussed so far, one may speculate that there were two final k/ɡ’s in
Middle Persian. Some final k/ɡ’s came from an Old Persian suffix (-aka) – I call this
underlying k/ɡ. Some final k/ɡ’s were added in Middle Persian to the end of vowel-final
words – I call this derived k/ɡ. The k in bandak (Middle Persian), which came from
banda-aka (Old Persian), is an example of the underlying k/ɡ. The k/ɡ in examples in (2),
as in kadaɡ ‘house, place of something’, represent the derived k/ɡ. The underlying k/ɡ re-
appear today in suffixation (e.g., Modern Persian bande bandeɡi ‘servitude’). The
derived k/ɡ, however, does not re-appear in suffixation today (e.g., Modern Persian kade
*kadeɡi). If this is correct, it may explain why some historical k/ɡ’s re-appear in
suffixation today and some do not. But it fails to explain why only with some suffixes
(the noun-forming –i and the plural marker -ɑn) the re-appearance of the underlying k/ɡ’s
occurs.
Based on Natel Khanlari’s description of Old Persian and Middle Persian with respect to
k/ɡ, I considered the possibility of underlying vs. derived hypothesis for the historical k/ɡ
and today’s ɡ in suffixation. I now discuss why my speculation about underlying vs.
derived k/ɡ’s is weakened based on Natel Khanlari’s other comments on k/ɡ.
(v) On the underlying vs. derived hypothesis: the problem
I showed that an underlying versus derived hypothesis might be suggested based on the
literature, mainly by Natel Khanlari’s account. I now show why this hypothesis does not
work.
367
In addition to the suffixes discussed above, Natel Khanlari also considers a suffix –k in
Old Persian. He notes that this suffix is observed in only a few words, such as:
(3) bandaka ‘servant’
pairikā ‘fairy’ (this word was above among those which got the -ɡ in
Middle Persian –see pariɡ ‘fairy’ in (2))
kainikā ‘maid’
Natel Khanlari continues that this suffix changed to -ɡ in Middle Persian and is added to
all vowel-ending words. It is not clear what is meant by this part. Why does he consider it
a “k” suffix rather than –(a)ka? The word bandaka ‘servant’ was considered before as
having the –(a)ka suffix.
Natel Khanlari gives the following examples in another part of his book:
kɑm means ‘desire’ and kɑme is used in suffixation or compounds today as in xod kɑme
(xod ‘self’) ‘autocratic’. Why aren’t these listed with the Old Persian words with a ‘k’
suffix in (3)? In particular, the word dānāka is important if it existed with this
pronunciation in Old Persian because today we have dɑnɑji ‘wisdom’ and this shows that
the underlying/derived hypothesis that I suggested above based on change from Old
Persian to Middle Persian is not possible because there was also an –āka in Old Persian
whose final –a is dropped in Middle Persian. And if we consider the two examples in (4),
we see that one of them shows the -ɡ and the other one does not. Compare dɑnɑji ‘wisdom’ with xod kɑmeɡi ‘autocracy’ (kɑme shows -ɡ today). Thus, the examples in (4)
suggest that because of the existence of dānāka (Old Persian), which is dɑnɑ (Modern
Persian) ‘wise’ from which dɑnɑji ‘wisdom’ (*dɑnɑɡi) results in Modern Persian, one
cannot say that the underlying k/ɡ necessarily re-appears today. In fact, the question
which we are dealing with today as to why the ɡ occurs only after –e is brought up again
368
with the historical data in (4). Since from the two Old Persian examples in (4), only one
of them which ends in –aka (kāmaka (Old Persian)) and which ends in –e today (kɑme)
shows ɡ today, and the one which ends in āka (dānāka (Old Persian)) and which ends in -
ɑ today (dɑnɑ) does not. Given (4), referring to Old Persian does not help us with the ɡ.
Taking Old Persian into account would have been helpful if there was an Old Persian –aka suffix, from which the Middle Persian -aɡ came, but no –āka, -ūka, etc. If Old
Persian had –āka, -ūka, etc., whose k/ɡ does not re-appear today in suffixation, and it also
had –aka, whose k/ɡ re-appears today then the puzzle remains unsolved because we are
back to where we were with Middle Persian data, which include final -āɡ, -ūɡ (whose ɡ
does not re-appear in present suffixation) as well as final -aɡ (whose ɡ re-appears in some
suffixation processes in Modern Persian).
I should note that as for adding a consonant in Middle Persian to vowel-final words
coming from Old Persian, in addition to the deletion of final vowels (recall the examples
in (1), e.g., aspa ‘horse’ becomes asp from Old to Middle Persian), insertion of a
consonant was a strategy used from Old Persian to Middle Persian (e.g., taken from Natel
Khanlari, Old Persian brū ‘eyebrow’ became brūɡ in Middle Persian and it is now
abru)69
Note that Natel Khanlari also considers the suffix -āɡ/-āk for Middle Persian. This suffix
is noun forming and is added to the present stem and today its -ɡ has been deleted. For
example, he gives dānāɡ ‘wise’ which as seen above gave dānāɡīh ‘wisdom’ in Middle
Persian. Today it is dɑnɑ and gives dɑnɑji (same meanings).
. For this reason adding a ɡ at the end of the words does not seem implausible. But
the explanation on it is not clear given that there was a suffix –aka- in Old Persian.
So far I have mainly focused on Natel Khanlari’s comments with respect to k/ɡ. Other
scholars have also commented on k/ɡ. I pointed out a couple of notes from Kent and
Meillet. I now review other literature in this respect.
69 On a different note: the word abru ‘eyebrow’ gives us today abovɑn for its plural form. But note that it had a ɡ in Middle Persian but no ɡ in Old Persian. So for underlying form of this word which gives us the v in suffixation, I have to go back to Old Persian because based on its Middle Persian form, it should be abruɡɑn under the historical account of occurrence of the ɡ. Note that in a Middle Persian dictionary, I found both brūj and brūk. The Old Persian form of this word is based on Natel Khanlari.
369
Rastorgueva (1969) also considers a –k suffix in Middle Persian along with other suffixes
as follows. Listing the suffixes of Middle Persian, she says:
(5)
a. -āk (it is added to the present stem of the verbs)
e.g., dān (the present stem of dānistan ‘to know’) → dānāk ‘wise,
knowledgeable’ [today: dɑnɑ ‘wise’]
b. -k (Old Persian: -kā)
e.g., zānūk ‘knee’ [today: zɑnu]
bandak ‘servant’ [today: bande]
c. -ūk or ōk
mastūk ‘drunk’ According to the author it was masta in Old Persian
[today: mast]
nerōk ‘energy’ According to the author it was naryava in Old Persian
[today: niru]
d. -ak
hazār ‘thousand’ hazārak ‘millennium’
[today: hezɑr and hezɑre]
kām ‘desire’ kāmak ‘desire’
(according to the author: no difference in meaning)
370
Salemann (1930) considers the following k-ending suffixes for Middle Persian and other
related languages: -k (ak, āk, ōk, ūk, īk). He then says: in the instance of suffixes ending
in –k the plural form is written –kān, -ɡān, and even -kɡān, most probably under the
influence of the Persian -ɡān, yān. Similarly also before the suffix of abstract nouns –īh:
bandaɡīh ‘servitude’ along with dānākīh ‘knowledge’.
Later in a section on formation of the noun, in a subsection about derivation by means of
a suffix, Salemann considers the suffix –k, Iranian –ka. This suffix, according to him, can
be authenticated in old languages only in a few cases like bandak, parīk, etc. In all
Modern Iranian languages that suffix is widely used and is joined with all the vowel-
roots, whereby the latter become transferred into the a-declension. The old root-terminal
a- is preserved in that instance. The suffix -k in these cases has a purely formal function
and does not in any way modify the original sense of the root-word. Different is the case
of the suffix -ak which had three following functions:
-forms diminutives (note that -ak (‘diminutive’) is still used in Modern Persian; it
has not become -aɡ) –I leave this aside since this is not related to our discussion.
-forms adjectives consisting of connected words (the second link can also be a
present-root) (e.g., ēvak-māhak ‘one-monthly’ [in Modern Persian jak mɑhe ]). This
suffix can further followed by the abstract suffix –īh.
-Nomina istumenti from present roots.
Salemann then says that the suffix –āk is without any doubt from –āvaka and forms
present participle. In Persian –āk forms also nomina instrumenti. Salemann also considers
-ōk or -ūk to seem to be traceable back to Old –avaka (-vaka??).
According to Pisowicz (1985), the glide /j/ appears after vowels /ɑ/, /u/, /i/ before plural
suffix –ɑn (e.g., dɑnɑ-j-ɑn ‘wise (pl)’). He then says that sometime /v/ comes after /u/ in
this suffixation processes. Pisowicz continues that the linking /j/ originally belonged to
noun stem, and now appears where it is historically justified. Then he adds that the
historical explanation of the plural ending –j-ɑn finds its indirect corroboration in the fact
that after –e, the element linking the final vowel of noun with the suffix -ɑn, is still -ɡ
371
(e.g., bande-ɡɑn ‘servants’) inherited from Middle Persian, where -ɡ belonged to the stem
(cf. Middle Persian bandaɡ ‘servant’). Pisowicz adds that avoiding reference to
diachrony, one might say that before the plural suffix, /j/ appears in the surface structure
in those cases where it corresponds to the final /j/ in the deep structure. Which deep
structure does he refer to? Does he mean that all the words which do not get the -ɡ in
suffixation today have a /j/ underlyingly? This is not the case. There were cases that, as I
said before, may be thought to have a /j/ finally like dɑneʃ ʤu ‘university student’ (dɑneʃ ʤujɑn ‘university student (pl)’) because the ʤu part of it is the present stem and is
thought to be ʤuj. But not all cases which take -ɡɑn and -ɡi can be interpreted this way
as seen before.
(vi) Conclusion
The status of the Middle Persian -ɡ and its history considering Old Persian seem unclear.
Was the -ɡ inserted at the end of the vowel-final words in Middle Persian and then a
suffix like -āɡ was created? That is, did the speakers reinterpret that the suffix is -āɡ
(with its -ɡ inserted in Middle Persian) or was this Middle Persian -āɡ from a suffix like
–āka in Old Persian? Given the data and the accounts in the literature, one cannot be sure
what the correct explanation is. Some speculations can be made, as above, but one cannot
go further due to the contradictory, mixed, and insufficient information and data about
Old Persian k/ɡ and even Middle Persian k/ɡ.
372
Chapter 9
Summary
This thesis examined aspects of Persian phonology and morpho-phonology. To conclude,
I briefly summarize the findings.
After presenting an introduction to the thesis in chapter 1, I discussed the Persian vowel
system and its active feature with respect to quality/quantity in chapter 2. The active
feature of the Persian vowel inventory is a matter of debate in the literature on Persian
phonology. Some studies consider the dimension of contrast to be length and some
consider it to be height. A synthetic analysis which includes both height and length is also
suggested in the literature. I reviewed the arguments presented in the literature for these
positions and showed that they are inconclusive.
Working within the framework of Modified Contrastive Specification, I consider the
phonological activity of a language as the main diagnostic for determining the dimension
of contrast in its vowel system. The dimension of contrast in the Persian vowel system
should account for both vowel harmony, a process which requires a feature, and the
categorization of vowels into two groups: a, e, o vs. ɑ, i, u. Vowel harmony can be
accounted for by height (but not by length), and categorization of vowels into the two
groups can be achieved by length (but not by height). Thus neither length nor height can
explain both the harmony and the categorization. In addition, under the theoretical
assumptions of the adopted framework, having both length and height is untenable for the
Persian vowel system. I proposed, in chapter 3, that the feature [tense] can both account
for the Persian harmony patterns and categorize the vowels into the two groups. I
therefore showed that evidence from phonological activity of Persian vowels confirms
the prediction of the framework of Modified Contrastive Specification that in a vowel
inventory such as Persian there is no need for the underlying presence of both quality and
quantity for a given pair of vowels. A phonetic study of the duration of tense and lax
vowels in Persian confirms the duration of Persian vowels to be similar to tense-based,
373
rather than quantity-based, languages. I further discussed how contrasts are built into the
vowel system of Persian and suggested a contrastive hierarchy for the system. A
discussion of markedness in the system, considering processes such as assimilation,
deletion, neutralization, and epenthesis, was also presented in chapter 3. In addition, I
examined diphthongs, whose phonemic status is controversial in the Persian literature.
Pre-nasal raising, and harmony across laryngeals, two very common processes in Persian
speech, were also discussed in this chapter.
In a continuation of my discussion of tenseness, I showed that there are some processes
and phenomena in the language which present potential evidence for underlying quantity
in the system. These are epenthesis in suffixation, VCC co-occurrence restrictions, and
minimal word requirements. I discussed these processes in some depth and showed that
they are not in contradiction with tenseness in the system. This type of effect is in fact
compatible with an analysis based on an underlying [tense] contrast in the system.
The occurrence of epenthesis in suffixation (insertion of a vowel at a stem-suffix
boundary when a consonant cluster is created), as examined in chapter 4, suggests a
division between roots with CVlaxC structure and those with CVtenseC, CVlaxCC, and
CVtenseCC structure. This may suggest that epenthesis occurs with roots of heavier
structures, which means that the environment for epenthesis is conditioned by properties
of the vowel and by syllable structure. I proposed that an analysis of epenthesis based on
underlying tenseness is possible: tense vowels project two morae and lax vowels a single
mora. Nonetheless, I raised a question as to why epenthesis does not occur with all cases
which include roots with those vowels/syllable structures if the occurrence of epenthesis
is motivated by the properties of vowels and by the difference in the syllable structure
due to these properties. In order to determine whether the conditions under which
epenthesis occurs are systematic, I examined a variety of synchronic factors (cluster
types, frequency, productivity) and argued that they do not provide an account for
epenthesis. I also discussed suffixes and the suffixed forms from an historical viewpoint
and showed that although the historical investigation of epenthesis in suffixation offers an
account for some epenthesis-including suffixed forms, it fails to provide an account for
many of them. Given the limited number of words which have epenthesis-including
374
versions and the results of an experiment I conducted for this process, I concluded that
epenthesis is not a productive synchronic process in Persian. Nevertheless, the
distribution of the epenthetic vowel is compatible with the [tense] analysis.
In chapter 5, I examined the restrictions observed in the co-occurrence of vowels and
their following consonants in VCC# form when V is tense. I showed that the restrictions
are not real considering both native and loan words, and argued that the restrictions,
which are limited to final position in native words, are accounted for by tenseness from
which surface quantity is derived where VtenseCC is in final position. Thus, the
restrictions do not provide an argument against underlying tenseness or for underlying
quantity.
A minimal word requirement, discussed in chapter 6, appears to support quantity in the
Persian vowel system. I showed that the language does not have a clear-cut list of lexical
words ending in ɑ, i, u versus a list of function words ending in a, e, o. I argued that a
minimal word requirement does not hold in Persian given the existence of o-final #CV#
words in the language based on the results of an experiment I conducted for #Co# words
– note that there is a surface minimality requirement, given the phonetic length of vowels
in #CV# words discussed in chapter 3.
The conclusion of the discussions of the vowel system and the processes (chapters 2-6),
which at first may appear to support quantity in the system, is that an analysis based on
phonological tenseness, as proposed in this thesis, accounts for all the vowel-related
processes and observations in the language. Neither a phonological height-based nor a
phonological quantity-based analysis is able to fully explain such processes and
observations.
The epenthesis in suffixation as well as minimal word requirements opened up the
discussion of morpho-phonology of the language. Since both of these involved an
experiment, which I conducted with 10 native speakers of Persian in both production and
perception with both real and made-up words, they in fact introduced the experimental
part of the thesis too. Two more morpho-phonological processes, namely v and ɡ in
375
suffixation, were other components of the morpho-phonology part of this work and of its
experimental dimension.
I argued that the occurrence of v after o- and u-final words is due to the phonetic
environment of labial. This account, as discussed in chapter 7, explains the variation
observed between the occurrence of v and the default pattern j in such words. The results
of the experiment showed that the process of occurrence of v is an active process in the
language as it is frequently seen with made-up words.
I argued that the occurrence of ɡ after e-final words with two suffixes (the noun-forming
-i and the plural marker -ɑn) is a case of phonologically conditioned allomorphy.
Considering -ɡi and -ɡɑn as allomorphs provides an account for their occurrence, which
is limited to a particular vowel followed by two particular suffixes and which cannot be
explained by other hypotheses as I showed in chapter 8. The results of the experiment
showed that ɡi and ɡɑn are not actively used with made-up words and this suggests that
the ɡ-including cases in the language are lexically listed allomorphs.
This thesis touches upon aspects of Persian phonology and morpho-phonology which are
either controversial or less-studied. Therefore, it greatly contributes to our knowledge of
the language. In addition to the theoretical aspect, it contains an experiment, which to the
best of my knowledge has never been done for those Persian processes.
Although the thesis is mainly about the synchronic status of the phonology and morpho-
phonology of the language, it has a considerable historical contribution as well, as many
processes were also examined historically. In addition to its contribution to the Persian
language, it provides discussion of the phonology of vowels and on morpho-phonology,
in particular suffixation, both of which deepen our understanding of the theory. It also
contributes to experimental work on phonological and morpho-phonological patterns.
Moreover, this work raised topics for future research on Persian phonology, morpho-
phonology, and phonetics.
376
References
Abaglo, Poovi and Diana Archangeli. 1989. Language-particular Underspecification:
Gengbe /e/ and Yoruba /i/. Linguistic Inquiry 20. 457-480.
Abramson, A. S. 2001. The stability of distinctive vowel length in Thai. In M. R. K.
Tingsabadh & Abramson, A. (Eds.), Essays in Tai Linguistics (pp. 13-26).
Bangkok: Chulalongkorn University Press.
Albright, Adam and Bruce Hayes. 2003. Rules vs. analogy in English past tenses: a
ajɑl ‘one’s family’; ajɑl vɑr ‘having a large family’
suɡ ‘mourning’; suɡ vɑr ‘mourner’
masih ‘Jesus’; masih vɑr ‘Jesus-like’
ʃɑh ‘king, Shah’; ʃɑh vɑr ‘kingly, royal’
divɑne ‘crazy’; divɑne vɑr ‘madly’
sezɑ ‘reward’; sezɑ vɑr ‘deserving’
tuti ‘parrot’; tuti vɑr ‘parrot-like’
24- vɑre
ɡuʃ ‘ear’; ɡuʃ vɑre ‘earring’
mɑh ‘moon’; mɑh vɑre ‘satellite’
411
ʤaʃn ‘feast’; ʤaʃn vɑre ‘festival’
jɑd ‘memory’; jɑd vɑre ‘memorial gathering/event’
25- vɑn (a few words; e.g., sar ‘head’, sarvɑn ‘a captain’)
26- vɑne (a few words; e.g., poʃt ‘back’; poʃt vɑne ‘support’)
27- vand (not productive)
ʃahr ‘city’; ʃahrvand ‘citizen’
pas ‘after’; pasvand ‘suffix’
piʃ ‘before’; piʃvand ‘prefix’
28- var
nɑm ‘name’; nɑmvar ‘famous’
bɑr ‘fruit, result’; bɑr var ‘fruitful’
soxan ‘speech’; soxan var ‘someone who speaks well’
sar ‘head’; sar var ‘venerable’
honar ‘art’; honar var ‘artist’
ʃoʔle ‘flame’; ʃoʔle var ‘flaming’
ʃenɑ ‘swimming’; ʃenɑvar ‘floating’
29- vaʃ
mah ‘moon’; mahvaʃ ‘like moon’
pari ‘angel’; parivaʃ ‘like an angel’
412
30- sɑ
mah ‘moon’; mahsɑ ‘like moon’
pari ‘angel’; parisɑ ‘like an angel’
31- sɑr
ʃarm ‘embaressment’; ʃarmsɑr ‘embaressed’
sanɡ ‘stone’; sanɡsɑr ‘death by stone’
rox ‘face’; roxsɑr ‘face’
32- sɑn
mah ‘moon’; mahsɑn ‘like moon’
ɡorbe ‘cat’; ɡorbe sɑn ‘usually used with plural -ɑn: the cat-like animals’
33- sar
sanɡ ‘stone’; sanɡ sar ‘name of a place’
34- sir
sard ‘cold’; sardsir ‘a region with a cold climate’
ɡarm ‘warm’; ɡarmsir ‘a region with a warm climate’
35- zɑr
namak ‘salt’; namak zɑr ‘salt-marsh(es)’
ɡol ‘flower’; ɡol zɑr ‘flower field’
alaf ‘grass’; alaf zɑr ‘grass field’
413
keʃt ‘cultivating a land’; keʃt zɑr ‘cultivated land’
sabze ‘grass’; sabze zɑr ‘meadow’
ʃɑli ‘rice when in the field’; ʃɑlizɑr ‘rice field’
36- zi (not productive; e.g., marv ‘a city; marv zi ‘sb who is from marv’)
37- ʃan (not productive)
Only in ɡol ‘flower’; ɡolʃan ‘garden’
There is also another word with this suffix but it is not considered as a
suffixed form today. It is bonʃan ‘puleses’.
38- ʧe
bɑɢ ‘garden’; bɑɢʧe ‘small garden’
bil ‘shovel’; bilʧe ‘trowel’
daftar ‘notebook’; daftarʧe ‘small notebook’
sanduɢ ‘chest, box’; sanduɢʧe ‘small chest’
ɑlu ‘plum’; ɑluʧe ‘cherry plum’
ɢɑli ‘rug’; ɢɑliʧe ‘small rug’
no ‘new’; noʧe ‘a disciple’
-ʒe and ʒak are also versions of this suffix.
39- ʧi (originally from Turkish)
ʃekɑr ‘hunting’; ʃekɑrʧi ‘hunter’
tofanɡ ‘gun’; tofanɡʧi ‘gunner’
414
telefon ‘telephone’; telefonʧi ‘telefon operator’
ɢahve ‘coffee’; ɢahve ʧi ‘the tea-house owner’
ɡɑri ‘a horse-driven cart’; ɡɑri ʧi ‘a cart-driver’
tamɑʃɑ ‘watching’; tamɑʃɑ ʧi ‘a viewer’
40- ʤi (only in mijɑnʤi ‘mediator’)
41- mɑn
sɑxt (past stem from sɑxtan ‘to build’); sɑxtemɑn ‘building’
sɑz (present stem from sɑxtan ‘to build’); sɑzmɑn ‘organization’
zɑ(j) (present stem from zɑjidan/zɑʔidan ‘to give birth’); zɑjemɑn
‘childbirth’
ɢahr ‘wrath, anger’; ɢahremɑn ‘hero’
ʃɑd ‘happy’; ʃɑdemɑn ‘happy, joyful’73
42- mand
xerad ‘wisdom’; xerad mand ‘wise’
dard ‘pain’; dard mand ‘painful’
ɑz ‘greed’; ɑz mand ‘greedy’
anduh ‘sorrow’; anduh mand ‘sad’
73 Note that Kalbasi considers -mɑn in the words ɢahremɑn ‘hero’ and ʃɑdemɑn ‘happy, joyful’ to be different from -mɑn in sɑxtemɑn ‘building’, sɑzmɑn ‘organization’, and zɑjmɑn ‘childbirth’. The former -mɑn, according to her, was a noun but the latter a suffix. Since she does not provide a source for this etymological distinction and mɑn as a noun does not exist today (as it cannot be found in Persian dictionaries), I consider the -mɑn in ɢahremɑn ‘hero’ and ʃɑdemɑn ‘happy, joyful’ to be also a suffix.
415
bahre ‘profit’; bahre mand ‘enjoying the profits of sth’
ɑberu ‘reputation’; ɑberu mand ‘having good reputation’
43- nɑ (a few words; tanɡ ‘tight’; tanɡnɑ ‘bottleneck’)
Real words used in the experiment on epenthesis in suffixation
The suffixed form
arʤ-mand ‘valued’ arʤ ‘value’
The stem/root
ɑfarid-ɡɑr ‘creator (used for God)’ ɑfarid ‘past stem of ɑfaridan ‘to create’
ɑmorz-ɡɑr ‘forgiving (used for God)’ ɑmorz ‘present stem of ɑmorzidan ‘to
forgive’’
ɑmuz-ɡɑr ‘teacher’ ɑmuz ‘present stem of ɑmuxtan ‘to learn sth,
to teach sth to sb’’
parvard-ɡɑr ‘creator’( used for God)’ parvard ‘past stem of parvardan ‘to raise’’
pɑj-ɡɑh ‘base as in ‘military base’’ pɑ(j) ‘foot’
pɑs-bɑn ‘policeman’ pɑs ‘watch, guard duty’
bɑɢ-bɑn ‘gardener’ bɑɢ ‘garden’
bozorɡ-vɑr ‘magnanimous’ bozorɡ ‘big, great’
459
dud-mɑn ‘lineage’74
kɑr-ɡar ‘worker’ kɑr ‘work’
(see the footnote)
ɡoft-mɑn ‘discourse, dialog’ ɡoft ‘past stem of ɡoftan ‘to say’’
ɢahr-mɑn ‘hero’ ɢahr ‘wrath, anger’
vɑʒ-ɡun ‘upside down’ vɑʒ ‘upside down’75
sɑxt-mɑn ‘building’ sɑxt ‘past stem of sɑxtan ‘to build’’
sɑz-mɑn ‘organization’ sɑz ‘present stem of sɑxtan ‘to build’’
sɑz-ɡɑr ‘compatible’ sɑz ‘present stem of sɑxtan ‘to get along
with sth/sb’’76
zɑj-mɑn ‘baby delivery’ zɑj ‘present stem of zɑjidan ‘to give birth’’
ʃɑd-mɑn ‘happy, joyful’ ʃɑd ‘happy’
xɑst-ɡɑr ‘suitor’ xɑst ‘past stem of xɑstan ‘to want’’
ʤɑn-var ‘animal’ ʤɑn ‘soul, body’
mehr-bɑn ‘kind’ mehr ‘kindness’
mehr-ɡɑn ‘name of a Persian celebration’ mehr ‘kindness’
mɑnd-ɡɑr ‘lasting’ mɑnd ‘past stem of mɑndan ‘to last’’
74 This word seems to be dude + -mɑn, which means that it has lost its –e in its dudmɑn version. This word formerly was dūt-ak-mān and given the historical change of –ak to –e, it is reasonable to consider dudemɑn as the original form and dudmɑn as its more recent form. This is one of those cases which I considered as historical residue (see (56) in 4.6.1). The Dehkhoda Persian dictionary considers dude ‘family’ as dud + e (http://www.loghatnaameh.com/). 75 This word is not usually used alone –the meaning given above is based on the Dehkhoda dictionary (http://www.loghatnaameh.com/). 76 sɑxtan has two meaning: ‘to build’ and ‘to get along with sth/sb’.
460
most-mand ‘poor’ most ‘complain’77
roft-ɡar ‘street sweeper’ roft ‘past stem of roftan ‘to sweep’’
rast-ɡɑr ‘salvaged’ rast ‘past stem of rastan ‘to find relief’
ruz-ɡɑr ‘times’ ruz ‘day’
jɑd-ɡɑr ‘memento’ jɑd ‘memory’
No-epenthesis words – the list is organized as the list of the epenthesis-possible words,
that is vowel-initial words followed by consonant-initial words, which are organized
based on the place of articulation of their first consonant.
The suffixed form
anduh-nɑk ‘sad’ anduh ‘sadness’
The stem/root
ajɑl-vɑr ‘burdened with a large family’ ajɑl ‘one’s wife and children’
77 most is not used alone -the meaning given above is based on the Dehkhoda dictionary, which considers mostmand/mostamand to be made from most (http://www.loghatnaameh.com/).
dɑneʃ-kade ‘faculty as in ‘faculty of arts’’ dɑneʃ ‘knowledge’ (dɑn + eʃ)
kɑr-ɡɑh ‘atelier, workshop’ kɑr ‘work’
kɑr-mand ‘employee’ kɑr ‘work’
ɡandom-zɑr ‘wheat field’ ɡandom ‘wheat’
ɡaʧ-kɑr ‘plasterer’ ɡaʧ ‘plaster’
ɡol-zɑr ‘flower field’ ɡol ‘flower’
ɡoruh-bɑn ‘sergeant’ ɡoruh ‘group’
ɡuʃ-vɑre ‘earrings’ ɡuʃ ‘ear’
ɢalam-dun ‘pen-case’ ɢalam ‘pen’
ɢam-ɡin ‘sad’ ɢam ‘sadness’
462
ɢazab-nɑk ‘frustrated’ ɢazab ‘frustration’
ɢodrat-mand ‘powerful’ ɢodrat ‘power’
sahar-ɡɑh ‘early morning’ sahar ‘early morning’
sanduɢ-ʧe ‘small chest’ sanduɢ ‘chest, box’
saxt-ɡir ‘strict’ saxt ‘difficult, severe’
servat-mand ‘rich’ servat ‘wealth’
suɡ-vɑr ‘mourner’ suɡ ‘mourning’
ʃamʔ-dun ‘candle holder’ ʃamʔ ‘candel’
ʃarm-ɡin ‘embarrassed’ ʃarm ‘embarrassment’
xerad-mand ‘wise’ xerad ‘wisdom’
xaʃm-ɡin ‘angry’ xaʃm ‘anger’
xaʃm-nɑk ‘angry’ xaʃm ‘anger’
xɑb-ɡɑh ‘dormitory’ xɑb ‘sleep’
xɑst-ɡɑh ‘origin’ xɑst ‘past stem of xɑstan ‘to rise’’78
honar-mand ‘artist’ honar ‘art’
mes-ɡar ‘coppersmith’ mes ‘copper’
nɑm-dɑr ‘famous’ nɑm ‘name’
78 xɑstan ‘to rise from which xɑstɡɑh ‘origin’ is formed has a different spelling from xɑstan ‘to want, to desire’ from which xɑstɡɑr ‘suitor’ is formed.
463
Followed by the suffix -i
Real words used in the experiment on the occurrence of -v in suffixation
no ‘new’
mo ‘grape leaf’
ʤo ‘barley’
piʃro ‘leader’
pejro ‘follower’
pɑdo ‘errand boy’
ɑhu ‘deer’
Followed by the suffix -ɑn
abru ‘eyebrow’
piʃro ‘leader’
pejro ‘follower’
pɑdo ‘errand boy’
ɑhu ‘deer’
bɑnu ‘lady’
464
Followed by other suffixes
no ‘new’ + in
ʤo ‘barley’ + in
mo ‘grape leaf’ + estɑn
ʤo ‘barley’ + estɑn
Followed by the suffix -i
Real words used in the experiment on the occurrence of -ɡ in suffixation