Top Banner

of 32

Speech Extrapolated

Mar 03, 2016

Download

Documents

Scott Rite

A speech-based paradigm of music and its perception.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • Perspectives of New Music is collaborating with JSTOR to digitize, preserve and extend access to Perspectives of New Music.

    http://www.jstor.org

    Perspectives of New Music

    Speech Extrapolated Author(s): David Evan Jones Source: Perspectives of New Music, Vol. 28, No. 1 (Winter, 1990), pp. 112-142Published by: Perspectives of New MusicStable URL: http://www.jstor.org/stable/833346Accessed: 21-10-2015 01:20 UTC

    Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/ info/about/policies/terms.jsp

    JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • SPEECH EXTRAPOLATED

    DAVID EVAN JONES

    T IS LARGELY timbral qualities and timbral transitions which form the acoustic basis of phonetic communication. It is not surprising, then, that

    those timbres which cue the perception of speech are often taken as points of departure in efforts to structure a wider timbral vocabulary as a vehicle of purely musical communication. Wayne Slawson (1985), for example, has developed a detailed and rigorous approach to the organization of vowels and vowel-like resonance patterns. Fred Lerdahl (1987) asserts that vowels are "central to human timbre perception" and incorporates them promi- nently in his efforts to construct a hierarchical system of timbral organiza- tion.' Prior to these theoretical efforts, poets such as Filippo Marinetti, Hugo Ball, Kurr Schwitters, Maurice Lemaitre, and composers such as Karlheinz Stockhausen, Herbert Eimert, Luciano Berio, Gyorgy Ligeti, Kenneth Gaburo, Charles Dodge, Roger Reynolds, Paul Lansky, and many

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Speech Extrapolated

    others have composed with speech sound in such a way as to focus the attention of listeners on the sounds of speech not only as carriers of information (verbal meaning), but also as structures of (timbral) informa- tion. Many of these composers and poets have utilized speech sound in association with timbrally similar nonspeech sounds in an effort to struc- ture a larger and more varied timbral vocabulary.

    I have addressed elsewhere (1987) some of the theoretical issues underly- ing the strategies composers have utilized to focus listeners' attention on "the sounds of speech-not only as cues for phonetically coded informa- tion, but as timbres, pitches, durations." As a composer, I have explored ways of making the structure of speech sound serve as a basis for important aspects of my musical structures. My approach has generally involved organizing speech sounds according to their timbral characteristics (rather than their morphemic function) and highlighting this timbral organization by my use of related instrumental timbres and associated pitch structures. By these means, I have found ways of "mapping" or "extrapolating" aspects of speech structure into a musical domain.2

    In this paper I will describe some aspects of the compositional language developed for my Still Life Dancing for four percussion players and computer tape. In particular, I will focus upon several families of "percussion/ vowels" I synthesized for this piece: sounds identifiable as percussion instruments but with identifiable vowel resonances. Along the way, I will address several theoretical issues related to the musical organization of speech sounds and of vowels in particular. In the first part of this paper, I will address some the assumptions underlying the internal organization of the families of percussion/vowels and I will outline the procedures by which these sounds were synthesized. In the second part, I will discuss the musical functions of the percussion/vowels within the overall timbral organization of Still Life Dancing, and I will outline the effects I hope to achieve by the integration of speech and nonspeech timbres.

    I. PERCUSSION/VOWELS

    OVERVIEW

    Utilizing the CHANT synthesizer (Rodet, Potard, and Barriere 1984) and an analysis/synthesis technique for impulsive sounds (Potard, Baisnee, and Barriere 1986) at L'Institut de Recherche et de Coordination Acoustique/ Musique (IRCAM), I synthesized familiar-sounding sources (percussion timbres) such that each source displayed the characteristics of a set of nine familiar resonances (vowels) to form the matrix in Example 1.

    1 13

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Perspectives of New Music

    u o ce A^ a E I i Struck Wood

    Skuare Struck Metal

    Bowed Metal

    etc.

    EXAMPLE 1

    The percussion/vowel sounds were synthesized with both wide and narrow bands at the formant center frequencies to produce perception of both vowel (by virtue of the wideband resonances) and pitch (the narrow bands) at these frequencies. The formant structure of the vowel "scale" from /u/ to /i/ is thus closely and consistently associated with a system of pitch relations.

    Vowel qualities were thus given potential musical functions as

    1. Points of cognitive unison between timbres which, in other respects, are very different, and which are identified by listeners as issuing from categorically different sources (e.g. metal percussion, wood percus- sion, etc.).

    2. Points of direct association between pitch information (the formant structure audible as pitches) and timbral information (the vowels themselves).3

    COGNITIVE UNISONS

    I regard two musical features as being in cognitive unison when they effectively represent the same functional unit (or category) in musical context. Two pitches or two timbres (or two independent aspects of timbre) or two vowels may be in cognitive unison when, despite perceptible differences between them along the dimension in question, they are functionally equivalent at the lowest hierarchical level along which that feature is varied-when they are tokens of the same functional category along that dimension. The idea of

    114

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Speech Extrapolated

    a cognitive unison is thus an attempt to distinguish between those dif- ferences along a given dimension that give rise to the musical structure (functional differences) and those differences that do not.4

    Perception of differences along any musical dimension is strongly influ- enced by the listeners' experience in their everyday acoustic world. Even if harmonies, durations, aspects of timbre were varied along carefully con- trolled continua, it is highly unlikely that the listeners' perceptual responses would be continuous: listeners make use of familiar categories to discrimi- nate and identify points along the continuum. (For related experiments, see Rosch 1973, 1975). In regard to speech sounds in particular, see the excellent review provided by Liberman and Studdert-Kennedy (1978). Composers attempting to create musically effective categories (and there- fore effective points of cognitive unison) along any musical "dimension" must take into account the categories already familiar to the listener. Composers must either make use of these familiar categories or take special care to convincingly override them for the purposes of a particular piece.5

    The percussion/vowels were synthesized in an effort to make use of familiar categories: to cue the perception of a familiar source (metal or wood percussion instruments) and a familiar resonance (one of nine selected vowels) in the same sound and to vary these two features indepen- dently in musically meaningful ways.

    INDEPENDENCE OF SOURCE AND RESONANCE

    Because timbre involves a complex of perceptually separable acoustic characteristics, each of which can be varied independently, the notion of cognitive unisons in the domain of timbre is, at least potentially, a compound issue. Matrices of timbral characteristics can be constructed within which individual dimensions can be independently varied. Following Rodet (1984), Rodet, Potard, and Barriere (1984), Slawson (1985), Potard et al. (1986) and many others, I have selected "source" and "resonance" as timbral dimensions which can be varied independently, and which can be parsed independently into musically functional categories.

    When controlled independently, changes in the "source" and "reso- nance" of a sound can often be discriminated independently. Spoken and whispered vowel sounds provide a clear example of this independence: changes in the "source" (glottal oscillations or air friction) and the vowel (associated with the formant pattern) can be discriminated separately. Experimental evidence for more general examples of the perceptual inde- pendence of source and resonance is extensively reviewed by Slawson (1985) who also proposes an elaborate network of rules according to which the dimension of "sound color" (vowel-like resonance patterns) can be varied.6

    115

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Perspectives of New Music

    Potard, Baisnee, and Barriere (1986) have developed an approach to syn- thesis using extremely detailed models of the resonances of impulsive sounds. I found their (1986) demonstrations of the independence and perceptual saliency of "source" and "resonance" to be remarkably con- vincing. Grey's (1977) multidimensional perceptual scaling experiments have also demonstrated that dimensions related to source ("instrument family") and resonance ("spectral energy distribution") are amongst the most perceptually salient features of musical timbre.

    VOWEL UNISONS

    Vowels (and other "steady-state" voiced speech sounds) are unique among all possible patterns of formant resonance in that listeners have codes, cognitive prototypes, by means of which they identify and remem- ber them. Two of the three formant structures in Example 2 describe vowels. Example 2a is a schematic formant structure of the vowel /u/. Example 2b represents the vowel /i/. Example 2c is not a vowel. Because the resonance pattern described by 2c would not be easily produced by a human vocal tract, it has not become part of a vocabulary of coded resonances associated with a natural language. The nonvowel resonance might (or might not) be remembered if it occurred twice in a musical composition but, because it cannot be related to a prototype and coded in memory, it is likely to be a much less effective category than the vowels, and therefore less effective as a musical form-building element.

    Identification and retention of sequences of vowels would be only mar- ginally affected if a listener's native language divides the vowel continuum at boundaries different from those chosen by a composer. While the specific prototypes around which listeners' organize vowel percepts may vary depending upon the language they speak, the tendency to group vowel resonances according to these prototypes persists even when the vowels presented do not conform to the prototype. If a listener's language has no /A/ , for example, s/he might remember it as an "open /oe/ ." While this variant might not form as strong a perceptual category as a "cardinal" vowel in the listener's language, it would be still more memorable than a resonance pattern which fell outside the region of the vowel continuum structured by the listener's language.7 The vowels at the high and low formant frequency extremes of the vowel continuum (/u/, /a/ and /i/) appear almost universally in languages of the world (see Disner 1983).

    The attempt to construct an instrument/vowel matrix is complicated by the fact that listeners are usually cued to listen for phonetic information in a signal when they identify the source of the sounds as a human voice.8 That is not to say, however, that listeners cannot extract phonetic information

    116

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Speech Extrapolated

    A

    Amp.

    1000 2000 3000

    Freq.

    B

    Amp.

    0 1000 2000 3000

    Freq.

    C

    Amp.

    I^ llllj llI llllllllllllll 0 1000 2000 3000 4000

    Freq.

    EXAMPLE 2: VOWEL AND NONVOWEL RESONANCES

    (when phonetic information is available) from "un-voice-like" signals. Experiments such as Bailey et al. (1977)9 suggest, however, that un-voice- like speech sounds must be presented in such a way as to draw the listeners' attention to the phonetic features of the sounds. 0 Rapid diphthongs and juxtapositions of sounds with markedly different formant structures tend to

    117

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Perspectives of New Music

    highlight the formant structure in ways which draw listeners' attention to the phonetic message. Presentation of sounds which clearly carry phonetic messages can also help to influence listeners to attend to associated sounds as speech (Tsunoda 1971).

    SYNTHESIS OF PERCUSSION/VOWEL SOUNDS

    The percussion/vowels were synthesized in an effort to cue the percep- tion of two very different-and normally incompatible-sets of categories within the same sound: familiar sources (metal and wood percussion instruments) with familiar resonances (vowel sounds). For this purpose, I made use of the CHANT synthesizer (Rodet, Potard, and Barriere 1984) and a technique developed by Potard, Baisnee, and Barriere (1986) for modeling and synthesizing the resonances of impulsive sounds.

    CHANT utilizes formant-wave-function (FOF) synthesis (Rodet 1984) to enable the user to dynamically control the center frequencies, band- width, and amplitudes of up to two hundred or more time-varying reso- nances in a synthesized sound. The resonances may be produced either by synthesis or filtering. The program was initially designed for speech syn- thesis, but has proven to be extremely flexible and effective in the synthesis of a variety of timbres.

    Potard, Baisnee, and Barriere (1986) have described techniques by means of which a digital recording of an impulsive sound can be analyzed and modeled as a set of resonances, each resonance with a given center fre- quency, bandwidth, and amplitude. In the form of a text file, each model can then be altered and manipulated by any algorithm the researcher may develop in the UNIX environment or in FORMES (Rodet and Cointe 1984) and synthesized using CHANT. Several models require more than a hundred individual resonances, but the number of resonances can often be systematically reduced without a marked loss in "fidelity." Potard et al. have made a number of models of impulsive sounds available in IRCAM's on-line library.

    Using CHANT and the techniques described by Potard, I synthesized the percussion/vowel sounds with three active formants. Only the first two formants were associated with audible pitches (described below); the fourth and fifth formants were static.

    1. I selected a model of a given instrument originally played at fundamental frequency around 440 Hz-near the center of the range of first formant vowel frequencies with which I was work- ing. (Specific formant freqlucncies tor the nine selected vowels are given below.)

    1 18

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Speech Extrapolated

    2. I transposed this model so that the fundamental frequency (FO) of the model of the instrument corresponded to the center frequency of the first formant (F1) of the desired vowel.

    3. I transposed the same model (or another model of the same instrument played at a higher pitch) so that the fundamental frequency (FO) of the instrumental model corresponded to the center frequency of the second formant (F2) of the desired vowel.

    4. I combined the two models.

    5. I attenuated or eliminated all resonances which fell between ca. 300 Hz and 3000 Hz (the frequency range within which F1 and F2 vary) except for the resonances within a few Hz of the funda- mental frequencies of the two combined percussion models (which had been tuned to F1 and F2 of the desired vowel).

    6. At Fl and F2 of the desired vowel, I added a number of addi- tional resonances which varied from wide bands (perhaps 50 Hz or more) to very narrow bands (ca. 1 Hz or less).

    7. Proceeding empirically, I balanced the relative amplitudes of the wide and narrow bands at each of the formant frequencies to produce both a perception of vowel (the wide bands) and pitch (the narrow bands). I also balanced the relative amplitudes ofF1 and F2 (to produce an optimally intelligible vowel) and the relative amplitudes of the formants vs. the remaining percussion resonances (to produce a percept of both vowel and instrument.) Amplitudes and resonance times were altered freely to these ends.

    8. Having produced individual vowel/percussion "objects," I pro- duced diphthongs by using tools in the FORMES environment to control continuous interpolations between series of two or more such "objects."

    The wide bands at F1 and F2 were designed to convey the vowel. The transients below 300 Hz, the narrowband resonances at the formant center frequencies (the fundamental frequencies of the two original models), and the transients and harmonics above 3000 Hz were left unaltered to convey the identity (and sense of pitch) of the original percussion instruments. Thus each percussion/vowel is pitched as a dyad at the formant center frequencies (F1 and F2).

    119

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Perspectives of New Music

    VOWEL SCALE

    I think of the extremes of the vowel space-/u/, /a/ , and /i/-as defining the extremes of a range of articulations with which listeners can identify. If a speaker or singer were to continue the articulatory gesture from /a/ to /u/ in an attempt to find a vowel with lower first and second formants than /u/, she will arrive at the articulatory place for the stop consonant /b/ or (if the sound is nasalized) the nasal /m/. If she attempts to continue the articula- tory gesture from /a/ to /i/ in an attempt to find a vowel with a higher second formant than /i/, she will arrive at the articulatory place for the stop consonant /d/ or (if the sound is nasalized) the nasal /n/. Similarly, attempts to continue the transitions from /u/ to /a/ or from /i/ to /a/ to produce still more

    "open" vowels, results only in vowels closely related to la/ . The vowels /u/, /a/ and /i/ are thus not arbitrarily selected as extremes of the vowel "scale"; they represent articulatory extremes along the vowel continuum. Their unique articulatory positions may be the reason that the vowels /u/ and /il appear almost universally in natural languages. In Still Life Dancing, /u/, /a/ , and /i/ are the most common vowels of reference or orientation: /u/ often serves as a point of resolution; lal often serves as a secondary (more temporary) "tonic", and /i/ often serves as a high point.

    I divided the vowel continuum from /u/ to /i/ into a scale of nine individual vowels roughly organized according to their second formant frequencies.11 The vowel continuum could, of course, be divided into an infinite number of individual vowels, just as the octave can be divided into an infinite number of pitches. I selected nine as a number which would facilitate a sufficient variety of vowel patterns and still permit the discrimi- nation and identification of the individual vowel qualities. 12 These particu- lar vowels were selected because they are familiar to me as a speaker of English and because their formant frequencies are spaced fairly evenly in the vowel space. In Example 3, a range of possible center frequencies is given for Fl and F2 of each of the nine vowels.13

    VOWEL/PITCH ARRAY

    As the formant center frequencies determine both vowel qualities and perceived pitch dyads, I designed a pitch array based upon the vowel scale described above.

    The pitch array shown in Example 4 was composed intuitively as a succession of dyads, roughly within the constraints of the average formant frequencies given in Example 3. (The phonetic symbol for each vowel in Example 3 is placed exactly at the point in the matrix representing the intersection of the F1 and F2 frequencies selected for that vowel.) The

    120

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Speech Extrapolated

    261.6 C4 277.2 C# 293.7 D U 311.1 D# | 329.6 E 349.2 F 370.0 F# -lo0 392.0 G I 415.3 G# 440.0 A I 466.2 A# oe 493.9 B 523.2 C5 554.4 C# 587.3 D 622.3 D# 659.3 E 698.5 F A 740.0 F# 784.0 G 830.6 G# ( 880.0 A

    e^ vo o o ai rt w

    o1 i 8 q q r-I6oS t ) g 5

    EXAMPLE 3: FIRST AND SECOND VOWEL FORMANTS FOR NINE VOWELS

    (FEMALE) MAPPED AGAINST THE TWELVE-TONE TEMPERED SCALE

    8ve ....................................................-------------------------.............--...........------------

    e- i ^ lT' - -

    u o e ^A aCL E I i EXAMPLE 4: VOWEL FORMANT CENTER-FREQUENCY/PITCH ARRAY

    pitches shown in Example 4 were consistently used as F1 and F2 in the synthesis of the indicated vowels in all families of percussion/vowel timbres. Each percussion/vowel is thus consistently associated with a specific pitch interval. (While the F2 pitch is not easily "heard out" in the vowels /u/ through /a/ , it is nonetheless audible in the "character" of the timbres.)

    121

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Perspectives of New Music

    As I wished to distinguish the vowel sounds in pitch as well as in formant pattern, I chose a different pitch class from the range of pitches available for the first formant of each of the nine vowels. The first formant pitch stands out most prominently as a pitch in the vowels from /u/ to /a/. The second formant pitch becomes increasingly prominent in the vowels from /x/ to /i/-particularly in the metal vowels. Because the percussion/vowels can be transposed only about a major second in either direction without seriously impairing the vowel percept, the vowel scale from /u/ to /i/ is thus roughly associated with a somewhat variable "scale" of formant pitches. Within the limits given, any sequence of vowels or diphthongs thus involves a concomitant pitch gesture.

    DISCRIMINATION AND IDENTIFICATION OF PITCH AND VOWELS

    Listeners bring very different perceptual processes and capabilities to bear in the perception of pitch and vowels. Individual pitches can (within limits) be discriminated from other pitches, and can be placed relative to other pitches (by interval higher or lower). Most listeners, however, cannot absolutely identify individual pitches. (That is to say, most listeners do not have absolute pitch.) Presumably for this reason, appreciation of most musical systems requires only the identification of relative pitch patterns.

    Individual vowels, on the other hand, are both discriminated and identi- fied in normal speech perception. Whereas most listeners cannot consis- tently identify the note G05, for example, in a non-tonal musical context (or even in a tonal one if they are not first told the key), even musically untrained listeners can consistently identify the vowel /a/ (whose first formant is around G#5 in this system).

    Moreover, the capability of listeners to label and code vowel qualities in memory may be a potentially powerful form-building tool. Research has shown that musically trained listeners retain in memory not only a "recording" of diatonic pitch patterns, but also a coded representation of that pattern. The coded representation is retained in memory more accurately and for a longer time than the analog "recording." Phonetic perception entails a similar coding of information. (See Liberman and Studdert-Ken- nedy (1978) for example.) Listeners' ability to retain the speech informa- tion in nonsense syllables suggests that even a non-morphemic organization of vowels (in a musical grammar) might be coded and retained in similar ways.

    The concomitance in the organization of vowel and pitch in the percus- sion/vowels has the effect of reinforcing the memorability of particular pitch/vowel areas as points of departure and arrival between which the large gestures of Still Life Dancing were constructed.

    122

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Speech Extrapolated

    FAMILIES OF PERCUSSION/VOWELS

    I produced several families of percussion/vowels at IRCAM. Each family consists of one source realized as each of the nine individual vowels and (in most families) ten to thirty selected diphthongs. The families include percussion/vowels utilizing models of cowbell, glockenspiel, and piano sounds. Wood/vowels were synthesized empirically ("by ear") without use of a percussion model. Both metal/vowels (glockenspiel) and wood/vowels were realized both by FOF synthesis and, using the same resonances, by filtered white noise. The filtered white noise vowels were altered to pro- duce sustained wideband resonances which I call wood-noise/vowels and metal-noise/vowels. "Bowed metal" vowels were produced by repeatedly exciting the glockenspiel model (as if the glockenspiel bar were being gently struck repeatedly at a frequency audible as a pitch) to give the impression of "bowing."

    Of all the percussion/vowels, only cowbell/vowels, glockenspiel/vowels, wood/vowels, wood-noise.vowels, and metal-noise/vowels are used in the composition Still Life Dancing. I plan to use other combinations of families in other pieces. The sound files were recorded at IRCAM on PCM digital tape and sampled onto the Synclavier II at Bregman Electronic Music Studio, Dartmouth College, where they were edited and assembled into the completed tape part.

    II. TIMBRAL ORGANIZATION IN STILL LIFE DANCING

    INSTRUMENTAL GROUPINGS

    The "instrumental" resources used in my composition Still Life Dancing for three percussion players and tape are presented in the matrix in Example 5. This figure displays the "instruments" within a configuration of vari- ables arranged so as to suggest some of the ways in which the "instru- ments"-and the variables-were deployed in the piece.14

    Along the horizontal axis of Example 5, the "instruments" of the piece are arranged in three categories representing a stepped transition between speech and percussion sounds (and, coincidently, according to the means by which the sounds were produced ). Because they possess the "source" characteristics of pitched wood and pitched metal percussion instruments and the resonance characteristics of the (unpitched) sampled vowels, the percussion/vowels help to unify the sound world of the piece. They very often function as a hinge or a bridge between the unpitched sampled speech and the pitched percussion or as part of a complex unified texture. (See Example 6, discussed below.)

    123

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Perspectives of New Music

    Sampled Speech Vowels Live Percussion

    wood vowels marimba Pitched wood-noise vowels xylophone

    Wood wood blocks

    temple blocks

    Voca/ vocal fry vowels drums nptch Tra. c whisper vowels hand percussion fricatives

    cowbells cymbals

    Metal metal-noise vowels vibraphone I

    metal vowels orchestral bells Pltched

    EXAMPLE 5: "INSTRUMENTS" USED IN Still Life Dancing

    Along the vertical axis of Example 5, the sound world of the piece is divided according to the "source" of the sounds (wood, vocal tract, metal) and according to specificity of pitch ("pitched" vs. "unpitched").15 At various points in the piece, "unpitched" instruments, "pitched" instru- ments, "wood" instruments, and "metal" instruments (as defined in Example 5) are projected as individual timbral groupings. Drums, which are neither wood nor metal of course, are rarely projected in Still Life Dancing as a separate "skins" category; instead they serve as a lower- frequency component of the "unpitched" category which also includes metal, wood, and the sampled speech.

    Although it is not indicated in the diagram, the "unpitched" category is further divided into vowel-related and fricative-related instruments. Because of their indefinite pitch and short decay, wood blocks, temple blocks, and cowbells are often grouped with vocal fry vowels to create an ambiguous speech/nonspeech texture. In Examples 7 and 8 in particular, the wood blocks and whisper/fry are hocketed in interlocking, often imita- tive patterns. Cymbals and whisper-sung vowels are often grouped because they are both wideband sounds with indefinite pitch and comparatively long decay. Hand percussion such as maracas, shekere, and tambourine which can be sustained by rolls are often grouped with the fricatives (/s/, and /J/ )16 and the fricative-vowel transitions (/su/, /sa/, /si/, /Ju/, /|a/, /Ji/ ). In Example 2, bar 1, the fricative-vowel transition /fi/ comes out ofthe sound ofthe Shekere. In bars 2-3 of the same example, an extended /I/ joins rolled tambourine, and maracas as part of a high-pitched rustle- noise texture.

    124

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Speech Extrapolated 125

    9=% ' n n0 I U.

    ? --

    -

    -

    U. ,

    . E sE -

    I ? BUI-

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Perspectives of New Music

    IL

    r.

    0-

    9L

    U-

    e)

    I

    I

    -

    ouko

    z 0

    'I

    Cr;l

    r)

    e i tl a a a 3 ^ E E

    126

    I

    ] s V ap

    Lo

    a

    [

    3

    H

    Lj I

    o

  • Speech Extrapolated

    w

    --

    ;L

    (

    FsN

    x

    sc^

  • Perspectives of New Music

    ib ? ( ^ ^ r_ri rj F -- 7- L I u,

    tLjoocI-,C- -~ P/Dipth.lL7 I^ | Ir

    P/Yowels3 $ 1 aa r ZIJ I i 1 t I tJ I- I

    c _ - ^

    3 I I I

    EXAMPLE 6 (CONT.)

    *1 O - K

    i zf I > -- ?

    1f 1- f"- I - I

    rt - - I ' I -I

    EXAMPLE 7

    128

    Wh/Fry

    0

    Xylophone

    Mari mba

    Cymbals

    Drums

    Whisper/Fry

    Wood Blocks Temple Blocks

    Hand Percussion

    Cymbals

    Drums

    I I I

    o L- L A

    A

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • 129 Speech Extrapolated

    00

    0-4 P-

    0 0 f a

    0 0

    3I- i 0

    I 2 a

    f) 0

    I a

    Q.

    0> L. U. k. a 0. 4 a 3:

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Perspectives of New Music

    Wh/Fry

    P/Yoveh1.^ I

    Wd.-'?$? BI f * 'Marimba I 7

    Wd. Biks r f f G rL Temp.Blk3l _ r r

    3 ~-=~? 3 Cowbell

    Cymbals

    M4- + ---- -+

    r ( r TT-T1 Drums

    -^ 7 6I-f 1 p I f-

    EXAMPLE 8 (CONT.)

    The piece evolves from a focus on unpitched sounds at the outset (Example 7) to include more and more pitched wood sounds ( Example 8). After a brief concentration on the ensemble of "wood" instruments, the piece evolves to a focus on pitched metal (Example 9), and to a complex mixture of timbral groups.

    VOWEL QUALITIES

    As indicated at the beginning of this paper, vowel qualities are integrated with other musical features in two different ways:

    1. Vowels function as points of direct association between pitch information (the formant structure of the vowels audible as pitches) and timbre (the formant structure audible as vowels).

    Pitch height and vowel quality could be schematically represented as differ- ent but associated aspects of a third dimension (not shown) of Example 5. Vowel quality and pitch are associated along this dimension because the

    L - _I

    130

    . .-J.

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Speech Extrapolated 131

    < (

    ,< /r I

    .i4 ^_

    )w

    b ^

    L O

    ' . ^ t1

    -^. er 0 jpO

    qL ^S 0 ^

    ^ ^ I ^~~~~ sO

    IC ? Q5 ? ? ?

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Perspectives of New Music

    percussion/vowel scale from /u/ to /i/ entails a concomitant (though some- what variable) set of pitches associated with the formant frequencies. Vowel quality and pitch represent different aspects of this dimension because of the differences (some of which are outlined above) between phonetic percep- tion and pitch perception.

    Within the context of Still Life Dancing, the pitches from around D4 (the first formant of /u/) to around F7 (the second formant of /i/) are in a privileged position because they fall within the range of vowel formant frequencies. The memorability of pitches and patterns of pitches within this range can be reinforced and "colored" by direct association with specific vowels. This will be illustrated below in a discussion of Example 6 (bars 93- 106).

    The whisper and vocal fry vowels are organized in a scale of ascending second formants as are the percussion/vowels. These sounds are sometimes organized according to their specific vowel qualities (see below) and some- times simply as another unpitched percussion instrument (a percussion instrument the vowel qualities of which can become functional at any time). The percussionists are asked to select the range of wood and temple blocks to approximate the pitch of/i/ at the high end and /u/ at the low end in order to facilitate perceptual associations and interactions between the two sets of "instruments."

    2. Vowel qualities function as points of cognitive unison between timbres which, in other respects, are very different, and which are identified by listeners as issuing from categorically different sources.

    Because the vocal fry vowels and whisper vowels are vaguely and complexly pitched, it is usually the vowel quality and the noise content-not the pitch-that stand out most clearly when a vocal fry vowel or whisper vowel is sounded. Moreover, because the sampled speech sounds were produced by a male vocal tract while the percussion vowels were based upon average formant frequencies for a smaller (female) vocal tract, the pitches associated with each vocal fry vowel are not identical to the formant frequencies of the percussion/vowels. The inclusion of vocal fry vowels and whisper vowels in the musical materials of Still Life Dancing thus gives vowel quality a function separate from and more independent of pitch.

    The vocal fry and whisper vowels can be used to match, extend, and imitate patterns of vowel quality heard in the percussion/vowels indepen- dently of the pitch or other aspects of the timbre. The fry and whisper vowel patterns often follow the pitched vowels (either as extensions of an individual vowel or as an imitation of a vowel pattern) to convey the impression that the pitch of the sound has decayed, leaving the unpitched vowel. The intent of these imitations and extensions is to persuade listeners

    132

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Speech Extrapolated

    to focus upon individual vowels as categories (points of cognitive unison) which can include sounds of remarkably different timbre and thereby to integrate the pitched and unpitched timbres.

    Both the association of vowels with specific pitch ranges and the use of vowel qualities as cognitive unisons between two categorically different sources (one of them unpitched) are illustrated in Example 6 (bars 93-106). Each of the pitched segments in this excerpt focusses upon at least one clearly-audible vowel: /a/ ends the first pitched segment (bars 93-96); /E/ ends the second (bars 98-100). The third pitched segment (bars 101- 104) begins with /I/ and descends to /o/ before repeating, in elaborated form, the motion from /a/ (bar 103, bar 104) to // (bar 104). The high point of the phrase is reached at the /i of bar 105.

    Each of these structural vowels appears in both the (pitched) percussion/ vowels and in the (unpitched) whisper/fry vowels. (Note that the percus- sion/vowel line is itself a hocket of different sources.) Vowel patterns which appear as percussion/vowel sounds are sometimes imitated in the whisper/ fry. (E.g. the descent from /i/ to /u/ in bar 105 is telescoped in the whisper/ fry in bars 105-106.) Similar imitations and interactions can be found in Example 9, bars 259-61 and Example 8, bar 43.

    CONCLUSION

    The vowel unisons between the sampled speech and the percussion/vowels point to an interesting and important ambiguity in Example 5. The source category "vocal tract" should certainly include the sampled speech which issued (quite audibly) from a human vocal tract. But should it include the percussion/vowels which seem to emanate from wood or metal instruments but behave (in their changing resonances) as if produced by a vocal tract? This ambiguity is maintained throughout the piece by the continual return of sampled speech and percussion/vowels. The speech and/or timbral sim- ilarities of the sampled speech, percussion/vowels and the live percussion is intended to extend this ambiguity to include as many of the sounds of the piece as possible.

    The compositional intent is to invite the listener:

    1. To attend, at times, to the texture of this piece in a speech mode- interpreting (or attempting to interpret) timbres which would nor- mally be thought of as non- speech in terms of speech sounds which share timbral characteristics, and, (conversely) ...

    2. To attend to the sounds of speech not only as cues for phonetically coded information, but as timbres, pitches, durations.

    133

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Perspectives of New Music

    The perception of "vocal tract behavior" and the "extrapolation" of this behavior onto other sounds requires only that the listener recognize charac- teristically "vocal" formant transitions and other speech qualities as inex- tricably related elements of the texture. This effect is intensified, however, to whatever extent specific vowel qualities are made identifiable and func- tional parts of the musical language.

    As Fred Lerdahl (1987) has remarked, the tremendous timbral flexibility which is provided by electro-acoustic instruments has made the need for a greater understanding of timbral organization extremely acute. I am not as optimistic as Lerdahl, however about the development, in the foreseeable future, of broad principles of hierarchical timbral organization. In the recent past and for the foreseeable future, I see a variety of "situational" approaches to timbre: approaches unique to individual composers and to individual pieces, approaches which involve elaborate timbral structures which are nonetheless mutually dependent with associated pitch structures. The central problem confronting these "situational" understandings con- cerns the unfamiliarity of audiences with the assumptions upon which each new system is based. In this regard, the sounds of speech recommend themselves as fascinating, diverse, familiar, and therefore useful points of departure.

    ACKNOWLEDGMENTS

    It is a great pleasure to acknowledge the assistance I received in this project from IRCAM researchers Pierre-Francois Baisnee, Xavier Rodet, and Jean- Baptiste Barriere, who offered instruction and assistance in the use of CHANT and who (with others cited in the bibliography) developed all of the synthesis tools with which I worked. I am grateful, as well, to my colleague Jamshed Bharucha at Dartmouth College who read and com- mented upon an earlier version of this manuscript. Any errors which may remain are my own.

    Funding for this project was generously provided in the form of a Research Fellowship by the Dartmouth Class of 1962. I greatly appreciate their assistance.

    134

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Speech Extrapolated

    NOTES

    1. While Lerdahl organizes timbre according to judgements of "conso- nance" and "dissonance" along several timbral dimensions (an approach explicitly related to the structure of tonal music), Slawson organizes his system of "sound colors" according to serial principles.

    2. I have utilized speech sounds as the basis for compositional control of a wide variety of timbres in various contexts. Two examples: In Passages (1981) for chamber choir, two pianos, two percussion, and organ, I utilized a phonetically constrained text composed for the piece by poet Michael Davidson in conjunction with a phonetically constrained artificial language of my own design. The artificial lan- guage entailed the rule-based construction of "nonsense" syllables utilizing three classes of phonemes: fricatives, stop consonants, and vowels-the timbral qualities and organization of which were mapped onto the organization of the entire ensemble and the entire piece. In Scritto (1986) for computer tape, I utilized sampled whispers (a vowel scale, two fricatives, three stop/vowel syllables), vocal fry (the same vowel scale), and sung aggregates with percussion attacks. The ambitus of the pitch aggregates consistently associated with the sung vowels varied directly with the height of the second formant frequen- cies of the vowels to reinforce the sense of

    "opening" from /u/ through the "vowel scale" to /a/ to /I/.

    3. J.K. Randall (1972-74) proposed a similar association of pitch and vowel as early as 1972 in "Part II: 6 Stimulating Speculations" of his article

    "Compose Yourself-A Manual for the Young." Randall described a system of "musical-intervallic relations (specifically, the intervallic relations formed by center-pitches of R1 and R2 [the first and second vowel formants]) between and among the timbres ..." which would be "... reproducible at various places" in the vowel square. Slawson (1985) cites a passage in Stockhausen's Kontakte (1963) in which filters set at wide bands to produce a vowel are gradually narrowed until the pitch of the center frequency is audible.

    Some psycho-linguists regard speech perception as fundamentally different from perception of timbre in other contexts, and indeed from all other types of perception: "speech is special," they say. I discuss phonetic labeling here as if it were an independent aspect of timbre perception, but I do not mean by that to deny the claim of "specialness" for speech. I am agnostic on most aspects of the controversy. I do maintain, however, that (as experiments in "duplex

    135

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Perspectives of New Music

    perception" demonstrate (see, for example, Liberman (1979)), lis- teners can attend simultaneously to both the timbral quality and phonetic identity of of speech sounds.

    4. An example: In the domain of pitch, the simplest, strongest, most elemental intra-category relationship is the unison or prime. Within limits, pitches may differ perceptibly and and still be identified by listeners asfunctionally identical in context. This is reflected in the fact that "bad intonation" or expressive "pitch inflection" is generally distinguished from "wrong notes." (In music which succeeds in establishing distinct functions for, say, thirty-one tones in an octave, the threshold between "bad intonation" or

    "wrong notes" would be virtually eliminated.) Pitch unisons or primes represent functional relationships in a musical context, rather than an absolute identities of pitch or frequency. This is not to say that two COs cannot play different functions at a higher hierarchical level-they certainly can. The point here is that, at a lower level of functional interpretation, the note Cf can be effectively represented by sounds with slightly (but perceptibly) different pitch heights.

    5. Erickson (1975) discusses the importance of recognition and identifi- cation at some length. He asserts that "Even if we try very hard we find it difficult to attend to any single parameter of a timbre" and he quotes Schouten (1968) as follows: "Evidently our auditory system does carry out an extremely subtle and multi-varied analysis of these elements, but our perception is cued to the resulting overall pattern. Acute observers may bring some of these elements to conscious perception, like intonation patterns, onsets, harshness, etc., even so, minute differences may remain unobservable in terms of their audi- tory quality and yet be highly distinctive in terms of recognizing one out of a multitude of potential sound sources."

    This bears directly upon Lerdahl's (1987) efforts to construct timbral hierarchies. My strongest objection to Lerdahl's otherwise intriguing approach is that he oversimplifies the powerful associations (gained by everyday experience) that listeners bring to bear in their listening to synthesized sounds. In a discussion of "timbral pro- totypes," he refers only to "prototypes" of vibrato, tremolo, ampli- tude envelope, and other independent variables and makes no refer- ence to traditional instruments or other environmental sounds. Presumably, his definitions of prototypical vibrato and prototypical amplitude envelope are somehow abstracted from listeners' experi- ences, but it seems clear to me that if such prototypical features of timbre can be said to exist, they are interdependent and strongly related to the listener's judgement as to the source of the sound. I would

    136

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Speech Extrapolated

    think, for example, that a wide vibrato (pitch variation-not tremolo) would sound more cognitively "dissonant" when applied to a vibraphone-like sound than to a voice-like sound because listeners know from previous experience that metal cannot change pitch as flexibly as vocal cords. Lerdahl postulates a prototypically "conso- nant" vibrato (and tremolo, amplitude envelope, etc.) independently of other aspects of the timbre. A more elaborate version of Lerdahl's initial study may prove fruitful, but any such elaboration must take account of the fact that timbre perception is much more context- dependent and less linear than Lerdahl seems to imply.

    6. Slawson (1985) has given us one of the most thorough theoretical reviews of issues concerning the physical and perceptual relationships between "source and resonance."

    7. The memorability of a set of resonance patterns might be explained in terms of the memorability of the location of a set of visual points on a page. It is easier to categorize and remember the location of points which fall within a familiar grid (the vowel continuum) than the location of points which fall outside any known grid (outside the vowel continuum)-even if the points within the grid do not fall exactly in the center of the spaces or on the lines. Differing grids applied to the same set of points will result in the points being positioned and recalled relative to different lines, but it will not change the fact that the locations can be more easily memorized when a grid is applied.

    8. For a discussion of "voice-likeness" and phonetic perception, see Jones (1987).

    9. Bailey et al. (1977) synthesized sine-wave analogues of speech sounds by varying the frequency of two sine waves to correspond to the center frequencies of the first two vowel formants in spoken syllables. Most of the listeners presented with these rapidly varying sine waves did not at first identify the sounds as speech. When they were asked to listen to the sounds as speech, however, all were easily able to decode the phonetic message.

    10. This corresponds to the argument above that each of the independent features of any timbre matrix must be presented in such as way as to draw the listeners' attention to the musical function of changes along that dimension.

    11. The organization of vowels into a scale of ascending second formants from /u/ through /a/ to i/ dates back at least as far as the Seven- teenth Century. Ladefoged (1967) cites Robinson's 1617 manuscript

    137

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Perspectives of New Music

    "The Art of Pronuntiation" as organizing the vowels /u/, /o/, /o , EI , /i/ along an articulatory continuum represented by tongue

    positions. Stockhausen organized the vowels between /u/ and /i/ into a "vowel-square" in Stimmung (1967)-essentially two scales of har- monics associated with vowels he indicates. (The "notes" of the scales are extremely focussed second-formant frequencies; the two scales delineate the front and back vowels respectively.) Slawson (1985) develops his elaborate system of "sound color" variation around traditional phonetic dimensions within the F1 vs. F2 vowel- space. (He adds the dimension of "smallness" not used by phoneti- cians.) "Openness," "acuteness," "laxness," and "smallness" can all be described acoustically and all (except "smallness") have proven to be useful to phoneticians. But Slawson's sound examples did not convince me that variation along all of these dimensions will function perceptibly in a musical context. Both Stockhausen's and Slawson's organization of the vowel space are more complex than the simple "scale" I outline in Example 4. I choose this simpler organization of vowels because I find it to be the clearest and most compelling for my purposes.

    12. Disner (1983) discusses evidence that the distribution of vowels in natural languages "... is best accounted for by a principle of maximum dispersion ... that is, that they tend to be arranged so as to be maximally far from one another in the available phonetic space." Also see Stevens (1972) article on "The Quantal Nature of Speech ...."

    13. The ranges of possible center frequencies for F1 and F2 of the nine vowels given in Example 3 were derived from a number of different sources. The most useful of these sources was Peterson and Barney (1952) (whose reported formant frequencies have worked well in a vowel synthesizer I developed in 1979 at the Center for Music Experi- ment at U.C. San Diego reported in Jones (1984)) and Ladefoged (1967) who gives formant frequencies reported by several different researchers. The orientation of the formants in Example 3 is a tradi- tional presentation of the vowel space.

    14. Matrices such as the one in Example 5 have become familiar composi- tional tools for many composers in this century. A sophisticated theoretical discussion of use of matrices is found in Lewin's (1987) Generalized Musical Intervals and Transformations. My own composi- tional uses of the above matrix were, however, much more intuitive than the rigorous approaches described by Lewin. I found it useful to think of the dimensions in Example 5 as fields within which gestures (directed motion, discernible shapes) may be constructed.

    138

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Speech Extrapolated

    15. The term "unpitched" appears in this paper in its "traditional"

    usage: as a convenient way to refer to indefinitely and/or complexly pitched sounds. Many sounds can serve as "unpitched" or "pitched" depending upon the context. The whisper vowels and vocal fry, for example, sometimes seem to project audible pitch centers (which were occasionally exploited in Still Life Dancing). These sounds, however, were more generally organized according to their relative pitch and are accordingly labeled "unpitched."

    16. Similar instrumental associations with fricatives are also found in Berio's Circles (see Jones 1988), Ligeti's Aventures and Nouvelle Aven- tures and many other pieces.

    BIBLIOGRAPHY

    Bailey, Peter J., Quentin Summerfield, and Michael Dorman. 1977. "On the Identification of Sine-wave Analogues of Certain Speech Sounds." Haskins Laboratories Status Report on Speech Research SR-51/52:1-25.

    Berio, Luciano. 1961. Circles, for female voice, harp, and two percussion players. London: Universal Edition.

    Disner, S. F. 1983. "Vowel Quality: The Relation Between Universal and Language Specific Factors." U.C.L.A. Working Papers in Phonetics no. 58.

    Erickson, Robert. 1975. Sound Structure in Music. Berkeley and Los Angeles: University of California Press.

    Grey, John M. 1977. "Multidimensional Perceptual Scaling of Musical Timbres." Journal of the Acoustical Society of America, 61, no. 5 (May): 1270-77.

    Jones, David Evan. 1981. Passages, for chamber choir, two pianos, two percussion, and organ (score forthcoming from American Composers Editions).

    . 1984. "A Composer's View." Electro-Acoustic Music (The Journal of the Electro-Acoustic Music Association of Great Britain) 1, no. 1 May-June.

    . [1986]. Scritto, for computer tape (compact disc Wergo Records on Digital Music Digital, vol. 4, WER 2024-50).

    139

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Perspectives of New Music

    . 1987. "Compositional Control of Phonetic/Non-Phonetic Percep- tion." Perspectives of New Music 25, nos. 1 & 2:138-55.

    . 1988. "Text and Music in Berio's Circles." Ex Tempore 4, no. 2 (Spring-Summer): 108-14.

    Ladefoged, Peter. 1967. "The Nature of Vowel Quality." Part two of Three Areas of Experimental Phonetics: Stress and Respiratory Activity; The Nature of Vowel Quality; Units in the Perception and Production of Speech. London: Oxford University Press.

    Lerdahl, Fred. 1987. "Timbral Hierarchies." Contemporary Music Review 2:135-60.

    Lerdahl, Fred, and Ray Jackendoff. 1983. A Generative Theory of Tonal Music. Cambridge, Mass.: M.I.T. Press.

    Lewin, David. 1987. Generalized Musical Intervals and Transformations. New Haven: Yale University Press.

    Liberman, Alvin M. 1979. "Duplex Perception and Integration of Cues: Evidence that Speech is Different from Nonspeech and Similar to Lan- guage." In Ninth International Congress of Phonetic Sciences, Symposium, vol. 8. Copenhagen: Institute of Phonetics, University of Copenhagen.

    Liberman, Alvin M., and Michael Studdert-Kennedy. 1978. "Phonetic Perception." In Handbook of Sensory Physiology, vol. 8: Perception, edited by Richard Held, Herschel W. Leiboweitz, and Hans-Lukas Teuber, 143-78. Berlin and Heidelberg: Springer-Verlag.

    Ligeti, Gyorgy. 1966. Nouvelles Aventures, for three singers and seven instru- mentalists. New York: C. F. Peters.

    McAdams, Stephen. 1982. "Spectral Fusion and the Creation of Auditory Images." In Music, Mind and Brain: the Neuropsychology of Music, edited by Manfred Clynes, 279-98. New York: Plenum Press.

    McAdams, Stephen, and Kaija Saariaho. 1985. "Qualities and Functions of Musical Timbre." In Proceedings of the International Computer Music Conference, 1985, edited by Barry Truax, 367-74. San Francisco, CA: Computer Music Association.

    Peterson, Gordon E., Harold L. Barney. 1952. "Control Methods Used in a Study of the Vowels." Journal of the Acoustical Society of America 24, no. 2 (March): 175-84.

    Plomp, Reinier. 1970. "Timbre as a Multidimensional Attribute of Com- plex Tones." In Frequency Analysis and Periodicity Detection in Hearing, edited by Reinier Plomp and Guido F. Smoorenburn. Leiden: A. W. Sijthoff.

    140

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • Speech Extrapolated

    Potard, Yves, Pierre-Fran;ois Baisnee, and Jean-Baptiste Barriere. 1986. "Experimenting with Models of Resonance Produced by a New Tech- nique for the Analysis of Impulsive Sounds." In Proceedings of the Interna- tional Computer Music Conference 1986, edited by Paul Berg. San Fran- cisco, CA: Computer Music Association.

    Randall, James K. 1972-74. "Compose Yourself-A Manual for the Young." Parts 1-3. Perspectives of New Music 10, no. 2 (Spring-Summer 1972): 1-12; 11, no. 1 (Fall-Winter 1972): 77-91; 12, nos. 1 & 2 (Fall- Winter 1973/Spring-Summer 1974): 233-81.

    Rodet, Xavier. 1984. "Time-Domain Formant-Wave-Function Synthesis." Computer Music Journal 8, no. 3 (Fall): 9-14.

    Rodet, Xavier, and Pierre Cointe. 1984. "FORMES: Composition and Scheduling of Processes." Computer Music Journal 8, no. 3 (Fall): 32-50.

    Rodet, Xavier, Yves Potard, and Jean-Baptiste Barriere. 1984. "The CHANT Project: From Synthesis of the Singing Voice to Synthesis in General." Computer Music Journal 8, no. 3 (Fall): 15-31.

    Rosch, Eleanor H. 1973. "On the Internal Structure of Perceptual and Semantic Categories." In Cognitive Development and the Acquisition of Language, edited by Timothy E. Moore, 111-44. New York: Academic Press.

    1975. "Cognitive Reference Points." Cognitive Psychology 7:532-

    47.

    Schouten, J. F. 1968. "The Perception of Timbre." In Reports of the 6th International Congress on Acoustics, edited by Dr. Y Kohasi. 6 vols. Vol. 3, GP-6-2:35-44, 90. Tokyo: Mauruzen Company; Amsterdam: Elsevier.

    Shepard, R. N. 1982. "Geometrical Approximations to the Structure of Musical Pitch." Psychological Review 89:305-33.

    Slawson, Wayne. 1985. Sound Color. Berkeley and Los Angeles: University of California Press.

    Stevens, Kenneth N. 1972. "The Quantal Nature of Speech: Evidence from Articulatory-Acoustic Data." In Human Communication: A Unified View. Inter-University Electronics Series, vol. 15, edited by Edward E. David and Peter B. Denes, 51-66. New York: McGraw-Hill.

    Stockhausen, Karlheinz. 1966. Kontakte. London: Universal Edition. .1967. Stimmung. London: Universal Edition.

    Tsunoda, Tadanobu. 1971. "The Difference of the Cerebral Dominance of

    141

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

  • 142 Perspectives of New Music

    Vowel Sounds among Different Languages." The Journal of Auditory Research 11:305-14.

    Wessel, David. 1983. "Timbral Control as a Musical Control Structure." Computer Music Journal 3, no. 2 (Summer): 45-52.

    This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:20:27 UTCAll use subject to JSTOR Terms and Conditions

    Article Contentsp. [112]p. 113p. 114p. 115p. 116p. 117p. 118p. 119p. 120p. 121p. 122p. 123p. 124p. 125p. 126p. 127p. 128p. 129p. 130p. 131p. 132p. 133p. 134p. 135p. 136p. 137p. 138p. 139p. 140p. 141p. 142

    Issue Table of ContentsPerspectives of New Music, Vol. 28, No. 1, Winter, 1990Front Matter [pp. 1 - 4][Illustration]: Rory Butler (Assisted by Rob Motl), The Chair [p. 5]From the Domaine Musical to IRCAM [pp. 6 - 19]Dating Charles Ives's Music: Facts and Fictions [pp. 20 - 56]To Play Pianissimo [p. 57]Computer Music ForumSieves [pp. 58 - 78]

    Stacatto [p. 79]Computer Music ForumStatistics and Compositional Balance [pp. 80 - 111]Speech Extrapolated [pp. 112 - 142]

    Grace Notes [p. 143]Computer Music ForumObservations in the Art of Speech: Paul Lansky's Six Fantasies [pp. 144 - 169]It's about Time: Some NeXT Perspectives (Part Two) [pp. 170 - 179]Processing Musical Abstraction: Remarks on LISP, the NeXT, and the Future of Musical Computing [pp. 180 - 191]LISP as a Second Language: Functional Aspects [pp. 192 - 222]

    [Illustration]: Rory Butler (Assisted by Rob Motl), Rain Dance [p. 223]A Major Webern Revision and Its Implications for Analysis [pp. 224 - 255]Row Derivation and Contour Association in Berg's Der Wein [pp. 256 - 292][Illustration]: Rory Butler (Assisted by Rob Motl), Slowly I Turned [p. 293]The Systematic Chromaticism of Robert Moevs [pp. 294 - 323]A Conversation with Robert Moevs [pp. 324 - 335]Colloquy and ReviewThe 1989 International Computer Music Conference: An Overview of the Concerts [pp. 336 - 342]

    [Illustration]: Rory Butler (Assisted by Rob Motl), Box Accident [p. 343]Colloquy and ReviewLearning to Compose: A Review [pp. 344 - 351]

    Editorial Notes [pp. 352 - 354]Correspondence [pp. 355 - 358]Back Matter [pp. 359 - 362]