-
Formant Tuning Strategies in PrOpera Singers
*Johan Sundberg, Filipa M. B. L~a, and Brian P. Gill, *Stoc
Summary: The term formant tuning is generally used forincides
with the frequency of a source spectrum partial. Some authors claim
that such coincidence is favorable and be-longs to the goals of
classical opera voice training, whereas oThis investigation
analyzes the relationships between formantscales on the vowels /a/,
/u/, /i/, and /ae/ in a pitch range inclrange of approximately
300400 Hz, applying either of the twand (2) in nonclassical
singing, respectively. Formant frequen
formematon adiffer, o
ectr
Formamounon forelsewhhere (TAs c
this res
seenictingin thee rec-onic; andartial,ided.g de-ed to,14 by
om graduate stu-y participated inmed ascending/
AccepA prel
tics of Siing of thethe AnnPhiladelpFrom t
Commun
search Centre, SE-10044 Stockholm, Sweden. E-mail:
[email protected]
old, with varying levels of expertise, ranging frdent to
international opera singer, voluntarilthis study (Table 2). The
singers perfor
Journal of Voice, Vol. 27, No. 3, pp. 278-2880892-1997/$36.00
2013 The Voice
Foundationhttp://dx.doi.org/10.1016/j.jvoice.2012.12.002MATERIALS
AND METHODS
Participants and protocolA total of eight male classically
trained singers, 2342 years
Communication and Arts, INET-MD, University of Aveiro, Aveiro,
Portugal; and thezDepartment of Music and Performing Arts
Professions, Steinhardt School of Culture,Education and Human
Development, New York University, New York, New York.Address
correspondence and reprint requests to Johan Sundberg, Department
of Speech,
Music and Hearing, School of Computer Science and Communication,
KTH Voice Re-ant tuning has been receiving an ever-increasingt of
attention over the last decade.7,913 The literaturemant tuning has
recently been described in some detailere,14 and hence, only a
brief overview will be givenable 1).an be seen in Table 1,
different methods have been used inearch. However, all these
methods suffer from limitations:
In addition, the results of this research diverge, as can bein
the same Table 1. There are at least three sharply conflideas on
vocal tract tuning in singing that have emergedworld of vocal
pedagogy and voice researchsingers arommended to or found to (1)
tune F1 and/or F2 to a harmpartial; (2) keep F1 and F2 constant,
independent of F0(3) tune F1 or F2 to a frequency just above its
nearest pso that the coincidence between partial and formant is
avoAgainst this background, it is clear that vocal tract tunin
serves more research. Thereby, a promising strategy seemexpand
our previous, preliminary study of the vowel /a/including also
vowels with lower F1 values.
ted for publication December 4, 2012.iminary version of this
investigation was presented at the Physiology and Acous-nging
conference at KTH; August 2010; Stockholm, Sweden; at the Seattle
meet-Acoustical Society of America; May 23 27, 2011; Seattle,
Washington; and at
ual Symposium Care of the Professional Voice; May 28 June 2,
2011;hia, PA.he *Department of Speech, Music and Hearing, School of
Computer Science andication, KTH Voice Research Centre, Stockholm,
Sweden; yDepartment ofhas been found also in some traditional
singing.8 source-filter effects, these may be mistakenly
eliminated.filtering the acoustic signal. In the classical style,
the firstther the first nor the second formant tended to change
systeither or both formants was just below, just above, or
rightspectrum characteristics of the top tones of the scales
withwhether the first formant was slightly lower, slightly highnot
seem to be affected.Key Words: Operatic singingNon-classical
singingSp
INTRODUCTIONSopranos have been found to avoid the situation
where the firstformant (F1) is lower than the fundamental frequency
(F0); if F0exceeds the normal value of F1, they increase F1 to a
frequencyvery close to F0.
1 The strategy of tuning F1 to a specific spec-trum partial was
strongly recommended by singing teacher Cof-fin,2 who developed a
vowel chart presenting optimum vowelsof a given pitch range.2 The
terms formant tuning or vocaltract tuning have been introduced for
this strategy, and it hasbeen found to be applied not only by
sopranos but also bysingers of other classifications. For male
voices, vocal tract tun-ing is mostly observed in and above the
passaggio, that is, nearthe pitches of E4G4, about 330400 Hz.3
Vocal tract tuning in-creases the sound level of the vocal output,
and it is assumed tohelp the singer avoid vocal hyperfunction and
registerbreaks.2,47 It is not specific to classically trained
singers; itther authors have found evidence for advising against
it.frequencies and partials in professional singers, who sanguding
the passaggio, that is, the fundamental frequencyo singing
strategies that are typically used (1) in classicalcies of each
note in the scales were measured by inverse-ant tended to be lower
than in the nonclassical style. Nei-ically between scale tones,
such that on some scale tonesspectrum partial. In many cases,
singers produced similarerent first and second formant frequencies.
Regardless ofr right on a partial, the properties of the voice
source did
umHarmonicsFormant tuning.
(1) it is difficult to interpret spectral data in terms of
formantfrequencies because the spectrum is strongly dependent on
therelationship between F0 and the frequencies of the formants,and
when shifting from modal phonation to ingressive or vocalfry
phonation, vocal tract shape, and hence the formant frequen-cies,
may change; (2) results of listening tests with synthesizedstimuli
are very dependent on how natural the stimuli sound;(3) model work
offers accurate predictions only if the model ac-curately reflects
all relevant characteristics of the system; (4)broadband excitation
at the lip opening during sustained vowelphonation implies that
also the subglottal system is to some extentincluded in the
resonator because the glottis is open during part ofthe glottal
vibratory cycle; and (5) in inverse filtering, the formantringing
during the closed phase is eliminated by tuning the fre-quencies of
the inverse filters, so if some of this ringing wascaused by
factors other than formants, for example, nonlinearofessional
Male
kholm, Sweden, yAveiro, Portugal, and zNew York, New York
the case that one of the lowest formant frequencies co-
-
TABLE 1.Summary Reports of Previous Studies on Formant Tuning
Stra
Authors Method
Miller and Schuttel0 Spectral analysis of ingressive phonatvocal
fry just after a sung tone
Carlsson andSundberg17
Listening tests with synthesized scalesvowel /a/, with and
without formant
Neumann et al7 Spectrum and EGG analysis of /a/ andchest and
head register transitio
12 lusiolterof f
thech ton
vow
Johan Sundberg, et al Male Singers Formant Strategies
279descending nine note scales using vowels /ae/, /a/, /u/, and
/i/.For each singer, three starting pitches were used, separated
bysemitone intervals and chosen so that the singers passaggiorange
was reached (pitches E4G4, depending on the voicetype). When
reaching their passaggios, they applied either ofthe two different
approaches: classical (sung twice) and non-classical, as in for
example, musical theater (sung once). The
Titze Computerized model allowing incexclusion of nonlinear
source-fiinteraction for predicting effectstuning
Henrich et al8,16 External broad band excitation oftract of 22
singers sustaining eaa scale in comfortable range and/a/, //, //,
and /u/
Sundberg et al14 Inverse filtering of scales sung onMadde vowel
synthesizer software (custom made by SvanteGranqvist, KTH) was used
to supply the reference tone. A totalof 224 (This number was
mistakenly reported to be 288 in a pre-vious publication Sundberg
et al14) scales were recorded: asthere were three starting pitches
and four vowels, there were192 classical and 32 nonclassical
versions.
EquipmentAll recordings were made in a sound-treated studio in
theSteinhardt School of Culture, Education and Human Develop-ment
at New York University. A hybrid system consisting ofa Laryngograph
(Laryngograph Ltd, Wallington, Greater
TABLE 2.Singer Subjects Participating in the Experiment
Singer Age Classification Experience
1 30 Tenor Internationally touring2 32 Baritone Nationally
touring3 23 Baritone Graduate student4 25 Tenor Graduate student5
24 Baritone Graduate student6 42 Tenor Internationally touring7 38
Baritone Nationally touring8 35 Baritone Graduate studentMean
31.1London, UK) microprocessor and a Glottal Enterprises
(Syra-cuse, NY) MS-110 computer interface was used to record au-dio
and electrolaryngograph (EGG) signals simultaneously.Audio was
picked up by a head-mounted electret microphone(Knowles EK3132,
Knowles Headquarters, Itasca, IL). Thesound level was calibrated by
means of a 1 kHz sine wave,the sound pressure level of which was
measured in dB(C)
tegies
Main Conclusion
ion and .the singerwould tend to adjust resonancesto match one
or another harmonic of theglottal source (p. 234)
ontuning
.subjects did not show any preference forthe scales with tuned
formants.(p. 259)
// inn
. formant tuning is the most appropriateway of register
equalization. (p. 325)
n and
ormant
. a stable harmonic source spectrum is notobtained by tuning
harmonics to vocal tractresonances, but rather by placingharmonics
into favorable reactanceregions (p. 2733)
vocalone ofvowels
Tenors and baritones generally tune F1 to H2or H3 over at least
part of their range, butintersubject variability is great
el /a/ Professional singers did not show anytendency to tune F1
or F2 to a harmonicpartial.next to the recording microphone by
means of a sound levelmeter, and the value observed was announced
on the recording.Both signals were recorded using Speech Studio
software (Lar-yngograph, Laryngograph Ltd, Wallington, Greater
London,UK) and stored as wav files.
Analysis
Listening test. Of the recorded scales, 28 were randomly
se-lected for a listening test aimed at identifying typical
examplesof classical and nonclassical singing styles. The listening
testwas carried out with a panel of six experienced listeners
(uni-versity singing teachers in USA with the knowledge of
vocaltract tuning). The subjects were given the following written
in-struction: Here you will listen to a set of
ascending-descendingscales sung by different male voices. We ask
you to rate howsuccessful the singer is with regard to vocal tract
tuning (disre-gard other issues) in transitioning into the pitches
in/above thepassaggio in a Classical mode of singing. The subjects
gavetheir ratings along visual analog scales (VAS). The test
included13 replicated stimuli and all 41 scales were presented in
thesame randomized order to all subjects, with a 5-second pausein
between, lasting a total of approximately 11 minutes. Thenumber of
each stimulus was announced in the test file.
Formant frequency analysis. The formant frequencies inall
recorded tones in the ascending part of the scale were
-
measured by inverse filtering of the audio signal, using
theDeCap software (custom made by Svante Granqvist, KTH).This
program displays waveform and spectrum in separatewindows (Figure
1). When analyzing an audio signal, it firstconverts the acoustical
signal to a flow signal by integration.The frequencies and
bandwidths of the inverse filters are setmanually (in the lower
window) whereby the classical equa-tions are applied for
calculating the transfer function that cor-responds to the given
combination of formant frequencies andbandwidths.15 The input
signal is filtered with the inverse ofthis transfer function, thus
eliminating the effects of the vocaltract transfer function on the
input signal, and the resultingwaveform and spectrum are displayed
in quasi-real time. Pro-vided that the filters are correctly set,
the output then displaysthe waveform and spectrum of the
transglottal airflow, includ-ing effects of nonlinear source-filter
interaction, if any. Theprogram can also display an additional
signal, e.g. the EGGrecorded on a second channel of the input file,
with or withoutderivation and with an adjustable time delay.In the
upper window of Figure 1, both the transglottal airflow
waveform, that is, the flow glottogram and the derivative of
the
note of each of the scales starting on the highest pitch wasthen
synthesized, using the frequencies and bandwidths savedin this
formant file. This was done for each of the four vowels,using the
Madde software. The bandwidths were about 50, 70,100, 110, and 120
Hz for the five lowest formants. The purposeof this synthesis
experiment was to test the formant values usedfor the inverse
filtering. The values were accepted only if thesynthesized vowel
and voice quality sounded similar to vowelsound analyzed, as
determined by the authors.The F0 signal was extracted from the EGG,
using the FoX op-
tion included in the Soundswell signal workstation
software(Hitech Development, Solna, Sweden). For each of the
individ-ual scale tones, F0 was averaged over a series of complete
vi-brato cycles by means of the histogram option of the
samesoftware. In this way, F and formant frequencies data were
ob-
solute difference between the first and second ratings of
repli-
managed to sing a nonclassical version of the /u/ scale. To
in-
Journal of Voice, Vol. 27, No. 3, 2013280EGG signal (dEGG) are
displayed. The dEGG was delayed bya time interval of about 1
millisecond corresponding to thetravel time of the sound from the
glottis to the microphone.The lower window represents the input
audio spectrum andthe spectrum of the flow glottogram.For inverse
filtering the formant frequencies and bandwidths
were adjusted according to the three criteria: (1)
ripple-freeclosed phase; (2) voice source spectrum envelope as void
ofpeaks and valleys as possible; and (3) synchrony between
thenegative dEGG peak and the maximum declination rate
oftransglottal flow during closure.After completing the tuning of
the filters, their frequencies
and bandwidths were saved in a formant data file. The top
FIGURE 1. Example of DeCap display. Upper panel shows
thewaveform of the filtered signal and the derivative of the EGG.
Lower
panel shows the input audio spectrum and the spectrum of the
filtered
flow (gray and heavy curve, respectively). Formant bandwidths
are
given on an arbitrary scale along the ordinate. The arrows show
the for-
mant frequencies and bandwidths used for the inverse filtering.
Thetwo parallel curves represent realistic bandwidths according to
Fant.15crease the analysis material, however, formant frequency
TABLE 3.Mean and SD of the Absolute Difference Between theFirst
and Second Ratings of Replicated Stimuli and thePearson Correlation
(r) Between These Ratings
Rater
Difference (%) Correlation
Mean (%) SD (%) r Slope Intercept (%)
1 11.9 13.4 0.779 0.830 12.22 8.3 10.4 0.931 0.989 0.13 9.1 17.3
0.848 0.828 5.44 9.6 13.6 0.907 0.898 7.45 4.4 5.0 0.979 0.993 0.96
7.1 7.9 0.858 0.825 7.3cated stimuli varied between 4.4% and 11.9%
and thecorrelation between 0.779 and 0.979 (Table 3).To assess
whether there were significant differences between
the ratings for the classical and nonclassical examples
includedin the listening test, a nonparametric paired sample test,
Wil-coxon, was performed. This test was used because the datashowed
a skewed distribution. Classical versions were scoredsignificantly
higher than nonclassical versions for all singers,except for the
vowels /u/, as sung by singer 1, and the vowel/a/, as sung by
singer 3 (Table 4).
Formant frequenciesSix pairs of scales of the same vowel sung by
the same singerin both classical and nonclassical versions were
included inthe listening test: three on /a/, one on /ae/, and two
on /i/. Nopairs for the vowels /u/ were chosen because none of the
singers0
tained, which were as reliable as possible.
RESULTS
Listening testA Pearson correlation was used to test the
consistency of theparticipating experts ratings. For the individual
experts, the ab-Abbreviation: SD, standard deviation.
-
, Z); *In
Johan Sundberg, et al Male Singers Formant Strategies
281measurements were made also on two more classical versionsof /i/
scales and four classical versions of /u/ scales. Thus, a totalof
18 scales were selected for formant frequency analysis.Figure 2AC
show the formant frequencies obtained.
Figure 2A pertains to the vowel /a/, described in our previous
re-port,14 complemented by newdata for vowel /ae/. As can be seenin
the graph, the mean ratings of the classical versions of
thesescales exceeded 75% of VAS length, whereas the mean ratingsfor
the nonclassical versions varied between 10% and 32%. Es-pecially
in and above the passaggio range,F1 andF2 were lowerin the
classical version than in the nonclassical version of the /ae/,just
as in the /a/. In the classical version, all singers lowered F1for
the top scale tone or scale tones, whereas in the
nonclassicalversion,F1was just above, just below, or right onH2.
F2was just
TABLE 4.Med, IQR, in % VAS, and Significance Level (Wilcoxon
TestIntentionally Sung in the Classical and Nonclassical Styles
Subject Vowel
Classical
Med IQR
1 /a/ 84.3 15.2/u/ 80.5 22.4
2 /a/ 83.6 9.8/ae/ 90.0 10.2
3 /a/ 34.3 31.1/u/ 80.0 16.1
4 /a/ 76.1 28.3/i/ 60.7 25.8
5 /u/ 42.5 47.56 /i/ 56.6 19.47 /ae/ 30.7 41.38 /ae/ 56.1
28.9
Abbreviations: Med, median; IQR, interquartile range.above, just
below, or right on H3 for the top pitches in the clas-sical
versions. In these versions, all singers tuned F1 almost ex-actly
to H2, at least in one of the scale tones. However, with
fewexceptions, neither F1 nor F2 changed markedly between
scaletones, thus suggesting that the singers did not attempt to
tunethese formants to the frequencies of spectrum harmonics ofthe
individual scale tones. In other words, the coincidence be-tween
formant and harmonic seemed to occur unintentionally.However,
singer 2 tuned his F2 exactly to H3 in /a/, both in theclassical
and nonclassical versions of his highest note.Figure 2B shows the
F1 and F2 values observed for the vowel
/u/, all in the classical versions. Singers 1, 2, 3, and 4 all
tunedF1 midway between H1 and H2 for the top pitches. Singers 1and
3 kept F2 rather constant throughout the scale, with F2 co-inciding
with H3 on the penultimate note for singer 1 and thetop note for
singer 3. Singer 6 by contrast, changed F2 in thelower part of the
scale such that it was close to H4 and H3.Singer 2 provided a clear
example of vocal tract tuning; F1tracked H2 for the four lowest
scale tones, whereas for thehigher tones, he tuned F2 to H3.The F1
and F2 values observed for /i/ are shown in Figure 2C.
A common trait is that all singers reduced F2 more or less
forthe top pitches, both in the classical and nonclassical
versions.Singers 1 and 2 produced a similar difference between
classicaland nonclassical for F1. However, the difference was
quitesmall in the case of singer 1, and, interestingly, also the
meanratings of these examples were not very different, 48% and78%.
Singers 4 and 6 produced clear examples of vocal tracttuning,
placing F1 just above H1 for the highest scale tones.Their F2, by
contrast, remained basically independent of F0.Singers 1 and 2
increased F1 with increasing F0, singer 2 tuningit almost midway
between H1 and H2 in the upper part of thescale.Singers have been
found to tune their formant frequencies
with an accuracy of about 20 Hz.16 Therefore, if applyingformant
tuning, they may not tune F1 exactly to a partial
of the Difference in Mean Ratings Between Scalesdicate
Significant Differences
Nonclassical Wilcoxon Test
Med IQR Z P
7.9 24.2 2.201 0.028*71.1 37.3 1.214 0.22513.0 28.5 2.201
0.028*5.0 9.6 2.207 0.027*16.4 33.2 1.363 0.17359.6 32.8 2.201
0.028*28.6 35.2 2.201 0.028*36.4 19.5 2.201 0.028*11.6 12.4 1.992
0.046*37.3 34.6 2.201 0.028*7.7 16.2 2.201 0.028*20.0 30.4 2.201
0.028*but only to the vicinity of a partial. In the
investigationjust referred to, the criterion for formant tuning
appliedwas that the distance between the formant and its closest
par-tial should not be wider than 50 Hz. This corresponds toabout 2
semitones in the range of the pitch of E4. If singersapply formant
tuning with this accuracy, one would expectthat the average of this
distance would be close to 0 semitonein our material. As formant
tuning is assumed to be requiredfor tones in the upper part of the
passaggio, this should applyat least to the top note of the
scales.9 Formant tuning mayconcern both F1 and F2. For these
reasons, we measuredthe frequency ratios, expressed in semitones,
between theseformants and their closest partials. The results are
shownin the box plots in Figure 3. For the vowel /a/ in the
classicalversion, the median distance was clearly negative for F1
andsmall and positive for F2. Thus, F1 was typically lower thanits
nearest partial. For /i/, F1 and F2 were about 1 semitoneabove and
below the nearest partials, respectively. For thevowel /u/, the
median was 0 for both F1 and F2, but the scat-ter was considerable.
For the nonclassical versions, the me-dians for /a/ and /i/ were
close to 0. These results showthat depending on the vowel, the
formant may be below,
-
Journal of Voice, Vol. 27, No. 3, 2013282above, and right on a
partial, although the proximity betweenF1 and F2 and their closest
partials tended to be greater in thenonclassical versions.
FIGURE 2. A. F1 and F2 observed for the indicated vowels in the
classicspectively). Diagonal dashed lines refer to the frequencies
of spectrum partia
over a set of adjacent complete vibrato cycles. The mean ratings
are given in
et al.14 B. New data, plotted the same way observed for the
vowel /u/. C. NIt also is relevant to compare the formant tuning
data with themean ratings of how successful the singers were with
regard tovocal tract tuning, when they transitioned into the
pitches in and
al and nonclassical versions of the scale (solid and dashed
curves, re-
ls, calculated as n*MF0, where n is an integer and MF0 is F0
averaged
% of VAS length. The graphs for the /a/ vowel are taken from
Sundberg
ew data, plotted the same way for the vowel /i/.
-
2. (c
Johan Sundberg, et al Male Singers Formant Strategies
283FIGUREabove the passaggio in the classical mode of singing. This
rat-ing can be assumed to apply to the top note of the scale.
More-over, formant tuning would imply that either F1 or F2, or
both,are tuned to the close neighborhood of a partial. Hence, we
cal-culated, for the top tone, the minimum of the distance
betweenF1 and its nearest partial and the distance between F2 and
itsnearest partial and then the minimum of these two values
wasselected. These minimum values can be compared with themean
ratings in Figure 4. No relationship can be seen with
FIGURE 3. Distance between F1 and its closest partial and
betweenF2 and its closest partial for the vowels /a, /i/, and /u/
averaged across
singers for the top pitch of the scale exercise. The frequency
of the har-
monic was calculated as n3MF0, where n is an integer and MF0 is
F0averaged over a set of adjacent complete vibrato
cycles.ontinued).the mean ratings, but almost all values lie within
a range of1.5 semitones from the closest partial. Thus, the
ratingsseemed independent of the frequency separation between F1or
F2 and their closest partial.
Flow glottogram waveformWhen formants coincide with the
frequencies of harmonics,nonlinear source-filter interaction can be
expected to occur.11
In such cases, the flow glottogram would show deviations
FIGURE 4. Mean ratings plotted as function of the minimum
dis-tance of F1 and F2 to their closest partial, for the top note
of the scales.
Filled and open symbols refer to tones intended as sung in
classical and
nonclassical style.
-
from the typical shape, for example, in terms of a ripple in
theclosed phase or a dent in the flow pulse. Such interaction
wouldalso imply that the source spectrum partial that coincides
witha formant may become boosted or attenuated, so that such
par-tials can be expected also to deviate from a smoothly
fallingsource spectrum envelope.Figure 5 shows a typical set of
flow glottograms obtained
from the top tones of the scales sung on the vowels /a/, /u/,and
/i/. The samples exemplify the situation that F1 is just be-low,
right on, or just above H1, H2, or H3. A small ripple inthe close
phase occurred for the /a/ when F1 was 58 Hz aboveH2 (1.5
semitones), for the /u/ when F2 was 60 Hz above H1(1.1 semitones),
and for the /i/ when F1 was 26 Hz above H1(1.3 semitones). In the
last mentioned case, the ripple did notseem related to F1 being
just above a harmonic because the rip-ple remained also when F1 was
right on or just below a har-monic. It cannot be excluded that the
ripple was caused bysome external resonance; the ripple
corresponded to a frequencyof about 1200 Hz, and in the vowel /i/,
this frequency is locatedin a spectrum region where the partials
are weak, thus allowingan external resonance to have a noticeable
effect on the flowglottogram.
singer 3 and 6 tuned it just below and just above H2,
respec-tively. Also with regard to F2, different strategies can be
ob-served: singers 1, 3, and 6 placed F2 almost right on H3,whereas
singers 2 and 4 placed it just below this same partial.One might
imagine that singers possessing a strong singers
formant cluster would tend to tune F1 or F2 to a partial to
pro-mote a favorable spectral balance. Singers possessing a
weaksingers formant cluster would promote spectrum balance bynot
tuning a formant to a partial. This would imply that the levelof
difference between the strongest spectrum partial and thesingers
formant cluster would show a relationship with thesmallest distance
between F1 or F2 and its closest partial.This relationship was
examined: the coefficient of determina-tion was 0.108. Thus, there
was no correlation between formanttuning and the level of the
singers formant cluster.
SynthesisThe top tones were synthesized by means of the singing
synthe-sizerMadde, as mentioned. This allowed for a more detailed
ex-amination of the contributions of nonlinear
source-filterinteraction to the audio signal in these tones because
this syn-thesizer represents a completely linear source-filter
model. It
Journal of Voice, Vol. 27, No. 3, 2013284SpectraIn pedagogical
practice, spectrum characteristics are sometimesused to provide a
visual feedback to the student.9 Figure 6 illus-trates howF1 andF2
were tuned for the top pitch of the vowel /a/in five cases, all
rated as good examples of the classical style ofsinging. All these
examples shared the characteristic of a risingaudio spectrum
envelope over partials 1, 2, and 3. However, thischaracteristic was
achieved by means of different combinationsof F1 and F2 and their
distances to the closest partials. Singers 1,2, and 4 placed F1
well above H1 and well below H2, whereasFIGURE 5. Flow glottograms
for the indicated relations between F0 ansimply filters a source
spectrum, which has an envelope fallingwith an adjustable number of
dB/octave, and its transfer func-tion is calculated from the given
frequencies and bandwidthsof the formants. Thus, the differences
between the real andthe synthesized versions of a vowel would show
the contribu-tions from nonlinear factors. In these measurements,
spectrumsections were taken at the upper turning point of the
vibrato cy-cle, and F0 of the synthesis was adjusted to that F0
value.Figure 7 compares the levels of the first 10 audio
spectrum
partials produced by the singer and by the synthesizer, forfour
cases, where either F1 or F2 coincided with a spectrumd its closest
harmonic observed for the indicated singers and vowels.
-
soft
enci
Johan Sundberg, et al Male Singers Formant Strategies
285partial. In three of the four cases, the singers fundamental
wasweaker than that of the synthesizer. It can be noted that in
noneof these cases, a particularly great discrepancy appeared for
thepartial that coincided with a formant (circled in the
graph).Thus, the source spectrum seemed to be rather unaffected
bythe resonator in these cases.
FIGURE 6. Audio and voice source spectra obtained from the
DeCapsingers (refer caption of Figure 1). The bold arrows represent
the frequFigure 8 shows the audio spectra of these same tones. In
thecase of the /u/ sung by singer 3, the very strong third partial
re-quired that the bandwidth of the inverse filter corresponding
toF2 was set to no more than 19 Hz, which is an unrealisticallylow
value. A possible explanation is that this partial wasboosted by
nonlinear source-filter interaction. For the othervowels, the
bandwidths of the inverse filters were all withinor very near the
realistic limits, represented by the two parallelcurves in the
DeCap graphs (Figure 1).It can also be noted that the spectrum peak
constituting the
singers formant cluster was produced by clustering formants3, 4,
and 5 in the synthesis. The levels of these partials in thesingers
spectra did not differ markedly from those of the syn-thesis. This
implies that by and large, the singers formant clus-ter could be
completely explained by the classical source-filtertheory.
DISCUSSIONThis investigation leans heavily on the assumption
that inversefiltering is applicable also in the presence of a
nonlinearsource-filter interaction. There seems to be no reason to
doubtthis assumption. It is a well-established fact that the
transferfunction of the vocal tract can be accurately predicted,
giventhe frequencies and the bandwidths of the formants.15
Inversefiltering merely computes this transfer function and filters
thesignal with the transfer function inverted. The transfer
func-tion will not be affected by a nonlinear source-filter
interac-tion. Therefore, the source waveform and spectrum
willfaithfully reflect the effects of such interaction,
providedthat realistic formant frequencies and bandwidths were
usedwhen adjusting the inverse filter. For this reason, we were
ware for the vowel /a/ sung at the top tone of the scale by the
indicated
es and bandwidths of the inverse filters.careful only to use
formant frequencies and, with the one sin-gle exception mentioned
above, also bandwidths typical of thevowels analyzed. It should
also be noted that the inverse filterwas adjusted such that the
ripple during the closed phase waseliminated. During this phase,
the folds prevents coupling ofthe vocal tract to the subglottal
airways. In other words, underthese conditions, the resonator
system includes only the vocaltract. Moreover, the flow glottograms
obtained showeda closed phase that started at the moment of vocal
fold con-tact, as evidenced by the spike of the dEGG waveform.
Thesefacts support the assumption that our results represent
reliableinformation and that the inverse filtering data
accuratelywould reflect any significant effects of a nonlinear
source-filter interaction.It is obviously important to compare our
findings with those
reported by others. Echternach18 converted magnetic reso-nance
imaging data for the vowel /a/ to area functions andcalculated the
formant frequencies of premiere tenors pas-saggio. He found that
they progressively lowered F1 with ris-ing F0. Henrich et al
16 observed a great intersubject variationin their group of
singers, which included both amateurs andprofessionals. In their
male singers, they observed formanttuning in the top of the singers
ranges. We observed some ex-amples of this.The representativeness
of the data is another important as-
pect of our findings. There are three reasons to assume that
-
Journal of Voice, Vol. 27, No. 3, 2013286our subjects were good
representatives of classically trained op-era singers: (1) the
examples selected for analysis were thosereceiving the highest and
lowest mean ratings as classical inthe listening test, whereby
those receiving the lowest mean rat-ings were regarded as typical
of nonclassical; (2) the subjectswere all singing at
professional/semiprofessional level, someat the Metropolitan opera;
and (3) all singers tried to followthe instructions given to
provide the nonclassical and classicalversions.Given these reasons
for assuming that the data observed are
representative, two main questions need to be considered: (1)
isthere a common tuning strategy that male singers use to
suc-cessfully navigate through their passaggio and (2) where arethe
formants in relation to the partials?Regarding the first question,
the answer is in the negative; al-
though there were spectral similarities, there was no
commonformant tuning strategy that our male singers used to
success-fully manage their passaggio. Such variability was
observedalso by Henrich et al.16 This suggests that perhaps,
insteadof a formant tuning rule that all male singers should
applyto negotiate passaggio notes, they apply personal
strategies,presumably tailored to their own
anatomical-physiologicalcharacteristics.On the other hand, three
common denominators were
found for vocal techniques perceived as classical and non-
FIGURE 7. Levels of the first 10 audio spectrum partials
produced by thevalues observed for partials coinciding with either
F1 or F2.classical. First, in the top notes of the classical
examples,F1 and F2 were lower than in the nonclassical examplessung
by the same singer, perhaps due to a lowering of the lar-ynx, lip
protrusion, and/or decreased jaw opening. Second,F1 in /a/ was
lowered for the top notes, thus suggestingthat a principle of
formant detuning rather than formant tun-ing was applied. Third,
all top notes on /a/ perceived asclearly representing the classical
singing technique sharedthe characteristic of a rising spectrum
envelope over thethree lowest spectrum partials of the audio
signal, which pos-sibly produces a desirable timbral effect in this
style ofsinging.It is noteworthy, however, that the sensitivity of
the audio
spectrum to the vibrato phase is quite large, particularly whena
formant is tuned to the close proximity of a spectrum
partial.Figure 9 presents an example of a spectrum taken at the
peakand at the valley of the same vibrato cycle for an /a/ sung
atthe pitch of F#4. For example, assuming a bandwidth of50 Hz for
the vowel /a/ and an H2 located 25 Hz below F1, a vi-brato
peak-to-peak extent of 5% would lead to a sound levelmodulation of
about 6 dB for that partial. In other words, thespectrum envelope
over the three lowest spectrum partialsmay vary greatly over a
vibrato cycle. It is not clear how the cri-terion of a rising
spectrum envelope over harmonics 1, 2, and 3is applied under such
conditions.
indicated singer and by theMadde synthesizer. The circles
indicate the
-
. Th
um,
Johan Sundberg, et al Male Singers Formant Strategies 287With
regard to the second question, where the formants are inrelation to
the partials, it is clear that F1 was never below H1.This finding
is in agreement with earlier studies.1,12,16 On theother hand, in
and above the passaggio range, we found onlyfew examples of F1
and/or F2 being tuned to a partial.Mostly, the formant frequencies
remained the same or similarbetween scale tones, so coincidence
between F or F and
FIGURE 8. Audio and source spectra of the tones shown in Figure
5filters (refer caption of Figure 1). In the case of the bottom
left spectr
more than 19 Hz, which is an unrealistic low value.1 2
a partial appeared rather to happen by chance. Thus, ourresults
fail to support the claim that such tuning is animportant principle
in the classical style of singing.7,9,16 Withregard to F1, it was
tuned to a frequency well above, justabove or right on H1 in /i/.
In /a/ and //, it was lowered toa frequency below H2 for the
highest pitches, which is inaccordance with earlier reports.4,7,18
In /u/, we found F1 to betuned midway between H1 and H2, an
observation markedlydeviating from results reported by Henrich et
al.16 With respect
FIGURE 9. Audio spectra of the vowel /a/ as sung by singer 3 at
the pitch oand at the valley of the vibrato curve, respectively.to
F2, it coincided with or was in the vicinity of H3 for /a/ and//,
whereas for /i/ and /u/, it seemed independent of F0 in allsingers
except one (singer 2). Although Neumann et al7 notedthat F2 in /a/
was tuned to H4 in the high range of the chestregister, we found
that this situation happened at single pitches,when F0 was in the
appropriate range for this to happen by co-incidence, that is, it
occurred without any marked changes of F
e bold arrows represent the frequencies and bandwidths of the
inverse
the bandwidth of the inverse filter corresponding to F2 was set
to no2
between scale tones.As mentioned, F1 or F2 coincided with a
harmonic in sev-
eral scale tones. According to Titze,12 this may cause
insta-bility because of nonlinear source-filter interaction, at
leastwhen no vertical phase difference can be seen in the vocalfold
vibration, such as in falsetto register. However, innone of these
cases, instability was noted, presumably be-cause singers used only
modal register. Apparently, singersare able to avoid instabilities
caused by such interaction, as
f F#4 in classical style. The left and right panels were taken
at the peak
-
Journal of Voice, Vol. 27, No. 3, 2013288suggested by Titze.12 A
relevant remaining question is howthis can be done.Nonlinear
source-filter interaction is likely to boost certain
partials in the radiated spectrum. Inverse filtering merely
im-plies that the effects of the vocal tract transfer function on
theradiated sound are eliminated, as mentioned. A
nonlinearsource-filter interaction would then manifest itself as
irregular-ity of the source spectrum envelope obtained from the
inversefiltering analysis. One example of this was illustrated
inFigure 8, where F2 was almost identical with H3 in the /u/sung by
singer 3. In the inverse filtering, the very strong H3was here
compensated by an unrealistically narrow bandwidthof F2. No similar
examples were observed in the classical ver-sions of the scale. On
the other hand, irregularities were oftennoted in the source
spectrum envelope of the nonclassical ver-sions. This again raises
the question what tricks singers canmake to avoid nonlinear
source-filter interaction.It is noteworthy that the source spectrum
envelope did not
show any irregularities in the region of the singers
formantcluster. The generation of this spectrum envelope peak
wasthus compatible with a normal voice source and the clusteringof
F3, F4, and F5 assumed in the inverse filtering analysis.Such
clustering does not seem unrealistic; it was observedalso in an
acoustical model of the vocal tract, which containeda
representation of the pyriform sinuses and of the larynx
tubeincluding a laryngeal ventricle. Thus, it seems that a
singersformant cluster can be produced without the help of
nonlinearsource-filter interaction.Formant tuning, meaning tuning
of formants to partial, must
be strongly influenced by F0 and the normal value of F1 and
F2.The male passaggio is limited to the F0 range of 300400
Hz,approximately, and F1 for the vowel /i/ is typically near300 Hz.
This automatically brings it to the vicinity of H1 inthe passaggio.
Likewise, F1 for the vowel /a/ is about 700 Hz,which implies that
distance between F1 and H2 will automati-cally be small in the
passaggio. It is thought-provoking thatin the classical versions,
the singers tended to decrease F1 inthe passaggio, thus, contrary
to formant tuning, expandingthe separation between F1 and H2. In
this case, the term formantdetuning seems more appropriate than
formant tuning.
CONCLUSIONSThe main results of the present investigation can be
summa-rized as follows. (1) The classical and nonclassical styles
ofsinging differed with respect to formant frequencies in a
consis-tent and clearly perceptible way, and for all vowels, F1 and
F2tended to be lower in the classical than in the nonclassical
style.This difference was most pronounced at high F0. (2) In
twocases, of a total of 18, examples of formant tuning were
found,occurring at high pitches for the vowel /i/ and both at low
andhigh F0 for the vowel /u/. (3) A rising spectrum envelope
overthe three lowest partials was a common denominator of
thehighest tones sung on /a/ and /ae/ in classical style, even
thoughit was produced with slightly differing combinations of
formantfrequencies and the spectrum varied greatly during the
vibratocycle. (4) F1 coincided with H2 at some scale tones in
allsingers classical as well as in their nonclassical versions
ofthe scale. (5) Almost without exception, inverse filtering
analy-sis of the tones produced in the classical style showed no
clearsigns of a nonlinear source-filter interaction, neither when
F1 orF2 coincided with a spectrum partial nor when F1 was
slightlylower than a partial. Thus, in most cases, the major
characteris-tics of the spectra produced by these singers with the
classicalformant tuning strategy could be explained by the
classical lin-ear source-filter theory of voice production. On the
other hand,in examples produced with a nonclassical formant tuning
strat-egy, some source spectrum irregularities were found that
mayreflect such interaction.
AcknowledgmentsThe kind cooperation of the singers is gratefully
acknowledged.The authors profited greatly from discussions with Dr
SvanteGranqvist, KTH, on inverse filtering and are indebted to
theSchering Health Care Ltd. & Bayer Portugal for providing
themeans to acquire the equipment and to the Gomes
TeixeiraFoundation for support.
REFERENCES1. Sundberg J. Formant technique in a professional
female singer. Acustica.
1975;32:8996.
2. Coffin B. Overtones of Bel Canto. New Brunswick, NJ:
Scarecrow Press;
1980.
3. Doscher BM. The Functional Unity of the Singing Voice.
London, UK: The
Scarecrow Press, Inc.; 1994.
4. Hertegard S, Gauffin J, Sundberg J. A comparison of
subglottal and intrao-
ral pressure measurements during phonation. J Voice.
1990;9:149155.
5. HiranoM, VennardW, Ohala J. Regulation of register, pitch and
intensity of
voice. An electromyographic investigation of intrinsic laryngeal
muscles.
Folia Phoniatr. 1970;22:120.
6. Miller DG, Schutte HK. Formant tuning in a professional
baritone. J Voice.
1990;4:231237.
7. Neumann K, Schunda P, Hoth S, Euler HA. The interplay between
glottis
and vocal tract during the male passaggio. Folia Phoniatr Logop.
2005;
57:308327.
8. Henrich N, Kiek M, Smith J, Wolfe J. Resonance strategies
used in Bulgar-
ian womens singing style: a pilot study. Logoped Phoniatr Vocol.
2011;32:
171177.
9. Miller DG. Resonance in Singing. Voice Building Through
Acoustic Feed-
back. Princeton, NJ: Inside View Press; 2008.
10. Schutte HK, Miller DG, Duijnstee M. Resonance strategies
revealed in re-
corded tenor high notes. Folia Phoniatr Logop.
2005;57:292307.
11. Titze IR. A theoretical study of F0-F1 interaction with
application to reso-
nant speaking and singing voice. J Voice. 2004;18:292298.
12. Titze IR. Nonlinear source-filter coupling in phonation:
theory. J Acoust
Soc Am. 2008;123:27332749.
13. Titze IR,Worley AS.Modeling source-filter interaction in
belting and high-
pitched operatic male singing. J Acoust Soc Am.
2009;126:15301540.
14. Sundberg J, L~a FMB, Gill BP. Professional male singers
formant strategiesfor the vowel a. Logoped Phoniatr Vocol.
2011;36:156167.
15. Fant G. Acoustic Theory of Speech Production. 2nd ed. The
Hague, The
Netherlands: Mouton; 1960.
16. Henrich N, Smith J, Wolfe J. Vocal tract resonances in
singing: strategies
used by sopranos, altos, tenors, and baritones. J Acoust Soc Am.
2011;
129:10241035.
17. Carlsson G, Sundberg J. Formant frequency tuning in singing.
J Voice.
1992;6:256260.
18. Echternach M. Untersuchungen zu Registerubergangen bei
mannlichen
Stimmen (Investigations of register transitions in male voices).
Bochum,
Germany: Projekt verlag; 2010.
Formant Tuning Strategies in Professional Male Opera
SingersIntroductionMaterials and methodsParticipants and
protocolEquipmentAnalysisListening testFormant frequency
analysis
ResultsListening testFormant frequenciesFlow glottogram
waveformSpectraSynthesis
DiscussionConclusionsAcknowledgmentsReferences