The Science of the Singing Voice Overview of the course (HC16) Winter 2008 Pat Keating, Linguistics, UCLA.

The Science of the Singing Voice

Overview of the course (HC16)Winter 2008

Pat Keating, Linguistics, UCLA

Books

• Johann Sundberg, The Science of the Singing Voice. Northern Illinois University Press (1989)

• Peter Ladefoged, Elements of Acoustic Phonetics. Second edition. University of Chicago Pres (1996)

• Richard Miller, The Structure of Singing: System and Art in Vocal Technique. Wadsworth Publishing (2001)

• Richard Miller, National schools of singing: English, French, German, and Italian techniques of singing revisited. Scarecrow Press (2002)

• Garyth Nair, Voice – Tradition and Technology: A State-of-the-Art Studio. With CD. Singular (1999)

• Ingo Titze, Principles of Voice Production (2nd printing 2000)

1. Intro: Sundberg’s demo

• Go to: The ugly voice poster

• But we don’t do any synthesis in the course

F0 and pitch

• Vibration, Hz

• Tuning forks, vocal folds

• Relations of Hz to musical notes and intervals (several websites with these) – see next slide

• Tone generator in Audacity is another way to relate Hz to notes

Frequencies of piano white keys

Digital audio

• SR, QR. compression

• File formats – more complicated this year than in 2006!

• We need song clips with a single voice (no instruments or other voices)

Review questions

1. Which tuning fork has the higher-sounding pitch, 392 Hz or 523 Hz?

2. What part of the body produces the fundamental frequency of the voice?

3. The frequency of the note G2 is 98 Hz. What is the frequency of G3?

4. Is a song on an audio CD or an mp3 player in .wav format?

Lab 1: audio clips

• Ripping CD tracks to .wav (CDex, CLICC)• Ripping audio from commercial DVDs (DVDFab

Decrypter to AnyAudioConverter)• Ripping audio from YouTube videos (Freecorder)• Saving .mp3 and various other audio formats

as .wav (Audacity, CDex, AudioConverter)• Splitting and saving mono tracks from stereo

(Audacity)• File clips kept on our ecampus Discussion Board

Examples

• From Worst of AI DVD:

• From AI on Youtube:

2. From Sundberg

• How do you experience your own voice?

• Why does a recording of your voice sound different to you?

• And, why do you sound better in the shower?

Pitch• Semi-tone = about 6% freq difference

• “in tune”: how close to target is close enough (about 20 cents for average listener)

• “in tune”: steadiness

• Transitions between notes: swooping

Example: steadiness, swooping

Vibratos

• Dimensions of vibrato– Rate, range, amplitude vibrato

• Supposed good classical vibrato– 5.5 to 7 Hz rate, + .5 to 2 semitones range

• What good a vibrato does, doesn’t do for the singer

• Examples next slides

Example: D. Fischer-Dieskau

Example: Leontyne Price

Example: Joan Baez

Example: Kelly Clarkson

Lab 2 and Assn 1: vibratos

• Pitchworks, wavesurfer

• Measuring F0 from pitchtrack

• Calculating vibrato properties

Tricks in pitchtracking

• Hardest part: keeping track of F0 range and optimizing option settings

• Tuning forks and thin voices: don’t use cepstral method, use autocorrelation

• Problems tracking trills and other fast F0 changes: need to change step size and/or window length

3. Larynx and phonation

• Laryngeal anatomy: physical model, “Vocal Parts” CD, ASA and Painter videotapes, Youtube videos, (DVDs about source and about phonation)

• Mechanisms of vocal fold vibration• F0 variation with airflow means pitch and

loudness are correlated, which singers need to learn to decouple

4. Spectrum

• The voice source: F0 and overtones

• Line spectrum of source

• FFT of output in Audacity, wavesurfer

• DVD “Human Speech”: a key point of this is that speed of closing of vocal folds determines strength of higher harmonics and thus the brightness of the voice

Partials, overtones?

• Partials = harmonics

• Overtones = partials above F0

Lab 3 and Assn 2: FFT

• FFT, LTAS in Pitchworks or wavesurfer• FFT in Audacity: View-Plot spectrum (nice

for comparing effect of window length; shows musical note of F0)

• Pros, cons of Audacity vs Pworks/wavesurf• Comparing spectra of different voice

qualities by strength of H1, number of harmonics, extent of high-freq energy

5. Resonances• From Ladefoged on resonance• Basic source-filter idea• More of Source-Filter DVD, on filter• Vowel “covering”: lowering the frequencies

of front vowel resonances so that brightness is more matched across vowels

Singers formant• “Singers formant”: extra energy around 3000 Hz

(Sundberg says 2300-3000 Hz for basses, 3000-3800 for tenors), which allows a solo voice to stand out against an orchestra, or other singers

• Sopranos don’t much need a singers formant against an orchestra, because any note above about B4 will stand out by itself. Similarly for amplified singers.

Singers formant

• Not an additional formant, but a clustering of F3, F4, F5; when they are close together in frequency their strengths are mutually enhanced and they give one broad strong spectral peak

• Male singers: enlarge the ventricle (just above the larynx), lower the larynx

• It is not known how altos (or sopranos, if they have one) produce their singers formant

Miller: singers formant

Example: Fischer-Dieskau (last vowel)

Speakers formant

• More like at 3500 Hz than 3000

• Property of speaking voices judged to be good

• Seen in some singing voices, especially in styles that are more like speaking (e.g. country)

Lab 4 and Assn 3: Singers formant

• Looking at own voice and at recordings to see if there is a singers formant

• trying to increase singers formant in own voice

• Emphasized looking at /o/, /u/, where higher formants are expected to be weak so any enhancement will be unambiguous

6. Vowel formants and F0

• Average formant frequencies for different English vowels

• a strong soprano voice matches F0 (H1) to F1, while a weak voice has no formant near F0

• [Good illustration of this on DVD: the good voice and the bad voice samples]

• Sundberg says that tuning F0 to F1 can add up to 30dB to the sound level

• [other strategies in other ranges: Pavarotti’s tenor tuning of F1 to H2 in chest voice, F2 to H3 or H4 on high notes]

When F0 is above F1

• F0 > F1 for many soprano notes• F1 cannot match F0, so H1 can’t be boosted

by a resonance• vowel qualities are indistinct because F1 is

not excited• trained singers tend to adjust the vowel

quality so that the F1 moves up, in the direction of F0

F1 and F0

• F1 is raised by opening the mouth more, or shortening the vocal tract (e.g. smiling)

• YouTube videos of Queen of the Night aria singers and their mouth contortions on the high notes

Sundberg: F1 tuning when F0>F1

The soprano challenge

• A few years ago a study of this effect, explicitly testing what Sundberg had said, got a lot of publicity:

http://www.phys.unsw.edu.au/~jw/soprane.html• They found that a trained soprano singing above

about 440 Hz tuned every vowel’s F1 to the F0, where formants were determined by reflection

http://www.phys.unsw.edu.au/~jw/soprane.html

http://www.phys.unsw.edu.au/~jw/soprane.html

Dani and Shri at USC – MRI study of vocal tract adjustments

that cause these formant shifts

Assn 4: F1 tuning

• Happy Birthday when sung from F4 to F5: not a good match between F0s and F1s

• Assignment was to write new lyrics that would give a better match to my vowel formants in this key

• Full credit for nonsense, but a prize promised for best meaningful lyrics

• Some wild-card vowels allowed where F0 was not near any F1 of mine

The winner

Yay today yay hurray

yay today yay is in

Today (na-me) is a-age

(A-a-age), spring chickin.

Lab 5: a total bust

• Tried to watch video en masse in CLICC

• Had planned to make EGG recordings

Guest lecture

• Gerry Berke from Head & Neck Surgery on their research on neuromuscular control of F0, on vocal pathology, and on care of the voice

7. Consonants

• 2 chapters each in Miller, Nair, on different aspects of consonants in singing

• Miller: oral agility for rapid consonant production

7. Consonants

• Voiced vs. voiceless consonants• Effects of voiceless consonants on melodic

line • Effect of C voicing on vowel F0• Lyricist’s choice of consonants already

affects the song, independent of artist’s interpretation

Sondheim lyrics example

• Bernadette Peters, Not a day goes by

Consonant “resonance”

• Nair: More vs less sonorous (vowel-like) consonants (“consovowels”) as seen in the narrowband spectrogram

• Consonant duration

• Using consonant articulation artistically, e.g. for emotion

Example: lyrics + articulation

• Bernadette Peters again, 2 clips

Example: lyrics + articulation

• Melinda Doolittle vs. Gregorian chant

Lab 6 and Assn 5: consonants

• Listening to, looking at, and making consonants in different ways

8. Vocal warm-ups

• Titze explains warm-up exercises in terms of bringing all systems up gradually

• Acoustic loading for respiratory warm-up– increase the acoustic loading on the vocal folds with

humming, trills, singing into a straw - lets the vocal folds vibrate with more abduction, and with overall lower Ps for an easy start

– increase F0 so that Ps must increase

• Fun with straws

9. EGG

• Ch. 13 in Nair (1999) = “The Use of the Electroglottograph in the Voice Studio” by D. Miller and H. K. Schutte

• “one of the primary aims of training the classical singing voice will be to establish the habit of complete and abrupt closure, at least in mezzo forte and forte”

• Seeing this in the EGG waveform

Falsetto vs chest voice on [i]: little contact in falsetto

Lab 7 and Assn 6: EGG

• We made individual EGG recordings of students’ voices

• Assn 6 on EGG analysis

Lab 7: webpages

• The course requires a term project, which is presented as a webpage visible to the whole class

• This year the webpages were by default on Googlepages (linked from, but not on, the ecampus site)

• In-class instruction on using Googlepages by our ITC

10. Aerodynamics

• Normal breathing: about .5 liters 12 times/minute, with active inspiration and passive expiration. – Muscles of expansion: external intercostals, diaphragm

• As in speech, in singing expiration is actively controlled, first by holding it back, then by increasing it– Muscles of contraction: internal intercostals, abs

Breathing in singing

• Trained singers take much longer breaths, and more total air in a breath. More of the air in the lungs is exhaled by professional singers.

• Trained singers have lower airflow rates in singing than do untrained singers, but the same airflow rates in speech.

• Trained singers thus have more efficient phonation: they use less air to get strong vocal fold vibrations.

Sundberg: airflow vs. pitch

sound level (S), subglottal pressure (P) and oral airflow (A) from a professional singer’s ascending scale, showing that pressure increases a lot as pitch increases, even when airflow is fairly constant and sound level increases only somewhat

Air pressure in singing

• Classically trained singers have lower subglottal pressures than do untrained singers, and these pressures are lower in speech as well as singing.

• In singing, subglottal pressure is higher for louder phonation and for higher pitches: A doubling of subglottal pressure gives about a doubling in loudness, and subglottal pressure also about doubles when F0 doubles.

Sundberg: Ps vs. pitch

the clear relation of loudness, pressure and pitch in these quicker triads

The flow glottogram

• Ug, from inverse filtering of Uo signal

2 key aspects of the flow glottogram

1. the maximum amplitude of the flow is directly proportional to H1, the amplitude (in the source, not in the output) of the fundamental component

and this affects the perceived “strength” of the voice, though not necessarily its overall loudness, which instead depends on the strongest partial

2. the maximum closing rate is proportional to the amplitudes of the overtones

Breathy phonation

• the glottis is somewhat abducted without complete closure

• so some air flows through continuously, and the maximum flow is quite high

• high airflow = a strong H1 in the source• High airflow also = high-frequency noise • Slower closing rate = lower-energy higher

partials, which are then covered by noise

Pressed phonation • The glottis is more adducted than normal

• So stronger lung pressure is needed to get vibration

• But the small and brief glottal opening means that little air flows through

• Lower Ug means a weaker H1

• Closing is usually more abrupt, so higher partials are stronger

Sundberg’s flow phonation

• The sweet spot: the most abducted glottis that will still give complete closure

• Most abducted, to give highest flow and thus strongest H1

• H1 in flow phonation can be 15 dB or more greater than in pressed phonation

• Complete closure, to reduce glottal noise and to strengthen higher partials

Loudness control with

a. phonation: the right amount of vocal fold adduction (Sundberg’s flow phonation)

b. the vocal tract: formant tuning, singers formant

c. lung pressure: higher pressure and higher airflow through the glottis. The power of the glottal source increases by 6 dB for every doubling of the lung pressure

Lab 8: aero

• Pressure and flow recording by each student

• Did they show the relation of Ps, Uo, and F0 (with relatively fixed loudness) as in the Sundberg example figures?

Exam week: project presentations

• During the scheduled exam period, students gave 5 minute overviews of their projects to the class, displaying their webpages, which were not due until the end of that day

The Science of the Singing Voice Overview of the course (HC16) Winter 2008 Pat Keating, Linguistics, UCLA.

Documents