Top Banner
ECE 598: The Speech ECE 598: The Speech Chain Chain Lecture 8: Formant Lecture 8: Formant Transitions; Vocal Tract Transitions; Vocal Tract Transfer Function Transfer Function
38

ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Dec 17, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

ECE 598: The Speech ECE 598: The Speech ChainChain

Lecture 8: Formant Lecture 8: Formant Transitions; Vocal Tract Transitions; Vocal Tract

Transfer FunctionTransfer Function

Page 2: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

TodayToday Perturbation Theory: Perturbation Theory:

A different way to estimate vocal tract resonant A different way to estimate vocal tract resonant frequencies, useful for consonant transitionsfrequencies, useful for consonant transitions

Syllable-Final Consonants: Formant Syllable-Final Consonants: Formant TransitionsTransitions

Vocal Tract Transfer FunctionVocal Tract Transfer Function Uniform Tube (Quarter-Wave Resonator)Uniform Tube (Quarter-Wave Resonator) During Vowels: All-Pole SpectrumDuring Vowels: All-Pole Spectrum

QQ BandwidthBandwidth

Nasal Vowels: Sum of two transfer functions Nasal Vowels: Sum of two transfer functions gives spectral zerosgives spectral zeros

Page 3: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Topic #1:Topic #1:Perturbation TheoryPerturbation Theory

Page 4: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Perturbation TheoryPerturbation Theory(Chiba and Kajiyama, (Chiba and Kajiyama, The VowelThe Vowel, 1940), 1940)

A(x) is constant everywhere, except for one small perturbation.

Method: 1. Compute formants of the “unperturbed” vocal tract. 2. Perturb the formant frequencies to match the area perturbation.

Page 5: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Conservation of Energy Under Conservation of Energy Under PerturbationPerturbation

Page 6: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Conservation of Energy Under Conservation of Energy Under PerturbationPerturbation

Page 7: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

““Sensitivity” FunctionsSensitivity” Functions

Page 8: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Sensitivity Functions for the Sensitivity Functions for the Quarter-Wave Resonator (Lips Quarter-Wave Resonator (Lips

Open)Open)

L

/AA/ /ER/ /IY/ /W/

• Note: low F3 of /er/ is caused in part by a side branch under the tongue – perturbation alone is not enough to explain it.

Page 9: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Sensitivity Functions for the Sensitivity Functions for the Half-Wave Resonator (Lips Half-Wave Resonator (Lips

Rounded)Rounded)

L

/L,OW/ /UW/

• Note: high F3 of /l/ is caused in part by a side branch above the tongue – perturbation alone is not enough to explain it.

Page 10: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Formant Frequencies of Formant Frequencies of VowelsVowels

From Peterson & Barney, 1952

Page 11: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Topic #2:Topic #2:Formant Transitions, Formant Transitions,

Syllable-Final Syllable-Final ConsonantConsonant

Page 12: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Events in the Closure of a Events in the Closure of a Nasal ConsonantNasal Consonant

Vowel Nasalization

Formant Transitions

Nasal Murmur

Page 13: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Formant Transitions: A Formant Transitions: A Perturbation Theory ModelPerturbation Theory Model

Page 14: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Formant Formant Transitions: Transitions:

Labial Labial ConsonantsConsonants

“the mom”

“the bug”

Page 15: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Formant Formant Transitions: Transitions:

Alveolar Alveolar ConsonantsConsonants

“the tug”

“the supper”

Page 16: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Formant Formant Transitions: Transitions: Post-alveolar Post-alveolar ConsonantsConsonants

“the shoe”

“the zsazsa”

Page 17: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Formant Formant Transitions: Transitions:

Velar Velar ConsonantsConsonants

“the gut”

“sing a song”

Page 18: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Topic #3:Topic #3:Vocal Tract Transfer Vocal Tract Transfer

FunctionsFunctions

Page 19: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Transfer FunctionTransfer Function ““Transfer Function” T(Transfer Function” T()=Output()=Output()/Input()/Input()) In speech, it’s convenient to write In speech, it’s convenient to write

T(T()=U)=ULL(()/U)/UGG(()) UULL(() = volume velocity at the lips) = volume velocity at the lips UUGG(() = volume velocity at the glottis) = volume velocity at the glottis T(0) = 1T(0) = 1

Speech recorded at a microphone = Speech recorded at a microphone = pressurepressure PPRR(() = R() = R()T()T()U)UGG(()) R(R() = j) = jf/r = “radiation characteristic”f/r = “radiation characteristic”

= density of air= density of air r = distance to the microphoner = distance to the microphone f = frequency in Hertzf = frequency in Hertz

Page 20: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Transfer Function of an Ideal Transfer Function of an Ideal Uniform TubeUniform Tube

Ideal Terminations:Ideal Terminations: Reflection coefficient at glottis: zero velocity, Reflection coefficient at glottis: zero velocity, =1=1 Reflection coefficient at lips: zero pressure, Reflection coefficient at lips: zero pressure, ==11 Obviously, this is an approximation, but it gives… Obviously, this is an approximation, but it gives…

T(T() = 1/cos() = 1/cos(L/c)L/c)

= …= ………

nn = n = nc/L – c/L – c/2Lc/2L

FFnn = nc/2L – c/4L = nc/2L – c/4L

122

232…

Page 21: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Transfer Function of an Ideal Uniform Transfer Function of an Ideal Uniform TubeTube

Peaks are actually infinite in height (figure is clipped to fit the display)

Page 22: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Transfer Function of a Non-Transfer Function of a Non-Ideal Uniform TubeIdeal Uniform Tube

Almost ideal terminations:Almost ideal terminations: At glottis: velocity almost zero, At glottis: velocity almost zero, ≈1≈1 At lips: pressure almost zero, At lips: pressure almost zero, ≈≈1 1

T(T() = 1/(j/Q +cos() = 1/(j/Q +cos(L/c))L/c))

… … at Fat Fnn=nc/2L – c/4L,…=nc/2L – c/4L,…

T(2T(2FFnn) = ) = jQjQ20log20log1010|T(2|T(2FFnn)| = 20log)| = 20log1010QQ

Page 23: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Transfer Function of a Non-Ideal Uniform Transfer Function of a Non-Ideal Uniform TubeTube

Page 24: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Transfer Function of a Vowel: Transfer Function of a Vowel: Height of First Peak is QHeight of First Peak is Q11=F=F11/B/B11

T(T() = ) = (j (j+j2+j2FFnn++BBnn)(j)(jj2j2FFnn++BBnn))

T(2T(2FF11) ≈ (2) ≈ (2FF11))22/(j4/(j4FF11BB11))

= = jFjF11/B/B11

Call QCall Qnn = F = Fnn/B/Bnn

T(2T(2FF11) ≈ ) ≈ jQjQ11

20log10|T(220log10|T(2FF11)| ≈ 20log10Q)| ≈ 20log10Q11

(2Fn)2+(Bn)2

n=1

Page 25: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Transfer Function of a Vowel: Transfer Function of a Vowel: Bandwidth of a Peak is BBandwidth of a Peak is Bnn

T(T() = ) = (j (j+j2+j2FFnn++BBnn)(j)(jj2j2FFnn++BBnn))

T(2T(2FF11++BB11) ≈ (2) ≈ (2FF11))22/((j4/((j4FF11)()(BB11++BB11))))

= = jQjQ11/2/2

At f=FAt f=F11+0.5B+0.5Bnn, ,

|T(|T()|=0.5Q)|=0.5Qnn

20log20log1010|T(|T()| = 20log)| = 20log1010QQ11 – 3dB – 3dB

(2Fn)2+(Bn)2

n=1

Page 26: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Amplitudes of Higher Amplitudes of Higher Formants: Include the RolloffFormants: Include the Rolloff

T(T() = ) = (j (j+j2+j2FFnn++BBnn)(j)(jj2j2FFnn++BBnn))

At f above FAt f above F11

T(2T(2f) ≈ (Ff) ≈ (F11/f)/f)

T(2T(2FF22) ≈ () ≈ (jFjF22/B/B22)(F)(F11/F/F22))

20log10|T(220log10|T(2FF22)| )|

≈ ≈ 20log20log1010QQ22 – 20log – 20log1010(F(F22/F/F11))

1/f Rolloff: 6 dB per octave (per doubling of 1/f Rolloff: 6 dB per octave (per doubling of frequency)frequency)

(2Fn)2+(Bn)2

n=1

Page 27: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Vowel Transfer Function: Synthetic Vowel Transfer Function: Synthetic ExampleExample

L1 = 20log10(500/80)=16dB

L2 = 20log10(1500/240) – 20log10(F2/F1) = 16dB – 9.5dB

L3 = 20log10(2500/600)

– 20log10(F3/F1) – 20log10(F3/F2)

B2 = 240Hz

B1 = 80HzB3 = 600Hz?(hard to measure because rolloff from F1, F2 turns the F3 peak into a plateau)

F4 peak completely swamped by rolloff from lower formants

Page 28: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Shorthand Notation for the Shorthand Notation for the Spectrum of a VowelSpectrum of a Vowel

T(s) = T(s) = (s (sssnn)(s)(sssnn*)*)

s = js = jssn n = = BBnn+j2+j2FFnn

ssnn* = * = BBnnj2j2FFnn

ssnnssnn* = |s* = |snn||22 = (2 = (2FFnn))22+(+(BBnn))22

T(0) = 1T(0) = 1

20log20log1010|T(0)| = 0dB|T(0)| = 0dB

snsn*

n=1

Page 29: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Another Shorthand Notation for Another Shorthand Notation for the Spectrum of a Vowelthe Spectrum of a Vowel

T(s) = T(s) = (1-s/s (1-s/snn)(1-s/s)(1-s/snn*)*)1

n=1

Page 30: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Topic #4:Topic #4:Nasalized VowelsNasalized Vowels

Page 31: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Vowel NasalizationVowel Nasalization

Nasalized Vowel

Nasal Consonant

Page 32: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Nasalized VowelNasalized Vowel

PPRR(() = R() = R()(U)(ULL(()+U)+UNN(())))

UUNN(() = Volume Velocity from Nostrils) = Volume Velocity from Nostrils

PPRR(() = R() = R()(T)(TLL(()+T)+TNN(())U))UGG(())

= R(= R()T()T()U)UGG(())

T(T() = T) = TLL(() + T) + TNN(())

Page 33: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Nasalized VowelNasalized Vowel

T(s) = TT(s) = TLL(s)+T(s)+TNN(s)(s)

= (1-s/s= (1-s/sLnLn)(1-s/s)(1-s/sLnLn*) *) ++ (1-s/s (1-s/sNnNn)(1-s/s)(1-s/sNnNn*)*)

= (1-s/s= (1-s/sLnLn)(1-s/s)(1-s/sLnLn*)(1-s/s*)(1-s/sNnNn)(1-s/s)(1-s/sNnNn*)*)

1/s1/sZnZn = ½(1/s = ½(1/sLnLn+1/s+1/sNnNn))

ssZnZn = n = nthth spectral zero spectral zero

T(s) = 0 if s=sT(s) = 0 if s=sZnZn

1 1

2(1-s/sZn)(1-s/sZn*)

Page 34: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

The “Pole-Zero Pair”The “Pole-Zero Pair”

20log20log1010T(T() =) =

20log20log1010(1/(1-s/s(1/(1-s/sLnLn)(1-s/s)(1-s/sLnLn*))*))

+ 20log+ 20log1010((1-s/s((1-s/sZnZn)(1-s/s)(1-s/sZnZn*)/(1-s/s*)/(1-s/sNnNn)(1-s/s)(1-s/sNnNn*))*))

= original vowel log spectrum= original vowel log spectrum

+ log spectrum of a pole-zero pair+ log spectrum of a pole-zero pair

Page 35: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Additive Terms in the Log Additive Terms in the Log SpectrumSpectrum

Page 36: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Transfer Function of a Transfer Function of a Nasalized VowelNasalized Vowel

Page 37: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

Pole-Zero Pairs in the Pole-Zero Pairs in the SpectrogramSpectrogram

Nasal Pole

Zero

Oral Pole

Page 38: ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

SummarySummary Perturbation Theory: Perturbation Theory:

Squeeze near a velocity peak: formant goes downSqueeze near a velocity peak: formant goes down Squeeze near a pressure peak: formant goes upSqueeze near a pressure peak: formant goes up

Formant TransitionsFormant Transitions Labial closure: loci near 250, 1000, 2000 HzLabial closure: loci near 250, 1000, 2000 Hz Alveolar closure: loci near 250, 1700, 3000 HzAlveolar closure: loci near 250, 1700, 3000 Hz Velar closure: F2 and F3 come together (“velar pinch”)Velar closure: F2 and F3 come together (“velar pinch”)

Vocal Tract Transfer FunctionVocal Tract Transfer Function T(s) = T(s) = ssnnssnn*/(s-s*/(s-snn)(s-s)(s-snn*)*) T(T(=2=2Fn) = QFn) = Qnn = F = Fnn/B/Bnn 3dB bandwidth = B3dB bandwidth = Bnn Hertz Hertz T(0) = 1T(0) = 1

Nasal Vowels: Nasal Vowels: Sum of two transfer functions gives a spectral zero Sum of two transfer functions gives a spectral zero

between the oral and nasal polesbetween the oral and nasal poles Pole-zero pair is a local perturbation of the spectrumPole-zero pair is a local perturbation of the spectrum