Top Banner
Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University
41
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Voice source characterisation

Gerrit Bloothooft

UiL-OTS Utrecht University

Page 2: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 2

Voice research

To describe and model the properties of the vocal sound source from view points of:– Physiology– Acoustics– Perception

Page 3: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 3

Importance of the voice

• Speech synthesis– Towards natural sounding synthesis

• Speech recognition– Using source properties in recognition

• Speaker recognition/identification– Voice source characteristics are essential

• Diagnosis– Pathologies, voice classifications

Page 4: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 4

Voice possibilities

Limited use of voice in speech• Range of the fundamental

frequency• Vocal intensity range• Spectral variation

Page 5: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 5

Focus in this presentation

How do acoustic voice source characteristics vary as a functionof F0 and vocal intensity

Page 6: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 6

Voice profile measurement

Thirties: Intensity range as function of various pitches– manual measurement

Eighties: Automatic computation ofF0 and Intensity– computer measurement– visual feedback– additional parameters

Page 7: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 7

Measurement unit

• One decibel• One semi-tone

Page 8: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 8

Measurement procedure

• Subject in front of computer screen• Microphone on head set (30 cm)• Just phonate, sing, and see the result

immediately

• Best results with recording protocol• Feed back stimulates extreme

phonations

Page 9: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 9

Fundamental frequency (Hz)

Voca

l In

ten

sity

(d

B S

PL)

Sam

ple

den

sity

Voice profile / density

Page 10: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 10

Fundamental frequency (Hz)

Voca

l In

ten

sity

(d

B S

PL)

Sam

ple

den

sity

Voice profile / speech area

Page 11: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 11

Acoustic voice quality parameters

• Jitter– Stability of periodicity– Asymmetry in vocal folds

• Crest factor– Max amplitude divided by average

energy– Relates to spectral slope

• Many more …

Page 12: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 12

Crest factorV

oc a

l In

ten

sity

(d

B S

PL)

Fundamental frequency (Hz)

Cre

st f

act

or

Page 13: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 1353

Jitter

Fundamental frequency (Hz)

Vo

cal

inte

nsi

ty (

dB

SP

L)

regular

irregular

Page 14: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 14

Real time presentation

Screen presentation• One data point per F0-I cell

Advanced data storage [new]• Full audio signal • Full distribution of data per F0-I cell

• Data for screen presentation

Page 15: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 15

Advantages

• Reusability of recordings

• Statistical analysis per F0-I cell

• Study of time-varying behavior

Page 16: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 16

Crest factorV

oc a

l In

ten

sity

(d

B S

PL)

Fundamental frequency (Hz)

Cre

st f

act

or

Page 17: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 17

Median smoothing of crest factor

Voc a

l In

ten

sity

(d

B S

PL)

Fundamental frequency (Hz)

Cre

st f

act

or

Crest factor median smoothed

Page 18: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 18

Vocal Registers

Different movement patterns of the vocal folds

• Pulse register (creaky voice)• Modal register• Falsetto register

Page 19: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 19

Pulse register

• Less than 50 Hz• Irregular • Long closed period

Page 20: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 20

Fundamental Frequency (Hz)

Voc

al I

nten

sity

(dB

SP

L)

Pulse register

Page 21: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 21

Modal register

• “Normal” use of voice• Active role of M. Vocalis• Vocal folds thick and completely

vibrating

• Wide range in F0 and intensity

• Flat spectrum

Page 22: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 22

Fundamental frequency (Hz)

Voc

al I

nten

sity

(dB

SP

L)

Modal register

Page 23: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 23

Falsetto register

• Higher pitches• M. Vocalis passive, tense vocal

ligaments through M.Cricothyroidus

• Edge vibration of vocal volds• Sound poor in higher harmonics (in

untrained subjects)

Page 24: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 24

Fundamental frequency (Hz)

Voc

al I

nten

sity

(dB

SP

L)

Falsetto register

Page 25: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 25

Fundamental frequency (Hz)

Voc

al I

nens

ity

(dB

SP

L)

Register overlap

Page 26: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 26

Chest- en head voice

Refer to secundary vibratory sensations in the body

• Chest voice: loud modal register• Head voice:

– males: higher, softer modal register in overlap area with falsetto register

– women: falsetto register

Page 27: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 27

Fundamental frequency (Hz)

Voc

al I

nten

sity

(dB

SP

L)

Chest voice and Head voice

chest

head

Page 28: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 28

Registers and voice profiles

With a description using

• Iso-crest factor lines• Iso-jitter lines

Page 29: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 29

Iso-crest factor lines

4 dB

6 dB

Vo c

al I

nten

sity

(dB

SP

L)

Cre

st f

acto

r

Fundamental frequency (Hz)

Page 30: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 30

Vo c

al I

nten

sity

(dB

SP

L)

Fundamental frequency (Hz)

3 %

Jitt

er (

%)

Iso-jitter lines

Page 31: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 31

New representation

• Areas defined by iso-parameter lines– crest factor < 4 dB– crest factor > 4 dB, < 6 dB– crest factor > 6 dB– jitter < 3 %– [relative rise time < 6 %]

Page 32: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 32

Areas in the phonetogramV

o cal

Int

ensi

ty (

dB S

PL

)

Fundamental frequency (Hz)

Jitter > 3%, unstable

RRT < 6 %pressed-like Crest factor < 4 dB

sine-like

Page 33: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 33Fundamental frequency (Hz)

Vocal registers in the phonetogram

Falsettoupper boundary

Modallower boundary

Chest voiceboundary

Vo c

al I

nten

sity

(dB

SP

L)

Page 34: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 34

Comparison of voice profiles

Characterisation of

• Voice pathologies• Voice classifications

Reuse stored voice profiles of subjects with known voice history

Page 35: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 35

Important features

• Contour has limited value– but most research goes into that

direction (norm profiles)

• Distribution of acoustical parameters across the voice profile tells much more

Page 36: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 36

• Unit for comparison

Voice profile unit defined by small range of F0 and Vocal Intensity

• Distributions of acoustic voice parameters per unit

Probability density function per parameter• Model

Hidden Markov Model

We need

Page 37: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 37

IN OUT

two unconnected states per phonetogram unit

• vocal registers• start and end of phonetion

Unit model

Page 38: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 38

Speech Voice Profile

• phoneme model F0/I unit model

• not labeled labeled by F0 and I

• spectral envelope acoustic voice parameters• language model unrestricted transitions

“forced alignment

recognition”

Correspondences

Page 39: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 39

Crest factor distributions

training subject 1

0

500

4 5 6 7 8 9 10 11 12 13 14 15

test subject 1

0

500

4 5 6 7 8 9 10 11 12 13 14 15

training subject 2

0

500

4 5 6 7 8 9 10 11 12 13 14 15

test subject 2

0

500

4 5 6 7 8 9 10 11 12 13 14 15

 

Page 40: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 40

Fundamental frequency (Hz)

Voc

al I

nten

sity

(dB

SP

L)

Dis

tinc

tive

ness

Most distinctive states

Page 41: Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Emasters School Leuven 2002 Voice Source Characterization 41

Conclusions

• Voice profiles can enhance our understanding of vocal behaviour in a visually attractive way

• Current data storage opens a series of important research topics

• Market opportunities for “light” versions