Top Banner
7/23/2015 1 1 Department of Electrical Engineering , IIT Bombay EE679 EE679 : Speech Processing : Speech Processing A preview A preview EE679 EE679 : Speech Processing : Speech Processing A preview A preview Dept of Electrical Engineering I.I.T. Bombay 2 Department of Electrical Engineering , IIT Bombay Why does signal processing for speech need a special course? “Signal processing” is concerned with the mathematical representation of the signal and the algorithmic operations carried out to modify the signal or to extract information from it. The representation and the algorithms are application domain specific, i.e. there are no “generic” methods. An understanding of the signal and of the application are crucial to the success of the signal processing methods
16
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/23/2015

    1

    1Department of Electrical Engineering , IIT Bombay

    EE679EE679: Speech Processing : Speech Processing

    A previewA preview

    EE679EE679: Speech Processing : Speech Processing

    A previewA preview

    Dept of Electrical EngineeringI.I.T. Bombay

    2Department of Electrical Engineering , IIT Bombay

    Why does signal processing for speech need a special course?

    Signal processing is concerned with the mathematicalrepresentation of the signal and the algorithmicoperations carried out to modify the signal or to extractinformation from it.

    The representation and the algorithms are applicationdomain specific, i.e. there are no generic methods.

    An understanding of the signal and of the application arecrucial to the success of the signal processing methods

  • 7/23/2015

    2

    3Department of Electrical Engineering , IIT Bombay

    Everyday speech technology

    Mobile telephony

    Automatic speech recognition (speech to text)

    Speech synthesis (text to speech)

    4Department of Electrical Engineering , IIT Bombay

    Understanding speech communication

  • 7/23/2015

    3

    5Department of Electrical Engineering , IIT Bombay

    Acoustic wavesSpeed = wavelength x frequency

    6Department of Electrical Engineering , IIT Bombay

    Information in speech?

    Linguistic (message -> sentences -> words -> phonemes)

    The speech signal is characterised by an enormous range of elementary perceptually contrasting sounds!

    Paralinguistic: --expressive (emotions, mood)--speaker-based (age, gender, accent and style)

  • 7/23/2015

    4

    7Department of Electrical Engineering , IIT Bombay

    Generating speech*

    Respiration->phonation->articulation

    Vibrating vocal cords create puffs of air giving rise to air pressure variations which reach our ears.

    *HyperPhysics, Sound and Hearing, Georgia State University

    8Department of Electrical Engineering , IIT Bombay

    Speech production (Childers, Speech Overview, 1993)

  • 7/23/2015

    5

    9Department of Electrical Engineering , IIT Bombay

    .......;45;

    43;

    4 321 Lcf

    Lcf

    Lcf

    Vocal tract: Acoustic resonances*

    *HyperPhysics, Sound and Hearing, Georgia State University

    (http://hyperphysics.phy-astr.gsu.edu/hbase/sound/)

    10Department of Electrical Engineering , IIT Bombay

  • 7/23/2015

    6

    11Department of Electrical Engineering , IIT Bombay

    Vocal cords

    Tongue Jaw

    Lips

    Teeth

    Velum

    Moving muscles which alter the resonant cavities Static cavity

    Dynamic cavity

    Vocalcavity

    Pharyngeal

    cavity

    Velum

    Nasal cavity

    Oral Cavity

    Articulators

    Trachea connection to lungs

    Oral sound output

    Nasal sound output

    Articulation: producing the various sounds of speech*

    *Securivox tutorial

    12Department of Electrical Engineering , IIT Bombay

    The sound spectrum is modified by the shape of the vocal tract. The resonant frequencies of the vocal tract cause peaks in the spectrum called formants.

    Vocal tract filter*

    *Childers, Speech Overview

  • 7/23/2015

    7

    13

    Von Kempelen's talking machine

    1791

    "Briefly, the device was operated in the following manner. The right arm rested on the main bellows

    14

    1875

    Alexander Bell invents the method of, and apparatus for, transmitting vocal or other sounds telegraphically ... by causing electrical undulations, similar in form to the vibrations of the air accompanying the said vocal or other sound.

    => Major impetus to modern speech processing.

    1930s: Electrical synthesis of speech by Dudleys vocoder

    Department of Electrical Engineering , IIT Bombay

  • 7/23/2015

    8

    15Department of Electrical Engineering , IIT Bombay

    Sound -> electrical form*

    *The Physics Classroom:http://www.glenbrook.k12.il.us/gbssci/phys/Class/sound/u11l2a.html

    16

    Speech waveform

    Department of Electrical Engineering , IIT Bombay

  • 7/23/2015

    9

    17Department of Electrical Engineering , IIT Bombay

    Speech Waveforms from my speech

    (b) ee vowel

    (c) s consonant

    (a) start of y vowel

    18Department of Electrical Engineering , IIT Bombay

    T0 = 3.3 msec

    T0 = 10 msec

    low pitch tone

    high pitch tone

    Frequency (Fo) = 1/To= 100 Hz

    Frequency = 300 Hz

    Air

    pres

    sure

    var

    iation

    1 Hertz = 1 vibration/sec

  • 7/23/2015

    10

    19Department of Electrical Engineering , IIT Bombay

    Components of sound

    A sound is usually comprised of several frequency components.

    Depending on the relationships of the frequency components, the sound can elicit a sensation of pitch.

    20Department of Electrical Engineering , IIT Bombay

    300 Hz

    600 Hz

    900 Hz

    300 Hz + 600Hz

    300 Hz + 600Hz + 900Hz

  • 7/23/2015

    11

    21Department of Electrical Engineering , IIT Bombay

    Classification of speech sounds

    Vowels and Consonants

    Vowels: steady sounds specified by position of the articulators (typically, tongue)

    Consonants: are (dynamic) sounds classifiedby place and manner of articulation

    22Department of Electrical Engineering , IIT Bombay

    Place of articulation(constriction of vocal tract)

  • 7/23/2015

    12

    23Department of Electrical Engineering , IIT Bombay

    Basic sounds of speech: Phones

    The speech signal can be divided into sound segments with fixed articulation and acoustics over short intervals.i.e. articulatory configuration acoustic properties

    Smallest meaningful sound unit: phone (i.e. set of distinctive sounds of a language)

    In Indian written scripts, one symbol represents one phone.

    24Department of Electrical Engineering , IIT Bombay

  • 7/23/2015

    13

    25

    PRAAT examples

    Department of Electrical Engineering , IIT Bombay

    26

    Physiology (articulator motion)

    Sound with specific acoustic characteristics (seen in waveform and spectrum)

    Perception of certain sound qualities

    Department of Electrical Engineering , IIT Bombay

  • 7/23/2015

    14

    27Department of Electrical Engineering , IIT Bombay

    Speech production basics

    Vocal cords (larynx) modulate the airflow from the lungs by rapid opening-closing; the rate of vibration is determined by their mass and tension. Pitch frequency ranges:male: 80-160 Hz; female:160-320 Hz; singers: over 2 octaves.

    Vocal tract shapes the vocal cord vibrations into the intricate sounds of speech via changes in shape to produce various acoustic resonances.

    28Department of Electrical Engineering , IIT Bombay

  • 7/23/2015

    15

    29

    Glottal folds in action

    Department of Electrical Engineering , IIT Bombay

    30Department of Electrical Engineering , IIT Bombay

    Outline

    Speech production (physiology)

    Classification of sounds: articulatory, acoustic

    Speech analysis (signal processing methods for information extraction)

    Hearing, and speech perception

    Speech technology (speech compression, ASR,TTS)

    Audio/music technology

  • 7/23/2015

    16

    31Department of Electrical Engineering , IIT Bombay

    Text / References

    Douglas O'Shaughnessy, Speech Communications: Human and Machine, Universities Press (India) Ltd., 2001

    Rabiner and Schafer, Digital Processing of Speech Signals

    IITB Moodle for all course-related hand-outs

    32Department of Electrical Engineering , IIT Bombay

    Evaluation

    Computing assignments (Python preferred)

    Exams: mid semester, end semester