Ways to generate computer speech Record a human speaking every sentence HAL will ever speak (not likely) Make a mathematical model of the human vocal.

Ways to generate computer speech

• Record a human speaking every sentence HAL will ever speak (not likely)

• Make a mathematical model of the human vocal tract (synthesis)

• Record a human speaking a lot of sentences, and come up with some way of making new sentences out of the recorded ones (concatenation)

What goes into synthesizing speech?

• Have some idea of what human speech actually looks/sounds like– Modeling the shape of a speaker’s mouth– Fricative noises and noises from stops– Pitch changes

• Produce sounds that resemble speech sounds

Synthesis: Putting it all together

• Shape of mouth: 1: 2: 3: all 3:

• Fricative and burst noises:• Shape of mouth and fricative noises:• Shape of mouth, fricative noises, & pitch:

Speech synthesis

• (1980): The Speak & Spell toy used a synthesis process called Linear Predictive Coding (LPC).

• Basically, LPC is a way for a computer to extract all of the different parts of speech from a speech signal, and re-create them using a mathematical model of the vocal tract

• Here’s a better example of LPC (1982):

• LPC is used today for GSM phone systems

Text-to-Speech (TTS) systems• Concatenative synthesis

– Record natural speech– Chop speech up into units– Recombine units according to the phonetic

transcription to be pronounced

• Steps for a TTS system:– Start w/ written text– Convert text to phonetic characters– Find segments of speech in database– Calculate intonation of sentence

Text-to-Speech (TTS) systems

Examples of text from The North Wind and the Sun (Aesop), circa 2005:

• Mike (AT&T)

• Crystal (AT&T)

• British English (Rhetorical Systems)

• Scottish English (Rhetorical Systems)

Ways to generate computer speech Record a human speaking every sentence HAL will ever speak (not likely) Make a mathematical model of the human vocal.

Documents

human speech

speech synthesis1980

synthesizing speech

speech soundssynthesis

speech signal

different parts of speech

burst noises

speakers mouthfricative