Top Banner
Ways to generate computer speech Record a human speaking every sentence HAL will ever speak (not likely) Make a mathematical model of the human vocal tract (synthesis) Record a human speaking a lot of sentences, and come up with some way of making new sentences out of the recorded ones (concatenation)
6

Ways to generate computer speech Record a human speaking every sentence HAL will ever speak (not likely) Make a mathematical model of the human vocal.

Jan 18, 2016

Download

Documents

Kelley Holt
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Ways to generate computer speech Record a human speaking every sentence HAL will ever speak (not likely) Make a mathematical model of the human vocal.

Ways to generate computer speech

• Record a human speaking every sentence HAL will ever speak (not likely)

• Make a mathematical model of the human vocal tract (synthesis)

• Record a human speaking a lot of sentences, and come up with some way of making new sentences out of the recorded ones (concatenation)

Page 2: Ways to generate computer speech Record a human speaking every sentence HAL will ever speak (not likely) Make a mathematical model of the human vocal.

What goes into synthesizing speech?

• Have some idea of what human speech actually looks/sounds like– Modeling the shape of a speaker’s mouth– Fricative noises and noises from stops– Pitch changes

• Produce sounds that resemble speech sounds

Page 3: Ways to generate computer speech Record a human speaking every sentence HAL will ever speak (not likely) Make a mathematical model of the human vocal.

Synthesis: Putting it all together

• Shape of mouth: 1: 2: 3: all 3:

• Fricative and burst noises:• Shape of mouth and fricative noises:• Shape of mouth, fricative noises, & pitch:

Page 4: Ways to generate computer speech Record a human speaking every sentence HAL will ever speak (not likely) Make a mathematical model of the human vocal.

Speech synthesis

• (1980): The Speak & Spell toy used a synthesis process called Linear Predictive Coding (LPC).

• Basically, LPC is a way for a computer to extract all of the different parts of speech from a speech signal, and re-create them using a mathematical model of the vocal tract

• Here’s a better example of LPC (1982):

• LPC is used today for GSM phone systems

Page 5: Ways to generate computer speech Record a human speaking every sentence HAL will ever speak (not likely) Make a mathematical model of the human vocal.

Text-to-Speech (TTS) systems• Concatenative synthesis

– Record natural speech– Chop speech up into units– Recombine units according to the phonetic

transcription to be pronounced

• Steps for a TTS system:– Start w/ written text– Convert text to phonetic characters– Find segments of speech in database– Calculate intonation of sentence

Page 6: Ways to generate computer speech Record a human speaking every sentence HAL will ever speak (not likely) Make a mathematical model of the human vocal.

Text-to-Speech (TTS) systems

Examples of text from The North Wind and the Sun (Aesop), circa 2005:

• Mike (AT&T)

• Crystal (AT&T)

• British English (Rhetorical Systems)

• Scottish English (Rhetorical Systems)