Kishore Prahallad ([email protected]), IIIT Hyder abad 1 Building a Limited Domain Voice Using Festvox (Workshop Talk at IIT Kharagpur, Mar 4-5, 2009) Kishore Prahallad Email: [email protected]International Institute of Information Technology (IIIT) Hyderabad, India & Language Technologies Institute, Carnegie Mellon University
21
Embed
Building a Limited Domain Voice Using Festvox (Workshop Talk at IIT Kharagpur, Mar 4-5, 2009)
Building a Limited Domain Voice Using Festvox (Workshop Talk at IIT Kharagpur, Mar 4-5, 2009). Kishore Prahallad Email: [email protected] International Institute of Information Technology (IIIT) Hyderabad, India & Language Technologies Institute, Carnegie Mellon University. Objective. - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Building Limited Domain• Unit selection is applied to a limited with restricted vocabulary
• High quality speech systems
• Units are words – Implementation in Festival:
• The units are still phone, but are restricted to be coming from a specific word – /p/ from “Pennsylvania” is differentiated from /p/ from “Pittsburgh”– To synthesize “Pittsburgh” all the phones should come from the word
“Pittsburgh” (there may be many examples of the same word).
• 1. Set the Environment:$FESTVOXDIR/src/ldom/setup_ldom iiit time pra
#This would give a talking clock set up. #To change it to any another domain, all you have to do is to replace "etc/time.data"
#with the domain specific training sentences. #For non-english languages, these sentences are transliterated in English.
• 2. Generate Prompts – Synthesize the sentence which *you* are going to speak – How can you synthesize? – mostly applicable to English languages only– Why Synthesize at all? – To *prompt* you what to speakfestival -b festvox/build_ldom.scm '(build_prompts "etc/txt.done.data")'
• 3. Record prompts– For new languages, switch off the * playing of the prompt* by commenting na_play in bin/prompt_thembin/prompt_them etc/txt.done.data
• 4. Label Automatically– Uses dynamic programming for labeling the speech– Labeling builds the correspondence between the text and the speechbin/make_labs prompt-wav/*.wav
• 4.1 Manually correct the labeling errorsemulabel etc/emu_lab time0001