Top Banner
New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com
28

New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

Dec 30, 2015

Download

Documents

Denis Doyle
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

New challenge: telephone

Text To Speech & audio

Speech recognition

VoiceXML

Homework: sign up on studio.tellme.com

Page 2: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

Telephone

• Caller to system: speech recognition, – using grammars (limited vocabulary, general audience,

no training)– optional use of touch tones (numbers)

• System to caller: recorded audio (wav files) plus TTS (text to speech)

• Limited bandwidth, in comparison to other applications, but very familiar, ubiquitous medium

• 800 long distance, some airline information systems, others?

Page 3: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

Problems in context

• Speech recognition: very difficult if – no restrictions on speakers

– grammar for all of English with aim of 'natural language understanding'

• Text to speech: much easier problem (but English is more difficult than more fully phonetic languages like Spanish. (I've been told.)

(More next class)

Page 4: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

studio.tellme.com• Company that provides ‘engine’ for applications• Provides developing environment

– We are doing the tellme version of VoiceXML, but it appears to be standard.

• Register as a developer:– Provide your own id; assigned a PIN– Scratchpad for quick testing

• Put VoiceXML in ScratchPad place (no audio files)• 1-800-555-VXML (8965)

– SAY id and then PIN.– Application URL for projects with multiple files

• To look at someone else's project, you change your Application URL– called pointing your account to a new source.

Page 5: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.
Page 6: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

VoiceXML• XML document (VXML header)• VoiceXML has tags for flow-of-control and

calculations.– Also can use <script> for JavaScript

• Grammars come in different varieties. We will use the tellme way. – Grammars are included in CDATA tags to prevent

XML interpretation.– Many grammars constructed for you.

• <field name="answer" type="boolean" >…will listen for yes or no. <field name="price" type="currency" > … will listen for currency.

– <menu > <choice > <choice> for list

Page 7: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

VoiceXML basics, continued• <form> element can contain

– <block> elements, which can contain <audio>, <go>, other

– <field> which can contain• <prompt>• <grammar> (if not one of built-in grammars)• <filled>

• <var> tags can be at different levels (for example, document, block, or higher levels)

• <if> <elseif><else> tags• <script> elements for JavaScript (which can also

appear in expressions>

Page 8: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

VoiceXML basics: typical case

• a form element – <field>

• <prompt>, made up of <audio>, with reference to recorded wav file and backup text

• <grammar>, if NOT using built-in grammars designated by type attribute of field. This is a CDATA section.

• <filled> with (follow-on) code using field

• <catch> for nomatch, noinput cases

Page 9: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

Caution

A form contains various elements,

including

a field.

If a field has a grammar and the grammar is satisfied, control goes to a

filled tag

Page 10: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

obligatory…

<?xml version="1.0"?><vxml version="2.0"> <form> <block> <audio src="prompt1.wav">Hello, world </audio>

</block> </form></vxml>

recorded using tellme studio

backup using TTS, just in case src file missing

Page 11: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

Preparation: objects

• JavaScript (and other languages) use classes and objects

• Objects (aka object instances) are declared (created, instantiated) as members of a class

• Objects have– properties ('the data')

– methods (functions that you can use 'on' the objects)

– static methods• Math.random

Page 12: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

Example: tm_date

• var dt = new tm_date; creates a date/time object.• Use methods to extract/manipulate information held

'in' dt.var day = dt.get_day();

• Use static methods supplied to do common tasks:var dn=tm_date.to_day_of_week_name(day);

or directly:var dn=tm_date.to_day_of_week_name(dt.get_day());

Page 13: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

outline

• Header stuff

• script with external reference

• script (code) encased in CDATA notation

• Form/Block, with text to speech using value produced by script

• Closing stuff

Page 14: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

<?xml version="2.0"?> <vxml><script src="http://resources.tellme.com/lib/code/tm_date.js"/>

Will make use of data functions

Page 15: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

<script> <![CDATA[ var dt = new tm_date(); var monis = tm_date.to_month_name(dt.get_month());

var dateis = dt.get_date(); var dayis = tm_date.to_day_of_week_name(dt.get_day());

var yearis = tm_date.to_year_name(dt.get_full_year());

var houris= dt.get_hours() - 4; var minutesis=dt.get_minutes() var whole = 'The date is '+ monis+' '+dateis+'. It is ' + dayis+'. The time is ' + houris + ' ' + minutesis;

]]> </script> brute force correction from GMT

Page 16: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

<form>

<block>Hello.

<value expr="whole"/>

Good bye.

</block>

</form>

</vxml>Can use block for audio

Page 17: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

Example: my family• Directed responses to 3 family members:

– Daniel, • question/response on activities

– Aviva, • question/response on number of cranes

– Esther • response

• Calculations (arithmetic) done using variables• if tags

– The cond attribute is a condition test.

• limited error handled: exit on no-match event– alternative is to repeat prompt, generally using count

attribute

Page 18: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

<vxml version="2.0"> <form> <field name="childid"> <prompt> <audio src="whosthis.wav">Hello. Who is calling?</audio>

</prompt>

Page 19: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

<grammar type="application/x-gsl" mode="voice">

<![CDATA[[[dan daniel (daniel meyer) (dan meyer)] {<childid "daniel">}

[aviva (aviva meyer)] {<childid "aviva">}

[esther (esther minkin) ] {<childid "esther">}

]]]></grammar>

Page 20: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

<catch event="noinput nomatch"> <audio src="sorry.wav">Sorry. I didn't get that.</audio> <exit/> </catch>

<filled> <if cond="'daniel'==childid"> <goto next="#danfollowup"/> <elseif cond="'aviva'==childid"/> <goto next="#avivafollowup"/> <elseif cond="'esther'==childid"/> <goto next="#estherfollowup"/> <else/> <reprompt/> </if> </filled> </field></form>

never happens Note inner, single quote marks. Note double ='s

Page 21: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

<form id="danfollowup"> <field name="today" > <prompt> <audio src="congratsdan.wav" >Congratulations on the new job.

Did you work on your thesis, or do aikido or jo today?</audio> </prompt><grammar type="application/x-gsl" mode="voice"><![CDATA[[[aikido (i key dough)] {<today "aikido">}[thesis (work)] {<today "thesis">}[jo (joe) ] {<today "jo">}[both (all) (everything) ((i key dough) jo)]{<today "both">}[none nothing (sort of)] {<today "nothing">}]]]></grammar><catch event="noinput nomatch"> <audio >I didn't quite

understand. Call or send e-mail.</audio> <exit/> </catch>

Page 22: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

<filled><if cond="today=='aikido'" > <audio>Some aikido is fine. </audio> <elseif cond="today=='thesis'" /> <audio>Good, but do other things also.</audio> <elseif cond="today=='jo'" /> <audio>don't get hit in the head.</audio> <elseif cond="today=='both'" /> <audio>Doing some of everything is best. </audio> <elseif cond="today=='nothing'"/> <audio> You deserve a break, but remember you want to

be done by September. </audio> <else/> <audio> See you soon.</audio> </if></filled> </field> <block> <audio> Good bye </audio> </block> </form>

Page 23: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

<form id="avivafollowup">

<var name="rest" expr="1000"/>

<field name="bcount" type="number">

<prompt>

<audio src="howmanycranes.wav">Hello, Aviva. How many cranes have you made? </audio>

</prompt>

<grammar type="application/x-gsl" mode="voice" >

<![CDATA[

NATURAL_NUMBER_THRU_9999

]]>

</grammar>

<catch event="noinput nomatch"> <audio src="sorry.wav">Sorry. I didn't get that.</audio> <exit/> </catch>

Page 24: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

<filled> <assign name="rest" expr="1000-bcount"/> <audio> <value expr="rest" /> </audio> <audio src="togo.wav"> to go. </audio> <if cond="rest&lt;200" > <audio src="homestretch.wav">You're in the home stretch

</audio> <elseif cond="rest&lt;500" /> <audio src="morethanhalf.wav">More than half way

</audio> <elseif cond="rest&lt;800" /> <audio src="goodstart.wav">Off to a good start </audio> <else/> <audio> Get a move on </audio> </if> <audio src="goodbye.wav">Good bye. </audio> </filled> </field> </form>

can't use <

Page 25: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

<form id="estherfollowup">

<block>

<audio >Hello, Mommy. This is all I can do now. </audio>

</block>

</form>

</vxml>

Page 26: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

Application logic• VoiceXML elements (for example, <if> and

<var>.– Note: more powerful than XSLT: <assign> tag

• JavaScript code in attributes (for example, cond, expr)

• JavaScript code in <script> </script>– Encase in CDATA to avoid problems with certain

characters

• external JavaScript code, cited using <script src=file address />

Page 27: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

Class work

• EVERYONE (who hasn't already) signup studio.tellme.com tonight

• Design simple application (you may work in groups):– Ask one question– Detect and respond to each of 2 or 3 answers– Use examples here for models– All text to speech

• Pick (at least) one and implement.• (Do this a short time and then go on to next lecture.

Resume after 9pm when minutes are free.)

Page 28: New challenge: telephone Text To Speech & audio Speech recognition VoiceXML Homework: sign up on studio.tellme.com.

Homework

• (Majors requirement overdue: there will be a deduction but better late than never.)

• Go to studio.tellme.com & signup as developer.– try examples (using scratch pad)

– record some voice samples

– do tellme tutorials

• ALSO try and report on– 800 long distance or some other commercial

application