Top Banner
VoiceXML VoiceXML and Internet Telephony and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel, Naho, Visda and Sean.
22

VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

VoiceXMLVoiceXMLand Internet Telephonyand Internet Telephony

Kundan Singh and Henning SchulzrinneColumbia University

{kns10,hgs}@cs.columbia.edu

Joint work (in progress) with Daniel, Naho, Visda and Sean.

Page 2: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

2

OverviewOverview

A language for specifying voice dialogs in interactive voice response systems

• Information retrieval– News, sports, traffic, stock quotes

• e-business– Customer service, banking, stock trading

• Notification service

Page 3: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

3

PSTN based IVR PlatformPSTN based IVR Platform

PSTN

End userEnd user

IVRIVR1 platform platform• Voice and telephony functions (ASR2, TTS3, DTMF4)• Service logic (application specific)

• Receives incoming PSTN5 call• Responds back with prompts• Accepts user input (DTMF or speech)• Takes action based on user input

(Usually the service logic is programmed for the specific

application, say weather report)

[1] Interactive voice response[2] Automated speech recognition[3] Text to speech [4] Dual tone multi-frequency (touch tone)[5] Public switched telephone network

Welcome to voice mail. Press 3 to listen to new messages...

1-212-8545224

Page 4: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

4

DecompositionDecomposition

PSTN

End userEnd user

IVR platformIVR platform• Voice and telephony functions (ASR, TTS, DTMF)• Service logic (application specific)

End userEnd userVoice gatewayVoice gateway• Voice and telephony functions

Internet

Web serverWeb server

• Service logic

Page 5: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

5

VoiceXMLVoiceXML

PSTN

End userEnd user

Internet

Voice gatewayVoice gateway

Web serverWeb server

• Service logic (CGI, servlet, JSP)

• Voice and telephony functions• VoiceXML browser

End userEnd userVXMLVXML HTMLHTML

DB

Multimedia

Audio/grammar

Scripts

Web server

Page 6: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

6

Why VoiceXMLWhy VoiceXML

• Alternative: write C/C++ application on telephony platforms ?

• Separate application specific service logic (HTML, VoiceXML) and User interaction (browser, IO device)

• Can use existing web development tools

• Can have single application for both web and voice

• Can use existing infrastructure: HTTP, web servers, etc.

• Programming voice services for telephony platforms

Page 7: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

7

VoiceXML vs HTMLVoiceXML vs HTML

• Phone vs PC; IO phone

• Transport: HTTP

• Voice browser vs web browser

• VoiceXML vs HTML form

<form action=“url”> Enter your Id: <input name=‘id’> <input type=‘submit’> </form>

<form> <field name=‘id’> <prompt> Your ID, please. </prompt> </field> <block> <submit next=“url”/> </block></form>

Page 8: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

8

VoiceXML examples [ 1 ]VoiceXML examples [ 1 ]

<?xml version=“1.0”?>

<vxml version=“1.0”>

<form>

<block>

<prompt>

<emp>Hello</emp>, World!

</prompt>

</block>

</form>

</vxml>

Page 9: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

9

VoiceXML examples [ 2 ]VoiceXML examples [ 2 ]

<form id=“weather_info”> <block>Welcome to the weather information service.</block>

<field name=“state”> <prompt>What state?</prompt> <grammar src=“state.gram”

type=“application/x-jsgf”/> <catch event=“help”> Please speak the state for which you want the weather. </catch> <field>

Page 10: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

10

VoiceXML examples [ 2 ]VoiceXML examples [ 2 ] <field name=“city”> <prompt>What city?</prompt> <grammar src=“city.gram”

type=“application/x-jsgf”/> <help> Please speak the state for which you want the weather. </help> <field> <block><submit next=“/servet/weather” namelist=“city state”/> </block></form>

Grammar (city.gram):

California | Illinois | New Jersey | New York

Page 11: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

11

VoiceXML examples [ 3 ]VoiceXML examples [ 3 ]

<field name=“card_type”> … <grammar> visa {visa} | master [card] {mastercard} | amex {amex} | american [express] {amex} </grammar> <help>Please say Visa, Mastercard, or American Express.</help> … </field>

Page 12: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

12

VoiceXML examples [ 4 ]VoiceXML examples [ 4 ]

<form><field name=“drink”> <prompt>Would you like Coffee, Tea, Milk or

Nothing.</prompt> <option value=“coffee”>coffee</option> <option value=“tea”>tea</option> <option value=“milk”>milk</option> <option value=“nothing”>nothing</option></field><block> <submit next=“http://…/bartender.cgi”

namelist=“drink”/></block></form>

Page 13: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

13

VoiceXML examples [ 5 ]VoiceXML examples [ 5 ]

<menu> <prompt>Would you like Coffee, Tea, Milk or

Nothing.</prompt> <choice next=“http://…coffee.vxml”>coffee</choice> <choice next=“http://…tea.vxml”>tea</choice> <choice next=“http://…coffee.vxml”>milk</choice> <choice next=“http://…blank.vxml”>nothing</choice> <nomatch count=“1”>I did not understand what you

said.</nomatch> <nomatch count=“2”>Please say one of coffee, tea,

milk or nothing</nomatch>

<noinput>You must say something.</noinput></menu>

Alternatively: “Would you like <enumerate/>”

Page 14: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

14

Form Interpretation Form Interpretation AlgorithmAlgorithm

• Initialize variables, counters.

• Main loop– Select phase: select next form

– Collect phase: prompt and collect input

– Process phase: process the event

• Document: collection of forms

• An application can use multiple documents

Page 15: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

15

VoiceXML scopeVoiceXML scope

• Human-Machine Interaction– Audio output (TTS, pre-recorded file)

– Audio input (Speech recognition, audio recording)

– Character input (DTMF)

– Presentation logic (scripting)

• Basic Connection Control– disconnect

– transfer

Page 16: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

16

Application scopeApplication scope

• General service logic

• State management

• Dialog generation

• Dialog sequencing

• Database operation

Page 17: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

17

VoiceXML featuresVoiceXML features

• Menus, Forms, Sub-Dialogs

• Inputs (grammar, record, dtmf)

• Outputs (audio, text-to-speech)

• Events (error handling: nomatch, noinput, catch-throw)

• Variables and scripting (var, assign, if)

• Transition or links (goto, submit)

• Transfer to 3rd party (also add third party)

• Disconnect the call

• Platform specific object, and property

• Pre-fetching

Page 18: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

18

VoiceXML 1.0 VoiceXML 1.0 <tags><tags>

assign, audio, block, break, catch, choice, clear, disconnect, div, dtmf, else, elseif, emp, enumerate, error, exit, field, filled, form, goto, grammar, help, if, initial, link, menu, meta, noinput, nomatch, object, option, param, property, pros, record, reprompt, return, sayas, script, subdialog, submit, throw, transfer, value, var, vxml

TelephonyTelephony, , Speech Synthesis or audio outputSpeech Synthesis or audio output, , User input User input and Grammarand Grammar, , Program flowProgram flow, , Variable and propertiesVariable and properties, , Error handlingError handling, Misc. , Misc.

Page 19: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

19

Internet TelephonyInternet Telephony

PSTN Internet

End userEnd user End userEnd userVoice gatewayVoice gateway

Web serverWeb server

• Service logic (CGI, servlet, JSP)

Voice and telephonyfunction

VoiceXML browser

Page 20: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

20

Internet TelephonyInternet Telephony

PSTN

End userEnd user

SIP user agentSIP user agent

Voice gatewayVoice gateway

Web serverWeb server•CGI, servlet, JSP

PSTN/SIP

VoiceXML browser with SIP

SIP phoneSIP phone

New module

Page 21: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

21

Internet TelephonyInternet Telephony

Web serverWeb server(CGI, servlet, JSP)Example: Email by phone,voicemail by phone, directory services for department,web browsing by phone (Not WAP), …

VoiceXMLVoiceXML browser with SIP

SIP phoneSIP phone

• Accept SIP connection• Fetch XML page over HTTP• Parse XML• Interpret VoiceXML tags• Do Text-to-speech• Receive and detect user input (DTMF, or in future speech) • Parse according to the grammer• Fetch audio file from web and play to the user . . .

gatewaygateway

SIP for signaling,RTP for audio,DTMF (either in-band audio tones or RFC2833)

Page 22: VoiceXML and Internet Telephony Kundan Singh and Henning Schulzrinne Columbia University {kns10,hgs}@cs.columbia.edu Joint work (in progress) with Daniel,

18 April, 2001 VoiceXML/Kundan Singh/Columbia University

22

StatusStatus

• Email by phone (using TellMe voice browser)

• Voice XML browser - on going