Top Banner
Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by: Dan Bohus Special appearances: Antoine Raux, Jahanzeb Sherwani, Thomas Harris
56

Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Apr 02, 2015

Download

Documents

Dominique Prime
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework

Sphinx Lunch TalkCarnegie Mellon University, October 2004

Presented by: Dan BohusSpecial appearances: Antoine Raux,

Jahanzeb Sherwani,Thomas Harris

Page 2: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Examples

RoomLineconference room reservations within SCS; system can access schedules of 13 conf rooms in Wean-Hall and NSH

Let’s Go! Bus Information System bus schedule information system for Port Authority buses in Oakland and Squirrel Hill [Let’s Go! Project]

Sublimepersonalized information management system

TeamTalkan investigation into human and multi-robot spoken language communication in unstructured environments

Page 3: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Examples

RoomLineconference room reservations within SCS; system can access schedules of 13 conf rooms in Wean-Hall and NSH

Let’s Go! Bus Information System bus schedule information system for Port Authority buses in Oakland and Squirrel Hill [Let’s Go! Project]

Sublimepersonalized information management system

TeamTalkan investigation into human and multi-robot spoken language communication in unstructured environments

Page 4: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Examples

RoomLineconference room reservations within SCS; system can access schedules of 13 conf rooms in Wean-Hall and NSH

Let’s Go! Bus Information System bus schedule information system for Port Authority buses in Oakland and Squirrel Hill [Let’s Go! Project]

Sublimepersonalized information management system

TeamTalkan investigation into human and multi-robot spoken language communication in unstructured environments

Page 5: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Examples

RoomLineconference room reservations within SCS; system can access schedules of 13 conf rooms in Wean-Hall and NSH

Let’s Go! Bus Information System bus schedule information system for Port Authority buses in Oakland and Squirrel Hill [Let’s Go! Project]

Sublimepersonalized information management system

TeamTalkan investigation into human and multi-robot spoken language communication in unstructured environments

Page 6: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

More Systems

LARRImultimodal system that assists F/A-18 aircraft maintenance personnel throughout the execution of procedural tasks [Symphony]

Madeleinetext-based prototype for medical diagnosis system [MITRE workshop]

Eurekadialogue interface to the Vivisimo web search engine

Page 7: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

The Communicator / RavenClaw Spoken Dialogue Systems Framework

Examples Overall Architecture System Development Components & Resources Miscellaneous Current Research

examples : architecture : development : components : miscellaneous : research

Page 8: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Overall Architecture

Classical pipeline architecture

Lang. Understand.PHOENIX/HELIOS

Dialog Manag.RAVENCLAW

Back-end(various)

Lang. GenerationROSETTA

RecognitionSPHINX

SynthesisTHETA

examples : architecture : development : components : miscellaneous : research

Page 9: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Galaxy HUB

Lang. Understand.PHOENIX/HELIOS

Dialog Manag.RAVENCLAW

Back-end(various)

Lang. GenerationROSETTA

HUB

RecognitionSPHINX

SynthesisTHETA

Galaxy

- Generic centralized, message-passing communication architecture

- Developed at MIT, used in Communicator program

- Competitor: OAA

examples : architecture : development : components : miscellaneous : research

Page 10: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Getting Even Closer

Lang. Understand.PHOENIX/HELIOS

Dialog Manag.RAVENCLAW

Back-end(perl)

Language Gen.ROSETTA

HUB

RecognitionSPHINX

SynthesisTHETA

examples : architecture : development : components : miscellaneous : research

Page 11: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

PROCESSMONITOR

SPHINXSPHINXSPHINX

Getting Even Closer

Dialog Manag.RAVENCLAW

Back-end(perl)

Lang. GenerationROSETTA

HUB

Lang. Understand.PHOENIX/HELIOS

RecognitionServer

SynthesisTHETA

Multiple, paralleldecoders

DateTime

Other domain agents

Back-endGalaxy Stub

Actual PerlBack-end

Lang. GenerationROSETTA (Perl)

Lang. GenerationGalaxy Stub

Text I/OTTYServer

ParsingPHOENIX

ConfidenceHELIOS

examples : architecture : development : components : miscellaneous : research

Inputs from othermodalities

Page 12: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

The Communicator / RavenClaw Spoken Dialogue Systems Framework

Examples Overall Architecture System Development Components & Resources Miscellaneous

examples : architecture : development : components : miscellaneous : research

Page 13: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Building a Spoken Dialogue System

Lang. Understand.PHOENIX/HELIOS

Dialog Manag.RAVENCLAW

Back-end(perl)

Lang. GenerationROSETTA

RecognitionSPHINX

SynthesisTHETA

Grammar

Templates

RavenClawDialogTask

Specification

Back-end(perl)

Language,Acoustic,LexicalModels

(LimitedDomain)Voice

examples : architecture : development : components : miscellaneous : research

Page 14: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Language,Acoustic,LexicalModels

(LimitedDomain)Voice

So How Long Will It Take?

Lang. Understand.PHOENIX/HELIOS

Dialog Manag.RAVENCLAW

Back-end(perl)

Lang. GenerationROSETTA

RecognitionSPHINX

SynthesisTHETA

Grammar

Templates

RavenClawDialogTask

Specification

Back-end(perl)

- MITRE Workshop on Dialogue Management (Fall 2003)

- Develop a Text-based SDS formedical diagnosis (provided backend)

- Madeleine (22 hours)

R C F ix e s 2 h 1 5 , 1 1 %

R a v e n C la w 4 h , 1 9 %

D e s ig n 4 h , 1 8 %

S e t u p 1 h 1 0 , 5 %

G r a m m a r3 h 4 5 , 1 8 %

B a c k e n d

3 h 2 0 , 1 6 %

T e m p la t e s 2 h 4 5 , 1 3 %

examples : architecture : development : components : miscellaneous : research

Page 15: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Okay, How Long Will It Really Take?

To get a system running with a reasonable performance [poll amongst 3 RavenClaw developers]

1 month to get a working system up and running 1 month to fine-tune performance

Further iterative improvements will continue as more data accumulates

examples : architecture : development : components : miscellaneous : research

Page 16: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

The Communicator / RavenClaw Spoken Dialogue Systems Framework

Examples Overall Architecture System Development Components & Resources Miscellaneous

examples : architecture : development : components : miscellaneous : research

Page 17: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Components & Resources

Lang. Understand.PHOENIX/HELIOS

Dialog Manag.RAVENCLAW

Back-end(perl)

Lang. GenerationROSETTA

RecognitionSPHINX

SynthesisTHETA

Grammar

Templates

RavenClawDialogTask

Specification

Back-end(perl)

Language,AcousticModels

LimitedDomainVoice

examples : architecture : development : components : miscellaneous : research

Page 18: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Components & Resources

Lang. Understand.PHOENIX/HELIOS

Dialog Manag.RAVENCLAW

Back-end(perl)

Lang. GenerationROSETTA

RecognitionSPHINX

SynthesisTHETA

Grammar

Templates

RavenClawDialogTask

Specification

Back-end(perl)

Language,AcousticModels

LimitedDomainVoice

examples : architecture : development : components : miscellaneous : research

Page 19: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

SPHINX II

Semi-continuous acoustic models Off-the-shelf 8kHz, 11.025kHz, 16kHz models Scripts for building your own

PLSA adapted models perform better

Language models 2-gram & 3-gram model

CMU-Cambridge SLM Toolkit Generate from Phoenix Grammar

Finite state grammar Sphinx supports state-specific LMs

Dictionary (lexical models) CMU Dictionary

examples : architecture : development : components : miscellaneous : research

Page 20: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Sphinx II - continued

Multiple parallel decoders [e.g., male + female] Multiple hypothesis forwarded, selection done later

Typical WER: 15-30% With pronounced differences native vs. non-native Lowered by retuning acoustic and language

models to the domain

Migration to SPHINX 3.x in the near future Expected: big improvement in WER Concern: real-time performance

Page 21: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Components & Resources

Lang. Understand.PHOENIX/HELIOS

Dialog Manag.RAVENCLAW

Back-end(perl)

Lang. GenerationROSETTA

RecognitionSPHINX

SynthesisTHETA

Grammar

Templates

RavenClawDialogTask

Specification

Back-end(perl)

Language,AcousticModels

LimitedDomainVoice

examples : architecture : development : components : miscellaneous : research

Page 22: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Phoenix Parser / Grammar

Phoenix: Robust Parser CFG Grammar

Manually-generated domain-specific grammar rules

Reusable, generic sub-grammars [Yes], [No], [Number], [DateTime],

[Help], [Repeat], [Suspend], etc…

[room_size_spec] ([rss_large]) ([rss_small]) ([rss_larger]) ([rss_smaller]) ([rss_smallest]) ([rss_largest]);[rss_large] (large) (big) (huge);[rss_larger] (*the larger) (*the bigger) (too small);[rss_largest] (*the largest) (*the biggest);[rss_small] (small) (little);

examples : architecture : development : components : miscellaneous : research

DO YOU HAVE SOMETHING A BIT LARGER?[NeedRoom] ( [_i_want] (DO YOU HAVE SOMETHING) )[RoomSizeSpec] ( [room_size_spec] ( [rss_larger] (LARGER)))

Parses all incoming hypotheses and passes all parses along…

Page 23: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Helios / Confidence Annotation

Builds accurate confidence scores using features from 3 sources of knowledge: Speech recognition Language understanding Dialogue management

Selects hypothesis with maximum confidence score

Research in progress on hypothesis-selection, and transferability across domains

examples : architecture : development : components : miscellaneous : research

Page 24: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Components & Resources

Lang. Understand.PHOENIX/HELIOS

Dialog Manag.RAVENCLAW

Back-end(perl)

Lang. GenerationROSETTA

RecognitionSPHINX

SynthesisTHETA

Grammar

Templates

RavenClawDialogTask

Specification

Back-end(perl)

Language,AcousticModels

LimitedDomainVoice

examples : architecture : development : components : miscellaneous : research

Page 25: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

RavenClaw Architecture

Captures all domain-specific dialog (task) logic using a hierarchical description

The authoring effort is focused entirely here

Dialog Task (Specification)

Domain-independent Dialog Engine

Manages dialog by executing the dialog task specification

Provides a large number of domain-independent conversational strategies

examples : architecture : development : components : miscellaneous : research

Page 26: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

RavenClaw Architecture

Captures all domain-specific dialog (task) logic with a hierarchical description

The authoring effort is focused entirely here

Dialog Task (Specification)

Domain-independent Dialog Engine

Manages dialog by executing the dialog task specification

Provides a large number of domain-independent conversational strategies

examples : architecture : development : components : miscellaneous : research

Page 27: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

RavenClaw: Dialogue Task Specification

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

general_feeling

have_fever

diagnostic

Tree of dialog agents Terminals: Inform, Request, Expect, Execute Non-terminals / Dialog agency: plans execution of child nodes

Basically a Hierarchical Task Execution Network; each agent: Preconditions & effects Success & failure criteria Trigger (focus) criteria Effects

examples : architecture : development : components : miscellaneous : research

Page 28: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Sample DTS Code

// /Madeleine/GeneralFeelDEFINE_AGENCY(CGeneralFeel, DEFINE_CONCEPTS( STRING_USER_CONCEPT(general_feeling, none)) DEFINE_SUBAGENTS( SUBAGENT(HowAreYou, CHowAreYou) SUBAGENT(Glad, CGlad) SUBAGENT(Sorry, CSorry)) SUCCEEDS_WHEN(COMPLETED(Glad) || COMPLETED(Sorry)))

// /Madeleine/GeneralFeel/HowAreYouDEFINE_REQUEST_AGENT(CHowAreYou, REQUEST_CONCEPT(general_feeling) GRAMMAR_MAPPING("![Yes]>good, ![FeelingGood]>good, " "![FeelingSoSo]>soso, ![FeelingBad]>bad")))

// /Madeleine/GeneralFeel/GladDEFINE_INFORM_AGENT(CGlad, PRECONDITION(C("general_feeling") == CString("good")) PROMPT("inform glad_youre_good") ON_COMPLETION(FINISH(/Madeleine)))

// /Madeleine/GeneralFeel/SorryDEFINE_INFORM_AGENT(CSorry, PRECONDITION(C("general_feeling") != CString("good")) PROMPT("inform sorry_youre_bad"))

R:HowAreYou?

general_feeling

GeneralFeel

I:Glad I:Sorry

examples : architecture : development : components : miscellaneous : research

Page 29: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

RavenClaw Execution

Dialog Stack

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

Expectation Agenda

general_feeling

chart

have_fever

diagnostic

examples : architecture : development : components : miscellaneous : research

Page 30: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

RavenClaw Execution

Dialog Stack

Madeleine

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

Expectation Agenda

general_feeling

chart

have_fever

diagnostic

examples : architecture : development : components : miscellaneous : research

Page 31: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

RavenClaw Execution

Dialog Stack

Madeleine

Welcome

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

Expectation Agenda

general_feeling

chart

have_fever

diagnostic

examples : architecture : development : components : miscellaneous : research

Page 32: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

RavenClaw Execution

Dialog Stack

Madeleine

Hi, this is Madeleine, the automated…

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

Expectation Agenda

general_feeling

chart

have_fever

diagnostic

examples : architecture : development : components : miscellaneous : research

Page 33: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

RavenClaw Execution

Dialog Stack

Madeleine

Hi, this is Madeleine, the automated…

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

LoadSymptoms

R:Headache R: R: R:

Expectation Agenda

general_feeling

chart

have_fever

diagnostic

headache

examples : architecture : development : components : miscellaneous : research

Page 34: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

RavenClaw Execution

Dialog Stack

Madeleine

Hi, this is Madeleine, the automated…

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

R:Headache R: R: R:

Expectation Agenda

general_feeling

chart

have_fever

diagnostic

headache

examples : architecture : development : components : miscellaneous : research

Page 35: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

RavenClaw Execution

Dialog Stack

Madeleine

Hi, this is Madeleine, the automated…

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

R:Headache R: R: R:

GeneralFeel

Expectation Agenda

general_feeling

chart

have_fever

diagnostic

headache

examples : architecture : development : components : miscellaneous : research

Page 36: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

RavenClaw Execution / Input Pass

Dialog Stack

Madeleine

Hi, this is Madeleine, the automated…

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

R:Headache R: R: R:

GeneralFeel

How are you feeling today?

general_feeling

chart

have_fever

diagnostic

HowAreYou

Expectation Agenda

general_feeling: [good], [bad], [soso]

general_feeling: [good], [bad], [soso] [good], [bad], [soso]

general_feeling: [good], [bad], [soso] [good], [bad], [soso]have_fever: [fever]. ![yes], ![no] ![yes], ![no]headache: [headache], ![yes], ![no] ![yes], ![no]cough: [cough], ![yes], ![no] ![yes], ![no]……

GeneralFeel

I:Glad I:Sorry

Not so good, I think I have a fever[soso](not so good)[fever](I think I have a fever)

headache

GeneralFeel

examples : architecture : development : components : miscellaneous : research

Page 37: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

RavenClaw Execution

Dialog Stack

Madeleine

Hi, this is Madeleine, the automated…

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

R:Headache R: R: R:

GeneralFeel

Expectation Agenda

general_feeling

chart

have_fever

diagnostic

headache

examples : architecture : development : components : miscellaneous : research

How are you feeling today?

Not so good, I think I have a fever[soso](not so good)[fever](I think I have a fever)

Page 38: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

RavenClaw Execution

Dialog Stack

Madeleine

Hi, this is Madeleine, the automated…

Madeleine

E:LoadSymptoms GeneralFeel

R:HowAreYou? I:Glad I:Sorry

Diagnose

Fever Travel

R:AskFever E:MeasureTemp I:InformFever

I:Welcome

R:Headache R: R: R:

GeneralFeel

Expectation Agenda

general_feeling

chart

have_fever

diagnostic

headache

examples : architecture : development : components : miscellaneous : research

How are you feeling today?

Not so good, I think I have a fever[soso](not so good)[fever](I think I have a fever)Sorry

Oh, I’m sorry to hear that…Let me take your temperature…

Page 39: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

RavenClaw – Other features

Dialogue Engine transparently provides a set of conversational skills Universal dialogue mechanisms:

Repeat, Suspend / Resume, Quit

Help: Help!, Where are we?, What can I say?

Error handling: Explicit and implicit confirmations Strategies for recovering from non-understandings

Dynamic dialogue task generation Dynamic dialogue control policy

Page 40: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Components & Resources

Lang. Understand.PHOENIX/HELIOS

Dialog Manag.RAVENCLAW

Back-end(perl)

Lang. GenerationROSETTA

RecognitionSPHINX

SynthesisTHETA

Grammar

Templates

RavenClawDialogTask

Specification

Back-end(perl)

Language,AcousticModels

LimitedDomainVoice

examples : architecture : development : components : miscellaneous : research

Page 41: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Backend & Domain Agents

Various problem-specific solutions RoomLine

Connects to a static Perl database or to the CMU CorporateTime server;

Let’s Go! Bus Information system Connects to a PostGRES database

Sublime Connects to a MySQL database; also functions as a

web-server; DTW search domain agent

Basically, build your own; we provide a stub for interfacing with the Galaxy-Hub

examples : architecture : development : components : miscellaneous : research

Page 42: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Components & Resources

Lang. Understand.PHOENIX/HELIOS

Dialog Manag.RAVENCLAW

Back-end(perl)

Lang. GenerationROSETTA

RecognitionSPHINX

SynthesisTHETA

Grammar

Templates

RavenClawDialogTask

Specification

Back-end(perl)

Language,AcousticModels

LimitedDomainVoice

examples : architecture : development : components : miscellaneous : research

Page 43: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Rosetta Language Generation

Template- and stochastic-based language generation Input: (act, object, {slot=value}) Output: text (tagged with concepts)

# welcome to the system “welcome” => “Welcome to RoomLine, the automated conference room “.

“reservation system.”,# greet user “greet_user” => (“Hi, <user_name>.”,

“Hi, <user_name>, good to hear from you again.”),# inform the user that the system has misunderstood the times (order) “wrong_time_order” => sub { my %args = @_; my $time_interval_as_string =

get_wrong_time_interval_as_string(\%args,“room_query.date_time.time”);

my $answer = “I'm sorry, I must have misunderstood the “. “time you needed the room. “;

$answer .= “I heard $time_interval_as_string. “; return [“$answer So, let's see ... “,

“$answer So, let's try this again ... “, “$answer So, let's try this once more ... “];

},

examples : architecture : development : components : miscellaneous : research

Page 44: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Components & Resources

Lang. Understand.PHOENIX/HELIOS

Dialog Manag.RAVENCLAW

Back-end(perl)

Lang. GenerationROSETTA

RecognitionSPHINX

SynthesisTHETA

Grammar

Templates

RavenClawDialogTask

Specification

Back-end(perl)

Language,AcousticModels

LimitedDomainVoice

examples : architecture : development : components : miscellaneous : research

Page 45: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Synthesis

Cepstral Theta synthesis Open-domain unit-selection synthesis SSML tags [Currently working on barge-in location]

Festival synthesis Diphone synthesis; Open-domain, Limited-domain

unit-selection synthesis SABLE tags Server running separately on a Linux box

examples : architecture : development : components : miscellaneous : research

Page 46: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

The Communicator / RavenClaw Spoken Dialogue Systems Framework

Examples Overall Architecture System Development Components & Resources Miscellaneous Current Research

examples : architecture : development : components : miscellaneous : research

Page 47: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Miscellaneous – Documentation

Transmitted largely by oral tradition :) A bit of documentation available

Research papers, slides WIKI: http://hap.speech.cs.cmu.edu/commwiki

mostly for developers, postings of updates, recent developments;

hopefully more introductory materials soon.

More under work Tutorials: 2 available, but a bit outdated

examples : architecture : development : components : miscellaneous : research

Page 48: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Miscellaneous – Portability

Current systems work on PC Windows platforms Galaxy has Linux version Components are C, C++, (Visual Studio 6.0,

Visual Studio.NET), Perl

How about using different input / output components? Modify RavenClaw DMInterface class

Has been done for the Gemini parser / language generator

examples : architecture : development : components : miscellaneous : research

Page 49: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Miscellaneous – Research Platform

Communicator / RavenClaw framework is a research platform! Constantly evolving Modular

Easy to change, develop and test new technologies

Research on variety of topics in a real-world, full-blown system:

Recognition, Language understanding, Dialogue management, Language generation, Synthesis

Your work can be evaluated / reused easily across multiple existing systems

examples : architecture : development : components : miscellaneous : research

Page 50: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Miscellaneous - Download

www.cs.cmu.edu/~dbohus/RavenClaw Download a version of RoomLine An installation script can seed your own

project from this RoomLine version

examples : architecture : development : components : miscellaneous : research

Page 51: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Miscellaneous – RavenClaw Team

RavenClaw Team Dan Bohus (dbohus@cs) Antoine Raux (antoine@cs) Jahanzeb Sherwani (jsherwan@cs) Thomas Harris (tkharris@cs) Satanjeev Banerjee (satanjeev@cs) Brian Langner (blangner@cs)

More users / developers / documentation writers are always welcome!!

Dialogs on Dialogs Reading Group www.cs.cmu.edu/~dod

examples : architecture : development : components : miscellaneous : research

Page 52: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

The Communicator / RavenClaw Spoken Dialogue Systems Framework

Examples Overall Architecture System Development Components & Resources Miscellaneous Current Research

examples : architecture : development : components : miscellaneous : research

Page 53: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Error awareness and recovery

Problem: lack of robustness when faced with understanding errors

Solution: build mechanisms for acting robustly at the dialogue management level Error awareness

Building better confidence annotators, hypothesis selection; transference across domains

Error recovery strategies Recovery from non-understandings

Error handling decision process Scalable, adaptable, task-independent architecture for

making error handling decisions

examples : architecture : development : components : miscellaneous : research

Page 54: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Let’s Go! Research

Speech Recognition: acoustic adaptation on non-native speech

WER: 50% 30%

Speech Synthesis: flexible and natural F0 modeling (F0 unit selection)

Emphasis on erroneous/uncertain words for utterance confirmation

examples : architecture : development : components : miscellaneous : research

Page 55: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

Sublime

Interface for personalized information management

Narrow functionality in unrestricted domains Currently, handle information without

understanding it Eventually, learn relationships and a shallow

ontology

examples : architecture : development : components : miscellaneous : research

Page 56: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.

That’s all, folks!

THANK YOU!