Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by: Dan Bohus Special appearances: Antoine Raux, Jahanzeb Sherwani, Thomas Harris
Apr 02, 2015
Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework
Sphinx Lunch TalkCarnegie Mellon University, October 2004
Presented by: Dan BohusSpecial appearances: Antoine Raux,
Jahanzeb Sherwani,Thomas Harris
Examples
RoomLineconference room reservations within SCS; system can access schedules of 13 conf rooms in Wean-Hall and NSH
Let’s Go! Bus Information System bus schedule information system for Port Authority buses in Oakland and Squirrel Hill [Let’s Go! Project]
Sublimepersonalized information management system
TeamTalkan investigation into human and multi-robot spoken language communication in unstructured environments
Examples
RoomLineconference room reservations within SCS; system can access schedules of 13 conf rooms in Wean-Hall and NSH
Let’s Go! Bus Information System bus schedule information system for Port Authority buses in Oakland and Squirrel Hill [Let’s Go! Project]
Sublimepersonalized information management system
TeamTalkan investigation into human and multi-robot spoken language communication in unstructured environments
Examples
RoomLineconference room reservations within SCS; system can access schedules of 13 conf rooms in Wean-Hall and NSH
Let’s Go! Bus Information System bus schedule information system for Port Authority buses in Oakland and Squirrel Hill [Let’s Go! Project]
Sublimepersonalized information management system
TeamTalkan investigation into human and multi-robot spoken language communication in unstructured environments
Examples
RoomLineconference room reservations within SCS; system can access schedules of 13 conf rooms in Wean-Hall and NSH
Let’s Go! Bus Information System bus schedule information system for Port Authority buses in Oakland and Squirrel Hill [Let’s Go! Project]
Sublimepersonalized information management system
TeamTalkan investigation into human and multi-robot spoken language communication in unstructured environments
More Systems
LARRImultimodal system that assists F/A-18 aircraft maintenance personnel throughout the execution of procedural tasks [Symphony]
Madeleinetext-based prototype for medical diagnosis system [MITRE workshop]
Eurekadialogue interface to the Vivisimo web search engine
The Communicator / RavenClaw Spoken Dialogue Systems Framework
Examples Overall Architecture System Development Components & Resources Miscellaneous Current Research
examples : architecture : development : components : miscellaneous : research
Overall Architecture
Classical pipeline architecture
Lang. Understand.PHOENIX/HELIOS
Dialog Manag.RAVENCLAW
Back-end(various)
Lang. GenerationROSETTA
RecognitionSPHINX
SynthesisTHETA
examples : architecture : development : components : miscellaneous : research
Galaxy HUB
Lang. Understand.PHOENIX/HELIOS
Dialog Manag.RAVENCLAW
Back-end(various)
Lang. GenerationROSETTA
HUB
RecognitionSPHINX
SynthesisTHETA
Galaxy
- Generic centralized, message-passing communication architecture
- Developed at MIT, used in Communicator program
- Competitor: OAA
examples : architecture : development : components : miscellaneous : research
Getting Even Closer
Lang. Understand.PHOENIX/HELIOS
Dialog Manag.RAVENCLAW
Back-end(perl)
Language Gen.ROSETTA
HUB
RecognitionSPHINX
SynthesisTHETA
examples : architecture : development : components : miscellaneous : research
PROCESSMONITOR
SPHINXSPHINXSPHINX
Getting Even Closer
Dialog Manag.RAVENCLAW
Back-end(perl)
Lang. GenerationROSETTA
HUB
Lang. Understand.PHOENIX/HELIOS
RecognitionServer
SynthesisTHETA
Multiple, paralleldecoders
DateTime
Other domain agents
Back-endGalaxy Stub
Actual PerlBack-end
Lang. GenerationROSETTA (Perl)
Lang. GenerationGalaxy Stub
Text I/OTTYServer
ParsingPHOENIX
ConfidenceHELIOS
examples : architecture : development : components : miscellaneous : research
Inputs from othermodalities
The Communicator / RavenClaw Spoken Dialogue Systems Framework
Examples Overall Architecture System Development Components & Resources Miscellaneous
examples : architecture : development : components : miscellaneous : research
Building a Spoken Dialogue System
Lang. Understand.PHOENIX/HELIOS
Dialog Manag.RAVENCLAW
Back-end(perl)
Lang. GenerationROSETTA
RecognitionSPHINX
SynthesisTHETA
Grammar
Templates
RavenClawDialogTask
Specification
Back-end(perl)
Language,Acoustic,LexicalModels
(LimitedDomain)Voice
examples : architecture : development : components : miscellaneous : research
Language,Acoustic,LexicalModels
(LimitedDomain)Voice
So How Long Will It Take?
Lang. Understand.PHOENIX/HELIOS
Dialog Manag.RAVENCLAW
Back-end(perl)
Lang. GenerationROSETTA
RecognitionSPHINX
SynthesisTHETA
Grammar
Templates
RavenClawDialogTask
Specification
Back-end(perl)
- MITRE Workshop on Dialogue Management (Fall 2003)
- Develop a Text-based SDS formedical diagnosis (provided backend)
- Madeleine (22 hours)
R C F ix e s 2 h 1 5 , 1 1 %
R a v e n C la w 4 h , 1 9 %
D e s ig n 4 h , 1 8 %
S e t u p 1 h 1 0 , 5 %
G r a m m a r3 h 4 5 , 1 8 %
B a c k e n d
3 h 2 0 , 1 6 %
T e m p la t e s 2 h 4 5 , 1 3 %
examples : architecture : development : components : miscellaneous : research
Okay, How Long Will It Really Take?
To get a system running with a reasonable performance [poll amongst 3 RavenClaw developers]
1 month to get a working system up and running 1 month to fine-tune performance
Further iterative improvements will continue as more data accumulates
examples : architecture : development : components : miscellaneous : research
The Communicator / RavenClaw Spoken Dialogue Systems Framework
Examples Overall Architecture System Development Components & Resources Miscellaneous
examples : architecture : development : components : miscellaneous : research
Components & Resources
Lang. Understand.PHOENIX/HELIOS
Dialog Manag.RAVENCLAW
Back-end(perl)
Lang. GenerationROSETTA
RecognitionSPHINX
SynthesisTHETA
Grammar
Templates
RavenClawDialogTask
Specification
Back-end(perl)
Language,AcousticModels
LimitedDomainVoice
examples : architecture : development : components : miscellaneous : research
Components & Resources
Lang. Understand.PHOENIX/HELIOS
Dialog Manag.RAVENCLAW
Back-end(perl)
Lang. GenerationROSETTA
RecognitionSPHINX
SynthesisTHETA
Grammar
Templates
RavenClawDialogTask
Specification
Back-end(perl)
Language,AcousticModels
LimitedDomainVoice
examples : architecture : development : components : miscellaneous : research
SPHINX II
Semi-continuous acoustic models Off-the-shelf 8kHz, 11.025kHz, 16kHz models Scripts for building your own
PLSA adapted models perform better
Language models 2-gram & 3-gram model
CMU-Cambridge SLM Toolkit Generate from Phoenix Grammar
Finite state grammar Sphinx supports state-specific LMs
Dictionary (lexical models) CMU Dictionary
examples : architecture : development : components : miscellaneous : research
Sphinx II - continued
Multiple parallel decoders [e.g., male + female] Multiple hypothesis forwarded, selection done later
Typical WER: 15-30% With pronounced differences native vs. non-native Lowered by retuning acoustic and language
models to the domain
Migration to SPHINX 3.x in the near future Expected: big improvement in WER Concern: real-time performance
Components & Resources
Lang. Understand.PHOENIX/HELIOS
Dialog Manag.RAVENCLAW
Back-end(perl)
Lang. GenerationROSETTA
RecognitionSPHINX
SynthesisTHETA
Grammar
Templates
RavenClawDialogTask
Specification
Back-end(perl)
Language,AcousticModels
LimitedDomainVoice
examples : architecture : development : components : miscellaneous : research
Phoenix Parser / Grammar
Phoenix: Robust Parser CFG Grammar
Manually-generated domain-specific grammar rules
Reusable, generic sub-grammars [Yes], [No], [Number], [DateTime],
[Help], [Repeat], [Suspend], etc…
[room_size_spec] ([rss_large]) ([rss_small]) ([rss_larger]) ([rss_smaller]) ([rss_smallest]) ([rss_largest]);[rss_large] (large) (big) (huge);[rss_larger] (*the larger) (*the bigger) (too small);[rss_largest] (*the largest) (*the biggest);[rss_small] (small) (little);
examples : architecture : development : components : miscellaneous : research
DO YOU HAVE SOMETHING A BIT LARGER?[NeedRoom] ( [_i_want] (DO YOU HAVE SOMETHING) )[RoomSizeSpec] ( [room_size_spec] ( [rss_larger] (LARGER)))
Parses all incoming hypotheses and passes all parses along…
Helios / Confidence Annotation
Builds accurate confidence scores using features from 3 sources of knowledge: Speech recognition Language understanding Dialogue management
Selects hypothesis with maximum confidence score
Research in progress on hypothesis-selection, and transferability across domains
examples : architecture : development : components : miscellaneous : research
Components & Resources
Lang. Understand.PHOENIX/HELIOS
Dialog Manag.RAVENCLAW
Back-end(perl)
Lang. GenerationROSETTA
RecognitionSPHINX
SynthesisTHETA
Grammar
Templates
RavenClawDialogTask
Specification
Back-end(perl)
Language,AcousticModels
LimitedDomainVoice
examples : architecture : development : components : miscellaneous : research
RavenClaw Architecture
Captures all domain-specific dialog (task) logic using a hierarchical description
The authoring effort is focused entirely here
Dialog Task (Specification)
Domain-independent Dialog Engine
Manages dialog by executing the dialog task specification
Provides a large number of domain-independent conversational strategies
examples : architecture : development : components : miscellaneous : research
RavenClaw Architecture
Captures all domain-specific dialog (task) logic with a hierarchical description
The authoring effort is focused entirely here
Dialog Task (Specification)
Domain-independent Dialog Engine
Manages dialog by executing the dialog task specification
Provides a large number of domain-independent conversational strategies
examples : architecture : development : components : miscellaneous : research
RavenClaw: Dialogue Task Specification
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
general_feeling
have_fever
diagnostic
Tree of dialog agents Terminals: Inform, Request, Expect, Execute Non-terminals / Dialog agency: plans execution of child nodes
Basically a Hierarchical Task Execution Network; each agent: Preconditions & effects Success & failure criteria Trigger (focus) criteria Effects
examples : architecture : development : components : miscellaneous : research
Sample DTS Code
// /Madeleine/GeneralFeelDEFINE_AGENCY(CGeneralFeel, DEFINE_CONCEPTS( STRING_USER_CONCEPT(general_feeling, none)) DEFINE_SUBAGENTS( SUBAGENT(HowAreYou, CHowAreYou) SUBAGENT(Glad, CGlad) SUBAGENT(Sorry, CSorry)) SUCCEEDS_WHEN(COMPLETED(Glad) || COMPLETED(Sorry)))
// /Madeleine/GeneralFeel/HowAreYouDEFINE_REQUEST_AGENT(CHowAreYou, REQUEST_CONCEPT(general_feeling) GRAMMAR_MAPPING("![Yes]>good, ![FeelingGood]>good, " "![FeelingSoSo]>soso, ![FeelingBad]>bad")))
// /Madeleine/GeneralFeel/GladDEFINE_INFORM_AGENT(CGlad, PRECONDITION(C("general_feeling") == CString("good")) PROMPT("inform glad_youre_good") ON_COMPLETION(FINISH(/Madeleine)))
// /Madeleine/GeneralFeel/SorryDEFINE_INFORM_AGENT(CSorry, PRECONDITION(C("general_feeling") != CString("good")) PROMPT("inform sorry_youre_bad"))
R:HowAreYou?
general_feeling
GeneralFeel
I:Glad I:Sorry
examples : architecture : development : components : miscellaneous : research
RavenClaw Execution
Dialog Stack
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
Expectation Agenda
general_feeling
chart
have_fever
diagnostic
examples : architecture : development : components : miscellaneous : research
RavenClaw Execution
Dialog Stack
Madeleine
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
Expectation Agenda
general_feeling
chart
have_fever
diagnostic
examples : architecture : development : components : miscellaneous : research
RavenClaw Execution
Dialog Stack
Madeleine
Welcome
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
Expectation Agenda
general_feeling
chart
have_fever
diagnostic
examples : architecture : development : components : miscellaneous : research
RavenClaw Execution
Dialog Stack
Madeleine
Hi, this is Madeleine, the automated…
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
Expectation Agenda
general_feeling
chart
have_fever
diagnostic
examples : architecture : development : components : miscellaneous : research
RavenClaw Execution
Dialog Stack
Madeleine
Hi, this is Madeleine, the automated…
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
LoadSymptoms
R:Headache R: R: R:
Expectation Agenda
general_feeling
chart
have_fever
diagnostic
headache
examples : architecture : development : components : miscellaneous : research
RavenClaw Execution
Dialog Stack
Madeleine
Hi, this is Madeleine, the automated…
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
R:Headache R: R: R:
Expectation Agenda
general_feeling
chart
have_fever
diagnostic
headache
examples : architecture : development : components : miscellaneous : research
RavenClaw Execution
Dialog Stack
Madeleine
Hi, this is Madeleine, the automated…
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
R:Headache R: R: R:
GeneralFeel
Expectation Agenda
general_feeling
chart
have_fever
diagnostic
headache
examples : architecture : development : components : miscellaneous : research
RavenClaw Execution / Input Pass
Dialog Stack
Madeleine
Hi, this is Madeleine, the automated…
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
R:Headache R: R: R:
GeneralFeel
How are you feeling today?
general_feeling
chart
have_fever
diagnostic
HowAreYou
Expectation Agenda
general_feeling: [good], [bad], [soso]
general_feeling: [good], [bad], [soso] [good], [bad], [soso]
general_feeling: [good], [bad], [soso] [good], [bad], [soso]have_fever: [fever]. ![yes], ![no] ![yes], ![no]headache: [headache], ![yes], ![no] ![yes], ![no]cough: [cough], ![yes], ![no] ![yes], ![no]……
GeneralFeel
I:Glad I:Sorry
Not so good, I think I have a fever[soso](not so good)[fever](I think I have a fever)
headache
GeneralFeel
examples : architecture : development : components : miscellaneous : research
RavenClaw Execution
Dialog Stack
Madeleine
Hi, this is Madeleine, the automated…
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
R:Headache R: R: R:
GeneralFeel
Expectation Agenda
general_feeling
chart
have_fever
diagnostic
headache
examples : architecture : development : components : miscellaneous : research
How are you feeling today?
Not so good, I think I have a fever[soso](not so good)[fever](I think I have a fever)
RavenClaw Execution
Dialog Stack
Madeleine
Hi, this is Madeleine, the automated…
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
R:Headache R: R: R:
GeneralFeel
Expectation Agenda
general_feeling
chart
have_fever
diagnostic
headache
examples : architecture : development : components : miscellaneous : research
How are you feeling today?
Not so good, I think I have a fever[soso](not so good)[fever](I think I have a fever)Sorry
Oh, I’m sorry to hear that…Let me take your temperature…
RavenClaw – Other features
Dialogue Engine transparently provides a set of conversational skills Universal dialogue mechanisms:
Repeat, Suspend / Resume, Quit
Help: Help!, Where are we?, What can I say?
Error handling: Explicit and implicit confirmations Strategies for recovering from non-understandings
Dynamic dialogue task generation Dynamic dialogue control policy
Components & Resources
Lang. Understand.PHOENIX/HELIOS
Dialog Manag.RAVENCLAW
Back-end(perl)
Lang. GenerationROSETTA
RecognitionSPHINX
SynthesisTHETA
Grammar
Templates
RavenClawDialogTask
Specification
Back-end(perl)
Language,AcousticModels
LimitedDomainVoice
examples : architecture : development : components : miscellaneous : research
Backend & Domain Agents
Various problem-specific solutions RoomLine
Connects to a static Perl database or to the CMU CorporateTime server;
Let’s Go! Bus Information system Connects to a PostGRES database
Sublime Connects to a MySQL database; also functions as a
web-server; DTW search domain agent
Basically, build your own; we provide a stub for interfacing with the Galaxy-Hub
examples : architecture : development : components : miscellaneous : research
Components & Resources
Lang. Understand.PHOENIX/HELIOS
Dialog Manag.RAVENCLAW
Back-end(perl)
Lang. GenerationROSETTA
RecognitionSPHINX
SynthesisTHETA
Grammar
Templates
RavenClawDialogTask
Specification
Back-end(perl)
Language,AcousticModels
LimitedDomainVoice
examples : architecture : development : components : miscellaneous : research
Rosetta Language Generation
Template- and stochastic-based language generation Input: (act, object, {slot=value}) Output: text (tagged with concepts)
# welcome to the system “welcome” => “Welcome to RoomLine, the automated conference room “.
“reservation system.”,# greet user “greet_user” => (“Hi, <user_name>.”,
“Hi, <user_name>, good to hear from you again.”),# inform the user that the system has misunderstood the times (order) “wrong_time_order” => sub { my %args = @_; my $time_interval_as_string =
get_wrong_time_interval_as_string(\%args,“room_query.date_time.time”);
my $answer = “I'm sorry, I must have misunderstood the “. “time you needed the room. “;
$answer .= “I heard $time_interval_as_string. “; return [“$answer So, let's see ... “,
“$answer So, let's try this again ... “, “$answer So, let's try this once more ... “];
},
examples : architecture : development : components : miscellaneous : research
Components & Resources
Lang. Understand.PHOENIX/HELIOS
Dialog Manag.RAVENCLAW
Back-end(perl)
Lang. GenerationROSETTA
RecognitionSPHINX
SynthesisTHETA
Grammar
Templates
RavenClawDialogTask
Specification
Back-end(perl)
Language,AcousticModels
LimitedDomainVoice
examples : architecture : development : components : miscellaneous : research
Synthesis
Cepstral Theta synthesis Open-domain unit-selection synthesis SSML tags [Currently working on barge-in location]
Festival synthesis Diphone synthesis; Open-domain, Limited-domain
unit-selection synthesis SABLE tags Server running separately on a Linux box
examples : architecture : development : components : miscellaneous : research
The Communicator / RavenClaw Spoken Dialogue Systems Framework
Examples Overall Architecture System Development Components & Resources Miscellaneous Current Research
examples : architecture : development : components : miscellaneous : research
Miscellaneous – Documentation
Transmitted largely by oral tradition :) A bit of documentation available
Research papers, slides WIKI: http://hap.speech.cs.cmu.edu/commwiki
mostly for developers, postings of updates, recent developments;
hopefully more introductory materials soon.
More under work Tutorials: 2 available, but a bit outdated
examples : architecture : development : components : miscellaneous : research
Miscellaneous – Portability
Current systems work on PC Windows platforms Galaxy has Linux version Components are C, C++, (Visual Studio 6.0,
Visual Studio.NET), Perl
How about using different input / output components? Modify RavenClaw DMInterface class
Has been done for the Gemini parser / language generator
examples : architecture : development : components : miscellaneous : research
Miscellaneous – Research Platform
Communicator / RavenClaw framework is a research platform! Constantly evolving Modular
Easy to change, develop and test new technologies
Research on variety of topics in a real-world, full-blown system:
Recognition, Language understanding, Dialogue management, Language generation, Synthesis
Your work can be evaluated / reused easily across multiple existing systems
examples : architecture : development : components : miscellaneous : research
Miscellaneous - Download
www.cs.cmu.edu/~dbohus/RavenClaw Download a version of RoomLine An installation script can seed your own
project from this RoomLine version
examples : architecture : development : components : miscellaneous : research
Miscellaneous – RavenClaw Team
RavenClaw Team Dan Bohus (dbohus@cs) Antoine Raux (antoine@cs) Jahanzeb Sherwani (jsherwan@cs) Thomas Harris (tkharris@cs) Satanjeev Banerjee (satanjeev@cs) Brian Langner (blangner@cs)
More users / developers / documentation writers are always welcome!!
Dialogs on Dialogs Reading Group www.cs.cmu.edu/~dod
examples : architecture : development : components : miscellaneous : research
The Communicator / RavenClaw Spoken Dialogue Systems Framework
Examples Overall Architecture System Development Components & Resources Miscellaneous Current Research
examples : architecture : development : components : miscellaneous : research
Error awareness and recovery
Problem: lack of robustness when faced with understanding errors
Solution: build mechanisms for acting robustly at the dialogue management level Error awareness
Building better confidence annotators, hypothesis selection; transference across domains
Error recovery strategies Recovery from non-understandings
Error handling decision process Scalable, adaptable, task-independent architecture for
making error handling decisions
examples : architecture : development : components : miscellaneous : research
Let’s Go! Research
Speech Recognition: acoustic adaptation on non-native speech
WER: 50% 30%
Speech Synthesis: flexible and natural F0 modeling (F0 unit selection)
Emphasis on erroneous/uncertain words for utterance confirmation
examples : architecture : development : components : miscellaneous : research
Sublime
Interface for personalized information management
Narrow functionality in unrestricted domains Currently, handle information without
understanding it Eventually, learn relationships and a shallow
ontology
examples : architecture : development : components : miscellaneous : research
That’s all, folks!
THANK YOU!