Page 1
Wolfgang Wahlster
German Research Center for Artificial Intelligence, DFKI GmbH
Stuhlsatzenhausweg 366123 Saarbruecken, Germany
phone: (+49 681) 302-5252/4162fax: (+49 681) 302-5341e-mail: [email protected]
WWW:http://www.dfki.de/~wahlster
VerbmobilMultilingual Processing of
Spontaneous Speech
Page 2
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Mobile Speech-to-Speech Translation of Spontaneous Dialogs
As the name Verbmobil suggests,the system supports verbal
communication with foreign dialog partners in mobile situations.
1
2
face-to-face conversations
telecommunication
Page 3
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Mobile Speech-to-Speech Translation of Spontaneous Dialogs
Verbmobil Speech Translation Server
Solution: Conference Call: The Verbmobil Speech Translation Server
is accessed by GSM mobile phones.
Page 4
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Verbmobil is a Multilingual System
GermanEnglish
(American)
German Japanese
It supports bidirectional translation between:
GermanChinese
(Mandarine)
Siemens, Philips, FH Konstanz, 2 Chinese Universities
Final industrial demo at the end of 2000
Page 5
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Input Conditions Naturalness Adaptability Dialog Capabilities
Incr
easi
ng
Co
mp
lexi
ty
Close-SpeakingMicrophone/Headset
Push-to-talk
Telephone,Pause-basedSegmentation
Isolated Words
Read ContinuousSpeech
SpeakerIndependent
SpeakerDependent
MonologDictation
Information-seeking Dialog
Open Microphone,GSM Quality
SpontaneousSpeech
Speakeradaptive
MultipartyNegotiation
Verbmobil
Challenges for Language Engineering
Page 6
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Wann fährt der nächsteZug nach Hamburg ab? When does the next
train to Hamburg depart?
Wo befindet sichdas nächste
Hotel?
Where is the nearest hotel?
Final Verbmobil Demos: CeBIT-2000 (Hannover) COLING-2000 (Saarbrücken) ECAI-2000 (Berlin)
Context-Sensitive Speech-to-Speech Translation
VerbmobilServer
Page 7
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Verbmobil: The First Speech-Only Dialog Translation System
Mobile GSM Phone
Mobile DECT Phone
German Speaker: “Verbmobil” (Voice Dialing)
Page 8
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Verbmobil: The First Speech-Only Dialog Translation System
Mobile GSM Phone
Mobile DECT Phone
German Speaker: “Verbmobil” (Voice Dialing)
Connect to the VerbmobilSpeech-to-Speech Translation Server
+49 631 3111911
Page 9
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Verbmobil: The First Speech-Only Dialog Translation System
Mobile GSM Phone
Mobile DECT Phone
German Speaker: “Verbmobil” (Voice Dialing)
Connect to the VerbmobilSpeech-to-Speech Translation Server
+49 631 3111911
Verbmobil: “Willkommen beim Verbmobil-Sprachserver. Bitte sprechen sie nach dem Piepton.”
Page 10
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Verbmobil: The First Speech-Only Dialog Translation System
Mobile GSM Phone
Mobile DECT Phone
German Speaker: “Verbmobil” (Voice Dialing)
Connect to the VerbmobilSpeech-to-Speech Translation Server
+49 631 3111911
Verbmobil: “Willkommen beim Verbmobil-Sprachserver. Bitte sprechen sie nach dem Piepton.”
German Speaker: “Verbmobil neuer Teilnehmer hinzufügen.” (Speech command to initiate a conference call)
Page 11
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Verbmobil: The First Speech-Only Dialog Translation System
Mobile GSM Phone
Mobile DECT Phone
German Speaker: “Verbmobil” (Voice Dialing)
Connect to the VerbmobilSpeech-to-Speech Translation Server
+49 631 3111911
Verbmobil: “Willkommen beim Verbmobil-Sprachserver. Bitte sprechen sie nach dem Piepton.”
German Speaker: “Verbmobil neuer Teilnehmer hinzufügen.” (Speech command to initiate a conference call)
Verbmobil: “Bitte sprechen Sie jetzt die Telephonnummer Ihres Gesprächspartners.”
Page 12
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Verbmobil: The First Speech-Only Dialog Translation System
Mobile GSM Phone
Mobile DECT Phone
German Speaker: “Verbmobil” (Voice Dialing)
Connect to the VerbmobilSpeech-to-Speech Translation Server
+49 631 3111911
Verbmobil: “Willkommen beim Verbmobil-Sprachserver. Bitte sprechen sie nach dem Piepton.”
German Speaker: “Verbmobil neuer Teilnehmer hinzufügen.” (Speech command to initiate a conference call)
Verbmobil: “Bitte sprechen Sie jetzt die Telephonnummer Ihres Gesprächspartners.”
German Speaker: “0681/302 5253”
Page 13
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Verbmobil: The First Speech-Only Dialog Translation System
Mobile GSM Phone
Mobile DECT Phone
German Speaker: “Verbmobil” (Voice Dialing)
Connect to the VerbmobilSpeech-to-Speech Translation Server
+49 631 3111911
Verbmobil: “Willkommen beim Verbmobil-Sprachserver. Bitte sprechen sie nach dem Piepton”
German Speaker: “Verbmobil neuer Teilnehmer hinzufügen” (Speech command to initiate a conference call)
Verbmobil: “Bitte sprechen Sie jetzt die Telephonnummer Ihres Gesprächspartners.”
German Speaker: “0681/302 5253”
Foreign Participant is placed into the Conference CallTo
Ger
man
Par
ticip
ant
Verbmobil: Verbmobil hat eine neue Verbindung aufgebaut. Bitte sprechen Sie jetzt.
To
Am
eric
an P
artic
ipan
t
Verbmobil: Welcome to the Verbmobil server. Please start your input after the beep.
Page 14
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Scenario 1AppointmentScheduling
When? When? Where? How? What? When? Where?How?
Focus on temporalexpressions
Focus on temporaland spatial expressions
Integration of specialsublanguage lexica
Vocabulary Size:2500/6000
Vocabulary Size:7000/10000
Vocabulary Size:15000/30000
Verbmobil II: Three Domains of Discourse
Scenario 2Travel Planning
Scenario 3Remote
PC Maintenance
Page 15
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
The Control Panel of Verbmobil
Page 16
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
M1 M2 M3
M5 M6M4
BB 2BB 1 BB 3
M1
Verbmobil I Verbmobil II Multi-Agent Architecture Multi-Blackboard Architecture
Each module must know, which moduleproduces what data
Direct communication between modulesEach module has only one instance Heavy data traffic for moving copies
around Multiparty and telecooperation applications
are impossible Software: ICE and ICE Master Basic Platform: PVM
All modules can register for each blackboard dynamically
No direct communication between modules Each module can have several instances No copies of representation structures
(word lattice, VIT chart) Multiparty and Telecooperation applications are
possible Software: PCA and Module Manager Basic Platform: PVM
From a Multi-Agent Architecture to a Multi-Blackboard Architecture
BlackboardsM2
M3
M6
M4 M5
Page 17
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Audio Data
CommandRecognizer
SpontaneousSpeech Recognizer
Channel/SpeakerAdaptation
ProsodicAnalysis
A Multi-Blackboard Architecture for the Combinationof Results from Deep and Shallow Processing Modules
Page 18
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Audio Data
Word HypothesesGraph with
Prosodic Labels
CommandRecognizer
SpontaneousSpeech Recognizer
Channel/SpeakerAdaptation
ProsodicAnalysis
StatisticalParser
Dialog ActRecognition
Chunk Parser
HPSGParser
A Multi-Blackboard Architecture for the Combinationof Results from Deep and Shallow Processing Modules
Page 19
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Audio Data
Word HypothesesGraph with
Prosodic Labels
VITsUnderspecified
DiscourseRepresentations
CommandRecognizer
SpontaneousSpeech Recognizer
Channel/SpeakerAdaptation
ProsodicAnalysis
StatisticalParser
Dialog ActRecognition
Chunk Parser
HPSGParser
SemanticConstruction
Robust DialogSemantics
SemanticTransfer
Generation
A Multi-Blackboard Architecture for the Combinationof Results from Deep and Shallow Processing Modules
Page 20
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Verbmobil as the First Dialog Translation System that Uses Prosodic Information Systematicallyat All Processing Stages
Speech Signal Word Hypotheses Graph
Multilingual Prosody ModuleProsodic features:durationpitchenergypause
Search SpaceRestriction
Parsing
Dialog ActSegmentation and
Recognition
Dialog Understanding
Constraints forTransfer
Translation
LexicalChoice
GenerationSpeech
Synthesis
SpeakerAdaptation
BoundaryInformationBoundary
InformationBoundary
InformationBoundary
InformationSentence
MoodSentence
MoodAccented
WordsAccented
WordsProsodic Feature
Vector
Page 21
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
AugmentedWord Hypotheses
Graph
AugmentedWord Hypotheses
Graph
Chunk ParserChunk ParserStatistical ParserStatistical Parser HPSG ParserHPSG Parser
Integrating Shallow and Deep Analysis Components in a Multi-Blackboard Architecture
Page 22
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
partial VITs Chart with a combination of
partial VITs
Chart with a combination of
partial VITs
partial VITs
partial VITs
Integrating Shallow and Deep Analysis Components in a Multi-Blackboard Architecture
AugmentedWord Hypotheses
Graph
AugmentedWord Hypotheses
Graph
Chunk ParserChunk ParserStatistical ParserStatistical Parser HPSG ParserHPSG Parser
Page 23
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Robust Dialog SemanticsCombination and knowledge-
based reconstruction of complete VITs
Robust Dialog SemanticsCombination and knowledge-
based reconstruction of complete VITs
Complete and SpanningVITs
Complete and SpanningVITs
Integrating Shallow and Deep Analysis Components in a Multi-Blackboard Architecture
Chunk ParserChunk ParserStatistical ParserStatistical Parser HPSG ParserHPSG Parser
partial VITs Chart with a combination of
partial VITs
Chart with a combination of
partial VITs
partial VITs
partial VITs
AugmentedWord Hypotheses
Graph
AugmentedWord Hypotheses
Graph
Page 24
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Verbmobil‘s Massive Data Collection Effort
Transliteration Variant 1Transliteration Variant 2 Lexical OrthographyCanonical PronounciationManual Phonological Segmentation
Automatic Phonological SegmentationWord SegmentationProsodic SegmentationDialog ActsNoises
Superimposed SpeechSyntactic CategoryWord CategorySyntactic FunctionProsodic Boundaries
The so-called Partitur (German word for musical score)orchestrates fifteen strata of annotations
3,200 dialogs (182 hours)with 1,658 speakers79,562 turnsdistributed on56 CDs, 21.5 GB
Page 25
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Machine Learningfor the Integration of Statistical Properties into
Symbolic Models for Speech Recognition, Parsing,Dialog Processing, Translation
TranscribedSpeech Data
SegmentedSpeech
with ProsodicLabels
AnnotatedDialogs withDialog Acts
Treebanks &Predicate-ArgumentStructures
AlignedBilingualCorpora
HiddenMarkovModels
Neural Nets,MultilayeredPerceptrons
ProbabilisticAutomata
ProbabilisticGrammars
ProbabilisticTransfer
Rules
Extracting Statistical Properties from Large Corpora
Page 26
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Incremental chart construction and anytime processing
Chart Parser using cascaded finite-state transducers (Abney, Hinrichs) Semantic
Construction
VHG: A Packed Chart Representation of Partial Semantic Representations
Page 27
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Incremental chart construction and anytime processing
Chart Parser using cascaded finite-state transducers (Abney, Hinrichs)
Statistical LR parser trained on treebank (Block, Ruland)
SemanticConstruction
VHG: A Packed Chart Representation of Partial Semantic Representations
Page 28
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Incremental chart construction and anytime processing
Chart Parser using cascaded finite-state transducers (Abney, Hinrichs)
Statistical LR parser trained on treebank (Block, Ruland)
Very fast HPSG parser (see two papers at ACL99, Kiefer, Krieger et al.)
SemanticConstruction
VHG: A Packed Chart Representation of Partial Semantic Representations
Page 29
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Incremental chart construction and anytime processing
Chart Parser using cascaded finite-state transducers (Abney, Hinrichs)
Statistical LR parser trained on treebank (Block, Ruland)
Very fast HPSG parser (see two papers at ACL99, Kiefer, Krieger et al.)
SemanticConstruction
VHG: A Packed Chart Representation of Partial Semantic Representations
Page 30
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Incremental chart construction and anytime processing Rule-based combination and transformation of partial UDRS coded as VITs
Chart Parser using cascaded finite-state transducers (Abney, Hinrichs)
Statistical LR parser trained on treebank (Block, Ruland)
Very fast HPSG parser (see two papers at ACL99, Kiefer, Krieger et al.)
SemanticConstruction
VHG: A Packed Chart Representation of Partial Semantic Representations
Page 31
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Incremental chart construction and anytime processing Rule-based combination and transformation of partial UDRS coded as VITs Selection of a spanning analysis using a bigram model for VITs
(trained on a tree bank of 24 k VITs)
Chart Parser using cascaded finite-state transducers (Abney, Hinrichs)
Statistical LR parser trained on treebank (Block, Ruland)
Very fast HPSG parser (see two papers at ACL99, Kiefer, Krieger et al.)
SemanticConstruction
VHG: A Packed Chart Representation of Partial Semantic Representations
Page 32
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Goals of robust semantic processing (Pinkal, Worm, Rupp) Combination of unrelated analysis fragments Completion of incomplete analysis results Skipping of irrelevant fragments
Method: Transformation rules on VIT Hypothesis Graph:
Conditions on VIT structures Operations on VIT structures
The rules are based on various knowledge sources:
lattice of semantic types domain ontology sortal restrictions semantic constraints
Results: 20% analysis is improved, 0.6% analysis gets worse
Robust Dialog Semantics: Deep Processing of Shallow Structures
Page 33
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
The preposition ‚in‘ is missing in all paths through the word hypothesis graph.A temporal NP is transformed into a temporal modifier using a underspecifiedtemporal relation:
[temporal_np(V1)] [typeraise_to_mod (V1, V2)] & V2
The modifier is applied to a proposition:
[type (V1, prop), type (V2, mod)] [apply (V2, V1, V3)] & V3
Let us meet the late afternoon to catch the train to Frankfurt
Let us meet (in) the late afternoon to catch the train to Frankfurt
Robust Dialog Semantics: Combining and Completing Partial Representations
Page 34
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
I need a car next Tuesday oops MondayI need a car next Tuesday oops Monday
Original Utterance Editing Phase Repair Phase
Reparandum Hesitation Reparans
Recognition ofSubstitutions
Transformation of theWord Hypothesis Graph
I need a car next MondayI need a car next Monday
Verbmobil Technology: Understands Speech Repairs and extracts the intended meaning
Dictation Systems like: ViaVoice, VoiceXpress, FreeSpeech, Naturally Speaking cannot deal with spontaneous speech and transcribe the corrupted utterances.
The Understanding of Spontaneous Speech Repairs
Page 35
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Integrating a Deep HPSG-based Analysis with Probabilistic Dialog Act Recognition for Semantic Transfer
ProbabilisticAnalysis of Dialog
Acts (HMM)
ProbabilisticAnalysis of Dialog
Acts (HMM)
Recognition ofDialog Plans
(Plan Operators)
Recognition ofDialog Plans
(Plan Operators)
Dialog Act Type
HPSG AnalysisHPSG Analysis
RobustDialog Semantics
RobustDialog Semantics
VITVIT
SemanticTransferSemanticTransfer
Dialog Act Type
Page 36
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
ProbabilisticAnalysis of Dialog
Acts (HMM)
ProbabilisticAnalysis of Dialog
Acts (HMM)
Recognition ofDialog Plans
(Plan Operators)
Recognition ofDialog Plans
(Plan Operators)
Dialog Act Type
Dialog Phase
Dialog Act Type
Integrating a Deep HPSG-based Analysis with Probabilistic Dialog Act Recognition for Semantic Transfer
HPSG AnalysisHPSG Analysis
RobustDialog Semantics
RobustDialog Semantics
VITVIT
SemanticTransferSemanticTransfer
Page 37
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
StatisticalPredictionStatisticalPrediction
ContextEvaluation
Dialog Module
Dialog-Actbased
Translation
PlanRecognition
PlanRecognition
DialogMemoryDialog
Memory
MainProprositional
Content
Dialog Act
ContextEvaluation
Dialog-Actbased
Translation
Transferby Rules
Generationof Minutes
Dialog ActPredictions
Dialog Act
DialogPhase
Focus
Combining Statistical and Symbolic Processing for Dialog Processing
Page 38
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Using Context and World Knowledgefor Semantic Transfer
All other dialog translation systems translate word-by-wordor sentence-by-sentence.
1Nehmen wir dieses Hotel, ja. Let us take this hotel.
Ich reserviere einen Platz. I will reserve a room.
2 Machen wir das Abendessen dort. Let us have dinner there.
Ich reserviere einen Platz. I will reserve a table.
3 Gehen wir ins Theater. Let us go to the theater.
Ich möchte Plätze reservieren. I would like to reserve seats.
Example: Platz room / table / seat
Page 39
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Dialog Translationby Verbmobil
MultilingualGeneration of Protocols
HTML-Document
in English
Transferred by
Internet or Fax
HTML-Document
in German
Transferred by
Internet or Fax
German Dialog Partner
American Dialog Partner
Automatic Generation of Multilingual Protocolsof Telephone Conversations
Page 40
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Segment 1If you prefer another hotel,
Segment 1If you prefer another hotel,
Segment 2please let me know.
Segment 2please let me know.
Integrating Deep and Shallow Processing: Combining Results from Concurrent Translation Threads
Page 41
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Alternative Translations with Confidence Values
StatisticalTranslationStatistical
TranslationDialog-Act Based
TranslationDialog-Act Based
TranslationSemanticTransferSemanticTransfer
Case-BasedTranslation
Case-BasedTranslation
Integrating Deep and Shallow Processing: Combining Results from Concurrent Translation Threads
Segment 1If you prefer another hotel,
Segment 1If you prefer another hotel,
Segment 2please let me know.
Segment 2please let me know.
Page 42
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Integrating Deep and Shallow Processing: Combining Results from Concurrent Translation Threads
Segment 1Translated by Semantic Transfer
Segment 1Translated by Semantic Transfer
Segment 2Translated by Case-Based Translation
Segment 2Translated by Case-Based Translation
Alternative Translations with Confidence Values
StatisticalTranslationStatistical
TranslationDialog-Act Based
TranslationDialog-Act Based
TranslationSemanticTransferSemanticTransfer
Case-BasedTranslation
Case-BasedTranslation
Segment 1If you prefer another hotel,
Segment 1If you prefer another hotel,
Segment 2please let me know.
Segment 2please let me know.
Selection ModuleSelection Module
Page 43
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Funding by the German Ministry for Education and Research BMBF
Phase I (1993-1996) $ 33 MPhase II (1997-2000) $ 28 M
60% Industrial funding according to shared cost model $ 17 MAdditional R&D investments of industrial partners $ 11 M
Total $ 89 M
> 800 Publications (>600 refereed) > Many Patents > 17 Commercial Spin-off Products > 6 Spin-off Companies> 900 trained Researchers for > Product Announcement
German Language Industry for GSM version in 2001
Philips, DaimlerChrysler and Siemens are leaders in Spoken DialogApplications
Verbmobil: Long-Term, Large-Scale Funding and Its Impact
Page 44
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
More than 80% of Verbmobil’s Translations are Approximately Correct
- Large-Scale Web-based Evaluation: 25 345 Translations, 65 Evaluators
- Sentence Length 1 - 60 Words
Translation Thread
Case-based Translation
Statistical Translation
Dialog-Act based Translation
Semantic Transfer
Substring-based Translation
Automatic Selection
Manual Selection
37%
69%
40%
40%
65%
57% / 78% *
88%
44%
79%
45%
47%
75%
66% / 83% *
95%
46%
81%
46%
49%
79%
68% / 85% *
97%
Word Accuracy 50%
5069 Turns
Word Accuracy 75%
3267 Turns
Word Accuracy 80%
2723 Turns
* After Training with Instance-based Learning Algorithm
Percentage of Approximately Correct Translation
Page 45
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Three Domains: Appointment Scheduling, Travel Planning, PC Hotline
Bi-directional and speaker-independent translation in the domains:appointment scheduling and travel planning
Translation pairs: German English, German Japanese
Vocabulary Size: 10 000 for German , Equivalent English Lexicon, 2500 for Japanese
Operational Success Criteria:
Word recognition rate (16 kHz):
German: spontaneous: 75% (cooperative: 85%)English: spontaneous: 72% (cooperative: 82%)Japanese: spontaneous: 75% (cooperative: 85%)(8kHz) spontaneous: 70% (cooperative: 80%)
80% of the translations are approximately correct and the dialog task success rate should be around 90%.
The average end-to-end processing time should be four times real time (length of the input signal)
Checklist for Final Verbmobil System I
Page 46
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
The system can work in the open microphone mode and cope with speech over GSM mobile phones.
Verbmobil can be controlled by speech commands.
A spelling mode is integrated into the speech recognizer.
The speech recognizers can cope with simple non-speech input (like coughing).
Spontaneous speech phenomena like repairs, hesitations and agreement failures can be handled.
The language identification and speech recognition components are implemented as separate components.
A three-party conference call with Verbmobil and a foreign partner can be initiated by one speaker.
A high-quality speech synthesis for German and American English is realized.
Checklist for Final Verbmobil System II
Page 47
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Prosodic information is used for input segmentation.
Unknown words can be identified and processed.
Robust semantic processing integrates partial analysis results of the competing parsing approaches.
The selection of the translation result is based on a dynamic choice function based on confidence values computed by competing translation threads.
Some translation ambiguities can be resolved by the exploitation of world and context knowledge, so that the translation quality is improved.
Verbmobil can generate various forms of dialog protocols in German and English.
Checklist for Final Verbmobil System III
Page 48
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Results of the Verbmobil Project have been used in 17 Spin-Off Products by the Industrial Partners DaimlerChrysler, Philips and Siemens
Verbmobil
Dictation Systems3
Spoken Dialog Systems4
Dialog Engines2
Command & ControlSystems
5
Text ClassificationSystems
1
Translation Systems2
Page 49
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Verbmobil
CLT Sprachtechnologie GmbHLanguage Technology for Text Processing www.clt-st.deSaarbrücken
RETIVOX GbRSpeech Synthesis Systemswww.retivox.deBonn
XTRAMIND TechnologiesLanguage Technology for Customer Interaction Serviceswww.xtramind.comSaarbrücken
SYMPALOG GmbHSpoken Dialog Systemswww.sympalog.deNürnberg
GSDC GmbHMultilingual Documentationwww.ic-portal.gsdc.deNürnberg
SCHEMA GmbHDocument Engineeringwww.schema.deNürnberg
Successful Technology Transfer: 6 High-Tec Spin-Off Companies in the Area of LanguageTechnology have been founded by Verbmobil Researchers
Page 50
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Verbmobil
Internships18
Master Students238
PhD Students164
Student Research Assistants
483Habilitations
16
Total919
Verbmobil was the Key Resource for the Education and Training of Researchers and Engineers Needed to Build Up Language Industry in Germany
Page 51
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Main ContractorProject Management
TestbedSoftware Integration
DFKISaarbrücken
The SmartKom Consortium:
Project Budget: $ 34 MProject Duration: 4 years
SmartKom: Intuitive Multimodal Interaction
MediaInterface European Media Lab
IMS Institut für MaschinelleSprachverarbeitung, Universität Stuttgart
Ludwig-Maximilians-Universität München
Page 52
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Smartcard/ Credit Cardfor authentication and billing
Docking stationfor PDA/Notebook/Camcorderhigh speed and broadbandwidth Internet connectivity
Loudspeaker
Room microphone
Face-tracking camera
Virtual touchscreenprotected against vandalism
Multipoint video conferencing High-resolution scanner
SmartKom-Public:A Multimodal Communication Booth
Page 53
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Camera
GPS
Microphone
Loudspeaker
Stylus-Activated Sketch Pad Wearable
ComputeServer
Docking Stationfor Car PC
Biosensorfor Authentication& Emotional Feedback
GSM for Telephone,Fax, Internet Connectivity
SmartKom-Mobile: A Handheld Communication Assistant
Page 54
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Verbmobil is a Very Large Dialog System
69 modules communicate via 224 blackboards
HPSG for German uses a hierarchy of 2,400 types
15,385 entries in the semantic database
22,783 transfer rules and 13,640 microplanning rules
30,000 templates for case-based translation
691,583 alignment templates
334 finite state-transducers
Page 55
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Additional Information about Verbmobil during COLING
2 Tutorials: Klüter/Reithinger: Verbmobil Development and Integration
Müller: HPSG
11 Presentations at main conference (regular papers and project notes)
- Probabilistic Parsing
- Tense Translation
- Selection of Translation Results
- Statistical Translation (4)
- HPSG Parsing
- Semantic Construction
- Self Corrections
Verbmobil Demos at the COLING exhibition
1
2
3
Page 56
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Real-world problems in language technology like the understanding of spoken dialogs, speech-to-speech translation and multimodal dialog systems can onlybe cracked by the combined muscle of deep and shallow processing approaches.
In a multi-blackboard architecture based on packed representations on all processing levels (speech recognition, parsing, semantic processing, translation, generation) using charts with underspecified representations (eg. UDRS) the results of concurrent processing threads can be combined in an incremental fashion.
Conclusion I
Page 57
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
All results of concurrent processing modules should come with a confidence value, so that a selection module can choose the most promising result at a each processing stage.
Packed representations together with formalisms for underspecification capture the uncertainties in a each processing phase, so that the uncertainties can be reduced by linguistic, discourse and domain constraints as soon as they become applicable.
Conclusion II
Page 58
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
Deep Processing can be used for merging, completing and repairing the results of shallow processing strategies.
Shallow methods can be used to guide the search in deepprocessing.
Statistical methods must be augmented by symbolic models (eg. Class-based language modelling, word order normalization as part of statistical translation).
Statistical methods can be used to learn operators orselection strategies for symbolic processes.
It is much more than a balancing act... (see Klavans and Resnik 1996)
Conclusion III
Page 59
Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH
English speech recognition for telephone input (DaimlerChrysler)
Two additional translation engines: case-based (ALI, DFKI) and substring-based translation (LTrans, Siemens)
An additional protocol mode (baseline protocol, DFKI)
Open Problems:
Integrating top-down knowledge into basic speech recognition processes
Exploiting more knowledge about human interpretation strategies
More robust translation of turns with very low word accuracy rates
More systematic use of expert knowledge about the domain of discourse
Additional Results (not promised in the project proposal)
Page 60
URL of this Presentation:
www.dfki.de/~wahlster/vm-final