Top Banner
24-Mar-14 Dashboard image reproduced with the permission of Visteon and 3M Corporation GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright © GENIVI Alliance 2012 1 Introduction of GENIVI Speech Services W3C Face to Face in Santa Clara, CA March 17 – 18 Mario Thielert, Continental Automotive AG Dominique Massonié, Elektrobit Automotive GmbH
12

Introduction of GENIVI Speech Services W3C Face to Face in ... · Status of Speech in GENIVI Description: An application can assume a standard interface to implement a speech dialog

Jul 25, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction of GENIVI Speech Services W3C Face to Face in ... · Status of Speech in GENIVI Description: An application can assume a standard interface to implement a speech dialog

24-Mar-14 Dashboard image reproduced with the permission of Visteon and 3M Corporation

GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright © GENIVI Alliance 2012

1

Introduction of GENIVI Speech Services

W3C Face to Face in Santa Clara, CA March 17 – 18

Mario Thielert, Continental Automotive AG Dominique Massonié, Elektrobit Automotive GmbH

Page 2: Introduction of GENIVI Speech Services W3C Face to Face in ... · Status of Speech in GENIVI Description: An application can assume a standard interface to implement a speech dialog

24-Mar-14 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright © GENIVI Alliance 2013 2

Status of Speech in GENIVI

Description: An application can assume a standard interface to implement a speech dialog as well as to output speech. GENIVI application cores and external apps can rely on standard interfaces towards speech stacks.

Scope: Identify requirements towards an unified Interface for speech components in the system, GENIVI Speech APIs, Integration of speech recognizer & TTS engines, identification of standards for resources (like phonetic alphabets etc.) .

Responsibles: David Kämpf (Subproject Lead), Continental Automotive Mario Thielert, Continental Automotive Dominique Massonié, Elektrobit

Collect  requirements  

Use  -­‐  Case  

Define    API      PoC   Compliance  

Statement  

Page 3: Introduction of GENIVI Speech Services W3C Face to Face in ... · Status of Speech in GENIVI Description: An application can assume a standard interface to implement a speech dialog

Application Cores may •  provide data that will be

included in dynamic grammars •  generate prompts that will be

spoken by the TTS engine

External Apps may •  register app specific dialog/

content •  react on dialog steps •  generate prompts

24-Mar-14 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright © GENIVI Alliance 2012 3

Basic Speech Architecture Relations to other Areas

Page 4: Introduction of GENIVI Speech Services W3C Face to Face in ... · Status of Speech in GENIVI Description: An application can assume a standard interface to implement a speech dialog

•  We have only collected requirements in the Speech Area that – …capture non

differentiating aspects – …are not specific for a

product segment (e.g. high end)

– …capture KPIs only where usability is affected

Additional information on the Speech requirements can be found in •  UML model •  Compliance Document https://collab.genivi.org/wiki/display/genivi/Compliance+Team#

ComplianceTeam-Compliance5.0DraftDocuments

24-Mar-14 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright © GENIVI Alliance 2012 4

Status of Speech @ HORIZON Requirements

UPDATE  REQUIRED  

 

Add  screenshot  of  EA  Req.  tree  

Page 5: Introduction of GENIVI Speech Services W3C Face to Face in ... · Status of Speech in GENIVI Description: An application can assume a standard interface to implement a speech dialog

24-Mar-14 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright © GENIVI Alliance 2012 5

Status of Speech @ HORIZON Use Cases

•  Speech is a P2 Placeholder component

•  Use Cases are defined that cover: –  Core App

•  Radio •  Media •  Navigation •  Phone

–  Speech Dialog –  NLU –  Server based Reco.

Page 6: Introduction of GENIVI Speech Services W3C Face to Face in ... · Status of Speech in GENIVI Description: An application can assume a standard interface to implement a speech dialog

•  Speech Dialog –  Modeling the User Interaction –  Handling of resources –  Interact with GUI –  Interact with Business Logic

•  Dialog Step Abstraction –  Modeling one dialog step

•  Speech Input Service –  Integration of the Speech

Recognizer (ASR engine) –  Resource handling

•  Speech Output Service –  Accessing TTS functionality –  Integration of TTS engine

24-Mar-14 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright © GENIVI Alliance 2012 6

Basic Speech Architecture Speech Software Stack

Business logic

Data binding / communication (D-Bus IF)

Event/ M

essage handling

UI logic

UI Layer Speech Dialog

Dialog Step Abstraction

Input Service

Output Service

Dialog Framework

IPC

Speech Base Services

TTS engine ASR engine Engines

Page 7: Introduction of GENIVI Speech Services W3C Face to Face in ... · Status of Speech in GENIVI Description: An application can assume a standard interface to implement a speech dialog

Speech Roadmap

24-Mar-14 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright © GENIVI Alliance 2013 7

 Level  1  –    Placeholder  Compliance  based    on  requirements    

Level  2  –  Abstract  Compliance  based    on  interfaces  

GEMINI 10/2013

HORIZON 04/2014

Speech  Output  Service  

Speech  Input  Service  

Speech  Dialog  Service  

Speech  Output  Service  

Speech  Output  Service  

Speech  Input  Service  

Speech  Output  Service  

Speech  Input  Service  

Speech  Dialog  Service  

INTREPID 10/2014

J* 04/2015

Page 8: Introduction of GENIVI Speech Services W3C Face to Face in ... · Status of Speech in GENIVI Description: An application can assume a standard interface to implement a speech dialog

•  Status of Speech @ HORIZON (04/2014) –  Speech Uses Cases defined (Output, Input, Dialog) –  Speech Requirements defined (Output, Input, Dialog) –  Basic Speech Architecture defined –  Defined and agreed Speech Output Service API

•  Next Steps –  Proof of concept for Speech Output Service API –  Define Speech Input Service and Speech Dialog Service APIs –  Define Interfaces towards application cores

•  Dynamic data (e.g. media ID3 tags, phonebook contacts, station lists etc.) •  Navigation address data

24-Mar-14 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright © GENIVI Alliance 2011 8

EG HMI Subproject Speech Status Summary

Page 9: Introduction of GENIVI Speech Services W3C Face to Face in ... · Status of Speech in GENIVI Description: An application can assume a standard interface to implement a speech dialog

•  GENIVI API based on an API that is used in multiple customer products –  Covering the automotive use cases –  Proven and mature API

•  Benefit of a GENIVI API for the „Speech Output Service“ –  Session management to support multiple concurrent clients

(prioritization, audio connection handling etc.) –  Prompt preparation for low latency playback

•  Benefits of a GENIVI API for the „Dialog Step Abstraction“ –  Reduces the effort to implement a dialog –  Basis for App Development

24-Mar-14 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright © GENIVI Alliance 2012 9

Why should you care about GENIVI Speech API ???

Page 10: Introduction of GENIVI Speech Services W3C Face to Face in ... · Status of Speech in GENIVI Description: An application can assume a standard interface to implement a speech dialog

•  Converging W3C / GENIVI Speech APIs – Adding automotive capabilities to the W3C proposals – GENIVI could support out-of-the-box W3C standard

•  Shared development effort / Joint Meetings

•  Open Questions about Speech Standardization – ASR grammars (NLU, server based, word lists, …) – Phonetic Alphabets and Transcription mechanism –  Leverage W3C Markup Languages (VoiceXML,

SSML, SRGS…)

24-Mar-14 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright © GENIVI Alliance 2012 10

Next Steps for W3C and GENIVI

Page 11: Introduction of GENIVI Speech Services W3C Face to Face in ... · Status of Speech in GENIVI Description: An application can assume a standard interface to implement a speech dialog

24-Mar-14 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright © GENIVI Alliance 2012 11

BACKUP

Page 12: Introduction of GENIVI Speech Services W3C Face to Face in ... · Status of Speech in GENIVI Description: An application can assume a standard interface to implement a speech dialog

•  Applications can read out text easily

•  Connection Handling is taken care of by the TTSReader

•  Applications have to provide: –  text or multiple text chunks

to be spoken –  an application priority –  a context ID that identifies

the domain of the text being spoken (e.g. Navi, E-Mail etc)

class TTSReader {!signals:!

void notifyConnectionStatus(const TTSAppPrompterConnectionStatus eConnState);!

void notifyPrompterStatus(const TTSAppPrompterStatus ePromptState);!

void notifyChunkMemorySize(const Int memSize);!

public slots:!!void openPrompter(const TTSAppPrio ePrio, Int ctxId);!!void closePrompter();!!void abortPrompter();!!void addTextChunk(const QString chunk);!!void addTextChunk(const QVector<QString> chunks);!!requestChunkMemorySize(void);!

}!

24-Mar-14 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright © GENIVI Alliance 2012 12

Speech Output API – Reader Interface