Top Banner
Information Information Retrieval using Retrieval using Intelligent Speech Intelligent Speech Communication Communication Interface Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava [email protected]
24

Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava [email protected].

Dec 27, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

Information Retrieval Information Retrieval using Intelligent Speech using Intelligent Speech Communication InterfaceCommunication Interface

Institute of Informatics of the Slovak Academy of Sciences, Bratislava

[email protected]

Page 2: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006 2

Overview

1. Introduction

2. IRKR system

3. Architecture

4. Pilot applications

5. Realization of service

Page 3: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

What is a Speech Communicarion Interface (SCI)?

• A SCI, or Spoken Language Dialog System (SLDS) is a computer system that you can talk to in order to carry out some task

• Contemporary SLDSs are typically of two kinds:

– Transaction-based systems, allowing to undertake some transaction, such as buying or selling stocks, or reserving a seat on a plane

– Information-provision systems, providing information in response to a query, such as a request for timetable information or weather information

• The circle of typical speech dialog in SCI shows also main components of SCI

Page 4: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

The Speech Dialog Circle in SLDS

DM

SLUResponseGeneration

Automatic SpeechRecognition

Spoken LanguageUnderstanding

DialogManagement

ASR

Data,Rules

Speech

Words spoken

”I need a flight from Košice to Bratislava roundtrip”

Speech

Meaning

ORIGIN_CITY: KOŠICEDESTINATION_CITY: BRATISLAVAFLIGHT_TYPE: ROUNDTRIP

Action

GET DEPARTURE DATE

Which date do you want to fly from Košice to Bratislava?

RG

TTS Text-to-Speech

Page 5: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006 5

IRKR

• first SLDS which is able to interact in the Slovak language

• developed in the period from July 2003 to June 2006

• supported by the National program for R&D “Building of the information society”

Page 6: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

IRKR - partners

• Technical University of Košice

• Institute of Informatics, the Slovak Academy of Sciences

• Slovak University of Technology in Bratislava

• University of Žilina

Page 7: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

IRKR - specification

• natural interaction

• multi-user interaction

• slovak language

• fixed and mobile telephone networks

• access to distributed information (on internet)

Page 8: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006 8

IRKR - architecture

• DARPA Communicator architecture

• ‘hub-and-spoke’ • each module seeks services

from and provides services to the other modules

• modules communicate with them through the central software router - the Galaxy hub

• communicator.sourceforge.net

Page 9: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

Distributed, message-based, hub-and-spoke infrastructure optimized for constructing spoken dialogue systems;

available under a liberal open source license;

not an end-to-end dialogue system, but provides tools for constructing such a system out of a suite of servers;

provides a sophisticated and general transport layer for connecting servers and Hubs, as well as a message syntax (does not provide specifications about semantics);

the core Galaxy Communicator infrastructure is written in C;

support for defining server and connection initialization functions in C, Python, Java and Allegro Common Lisp.

Galaxy – basic overview

Page 10: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

IRKR - architecture

S LD S - S p e e ch La n g u a g e D ia lo g S y s te m

T elep h o n ys er v er

AS Rs er v er

T T Ss er v er

HUB

I n te r n e tI n f o r m atio ns er v er

D ia lo g u em an ag er

Vo ic eX M L

T elep h o n en etw o r k

Page 11: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

Automatic speech recognition server

• conversion of incoming speech to a corresponding text

• two speech recognizers of freely available for nonprofit research

• ATK - htk.eng.cam.ac.uk/develop/atk.shtml• SPHINX - cmusphinx.sourceforge.net

• Phoneme acoustic models:• built following REFREC 0.96 training procedure • acoustic features were conventional 39-dimensional MFCCs, including energy and first and second order deltas• 3-state left-to-right HMMs • context dependent (triphone) acoustic models

Page 12: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

Databases used for ASR training

• SpeechDat-E SK• 1000 speakers, PSTN (office, home, phonebooth)

• MobilDat SK• 1100 speakers, GSM networks (office, home, street, vehicle, public building)

• Both of them balanced for:age, regional accent, and sex of the speakers

• Every speaker pronounced 50 files - numbers, names, dates, money amounts, embedded command words, geographical names, phonetically balanced words, phonetically balanced sentences, Yes/No answers and one longer non-mandatory spontaneous utterance

Page 13: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

Text-to-speech synthesis

• TTS converts outgoing information in text form to speech • intelligibility , naturalness • we developed two TTS modules using two different approaches:

• diphone• intelligible speech • flexible and totally domain–independent • computationally inexpensive• small memory-footprint•sounds a bit robotic and tedious

• unit-selection• better naturalness• some problems with intelligibility • limited domain

Page 14: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

TTS architecture

T e xt p re p ro c e s s o r

S y nta c tic - p ro s o d ic p a rs e r

P ro s o d y ge ne ra tio nF 0 , E n erg y, d u ra tion , ...

S e gm e nt lis tge ne ra tio nInd e x o f s p e e c h

s e gm e nts D B

te x t a n a ly s is

Ind e x o fa c o u s tic o ns

p re p a ra tio n

O rtho e p ic tra ns c rip tio nS AM P A cod e

P ro s o d ym a tc hing

S e gm e ntc o nc a te na tio n

S igna l S y nthe s is

S p e e c hs e gm e nts D B Aco u stico ns

D B

s ig n a l p ro ce ss in g

T E X T

S P E E C H

G AL AX Y w r ap p er

T e lep h o n y s e r v e rHUB

d ic t io n ar y

p r o c es s in g o fn u m er a ls an dab b r ev ia tio n s

d a ta d r iv enp h o n e tic

tr an s c r ip tio n

T T Sc o n tr o lb lo c k

c o r p u s /S D B

u n its e lec t io n

u n it c o n c a ten a tio n

au d io f ile

b ro k er ch a n n el

p h r as e c ac h e

h ig h lev el syn th esis lo w lev el syn th esis

TTS s e rv e r

Diphone synthesizer Unit selection synthetizer

Page 15: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

Dialogue manager

•The dialogue manager controls the dialogue of the system with the user• The heart of the dialogue manger is the interpreter of VoiceXML mark-up language:

• simplifies speech application development• enables distributed application design • accelerates the development of interactive voice response (IVR) environments

Page 16: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

Dialogue manager architecture

X M L p ar s er

G r am m ar sh an d lin g

u n it

I n p u tin te r f ac e

O u tp u tin te r f ac e

D o c u m en tm an ag er

Lo g g in gin te r f ac e

EC M AS c r ip tu n it

Vo ic e X M Lin ter p r e te r

( c o r e)

Vo ic eX M L

HUB

D ia lo g u e m a n a g e r

Page 17: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

Audioserver

• provides the whole information system with reliable multiuser connection to the telephone networks• supports telephone hardware - Dialogic D120/41JCT-LSEuro card• The direct (broker) connection between audio server and ASR server or TTS server

Page 18: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

Dialogue manager architecture

P ABXS w itc h

G S M - G W

M - G W

I S D N /P S T N

I P

G S M

a /bB R A

H .3 2 3 SI P

T elep h o n yi/o b o ar d

Au d ios er v er

HUBT T S AS R

4 . .1 2 . . . . a /b

ip n e two rk

B R A /P R A

Page 19: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

Information server - IS

• IS connects the system to information sources and retrieves information required by the user• special IS for every pilot application – special web wrapper• a rule based ad-hoc IS searching only several predefined web-servers with a relatively well known structure of pages will do a much better job• returning the data in the XML format• caching of results with user defined expiration

Page 20: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

IS architecture

HUB

Integrator

web wrapper

web wrapper

web wrapper

web wrapper

web source

web source

web source

web source

IS - Backend

Galaxy interface

Internet

Page 21: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

WEB wrapper

• navigation through the web-server• extraction from the web-pages• mapping on to a structured format (XML)• data verification

• robust as possible against changes in the web-pages structure

W e b w ra p p e r

N av ig a tio nm o d u le

E x tr ac tio nm o d u le

I n te r n e t D atav er if ic a t io n

M ap p in gm o d u le

H TM L X M LD a ta b ase

S Q L

Page 22: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006 22

Pilot applications• “Weather forecast in Slovakia“

• www.meteo.sk; www.shmu.sk• weather forecast for about 80 Slovak district towns Place: District town or holiday locality

Date: relative date / accurate date

• „Timetable of Slovak Railways“• www.cp.sk• information about Slovak railways timetable

Starting place: railway station in Slovakia Destination place: railway station in Slovakia

Date: relative date (today, tomorrow etc.)/absolute date

(“the twentieth of December” etc.)

Time: departure time (hour, minute)

Page 23: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006

Realization of services

• available at: +421 55 602 2297, +421 2 5941 1118 (T-com), +421 911 650 038 (T-Mobile), +421 918 717 491 (Orange), irkr_pub (skype)• IRKR on web - irkr.fei.tuke.sk

Here we show a typical dialogue between the user (U) and the system (S):S: Welcome to the IRKR portal. Would you like to play the introduction? U: No. S: Choose one of the services: Weather forecast or Railway’s timetable. U: Weather forecast S: Please, name a city and assign a day, for which you want to get the weather forecast. U: Bratislava, tomorrow. S: Did you say Bratislava, tomorrow? U: Yes S: The weather forecast for Bratislava for tomorrow is: sunny, 32 centigrade...

Page 24: Information Retrieval using Intelligent Speech Communication Interface Institute of Informatics of the Slovak Academy of Sciences, Bratislava trnka@savba.sk.

WIKT 2006 24

Thank you for your attention