Top Banner
Speech Technology Part I : Automatic Speech Recognition Rajesh M. Hegde [email protected] Associate Professor Dept. of EE Indian Institute of Technology Kanpur Several pictures used in this presentation have been collected from various sources available on the web and have been acknowledged in the slides.
20

Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

Apr 28, 2018

Download

Documents

lamdieu
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

Speech Technology Part I : Automatic Speech Recognition

Rajesh M. Hegde [email protected]

Associate Professor Dept. of EE

Indian Institute of Technology Kanpur

Several pictures used in this presentation have been collected from various sources available on the web and have been acknowledged in the slides.

Page 2: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

Topics Covered

• What is Automatic speech recognition (ASR)? • What are the challenges in implementing ASR

systems on a mobile phone ? • How can speech technology be used for

developing applications on a mobile phone ?

Page 3: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

Broad Objectives of Speech Recognition for Machines

Speech to Text (ASR)

Source : Reynolds et. al

Page 4: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

Speech Recognition for Mobile Phones

• Speech recognition converts a speech signal, acquired by a mobile phone, to a sequence of words.

• The recognition output can be used in command and control, email, search, and communication.

• This output can also be used in dialog management and natural language understanding.

• What you can do with it : Dictation, Call routing, Directory assistance, Travel planning, and Logistics.

Page 5: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

Overview of the Automatic Speech Recognition (ASR) Technology

Open Source Tools : HTK and CMU Sphinx

Source : Google Image Search

Page 6: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

Popular Commercial Applications : Siri, Google Voice

Source : Google, Apple

Page 7: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

Client and Server Based Speech Recognition on the Mobile Phone

Source : Pearce et. al. ETSI

Speech Recognition at the Client Mobile Phone

Server based Speech Recognition on the Mobile Phone

Page 8: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

ASR Issues on Mobile Phone

• Memory Crunching • Computational Complexity • Power Requirement

Page 9: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

“Their Car” =

ASR Issues on Mobile Phones : Search Complexity

DH EH R [word] K AA R DH

P(“DH”)

Source : Slides Krishna et.al, from U Michigan

Page 10: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

DH

DH EH R [word] K AA R

Source : Slides from Krishna et.al, U Michigan

ASR Issues on Mobile Phones : Search Complexity

Page 11: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

DH

DH EH R [word] K AA R

Source : Slides from Krishna et. al, U Michigan

ASR Issues on Mobile Phones : Search Complexity

Page 12: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

DH EH R

AH

AX

“The”

IH

IY

[word]

“Ear”

“Their”

DH EH R [word] K AA R

Source : Slides from Krishna et. al, U Michigan

ASR Issues on Mobile Phones : Search Complexity

Page 13: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

DH EH R [word] K AA R

DH EH R

AH

AX

“The”

IH

IY

[word]

“Ear”

“Their”

Source : Slides from Krishna et. al, U Michigan

ASR Issues on Mobile Phones : Search Complexity

Page 14: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

“Their” “Car”

“The” [word]

“Ear”

DH EH R

AH

AX IH

IY

K AA R

[word]

P AE

T “Cat”

“Cap”

DH EH R [word] K AA R

Source : Slides from Krishna et. al, U Michigan

ASR Issues on Mobile Phones : Search Complexity

Page 15: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

DH EH R

AH

AX IH

IY

K AA R

P AE

T

NH

F S

N L

EH OY

DH EH R [word] K AA R

Source : Slides from Krishna et. al, U Michigan

ASR Issues on Mobile Phones : Search Complexity

Page 16: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

DH EH R

AH

AX IH

IY

K AA R

P AE

T

NH

F S

N L

EH OY

TH

SH

T

IY

OW G

DH EH R [word] K AA R

Source : Slides from Krishna et. al, U Michigan

ASR Issues on Mobile Phones : Search Complexity

Page 17: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

DH EH R

AH

AX IH

IY

K AA R

P AE

T

NH

F S

N L

EH OY

TH

SH

T

IY

OW G

DK CH

ER

IY IH

F K

DUH OW

IH

Z OW

JH V

ZH AX

G SH GH DH EH R [word] K AA R

Source : Slides from Krishna et. al, U Michigan

ASR Issues on Mobile Phones : Search Complexity

Page 18: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

SEARCH – Computing Requirements on the Mobile Phone

1. Search • Roughly 50% of total time for Speech Recognition is taken away by search • Even More for Large Vocabulary Recognition • Considerably less for Small vocabulary tasks 2. Solutions • Network optimization • Efficient search techniques • Pruning methods i) Look-ahead based strategy ii) Pruning threshold dependent on the grammar • Multi-pass methods i) A fast first pass to produce a short list of candidates or a lattice, followed by second pass rescoring with larger acoustic and language models

Source : Rose et. al

Page 19: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

Speech Recognition Based Access of Agrocommodity Prices in Hindi for Uttar Pradesh

Sponsored by DieTY Govt. Of India

Page 20: Speech Technology Part I : Automatic Speech Recognition · Speech Technology Part I : Automatic Speech ... Based Access of Agrocommodity Prices in Hindi for ... Technology Part I

Questions [email protected]

URL : http://202.3.77.107/mips/

?