Top Banner
How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology (CLST) Radboud University Nijmegen, The Netherlands Radboud University Nijmegen
25

How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Apr 01, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

How to integrateautomatic speech recognition (ASR) into CALL applications

Helmer Strik

Department of LinguisticsCentre for Language and Speech Technology (CLST)Radboud University Nijmegen, The Netherlands

Radboud University Nijmegen

Page 2: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 2

Overview

IntroductionASR: automatic speech recognitionASR-based tutoringASR-based CALLASR-based literacy trainingConclusions

Page 3: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 3

IntroductionStudents who receive 1-on-1 instruction

perform as well as the top two percent of students who receive traditional classroom instruction [Bloom 1984]

A human tutor for every student is not feasible

computer tutors

For language learning: CALLMany text-based CALL systems

Include speech

speech-based CALL system

Page 4: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 4

Speech insideMany applications with ‘speech’:

Screen readers [#]

Reading pen

Mobile phone: photo + OCR + TTS

Some also (useful) for CALL

[#]

Page 5: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 5

Speech inside (cont’d)Many applications with ‘speech’

Screen readers, reading pen, etc.

Some also (useful) for CALL

However, usually the learner canonly listen (TTS: text-to-speech)or, also speak, but …

no assessment, orthe learner has to carry out the assessment,e.g. by comparing with examples

use ASR / speech technologyIs it feasible?

Page 6: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 6

ASR: automatic speech recognitionWhat is ASR?

Speech to text conversion

Applications:DictationCommand and controlSpoken dialogue systems (information)etc.

ASR is not flawless, and it will probably never beesp. for non-native speech

Note: this is not even the case for humans!

Page 7: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 7

Speech Recognition

cgn2-s

vb

nn

mii

Page 8: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 8

ASR-based tutoringITS: Intelligent Tutoring Systems

Spoken dialogue system for learningSubject matter: math, physics, etc.

Examples:ITSPOKE, Univ. of Pittsburgh, Litman et al.Topic: PhysicsSCoT, Stanford Univ., Peters et al.Topic (SCoT-DC): shipboard damage control

Communicate with speechthe subject matter doesn’t have to be speech

Page 9: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 9

ASR-based CALLThe subject matter is speech

(language)

Late 1990’s:

1998: STiLL, Marholmen (Sweden); 1st time the CALL and Speech communities met

1999: Special Issue of CALICO, 'Tutors that Listen‘, focusing on ASR (mainly ‘discrete ASR’)

Page 10: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 10

ASR-based literacy trainingWhat has been done? Reading tutors (the learner reads, not the PC):

Listen, CMU, Pittsburgh; Mostow et al. (1994)STAR system, UK; Russel et al. (1996)SPACE, KU Leuven; Van hamme, Duchateau, et al.… and many others [#]

FtL: Foundations to Literacy, Boulder; Cole, Wise, et al.

Page 11: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 11

ASR-based literacy trainingFoundations to Literacy

Interactive BooksTeach fluent reading & comprehension

Foundational Skills TutorsTeach underlying reading skillsPhonics

Page 12: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 12

ASR-based literacy training (cont’d)What has been done?

Reading tutors:Listen, CMU, Pittsburgh; Mostow et al. (1994)STAR system, UK; Russel et al. (1996)SPACE, KU Leuven; Van hamme, Duchateau, et al.…, and many others

FtL: Foundations to Literacy, Boulder; Cole, Wise, et al.

Mostly for children

And for adults?What is needed?What is possible, and what is not?…

Page 13: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 13

ASR-based CALLASR is not flawless, and it will probably never be

esp. for non-native speech

Be aware of what is (not) possible with ASR technology

Problematic issues and possible solutions:Noise, esp. background speech min., head-setsDisfluencies min., improve autom. handlingNon-native pronunciation

Recognizing utterances utterance verification

Detect pronunciation errors classifiers

Page 14: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 14

ASR-based CALLOur research:

Non-nativesAssessment of oral proficiencyDutch-CAPT – pronunciation

o ASR / UV – Utterance Verificationo PED – Pronunciation Error Detection

DISCO – pronunciation, morphology, syntaxTST-AAP

People with speech disabilityfor training & as communication aid (AAC)

ASR for dysarthric speechEST: E-learning based Speech Therapy

Page 15: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 15

ASR-based CALLProject Dutch-CAPT

(Computer Assisted Pronuciation Training)

Page 16: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 16

Page 17: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 17

ASR-based CALL (cont’d)Project Dutch-CAPT

(CAPT: Computer Assisted Pronuciation Training)

Exp. group: used the Dutch-CAPT system2 control groups: didn’t use Dutch-CAPT

The reduction in the number of pronunciation errors made was significantly larger for the exp. group, Training: 4 weeks x 1 session of 30’ – 60’

Page 18: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 18

ASR-based CALL (cont’d)ASR is not flawless, and it will probably never be

esp. for non-native speech

Be aware of what is (not) possible with ASR technology

Problematic issues and possible solutions:Noise, esp. background speech min., head-setsDisfluencies min., improve autom. handlingNon-native pronunciation

Recognizing utterances utterance verification

Detect pronunciation errors classifiers

Mix of expertise needed: ASR techn., L-acq., pedagogy, design, …

Page 19: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 19

ASR-based literacy trainingDemonstration project TST-AAP

Existing courseAdd speech technology:Detect whether words & sounds were pronounced (correctly)

Page 20: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 20

ASR-based literacy trainingListening; PC: produces speech

Text-To-Speech (TTS); quality good enough?Recorded speech, concatenation

Speaking; PC: recognizes speechPhonics (see FtL)

PC: Recognize words, utterances: CMs for Utt. Ver.PC: Recognize sounds: CMs for Phon. Ver. (contrasts)

Reading (reading tutors)PC: Recognize words, utterancesPC: Pointer in the text (‘track’ the reader)PC: Help when encountering problems

PC: Change tempo read faster

Page 21: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 21

ASR-based CALLAdvantages of using speech (vs. writing)

Self-explanationExtra information:

Prosody (stress, accent)EmotionsConfidence

Other useful techniques:VTH [#]

Page 22: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 22

ConclusionsASR is not flawlessASR-based tutoring is possible (restricted domain)

general topics; ITS: ITSPOKE, SCoTCALL; many systems: non-natives, disabled, etc.Literacy training

So far mainly for childrenAnd for adults !?

NeededMix of expertise: techn., L-acq., pedagogy, design, …Improved ASR, speech technologyProjects, funds

Page 23: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 23

Questions?

Why are there so few ASR-basedCALL / literacy applications for adults?What are, in this context,important differences between children & adults?What is needed?

Listening; PC: produces speechSpeaking; PC: recognizes speech

PhonicsReading (reading tutors)

What else?

Page 24: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 24

Questions?

Why are there so few ASR-basedCALL / literacy applications for adults?What are, in this context,important differences between children & adults?What is needed?

Listening; PC: produces speechSpeaking; PC: recognizes speech

PhonicsReading (reading tutors)

What else?

Page 25: How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology.

Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 25