Top Banner
Rapid and Accurate Rapid and Accurate Spoken Term Detection Spoken Term Detection Owen Kimball BBN Technologies 15 December 2006
13

Rapid and Accurate Spoken Term Detection Owen Kimball BBN Technologies 15 December 2006.

Dec 27, 2015

Download

Documents

Sherilyn Dixon
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Rapid and Accurate Spoken Term Detection Owen Kimball BBN Technologies 15 December 2006.

Rapid and Accurate Rapid and Accurate Spoken Term DetectionSpoken Term Detection

Owen Kimball

BBN Technologies

15 December 2006

Page 2: Rapid and Accurate Spoken Term Detection Owen Kimball BBN Technologies 15 December 2006.

15-Dec-06Rapid and Accurate Spoken Term Detection 2

Overview of TalkOverview of Talk

• BBN Levantine system description

• Evaluation results

• Diacritics

• Out-of-vocabulary issues

Page 3: Rapid and Accurate Spoken Term Detection Owen Kimball BBN Technologies 15 December 2006.

15-Dec-06Rapid and Accurate Spoken Term Detection 3

BBN Evaluation TeamBBN Evaluation Team

Core Team• Chia-lin Kao• Owen Kimball• Michael Kleber• David Miller

Additional assistance• Thomas Colthurst• Herb Gish• Steve Lowe• Rich Schwartz

Page 4: Rapid and Accurate Spoken Term Detection Owen Kimball BBN Technologies 15 December 2006.

15-Dec-06Rapid and Accurate Spoken Term Detection 4

BBN System OverviewBBN System Overview

Byblos STT

indexer

detector

decider

latticesphonetic-transcripts

indexscored

detectionlists

final outputwith YES/NO

decisions

audiosearc

hterms

ATWV cost

parameters

Page 5: Rapid and Accurate Spoken Term Detection Owen Kimball BBN Technologies 15 December 2006.

15-Dec-06Rapid and Accurate Spoken Term Detection 5

Levantine STT ConfigurationLevantine STT Configuration

• STT generates a lattice of hypotheses and a phonetic transcript for each input file.

• Word-based system:– Orthography based on Modern Standard Arabic

(MSA), no short vowel diacritics– Acoustic: 57.3 hours LDC

(noise words, no mixture exponents)– Language: 250 hours of data, 1.3M words

• 38.5K dictionary, grapheme-as-phoneme based plus 100 manual pronunciations

– unknown short vowel (U), 39 phonemes

• 42.32% WER on STD Dev06 CTS data

Page 6: Rapid and Accurate Spoken Term Detection Owen Kimball BBN Technologies 15 December 2006.

15-Dec-06Rapid and Accurate Spoken Term Detection 6

Levantine CTS ResultsLevantine CTS Results

0.3467Eval06

0.410DryRun

0.515Dev06

ATWV Data

Page 7: Rapid and Accurate Spoken Term Detection Owen Kimball BBN Technologies 15 December 2006.

15-Dec-06Rapid and Accurate Spoken Term Detection 7

OOV Pipeline: DetectorOOV Pipeline: Detector

• Word-based STT produces 1-best transcript: pronounce it 1-best phonetic transcript.

• Query is OOV if it contains any OOV word.

• OOV query detection:– Pronounce query (grapheme-as-phoneme)– Find minimal edit-distance alignments (agrep)– Score = % error = phonemes#

distanceedit 1

Page 8: Rapid and Accurate Spoken Term Detection Owen Kimball BBN Technologies 15 December 2006.

15-Dec-06Rapid and Accurate Spoken Term Detection 8

OOV Pipeline: DeciderOOV Pipeline: Decider

• Need different Yes/No decision procedure:IV-decider requires posterior probabilities.

• Simple OOV decision procedure:– Constant threshold on score (~ 0.7)– Cap on maximum number of hits (0-3)– Values set to maximize ATWV on Dev06 data.

Page 9: Rapid and Accurate Spoken Term Detection Owen Kimball BBN Technologies 15 December 2006.

15-Dec-06Rapid and Accurate Spoken Term Detection 9

OOV Pipeline: ResultsOOV Pipeline: Results

• ATWV remained good:0.3450 IV

0.3635 OOV

• Searches take longer: ~10-15x IV speed on Dev06 and DryRun06,

with no attempt at indexing.

Page 10: Rapid and Accurate Spoken Term Detection Owen Kimball BBN Technologies 15 December 2006.

15-Dec-06Rapid and Accurate Spoken Term Detection 10

OOV Directions for ImprovementOOV Directions for Improvement

• Score substitutions using phoneme confusion matrix instead of flat edit distance

• Speed: indexing phonetic transcripts for approximate matching

• Search lattices beyond 1-best transcripts

Page 11: Rapid and Accurate Spoken Term Detection Owen Kimball BBN Technologies 15 December 2006.

15-Dec-06Rapid and Accurate Spoken Term Detection 11

Levantine Diacritic IssuesLevantine Diacritic Issues

• Originally looked at diacritized Levantine

• Trained STT engine using LDC 45 hour set

• Ran STD without knowing WER (no diacritized STT test set to measure WER).– Found very high false alarm rate

• Examining FAs found hits that were legitimate alternate spellings

Page 12: Rapid and Accurate Spoken Term Detection Owen Kimball BBN Technologies 15 December 2006.

15-Dec-06Rapid and Accurate Spoken Term Detection 12

Levantine Diacritics- Alternate SpellingsLevantine Diacritics- Alternate Spellings

• Examining query words found more of same:– In first 22 terms of dry run term list, 14 are “alternate

diacritic” spellings of 5 underlying words, i.e. there were just 13 unique words in the first 22 terms

– Min~ahumo v Minohumo

– AlHayaApi v AlHayaAp

– Waliko v Walika

– qabilo v qabola v qabolo

• LDC training and STD test set had additional pervasive differences

Page 13: Rapid and Accurate Spoken Term Detection Owen Kimball BBN Technologies 15 December 2006.

15-Dec-06Rapid and Accurate Spoken Term Detection 13

No-Diacritic Levantine IssuesNo-Diacritic Levantine Issues

• A quick look turned up a smaller number of problems for no-diacritic Levantine– Looking at 7 top-FA terms in dev set, found

• “bHky” vs “b>Hky” but no other spelling confusions

• One ref instance of term with 0 duration

• It would be interesting to QC test sets for inconsistent spellings and other issues