Top Banner
Robust Speech Feature Decorrelated and Liftered Filter-Ban k Energies (DLFBE) Proposed by K.K. Paliwal , in EuroSp eech 99
27

Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

Jan 01, 2016

Download

Documents

Dina Jordan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

Robust Speech Feature

Decorrelated and Liftered Filter-Bank Energies

(DLFBE)

Proposed by K.K. Paliwal , in EuroSpeech 99

Page 2: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

DLFBE ---Preliminary

* MFCC is very successful in speech recognition

* MFCC computed from the speech signal using

the following three steps: 1.Compute the FFT power spectrum of the speech signal

2.Apply a Mel-space filter-bank to the power spectrum to get N

energies (N=20~60)

3.Compute discrete cosine x’form (DCT) of log filter-bank energies

to get uncorrelated MFCC’s (M=10)

Page 3: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

DLFBE --- Motivation

*MFCC has two drawbacks 1. Does not have any physical interpretataion

2. Liftering of cepstral coefficient has no effect in the

modern speech recognition (discuss later)

*The two problem(i.e., numbers and correlation)

in FBE used in 50’s, 60’s,70’s can be solved

today

Page 4: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

Liftering --- What and How

*Lifter is the reweighting process of cepstral

coeff. used in DTW framework of speech

recognition

where is dissimilarity between the test vector and the mean vector

2

1

)'()'()'()';(

D

iii

t xxxxxxxxd

)',( xxd

x 'x

Euclidean distance

Page 5: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

Liftering --- What and How (cont’d)

Where is i-th cepstral coeff. , is the corresponding liftering coeff. and is the lifter

So

iii xwy iyix

iw

xhgfe

dcba

x

w

w

w

y

D

....

....

000

0...

000

000

2

1

More general form

Page 6: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

Liftering --- What and How (cont’d)

2

1

)'()'()'()',(

D

iii

t yyyyyyyyd

2

0

)]'([

D

iiii xxw

Page 7: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

Liftering --- What and How (cont’d)

The types of lifters are listed belows

1.Linear lifter

2.Statistical lifter

3.Sinusoidal lifter

4.Exponential lifter

iwi

iiw

1

)sin(2

1D

iDwi

)2

exp(2

2

i

iw si 5,5.1 s

Page 8: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

Liftering --- Discussion and Why

* The multiplicative weighting in cepstrum domain is equivalent to convolution

in spectral domain

Spectral domain Cepstral domain

Type 1 and 2 HP filter Emphasize the higher

cepstral coeff’s.

Type 3 and 4 BP filter Lessen the higher and lower cepstral coeff’s.

kk

IFFT

nn WCwc .

Page 9: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

Liftering --- Experiment on DTW

Page 10: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

Liftering on CDHMM (??) --- Why

Mahalanobis distance measure due to out

observation prob.

)'()'(),';( 1'' xxxxxxdx

t

x

Page 11: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

Liftering on CDHMM (??) --- Why

liftering matrix for MFCC

where

DDDw

w

w

w

W

*

3

2

1

.000

.....

0.00

0.00

0.00

txy WWWxyWxy '','',

Page 12: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

Liftering on CDHMM (??) --- Why

Thus,cepstral liftering has no effect in the recognition

process when used with continuous observation Gaussian

Density HMM’s

),';(

)'()'()'()()()'(

)'()()'()'()'(),';(

'

11'

1

1'

1' '

x

tx

ttt

tx

t

y

ty

xxd

xxWxxxxWWWWxx

WxWxWWWxWxyyyyyyd

Page 13: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

Decorrelation of FBE --- Why/How

*FBEs are correlated => we can’t use CDHMM

* We can use LP techniques to solve this defeat

can be obtained by covariance method of

LP analysis

p

i

ii za

zPzA

1

1

1

)(1

1)(

1,...,1,0},{ Nnen

}{ ia

P M N

M

Page 14: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

Liftering of FBE --- How

L

i

iizhzH

0

)(1,...,1,0},{ NnenM

FIR filter

N=M+L

Page 15: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

DLFBE --- Experiment

*SI and isolated word recognition using ISOLET spoken letter database

*90 training utterances from 90 speakers(45 females,45 males)

30 testing utterances from 30 speakers (15 females,15 males)

Page 16: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

DLFBE --- Experiment (cont’d)

)(zp

no1

1)( zazP

22

11)( zazazP

no

no

no

no

22

11)( zazazP

)(zH

no

nono

15.01)( zzH175.01)( zzH

11)( zzH21)( zzH11)( zzH

Page 17: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

DLFBE --- Experiment (cont’d)

Page 18: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

Robust Speech Feature

Noise-Invariant Representation for Speech Signal

Group Delay Function (GDF) Method

Proposed by Bayya & Yegnanarayana

in EuroSpeech ‘99

Page 19: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

GDF --- Motivation

*Background noise is a prominent source of mismatch

and eliminated roughly by methods as follows

1.compensation

cause the overestimation and underestimation side effects

Pre-

Processing

SS(spectral sub.) ,HP,BP

FN(feature normalization)

Model

Adaptation

Parameter x’form

Page 20: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

GDF --- Motivation (cont’d)

2.new feature

not completely noise resistant

*All the above use power/amplitude as speech feature

Why don’t we use phase information as features ?

And phase infor. may be helpful in speech recognition.

LPC MEL,PLP (projection concept)

Page 21: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

GDF --- What/How

*GDF is defined as the normalized autocorrelation of

a short segment of a signal

(#.1)

Where is the normalized autocorrelation of a short

segment of a signal

(.))arg((.)log

(.)log

(.)log))(1log(

(.))arg(

1

RR

eR

Renr

Rj

n

nj

)(nr

Page 22: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

(#.2)

compare(#.1)&(#.2)

GDF --- What/How (cont’d)

1 1

11

)cos()()cos()(

)())(1log(

n n

n

nj

n

nj

nnrjnnr

enrenr

1

)sin()((.))arg(n

nnrR

0,0)( nnr

Page 23: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

GDF --- What/How (cont’d)

1

)]cos()[((.))arg(

n

nnnrR

GDF

30~10p

Easy to implement

)()]}cos([)({1

nwnnwnrGDFp

n

Truncated version of GDF

Page 24: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

GDF --- What/How (cont’d)

where

pnPnw 1),2cos(5.05.0)(

Hanning window

Page 25: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

GDF --- Why & Experiment

*frame length = 5 ms , frame rate = 1 ms & modified

autocorrelation sequence averaged over 20 frames

then the GDF computed as defined above

Page 26: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

GDF --- Why & Experiment (cont’d)

Page 27: Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

GDF --- Experiment

*Isolated-digit recognition

Clean Noisy

SI 97%

95%

YES

SD 96.5%

94.5%

NO

Due to large dynamicrange?