Top Banner
Multi-microphone noise reduction Multi-microphone noise reduction and dereverberation techniques and dereverberation techniques for speech applications for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium 8 July 2003
50

Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

Dec 28, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

Multi-microphone noise reduction Multi-microphone noise reduction

and dereverberation techniques and dereverberation techniques

for speech applicationsfor speech applications

Simon Doclo

Dept. of Electrical Engineering, KU Leuven, Belgium

8 July 2003

Page 2: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

22

OverviewOverview

• Introduction

• Basic principles

• Robust broadband beamforming

• Multi-microphone optimal filtering

• Acoustic transfer function estimation and dereverberation

• Conclusion and further research

Page 3: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

33

OverviewOverview

• Introduction

� Motivation and applications

� Problem statement

� Contributions

• Basic principles

• Robust broadband beamforming

• Multi-microphone optimal filtering

• Acoustic transfer function estimation and dereverberation

• Conclusion and further research

Page 4: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

44

• Speech acquisition in an adverse acoustic environment

MotivationMotivation

• Speech communication applications: hands-free mobiletelephony, voice-controlled systems, hearing aids

Background noise:- fan, radio- other speakers- generally unknown

Reverberation- reflections of signal against walls, objects

• Poor signal quality

• Speech intelligibility and speech recognition

Introduction -Motivation -Problem statement

-Contributions

Basic principles

Beamforming

Multi-microphoneoptimal filtering

Transfer functionestimation anddereverberation

Conclusion

Page 5: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

55

Signal enhancement

ObjectivesObjectives

• Signal enhancement techniques:� Noise reduction : reduce amount of background

noise without distorting speech signal� Dereverberation : reduce effect of signal reflections

� Combined noise reduction and dereverberation

• Acoustic source localisation: video camera or spotlight

Introduction -Motivation -Problem statement

-Contributions

Basic principles

Beamforming

Multi-microphoneoptimal filtering

Transfer functionestimation anddereverberation

Conclusion

Page 6: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

66

• Video-conferencing:� Microphone array for source localisation :

– point camera towards active speaker– signal enhancement by steering of microphone array

ApplicationsApplications

• Hands-free mobile telephony:

� Most important application from economic point of view

� Hands-free car kit mandatory in many countries

� Most current systems: 1 directional microphone

Introduction -Motivation -Problem statement

-Contributions

Basic principles

Beamforming

Multi-microphoneoptimal filtering

Transfer functionestimation anddereverberation

Conclusion

Page 7: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

77

• Hearing aids and cochlear implants:� most hearing impaired suffer from perceptual hearing

loss amplification

reduction of noise wrt useful speech signal

ApplicationsApplications

• Voice-controlled systems:� domotic systems, consumer electronics (HiFi, PC software)� added value only when speech recognition system

performs reliably under all circumstances� signal enhancement as pre-processing step

� multiple microphones + DSP in hearing aid� current systems: simple beamforming � robustness important due to small inter-microphone distance

Introduction -Motivation -Problem statement

-Contributions

Basic principles

Beamforming

Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 8: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

88

Algorithmic requirementsAlgorithmic requirements

• ‘Blind’ techniques: unknown noise sources and acoustic environment

• Adaptive: time-variant signals and acoustic environment

• Robustness:

� Microphone characteristics (gain, phase, position)

� Other deviations from assumed signal model

(look direction error, VAD)

• Integration of different enhancement techniques

• Computational complexity

Introduction -Motivation -Problem statement

-Contributions

Basic principles

Beamforming

Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 9: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

99

Problem statementProblem statement

• Problem of existing techniques:

� Single-microphone techniques: very limited performance multi-microphone techniques: exploit spatial

information multiple microphones required for source localisation

� A-priori assumptions about position of signal sources and microphone array: large sensitivity to deviations improve robustness (and performance)

� Assumption of spatio-temporally white noise extension to coloured noise

Development of multi-microphone noise reduction and dereverberation

techniques with better performance and robustness

for coloured noise scenarios

Introduction -Motivation -Problem statement

-Contributions

Basic principles

Beamforming

Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 10: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

State-of-the-art and State-of-the-art and

contributionscontributions

1010

Single-microphone techniques

– spectral subtraction [Boll 79, Ephraim 85, Xie 96]

•Signal-independent transformation

•Residual noise problem

– subspace-based [Dendrinos 91, Ephraim 95, Jensen 95]

•Signal-dependent transformation

•Signal + noise subspace

2. Multi-microphone optimal filtering

spatial information

robustness

3. Blind transfer function

estimation and

dereverberation

1. Robust broadband

beamforming

Multi-microphone techniques

– fixed beamforming [Dolph 46, Cox 86, Ward 95, Elko 00]

•Fixed directivity pattern

– adaptive beamforming [Frost 72, Griffiths 82, Gannot 01]

•adapt to different acoustic environments performance

•`Generalised Sidelobe Canceller’ (GSC)

– inverse, matched filtering [Myoshi 88, Flanagan 93, Affes 97]

only spectral information

a-priori assumptions

Page 11: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

1111

OverviewOverview

• Introduction

• Basic principles

� Signal model

� Signal characteristics and acoustic environment

• Robust broadband beamforming

• Multi-microphone optimal filtering

• Acoustic transfer function estimation and dereverberation

• Conclusion and further research

Page 12: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

1212

Signal modelSignal model

• Signal model for microphone signals in time-domain: filtered version of clean speech signal + additive coloured noise

][0 ky

][1 ky

][1 kyN

][][][ kvkxky nnn ][kvn][khn ][ks

Acousticimpulse response

][ks

Speechsignal

Additivenoise

Introduction

Basic principles -Signal model -Characteristics

Beamforming

Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 13: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

1313

Signal modelSignal model

• Multi-microphone signal enhancement: microphone signals are filtered with filters wn[k] and summed

� f [k] = total transfer function for speech component

� zv[k] = residual noise component

][

][][][

][

][][][][][1

0

1

0

1

0

kz

kvkwks

kf

khkwkykwkz

v

N

nnn

N

nnn

N

nnn

• Techniques differ in calculation of filters:

� Noise reduction : minimise residual noise zv[k] and limit speech distortion

� Dereverberation : f [k]=δ [k] by estimating acoustic impulse responses hn[k]

� Combined noise reduction and dereverberation

Introduction

Basic principles -Signal model -Characteristics

Beamforming

Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 14: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

1414

Signal characteristicsSignal characteristics

• Speech:

� Broadband (300-8000 Hz)

� Non-stationary

� On/off-characteristic

Speech detection algorithm (VAD)

� Linear low-rank model: linearcombination of basis functions

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

Am

plit

ud

e

Time (sec)

][][1

kak i

R

ii

ss (R=12…20)

• Noise:

� unknown signals (no reference available)

� slowly time-varying (fan) non-stationary (radio, speech)

� localised diffuse noise

Introduction

Basic principles -Signal model -Characteristics

Beamforming

Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 15: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

1515

Acoustic environmentAcoustic environment

• Reverberation time T60 : global characterisation

• Acoustic impulse responses:

� Acoustic filtering between2 points in a room

� FIR filter (K=1000…2000 taps)

� Non-minimum-phase system no stable inverse

• Microphone array:

� Assumption: point sensors with ideal characteristics

� Deviations: gain, phase, position

� Distance speaker – microphone array: far-field near-field

Car Room Church

70 ms 250 ms 1500 ms

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Time (sec)

Am

plit

ud

e

Impulse response PSK row 9

Introduction

Basic principles -Signal model -Characteristics

Beamforming

Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 16: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

1616

OverviewOverview

• Introduction

• Basic principles

• Robust broadband beamforming

� Novel design procedures for broadband beamformers

� Robust beamforming for gain and phase errors

• Multi-microphone optimal filtering

• Acoustic transfer function estimation and dereverberation

• Conclusion and further research

Page 17: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

1717

Fixed beamformingFixed beamforming

• Speech and noise sources with overlapping spectrum at different positions

Exploit spatial diversity by using multiple microphones

• Technique originally developed for radar applications:

� Smallband : delay compensation broadband

� Far-field : planar waves near-field : spherical waves

� Known sensor characteristics deviations

- Low complexity- Robustness at low signal-to-noise ratio (SNR)

- A-priori knowledge of microphone array characteristics- Signal-independent

FIR filter-and-sum structure: arbitrary spatial directivity pattern for arbitrary microphone array configuration

Suppress noise and reverberation from certain directions

Introduction

Basic principles Beamforming -Design -Robustness

Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 18: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

1818

Filter-and-sum configurationFilter-and-sum configuration

• Objective: calculate filters wn[k] such that beamformer

performs desired (fixed) spatial and spectral filtering

Far-field: - planar waves- equal attenuation

2D filter design in angle and frequency

Spatial directivity pattern:

),()(

),(),(

gwT

S

ZH

Desired spatial directivity pattern:

),( D

Introduction

Basic principles Beamforming -Design -Robustness

Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 19: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

1919

Design proceduresDesign procedures

• Design filter w such that spatial directivity pattern optimally fits minimisation of cost function

� Broadband problem: no design for separate frequencies i

design over complete frequency-angle region

� No approximations of integrals by finite Riemann-sum

� Microphone configuration not included in optimisation

• Cost functions:

� Least-squares quadratic function

� Non-linear cost function iterative optimisation = complex!

[Kajala 99]

ddDHFJ LS

2),(),(),()(w

amplitude and phase

ddDHFJ NL

222),(),(),()(w

Double integrals only need to be calculated once

),( H),( D

Introduction

Basic principles Beamforming -Design -Robustness

Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 20: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

2020

Design proceduresDesign procedures

• 2 non-iterative cost functions, based on eigenfilters:

� Eigenfilters: 1D and 2D FIR filter design

� Extension to design of broadband beamformers

• Novel cost functions:

� Conventional eigenfilter technique (G)EVD

� Eigenfilter based on TLS-criterion GEVD

• Conclusion: TLS-eigenfilter preferred non-iterative design procedure

ddDH

FJtote

TTLS 1

),(),(),()(

2

wQww

[Vaidyanathan 87, Pei 01]

ddHH

D

DFJ cc

cceig

2

),(),(),(

),(),()(w

reference point required

Introduction

Basic principles Beamforming -Design -Robustness

Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 21: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

2121

Non-linear procedure TLS-Eigenfilter

SimulationsSimulations

Angle (deg) Freq (Hz)

dB

Angle (deg) Freq (Hz)

dB

Parameters:-N=5, d=4cm-L=20, fs=8kHz-Pass: 40o-80o

-Stop: 0o-30o + 90o-180o

Delay-and-sum

Angle (deg) Freq (Hz)

dB

Page 22: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

2222

Near-field configurationNear-field configuration

• Near-field: spherical waves + attenuation

• Ultimate goal: design for all distances

• One specific distance: very similar to far-field design (different calculation of double integrals)

• Several distances: trivial extension for most cost functions, for TLS-eigenfilter = sum of generalised Rayleigh-quotients

Take into account distance r between speaker - microphones

Rtot drddrDrHrFJ 2),,(),,(),,()(w

Finite number (R) of distances

R

rrrtot JJ

1

)()( ww

Deviation for other distances

Trade-off performance for different distances

Introduction

Basic principles Beamforming -Design -Robustness

Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 23: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

2323

Far-field pattern Near-field pattern (r=0.2m)

SimulationsSimulations

Angle (deg)

Frequency (Hz)

dB

Far-fi

eld

desig

n

Angle (deg)

Frequency (Hz)

dB

Mix

ed n

ear-fi

eld

far-

field

Angle (deg)

Frequency (Hz)

dB

Angle (deg)

Frequency (Hz)

dB

Parameters:-N=5, d=4cm-L=20, fs=8kHz-Pass: 70o-110o

-Stop: 0o-60o + 120o-180o

Page 24: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

2424

• Small deviations from the assumed microphone characteristics (gain, phase, position) large deviations from desired directivity pattern, especially for small-size microphone arrays

• In practice microphone characteristics are never exactly known

• Consider all feasible microphone characteristics and optimise

� average performance using probability as weight

– requires knowledge about probability density functions

� worst-case performance minimax optimisation problem

– finite grid of microphone characteristics high complexity

Robust broadband beamformingRobust broadband beamforming

101010 )()(),,(0 1

NNN

A A

mean dAdAAfAfAAJJN

Incorporate specific (random) deviations in design

position

/cos

phase

),(

gain

),(),( cfjjnn

snn eeaA

Measurement or calibration procedure

Introduction

Basic principles Beamforming -Design -Robustness

Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 25: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

2525

SimulationsSimulations

• Non-linear design procedure

• N=3, positions: [-0.01 0 0.015] m, L=20, fs=8 kHz

• Passband = 0o-60o, 300-4000 Hz (endfire)Stopband = 80o-180o, 300-4000 Hz

• Robust design - average performance:Uniform pdf = gain (0.85-1.15) and phase (-5o-10o)

• Deviation = [0.9 1.1 1.05] and [5o -2o 5o]

Design J Jdev Jmean Jmax

Non-robust 0.1585 87.131 275.40 3623.6

Average cost 0.2196 0.2219 0.3371 0.4990

Maximumcost

0.1707 0.1990 0.4114 0.4167

Introduction

Basic principles Beamforming -Design -Robustness

Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 26: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

2626

Non-robust design Robust design

No d

evia

tions

Devia

tions (g

ain

/phase

)

SimulationsSimulations

Angle (deg)

Frequency (Hz)

dB

Angle (deg)

Frequency (Hz)

dB

Angle (deg)

Frequency (Hz)

dB

Angle (deg)

Frequency (Hz)

dB

Introduction

Basic principles Beamforming -Design -Robustness

Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 27: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

Non-robust design Robust design

SimulationsSimulations

2727

Page 28: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

2828

OverviewOverview

• Introduction

• Basic principles

• Robust broadband beamforming

• Multi-microphone optimal filtering

� GSVD-based optimal filtering technique

� Reduction of computational complexity

� Simulations

• Acoustic transfer function estimation and dereverberation

• Conclusion and further research

Page 29: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

2929

Multi-microphone optimal Multi-microphone optimal filteringfiltering

Objective: optimal estimate of speech components

in microphone signals

Minimise MSE 2][][ kzkxE n No a-priori assumptions

2

][

2

][][][][min][][min kkkEkkE T

kkyWxzx

WW

][][][ 1 kkk yxyyWF RRW

Multi-channel Wiener Filter

][][][][ 1 kkkk vvyyyyWF RRRW

-Speech and noise independent-2nd order statistics noise stationary estimate during noise periods (VAD)

Multi-microphone

Signal-dependent Robustness

Introduction

Basic principles Beamforming Multi-microphoneoptimal filtering

-Optimal filtering -Complexity -Simulations

Transfer functionestimation and dereverberation

Conclusion

Page 30: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

3030

Multi-microphone optimal Multi-microphone optimal filteringfiltering

• Implementation procedure:

� based on Generalised Eigenvalue Decomposition (GEVD)

– take into account low-rank model of speech

– trade-off between noise reduction and speech distortion

� QRD [Rombouts 2002] , subband [Spriet 2001] lower complexity

• Generalised Eigenvalue Decomposition (GEVD):

• Speech detection mechanism is the only a-priori assumption:required for estimation of correlation matrices

][][][][

][][][][

kkkk

kkkkT

vvv

Tyyy

QΛQR

QΛQR

coloured noise!

Low-rank model

MRikk

Rikk

ii

ii

1,][][

1,][][22

22

][][

][1diag][][

2

2

kk

kη-kk T

i

iTWF QQW

Signal-dependent FIR-filterbank

Introduction

Basic principles Beamforming Multi-microphoneoptimal filtering

-Optimal filtering -Complexity -Simulations

Transfer functionestimation and dereverberation

Conclusion

Page 31: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

3131

General class of estimatorsGeneral class of estimators

• Multi-channel Wiener filter: always combination of noise reduction and (linear) speech distortion:

estimation error:

][ke ][][ kkTWFM xWI ][][ kkT

WF vW

• General class: noise reduction speech distortion

– =1 : MMSE (equal importance)

– <1 : less speech distortion, less noise reduction

– >1 : more speech distortion, more noise reduction

[Ephraim 95]

][][)1(][

][][diag][][

22

22

kkηk

kηkkk T

ii

iiTWF QQW

speech distortion

residual noise

Introduction

Basic principles Beamforming Multi-microphoneoptimal filtering

-Optimal filtering -Complexity -Simulations

Transfer functionestimation and dereverberation

Conclusion

Page 32: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

3232

• Decomposition in spectral and spatial filtering term

• Desired beamforming behaviour for simple scenarios

Frequency-domain analysisFrequency-domain analysis

WFW

vx

x

PP

P

1

11 eΓΓ xy

spectral filtering(PSD)

spatial filtering(coherence)

Introduction

Basic principles Beamforming Multi-microphoneoptimal filtering

-Optimal filtering -Complexity -Simulations

Transfer functionestimation and dereverberation

Conclusion

Speech Noise

Page 33: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

3333

Complexity reductionComplexity reduction

• Recursive version: each time step calculation GSVD + filter

• Complexity reduction using:

� Recursive techniques for recomputing GSVD [Moonen 90]

� Sub-sampling (stationary acoustic environments)

High computational complexity

Batch Recursive QRD [Rombouts]

sub = 1 7504 Gflops 2.1 Gflops 358 Mflops

sub = 20

375 Gflops 105 Mflops 18 Mflops

(N = 4, L = 20, M=80, fs = 16 kHz, P = 4000, Q = 20000)

)(316 23 QPMM 25.20 M 25.3 M

Real-time implementation possible

Introduction

Basic principles Beamforming Multi-microphoneoptimal filtering

-Optimal filtering -Complexity -Simulations

Transfer functionestimation and dereverberation

Conclusion

Page 34: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

3434

Complexity reductionComplexity reduction

• Incorporation in ‘Generalised Sidelobe Canceller’ (GSC) structure: adaptive beamforming

� Creation of speech reference and noise reference signals

� Standard multi-channel adaptive filter (LMS, APA)][0 ky

][1 ky

][1 kyN

Speechreferenc

e

][0 kw

][1 kw

][1 kwN

Optimalfilter

Noise reference(

s) +

][0 kwa

Adaptive filter

delay

Increase noise reduction performance

Complexity reduction by using shorter filters

Introduction

Basic principles Beamforming Multi-microphoneoptimal filtering

-Optimal filtering -Complexity -Simulations

Transfer functionestimation and dereverberation

Conclusion

Page 35: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

3535

SimulationsSimulations

• N=4, SNR=0 dB, 3 noise sources (white, speech, music), fs=16 kHz

• Performance: improvement of signal-to-noise ratio (SNR)

0 500 1000 15000

5

10

15

Reverberation time (msec)

Unb

iase

d S

NR

(dB

)Delay-and-sum beamformerGSC (LANC=400, noise ref=Griffiths-Jim)

Recursive GSVD (L=20, LANC=400, all nref)Recursive GSVD (L=20, no ANC)

Introduction

Basic principles Beamforming Multi-microphoneoptimal filtering

-Optimal filtering -Complexity -Simulations

Transfer functionestimation and dereverberation

Conclusion

Page 36: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

3636

SimulationsSimulations

• N=4, SNR=0 dB, 3 noise sources, fs=16 kHz, T60=300 msec

• ‘Power Transfer Functions’ (PTF) for speech and noise component

0 1000 2000 3000 4000 5000 6000 7000 8000

-30

-25

-20

-15

-10

-5

0

Speech

Noise

Frequency (Hz)

Sp

ect

rum

(d

B)

Recursive GSVD (L=20, no ANC)Recursive GSVD (L=20, LANC=400, all noise ref)

Introduction

Basic principles Beamforming Multi-microphoneoptimal filtering

-Optimal filtering -Complexity -Simulations

Transfer functionestimation and dereverberation

Conclusion

Page 37: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

3737

ConclusionsConclusions

• GSVD-based optimal filtering technique:

� Multi-microphone extension of single-microphone subspace-based enhancement techniques

� Signal-dependent low-rank model of speech

� No a-priori assumptions about position of speaker and microphones

• SNR-improvement higher than GSC for all reverberation times and all considered acoustic scenarios

• More robust to deviations from signal model:

� Microphone characteristics

� Position of speaker

� VAD: only a-priori information!

– No effect on SNR-improvement

– Limited effect on speech distortion

Introduction

Basic principles Beamforming Multi-microphoneoptimal filtering

-Optimal filtering -Complexity -Simulations

Transfer functionestimation and dereverberation

Conclusion

Page 38: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

3838

Advantages - DisadvantagesAdvantages - Disadvantages

Fixed beamforming

Adaptivebeamforming

Optimal filtering

Signal-dependent no yes yes

Noise reduction + ++ +++

Dereverberation + + no

Complexity low average high

VAD no yes yes

Robustness - (+) -- (+) ++

Introduction

Basic principles Beamforming Multi-microphoneoptimal filtering

-Optimal filtering -Complexity -Simulations

Transfer functionestimation and dereverberation

Conclusion

Page 39: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

3939

OverviewOverview

• Introduction

• Basic principles

• Robust broadband beamforming

• Multi-microphone optimal filtering

• Acoustic transfer function estimation and dereverberation

� Time-domain technique

� Frequency-domain technique

� Combined noise reduction and dereverberation

• Conclusion and further research

Page 40: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

4040

ObjectiveObjective

][0 ky

][1 ky

][1 kyN

][1 kh

][0 kw

][1 kw

][1 kwN

][kz

Blind estimation of acoustic impulse responses

Time-domain Frequency-domain

Noise reduction and

dereverberation

Dereverberation

Source localisation

Introduction Basic principles Beamforming Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

-Time-domain -Frequency-domain

-Dereverberation Conclusion

Page 41: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

4141

• Signal model for N=2 and no background noise

• Subspace-based technique: impulse responses can be computed from null-space of speech correlation matrix� Eigenvector corresponding to smallest eigenvalue� Coloured noise: GEVD� Problems occuring in time-domain technique:

– sensitivity to underestimation of impulse response length – low-rank model in combination with background noise

Time-domain techniquesTime-domain techniques

S(z)

H0(z)

H1(z) Y1(z)

Y0(z)

Signals

][kyyR

-H1(z)

H0(z)

Null-space

0

±α

±α

E(z)

E(z)

Introduction Basic principles Beamforming Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

-Time-domain -Frequency-domain

-Dereverberation Conclusion

Page 42: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

4242

• Batch estimation techniques form basis for deriving adaptive stochastic gradient algorithm

• Usage :� Estimation of partial impulse responses time-delay

estimation for acoustic source localisation� For source localisation adaptive GEVD algorithm is

more robust than adaptive EVD algorithm (and prewhitening) in reverberant environments with a large amount of noise

Stochastic gradient algorithmStochastic gradient algorithm

1][ subject to,][min uRuuRuu

kk vvT

yyT

]1[][]1[

]1[]1[

][][][][][][]1[

][][][

kkk

kk

kkkekkekk

kkke

vvT

vv

T

uRu

uu

uRyuu

yu

Introduction Basic principles Beamforming Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

-Time-domain -Frequency-domain

-Dereverberation Conclusion

Page 43: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

4343

• Problems of time-domain technique frequency-domain

• Signal model: rank-1 model

• Estimation of acoustic transfer function vector H() from GEVD of correlation matrices and

� Corresponding to largest generalised eigenvalue no stochastic gradient algorithm available (yet)

� Unknown scaling factor in each frequency bin:

can be determined only if norm is known

algorithm only useful when position of source is fixed (e.g. desktop, car)

Frequency-domain techniquesFrequency-domain techniques

)(

1

1

0

)(

1

1

0

1

1

0

)(

)(

)(

)(

)(

)(

)(

)(

)(

)(

)(

VH

Y

NNN V

V

V

S

H

H

H

Y

Y

Y

)(yyR )(vvR

)(H

Introduction Basic principles Beamforming Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

-Time-domain -Frequency-domain

-Dereverberation Conclusion

Page 44: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

4444

Combined noise reduction and Combined noise reduction and dereverberationdereverberation

• Filtering operation in frequency domain:

• Dereverberation: normalised matched filter

• Combined noise reduction and dereverberation:Z() is optimal (MMSE) estimate of S()

� Optimal estimate of s[k] integration of multi-channel Wiener-filter with normalised matched filter

� Trade-off between both objectives

• Implementation: overlap-save

)()()()()()()()()(

VWHWYW H

F

HH SZ

1)( F

2)(

)()(

H

HW d Residual noise

)(ˆ)()(ˆ SHX

Introduction Basic principles Beamforming Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

-Time-domain -Frequency-domain

-Dereverberation Conclusion

Page 45: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

4545

SimulationsSimulations

• N=4, d=2 cm, fs=16 kHz, SNR=0 dB, T60= 400 msec

• FFT-size L=1024, overlap R=16

• Performance criteria:

� Signal-to-noise ratio (SNR)

� Dereverberation-index (DI) :

SNR (dB) DI (dB)

Original microphone signal 2.88 4.74

Noise reduction 16.82 4.73

Dereverberation 2.30 0.86

Combined noise reduction and dereverberation

10.12 1.35

dH )()(log20

2

110 HW

Introduction Basic principles Beamforming Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

-Time-domain -Frequency-domain

-Dereverberation Conclusion

Page 46: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

4646

SimulationsSimulations

Introduction Basic principles Beamforming Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

-Time-domain -Frequency-domain

-Dereverberation Conclusion

Page 47: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

4747

ConclusionConclusion

• Low signal quality due to background noise and reverberation signal enhancement to improve speech intelligibility and ASR performance

Single-microphone techniques: spectral informationStandard beamforming: a-priori assumptions

No a-priori assumptions

Multi-microphone

Signal-dependent

Blind transfer function

estimation and dereverberation

Robust broadband

beamforming

Multi-microphone optimal filtering

Introduction Basic principles Beamforming Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 48: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

4848

ContributionsContributions

• Robust broadband beamforming:� novel cost functions for broadband far-field design

(non-linear, eigenfilter-based)

� extension to near-field and mixed near-field far-field

� 2 procedures for robust design against gain and phase deviations

• GSVD-based optimal filter technique for multi-microphone noise reduction:� extension of single-microphone subspace-based

techniques multiple microphones

� integration in GSC-structure

� better performance and robustness than beamforming

• Acoustic transfer function estimation and dereverberation:� stochastic gradient algorithm for estimation of time-delay

and acoustic source localisation (coloured noise)

� combined noise reduction and dereverberation in frequency-domain

Introduction Basic principles Beamforming Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 49: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

4949

Further researchFurther research

• Combination of multi-channel Wiener-filter and fixed beamforming:

� Low SNR: VAD fails poor performance of Wiener-filter

� Combined technique: more robust when VAD fails, better performance than fixed beamformers in other scenarios

• Acoustic transfer function estimation and dereverberation:

� Time-domain: underlying reason for high sensitivity

� Frequency-domain: unknown scaling factor BSS ?

� other blind identification techniques (LP, NL Kalman-filtering)

• Further complexity reduction of multi-channel optimal filtering technique

� Stochastic gradient algorithms

� Subband/frequency-domain

Introduction Basic principles Beamforming Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion

Page 50: Multi-microphone noise reduction and dereverberation techniques for speech applications Simon Doclo Dept. of Electrical Engineering, KU Leuven, Belgium.

5050

Relevant publicationsRelevant publications

• S. Doclo and M. Moonen, “GSVD-based optimal filtering for single and multimicrophone speech enhancement,” IEEE Trans. Signal Processing, vol. 50, no. 9, pp. 2230-2244, Sep. 2002.

• S. Doclo and M. Moonen, “Multi-Microphone Noise Reduction Using Recursive GSVD-Based Optimal Filtering with ANC Postprocessing Stage,” Accepted for publication in IEEE Trans. Speech and Audio Processing, 2003.

• S. Doclo and M. Moonen, “Robust adaptive time delay estimation for speaker localisation in noisy and reverberant acoustic environments, EURASIP Journal on Applied Signal Processing, Sep. 2003.

• S. Doclo and M. Moonen, “Combined frequency-domain dereverberation and noise reduction technique for multi-microphone speech enhancement,” in Proc. Int. Workshop on Acoustic Echo and Noise Control (IWAENC), Darmstadt, Germany, Sep. 2001, pp. 31-34.

• S. Doclo and M. Moonen, “Design of far-field and near-field broadband beamformers using eigenfilters,” Accepted for publication in Signal Processing, 2003.

• S. Doclo and M. Moonen, “Design of robust broadband beamformers for gain and phase errors in the microphone array characteristics,” IEEE Trans. Signal Processing, Oct. 2003.

Available at http://www.esat.kuleuven.ac.be/~doclo/publications.html

Introduction Basic principles Beamforming Multi-microphoneoptimal filtering

Transfer functionestimation and dereverberation

Conclusion