Top Banner
Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC 12-month meeting, Sheffield: 23rd Oct 2009
39

Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

Jan 05, 2016

Download

Documents

Russell Wells
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

Modelling compensation for reverberation:work done and planned

Amy Beeston and Guy J. Brown

Department of Computer Science

University of Sheffield

EPSRC 12-month meeting, Sheffield: 23rd Oct 2009

Page 2: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

2 of 36

Overview

1. Work done

sir-stir framework 3 Across-channel model configurations

2. Work planned

Across-channel model Within-channel model Further questions

Page 3: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

3 of 36

Part 1: Work Done

sir-stir framework 3 Across-channel model configurations

Page 4: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

Watkins’ sir/stir paradigm

4 of 36

Cat

egor

y bo

unda

ry,

step

Human listeners

Context distance, m100.32

10

5

0

effect ofreverberation

effect ofcompensation

more ‘sir’ responses

more ‘stir’ responses

Page 5: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

5 of 36

Efferent auditory processing

Reverberating a speech signal reduces its dynamic range Reflections fill gaps in the temporal envelope

Efferent system helps control dynamic range (Guinan & Gifford, 1988). Could compensation be characterised as restoration of dynamic range?

mean= small valuemean/peak= 0.1216

mean= larger valuemean/peak= 0.2142

dry

reverberated

Page 6: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

6 of 36

mean-to-peak ratio (MPR)

measured over some time-window peak does not vary greatly with source-receiver distance mean increases with source-receiver distance MPR = mean/peak

therefore MPR increases with distance

Page 7: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

7 of 36

Modelling framework

Page 8: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

8 of 36

Stimuli

Watkins, JASA 2005, experiment 5

forward/reversed speech carrierforward/reversed reverberation

fwd rev

fwd

rev

speech carrier

reve

rber

atio

n

Watkins, JASA 2005, experiment 4

reverberate, then flip polarities: noise afterflip polarities, then reverberate: noise before

after before

nois

e

Page 9: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

9 of 36

Auditory Periphery

Outer/middle earSimulates human data from Huber et al. (2001)

Basilar membraneDRNL – dual resonance nonlinear filterbank (DRNL)

Originally proposed by Meddis, O’Mard and Lopez-Poveda (2001)

Human parameters from Meddis (2006)

Efferent attenuation introduced by Ferry and Meddis (2007)

Hair cellLinear output between threshold and saturated firing rate (Messing 2007)

Does not model adaptation in the auditory nerve

Page 10: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

Bes

t fr

equ

ency

(H

z, lo

g-sp

aced

)

100

8000

100

8000

100

8000

Time

Auditory Nerve STEP

10 of 36

Spectro-temporal excitation patterns

… ok, next you’ll get to click on …{ }sirstir

Page 11: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

11 of 36

Efferent attenuation based on dynamic range

Ideato control the amount of efferent attenuation applied in the model according to the dynamic range of the context

Dynamic range measured according to mean-to-peak ratio in AN response

Kurtosis negativedifferentials

offsets mean-to-peakratio

Page 12: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

12 of 36

Auditory Nerve response

Across-channel modelthe auditory nerve response is summed across all frequency channels prior to implementation of the efferent system component

Within-channel modelthe auditory nerve response is NOT summed across all frequency channels

Page 13: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

13 of 36

Across Channel

the auditory nerve response is summed across all frequency channels prior to implementation of the efferent system component

Σ MPR ATT

freq

uenc

y >

>

time >>

Page 14: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

14 of 36

Within Channel

Auditory nerve response in each frequency channel influences the efferent system component

freq

uenc

y >

>

MPR ATTMPR ATT

MPR ATTMPR ATT

MPR ATTMPR ATT

MPR ATTMPR ATT

time >>

Page 15: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

Efferent attenuation

MPR

AT

T

Linear map from MPR (of summed AN) to efferent attenuation, ATT ATT turns down the gain on the non-linear pathway of DRNL The rate-intensity curve shifts to the right

15 of 36

Page 16: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

16 of 36

Recognition

helps to recover the dip in the temporal envelope corresponding to the ‘t’ closure in ‘stir’

Templates:

sir stir

Page 17: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

17 of 36

3 Model configurations

Open loop

Semi-closed loopamount of attenuation is estimated during one second preceding the test-word, and held constant thereafter

Closed loopamount of attenuation is estimated continually in a sliding time window, and updated on a sample-by-sample basis (or at a specified control rate)

Page 18: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

18 of 36

Efferent system: Open loop

Open loop

Many simulations were run: the amount of attenuation applied was varied across a range of values (0-30 dB), and the category boundary resulting were recorded in calibration charts.

The ‘best-match’ to human results was found (manually) for each condition.

Near contexts match best with low attenuation values, while far contexts match best with higher attenuation values.

Page 19: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

19 of 36

Results: Open loop

Cat

egor

y B

ound

ary

resu

lts

12.5

9.0

22.0

21.5

Attenuation applied (dB)

farnear

Attenuation applied (dB) Attenuation applied (dB)

Attenuation applied (dB)Attenuation applied (dB)

Cal

ibra

tion

curv

es f

or t

unin

g

0, 0.5, … 29.5, 30

0, 0.5, … 29.5, 300, 0.5, … 29.5, 30

0, 0.5, … 29.5, 30

Page 20: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

20 of 36

Efferent system: Semi-closed loop

Semi-closed loop

amount of attenuation is estimated during one second preceding the test-word, and held constant thereafter

Page 21: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

21 of 36

Semi-closed loop

ATT

…………… ok, next you’ll get to click on ……………{ }sirstir

Examine context within time window to derive a metric value Use metric value to determine the efferent attenuation

Page 22: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

22 of 36

Metric:Semi-closed loop MPR

Page 23: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

23 of 36

Results:Semi-closed loop

Tuned to match near-near and far-far (fwd fwd) conditions

experiment 5 achieves qualitative (not quantitative) match to human data…

…but experiment 4 conditions do not match well

ATTENUATION=(38.36*MPR)+13.77

fwd rev

forw

ard

reve

rse

speech carrier

reve

rber

atio

n

before after

Page 24: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

24 of 36

Efferent system:Closed-loop

Closed loop

amount of attenuation is estimated continually in a sliding time window, and updated on a sample-by-sample basis (or at a specified control rate)

Page 25: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

25 of 36

Closed loop

ATTATTATTATTATTATT

…………… ok, next you’ll get to click on ……………{ }sirstir

Examine context within time window to derive a metric value Use metric value to determine the efferent attenuation applied Window slides forward, process repeats…

Page 26: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

26 of 36

Metric:Closed loop MPR

Expt. 5

MPR through time

forw

ard

reve

rse

reve

rber

atio

n

fwd revspeech carrier

Page 27: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

27 of 36

Closed loop (expt 5)

tuned to ‘best’ match near-near and far-far (fwd fwd) conditions variation possible due to granularity of model (± 0.5)

ATTENUATION=(45*MPR)+18 ATTENUATION=(45*MPR)+19

fwd revspeech carrier fwd revspeech carrier

fwd

rev

reve

rber

atio

n

fwd

rev

reve

rber

atio

n

Page 28: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

28 of 36

Closed loop (expt 4)

MPR mapping does not generalise for experiment 4 noise contexts

ATTENUATION=(45*MPR)+18

fwd revspeech carrier

fwd

rev

reve

rber

atio

n

before after

Page 29: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

29 of 36

Part 2: Work Planned

Across-channel model

Within-channel model

Further questions

Page 30: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

30 of 36

Across-channel model

Practical considerations

Control rate specified to speed up the simulation(usually 1 kHz i.e., attenuation parameter is updated every 1 ms)

Time-window over which to determine metric(usually previous 1 second, different values under investigation at present)

Shape of window(rectangular at present, should have a ‘forgetting function’)

Question to Tony et al.

What data can we use to determine the shape/duration window?

Page 31: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

31 of 36

Across-channel model

Σ MPR ATT

freq

uenc

y >

>

time >>

window shape/duration?

time >>

wei

ght

Page 32: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

32 of 36

Within-channel model

Previously we asked what duration and shape is the metric-window in time.

Now we ask what duration and shape is the metric-window in frequency.

/t/ is defined by sharp onset burst 2->8 kHz (Régnier & Allen, 2008)

template matching over restricted areas of the frequency domain

Page 33: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

33 of 36

Within-channel model

Frequency-dependent suppression: Feedback from efferent system appears to be

fairly narrowly tuned fall-off in the effect of efferent-induced threshold

shift at low BFs [data from cat, Guinan & Gifford (1988)]

improves representation of low-frequency speech structure when efferent attenuation is high

Modelling implications: Need no longer be a pooled auditory nerve

(STEP) response for metric/map to attenuation Each channel can react quasi-independently to

the audio context it hears

Page 34: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

34 of 36

Within-channel modelfr

eque

ncy

>>

MPR ATTMPR ATT

MPR ATTMPR ATT

MPR ATTMPR ATT

MPR ATTMPR ATT

time >>

time >>

wei

ght

window shapes/durations?

Page 35: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

35 of 36

Questions

Is there a time-analogy to the frequency gaps in 8-band stimuli?

- imposing gaps so that bits are missing from the freq/time pattern in the context window.

- might allow an importance weighting for time-bands like for the frequency bands.

Page 36: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

36 of 36

Implication?

What happens with a silent context?

Physiology predicts that efferent system is not activated

Model predicts small dynamic range,- maximum mean/peak ratio- high efferent attenuation- low category boundary (more stirs)

specifically, if (when) context is shorter than metric window:- should we shorten the metric window?- zero pad the utterances?- count previous trial as context?

Page 37: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

37 of 36

Thanks

Tony Watkins, Simon Makin and Andrew Raimond of Reading University for all the data.

Ray Meddis and Robert Ferry of Essex University for the DRNL program code.

Kalle Palomäki, Hynek Hermansky and Roger Moore for discussion.

Page 38: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.

The end

Page 39: Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.