Top Banner
Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division, Agere Systems Juha Merimaa Institut für Kommunikationsakustik, Ruhr-Universität Bochum
36

Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Dec 17, 2015

Download

Documents

Easter Goodwin
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Source Localization in Complex Listening Situations:

Selection of Binaural Cues Based on Interaural Coherence

Christof FallerMobile Terminals Division, Agere Systems

Juha MerimaaInstitut für Kommunikationsakustik,

Ruhr-Universität Bochum

Page 2: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Complex listening situations

Jazz

Blaah, blaah,blaah

Hum

Speech source at -15º, good music at 50º, and noise through an open door at -125º azimuth

Page 3: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

This work

• A model to extract binaural cues corresponding to human localization performance in several complex listening situations

Page 4: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Outline

1. Model descripiton

2. Simulation resultsA) Independent sources in free-field

B) Precedence effect

C) Independent sources and reverberation

3. Comparison with earlier models

4. Summary

Page 5: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

HRTF/BRIR 1

Left ear input

Stimulus 1

HRTF/BRIR N

Right ear input

Gammatone filterbank

HRTF/BRIR N

HRTF/BRIR 1

Stimulus N

Internal noise

Normalized cross-correlation &level difference calculation

Model of neural

transduction

Exponential time window 10 ms

Bernstein et al. 1999

Page 6: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Extraction of binaural cues

• Estimated at each time instant:– Interaural Time Difference (ITD)• Time lag of the maximum of the normalized

cross-correlation

– Interaural Level Difference (ILD)• Ratio of signal energies within time window

– Interaural coherence (IC)• Maximum of the normalized cross-correlation

Page 7: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Assumption for correct localization

• The auditory system needs to acquire ITD and ILD cues similar to those evoked by each source separately in an anechoic environment

Page 8: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Example: Two active sound sources

• Superposition with different level and phase relations at left and right ears

• For independent or non-stationary source signals:– Time-varying binaural cues– Reduced IC

Page 9: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

How to obtain correctlocalization cues?

• Simply select ITDs and ILDs only when IC is above a set threshold– An adaptive threshold is assumed

Page 10: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Simulation results

Page 11: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

1. Effect of number of sources• Speech sources at same overall level

(Hawley et al. 1999; Drullman & Bronkhorst 2000)– One or two distracters have little effect on

localization performance– Performance is still good for 5 competing

sources

• Simulations with different phonetically balanced sentences recorded by the same male speaker

Page 12: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Two talkers, ±40º azimuth

• 65 and 58 % selected signal power

Page 13: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

3 and 5 talkers

• Simulated at 500 Hz critical band• 3 talkers: 0º and ±40º azimuth

• 5 talkers: 0º, ±40º, and ±80º azimuth

Page 14: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

3 talkers: c

0 = 0.99

p0 = 54 %

5 talkers: c

0 = 0.99

p0 = 22 %

All

cue

sS

ele

cte

d c

ues

Page 15: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

2. Effect of target-to-distracter ratio

• Click-train target in presence of a white noise distracter– Target is localizable down to a few dB above

detection threshold (Good & Gilkey 1996; Good et al. 1997)

– High frequencies are more important for localization (Lorenzi & et al. 1999)

Page 16: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Simulation

• 2 kHz critical band• White noise at 0º azimuth

• 100 Hz clicktrain at 30º azimuth

• -3, -9, and -21 dB absolute target-to-distracter ratios (T/D)– Corresponds to 8, 2, and -10 dB T/D relative

to detection threshold, as defined by Good & Gilkey (1996)

Page 17: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

-3 dB T/Dc

0 = 0.990, p

0 = 3 %

-9 dB T/Dc

0 = 0.992, p

0 = 9 %

-21 dB T/Dc

0 = 0.992, p

0 = 99 %

All

cue

sS

ele

cte

d c

ues

Page 18: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Precedence effect

• Perception of subsequent sound events– Fusion– Localization dominance by the first event– Suppression of directional discrimination of

latter events

• Depends on interstimulus delay– Summing localization (approx. 0-1 ms)– Localization dominance by first event

(stimulus dependent, until 2-50 ms)– Independent localization

Page 19: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

1. Click pairs

• Classical precedence effect experiment: Two consecutive clicks with same level from different directions

Page 20: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Lead: 40º, lag: -40º, ICI: 5 ms

Page 21: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Click pairs as a function of inter-click interval (ICI)

• Simulations for ICI between 0 - 20 ms• Same click sources: ±40º azimuth

• 500 Hz critical band

• A single threshold did not predict all cases correctly– Threshold was determined for each ICI such

that the standard deviation of ITD is 15 μs

Page 22: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Click pairs as a function of ICI

Page 23: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Click pairs as a function of ICI

Page 24: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Note on crossfrequency processing

• At certain small ICIs the required IC threshold gets very high– Anomalies of precedence effect have been

reported for bandpass filtered clicks (Blauert & Cobben 1978)

• Some characteristic power peaks occur at different ICIs at different critical bands

• Across frequency band processing would allow extraction of correct cues

Page 25: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

2. Sinusoidal tones and a reflection

• Steady state cues are a result of coherent summation of sound at the ears of a listener

• Localization depends on onset rate (Rakerd & Hartmann 1986)– Correct localization with a fast onset– Localization based on misleading steady

state cues for tones with a slow onset

Page 26: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Sinusoidal tones: Simulation

• 500 Hz sinusoidal tone• Direct sound from 0º azimuth

• Reflection after 1.4 ms from 30º

• Linear onset ramp

• Steady state level of 65 dB SPL

Page 27: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,
Page 28: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Sinusoidal tones: Results

• The model cannot as such explain discounting of the steady state cues

• Dependence on onset rate can be explained by considering cues at the time when signal level gets high enough above internal noise

Page 29: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Independent sources and reverberation

• Final test for the model• Simulation at 2 kHz critical band– One speech sources at 30º azimuth– Two speech sources at ±30º azimuth

• BRIRs measured in a hall withRT = 1.4 s at 2 kHz octave band

Page 30: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

All

cue

sS

ele

cte

d c

ues

1 talker: c

0 = 0.99

p0 = 1 %

2 talkers: c

0 = 0.99

p0 = 1 %

Page 31: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Comparison with earlier models

Page 32: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Weighting of localization cues with signal power

• Not done outside 10 ms analysis window• Contribution of each time instant to

localization is defined by IC

• Model can neglect information corresponding to high power when due to concurrent activity of several sources

• Power still affects how often ITDs and ILDs of individual sources are sampled

Page 33: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Lindemann (1986)

• Based on contralateral inhibition using a fixed (10 ms) time constant

• Tends to hold cross-correlation peaks with high IC

• Differences– Operation of the cue selection method is not

limited to the 10 ms time window– When necessary (complex situations), the

“memory” of past cues can last longer

Page 34: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Zurek (1987)

• Localization inhibition controlled by onset detection

• In precedence effect conditions, the cue selection naturally derives most localization cues from onsets

• Differences– Cue selection is not limited to getting

information from signal onsets

Page 35: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Summary

• A method was proposed for modeling auditory localization in presence of concurrent sound

• ITD and ILD cues are selected only when they coincide with a large IC

• Operation of the model was verified with results of several psychoacoustical studies from the literature

Page 36: Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,

Thank you!