MULTIZONE REPRODUCTION OF SPEECH SOUNDFIELDS: A PERCEPTUALLY WEIGHTED APPROACH Jacob Donley and Christian Ritz School of Electrical, Computer and Telecommunications Engineering ICT Research Institute & Global Challenges University of Wollongong
Jan 10, 2017
MULTIZONE REPRODUCTION OF SPEECH SOUNDFIELDS:A PERCEPTUALLY WEIGHTED APPROACHJacob Donley and Christian Ritz
School of Electrical, Computer and Telecommunications EngineeringICT Research Institute & Global ChallengesUniversity of Wollongong
2
Room
How can we perceptually enhance independent listening zones in a room?
Quiet Zone:No reproduced sound
Bright Zone:Listening to speech or music
Loudspeakers
Known as Multizone Reproduction of Soundfields
3
Aim: derive loudspeaker signals to reproduce desired sound field in each zone • Reproduced sound field modelled in the
(discrete) space (), time (), frequency domain () as:
𝑆𝑤 (𝐱 ,𝑛 ,𝑘 )=∑𝑙=1
𝐿
𝑑𝑙 (𝑛 ,𝑘 ,𝑤 )( 𝑗4 𝐻0❑(1 )❑ (𝑘‖𝐱 𝑙−𝐱‖))
is the mth order Hankel function of the first kind are the loudspeaker signals to be derived
[1] Donley, J. & Ritz, C., “An efficient approach to dynamically weighted multizone wideband reproduction of speech soundfields”, Proc. IEEE ChinaSIP 2015, pp. 60-64, 12-15 July 2015. [2] W. Jin, W. B. Kleijn, and D. Virette, “Multizone soundfield reproduction using orthogonal basis expansion,” Proc. IEEE ICASSP 2013, pp. 311–315
Solution is based on a weighted orthogonal basis expansion approach [1,2]
4http://bit.ly/WeightedMultizone
Weighting method controls leakage into quiet zone at cost of quality in bright zone
• Multizone Occlusion problem:
• Quiet zone in-line with desired bright zone
• Difficult to control leakage• Trade-off:
• Quality in Bright Zone vsQuietness in Quiet Zone
Small weight
Large weight
Discrete:Space
Time Frequency
(weighted actual soundfield function)
How quiet does the quiet zone need to be?
5
• Only need to suppress leakage in the quiet zone down to the threshold in quiet• Possible only if the acoustic contrast between zones is large
enough
Case 1: The Hearing Threshold
Speech
6
• Key idea: a masker in the quiet zone perceptually hides surrounding frequency components leaked from the bright zone
• Benefit: Less control via weighting needed – improve bright zone quality
Case 2: Spreading functions corresponding to local masking signal
2kHz MaskerSpeech
• Max. SPL - small weight, high bright zone quality
• Min. SPL – large weight, low bright zone quality
• Leaked SPL – masker allowed to remain in quiet zone
7
Considering masking - reduces spatial error in the bright zone and SPL in quiet zone
Benefit: Perceptually optimised trade-off between quality and leakage
• Weights chosen by comparing reproduced speech with spreading functions
(2)
reduction
Spatial error:Speech
Spreading function and hearing threshold
𝜖𝑏(𝑛 ,𝑘)
8
Experimental evaluation to validate proposed perceptual approach
Multizone Setup:• Full circle of 65 loudspeakers • Loudspeaker array diameter: 3m• Zone diameters: 60cm
(enough space for a human head)• Zone centres are 1.2m apart • Reproduction capable of wideband
speech• Direction of speech causes Multizone
Occlusion Problem ().
= Hearing threshold & Spreading function (as used in audio coding standards)
9
• 10dB improvement in MSE• Still high quality speech in the bright zone
Reduced bright zone error from psychoacoustic masking
Mean Squared Error (MSE):
No masking large weight
With masking variable weight
10
Reduced bright zone spatial error from psychoacoustic masking
Magnitude difference (A, B):
Phase difference (C, D):
Maximum spatial error reduction: 28dB
Consequence of smaller weighting:less loudspeaker power
(max. reduction = 65 %
11
Conclusion: Exploiting perceptual weighting within multizone soundfield reproduction results in significant advantages • Improved error in bright zones with no perceptual cost in
adjacent zones• MSE of speech: -69.8dB to -80.3dB (max)• Spatial error: -7.4dB to -31.5dB (max)
• Reduced loudspeaker power (up to 65%)
• Improved reproduction when occlusion problem is present
Questions?