2nd Workshop on Wideband Spee ch Quality - June 2005 1 Some Aspects of Wideband Speech in Enterprise Telephony Eric J. Diethorn ([email protected]) with Gary W. Elko ([email protected]) and Joseph L. Hall ([email protected]) Avaya, Inc. Avaya Labs, Research 233 Mt. Airy Road, Basking Ridge, New Jersey 07920 USA
22
Embed
Some Aspects of Wideband Speech in Enterprise Telephony
Some Aspects of Wideband Speech in Enterprise Telephony. Eric J. Diethorn ( [email protected] ) with Gary W. Elko ( [email protected] ) and Joseph L. Hall ( [email protected] ) Avaya, Inc. Avaya Labs, Research 233 Mt. Airy Road, Basking Ridge, New Jersey 07920 USA. Outline. Physical acoustics - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
2nd Workshop on Wideband Speech Quality - June 2005
1
Some Aspects of Wideband Speech in Enterprise Telephony
2nd Workshop on Wideband Speech Quality - June 2005
3
Some introductory thoughts
Wideband speech telephony will instantaneously raise the bar of end-user expectation, at least for some applications. Skype
We have standards for the reproduction of wideband speech, but is wider-band good enough? Maybe [150, 5000] is good enough?
With greater bandwidth comes a greater range of potential artifacts that the acoustical-signal-processing engineer must address. Low-frequency acoustic echo, earpiece hiss, speech-coder
distortion, arbitration of multiple sampling rates.
The preferences of end users are uncertain. Speech bandwidths policies (buddy lists, profiles)? Suppose I have a physiological speech impediment. Do I
want it emphasized?
2nd Workshop on Wideband Speech Quality - June 2005
4
Physical acoustics
The physical design of terminal acoustics must change to render wideband speech.
Acoustical signal processing changes, too.
2nd Workshop on Wideband Speech Quality - June 2005
5
Loudspeakers & enclosures
Frequency (Hz)
So
un
d P
res
su
re L
ev
el
(dB
)
Frequency response,traditional narrowband speakerphone,
80 dB-SPL50 cm
2nd Workshop on Wideband Speech Quality - June 2005
6
Loudspeakers & enclosures
Frequency (Hz)
TH
D a
t h
arm
on
ics
(%
)
Total harmonic distortion,traditional narrowband speakerphone,
80 dB-SPL50 cm
• High distortion at low frequency end of wideband-speech spectrum
• Acoustic echo control difficult if not impossible without acoustical modifications.
2nd Workshop on Wideband Speech Quality - June 2005
7
Earpieces
Frequency (Hz)
So
un
d P
res
su
re L
ev
el
(dB
)
Frequency response, wideband handset
• In order to satisfy wideband standards, acoustical modifications are necessary to extend the low-frequency response of most earpiece designs.
• This is particularly challenging for physical arrangements in which the earpiece is held to the ear with little pressure.
2nd Workshop on Wideband Speech Quality - June 2005
8
Microphones Most low-cost electret microphones used today have a
frequency response that is practically flat beyond the range of wideband speech – they are “wideband ready.”
Multiple microphone arrangements – arrays – can be exploited to reduce the level of ambient noise at frequencies not present in traditional narrowband telephony. Low-frequency rumble. High-frequency hiss.
Short-time spectral modification methods of noise reduction can help, but the perception of artifacts from such processing is enhanced by the wider speech band.
2nd Workshop on Wideband Speech Quality - June 2005
9
Microphones
Omnidirectional microphone (traditional) Good pick-up of talkers in all
directions But, picks-up ambient noise
from all directions
Directional microphone Reduces off-axis noise Reduces reverberation of talker’s voice Reduces coupling from speakerphone
(helping AEC) But, talkers off axis can’t be heard well.
Front of
phone
Front of
phone
2nd Workshop on Wideband Speech Quality - June 2005
10
Echo
Requirements on echo control may change. The art of echo control must evolve to meet the
challenge of wideband speech.
2nd Workshop on Wideband Speech Quality - June 2005
11
Requirements on Talker Echo
Source: Transmission Systems for Communications, Bell Telephone Laboratories, Inc., 5th Edition, 1982.
Roundtrip, mouth-to-ear, echo loss requirements were measured on populations for narrowband speech. How well do these data apply to wideband speech echo paths?
Echo annoyance as a function of roundtrip, mouth-to-ear loss and delay, for narrowband speech.
Acoustic-to-acoustic echo-path loss (dB)
Pe
rce
nt
Go
od
-or-
Be
tte
r
2nd Workshop on Wideband Speech Quality - June 2005
12
Talker Echo, Continued
Source: Transmission Systems for Communications, Bell Telephone Laboratories, Inc., 5th Edition, 1982.
Being strictly digital, wideband-speech network paths do not suffer from analog circuit noises, however, analog and environmental noises enter calls at the endpoint. Should requirements on talker echo incorporate such (wideband) noise phenomena?
Echo annoyance as a function of roundtrip, mouth-to-ear echo-and-noise loss. Long-haul (~1000 mi.) PSTN connection, circa 1980.
2nd Workshop on Wideband Speech Quality - June 2005
13
Wideband speech coding
G.722, G.722.1 and G.722.2 G.722 is cheap. G.722.1 often comes with video-on-the-enterprise
(Polycom).
Proprietary codecs Silicon solution providers have their favorites. Some are
pretty good.
Linear 16-bit encoding? Speech-transmission bandwidth (bits-per-second) is
becoming a non-issue in the enterprise, at least for wired LANs.
Architecturally appealing within the enterprise. Let boundary gateways worry about transcoding.
2nd Workshop on Wideband Speech Quality - June 2005
2nd Workshop on Wideband Speech Quality - June 2005
15
Stereo audio conferencingHands-free, wideband-speech communications with stereo echo cancellation
ROOM 1 ROOM 2
h1 h1~
talker
+-
echo
h2h2~
g2 g1
NL
NL
2nd Workshop on Wideband Speech Quality - June 2005
16
Stereo Conferencing
(Placeholder, video demonstration)
2nd Workshop on Wideband Speech Quality - June 2005
17
Wideband speech & intelligibility
Siemens – “…wideband transmissions can reduce speech ambiguities by as much as 90 percent, increasing conversational intelligibility and reducing listener fatigue.” (2003 press release)
Polycom – “For single syllables, 3.3 kHz bandwidth yields an accuracy of only 75 percent, as opposed to over 95 percent with 7 kHz bandwidth.” (2003 white paper)
Marketing vs. science – both required
2nd Workshop on Wideband Speech Quality - June 2005
18
Experimental study*
Similar to Diagnostic Rhyme Test and Diagnostic Alliteration Test , except we generated our own word pairs e. g., “tie” & “pie” (“hot” & “hop”)
Subject hears one of the two, is shown both, is asked “Which of these two did you hear?”
Clean anechoic speech filtered to 3 bandwidths [50,3300], [50,5000] and [50,7000] Hz.
Investigate all nine combinations of three bandwidths and three additive-noise levels (0 dB, +12 dB, +24 dB SNR).
Reference: G.A. Miller and P.E. Nicely, “An analysis of perceptual confusions among some English consonants” Lincoln Laboratory, MIT, 1955 (J. Acoust. Soc. Amer. Vol. 27, pp. 338-352)
* For questions concerning aspects of this study, contact Joseph L. Hall, Avaya Research, [email protected]
2nd Workshop on Wideband Speech Quality - June 2005
19
What do they sound like?
“Seed, feed, seed” at different bandwidths and additive noise levels.
3.3 kHz LP 5 kHZ LP 7 kHZ LP
CLEAN
24 dB SNR
12 dB SNR
0 dB SNR
2nd Workshop on Wideband Speech Quality - June 2005
20
Representative results
3.3 kHz 5 kHz 7 kHz
CUTOFF FREQUENCY
0.5
0.6
0.7
0.8
0.9
1.0P
rob(
CO
RR
EC
T)
SNR=24 dBSNR=12 dBSNR = 0 dB
s (e.g. six) mistaken for f (e.g. fix)
90% CI
Confuse_s_f
2nd Workshop on Wideband Speech Quality - June 2005