This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Perceptual Audio Coding (PAC) is applied for storage and transmission of audio signals.
Perceptual transparency is achieved when bitrate is high enough. Original and coded/decoded signals are indistinguishable when listening in an
optimal listening environment. At low bitrates, artifacts can be introduced and the sound quality is reduced. Width of stereo image is reduced, e.g. due to Decreased difference signal (M/S Coding), Increased correlation between channel signals (Intensity Stereo Coding).
Aim is to apply post‐processing for improving the sound quality. Single‐ended, i.e. without having information about the coding (codec, bit rate). Criterion is pleasantness, not transparency.
Improve the perceived stereo image by applying artificial decorrelation to the background signal components.
Background sounds: ambient sounds, background music (radio broadcast) and musical accompaniment.
Foreground sounds: singers, talkers, soloists, loud instruments (drums). Maintains the timbral qualities without introducing coloration and artifacts. Decorrelation can impair the sound quality when applied to foreground sounds
(e.g. speech, drums). Decorrelation is not required for foreground sounds (directional sounds are
locatable). The intensity of the decorrelation is controlled using a model of reverberance
(perceptual attribute that relates to the intensity of reverberation).
3. Ambient Sound EnhancementSeparation of the Background Sounds
STFT, Spectral weighting, i.e. scaling of the spectral coefficients, Spectral weights (for each time‐frequency bin) to attenuate transient signal
components, Spectral weights for attenuating tonal signal components, Combination of these spectral weights (by taking the minimum of both), Inverse STFT.
3. Ambient Sound EnhancementAttenuation of Transient Signals
Signal model: Input signal is an additive mixture of a transient signal component and a sustained signal component (in the STFT domain, time frame index k and frequency bin index m):
The transient signal is attenuated by spectral weighting
The spectral weights are computed from estimates of the sustained signal and the transient signal
The sustained signal magnitude is estimated by means of low‐pass filtering of the sub‐band magnitudes along time and limiting the sustained signal by the input.
3. Ambient Sound EnhancementDecorrelator Gain Control
The perceived level of decorrelation (and reverberation) depends on both, the processing (impulse response) and the input signal. Lower effect intensity for stationary input signals than for transient signals or
frequency modulated signals (e.g. speech). Level of decorrelation is controlled using a model for the perceived intensity of
decorrelation. Modified version of a model of reverberance (Uhle et. al., 2011), Based on a model for partial loudness (Moore et. al., 1997). Partial loudness difference =
partial loudness of decorrelated signal (masked by the dry input)‐ partial loudness of dry input (masked by the decorrelated signal)
Extending the width of the stereo image by enhancing inter‐channel level differences of direct sound components:1. Stereo Mid/Side Decomposition,2. Boost the stereo side signal.
Listening test with multiple stimuli using loudspeakers. Conditions: Coded signal without any postprocessing, as known and hidden “reference”, Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE.
5 test signals of length between 8 s and 30 s each, loudness normalized (ITU‐R BS.1770).
In perceptual audio coding, audible artifacts can be introduced when the bitrate is too low.
We have proposed a suite of algorithms each designed for mitigating common types of artifact.
Listening test: Both methods achieved a significant improvement, The combination of both methods is rated higher than the methods in isolation
(“slightly better”). These tools can be used to implement a Low Bitrate Coding Enhancement system. Future work: Assessment of the performance obtained with a combination of all proposed
enhancement tools (presented in Part 1 and Part 2).