Top Banner
AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya
17
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.

AUDIO COMPRESSION

TOOLS & TECHNIQUESGautam Bhattacharya

Page 2: AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.

CD Quality

CD Audio: 2 Channel (stereo)16 bit encoding44.1 kHz sampling rate

Data Rate:This leads to a data rate of 1.4 - 1.54 mbps

Page 3: AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.

AUDIO ENCODER!

Bit rate: as low as 1 bit per sample or less

Based on a Perceptual model

Capable of high fidelity audio

Lossy or Lossless?

Page 4: AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.

‘CD Quality’

Many perceptual test were conducted to verify the quality of audio output

Takes advantage of perceptual irrelevancies as well as statistical redundancies.

Page 5: AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.

Motion Picture Experts Group

MPEG

MPEG is a family of encoding standards for digital multimedia information

• MPEG-1: a standard for storage and retrieval of moving pictures and audio on storage media (e.g., CD-ROM).

Layer I

Layer II

Layer III (aka MP3)

• MPEG-2: standard for digital television, including high-definition television (HDTV), and for addressing multimedia applications.

• Advanced Audio Coding (AAC)

• MPEG-4: a standard for multimedia applications, with very low bit-rate audio-visual compression for those channels with very limited bandwidths (e.g., wireless channels).

• MPEG-7: a content representation standard for information search

Page 6: AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.

Back to the Encoder!

Generic Audio Encoder ArchitecturePainter, T. & Spanias, A. Perceptual Coding of Digital Audio, Proceedings of IEEE, 2000

Page 7: AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.

Psychoacoustic Model

Critical Listening Threshold:The absolute threshold of hearing is defined as the amount of energy needed in a pure tone such that it can be detected by a listener in a noiseless environment.

This criteria assumes that the volume control on the decoder will be set such that the smallest possible output signal will be presented at 0 dB - SPL

Page 8: AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.

Psychoacoustic Model

Absolute threshold of hearing Painter, T. & Spanias, A. Perceptual Coding of Digital Audio, Proceedings of IEEE, 2000

Painter, T. & Spanias, A. Perceptual Coding of Digital Audio, Proceedings of IEEE, 2000, Vol. Vol. 88(No. 4)

Painter, T. & Spanias, A. Perceptual Coding of Digital Audio, Proceedings of IEEE, 2000, Vol. Vol. 88(No. 4)

nn

Page 9: AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.

Psychoacoustic Model

Critical BandsThe ear has a limited frequency selectivity that varies in acuity from less than 100 Hz for the lowest audible frequencies to more than 4 kHz for the highest.As a result the audible spectrum can be partitioned into critical bands that reflect the resolving power of the ear as a function of frequency.

Due to this limited frequency resolving power, the threshold for noise masking at any given frequency is solely dependent on the signal activity within a critical band of that frequency.

Page 10: AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.

Psychoacoustic Model

MPEG/Audio filter banks Vs Critical BandsPan, D.Y. Digital Audio Compression, Digital Technical Journal, 1993, Vol. 5

Page 11: AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.

Psychoacoustic Model

Auditory MaskingAuditory Masking is a perceptual weakness of the ear that occurs whenever the presence of a strong audio signal makes a spectral neighbourhood of weaker audio signals imperceptible.

Two types of Masking:* Simultaneous Masking* Temporal Masking

Page 12: AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.

Psychoacoustic Model

Audio Masking

Pan, D.Y. Digital Audio Compression, Digital Technical Journal, 1993, Vol. 5

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 13: AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.

Psychoacoustic Model

Perceptual EntropyJohnston at Bell Labs has combined notions of psychoacoustic masking with signal quantization principles to define perceptual entropy (PE), a measure of perceptually relevant information contained in any audio record.

Expressed in bits per sample, PE represents a theoretical limit on the compressibility of a particular signal.

PE measurements reported in and suggest that a wide variety of CD quality audio source material can be transparently compressed at approximately 2.1 bits per sample.

Page 14: AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.

Time - Frequency Analysis

Filter BanksThe filter bank divides the signal spectrum into frequency sub-bands and generates a time-indexed series of coefficients representing the frequency localized signal power within each band.

Masking thresholds are applied to resulting frequency sub-band signals

By providing explicit information about the distribution of signal and hence masking power over the time-frequency plane, the filter bank plays an essential role in the identification of perceptual irrelevancies when used in conjunction with a perceptual model

Page 15: AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.

Time - Frequency Analysis

Pseudo QMF - M Band BanksUsed in all MPEG 1 encoders

Signal is separated into sub-bands, the widths of which are equal over the entire frequency range

The resulting sub-band signals are then down-sampled, in order to conserve bandwidth. (they are up-sampled again at the decoder)

Page 16: AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.

Pre Echo Distortion

Pre-echoes occur when a signal with a sharp attack begins near the end of a transform block immediately following a region of low energy.

This situation can arise when coding recordings of percussive instruments such as the triangle, the glockenspiel, or the castanets

b

Painter, T. & Spanias, A. Perceptual Coding of Digital Audio, Proceedings of IEEE, 2000

Page 17: AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.

AUDIO COMPRESSION

THANK YOU!