Top Banner
Introduction of MPEG-2 AAC Audio Coding 指指指指 : 指指指 指指 : 指指指
30

Introduction of MPEG-2 AAC Audio Coding

Jan 16, 2016

Download

Documents

Binh

Introduction of MPEG-2 AAC Audio Coding. 指導教授 : 蔡宗漢 學生 : 劉俊男. Why do we need MPEG?. low sample and bit rates storage space For example: A CD can contain a maximum of 650 MB of unencoded video just 5 or 6 minutes. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction of MPEG-2 AAC Audio Coding

Introduction of MPEG-2 AAC Audio Coding

指導教授 :蔡宗漢

學生 :劉俊男

Page 2: Introduction of MPEG-2 AAC Audio Coding

2Electrical EngineeringNational Central University

Why do we need MPEG?

low sample and bit rates storage space

For example: A CD can contain a maximum of 650 MB of unen

coded video just 5 or 6 minutes. When the video signal is encoded the CD can co

ntain up to 74 minutes of video. Bandwidths

Page 3: Introduction of MPEG-2 AAC Audio Coding

3Electrical EngineeringNational Central University

MPEG Audio Coding Standards

MPEG-1 (1992) Three layers with increasing complexity and perfor

mance Layer-3 is the highest complexity mode,optimized t

o provide the highest quality at low bitrate (around 128 kbits/s for a stereo signal)

MPEG-2 (1994) backwards compatible multichannel coding coding at lower sampling frequencies adds samplin

g frequencies of 16,22.05,24 khz MPEG-2 AAC (1994)

AAC is a second generation audio coding scheme for generic coding of stereo and multichannel signals

Page 4: Introduction of MPEG-2 AAC Audio Coding

4Electrical EngineeringNational Central University

MPEG Audio Coding Standards

MPEG-4 (1998) the emphasis in MPEG-4 is on new functionalities rather than

better compression efficiency mobile as well as stationary user terminals,database access,

communications,will be major applications for MPEG-4 consists of a family of audio coding algorithm spanning the r

ange from low bitrate speech coding (down to 2 kbit/s) up to high quality audio coding at 64 kbit/s per channel and above.

generic audio coding at medium to high bitrate is down by AAC

MPEG-7 (2001) does not define compression algorithms MPEG-7 is a content representation standard for multimedia

information search,filtering,management and processing

Page 5: Introduction of MPEG-2 AAC Audio Coding

5Electrical EngineeringNational Central University

Assignment of codecs to bitrate ranges in MPEG-4 natural audio coding

Scalable Coder

low medium high

Parametric coder

CELP coder

T/F coder

ITU-T coder

4 kHz 8 kHz 20 kHzSignal Bandwidth

Channel bitrate(kbps)2 4 6 8 10 12 14 16 24 32 48 64 ~

~

Page 6: Introduction of MPEG-2 AAC Audio Coding

6Electrical EngineeringNational Central University

Structure of MPEG-4 Audio

Page 7: Introduction of MPEG-2 AAC Audio Coding

7Electrical EngineeringNational Central University

A basic perceptual audio coder

Analysis Filterbank

Perceptual Model

Quantization & Coding

Encoding of bitstream

Audio in

bistream out

Decoding of bitstream

Inverse Quantization

Synthesis Filterbank

Audio out

bistream in

Block diagram of a perceptual Encoding system

Block diagram of a perceptual Decoding system

Page 8: Introduction of MPEG-2 AAC Audio Coding

8Electrical EngineeringNational Central University

The critical bands

Page 9: Introduction of MPEG-2 AAC Audio Coding

9Electrical EngineeringNational Central University

The absolute threshold of hearing in quiet

Across the audio spectrum, quantifies sound pressure level (SPL) required at each frequency such that an average listener will detect a pure tone stimulus in a noiseless environment

Page 10: Introduction of MPEG-2 AAC Audio Coding

10Electrical EngineeringNational Central University

The absolute threshold of hearing in quiet

The absolute threshold of hearing characterizes the amount of energy needed in a pure tone such that it can be detected by a listener in a noiseless environment.

The absolute threshold is typically expressed in terms of dB Sound Pressure Level (dB SPL).

The quiet threshold is well approximated by the non-linear function

Page 11: Introduction of MPEG-2 AAC Audio Coding

11Electrical EngineeringNational Central University

SIMULTANEOUS MASKING

Page 12: Introduction of MPEG-2 AAC Audio Coding

12Electrical EngineeringNational Central University

TEMPORAL MASKING

Pre-masking in particular has been exploited in conjunction with adaptive block size transform coding to compensate for pre-echo distortions.

Page 13: Introduction of MPEG-2 AAC Audio Coding

13Electrical EngineeringNational Central University

Pre-echo effect

Pre-Echo Example:

(a) Uncoded Castanets.

(b) Transform Coded Castanets, 2048-Point Block Size

Page 14: Introduction of MPEG-2 AAC Audio Coding

14Electrical EngineeringNational Central University

The building blocks of MPEG-2 AAC encoder

Quantizer

Quantized Spectrum

of Previous Frame

Perceptual Model

Gain control

Filter Bank

TNS

Intensity/ Coupling

Prediction

Mid/Side Stereo

Scale Factors

Noiseless Coding

Rate/

Distortion

Control

Process

Bi tstream

Multiplex

13818-7 Coded Audio Stream

13818-7 Coded Audio Stream

Legend Data Control

Legend Data Control

Input Time Signal

•A high frequency resolution filterbank (MDCT) •Switched between resolutions of 1024 and 128 spectral lines•The shape of the transform window can be adaptively selected between a sine window an a Kaiser-Bessel- derived(KBD) window•Depending on the stationary or transient character of the input signal

the perceptual model is taken from

MPEG-1(model 2).

The temporal noise shaping tool controls

the time dependence of the quantization noise

•The second-order backward adaptive predictor •Improves coding efficiency

•An iterative method is employed •So as to keep the quantization noise in all critical bands below the global masking threshold

.

Page 15: Introduction of MPEG-2 AAC Audio Coding

15Electrical EngineeringNational Central University

MPEG-2 AAC Decoder

13818-7 CodedAudio Stream

Bitstream

Demultiplex

Bitstream

Demultiplex

NoiselessDecodingNoiselessDecoding

InverseQuantizerInverse

Quantizer

ScaleFactorsScale

Factors

M/SM/S

PredictionPrediction

Intensity/CouplingIntensity/Coupling

TNSTNS

FilterBankFilterBank

GainControlGain

Control

OutputTimeSignal

Legend

Data Control

Legend

Data Control

13818-7 CodedAudio Stream

Bitstream

Demultiplex

Bitstream

Demultiplex

NoiselessDecodingNoiselessDecoding

InverseQuantizerInverse

Quantizer

ScaleFactorsScale

Factors

M/SM/S

PredictionPrediction

Intensity/CouplingIntensity/Coupling

TNSTNS

FilterBankFilterBank

GainControlGain

Control

OutputTimeSignal

Legend

Data Control

Legend

Data Control

Legend

Data Control

Legend

Data Control

Page 16: Introduction of MPEG-2 AAC Audio Coding

16Electrical EngineeringNational Central University

Channel mapping

supports up to 46 channels for various multichannel loudspeaker configurations and other applications

the default loudspeaker configurations are the monophonic channel the stereophonic channel the 5.1 system (five channels plus LFE channel).

Page 17: Introduction of MPEG-2 AAC Audio Coding

17Electrical EngineeringNational Central University

Applications for MPEG-2 AAC

Due to its high coding efficiency, AAC is a prime candidate for any digital broadcasting system. The Japanese authorities were the first to decide t

o use AAC within practically all digital audio broadcasting schemes. As their first services will start in the year 2000, this decision already triggered the development of dedicated AAC decoder chips at a number of manufacturers.

AAC has been selected for the use within the Digital Radio Mondiale (DRM) system. Due to its superior performance, AAC will also play a major role for the delivery of high-quality music via the Internet.

Page 18: Introduction of MPEG-2 AAC Audio Coding

18Electrical EngineeringNational Central University

Applications for MPEG-2 AAC

Furthermore, AAC (with some modifications) is the only high-quality audio coding scheme used within the MPEG-4 standard, the future "global multimedia language".

Fraunhofer IIS-A offers to contribute to AAC applications at all implementation levels, e.g. licensing software libraries for PC-based applications or for VLSI developments as well as offering DSP-based solutions (e.g. on Motorola’s DSP56300, Texas Instruments’ TMS320C67xx, and Analog Devices’ ADSP21x6x family). The coding methods developed by Fraunhofer IIS-A stand for optimum audio quality at any given bit rate.

Page 19: Introduction of MPEG-2 AAC Audio Coding

19Electrical EngineeringNational Central University

Profiles of MPEG-2 AAC

(1) main profile • offers highest quality • used when memory cost is not significant • substantial processing power is available(2) low-complexity profile (LC) • used when RAM usage, processing power and compression requirements are all present • preprocessing and time-domain prediction are not permitted • TNS order and bandwidth are limited(3) scaleable sampling rate profile (SSR) • offers the lowest complexity • preprocessing block is added,and prediction is not permitted • TNS order and bandwidth are limited

Page 20: Introduction of MPEG-2 AAC Audio Coding

20Electrical EngineeringNational Central University

Tool usage of AAC Profiles

Profile Interoperability

AAC profiles Tool usageMain All tools except gain controlLC Prediction and gain control are not used

TNS order is limitedSSR Prediction and coupling channels are not used

TNS order and bandwidth are limited

Scaleable Sampling Rate

20 kHz

18 kHz

12 kHz

6 kHz

Main

Low Complexity

Page 21: Introduction of MPEG-2 AAC Audio Coding

21Electrical EngineeringNational Central University

MPEG-2 AAC audio transport formats

the basic audio format and the transport syntax for synchronization and coding parameters in MPEG-1 are tied together unseparably

MPEG-2 AAC defines both,but leaves the actual choice of audio transport syntax to the application

ADIF (Audio Data Interchange Format) puts all data controlling the decoder (like sampling frequency, mode et

c.) into a single header preceding the actual audio stream it is useful for file exchange , but does not allow for break-in or start of d

ecoding at any point in time like the MPEG-1 format ADTS (Audio Data Transport Stream)

format packs AAC data into frames with headers very similar to the MPEG-1 header format

allows start of decoding in the middle of an audio bitstream the ADTS format has emerged as the de-facto standard for a number of

applications using AAC

Page 22: Introduction of MPEG-2 AAC Audio Coding

22Electrical EngineeringNational Central University

Filterbank and block switching

Standard Filterbank A straight forward Modified Discrete Cosine Transfor

m (MDCT) Supporting block lengths of 2048 points and 256 p

oints which can be switched dynamically Supports two different window shapes that can be

switched dynamically sine shaped window Kaiser-Bessel Derived (KBD) Window

All blocks are overlapped by 50% with the preceding and the following block

Page 23: Introduction of MPEG-2 AAC Audio Coding

23Electrical EngineeringNational Central University

MDCT & IMDCT

the MDCT basis functions extend across two blocks in time, leading to virtual elimination of the blocking artifacts

MFrame k

MFrame k+1

MFrame k+2

MFrame k+3

2M

2M

2M

MDCT

MDCT

MDCT

M

M

M

2MMDCTM

2MMDCTM

2MMDCTM

MFrame k+1

MFrame k+2

‧ ‧ ‧ ‧ ‧ ‧

+

+

Page 24: Introduction of MPEG-2 AAC Audio Coding

24Electrical EngineeringNational Central University

Block switching and Overlap-add

Page 25: Introduction of MPEG-2 AAC Audio Coding

25Electrical EngineeringNational Central University

Temporal noise shaping (TNS) 

The basic idea of TNS relies on the duality of time and frequency domain

TNS uses a prediction approach in the frequency domain to shape the quantization noise over time

It applies a filter to the original spectrum and quantizes this filtered signal

quantized filter coefficients are transmitted in the bitstream

the decoder undo the filtering performed in the encoder, leading to a temporally shaped distribution of quantization noise in the decoded audio signal

Page 26: Introduction of MPEG-2 AAC Audio Coding

26Electrical EngineeringNational Central University

Frequency domain prediction

Improves redundancy reduction of stationary signal segments

Only supported in AAC Main The actual implementation of the predictor is a secon

d order backwards adaptive lattice structure The required processing power of the frequency dom

ain prediction and the sensitivity to numerical imperfections make this tool hard to use on fixed point platforms

Page 27: Introduction of MPEG-2 AAC Audio Coding

27Electrical EngineeringNational Central University

Joint stereo coding

Mid-Side (MS) stereo coding Applies a matrix to the left and right channel signal

s, computing sum and difference of the two original signals

Intensity stereo coding Saving bitrate by replacing the left and the right si

gnal by a single representing signal plus directional information

Intensity stereo is by definition a lossy coding method thus it is primarily useful at low bitrates. For coding at higher bitrates only MS stereo is used.

Page 28: Introduction of MPEG-2 AAC Audio Coding

28Electrical EngineeringNational Central University

Scalefactors

Inherent noise shaping in the non-linear quantizer is usually not sufficient to achieve acceptable audio quality

Scalefactors are used to amplify the signal in certain spectral regions (the scalefactor bands) to increase the signal-to-noise ratio in these bands

To properly reconstruct the original spectral values in the decoder the scalefactors have to be transmitted within the bitstream

Scalefactors are coded as efficiently as possible differentially encoded and then Huffman

Page 29: Introduction of MPEG-2 AAC Audio Coding

29Electrical EngineeringNational Central University

Quantization

A non-linear quantizer is used The main source of the bitrate reduction It assignes a bit allocation to the spectral values acco

rding to the accuracy demands determined by the perceptual model

The main advantage over a conventional linear quantizer is the implicit noise shaping

Page 30: Introduction of MPEG-2 AAC Audio Coding

30Electrical EngineeringNational Central University

Noiseless coding

The noiseless coding tries to optimize the redundancy reduction within the spectral data coding

The spectral data is encoded using a Huffman code

Codebook Number

unsigned_cb Dimension of Codebook

LAV for codebook

0 - - 0 1 0 4 1 2 0 4 1 3 1 4 2 4 1 4 2 5 0 2 4 6 0 2 4 7 1 2 7 8 1 2 7 9 1 2 12

10 1 2 12 11 1 2 (16) ESC 12 - - (reserved) 13 - - (reserved) 14 - - intensity out-of-phase 15 - - intensity in-phase

11 huffman codebooks for the spectral data

2 huffman codebooks for the intensity stereo

Neither spectral coefficients nor a scalefactor transmitted