Top Banner
Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 1 AC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University Ilmenau, Germany
58

AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc –...

Feb 04, 2018

Download

Documents

vankhanh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 1

AC-3 and DTS

Prof. Brandenburg

Fraunhofer IDMT & Ilmenau Technical UniversityIlmenau, Germany

Page 2: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 2

Dolby Digital• Dolby Digital (AC-3) was first commercially used in

1992• Multi-channel digital audio for 35mm movie film

material alongside the (optical) analog audio channel• Perceptual coding with block length of 256 samples• Additionally it is used in:

– Laser Disc– ATSC High Definition Digital Television (HDTV)– DVB/ATSC Standard Definition Digital Television

(SDTV)– DVD-Video/Audio– Internet-, Cable-, Satellite broadcasting

• For 5.1-channel audio, the bit stream is packed in the AES-EBU transmission format.

• Bit stream defined in ATSC “Digital Audio Compression Standard, Revision B”: Doc. A/52B from June 2005; also E-AC3 included Source: Dolby Labs, Internet

Page 3: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 3

Dolby Digital embedded in a piece of film

Page 4: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 4

Page 5: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 5

Page 6: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 6

Dolby AC-3 (1)

• Predecessors:– Dolby AC-1: low-cost, based on delta modulation– Dolby AC-2: transform based codec

• Lossy coder that uses psychoacoustics• Special Features:

– Use of a Variable Frequency Resolution Spectral Envelope

– Hybrid Backward/Forward Adaptive Bit Allocation• Primarily developed for multi-channel format for HDTV• Based on ITU-R BS.775 that showed that 5 + 1

channels are enough for new digital audio system of movies (based on an analog Split-Surround-Format from 1979)

Source: Dolby Labs, „AC-3: Flexible Perceptual Coding for Audio Transmission and Storage

Page 7: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 7

Dolby AC-3 (2)• There is an useable data rate of 320kbps on 35mm

movie film such that:– Audio compression must be used for 5.1 channel

audio– The peak bit rate can not surpass 320kbps

• First film with AC-3: Star Trek VI (Dec. 1991)• Transform:

– Fielder windowing (aka KBD-Window)– Window length 512 Samples (10.66ms@48kHz)

with 50% overlap: 256 Spectral values– Oddly Stacked Time-Division Alias-Cancellation

Filter Bank from Princen and Bradley– With signal transients (attacks) block switching is

used to half the block length.– Frequency resolution: 93,75 Hz– No „Critical Bands“ like in MP3

Source: Dolby Labs, „AC-3: Flexible Perceptual Coding for Audio Transmission and Storage

Page 8: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 8

Dolby AC-3: Forward Adaptive Bit Allocation

Source: Dolby Labs, „AC-3: Flexible Perceptual Coding for Audio Transmission and Storage

Page 9: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 9

Dolby AC-3: Backward Adaptive Bit Allocation

Source: Dolby Labs, „AC-3: Flexible Perceptual Coding for Audio Transmission and Storage

Page 10: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 10

Dolby AC-3: Hybrid Backward/Forward ABA

Source: Dolby Labs, „AC-3: Flexible Perceptual Coding for Audio Transmission and Storage

Page 11: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 11

Dolby AC-3: Encoder

Source: Advanced Television Systems Committee: „Digital Audio Compression Standard (AC-3)“, Nov. 94

Page 12: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 12

Dolby AC-3: Decoder

Source: Advanced Television Systems Committee: „Digital Audio Compression Standard (AC-3)“, Nov. 94

Page 13: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 13

Dolby AC-3: Spectral Envelope (1)

Source: Dolby Labs, „AC-3: Flexible Perceptual Coding for Audio Transmission and Storage

Page 14: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 14

Dolby AC-3: Spectral Envelope (2)

Source: Dolby Labs, „AC-3: Flexible Perceptual Coding for Audio Transmission and Storage

Page 15: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 15

Dolby AC-3: Bit Allocation

Source: Dolby Labs, „AC-3: Flexible Perceptual Coding for Audio Transmission and Storage

Page 16: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 16

Dolby Digital Setup

Source: Dolby Labs, Internet

Page 17: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 17

Dolby Digital Enhancement for 6.1-channel audio

Source: Dolby Labs, „AC-3: Flexible Perceptual Coding for Audio Transmission and Storage

Page 18: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 18

Dolby Digital Plus (E-AC-3, Enhanced AC-3): Main features

• greater range of data rates: 32kbps – 6.144 Mbps, fine-grain data rate resolution

• 13.1 channel support• High resolution hybrid filter bank (AC-3 filter

combined with 2nd stage DCT -> 1536 coeffs or subbands)

• New quantization tools• Improved channel coupling (similar to BCC)• Spectral extension tool (similar to SBR)• Transient pre-noise processing• Based on AC-3: low-loss and low-complexity

conversion from E-AC-3 to AC-3Source: L.D. Fielder et al., 117th AES Convention

Page 19: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 19

E-AC-3: Decoder setup example

Page 20: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 20

E-AC-3: Adaptive Hybrid Transform (AHT)

• Based on AC-3 MDCT (256 coeffs or subbands, KBD window with alpha factor 5.0) for easy interoperability, filter length N=512

• 2nd stage DCT Type 2 with M=6 subbands, resulting in 1536 coeffs or subbands

higher frequency resolution for stationary signals

MDCT:

DCT-2:

Page 21: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 21

E-AC-3: Spectral Extension Tool

• Parametric description of high frequency region of the high frequency subband coeffs, which then are transmitted as this parametric description

• Spectral extension bands approx. match Critical Bands

• For each band an energy ratio and a noise blending parameter is calculated

Page 22: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 22

E-AC-3: Spectral Extension (1)

Fig 1: Original Spectrum

Fig 2: Decoder: Translation

Fig 3: Decoder: Noise spectrum multiplied by blending function

Fig 4: Decoder: Translated spectrum, multiplied by inverse blending function

Page 23: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 23

E-AC-3: Spectral Extension (2)

Fig 5: Decoder: Blended spectrum (blending of noise and translated spectrum)

Fig 6: Decoder: Final spectrum, multiplication by transmitted energy ratios

Page 24: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 24

E-AC-3: Enhanced Coupling

Page 25: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 25

E-AC-3: Transient Pre-Noise Processing

• reduces pre-echo artifacts with a time-domain strategy

• a time-scaled part of the signal substitutes quantization noise just before a transient

Page 26: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 26

DTS

Page 27: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 27

DTS: Coherent Acoustics coding or Digital Surround®

• Intended for entertainment and professional use

• Optional coding scheme for DVD• Part of the Blue Ray audio standard• Audio data rates from 8 to 512

kbit/s/channel• Sampling rates up to 192 kHz / 24 bit• 5.1 core coder with up to 1536 kbit/s

Page 28: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 28

DTS: Encoder overview

• Two main stages: polyphase filtering and subband-ADPCM

Page 29: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 29

DTS: Polyphase Filter Bank

• 32 subbands• Frames of 256, 512, 1024, 2048 or 4096 samples• Long frames mainly used for low bit rates (coding

efficiency)• Two filter banks: perfect reconstruction (high bit

rates) and near-perfect reconstruction (lower bit rates)

Example: Polyphase filter banks at 48 kHz

Page 30: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 30

DTS: Subband Adaptive DPCM

• Reduce sample-to-sample correlation within each subband

• Disengageable within each subband, if simple PCM renders better results

• Forward prediction based on LPC analysis

Page 31: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 31

DTS: Subband Adaptive PCM (block diagram)

Page 32: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 32

DTS: Quantization and Bit Allocation

• 28 different mid-tread quantizers up to 16,777,216 levels

• Psychoacoustically controlled• Optional table-based entropy coding at low

bit rates

Page 33: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 33

DTS: Quantization and Bit Allocation (block diagram)

Page 34: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 34

DTS: Example for the use of the extension audio data

Page 35: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 35

DTS-ES: discrete 6.1 multi channel coding

• 5.1 channel DTS core + additional Center Surround channel

• Additional channel is transmitted using Extension Audio Data

• Backwards compatible to 5.1 DTS core coder• Three possible decoder setups possible:

– 5.1 decoding with phantom source– Matrix decoding of Center Surround

Channel– Discrete 6.1 decoding by evaluating

Extension Audio Data

Page 36: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 36

DTS-ES: Encoder block diagram and Bit Stream

DTS-ES Encoder

DTS-ES Bit Stream

Page 37: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 37

DTS-HD

• DTS Digital Surround (DTS 5.1 core) mandatory for HD-DVD and Blu-Ray

• DTS-HD is optional for (HD-DVD outdated) Blu-Ray

• DTS-HD is a set of extensions to DTS core, encompassing DTS core, DTS-ES, Neo:6 and DTS 96/24

• Lossless audio coding possible

Page 38: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 38

Ogg Vorbis• Ogg project started 1993 to provide a license-

fee free audio coder/decoder• Ogg: file transport protocol• Vorbis: audio coder

– Psycho-acoustically controlled forward adaptive monolithic codec based on MDCT

– Inherently variable bit rate coder– Provides no framing, synchronization or

error protection by itself (therefore use Ogg for file transport, RTP for multicast)

– Low-complexity decoder, but high memory usage due to non-static probability models

– Huffman and VQ codebooks are transmitted within bit stream header

Page 39: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT & Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 39

Windows Media Audio (WMA)• Proprietary Audio Coder developed by Microsoft• Collection of profiles for different applications:

– WMA 9: most scenarios, backwards compatible to WMA 8, about 20% lower data rate, VBR possible

– WMA 9 professional: 24 bit/96 kHz audio, 7.1 channels, 128-768 kbps, stereo downmix available

– WMA 9 voice: speech content at low bit rates (<20 kbps)

– WMA 9 lossless: compression depending on input audio, used for high-quality archiving purposes

Page 40: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 40

WMA: main features

• MDCT (or MLT) based• Multiple numbers of frequency lines (128,

256, 512, 1024, 2048)• Sinusoidal shaped windows, transition

windows and “bridge” windows (“soft” transition between long and short blocks)

• Uniform quantization within scale factor bands

• M/S coding frame-by-frame instead of scale-factor-band-wise

• Bit reservoir available (1-pass and 2-pass coding)

Page 41: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 41

Overview• Concept

• MPEG Surround integration

• Advantages of SAOC

• Applications

• Conclusion

MPEG Spatial Audio Object Coding (SAOC)

Page 42: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 42

Concept: From MPEG Surround to SAOC (1)

Current Spatial Audio Coding: Channel-oriented (MPEG Surround)

Chan. #1

Chan. #2

Chan. #3

Chan. #4

. . .

Downmixsignal(s)SAC

Encoder

SideInfo

SACDecoder

Chan. #1

Chan. #2

Chan. #3

Chan. #4

. . .

Page 43: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 43

Object-oriented Spatial Audio Coding

Obj. #1

Obj. #2

Obj. #3

Obj. #4

. . .

Downmixsignal(s)SAOC

Encoder

SideInfo

SAOCDecoder

Chan. #1

Chan. #2 . . .

Renderer

Interaction/ Control

obj. #1

obj. #2

obj. #3

obj. #4

. . .

Concept: From MPEG Surround to SAOC (2)

• Processes object signals instead of channel signals• Side Info: few kbit/s per audio object• Mono or stereo downmix• “Mixing”/rendering parameters vary according to user interaction

Page 44: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 44

MPEG Surround integration/extension

Obj. #1

Obj. #2

Obj. #3

Obj. #4

. . .

Downmixsignal(s)

SAOCEncoder SAOC

Bitstream

SAOCTranscoder

Chan. #1

Chan. #2 . . .

MPEGSurroundDecoder

Interaction/ Control

Downmixsignal(s)

MPSBitstream

Combined Decoder

• MPEG SAOC decoder = MPEG SAOC Transcoder + MPEG Surround decoder

Page 45: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 45

Advantages using MPEG SAOC (1)

• Highly efficient storage/transport of

individual audio objects ..

• .. in a backwards compatible downmix

• User interactive rendering of the audio

objects (e.g. move or amplify objects)

• Flexible rendering configurations

(e.g. 2.0, 5.1, binaural, ..)

Key features

Page 46: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 46

Advantages using MPEG SAOC (2)

• Low complexity decoding/rendering for a

large number of objects compared with

individually encoded and rendered objects

• Compatible with any core codec

(for the downmix)

• Powerful rendering engine (= MPEG

Surround) integrated, no additional solution

required

Other features

Page 47: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 47

Applications (1)• Interactive Remix / Karaoke

– Suppress / attenuate instruments or vocals (Karaoke)

– Modify the original track to reflect current preference (e.g. “more drums & less strings” for a dance party)

– Choose between different vocal tracks (“female lead vocal vs. male lead vocal”)

– Control the dialog/speech level in movies/news broadcasts for better speech intelligibility.

• Backwards compatibilityMain feature

Examples

Page 48: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 48

Applications (2)• Gaming / Rich Media

– Efficient and flexible audio transport in multi-player games or applications(e.g. Second Life)

– Efficient storage together with flexible rendering of audio in small interactive games

• Storage/ Bitrate Efficiency

Main feature

Examples

Page 49: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 49

Applications (3)• Teleconferencing

– Mobile conference over headphones: Virtual 3D-audio line-up of communication partners all around the listener

– Conference setup with 2 or more loudspeakers: Spatial distribution of communication partners

• Quality Improvement:– Increased speech intelligibility– Increased listening comfort

Main feature

Examples

Page 50: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 50

Conclusions SAOC

• Highly efficient transport/storage of audio objects and flexible/interactive audio scene rendering

• Backwards compatible downmix for reproduction on legacy devices

• Flexible rendering configurations

• Under standardization within MPEG

• Very interesting applications, e.g.:

– Remixing/Karaoke

– Gaming

– Teleconferencing

Page 51: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 51

Universal Speech and Audio Coding (USAC)• Problem:

– Speech coders are good at speech but not at music,

– Audio coders are good at music, but not at speech (too instationary, the 1024 sample block size smears the qualtization noise and makes speech sound reverberant)

• MPEG decided to tackle the problem• Goal: to come up with a universal coder

which handles speech and audio as well as the best speech or audio coder in that bit-rate range

Page 52: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 52

Universal Speech and Audio Coding• A competition was conducted by MPEG• Winner of this competition was a joint

submission by Fraunhofer IIS and Voiceage Corp. in Canada

• Their submission was a combination of VoiceAge’s AMR-WB+ coder and Fraunhofers HE-AAC coder

• The bit-rate range for the competition was about 12 to 64 kb/s.

• Target is mainly mobile devices (wireless phones, digital radio…)

Page 53: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 53

Universal Speech and Audio Coding• We already know HE-AAC• But how does the VoiceAge coder work?• Answer: It is based on CELP (Code Excited

Linear Prediction)• CELP is based on predictive coding, just as

for ULD for lossless predictive coding• Here: usually prediction of order 12-16 (this

was found to be sufficient to model the human vocal tract for speech production)

• The prediction residual is then encoded using codebook vectors, called Code Excitation, using a fixed codebook (innovation) and an adaptive codebook (past samples)

Page 54: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 54

CELP (Code Excited Linear Prediction)• Structure of the CELP decoder (from

Wikipedia, CELP):

Decoder prediction filter(usually order 12-16)

Constantly adapted delay

Page 55: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 55

Universal Speech and Audio Coding• ACELP (Algebraic CELP): The codebook is not

explicitely stored, by algebraicly described by pulses and their distances to the next pulses

• AMR: Voiceage Speech Coder (for instance for 3GPP), for about 4.75 and 12.2 kb/s

• AMR-WB: Wideband Extension (up to 7 kHz bandwidth), 6.6 to 23.5 kb/s

• AMR-WB+: Used for the MPEG submission, has a transform coding kernel in it too, to obtain higher bandwidth and bit rates up to about 32 kb/s

Source: IEEE Transaction On Speech and Audio Processing, Bessette et al., 2002

Page 56: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 56

Universal Speech and Audio Coding• AMR-WB+ has a transform based mode

called TCX, which is based on an FFT (not an MDCT)

• The TCX mode is switchable: The audio stream is divided in 80 ms “super frames”, which consists of two 40 ms frames, and each 40 ms frame consists of two 20 ms frames.

• For the 20 ms frame base it is decided if ACELP is used or TCX

• For TCX it is decided if it is applied to frames of 20ms, 40ms, or 80 ms, to obtain different numbers of subbandsSource: IEEE International Conference on Audio and Speech Signal Processing(ICASSP), 2005, Bessette et al.

Page 57: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 57

Universal Speech and Audio Coding (USAC)• USAC combines AMR-WB+ with HE-AAC• An important component is a suitable switch

between them, such that for the current audio signal the suitable coder is selected

• Some integration between subband coding modes in AMR-WB+ and HE-AAC.

Page 58: AC-3 and DTS - Startseite TU · PDF fileAC-3 and DTS Prof. Brandenburg Fraunhofer IDMT &amp; Ilmenau Technical University ... • Additionally it is used in: – Laser Disc – ATSC High

Prof. Dr.-Ing. Karlheinz Brandenburg, [email protected] Page 58

Universal Speech and Audio Coding

• The MUSHRA tests showed: the resulting codec is indeed at least as good as a virtual coder, which is the best of either HE-AAC or AMR-WB+ (which was a requirement)

• It was tested on speech, audio, and mixed speech and audio (the latter being the most difficult)

• That showed that the goal was reached