Top Banner

Click here to load reader

Multimedia Object - Audio

May 21, 2015



explaining audio concept and compression, a course material at IMTelkom (

  • 1. Multimedia System Audio Nyoman Bogi Aditya Karna, ST, MSEE Sisfo IM Telkom

2. Multimedia Object

  • Image
  • Introduction
  • Compression
  • Types: GIF/JPEG
  • Sound
  • Introduction
  • Compression
  • Types: WAV/MPEG
  • Video
  • Introduction
  • Compression
  • Types: MPEG

3. MPEG - History MPEG (Moving Pictures Experts Group) was established 1988 by ISO as a research group to create standard for the coded representation of moving pictures and associated audio to be stored on digital storage media. Mainly based in German at Fraunhofer Institute (IIS), MPEG submitted its research to ISO : 1993 : MPEG phase 1 (IS 11172-3) 1994 : MPEG phase 2 (IS 13818-3) 1997 : MPEG phase 2.5 (IS 13818-7) 1998 : MPEG phase 4 (IS 14496-3) 2001 : MPEG phase 7 MPEG1 is used in VCD (Video Compact Disc) technology, while Super VCD and DVD (Digital Versatile Disc) are using MPEG2. MPEG4 emphasis on functionality rather than new compression technology, while MPEG7 is a content representation standard. 4. MPEG phase 1

  • 1993 : MPEG1 (IS 11172)
  • Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s (VCD is using 1.15 Mbit/s)
  • IS 11172-1 : System, describe synchronization and multiplexing of audio and video signals
  • IS 11172-2 Video, describe compression of non-interlaced video signals
  • IS 11172-3 Sound, describe compression of audio signals
  • IS 11172-4 Compliance Testing, describe procedures for determining the characteristics of coded bitstreams and decoding process
  • IS 11172-5 Software Simulation
  • Video format : 352x240 SIF (Source Input Format)
  • Audio : 64/128/192 kbits/channel

5. MPEG phase 1 MPEG1 handle video (moving picture) and audio separately, since they both have different characteristic and we use different sense to accept both information (eye for video and ear for audio) with their own limitation. IS 11172-1 Video Audio Sync IS 11172-2 IS 11172-3 6. How MPEG1 Audio Works 7. MPEG1 Audio Encoding Mappingcreates a filtered and subsampled representation of the input audio stream. Psychoacoustic modelcreates a set of data to control the quantizer and coding. Quantizer&Codingcreates a set of coding symbols from the mapped input sampless Frame packingassembles the actual bitstream from the output data of the other blocks, and adds other information (e.g. error correction) if necessary. 8. MPEG1 Audio Decoding Frame unpackingunpack and decode block as does error detection if error-check is applied in the encoder. The bit stream data are unpacked to recover the various pieces of information. Reconstructionblock reconstructs the quantized version of the set of mapped samples. Inverse mappingtransforms these mapped samples back into uniform PCM 9. MPEG1 Audio Layer Depending on the application, different layers of the coding system with increasing encoder complexity and performance can be used. An ISO MPEG Audio Layer N decoder is able to decode bit stream data which has been encoded in Layer N and all layers below N. Layer Icontains the basic mapping of the digital audio input into 32 sub-bands, fixed segmentation to format the data into blocks, a psychoacoustic model to determine the adaptive bit allocation, and quantization using block companding and formatting. Layer IIprovides additional coding of bit allocation, scale factors and samples. Different framing is used. Layer IIIincreased frequency resolution (576 sub-bands) based on a hybrid filterbank (filterbank + MDCT). It adds a different (non-uniform) quantizer, adaptive segmentation and entropy coding of the quantized values. 10. MPEG1 Audio Layer

  • All Layers use the same analysis filterbank (polyphase with 32 subbands). Layer-3 adds a MDCT transform to increase the frequency resolution.
  • All Layers use the same "header information" in their bitstream, to support the hierarchical structure of the standard.
  • All Layers use a bitstream structure that contains parts that are more sensitive to biterrors ("header", "bit allocation", "scalefactors", "side information") and parts that are less sensitive ("data of spectral components").
  • All Layers may use 32, 44.1 or 48 kHz sampling frequency.
  • All Layers are allowed to work with similar bitrates:
    • Layer-1: from 32 kbps to 448 kbps
    • Layer-2: from 32 kbps to 384 kbps
    • Layer-3: from 32 kbps to 320 kbps
  • From Layer-1 to Layer-3:
    • Complexity increases
    • Codec Delay increases
    • Performance increases (sound quality per bitrate)

11. MPEG1 Audio Frame Layer I and II Part of the bit stream that is decodable by itself .In Layer I it contains information for 384 samples and in Layer II for 1152 samples. It starts with a syncword, and ends just before the next syncword. It consists of an integer number of slots (four bytes in Layer I, one byte in Layer II). Layer III Part of the bit stream that is decodable with the use of previously acquired side and main information. In Layer III it contains information for 1152 samples. Although the distance between the start of consecutive syncwords is an integer number of slots (one byte in Layer III), the audio information belonging to one frame is generally not contained between two successive syncwords 12. MPEG1 Audio Layer 3 13. Control Loop Inner iteration loop (rate control loop) Huffman code tables assign shorter code words to (more frequent) smaller quantized values. When the code word exceeds the available block, it can be corrected by adjusting quantization step size, leading to smaller quantized values. This adjustment is repeated until the resulting Huffman coding is small enough. This loop is calledrate loopbecause it modifies overall coder rate until it is small enough. Outer iteration loop (noise control loop) To shape the quantization noise according to the masking threshold (supplied by the perceptual model), scalefactors are applied to each subband. If the quantization noise in a given subband exceed the masking threshold (allowed noise), the scalefactor for this subband is adjusted to reduce quantization noise. Since achieving a smaller quantization noise requires a larger number of quantization steps and thus a higher bit-rate, the rate adjustment loop has to be repeated every time new scalefactors are used. Noise control loop is executed until the quantization noise is below the masking threshold for every scalefactor subband. 14. Comparison Criteria MPEG1 layer1 MPEG1 layer 2 MPEG1 layer3 MPEG2 MPEG2 AAC input PCM sample 32kHz, 44.1kHz, 48kHz 16, 22.05, 24kHz 8 96 kHz sound mode mono, dual channel, joint stereo, stereo + 5.1 channel up to 48 channel filterbank polyphase filterbank + MDCT decoding active frame active + previous frame output bit rate for near-CD quality 384 kbps 256 kbps 128 kbps 96 kbps 15.