elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

1

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

104_Compression © elsaddik

Multimedia Communications

Multimedia Technologies & Applications

Prof. Dr. Abdulmotaleb El SaddikMultimedia Communications Research Laboratory

School of Information Technology and EngineeringUniversity of Ottawa

Ottawa, Ontario, Canada

elsaddik @ site.uottawa.ca

abed @ mcrlab.uottawa.ca

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Content

1. Motivation

2. Requirements - General

3. Fundamentals - Categories

4. Source Coding

5. Entropy Coding

6. Hybrid Coding: Basic Encoding Steps

7. JPEG

8. H.261 and related ITU Standards

9. MPEG-1

10. MPEG-2

11. MPEG-4

12. Wavelets

13. Fractal Image Compression

14. Basic Audio and Speech Coding Schemes

15. Conclusion

2

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Video Compression

In video streams, there are 2 types of redundancy that can be explored:ØSpatial redundancyØTemporal redundancy

Recall that spatial redundancy is what JPEG and other still image algorithms use.ØThere are two groups of video compression

products: vBased purely on spatial redundancyvBased on both spatial and temporal

redundancy

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Spatial-Redundancy-Only Video Compression

ØCalled motion JPEGØCompress each frame individually, without

reference to any other frames in the sequencevà thus does not consider inter-frame

redundanciesØaudio is not supported in an integrated fashionØMotion JPEG Hardware (Chips, boards) for near

real-time compression/ decompression available, but storage and retrieval from a hard disc still takes a second or more.vHigh quality video requires fast SCSI discs or

cashing of short video sequences in large memory buffers.

3

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


JPEG for full-motion video

ØAdvantages:vLoss of frames does not affect other framesvLess encoding complexity and delayvEasier editing

ØDisadvantages:vnetwork-based JPEG applications unlikely,

since it is bandwidth-intensive• Typical rate for studio quality TV: 10 ~ 20

Mbps

Basically, lower compression rates is needed

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Spatial and temporal redundancy video compression – MPEG

We have seen with JPEG how spatial redundancy can be explored. MPEG utilises, as well as spatial redundancy, the fact that frames in a sequence are similar to each other. This is what is known as temporal redundancy.

A few definitions are required here:ØMacroblocksvThis is a 16x16 pixel block, composed of

4 times 8x8 luminance blocks and 2 colour difference blocks

ØMotion VectorsvIndicates the spatial translation of a

macroblock between two frames.

4

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Macroblocks

Y CB CR

0 1

2 3

4 5

YrcYbc

bgrY

b

r

−=−=

⋅+⋅+⋅= 0721,07154,02125,0

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Macroblocks

5

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Motion Vectors: Basis imagew

ww

.site

.uot

taw

a.ca

/~el

sadd

ikw

ww

.el-s

addi

k.co

m


Motion Vectors: 2nd Image with motion

6

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Motion Vectors: Difference without motion compensationw

ww

.site

.uot

taw

a.ca

/~el

sadd

ikw

ww

.el-s

addi

k.co

m


Motion Vectors: Difference with motion compensation

7

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Motion estimation for different frames

I P

B

Available from earlier frame (I)

Available from later frame (P)

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG

ØMotion Picture Expert Group (MPEG)vISO/IEC working group(s)vISO/IEC JTC1/SC29/WG11vISO IS 11172 since 3/93

Øcoding of combined:vvideo and audio information

ØStarting point: MPEG-1vAudio/video at about 1.5 Mbit/svBased on experiences with JPEG and H.261

ØFollow-up standardsvMPEG-2vMPEG-4vMPEG-7vMPEG-21

8

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG

ØMPEG vallows coding comparison across multiple

frames and therefore can yield compression ratios of 50:1 to 200:1vMPEG chips

• provide VHS quality at 1.2 -1.5 Mbps and 200:1

• can also give 50:1 and broadcast video quality at 6 Mbps

Øalgorithm asymmetrical: vmore complex to compress than decompress

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG - Video: Processing Step

4 types of frames:ØI-frames (intra-coded frames):vReal-time decoding demands and sometimes

in encoding toovCompression of I frames the lowest in MPEGvI-frames are points for random access in

MPEG streamsvcoding and decoding like JPEGvStructured in 8x8 blocks, within macroblocks

of 16x16, that are DCT coded, quantized and entropy coded

9

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG - Video: Processing Step

ØP-frames (predictive coded frames):

vRequire about 1/3 of data of I-framesvReference to previous I- or P-framesvMotion vector calculated

• MPEG does not define how to determine the motion vector

• difference of similar macroblocks is DCT codedvDC and AC coefficients are runlength coded

ØB-frames (bi-directional predictive coded frames):

vReference to previous and subsequent (I or P) framesvOne or two motion vectors are encodedvInterpolation between matching macroblocks allowed

(both directions)ØD-frames (DC-coded frames):vOnly DC-coefficients are DCT codedvFor fast forward and rewind

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG Video-frame sequence

I B B P B B P B B I

•I frame: Intra frame •P frame: Predicted frame•B frame: Bidirectionallyinterpolated frame

1 2 3 4 5 6 7 8 9 10

MPEG coded sequence will be transmitted in different order:

I P B B P B B I B B1 4 2 3 7 5 6 10 8 9

Sequence• Defined by application

10

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG in a Nutshell

ØI-Frames are self contained but less compressed than P and B Frames. ØB-Frames are the most compressed frames.

Typical sequences of frames are:ØI BBB P BBB I…ØI BB P BB P BB I…ØI BB P BB P BB P BB I...

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG Video-Coding Procedure

Colourspace converter

FDCT QuantizationEntropyencoder

I frame

(RGB->YUV)

Video in

Compressed data

Colourspace converter

FDCT

Entropyencoder

+

-

+

Referenceframe

Errorterms

Motionestimator

P / B frame

(RGB->YUV)

Video in

Compressed data

11

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG Encoder: One possible implementation

Framerecorder DCT Quantize

Variable-lengthcoder

Transmitbuffer

Predictionencoder

De-quantize

InverseDCT

Motionpredictor

Referenceframe

Ratecontroller

IN OUT

Scalefactor

Bufferfullness

Prediction

Motion vectors

DC

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG- Audio Coding

ØSampling compatible to encoding of CD-DA and DAT:vSampling rates:

• 32 kHz, 44,1 kHz, 48 kHzvSampling precision:

• 16 bit/sampleØAudio channels:vMono (single, 1 channel)vStereo (2 channels)

• dual channel mode (independent, e.g., bilingual)

• optional: joint stereo (exploits redundancy and irrelevancy)

12

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG Audio

ØApplication Example: DAB Digital Audio BroadcastingØuses MPEG layer 2 (compression also known as

“MUSICAM” =v(Masking pattern adapted Universal Subband

Integrated Coding And Multiplexing)Ødelays, for VLSI implementation:vmax. 30 ms encodingvmax. 10 ms decoding

ØSW codec delays vary for different layers, implementations, computers (rule-of-thumb may be 50/100/150 ms for layer 1/2/3, which makes MP3 rather inappropriate for real-time conversation)

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG-Audio Coding

ØFFT applied to audio and spectrum is split into 32 non-interleaved sub-bandsvfor each sub-band, amplitude of audio signal

is calculatedvalso, noise level is determined simultaneously

with FFT, using a “psychoacoustic” model• Rough quantization at low noise level and

fine one at high-level

Sub-bandcoding Quantization Entropy

coding

Psychoacousticalmodel

32

control

Uncompressedaudio

Compressedaudio

13

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG- Audio Coding

ØDefines 3 layers of quality, with different complexity of encoder/ decoder

v"higher layer" means "more complex" & "can handle lower layers"

ØData ratesv14 fixed data rates per layer, between 32 kbps-448 kbps

• In steps of 16 kbit/s

vLayer 1: max. 448 Kbit/s(ca. 1:4 compression, e.g. used as PASC in DCC)

vLayer 2: max. 384 Kbit/s(ca. 1:6-8, common, e.g. as MUSICAM in DAB)

vLayer 3: max. 320 Kbit/s(ca. 1:10-12, the famous MP3)

vHigher data rates are allowed for the modes:• “stereo”

• “joint stereo”• “dual channel”

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG- Audio and Video Data Streams

Audio Data Stream LayersØ1. FramesØ2. Audio access unitsØ3. Slots ( 4 bytes in Layer 1 (low compexity), 1

byte in Layer 2 &3)

Video Data Stream LayersØ1. Video sequence layerØ2. Group of pictures layerØ3. Single picture layerØ4. Slice LayerØ5. Macroblock layerØ6. Block layer

PB

BI

14

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG Layersw

ww

.site

.uot

taw

a.ca

/~el

sadd

ikw

ww

.el-s

addi

k.co

m


MPEG Layers

ØEach picture is divided to m horizontal slicesØEach slices contains n macroblocksØEach macroblock contains of 16x16 pixels with

the total of 256 pixelsØEach block composed of 8x8 pixels which is 64

total pixels

PicturePicture

SliceSlice

MacroBlockMacroBlock

BlockBlock

15

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG - Fellow upØMPEG-2:vHigher data rates for high-quality audio/videovMultiple layers and profilesvStudio quality TV and CD quality audio channels. 4 to 6 Mbps

typically.ØMPEG-3vInitially HDTVvMPEG-2 scaled up to subsume MPEG-3

ØMPEG-4:vInitially, lower data rates for e.g. mobile communicationvthen: focus coding & additional functionalities based on

image contentsvVideo conferencing at very low bit rates: 4.8 to 64 Kbps, with

10fps.ØMPEG-7 (EC = "experimental core" status):vContent descriptionvBasis for search and retrievalvSee section on databases

ØMPEG-21 (upcoming):vFramework for multimedia business, delivery... what’s

missing?vmaybe eCommerce focus --> e.g., security, watermarking?

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG 2

ØFrom MPEG-1 to MPEG-2vImprovement in quality

• from VCR to TV to HDTVØNo CD-ROM based constraintsvhigher data rates

• MPEG-1: about 1.5 Mbit/s• MPEG-2: 2-100 Mbit/s

ØProminent role for digital TV in DVB (digital video broadcasting)vcommercial MPEG-2 realizations available

16

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG 2

Øan international standard (1994)ØCBR (constant bit rate) and VBR video (Variable

bit rate) ØPicture quality higher than that of current NTSC,

PAL and SECAM broadcast systems ØCompression to bit rates in the range of:

v60 Mbps for HDTVv15 Mbps for NTSC, PAL and SECAMv4-15 Mbps for TV signals conforming

to CCIR 601ØMPEG-2 consists of five profiles: (Simple (does

not support B frames), Main, Next, .. ) each having four levels :

vHigh level Type 1: 1152 lpf, 1920 ppl, 60 fps -> 60 Mbps

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG-2 Video Profiles and Levels

SimpleProfile

MainProfile

SNR ScalableProfile

Spatial Sca-lable Profile

HighProfile

High Level1920 pixels/line1152 lines

High-1440 Level1440 pixels/line1152 lines

Main Level720 pixels/line576 lines

Low Level352 pixels/line288 lines

LAYERSandPROFILES

No B-frames B-frames B-frames B-frames B-frames

Not Scalable Not Scalable SNR Scalable SNR Scalableor Spatial Sca-lable

SNR Scalableor Spatial Sca-lable

80 Mbps

80 Mbps60 Mbps 60 Mbps

100 Mbps

15 Mbps 15 Mbps 15 Mbps 20 Mbps

4 Mbps 4 Mbps

Signal to Noise (SNR) scaling : noise introduced byquantization errors and block structures

17

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG 2 Audio

(two modest) extension to MPEG-1 audio: 1. "low sample rate extension" LSE: v 1/2 of all MPEG-1 rates: 16, 22.05, 24kHzv quantization down to 8 bits/sample

2. "multichannel extension": more channels, i.e. up to v 5 full bandwidth channels (surround system)

• left and right front• center (in front)• left and right back

v "multilingual extension": 7 more, i.e. up to 12 channels (multiple languages, commentary)

Ø Backward compatibility with MPEG-1 audiov Only three MPEG-2 audio codecs will not provide

backward compatibility ( in the range of 256- 448 kbps)

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG-2 System DefinitionØStepsvaudio and video combined to “Packetized

Elementary Stream”vPES combined to “Program Stream” or “Transport

Stream”ØProgram StreamvError-free environmentvPackets of variable lengthvOne single stream with one timing reference

ØTransport StreamvDesigned for “noisy” (lossy) media channelsvMultiplex of various programs with one or more

time basesvPackets of 188 bytes

ØConversion between Program and Transport Streams possible

18

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG 2 Elementary Streams

Audio source

Video source

Audio encoder

Video encoder

Systemclock

MPEG2 Systemmultiplexerand encoder

MPEG2stream

Audio PacketizedUnit

MPEG2 encoded Audio

MPEG2 encoded Video

Video PacketizedUnit

Time Sync. Information

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG 2 Streams

ISO 11172 Stream

PackHeader

PackHeader

SystemHeader ……..

Pack 1 Pack 2

VideoPacket

VideoPacket

VideoPacket

VideoPacket

VideoPacket

AudioPacket

188 bytes

19

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG - Fellow upØMPEG-2:vHigher data rates for high-quality audio/videovMultiple layers and profilesvStudio quality TV and CD quality audio channels. 4 to 6 Mbps

typically.ØMPEG-3vInitially HDTVvMPEG-2 scaled up to subsume MPEG-3

ØMPEG-4:vInitially, lower data rates for e.g. mobile communicationvthen: focus coding & additional functionalities based on

image contentsvVideo conferencing at very low bit rates: 4.8 to 64 Kbps, with

10fps.ØMPEG-7 (EC = "experimental core" status):vContent descriptionvBasis for search and retrievalvSee section on databases

ØMPEG-21 (upcoming):vFramework for multimedia business, delivery... what’s

missing?vmaybe eCommerce focus --> e.g., security, watermarking?

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG 4

ØMPEG-4 (ISO 14496) originally:vTargeted at systems with very scarce

resourcesvTo support applications like

• Mobile communication• Videophone and E-mail

vMax. data rates and dimensions (roughly):• VLBV “Very Low Bit-rate Video”

• Between 4800 and 64000 bits/s• 176 columns x 144 lines x 10 frames/s

• Largely covered by H.263 (QCIF)Øtherefore re-orientation:vGoal to provide enhanced functionalityvto allow for analysis and manipulation of

image contents

20

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG 4

MPEG-4: Schedule for StandardizationØ1993: Work startedØ1997: Committee DraftØ1998: Final Committee DraftØ1998: Draft International StandardØ1999-2000: International Standard

ØAgainvStarted from original goal of providing an audio-visual

coding standard for very-low-bit-rate channels (e.g., for mobile applications)vEvolved into a complex tool kit vMPEG-4 innovates the MPEG-2 information production

and consumption paradigm by the way audio and video info is represented

vDeals with audio and video no longer as packaged “bitstreams”, produced by encoding, but as “audio-visual objects” (AVOs)

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG 4 - Technical information

ØObjects are organized in a hierarchal fashion.ØEach object has its own

description element.

vAllows handling of the object

ØOne or more primitive media objects can be combined.ØTechniques from the

Virtual Reality model language.

Voice

Background

Image

Talkingperson

Compound mediaobject

Primitive media objects

21

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Video objects

ØDivide video components

vPerson and backgroundØCamera position information

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG 4 -- Media streams

ØOne or more media streamsØDescriptors for the objects and the stream

Mediastream Decompression

Scenedescription

Composition

22

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Scene description

ØGrouping of the objectsvDirected acyclic graph

ØPositioning the objectsvSpecial attributes

Scene

Room

… … … …

Person

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


New or Improved’ MPEG4 Functionalities

ØContent-Based ScalabilityØContent-Based Manipulation and Bitstream

EditingØContent-Based Multimedia Data Access ToolsØHybrid Natural and Synthetic Data CodingØCoding of Multiple Concurrent Data StreamsØImproved Coding EfficiencyØRobustness in Error-Prone EnvironmentsØImproved Temporal Random Access

23

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Content-Based Scalability

ØMPEG4 provides the ability to achieve scalability with a fine granularity in content, spatial resolution, temporal resolution, quality and complexity.ØContent-scalability may imply the existence of a

prioritization of the objects in the scene. The combination of more than one scalability case may yield interesting scene representations, where the more relevant objects are represented with higher spatial-temporal resolution. ØExample uses: vuser selection of decoded quality of individual

objects in the scene; vdatabase browsing at different scales,

resolutions, and qualities.

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Content-Based Manipulation and Bitstream Editing

ØMPEG4 provides a syntax and coding schemes to support content-based manipulation and bitstream editing without the need for transcoding.ØThis means the user should be able to access

one specific object in the scene/bitstream and perhaps change some of its characteristics.ØExample uses: vhome movie production and editing;

interactive home shopping; vinsertion of sign language interpreter or

subtitles.

24

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Content-Based Multimedia Data Access Tools

ØMPEG4 shall provide efficient data access and organisation based on the audio-visual contentvAccess tools may be

• indexing, hyperlinking, querying,browsing, uploading, downloading, and deleting.

ØExample uses: vcontent-based retrieval of information from

on-line libraries and travel information databases

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Hybrid Natural and Synthetic Data Coding

ØMPEG4 supports efficient methods for combining synthetic scenes with natural scenes (e.g. text and graphics overlays), the ability to code and manipulate natural and synthetic audio and video data and decoder-controllable methods of mixing synthetic data with ordinary video and audio, allowing for interactivity. Øharmonious integration of natural and synthetic audio-

visual objects. Ø first step towards the integration of all types of audio-

visual information.ØExample uses:

vvirtual reality applications; vanimations and synthetic audio (e.g. MIDI) can be mixed

with ordinary audio and video in a game; vgraphics can be rendered from different viewpoints.

25

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Coding of Multiple Concurrent Data Streams

Øability to efficiently code multiple views/soundtracks of a scene as well as sufficient synchronisation between the resulting elementary streams. ØFor stereoscopic and multiview video applications, MPEG4

shall include the ability to exploit redundancy in multiple views of the same scene, also permitting solutions that allow compatibility with normal (mono) video. This functionality should provide efficient representations of 3D natural objects provided a sufficient number of views is available. Again, this may require a complex analysis process. It is expected that this functionality could substantially benefit applications such as virtual reality where almost only synthetic objects are used till now.ØExample uses:

vmultimedia entertainment, e.g. virtual reality games, 3D movies; vtraining and flight simulations; vmultimedia presentations and education.

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Improved Coding Efficiency

Øthe growth of mobile networks provides a strong need for improved coding efficiency, ØMPEG4 is required to provide subjectively better

audio-visual quality compared to existing or other emerging standards (such as H.263), at comparable bit-rates. ØThe results of the MPEG4 video subjective tests,

held in November 1995, showed however that, in terms of coding efficiency, the available coding standards still perform very well in comparison with most of the other coding techniques proposedØExample uses: vefficient transmission of audio-visual data on

low-bandwidth channels; vefficient storage of audio-visual data on

limited capacity media, such as chip cards.

26

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Robustness in Error-Prone Environments

Øuniversal accessibility implies access to applications over a variety of wireless and wired networks and storage media ØMPEG4 shall provide an error robustness

capability. Particularly, for low bit-rate applications under severe error conditions.ØThe idea is not to substitute the error control

techniques implemented by the network but provide resilience against the residual errors, e.g. through selective forward error correction, error containment or error concealment.ØExample uses: vtransmitting from a database over a wireless

network;vcommunicating with a mobile terminal; vgathering audio-visual data from a remote

location

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Improved Temporal Random Access

ØMPEG4 shall provide efficient methods to randomly access, within a limited time and with fine resolution, parts from an audio-visual sequence. This includes ‘conventional’ random access at very low bit rates.

ØExample uses vaudio-visual data can be randomly accessed

from a remote terminal over limited capacity media; va ‘fast forward’ can be performed on a single

audio-visual object in the sequence.

27

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG7

ØIncreasing availability of Multimedia contentØIncreasing creation of Multimedia contentØIncreasing use of Multimedia content by

machinesØThe need for searching, categorizing, describing,

managing and filtering

à Great need for Standard Description

ØMPEG-7 proposing such a standardØMPEG-7 does not deal with implementationv(Great for Master Thesis)

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG 21

ØMPEG-21 Multimedia FrameworkvThe vision for MPEG-21 is:

to define a multimedia framework to enable transparent and augmented use of multimedia

resources across a wide range of networks and devices used by different communities

28

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG 21

Seven Architectural ‘Elements’ in the Multimedia Framework:

1. Digital Item Declaration2. Digital Items Representation3. Digital Item Identification and Description 4. Content Management and Usage 5. Intellectual Property Management and

Protection 6. Terminals and Networks 7. Event Reporting

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG 21

98

Identification and

Description

Content Management and Usage

Terminals & Networks

IPMP

Content Represent-

ation

Digital Item Declaration

Event Reporting

Event Reporting Metrics & InterfacesEvent Reporting Metrics & InterfacesUser A User BTransaction/Use/Relationship

ßContentàßAuthorization/Value Exchangeà

Example: item, resource

Example: Unique IdentifierExample: Natural & Synthetic

Example: Encryption, Authentication Watermarking

Example: resource Mgmt. (QoS)

Example: Storage MgmtPersonalization

Event reporting, by creating metrics and interfaces,

further describes specific interactions

29

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


MPEG relations

MPEG2MPEG1 MPEG2 MPEG4MPEG7

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Standards for Narrow-Band Videoconferencing

ØH.320:

vStandard for videoconferencing over ISDN linesØH.324: vStandard for videoconferencing over POTS (Plain Old

Telephone Service)ØH.32x’s umbrella specification structure:

G.723 H.263H.245 H.223V.34

H.324

G.722 H.261H.242 H.221

H.320

30

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


H.261 and related ITU Standards

ØVideo codec for audiovisual services at p x 64kbit/s

v("p-times-sixtyfour", where p means "multiples-of"):vITU- CCITT standard from 1990

• ITU = International Telecommunication Union• CCIT = Consultative Committee for International

Telegraph and Telephone vFor ISDNvWith p=1,..., 30

ØTechnical issues:

vReal-time encoding/decodingvMax. signal delay of 150msvConstant data ratevImplementation in hardware (main goal) and software

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


H.261 – Resolution FormatØUnlike JPEG, H.261 defines a very precise image formatvImage components:

• Luminance signal (Y)• Two color difference signals (Cb,Cr)

vSubsampling according to CCIR 601 (4:1:1)• ITU-R 601: (formerly CCIR) designates a "raw" digital

video format with 704 x 480 pixels • CCIR = International Radio Consultative Committee

Two resolution formats are specified:ØOptionalvCommon Intermediate format (CIF) resolution

• Y: 352 x 288 pixel• At 29.97 frames/s app. 36.46 Mbps (uncompressed) i.e. ~

570 * 64kbpsØMandatoryvQuarter Common Intermediate Format (QCIF) resolution (has

half of CIF resolution)• Y: 176 x 144 pixel• At 29.97 frames/s app. 9.115 Mbps (uncompressed)

Ø all H.261 implementations must be able to encode and decode QCIF ; CIF is optional

31

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


H.261 ( p x 64) Video Compression

ØDCT-based compression algorithm, like JPEG, with

vdifferential PCM (DPCM) with motion estimation for interframe coding and vvariable word-length entropy coding (such as Huffman)

Øvery high-compression ratios for full-color, real-time motion video transmissionØcombines intraframe and interframe codingØoptimized for applications such as vvideo-conferencing, which are not motion-intensive

Ø limited motion search and estimation strategiesØcompression ratios from 100:1 to 2,000:1Øcovers the entire ISDN channel capacity (p x 64 kbps,

p=1,2,...,30)vfor p=1 or 2: videophone, desk -top video-conferencing

applicationsvfor p=6 or higher, more complex pictures are

transmitted. Good for group video-conferencing

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


H.261Ø Intraframe coding takes no advantage of redundancy between

frames.vIntraframe coding: yields "reference frame" f0veach 8x8 block is transformed by DCTvDCT with same quantization factor for all AC valuesvthis factor may be adjusted by loopback filtervintraframes rare (bandwidth!, main application videophone)

Ø Interframe coding (corresponds to P frames of MPEG) à Motion estimationvinterframes: f1,f2,f3,... relative to f0 (differential encoding)vSearch of similar macroblock (16x16) in previous imagevPosition of this macroblock defines motion vectorvSearch range is up to the implementation:

• max. ± 15 pixel• but: motion vector may also always be 0 ("bad" software

encoder) • e.g. H.261 also allows simple implementation, considering

only the differences between macroblocks located in the same position, thus a zero motion vector

32

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Main Differences between H.261 and H.263

ØExtension to H.261Ømax. bitrate: H.263 approx. 2.5 x H.261; lowest bitrates

suitable f. modem

Main Differences between H.261 and H.263ØBase Level Differences (always ON)vNo filter for HF noise in feedback loopvMotion vectors produced with 1/2-pixel resolutionvPicture format for sub-QCIF (128x96)vHuffman tables designed specifically for low bit rate.

vJPEG is the still picture modeØOptional Level Differences (Negotiated)vUnlimited search space for motion vector à fast encoder can do bettervSyntax-based Arithmetic codingvAdvanced prediction modevPB-frames (2 combined pictures: 1 B- & 1 P-Frame)

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Main Differences between H.261 and H.263

ØN.B. H.261 is fully contained within H.263

H.261

H.263

33

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Source Image Formats

optionalnot defined1408 x 115216QCIF16 Times Quarter

Common Intermediate Format

optionalnot defined704 x 5764QCIF4 Times Quarter

Common Intermediate Format

optionaloptional352 x 144CIFCommon Intermediate

Format

requiredrequired176 x 144QCIFQuarter Common

Intermediate Format

requiredoptional128 x 96SQCIFSub Quarter Common Intermediate Format

H263Encoder/Decoder

H261Encoder/Decoder

PixelsFormat

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Conclusion

JPEGØVery general format with good compression ratioØSW and HW for baseline mode available

H.261/ H.263ØEstablished standard by telecom worldØPreferable hardware realization

MPEG-1, MPEG-2, MPEG-4, MPEG-7ØMPEG-2 with data rates between 2 and 100 MbpsØMPEG-4, MPEG-7: objects coding, content descr.

Proprietary Systems: Quicktime, DVI, CD-I,...ØProduct that use of other standardsØMigration to use the standards

34

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com


Encoding Rates of Various Standards

JPEG (for video) 10-20 Mbps 7-27 timesMPEG-1 1.2-2.0 Mbps 100 timesH.261 64kbps-2Mbps 24 timesDVI 1.2-1.5 Mbps 160 timesCD-I 1.2-1.5 Mbps 100 timesMPEG-2 4-60 Mbps 30-100 timesCCIR 723 32-45 Mbps 3-5 timesCCIR 601/D-1 140-270 Mbps ReferencePictureTel SG3 0.1-1.5 Mbps 100 timesSoftware compression (small window) ~2 Mbps 6 times

Standard Data Rate Compression

NB. For JPEG , it was assumed 640 x 480 x 24-bit colour, 15 fps

elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

Documents