Multimedia Compression - INTELLIGENCE: Lab …. Petrakis Multimedia Compression 2 Classification of Techniques Lossless: recover the original representation Lossy: recover a representation

E.G.M. Petrakis Multimedia Compression 1

Multimedia CompressionAudio, image and video require vast amounts of data

320x240x8bits grayscale image: 77Kb1100x900x24bits color image: 3MB640x480x24x30frames/sec: 27.6 MB/sec

Low network’s bandwidth doesn't allow for real time video transmissionSlow storage devices don't allow for fast playing back Compression reduces storage requirements


Classification of Techniques

Lossless: recover the original representationLossy: recover a representation similar to the original one

high compression ratiosmore practical use

Hybrid: JPEG, MPEG, px64 combine several approaches


Furht at.al. 96

Compression Standards


Furht at.al. 96

Lossless Techniques


Furht at.al. 96

Lossy Techniques


JPEG Modes of Operation

Sequential DCT: the image is encoded in one left-to-right, top-to-bottom scanProgressive DCT: the image is encoded in multiple scans (if the transmission time is long, a rough decoded image can be reproduced)Hierarchical: encoding at multiple resolutionsLossless : exact reproduction


Furht at.al. 96

JPEG Block Diagrams


JPEG Encoder

Three main blocks:Forward Discrete Cosine Transform (FDCT)QuantizerEntropy Encoder

Essentially the sequential JPEG encoderMain component of progressive, lossless and hierarchical encoders For gray level and color images


Sequential JPEGPixels in [0,2p-1] are shifted in [-2p-1,2p-1-1] The image is divided in 8x8 blocksEach 8x8 block is DCT transformed

⎪⎩

⎪⎨⎧

>

==

⎪⎩

⎪⎨⎧

>

==

++= ∑∑

= =

0for 1

0for 2

1)(

0for 1

0for 2

1)(

16)12(cos

16)12(cos),(

2)(

2)(),(

7

0

7

0

v

vvC

u

uuC

vyuxyxfvCuCvuFx y

ππ


DCT CoefficientsF(0,0) is the DC coefficient: average value over the 64 samplesThe remaining 63 coefficients are the AC coefficientsPixels in [-128,127]: DCTs in [-1024,1023]

Most frequencies have 0 or near to 0 values and need not to be encodedThis fact achieves compression


Quantization Step

All 64 DCT coefficients are quantized Fq(u,v) = Round[F(u,v)/Q(u,v)]Reduces the amplitude of coefficients which contribute little or nothing to 0Discards information which is not visually significantQuantization coefficients Q(u,v) are specified by quantization tablesA set of 4 tables are specified by JPEG


Quantization Tables

for (i=0; i < 64; i++)

for (j=0; j < 64; j++) Q[i,j] = 1 + [ (1+i+j) quality];quality = 1: best quality, lowest compressionquality = 25: poor quality, highest compression

Furht at.al. 96


AC CoefficientsThe 63 AC coefficients are ordered by a “zig-zag” sequencePlaces low frequencies before high frequenciesLow frequencies are likely to be 0Sequences of such 0coefficients will be encoded by fewer bits

Furht at.al. 96


DC CoefficientsPredictive coding of DC CoefficientsAdjacent blocks have similar DC intensitiesCoding differences yields high compression


Entropy EncodingEncodes sequences of quantized DCT coefficients into binary sequences AC: (runlength, size) (amplitude)DC: (size, amplitude)runlength: number consecutive 0’s, up to 15

takes up to 4 bits for coding(39,4)(12) = (15,0)(15,0)(7,4)(12)

amplitude: first non-zero valuesize: number of bits to encode amplitude 0 0 0 0 0 0 476: (6,9)(476)


Huffman coding

Converts each sequence into binaryFirst DC following with ACsHuffman tables are specified in JPEGEach (runlength, size) is encoded using Huffman codingEach (amplitude) is encoded using a variable length integer code(1,4)(12) => (11111101101100)


Example of Huffman table

Furht at.al. 96


Furht at.al. 96

JPEG Encoding of a 8x8 block


Compression MeasuresCompression ratio (CR): increases with higher compression

CR = OriginalSize/CompressedSizeRoot Mean Square Error (RMS): better quality with lower RMS

Xi: original pixel valuesxi: restored pixel valuesn: total number of pixels

∑=−=

n

i ii xXn

RMS1

2)(1


Furht at.al. 96


JPEG Decoder

The same steps in reverse orderThe binary sequences are converted to symbol sequences using the Huffman tablesF’(u,v) = Fq(u,v)Q(u,v)Inverse DCT

⎟⎠

⎞⎜⎝

⎛ ++= ∑∑

= =

7

0

7

0 16)12(cos

16)12(cos),()()(

41),(

u v

vyuxvuFvCuCyxF ππ


Progressive JPEGWhen image encoding or transmission takes long there may be a need to produce an approximation of the original image which is improved gradually

Furht at.al. 96


Progressive Spectral Selection

The DCT coefficients are grouped into several bands

Low-frequency bands are firstband1: DC coefficient onlyband2: AC1,AC2 coefficientsband3: AC3, AC4, AC5, AC6 coefficientsband4: AC7, AC8 coefficients


Lossless JPEG

Simple predictive encodingprediction schemes

Furht at.al. 96


Hierarchical JPEG

Produces a set of images at multiple resolutions

Begins with small images and continues with larger images (down-sampling)The reduced image is scaled-up to the next resolution and used as predictor for the higher resolution image


Encoding1. Down-sample the image by 2a in each x, y2. Encode the reduced size image

(sequential, progressive ..)3. Up-sample the reduced image by 24. Interpolate by 2 in x, y5. Use the up-sampled image as predictor 6. Encode differences (predictive coding)7. Go to step 1 until the full resolution is

encoded


Furht at.al. 96


JPEG for Color images

Encoding of 3 bands (RGB, HSV etc.) in two ways:

Non-interleaved data ordering: encodes each band separatelyInterleaved data ordering: different bands are combined into Minimum Coded Units(MCUs)

Display, print or transmit images in parallel with decompression


Interleaved JPEGMinimum Coded Unit (MCU): the smallest group of interleaved data blocks (8x8)

Furht at.al. 96


Video Compression

Various video encoding standards: QuickTime, DVI, H.261, MPEG etc

Basic idea: compute motion between adjacent frames and transmit only differencesMotion is computed between blocksEffective encoding of camera and object motion


MPEG

The Moving Picture Coding Experts Group (MPEG) is a working group for the development of standards for compression, decompression, processing, and coded representation of moving pictures and audio MPEG groups are open and have attracted large participationhttp://mpeg.telecomitalialab.com

http://mpeg.telecomitalialab.com/


MPEG Features

Random accessFast forward / reverse searchesReverse playbackAudio – visual synchronizationRobustness to errorsAuditabilityCost trade-off


MPEG -1, 2

At least 4 MPEG standards finished or under constructionMPEG-1: storage and retrieval of moving pictures and audio on storage media

352x288 pixels/frame, 25 fps, at 1.5 MbpsReal-time encoding even on an old PC

MPEG-2: higher quality, same principles720x576 pixels/frame, 2-80 Mbps


MPEG-4Encodes video content as objectsBased on identifying, tracking and encoding object layers which are rendered on top of each otherEnables objects to be manipulated individually or collectively on an audiovisual scene (interactive video)Only a few implementationsHigher compression ratios


MPEG-7

Standard for the description of multimedia content

XML Schema for content descriptionDoes not standardize extraction of descriptionsMPEG1, 2, and 4 make content availableMPEG7 makes content semantics available


MPEG-1,2 CompressionCompression of full motion video, interframe compression, stores differences between framesA stream contains I, P and B frames in a given patternEquivalent blocks are compared and motion vectorsare computed and stored as P and B frames

Furht at.al. 96


Frame StructuresI frames: self contained, JPEG encoded

Random access frames in MPEG streamsLow compression

P frames: predicted coding using with reference to previous I or P frame

Higher compressionB frames: bidirectional or interpolated coding using past and future I or P frame

Highest compression


Example of MPEG Stream

B frames 2 3 4 are bi-directionally coded using I frame 1 and P frame 5

P frame 5 must be decoded before B frames 2 3 4I frame 9 must be decoded before B frames 6 7 8Frame order for transmission: 1 5 2 3 4 9 6 7 8

Furht at.al. 96


MPEG Coding Sequences

The MPEG application determines a sequence of I, P, B frames

For fast random access code the whole video as I frames (MJPEG)High compression is achieved by using large number of B framesGood sequence: (IBBPBBPBB)(IBBPBBPBB)...


Motion Estimation

The motion estimator finds the best matching block in P, B frames

Block: 8x8 or16x16 pixelsP frames use only forward prediction: a block in the current frame is predicted from past frameB frames use forward or backward or prediction by interpolation: average of forward, backward predicted blocks


Motion Vectors

One or two motion vectors per blockOne vector for forward predicted P or B frames or backward predicted B framesTwo vectors for interpolated B frames

block: 16x16pixles

Furht at.al. 96


MPEG EncodingI frames are JPEG compressedP, B frames are encoded in terms of future or previous framesMotion vectors are estimated and differences between predicted and actual blocks are computed

These error terms are DCT encoded Entropy encoding produces a compact binary codeSpecial cases: static and intracoded blocks


MPEG encoder

Furht at.al. 96

JPEG encoding


MPEG DecoderFurht at.al. 96


Motion Estimation TechniquesNot specified by MPEG Block matching techniques Estimate the motion of an nxm block in present frame in relation to pixels in previous or future frames

The block is compared with a previous or forward block within a search area of size (m+2p)x(n+2p)m = n = 16p = 6


Block Matching

Search area in block matching techniquesTypical case: n=m=16, p=6F: block in current frameG: search area in previous (or future) frame

Furht at.al. 96


Cost functionsThe block has moved to the position that minimizes a cost function

I. Mean Absolute Difference (MAD)

F(i,j) : a block in current frameG(i,j) : the same block in previous or future frame(dx,dy) : vector for the search location

dx=(-p,p), dy=(-p,p)

∑ ∑−= −=

++−=2/

2/

2/

2/

),(),(1),(n

ni

m

mj

dyjdxiGjiFmn

dydxMAD


More Cost FunctionsII. Mean Squared Difference (MSD)

III.Cross-Correlation Difference (CCF)

∑ ∑−= −=

++−=2/

2/

2/

2/

2),(),(1),(n

ni

m

mj

dyjdxiGjiFmn

dydxMSD

2/1

2

2/1

2 ),(),(

),(),(),(

⎟⎟⎠

⎞⎜⎜⎝

⎛++⎟⎟

⎠

⎞⎜⎜⎝

⎛

++=

∑∑∑∑

∑∑

i ji j

i j

dyjdxiGjiF

dyjdxiGjiFdydxCCF


More cost FunctionsIV. Pixel Difference Classification (PDC)

t: predefined thresholdeach pixel is classified as a matching pixel (T=1) or a mismatching pixel (T=0)the matching block maximizes PDC

⎩⎨⎧ ≤++−

=

= ∑∑

otherwisetdyjdxiGjiFif

jidydxT

jidydxTdydxPDCi j

0),(),(1

),,,(

),,,(),(


Block Matching Techniques

Exhaustive: very slow but accurateApproximation: faster but less accurate

Three-step search2-D logarithmic searchConjugate direction searchParallel hierarchical 1-D search (not discussed) Pixel difference classification (not discussed here)


Exhaustive Search

Evaluates the cost function at every location in the search area

Requires (2p+1)2 computations of the cost functionFor p=6 requires169 computations per block!!

Very simple to implement but very slow


Three-Step Search

Computes the cost function at the center and 8 surrounding locations in the search area

The location with the minimum cost becomes the center location for the next step The search range is reduced by half


Three-Step Motion Vector Estimation (p=6)

Furht at.al. 96


Three–Step Search1. Compute cost (MAD) at 9 locations

• Center + 8 locations at distance 3 from center2. Pick min MAD location and recompute MAD

at 9 locations at distance 2 from center3. Pick the min MAD locations and do same at

distance 1 from center• The smallest MAD from all locations indicates

the final estimate• M24 at (dx,dy)=(1,6)• Requires 25 computations of MAD


2-D Logarithic Search

Combines cost function and predefined threshold TCheck cost at M(0,0), 2 horizontal and 2vertical locations and take the minimum If cost at any location is less than Tthen search is completeIf no then, search again along the direction of minimum cost - within a smaller region


if cost at M(0,0) < T then search ends!compute min cost at M1,M2,M3,M4; take their min;if min cost < M(0,0)

if (cost less than T) then search ends!else compute cost at direction of minimum cost (M5,M6 in the example);

else compute cost at the neighborhood of min cost within p/2(M5 in the example)

Furht at.al. 96


Conjugate Direction Search

Repeatfind min MAD along dx=0,-1,1 (y fixed): M(1,0) in examplefind min MAD along dy=0,-1,1 starting from previous min (x fixed): M(2,2)search similarly along the direction connecting the above mins

Furht at.al. 96


Other Compression Techniques

Digital Video Interactive (DVI)similar to MPEG-2

Fractal Image CompressionFind regions resembling fractalsImage representation at various resolutions

Sub-band image and video coding Split signal into smaller frequency bands

Wavelet-based coding


ReferencesB. Furht, S. W. Smoliar, H-J. Zang, “Video and Image Processing in Multimedia Systems”, Kluwer Academic Pub, 1996

Multimedia Compression - INTELLIGENCE: Lab …. Petrakis Multimedia Compression 2 Classification of Techniques Lossless: recover the original representation Lossy: recover a representation

Documents

Multimedia Compression - INTELLIGENCE: Lab …. Petrakis Multimedia Compression 2 Classification of Techniques Lossless: recover the original representation Lossy: recover a representation