Topic for lecture 2 • Topic: video compression • The ultimate compression task? • Color image (300 x 300 x 24bit): – 2.16Mbit/image x 30 image/s = 64.8Mbps • Motion picture: 90min = 64.8Mbps x 60 x 90 = 349.92Gbit • 56.6K modem => Raw download time (excl. sound and overhead) ~ 1717 hours or ~ 72 days!!!
39
Embed
Topic for lecture 2 Topic: video compression The ultimate compression task? Color image (300 x 300 x 24bit): –2.16Mbit/image x 30 image/s = 64.8Mbps Motion.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Topic for lecture 2
• Topic: video compression • The ultimate compression task?• Color image (300 x 300 x 24bit):
– 2.16Mbit/image x 30 image/s = 64.8Mbps
• Motion picture: 90min = 64.8Mbps x 60 x 90 = 349.92Gbit
• 56.6K modem => Raw download time (excl. sound and overhead) ~ 1717 hours or ~ 72 days!!!
Agenda for lecture 2
• What makes video compression possible?
• Implementations of motion compensation– Block matching
• The YCbCr color representation
• MPEG
Video compression • A sequence of images that needs to be
compressed: storage and/or transmission
• Ignore audio as images >> audio
• Straight forward methods– Motion JPEG – 3D DCT
Temporal redundancy• Less than 10% of the pixels changes more than
1% between frames
• Temporal redundancy or interframe correlation
• Temporal redundancy > spatial redundancy
• Origin: slow camera- and object movements
Motion compensated coding
• Second generation of temporal compression method• More efficient (especially with rapid changes) but also more
complex: – Ok since the cost of computer power is decreasing faster than the
cost of bandwidth
• Basic idea: only difference between two images are the moving objects (draw)
• Estimate the motion and simply code this information• From prediction and the initial frame we can encode/decode
all other frames
Practical issues• Due to noise, camera movements, light changes etc. =>
the object and background changes =>– Calculate the predicted error (difference) and code this
• Very hard to track and describe a general object (contour and texture) instead a block of pixels is used as ’object’
• The estimated motion is represented as pure translation: no rotation and scaling– This is justified since we have high frame rates and ’slow’
changes
– Denoted the displacement vector or motion vector
Procedure for motion compensated coding • Image sequence => image => blocks of pixels• Step 1: Motion analysis:
– Estimate the motion vector of the current block, i.e. the position of the block in the previous image(s)
• Step 2: Prediction and differentiation– Predict how the block found in the previous image(s) will look
like in the current image– Subtract the predicted block from the current block =>
difference • Step 3: Entropy encoding of the difference and motion vector• Encoded difference and motion vector << raw image =>
video compression• Step 3 we know
Motion analysis and prediction• In general we seek the trajectory of a block so we
can predict its current position e.g. using weights• In praxis this is too complicated and instead a 0th
order predictor is applied:– Predicted block(x,y,t) = block(a,b,t-1)– MPEG uses two 0th order predictors
• The only unknown issue: step 1: how do we find the block in the previous frame that best matches the block in the current frame?
• Three methods:– Block matching (by far the most applied method)– Pel-recursion (block = 1 pixel)– Optical flow (block = 1 pixel)
Block matching (1)• Principle• The displacement of
the pixels in a block are assumed to have the same motion vector
• Search window– Maximum from frame rate and context– Usually a square region
• Usually p=q => square block• The smaller the block size => the better prediction, but
more overhead (motion vectors)• Usually block size = 16 x 16
image quality but decrease the bit-rate– Usually non-overlapping blocks are applies
• Block matching via a similarity measure:– Sum of squared differences (SSD): S(u,v) = (u-v)^2– Mean absolute differences (MAD): S(u,v) = |u-v|
Searching strategies• Full search:
– Finds global minimum but requires heavy processing!
• Only one minimum in the search region => A less computational demanding search strategy
• Accept a local minimum => – Larger difference but less processing
• Searching strategies with one (local) minimum:– Coarse-fine three-step search– 2D logarithmic search– Conjugate direction search– Etc.
Coarse-fine three-step search• Step 1) Test 9 points within a fixed pattern
• Step 2+3) Centre the pattern around the best match and change the distance within the pattern
YCbCr color representation
YCbCr color representation
• A camera captures color in RGB format (show)• We would like a representation where the intensity and color is separated:
– So we can transmit and decode both a color and gray-scale signal – [R,G,B]: [50,50,50] same color as [100,100,100]– HSI (hue-saturation-intensity)– HSI is complex to calculate so we seek a more simple rep.
• YUV-representation is a simple approximation:– Y = Luminance (intensity) = 0.299 R + 0.587 G + 0.114 B– The non-uniform weighting comes from the HVS– U = B – intensity = ”pure” blue color = 0.492 (B - Y)– V = R – intensity = ”pure” red color = 0.877 (R - Y)– Rough approximation but very simple to compute
YCbCr color representation (3)• The HVS is more sensitive to intensity (Y)
than to color (Cb and Cr) so more bits can be used to represent the intensity
• Formats:
1
2
3
4
1
2
3
4
1
2
3
4
= Y sample = Cb and Cr sample
4:4:4 (24 bits) 4:2:2 (16 bits) 4:2:0 (12 bits)
MPEG• MPEG = Moving pictures experts group• International standard for compression of video (image,
sound, and system info.), due to grows in the digital media (e.g. CD-rom, DVD) market. Both transmission and storage
• MPEG-1: 1991• MPEG-2: 1994
– MPEG-2 is MPEG-1 compatible, hence only MPEG-2 used today
• MPEG is NOT an algorithm but rather a frameworkwith several algorithms and MANY user-settings. – Fixed protocol, hence fixed decoders (encoder not specified! )– Asymmetrical codec ~ 100:1 ( JPEG ~1:1 )
• MPEG is a lossy compression algorithm
MPEG-1• MPEG-2 is an ”add-on” to MPEG-1• Typical bit rate for MPEG-1 = 1.5Mbps
– Meaning that an MPEG-1 decoder can decode and show real-time video that has been compressed to 1.5Mbps. MPEG: Trade off between video quality and bandwidth
• Allows resolutions up to 4095 x 4095 at 60Hz– Most used is the CPB (constrained parameter bit steam)
• Fixed resolutions and frame rates =>
HW implementations
• Max. resolution = 768 x 576 at 30Hz
• Max. bit rate = 1.856Mbps
MPEG-1 compression rate• BT.601 (digital TV-signal):• 704 x 576 x 24bit x 25Hz = 243Mbps• Compression factor: 243Mbps / 1.5Mbps = 162 • JPEG = 10-20• YCrCb 4:2:0 format: 12 bit per pixel• Basic operation: down-scale to SIF (source input format)
• 360 x 288 x 12 x 25Hz = 30.4Mbps => comp. factor = 20• But can be higher or lower• In general: Fewer input data => better image quality (for
fixed bit rate)
MPEG-1 principle (1)
• Full-motion-compensated DCT and difference coding
• Frames: 1,2,3,4,5,6,7,8,9, …
• 1: (DCT-JPEG)
• 2,3,4,5,6,7,8,9, … : difference coding– The difference is DCT coded and quantized =>
loosy compression– Problems? – Error propagation – No random access
MPEG-1 principle (2)
• I-picture: intra-coded
– Similar to JPEG
• P-picture: predictive
coded via forward prediction
• B-picture: predictive coded via:
– forward-, backward-, or bi-directional prediction
• Errors in I and P are limited to max one GOP (group of pixels)
• Errors in B are limited to one picture
• High N and M => good coding but error propagation.
– Usually: 13<N<16 and 0<M<4
– Recommended: I each ½ sec. and whenever scene changes
• Coding order vs. visualisation order
Entire sequence
16
16 Y
88Cb
88Cr
88
4:2:0-format
6 Blocks
Type: I,P,B
MB = Macro Block
Coding one Block (8x8)
• Similar to JPEG except for adaptive quantization– DCT, quantization, zig-zag scan, entropy coding– Adaptive quantization controls the quality/amount of data– Intra vs. Inter coding:
• I-blocks: Intra
• P,B-blocks: Depending on DIFF: 0, motion vectors, Inter, Intra.
Coding one Block (8x8)
• Encoding
• Decoding
What to remember
• Video compression is done by removing the temporal redundancy• Principle: (at block level)
– Step 1: Motion analysis => motion vector– Step 2: Calculate the error/difference (subtraction)– Step 3: Entropy encoding of motion vector and difference
• Motion analysis:– Pel-recursion– Optical flow– Block matching (the currently applied method)
• Block matching– Block of pixels (16 x 16)– Similarity measure– Search region– Different search strategies to avoid the full search
What to remember• Video compression is done by removing the temporal redundancy• Principle: (at (macro)block level)
• MPEG-1: – Bit rate ~1.5Mbps– Asymmetrical codec ~ 100:1 ( JPEG ~1:1 )– Compression rate < 400 (down scaling + YCbCr 4:2:0 => ~20)– Coding-style: I B B P B B P B B I
• Layer 3 is the most advanced and often applied– It has a nickname, which?
dB
Hz
dB
Hz
MPEG-2• Defined in 1994• Developed for DTV but has lots of other applications• Based on MPEG-1 (backward compatible) • Bit rates: 1.5Mbps – 60Mbps. Target: 2-15Mbps (best: 4)• Lots of new features including:
– Support for fields, support for 4:4:4 and 4:2:2
– Alternative zig-zag scan, better motion vectors
– Scalability to allow any subset of a stream to be decoded and visualised, etc.
• MPEG-3: Purpose: HDTV – Merged with MPEG-2 => no MPEG-3 standard
MPEG-4• Both for real video and synthetic video• Very low bit rates < 64Kbps => efficient coding• Content based coding: code the objects