Top Banner
LOGO Video Compression NPUST-MINAR Professor : Sheau- Ru Tong Student : Chih-Ming Chen http://minarlab.mis.npust .edu.tw/ MINAR
56

LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen MINAR.

Dec 19, 2015

Download

Documents

Diane Webster
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

LOGO

Video CompressionNPUST-MINAR

Professor : Sheau-Ru TongStudent : Chih-Ming Chen

http://minarlab.mis.npust.edu.tw/ MINAR

Page 2: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Outline

Review of basics of image and video compression1

Scalable video coding2

Overview of current video compression standards3

Object-based video coding (MPEG-4)4

2http://minarlab.mis.npust.edu.tw/MINAR

Page 3: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Review of Image Compression

http://minarlab.mis.npust.edu.tw/3 MINAR

Coding an image (single frame): RGB to YUV color-space conversion Partition image into 8x8-pixel blocks 2-D DCT of each block Quantize each DCT coefficient Runlength and Huffman code the nonzero quantized DCT coefficients

Basis for the JPEG Image Compression Standard

JPEG-2000 uses wavelet transform and arithmetic coding

RGBto

YUVBlock DCT Quantization

OriginalSignal

CompressedBitstream

Runlength &Huffman

Coding

Page 4: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Video Compression

http://minarlab.mis.npust.edu.tw/4 MINAR

Main addition over image compression: Exploit the temporal redundancy

Predict current frame based on previously coded frames Three types of coded frames:

I-frame: Intra-coded frame, coded independently of all other frames P-frame: Predicatively coded frame, coded based on previously coded

frame B-frame: Bi-directionally predicted frame, coded based on both

previous and future coded frames

Page 5: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

MC-Prediction and Bi-DirectionalMC-Prediction (P- and B-frames)

http://minarlab.mis.npust.edu.tw/5 MINAR

Motion compensated prediction: Predict the current frame based on reference frame(s) while compensating for the motion

Examples of block-based motion-compensated prediction (P-frame) and bi-directional prediction (B-frame):

Previous Frame P-Frame Previous Frame B-Frame Future Frame

Page 6: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Example Use of I-,P-,B-frames:MPEG Group of Pictures (GOP)

http://minarlab.mis.npust.edu.tw/6 MINAR

Arrows show prediction dependencies between frames

Page 7: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Summary of Temporal Processing

http://minarlab.mis.npust.edu.tw/7 MINAR

Use MC-prediction (P and B frames) to reduce temporal redundancy

MC-prediction usually performs well; In compression have a second chance to recover when it performs badly

MC-prediction yields: Motion vectors MC-prediction error or residual Code error with conventional

image coder

Sometimes MC-prediction may perform badly Examples: Complex motion, new imagery (occlusions) Approach:

1. Identify blocks where prediction fails

2. Code block without prediction

Page 8: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Basic Video Compression Algorithm

http://minarlab.mis.npust.edu.tw/8 MINAR

Exploiting the redundancies: Temporal: MC-prediction (P and B frames) Spatial: Block DCT Color: Color space conversion

Scalar quantization of DCT coefficients Zigzag scanning, runlength and Huffman coding of the

nonzero quantized DCT coefficients

Page 9: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Example Video Encoder

http://minarlab.mis.npust.edu.tw/9 MINAR

Page 10: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Example Video Decoder

http://minarlab.mis.npust.edu.tw/10 MINAR

Page 11: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Outline

Review of basics of image and video compression1

Scalable video coding2

Overview of current video compression standards3

Object-based video coding (MPEG-4)4

11http://minarlab.mis.npust.edu.tw/MINAR

Page 12: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Motivation for Scalable Coding

Basic situation:

1. Diverse receivers may request the same video Different bandwidths, spatial resolutions, frame rates, computational

capabilities

2. Heterogeneous networks and a priori unknown network conditions Wired and wireless links, time-varying bandwidths

When you originally code the video you don’t know which client or network situation will exist in the future

Probably have multiple different situations, each requiring a different compressed bitstream

Need a different compressed video matched to each situation Possible solutions:

1. Compress & store MANY different versions of the same video

2. Real-time transcoding (e.g. decode/re-encode)

3. Scalable codinghttp://minarlab.mis.npust.edu.tw/

12 MINAR

Page 13: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Scalable Video Coding

Scalable coding: Decompose video into multiple layers of prioritized importance Code layers into base and enhancement bitstreams Progressively combine one or more bitstreams to produce different

levels of video quality

Example of scalable coding with base and two enhancement layers: Can produce three different qualities1. Base layer

2. Base + Enh1 layers

3. Base + Enh1 + Enh2 layers

Scalability with respect to: Spatial or temporal resolution, bit rate, computation, memory

http://minarlab.mis.npust.edu.tw/13 MINAR

Higher quality

Page 14: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Example of Scalable Coding

Encode image/video into three layers:

Low-bandwidth receiver: Send only Base layer

Medium-bandwidth receiver: Send Base & Enh1 layers

High-bandwidth receiver: Send all three layers

Can adapt to different clients and network situations

http://minarlab.mis.npust.edu.tw/14 MINAR

Page 15: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Scalable Video Coding (cont.)

Three basic types of scalability (refine video quality along three different dimensions): Temporal scalability Temporal resolution Spatial scalability Spatial resolution SNR (quality) scalability Amplitude resolution

Each type of scalable coding provides scalability of one dimension of the video signal Can combine multiple types of scalability to provide scalability along

multiple dimensions

http://minarlab.mis.npust.edu.tw/15 MINAR

Page 16: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Scalable Coding: Temporal Scalability

Temporal scalability: Based on the use of B-frames to refine the temporal resolution B-frames are dependent on other frames However, no other frame depends on a B-frame Each B-frame may be discarded without affecting other frames

http://minarlab.mis.npust.edu.tw/16 MINAR

Page 17: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Scalable Coding: Spatial Scalability

Spatial scalability: Based on refining the spatial resolution Base layer is low resolution version of video Enh1 contains coded difference between upsampled base layer and

original video Also called: Pyramid coding

http://minarlab.mis.npust.edu.tw/17 MINAR

Page 18: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Scalable Coding: SNR (Quality) Scalability

SNR (Quality) Scalability: Based on refining the amplitude resolution Base layer uses a coarse quantizer Enh1 applies a finer quantizer to the difference between the original

DCT coefficients and the coarsely quantized base layer coefficients

http://minarlab.mis.npust.edu.tw/18 MINAR

Page 19: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Summary of Scalable Video Coding

Three basic types of scalable coding: Temporal scalability Spatial scalability SNR (quality) scalability

Scalable coding produces different layers with prioritized importance

Prioritized importance is key for a variety of applications: Adapting to different bandwidths, or client resources such as spatial or

temporal resolution or computational power Facilitates error-resilience by explicitly identifying most important

and less important bits

http://minarlab.mis.npust.edu.tw/19 MINAR

Page 20: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Outline

Review of basics of image and video compression1

Scalable video coding2

Overview of current video compression standards3

Object-based video coding (MPEG-4)4

20http://minarlab.mis.npust.edu.tw/MINAR

Page 21: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Motivation for Standards

Goal of standards: Ensuring interoperability: Enabling communication between devices

made by different manufacturers Promoting a technology or industry Reducing costs

http://minarlab.mis.npust.edu.tw/21 MINAR

Page 22: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

What do the Standards Specify?

Not the encoder Not the decoder Just the bitstream syntax and the decoding process (e.g. use IDCT, but not

how to implement the IDCT)

Enables improved encoding & decoding strategies to be employed in a standard-compatible manner

http://minarlab.mis.npust.edu.tw/22 MINAR

Encoder DecoderBitstream

Scope of Standardization

(Decoding Process)

Page 23: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Current Image and VideoCompression Standards

Standard Application Bit Rate

JPEG Continuous-tone still-image compression Variable

H.261 Video telephony and teleconferencing over ISDN

p x 64 kb/s

MPEG-1 Video on digital storage media (CD-ROM) 1.5 Mb/s

MPEG-2 Digital Television 2-20 Mb/s

H.263 Video telephony over PSTN 33.6-? kb/s

MPEG-4 Object-based coding, synthetic content, interactivity

Variable

JPEG-2000 Improved still image compression Variable

H.26L Improved video compression 10’s to 100’s kb/s

http://minarlab.mis.npust.edu.tw/23 MINAR

Page 24: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Comparing Current Video Compression Standards

Based on the same fundamental building blocks Motion-compensated prediction (I, P, and B frames) 2-D Discrete Cosine Transform (DCT) Color space conversion Scalar quantization, runlengths, Huffman coding

Additional tools added for different applications: Progressive or interlaced video Improved compression, error resilience, scalability, etc.

MPEG-1/2/4, H.261/3/L: Frame-based coding MPEG-4: Object-based coding and Synthetic video

http://minarlab.mis.npust.edu.tw/24 MINAR

Page 25: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

MPEG-1 and MPEG-2

MPEG-1 (1991) Goal: Compression for digital storage media (e.g. CD-ROM) Achieves VHS quality video and audio at ~1.5 Mb/s

MPEG-2 (1993) Goal: Superset of MPEG-1 to support higher bit rates, higher

resolutions, and interlaced pictures. Original goal to support interlaced video from conventional television;

Eventually extended to support HDTV Provides: Field-based coding and scalability tools

http://minarlab.mis.npust.edu.tw/25 MINAR

Page 26: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Example Use of I-,P-,B-frames:MPEG Group of Pictures (GOP)

Arrows show prediction dependencies between frames

http://minarlab.mis.npust.edu.tw/26 MINAR

Page 27: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

MPEG Group of Pictures (GOP) Structure

Composed of I, P, and B frames Arrows show prediction dependencies Periodic I-frames enable random access into the coded

bitstream Parameters: (1) Spacing between I frames, (2) number of B

frames between I and P frames

http://minarlab.mis.npust.edu.tw/27 MINAR

Page 28: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

MPEG Structure

MPEG codes video in a hierarchy of layers. The sequence layer is not shown.

http://minarlab.mis.npust.edu.tw/28 MINAR

GOP Layer Picture Layer

Slice Layer MacroblockLayer

BlockLayer

Page 29: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

MPEG-2 Profiles and Levels

Goal: To enable more efficient implementations for different applications (interoperability points) Profile: Subset of the tools applicable for a family of applications Level: Bounds on the complexity for any profile

http://minarlab.mis.npust.edu.tw/29 MINAR

Level

Profile

High

High

Main

Main

Low

Simple

HDTV: Main Profile atHigh Level (MP@HL)

DVD & SD Digital TV:Main Profile at Main Level(MP@ML)

Page 30: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Goals of MPEG-4

Primary goals: New functionalities (not just better compression) Object-based or content-based representation Separate coding of individual visual objects Content-based access and manipulation Integration of natural and synthetic objects Interactivity Communication over error-prone environments

Includes frame-based coding techniques from earlier standards

http://minarlab.mis.npust.edu.tw/30 MINAR

Page 31: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Comparing MPEG-1/2 and H.261/3 with MPEG-4

MPEG-1/2 and H.261/H.263: Algorithms for compression Basically describe a pipe for storage or transmission Frame-based Emphasis on hardware implementation

MPEG-4: Set of tools for a variety of applications Define tools and glue to put them together Object-based and frame-based Emphasis on software Downloadable algorithms (not encoders or decoders)

http://minarlab.mis.npust.edu.tw/31 MINAR

Page 32: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Outline

Review of basics of image and video compression1

Scalable video coding2

Overview of current video compression standards3

Object-based video coding (MPEG-4)4

32http://minarlab.mis.npust.edu.tw/MINAR

Page 33: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Comments on Object-based Processing

Basic goal: Separate encoding/decoding of separate objects in a scene

Separate processing of each object enables: Identification and selective decoding and/or processing of object of

interest Facilitates interactivity and manipulation of content Processing of content in the compressed domain Possible w/o decoding or segmentation at decoder

Used for many years in authoring/production Video: bluescreening, e.g. weather-news Audio: individual processing of each voice

MPEG-4 also enables end-user to have object-based processing

http://minarlab.mis.npust.edu.tw/33 MINAR

Page 34: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Different Parts of MPEG-4

Video Coding and expression of natural and synthetic video objects

Audio Coding and expression of natural and synthetic speech and audio

objects

Systems Scene Description: Composition of different audio and video objects in

the scene BIFS: Binary Format for Scene Description Buffering, multiplexing, timing Interaction

Delivery (Delivery of MM Integration Framework, DMIF) Setup of connection (broadcast, interactive) Network is transparent to application

http://minarlab.mis.npust.edu.tw/34 MINAR

Page 35: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Scene Description

Scene description: Describes the spatio-temporal positioning of the individual audio &

video (AV) objects to compose the scene AV Objects: audio, video, natural, synthetic, 2-D, 3-D

Transmitted separately from object bitstreams Scene description info is a property of scene’s structure rather than

individual objects

Enables scene modification without decoding objects

Can be dynamically altered

http://minarlab.mis.npust.edu.tw/35 MINAR

Page 36: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Example of MPEG-4 Scene

http://minarlab.mis.npust.edu.tw/36 MINAR

[MPEG Committee]

Page 37: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Scene Description (cont.)

Hierarchical, tree structure: Leaf nodes: individual AV objects Other nodes: meaningful grouping

http://minarlab.mis.npust.edu.tw/37 MINAR

[MPEG Committee]

Page 38: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Example MPEG-4 Decoding Process

http://minarlab.mis.npust.edu.tw/38 MINAR

[MPEG Committee]

Page 39: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Object-based Processingin the Compressed Domain

Each video or audio object coded into a separate bitstream Scene description contains all non-coded information Possible operations:

Add/delete an object: Add/discard bitstream, e.g. individual instruments in an orchestra

Manipulate (e.g. move) object: Alter visual/audio scene composition

Many object-based operations can be performed without requiring decoding

http://minarlab.mis.npust.edu.tw/39 MINAR

Page 40: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

MPEG-4 Natural Video

MPEG-4 has two primary goals for natural video coding: High compression efficiency coding

Rectangular frames High coding efficiency (64-384 kb/s), low latency, low complexity Error resilience against packet loss, burst errors on wireless links Applications include: Video streaming over the Internet, video over 3G

cellular systems

Object-based coding Content-based functionalities Arbitrarily shaped visual objects Separate encoding & decoding of each object Greatly improved content creation capabilities, as well as interactivity

with different objects at the client

http://minarlab.mis.npust.edu.tw/40 MINAR

Page 41: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

MPEG-4 Coding of Natural Video

Classes of video to represent: Rectangular images

Shape (rectangle) does not change with time Code motion and amplitude information Use conventional coding methods, e.g. MPEG-1/2

Arbitrarily shaped (non-rectangular) image regions Shape usually changes with time Must code motion, amplitude (texture) and shape

Arbitrary & time-varying shape complicates coding Also describe how objects are composed to form scene (scene

description) Separate encoding and decode of each object

http://minarlab.mis.npust.edu.tw/41 MINAR

Frame-

based

coding

Object-

based

coding

Page 42: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

MPEG-4 Natural Video Coding

Extension of MPEG-1/2-type algorithms to code arbitrarily shaped objects

http://minarlab.mis.npust.edu.tw/42 MINAR

Frame-based Coding

Object-based Coding

Basic Idea: Extend Block-DCT and Block-ME/MC-prediction to code arbitrarily shaped objects

[MPEG Committee]

Page 43: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Coding of Arbitrarily Shaped Video Objects

Following slides briefly discuss different aspects of coding arbitrarily shaped video objects: Coding of texture (amplitude) information MC-prediction I, P, B coding of objects Coding of shape information

Goal: To give brief, conceptual overview

(Not covered on problem sets or quiz) Key points to take away:

1. Different attributes to code for arbitrarily shaped video objects

Texture, motion, & shape information

2. MPEG-4 extends block-based coding to code arbitrarily shaped objects (Not an elegant solution, but it works)

http://minarlab.mis.npust.edu.tw/43 MINAR

Page 44: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Example of Arbitrarily Shaped Object

Arbitrarily shaped 2-D object (image region): Video object plane (VOP) in MPEG-4

http://minarlab.mis.npust.edu.tw/44 MINAR

[MPEG Committee]

Page 45: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Comments on Segmentation

Segmentation of video into objects is not standardized (part of encoder)

Different segmentations scenarios: Sometimes segmentation is available, e.g. synthetically generated

content Sometimes it is relatively easy, e.g. bluescreening or video-

conferencing Usually it is very difficult

http://minarlab.mis.npust.edu.tw/45 MINAR

Page 46: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Coding the Texture of an Arbitrarily Shaped Object

Texture (amplitude) coded by Block-DCT adapted for arbitrarily shaped support1. Embed VOP in rectangle

2. Separate processing of each 8x8 block

a) Interior ® Conventional Block-DCT

b) Exterior ® Discard

c) Boundary ® Extrapolate then Block-DCT

http://minarlab.mis.npust.edu.tw/46 MINAR

[MPEG Committee]

Page 47: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

MC-Prediction for Texture Coding of Arbitrarily Shaped Object

Block-based ME/MC-P adapted for arbitrarily shaped support:1. Extrapolate arbitrarily shaped object to fill rectangle

2. Perform conventional block-based ME/MC-P

• Error metric computed only over object’s support in current frame

Also: Parametric motion models (e.g. affine, perspective)

http://minarlab.mis.npust.edu.tw/47 MINAR

[MPEG Committee]

Page 48: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

MC-Prediction for Video Object Planes: I, P, and B VOP’s

MC-Prediction for VOP’s: I-VOP: Intra-coded VOP (no prediction) P-VOP: Predicted VOP B-VOP: Bi-directionally predicted VOP

http://minarlab.mis.npust.edu.tw/48 MINAR

Page 49: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Binary Shape Coding

Opaque objects: Each pixel either inside or outside support Shape given by binary alpha map (bitmap or binary mask)

Many possible approaches for lossless and lossy shape coding e.g. Describe shape by chain code, polynomials, splines, bitmap

MPEG-4: Block-based Context-based Arithmetic Coding (CAE)1. Embed support in rectangle

2. Separate processing of 16x16 blocks

a) Interior (opaque) blocks (completely within object)

b) Exterior (transparent) blocks (completely outside object)

c) Boundary blocks CAE

Also motion compensated CAE

http://minarlab.mis.npust.edu.tw/49 MINAR

Page 50: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Binary Shape Coding:Block-based Shape Coding

Different 16x16 blocks: Interior, boundary, and exterior

http://minarlab.mis.npust.edu.tw/50 MINAR

Page 51: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Binary Shape Coding:Block-based CAE (cont.)

Coding of boundary blocks using CAE: Intra-shape coding

Context defined by 10-pixel template

Inter-shape coding MC-shape using shape motion vector Context defined by 9-pixel template from current and previous frames

http://minarlab.mis.npust.edu.tw/51 MINAR

PreviousFrame

Current Frame

Page 52: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Sprite Coding (Background Prediction)

Sprite: Large background image Hypothesis: Same background exists for many frames, changes

resulting from camera motion and occlusions

One possible coding strategy:1. Code & transmit entire sprite once

2. Only transmit camera motion parameters for each subsequent frame

Significant coding gain for some scenes

http://minarlab.mis.npust.edu.tw/52 MINAR

Page 53: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Sprite Coding Example

http://minarlab.mis.npust.edu.tw/53 MINAR

Sprite (background) Foreground Object

Reconstructed Frame[MPEG Committee]

Page 54: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

Related MPEG Standards(non-compression)

MPEG-7 “Multimedia Content Description Interface” Goal: A method for describing multimedia content to enable efficient

searching and management of multimedia.

MPEG-21 “Multimedia Framework” Goal: To enable the electronic commerce of digital media content.

http://minarlab.mis.npust.edu.tw/54 MINAR

Page 55: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

References and Further Reading

General Video Compression References: J.G. Apostolopoulos and S.J. Wee, ``Video Compression Standards'‘,

Wiley Encyclopedia of Electrical and Electronics Engineering, John Wiley & Sons, Inc., New York, 1999.

V. Bhaskaran and K. Konstantinides, Image and Video Compression Standards: Algorithms and Architectures, Boston, Massachusetts: Kluwer Academic Publishers, 1997.

J.L. Mitchell, W.B. Pennebaker, C.E. Fogg, and D.J. LeGall, MPEG Video Compression Standard, New York: Chapman & Hall, 1997.

B.G. Haskell, A. Puri, A.N. Netravali, Digital Video: An Introduction to MPEG-2, Kluwer Academic Publishers, Boston, 1997.

MPEG web site: http://drogo.cselt.stet.it/mpeg

http://minarlab.mis.npust.edu.tw/55 MINAR

Page 56: LOGO Video Compression NPUST-MINAR Professor : Sheau-Ru Tong Student : Chih-Ming Chen  MINAR.

References and Further Reading (cont.)

Video Compression Standards Documents Video codec for audiovisual services at px64 kbits/s, ITU-T

Recommendation H.261, International Telecommunication Union, 1990. Video coding for low bit rate communication, ITU-T Recommendation

H.263, International Telecommunication Union, version 1, 1996; version 2, 1997.

ISO/IEC 11172, Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbits/s. International Organization for Standardization (ISO), 1993.

ISO/IEC 13818. Generic coding of moving pictures and associated audio information. International Organization for Standardization (ISO), 1996.

ISO/IEC 14496. Coding of audio-visual objects. International Organization for Standardization (ISO), 1999.

http://minarlab.mis.npust.edu.tw/56 MINAR