Top Banner
Scalable Video Coding Yao Wang Yao Wang Polytechnic Institute of NYU Brooklyn, NY11201 (Modified from slides prepared by Amy Reibman)
46

Scalable Video Coding - Department of Electrical & Computer

Feb 03, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Scalable Video Coding - Department of Electrical & Computer

Scalable Video CodingYao WangYao Wang

Polytechnic Institute of NYUBrooklyn, NY11201

(Modified from slides prepared by Amy Reibman)

Page 2: Scalable Video Coding - Department of Electrical & Computer

Outline

• Heterogeneous clients• Heterogeneous clients– Simulcast– Transcoding– Scalability

• Definition of scalabilityFour (or more) types of scalability• Four (or more) types of scalability

• Evolution of the standards

2Scalable video coding

Page 3: Scalable Video Coding - Department of Electrical & Computer

Heterogeneity

• Many heterogeneous clients• Many heterogeneous clients– Different bandwidth requirements– Different decoding complexity and power constraints– Different screen sizes

• Heterogeneous networks• Heterogeneous networks – Different rates on different networks

• Mobile phoneC t LAN• Corporate LAN

– Dynamically varying rates• Congestion in the network• Distance to base station

ARReibman, 2011 Scalable video coding 3

Page 4: Scalable Video Coding - Department of Electrical & Computer

Simulcast and Transcoding

• Simulcast• Simulcast– Compress video once for each client capability– To support a range of possible clients requires

storage/transmission at each possible rate

• Transcoding– Compress video once; transcode to a lower bit-rate basedCompress video once; transcode to a lower bit rate based

on client capability– Simplest scenario: decode and re-encode

Also possible to reduce complexity by careful design;– Also possible to reduce complexity by careful design; however, it almost always involves more than VLC

– To support a range of possible clients requires transcoding to each possible rateto each possible rate

ARReibman, 2011 Scalable video coding 4

Page 5: Scalable Video Coding - Department of Electrical & Computer

Illustration of Scalable Codingbi

lity

6.5 kbps 133.9 kbps

patia

l sca

lab

Sp

21 6 kbps 436 3 kbps

©Yao Wang, 2006 5

21.6 kbps 436.3 kbps

Amplitude (SNR or quality) scalabilityScalable video coding

Page 6: Scalable Video Coding - Department of Electrical & Computer

Embedded Bit Stream

©Yao Wang, 2006 6Scalable video coding

Page 7: Scalable Video Coding - Department of Electrical & Computer

Scalable Video Coding

• Definition• Definition– Ability to recover acceptable image/video by decoding only

parts of the bitstream• Ideal goal is an embedded bitstream

– Truncate at any arbitrary rate

• Practical video coder– Layered coder: base layer provides basic quality, successive

layers refine the quality incrementally– Fine granularity (FGS): each layer is very thin

• To be useful, a scalable solution needs to be more efficient than Simulcast or Transcoding

Scalable video coding 7

efficient than Simulcast or Transcoding

Page 8: Scalable Video Coding - Department of Electrical & Computer

Functionality Provided by Scalability

• Graceful degradation if the less important parts of the bitstream• Graceful degradation if the less important parts of the bitstreamare not delivered or received or decoded (lost, discarded)

• Bit-rate adaptation at the sender or intermediate nodes to match the channel throughputthe channel throughput

• Format adaptation for backwards compatible extensions• Power adaptation for a trade-off between decoding time (power

ti ) d litconsumption) and quality• Transport module can provide more protection against packet

losses to lower layers (unequal error protection or UEP)• Overall robustness to bandwidth fluctuation and packet losses

ARReibman, 2011 Scalable video coding 8

Page 9: Scalable Video Coding - Department of Electrical & Computer

Design Considerations for Scalability

• Compression efficiency• Compression efficiency• Encoder and decoder complexity• Resilience to lossesResilience to losses• Flexible partitioning for rate adaptation

– Range of rate partitioning (ratio of base rate to total rate)– Number of partitions (finely granular, or a few discrete levels)

• Compatibility with standards• Ease of prioritization• Ease of prioritization

• Prediction structure controls most of these!• Prediction structure controls most of these!

ARReibman, 2011 Scalable video coding 9

Page 10: Scalable Video Coding - Department of Electrical & Computer

Scalability methods

• Temporal scalability (frame rate)• Temporal scalability (frame rate)

• Spatial scalability (picture size)Spatial scalability (picture size)

• Amplitude (AKA SNR or Quality) scalability (quantization stepsize or QP)

F l bilit (t f ffi i t )• Frequency scalability (transform coefficients)

• Object based or ROI scalability (content)• Object-based or ROI scalability (content)

ARReibman, 2011 Scalable video coding 10

Page 11: Scalable Video Coding - Department of Electrical & Computer

MPEG-1,2,4, H.263 Temporal ScalabilityTemporal Scalability

BothBothlayers

Baselayer

ARReibman, 2011 Scalable video coding 11

Can also be considered three layers: Layer 0: Black (I-frames), Layer 1: Green (P frames), Layer 2: brown (B-frames)

Page 12: Scalable Video Coding - Department of Electrical & Computer

H.264: Temporal Scalability with Hierarchical predictionHierarchical prediction

ARReibman, 2011 Scalable video coding 12

Page 13: Scalable Video Coding - Department of Electrical & Computer

Temporal Scalability with Hierarchical B picturesHierarchical B pictures

Problem: encoding delay = number of frames in a GOP (between black frames)

ARReibman, 2011 Scalable video coding 13

g y ( )

OK for non-realtime applications: live streaming, video-on-demand

Page 14: Scalable Video Coding - Department of Electrical & Computer

Temporal Scalability with Hierarchical prediction and Zero delay

(Hierarchical P)

Good for realtime applications: chat or conferencing

ARReibman, 2011 Scalable video coding 14

Good for realtime applications: chat or conferencing

Page 15: Scalable Video Coding - Department of Electrical & Computer

Comments about Temporal Scalability

• MPEG 1 MPEG 2 MPEG 4 and H 263+ all had• MPEG-1, MPEG-2, MPEG-4, and H.263+ all had capability for Temporal scalability through B-frames– These all require added delay at encoder/decoder

• H.264 added flexible temporal prediction, enabling more flexible temporal scalabilitymore flexible temporal scalability– This can be implemented with or without added delay– Hierarchical B structure with large GOP size not only

bl t l l bilit ith l b t lenables temporal scalability with many layers, but also generally improves coding efficiency over using IPP.. Structure.

ARReibman, 2011 Scalable video coding 15

Page 16: Scalable Video Coding - Department of Electrical & Computer

Efficiency of H.264 Temporal ScalabilityScalability

ARReibman, 2011 Scalable video coding 16

Page 17: Scalable Video Coding - Department of Electrical & Computer

Spatial and Temporal Scalability

BothBothlayers

BaseBaselayer

ARReibman, 2011 Scalable video coding 17

Page 18: Scalable Video Coding - Department of Electrical & Computer

Spatial Scalability Through Down/Up SamplingThrough Down/Up Sampling

ME

©Yao Wang, 2006 18Scalable video coding

Page 19: Scalable Video Coding - Department of Electrical & Computer

Amplitude Scalability

• Quality in each layer differs because of the• Quality in each layer differs because of the quantization level

• Only the base layer can do intra-coding• Enhancement layer(s) code the residual (between

original and lower layer)

ARReibman, 2011 Scalable video coding 19

Page 20: Scalable Video Coding - Department of Electrical & Computer

Amplitude (SNR) Scalability By Multistage Stage QuantizationMultistage Stage Quantization

Larger Q

Prediction error Encoder

Smaller Q

Decoder

©Yao Wang, 2006 20Scalable video coding

Page 21: Scalable Video Coding - Department of Electrical & Computer

Multi-Stage Quantization

0

21Scalable video coding

Page 22: Scalable Video Coding - Department of Electrical & Computer

Bitplane coding

• Special case of multistage quantization where• Special case of multistage quantization, where successive step sizes differ by a factor of 2

©Yao Wang, 2006 22Scalable video coding

Page 23: Scalable Video Coding - Department of Electrical & Computer

Prediction strategies

• Predict from the base layer only (Option 1):• Predict from the base layer only (Option 1):– Can be implemented with bit plane coding (MPEG4 FGS)– No mismatch at decoder– Low prediction accuracy if the base layer use large Q

• Predict from the highest layer (Option 2):Mismatch at decoder receiving only lower layers!– Mismatch at decoder receiving only lower layers!

– When the prediction requires unavailable information, this is called “drift”Hi h di ti– High prediction accuracy

©Yao Wang, 2006 23Scalable video coding

Page 24: Scalable Video Coding - Department of Electrical & Computer

Prediction structures for scalability (Options 1 and 2)(Options 1 and 2)

Enhancement layer is predictedEnhancement layer is predictedonly from same frame in base layer

MPEG-2 Spatial Scalability (1)MPEG 4 FGS

Enhancement layer is used to predict base layer

MPEG 2 SNR scalability

ARReibman, 2011 Scalable video coding 24

MPEG-4 FGSVERY INEFFICIENT!!No drift in base layer

MPEG-2 SNR scalabilityErrors propagate into base layerMore efficient

Page 25: Scalable Video Coding - Department of Electrical & Computer

More Efficient Prediction Structures(Options 3 and 4) (Options 3 and 4)

• Base layer predict from base layer; higher layer• Base layer predict from base layer; higher layer predict from either high layer or base layer (Two loop control) (Option 3)

• Allow base layer be predicted from enhancement layer; enhancement layer predict from enhancement layer (Option 4)layer (Option 4)

ARReibman, 2011 Scalable video coding 25

Page 26: Scalable Video Coding - Department of Electrical & Computer

Prediction structures for scalability (Options 3 and 4)(Options 3 and 4)

2-loop control H.264 MGS:pBoth base and enhancement layersuse their own prediction loop

MPEG 2 Spatial Scalability (2)

Base: non-key frames predict usingenhancement; key frames from base layer key framesEnhancement: predict from enhancement

ARReibman, 2011 Scalable video coding 26

MPEG-2 Spatial Scalability (2)H.264 CGSNo drift in base layerreasonably efficient

Enhancement: predict from enhancementTradeoff between efficiency and robustness

Page 27: Scalable Video Coding - Department of Electrical & Computer

Allow both intra-layer and inter-layer predictionprediction

• Inter layer prediction• Inter-layer prediction– Predict from the same frame of the lower layer (higher Q),

quantize the error using lower Q

• Intra-layer prediction– Predict from previous frame (or previous blocks of the

current frame) of the current layer (lower Q), quantize the ) y ( ), qerror using the same lower Q

• Choose which ever is better in RD sense (H 264/SVC• Choose which ever is better in RD sense (H.264/SVC quality scalability)

©Yao Wang, 2006 27Scalable video coding

Page 28: Scalable Video Coding - Department of Electrical & Computer

Frequency scalabilityAKA Data PartitioningAKA Data Partitioning

• Base layer: low frequencies of DCT• Base layer: low frequencies of DCT• Enhancement layer: remaining high frequencies of

DCT

• Standardized in MPEG-2• A breakpoint included in the bitstream made it very

easy to partition

• One encoder prediction loop missing the high frequencies means strong driftq g– (Prediction assumes all coefficients are available in the

previous frame)ARReibman, 2011 Scalable video coding 28

Page 29: Scalable Video Coding - Department of Electrical & Computer

Frequency scalability:Effect of lost informationEffect of lost information

Two blocks at encoder: Two blocks at decoder:

• Errors from previous frame propagate into current• Errors from previous frame propagate into current frame

• Motion causes error to spread, not just spatially, but in frequency

• Prediction method affects degree of propagation

ARReibman, 2011 Scalable video coding

Page 30: Scalable Video Coding - Department of Electrical & Computer

MPEG-2 Scalability:First standard that offers scalabilityFirst standard that offers scalability

• Data partitionData partition– All headers, MVs, first few DCT coefficients in the base layer– Can be implemented at the bit stream level– Simple

• SNR scalabilitySNR scalability– Base layer includes coarsely quantized DCT coefficients– Enhancement layer further quantizes the base layer quantization error– Relatively simple– Predict from enhancement layer of previous framey p

• Spatial scalability– Complex– Predict from previous frame of the same layer, or upsampled frame from lower layer

• Temporal scalabilityp y– Simple; two layers only

• Drift problem: – If the encoder’s base layer information for a current frame depends on the

enhancement layer information for a previous frame

©Yao Wang, 2006 30

– Exist in the data partition and SNR scalability modes

Scalable video coding

Page 31: Scalable Video Coding - Department of Electrical & Computer

MPEG-2 SNR Scalability Encoder

©Yao Wang, 2006 31Scalable video coding

Page 32: Scalable Video Coding - Department of Electrical & Computer

MPEG-2 Spatial Scalability Codec

©Yao Wang, 2006 32Scalable video coding

Page 33: Scalable Video Coding - Department of Electrical & Computer

Fine Granularity Scalability (FGS) in MPEG-4MPEG-4

• MPEG 4 achieves fine granularity quality scalability• MPEG-4 achieves fine granularity quality scalability through bit-plane coding– Base layer coded using a large QP on DCT coefficients

Q anti ation error for DCT coefficients are represented– Quantization error for DCT coefficients are represented losslessly in binary bits

– The bit planes are coded successively, from the most significant bit to the leastsignificant bit to the least.

– The bit plane within each block is coded using run-length coding.

– The same bit plane from all blocks forms one layerThe same bit plane from all blocks forms one layer– Temporal prediction from base layer frames– Efficiency depends on base layer QP (or base layer rate)

©Yao Wang, 2006 33Scalable video coding

Page 34: Scalable Video Coding - Department of Electrical & Computer

Fine-Grained Scalability encoder

I t Vid

Find Reference

FrameMemory

FindMaximum

Bit-planeVLC Enhancement

BitstreamFGS Enhancement Encoding

DCT Q

Q-1

MotionCompensation

VLCInput Video

Base LayerBitstream

IDCT

MotionEstimation

FrameMemory

Encode once, decode to any bandwidth

ARReibman, 2011 Scalable video coding

Page 35: Scalable Video Coding - Department of Electrical & Computer

Inefficiency of predicting only from the base layer (MPEG-4 FGS)the base layer (MPEG-4 FGS)

©Yao Wang, 2006 35

Each blue curve is obtained with MPEG4 FGS using different base-layer rate

Scalable video coding

Page 36: Scalable Video Coding - Department of Electrical & Computer

Example: Simulcast vs FG ScalabilityExample: Simulcast vs. FG Scalability

• Assume minimum sustainable throughput• Assume minimum sustainable throughput– 128 kbps

• Assume known maximum possible throughputp g p– 1024 kbps

• Assume equally probable rates between min and maxmax

• Choose 3 rates for storing simulcast one-layer video– Switch between different one-layer videos depending on y p g

channel rate– Rate of all 3 videos must sum to 1024 kbps

• Compare average video quality of one layer videos to• Compare average video quality of one-layer videos to average video quality of Fine-Grained Scalability

ARReibman, 2011 Scalable video coding

Page 37: Scalable Video Coding - Department of Electrical & Computer

Simulcast vs. FG Scalability

39 Average

36

37

38Average PSNR for switched one-layer is

34

35

36

NR

(dB

)

more than 1 dB better than average

One-layer (upper bound)32

33PS

N PSNR for FG Scalability

(due toOne-layer (upper bound) Fine-grained scalabilitySwitched one-layer

200 300 400 500 600 700 800 900 100029

30

31 (due toprediction inefficiencies of FGS)200 300 400 500 600 700 800 900 1000

Sustainable bandwidth (kbps)of FGS)

ARReibman, 2011 Scalable video coding

Page 38: Scalable Video Coding - Department of Electrical & Computer

Temporal and Spatial Scalability of MPEG 4MPEG 4

• Temporal scalability is accomplished by combining I• Temporal scalability is accomplished by combining I, B, and P-frames

• Spatial scalability is achieved by spatial down/up lisampling

ARReibman, 2011 Scalable video coding 38

Page 39: Scalable Video Coding - Department of Electrical & Computer

H.264 SVC (Scalable Video Coding)

• An optimized H 264/SVC encoder has an average• An optimized H.264/SVC encoder has an average overhead bit-rate of about 11% compared to non-scalable version (H.264/AVC)

• A good trade-off between efficiency and error-propagation/driftDecoding complexity is similar to single layer H 264• Decoding complexity is similar to single-layer H.264 decoding– Uses only a single motion-compensation loop at the decoder

• Predicts not only residual (DCT) information, but also predict motion information and macroblock modes

ARReibman, 2011 Scalable video coding 39

Page 40: Scalable Video Coding - Department of Electrical & Computer

SVC scalability modes

• Temporal scalability: using hierarchical B or• Temporal scalability: using hierarchical B or hierarchical P structure. – No loss of coding efficiency when using hierarchical B

• Spatial scalability: – Using down/up sampling combined with switching between

intra-layer and inter-layer prediction (CGS and MGS)intra layer and inter layer prediction (CGS and MGS)

• Amplitude (quality) scalability– Same as spatial scalability where each layer has the same

ti l l ti b t diff t QPspatial resolution, but different QP

• QP cascading:– Using lower QP for lower spatial/temporal layers, increasing g Q p p y , g

QP for higher spatial/temporal layers incrementally

Yao Wang Scalable video coding 40

Page 41: Scalable Video Coding - Department of Electrical & Computer

Prediction structures for scalability (Options 3 and 4)(Options 3 and 4)

2-loop control H.264 MGS:pBoth base and enhancement layersuse their own prediction loop

MPEG 2 Spatial Scalability (2)

Base: non-key frames predict usingenhancement; key frames from base layer key framesEnhancement: predict from enhancement

ARReibman, 2011 Scalable video coding 41

MPEG-2 Spatial Scalability (2)H.264 CGSNo drift in base layerreasonably efficient

Enhancement: predict from enhancementTradeoff between efficiency and robustness

Page 42: Scalable Video Coding - Department of Electrical & Computer

Efficiency of H.264 Temporal ScalabilityScalability

ARReibman, 2011 Scalable video coding 42

Page 43: Scalable Video Coding - Department of Electrical & Computer

SNR scalability: Before H.264 SVC

Scalable video coding 43

Page 44: Scalable Video Coding - Department of Electrical & Computer

SNR scalability: with H.264 SVC

Scalable video coding 44

Page 45: Scalable Video Coding - Department of Electrical & Computer

Scalable Video Coding Using Wavelet TransformsTransforms

• Wavelet based image coding:• Wavelet-based image coding:– Full frame image transform (as opposed to block-based

transform)– Bit plane coding of the transform coefficients can lead to

embedded bitstreams– EZW SPIHT JPEG2000

• Wavelet-based video coding– Temporal filtering with and without motion compensation

• Using MC limits the range of scalability• Using MC limits the range of scalability– Can achieve temporal, spatial, and quality scalability

simultaneouslySo far has not outperformed block based approach!

©Yao Wang, 2006 45

– So far has not outperformed block-based approach!

Scalable video coding

Page 46: Scalable Video Coding - Department of Electrical & Computer

Homework and References

• Reading assignment: Sec 11 1 11 2 11 3• Reading assignment: Sec. 11.1, 11.2, 11.3• Written assignment

– Prob. 11.3, 11.4,

• Additional information: • H. Schwarz, D. Marpe, T. Wiegand, “Overview of the Scalable Video

Coding Extension of the H.264/AVC Standard”, IEEE Trans. CSVT, September 2007

• http://iphome hhi de/wiegand/assets/pdfs/DIC SVC 07 pdf• http://iphome.hhi.de/wiegand/assets/pdfs/DIC_SVC_07.pdf

©Yao Wang, 2006 46Scalable video coding