Top Banner

Click here to load reader

Scalable Video Coding - Çankaya Ü Video Coding20.pdf · PDF fileVideo, Video Coding, Video Coding Standards Video media is a sequence of images called frames. Each frame...

Jul 23, 2018




  • Scalable Video CodingROYA CHOUPANI

  • Video, Video Coding, Video Coding Standards

    Video media is a sequence of images

    called frames.

    Each frame consists of hundreds of rows and columns.

    Hence, the main challenge in using video is its huge size.

    Video Coding refers to the methods used for compressing video.

    These methods try to reduce or eliminate redundancy in video.

    The most important video coding standards are:

    MPEGx (MPEG4 is the latest version)

    H.26x (H.264 is the latest commercialized version. H.265 approved but not commercially available yet)

  • Redundancy EliminationPsycho-Visual Redundancy

    Redundancy appears in different forms but in general it refers to unnecessary bits used for representing information

    For instance, human visual system is very sensitive to brightness differences, but less sensitive to color changes.

    Therefore, storing and transmitting a large number of different color shades is waste of space/bandwidth and redundant.

    This is called psycho-visual redundancy

  • Redundancy EliminationTemporal Redundancy

    Frames captured within a short time interval are very similar in content.

    Encoding the frames in this way is redundant.

    We can encode one of the frames (as the reference frame) and for the rest, encode their difference with the reference frame.

    In order to compensate the differences between the frames due to moving objects, frames are divided into blocks, and for each block the most similar area is found. Then the difference is considered for encoding.

  • Redundancy EliminationSpatial Redundancy

    Adjacent or nearby pixels generally have similar colors.

    This spatial similarity causes redundancy in data.

    To eliminate the spatial redundancy, transforms such as Discrete Cosine Transform (DCT) are used.

    In these transforms, low frequency coefficients correspond to the spatial similarity and high frequency coefficients show the differences such as object edges.

  • Redundancy EliminationStochastic Redundancy

    While representing data, representation codes show different repetitions.

    More frequent codes are assigned smaller number of bits to eliminate stochastic redundancy

  • Video Coding Diagram

    RGB to



    samplingME/MC DCT Quantization VLC















  • Scalable Video CodingVideo streaming requires large network bandwidths.

    If the required bandwidth is not available, or when the bandwidth fluctuates, or when the receiver device has low capabilities, we have to scale down the video.

    Video scaling should be fast, requiring minimum processing

    Scalable video coding encodes videos in a way that we can scale them down by simply putting aside some parts.

    Videos are encoded in a multi-layer format with the first layer (the base layer) having the video in lowest quality, and some other layers (the enhancement layers) which add to the quality of the base layer video

  • Spatial Scalable Video CodingIt is possible to put a low resolution version of the video in the base layer, and increase the resolution by adding enhancement layer(s)

    Base layer pixels are given in red

    Enhancement layer pixels are encoded by finding their difference with the up-sampled base layer

  • Temporal Scalable Video CodingIn temporal scalability, some frames are sent in the base layer and the others in the enhancement layer(s).

    The base layer frames can be used as the reference frame for the enhancement layer frames but the enhancement layer frames cannot be used as the reference for the base layer frames.

  • Signal-to-Noise Scalable Video CodingIf the required bandwidth is not available, we can send a lower quality version of each frame with the same resolution and the same frame rate.

    The lower quality reduces the bit-per-pixel rate, and hence the requirement is satisfied.

    Different quantization step sizes are used for encoding video in different qualities.

  • Challenges in Scalable Video CodingMultilayer scalable video coding is very sensitive to frame losses.

    If base layer is lost, enhancement layers are useless.

    Besides, if the motion compensation is carried out using original video, in reconstructing the video using base layer only, we will have drift error.

    The main reason for these problems is that the video layers are not independent

  • Multiple Description CodingMultiple description coding (MDC) is a video coding method which divides the video into independent parts named descriptions (despite layers in SVC)

    Descriptions are encoded, transmitted and decoded independently. If all descriptions are present the video is decoded in its highest quality.

    If a description is lost, other descriptions are used to estimate or interpolate it.

    Therefore, there should be some correlation between the descriptions.

    However, this correlation increases the data redundancy and reduces the coding efficiency.

  • Spatial Multiple Description Coding Pixels are distributed between the descriptions. Each description can be considered as a lower resolution version of the video.

    Description 1

    Description 2

    Description 3

    Description 4

    In case that a description is lost, the spatial correlation between the pixels is used for interpolating the missing data

  • Temporal Multiple Description Coding In temporal multiple description coding, the video is decomposed into two independent set of frames. Each set represents the video in a lower frame-per-second rate.

    One option is simply putting the odd frames in one description and the even frames in the other

  • SNR Multiple Description Coding After applying DCT transform, a few low frequency coefficients are repeated in all descriptions.

    Other coefficients are distributed in different descriptions.


    In P frames, when only a few coefficients are non-zero this method is very inefficient

    In I frames, missing coefficients cannot be estimated because there is no correlation between DCT coefficients

  • Challenges in Multiple Description Coding methods

    Descriptions in MDC are correlated. This correlation reduces coding efficiency because the data redundancy is not completely eliminated.

    MDC methods are not suitable for real time video streaming applications because they cannot adapt with the bandwidth changes

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.