Top Banner

Click here to load reader

Video Coding Standards

Jan 03, 2016

ReportDownload

Documents

Video Coding Standards. Heejune AHN Embedded Communications Laboratory Seoul National Univ. of Technology Fall 2011 Last updated 2011. 5. 13. Agenda. History and Concepts JPEG and JPEG-2000 MPEG-1 and MPEG-2 MPEG-4 H.261 and H.263 H.264 Beyond H.264. - PowerPoint PPT Presentation

  • Video Coding StandardsHeejune AHNEmbedded Communications LaboratorySeoul National Univ. of TechnologyFall 2011Last updated 2011. 5. 13

    Heejune AHN: Image and Video Compressionp. *

    AgendaHistory and ConceptsJPEG and JPEG-2000MPEG-1 and MPEG-2MPEG-4 H.261 and H.263H.264Beyond H.264

    Heejune AHN: Image and Video Compressionp. *

    1. Standards and Standards Bodies VCEG (video coding expert group) in ITU (formerly CCITT) Focus on real-time, two-way video communication MPEG/JPEG (moving picture expert group) in ISOFocus on multimedia storage and distribution for entertainmentSome are overlapped JPEGJPEG-2000MPEG-1MPEG-2 => H.262MPEG-4MPEG-7MPEG-21H.261H.263H.264MPEG-4/AVC [-128,127]RRRRSSSS-valueSSSS-value

    Heejune AHN: Image and Video Compressionp. *

    Lossless JPEGDPCM used, prediction from 3 neighbors pixelsOptional modeProgressive encoding Store image data in order of DC only, low-frequency AC, high frequency AC Hierarchical encodingStore image data in low resolution to high resolution Motion-JPEGJust a sequence of JPEG still imagesLow complexity, Error tolerance, Market awarenessUsed for video conferencing and surveillance before widely available cheap MPEG-1/2/4 solution in a market

    Heejune AHN: Image and Video Compressionp. *

    JPEG-2000FeaturesGood compression performance than JPEGat high compression ratio, no blocking effectsGood compression for continuous tone, bi level (text)Both lossless and lossy compression in one frameworkROI (region of interest) support Error resilient support (data partitioning)Rather slow in current embedded system due to complexityEncoding processWaveletTransformQuantizerArithmetic Encoder(Tiling)bits image

    Heejune AHN: Image and Video Compressionp. *

    Comparison between JPEG vs. JPEG-2000Lenna, 256x256 RGBBaseline JPEG: 4572 bytes Lenna, 256x256 RGBJPEG-2000: 4572 bytes

    Heejune AHN: Image and Video Compressionp. *

    MPEG-1/2MC-DCT Hybrid Coding

    Heejune AHN: Image and Video Compressionp. *

    MPEG-1MPEG-1Targeted VHS quality(352x288, 30fps, YCbCr420) on VCD (600MB) 1.4 Mbps (1.2 Mbps video + 0.2 Mbps audio) VCD, 70 minutes Three parts: Part 1 System, Part 2 Video, Part 3 AudioTechnologyMC-DCT Hybrid Macro-block (16x16 pixels): Motion estimation unit Block (8x8 pixels): DCT and Quant unitGOP structure I, P, B picture Trade-off between random access and coding efficiency Asymmetric complexity Larger memory and high computation required at Encoder

    Heejune AHN: Image and Video Compressionp. *

    MPEG-1 StructureSyntax Hierarchy Sequence layer

    GOP layer

    Picture Layer

    Slice Layer

    MB Layer

    Block Layer

    S

    H

    MB

    MB

    MB

    MB

    ...

    ...

    3

    4

    1

    2

    5

    6

    6

    5

    1

    2

    3

    4

    8

    8

    16

    16

    Cr

    Cb

    Y

    8

    8

    Cb

    Cr

    I

    B

    B

    P

    B

    B

    P

    ...

    P

    Slice

    GOP

    S

    H

    GOP

    S

    H

    GOP

    ...

    S

    H

    SH : Sequence Header

    GOP : Group of Picture

    (4:2:0)

    Heejune AHN: Image and Video Compressionp. *

    Picture Coding I Picture: no interframe predictionP Picture: interframe prediction from one casual reference picture B Picture: interframe prediction from one previous and one future pictureGOP and picture order display order (input at encoder)

    Transmission order (Encoding/decoding order)

    I1P1B1B2P2B4B5I2B6B7I1P1B1B2P2B4B5I2B6B7

    Heejune AHN: Image and Video Compressionp. *

    MPEG-2 Major target application Digital television quality (720x576/480, 25/30 fps) at 3 ~ 4MbpsInterlaced video support Frame picture vs field picture : motion compensation unit Frame DCT vs field DCT in frame picture

    field picturefield pictureframe pictureField DCTFrame DCT

    Heejune AHN: Image and Video Compressionp. *

    Scalability SupportSpatial scalability Low resolution at Base layer and high resolution at Enhancement layerBL is used for prediction of EL E.g. SD resolution at BL, HD resolution at ELTemporal scalability 30 fps at BL, 60 fps at ELSNR scalability Same resolution but different qualityData partitioning Coding Data is packed into different stream

    BL EncInput video EL EncBL bit stream EL bit stream down BL DecEL EncHigher Quality Lower Quality

    Heejune AHN: Image and Video Compressionp. *

    Profile & LevelMPEG-2 has many options; all implementation do not needs all of themProfilesSimple : 4:2:0 input, I and P picture only, low complexity & low perf.Main : 4:2:0 input, I,P,B Picture, interlaced4:2:2 : 4:2:2 input (same vertical resolution of color)SNR : SNR scalableSpatial : Spatial scalableHigh : Spatial and 4:2:2LevelLow (352x288), Main(720x576), High 1440 (1440x1152), High (1920x1152)E.g.MPEG-1 : Main profile & Low LevelSD DTV, DVD : Main profile & Main Level HDTV : Main profile & High Level (Historically MPEG-3s target application)

    Heejune AHN: Image and Video Compressionp. *

    MPEG-4FeaturesSupport for low bit rate (from 20 Kbps)Support for object based codingReuse of components, composition, and interactivity support.In practice, object based is not well usedObject-based Coding Video Object Shape Coding : transparent/opaque region, binary or grey scale Texture coding with arbitrary shapeDCT after zero filling in interblock and exrapolation in Intrablock

    VO1VO2VO3

    Heejune AHN: Image and Video Compressionp. *

    Visual data structure

    VS

    : (VS : visual seguence/video session)

    VO1

    : (video object)

    VOL1

    : (VOL : video object layer)

    GOV1

    : (GOV : group of VOP)

    VOP1

    : (VOP : video object plane)

    MB

    :

    2/3

    (synthetic object)

    VO2

    GOV2

    VOL2

    VOP2

    Heejune AHN: Image and Video Compressionp. *

    H.261ITU Mostly focus on real-time communicationH.261 First video coding std(1990)N-ISDN (1990s)px64Kbps (p=1,..30), typically 64 ~ 384kbpsCircuit network based: low delay, reliable H.261 key featuresYCbCr420 CIF, QCIF inputMC-DCT Integer-pel motion Optional loop filter (for deblocking)Filtering at 8x8 block boundary FEC used

    Heejune AHN: Image and Video Compressionp. *

    H.261 syntax structureH.261 Bit structure

    MBA

    PSC

    PTYPE

    TR

    PEI

    PSPARE

    GOB

    MBA

    MTYPE

    MQUANT

    MVD

    CBP

    CBP

    MVD

    TCOEFF

    EOB

    GBSC

    GQUANT

    GN

    GEI

    GSPARE

    GOB (Group of block)

    (Macro block)

    12

    10

    8

    6

    4

    2

    11

    9

    7

    5

    3

    1

    5

    3

    1

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    352

    QCIF

    CIF

    176

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    288

    144

    8

    8

    16

    16

    Y

    Cr

    Cb

    GOB

    Heejune AHN: Image and Video Compressionp. *

    H.263

    H.263 VersionsVersion 1 (1995) Improvement to H.261 4 optional modesVersion 2 (2000, H.263+)12 optional modesVersion 3 (2002, H.263++)19 optional modes Key FeaturesTargets to 20 kbps and for packet based network alsoHalf-pel prediction Redesigned 3-D VLC code

    Heejune AHN: Image and Video Compressionp. *

    H.263 Optional ModesAnnex D: Unrestricted motion vectorsAnnex E: Syntax-based arithmetic codingAnnex F: Advanced PredictionAnnex G: PB Frames

    Annex I : Advanced Intra CodingAnnex J: Deblocking FilterAnnex K: Slice Structured ModeAnnex L: Supplemental enhancement informationAnnex M: Improved PB framesAnnex N: Reference Picture SelectionAnnex O: Scalability Annex P: reference picture resampling

    Heejune AHN: Image and Video Compressionp. *

    (continued)Annex Q: Reduced resolution updateAnnex R: Indepenedent Segment DecodingAnnex S: Alternative inter VLCAnnex T: Modified QuantizationAnnex U: Enhanced reference picture selectionAnnex V: Data partition sliceAnnex W: Additional supplemental enhancement information

    Heejune AHN: Image and Video Compressionp. *

    Performance

    Heejune AHN: Image and Video Compressionp. *

    H.264NameITU H.264 = ISO MPEG-4 Part 10/AVCH.26L : Long term enhancement, not compatible H.263Now accepted in DMB-T/S, IPTV, replacing many MPEG-2 solutionsFor 50% gain to H.263+

    Heejune AHN: Image and Video Compressionp. *

    Key features Smaller processing units (upto 4x4 pixel block)Intra prediction Inter prediction Macroblock based Interframe prediction selection pixel motion vector supportMotion vector options for subblocks4x4 Integer DCT Deblocking filterUniversal VLCCAVAC (content-based adaptive binary arithmetic coding)

    Heejune AHN: Image and Video Compressionp. *

    Intra-frame Prediction luma- 4x4: 9 modes

    - 16x16: 4 modes

    chroma- 8x8: 4modes

    - The same prediction mode is always applied to both chroma blocks

    Heejune AHN: Image and Video Compressionp. *

    Inter-frame Prediction

    H.264MPEG-1/2/4, H.261/3References Permits up to 15 (2 mostly used) reference pictures Bi-predictive B-slices A P-slice may reference a picture that has B-slices Supports explicit weighting coefficients and (a+b)/2 type A P-slice references only one I-picture Bi-directional B-slices Only permit (a+b)/2 type prediction weightingBlock Sizes Tree-structured (16x16 16x8, 8x16, 8x8 8x4, 4x8, 4x4) Either 16x16 or 8x8Motion Estimation half or -pixel accuracy 6-point interpolation for half-pixel and 2-point linear interpolation for -pixel MPEG2 permits half-pixel accuracy and MPEG4 permits -pixel accuracy2-point linear interpolation

    Heejune AHN: Image and Video Compressionp. *

    Heejune AHN: Image and Video Compressionp. *

    Transform and QuantizationInteger DCT No encoder decoder mismatch Three types of transform followed by quantization- Type 1: for the 4x4 array of luma DC coefficients in intra MBs predicted in 16x16 mode # -1- Type 2: for the 2x2 array of chroma DC coefficients #16-17- Type 3: for all other 4x4 blocks # 0-15, 18-25-1( 16x16 Intra Mode only)014523678912131011141516171819202122232425*Data is transmitted in the numbered order4 pixels4 pixels4 pixels4 pixels4 pixels4 pixels

    Heejune AHN: Image and Video Compressionp. *

    Transform and Quantization44 DCT ( X Input, Y output)

    44 integer transform- forward

    - backward

    WPost-scaling factor (PF)

    Heejune AHN: Image and Video Compressionp. *

    Entropy Coding

    Parameters to be codedentropy_coding_mode=0entropy_coding_mode=1Macroblock type (Intra/Inter)Exponential Golomb codes (Exp_Golomb)Variable Length Coding (VLC)Context-based Adaptive Binary Arithmetic Coding (CABAC)Coded block patternQuantizer parameterReference frame indexMotion vectorResidual dataContext-adaptive variable length coding (CAVLC)

    Heejune AHN: Image and Video Compressionp. *

    Deblocking FiltersA boundary-strength (BS) parameter is assigned to every 44 blockBS = 0 No filtering BS = 1-3 Slight filteringBS = 4 Strong filteringFilters only when |P0-Q0|< |P1-P0|< |Q1-Q0|<

    Thresholds and depend on the average quantization parameter (QP) The deblocking filtering accounts for 1/3 of the computational complexity of a decoder.

    Block modes and conditionsBoundary-Strength parameter (BS)One of the blocks is intra-coded and the edge is a MB edge4One of the blocks is intra-coded3One of the blocks has coded residuals2Difference of block motion one luma sample distance1Motion compensation from different reference frames1Else0

    P3P2P1P0Q0Q1Q2Q3

    Heejune AHN: Image and Video Compressionp. *

    Network Adaptation VCL & NALVCL (video coding layer)NAL (network adaptation layer)

    Error Resilient ToolsFlexible macroblock ordering (FMO)Allows to assign MBs to slices In an order other than scan orderArbitrary slice ordering (ASO)Improved end-to-end delay in real-time applicationsRedundant slices (RS)Redundant representations are coded using different coding parametersSlice Group #0Slice Group #1

    Heejune AHN: Image and Video Compressionp. *

    Profile & LevelMain application Baseline : Video telephony Main : DTV and StorageExtended :StreamingProfile & tools

    Heejune AHN: Image and Video Compressionp. *

    Performance comparison

    Heejune AHN: Image and Video Compressionp. *

    Contributions of the VCL Tools

    Spatial Prediction for Intra-coded MacroblocksSaves 6-9% bitsTemporal PredictionSaves around 50% bitsTransformsPSNR less than 0.02dBLogarithmic QuantizationA change in step size by 12% also saves 12% bits CAVLCSaves 5-8% bitsCABACSaves 5-15% bits over CAVLCPicture-adaptive frame/field (PAFF) codingSaves 16%-20% bitsMB-adaptive frame/field (MBAFF) codingSaves 14-16% bits over PAFFDeblocking FilterSaves 5-10% bits

    Heejune AHN: Image and Video Compressionp. *

    Conclusion Many video coding standards Stds reflect Coding Technology and Implementation Technology Coding performance has improved over 4 times since H.261 (1990)

    Whats next SVC (Scalable Video Coding) in H.264 (done)H.264ext (further improvement of H.264)3-D and MVC (Multi-View Coding) is on going.UDTV (ultra Definition TV: 3840x2160)And whats next?

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.