EE569 Digital Video Processin EE569 Digital Video Processin g g 1 Existing Video Coding Standards ISO ITU MPEG-1 (1992) 1.5Mbps, VCD MPEG-2/H.262 (1996) 2-10Mbps, DVD MPEG-4 (2000) 8-1024Kbps H.261 (1990) p×64Kbps H.263 8-512Kbps H.263+(1998) windows media player or real player H.264/AVC coding standard H.120 (1984)
60
Embed
EE569 Digital Video Processing 1 Existing Video Coding Standards ISO ITU MPEG-1 (1992) 1.5Mbps, VCD MPEG-2/H.262 (1996) 2-10Mbps, DVD MPEG-4 (2000) 8-1024Kbps.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
EE569 Digital Video Processing EE569 Digital Video Processing
11
Existing Video Coding Standards
ISO
ITU
MPEG-1 (1992)1.5Mbps, VCD
MPEG-2/H.262 (1996)2-10Mbps, DVD
MPEG-4 (2000)8-1024Kbps
H.261 (1990)p×64Kbps
H.2638-512Kbps
H.263+(1998)windows media player
or real player
H.264/AVC coding standard
H.120 (1984)
EE569 Digital Video Processing EE569 Digital Video Processing
22
H.261 Coding StandardH.261 Coding Standard
Background:Background:– Facilitate video Facilitate video conferencingconferencing and and videophonevideophone
service over ISDNservice over ISDN– p×64 kbps (p=1:videophone; p>5: p×64 kbps (p=1:videophone; p>5:
videoconference; p=30: VHS-quality)videoconference; p=30: VHS-quality)– Basis of MPEG-1 and MPEG-2Basis of MPEG-1 and MPEG-2
FeaturesFeatures– Maximum coding delay of 150msMaximum coding delay of 150ms– Amenable to low-cost VLSA implementationAmenable to low-cost VLSA implementation
EE569 Digital Video Processing EE569 Digital Video Processing
EE569 Digital Video Processing EE569 Digital Video Processing
44
Video MultiplexVideo Multiplex
It defines a data structure so that a decoder can It defines a data structure so that a decoder can interpret the received bit stream without any interpret the received bit stream without any ambiguityambiguityHierarchical data structureHierarchical data structure– Picture layerPicture layer– Group of blocks (GOB) layerGroup of blocks (GOB) layer– Macroblock (MB) layerMacroblock (MB) layer– Block layerBlock layer
Each layer has a distinct headerEach layer has a distinct header
EE569 Digital Video Processing EE569 Digital Video Processing
55
Picture and GOB LayersPicture and GOB Layers
Picture layer consists of picture header Picture layer consists of picture header followed by the data for GOBsfollowed by the data for GOBs– Picture header contains data such as picture format Picture header contains data such as picture format
(CIF or QCIF)(CIF or QCIF)
GOB layer is always composed of 33 GOB layer is always composed of 33 macroblocksmacroblocks– GOB header contains a MB address and GOB header contains a MB address and
compression mode followed by the data for the compression mode followed by the data for the blocksblocks
EE569 Digital Video Processing EE569 Digital Video Processing
66
Macroblock and Block LayersMacroblock and Block Layers
Macroblock: the smallest unit to select the compression mode
Y1 Y2
Y4Y3
Cr Cb
A MB always consists of 6 blocks (Y1 – Y4, Cr, Cb)
MBAMBA MTYPEMTYPE MQUANTMQUANT MVDMVD CBPCBP Bock DataBock Data
EE569 Digital Video Processing EE569 Digital Video Processing
77
Compression Modes Compression Modes
Intra ModeIntra Mode– Similar to JPEG codingSimilar to JPEG coding– Support two compression modesSupport two compression modes
Inter ModeInter Mode– ME is not specified (MC is optional)ME is not specified (MC is optional)– Usually, 16-by-16 BMA, integer-pel accuracy, Usually, 16-by-16 BMA, integer-pel accuracy,
search range [-15,15]search range [-15,15]– Support various compression modesSupport various compression modes
EE569 Digital Video Processing EE569 Digital Video Processing
88
Selecting a Compression ModeSelecting a Compression Mode
Should a MV be transmitted?Should a MV be transmitted?
Should we use intra or inter compression Should we use intra or inter compression mode?mode?
Should the quantizer stepsize be changed?Should the quantizer stepsize be changed?
We can choose the optimal compression mode based onthe variance of the original MB, the MB difference (bd),the displaced MB difference (dbd) and the best MV estimate
EE569 Digital Video Processing EE569 Digital Video Processing
99
Selection MethodSelection Method
If the variance of If the variance of dbddbd is smaller than is smaller than bdbd, then we , then we select Inter mode and MC is neededselect Inter mode and MC is needed– Need to transmit MVDNeed to transmit MVD
– The transmission of DCT coefficients is optionalThe transmission of DCT coefficients is optional
Otherwise, no MV will be transmittedOtherwise, no MV will be transmitted– If the original MB has a smaller variance, select Intra If the original MB has a smaller variance, select Intra
mode; otherwise select Inter mode (but with a zero MV)mode; otherwise select Inter mode (but with a zero MV)
For MC blocks, prediction errors can be modified by For MC blocks, prediction errors can be modified by a 2D spatial filter (the prototype of deblocking filter)a 2D spatial filter (the prototype of deblocking filter)
EE569 Digital Video Processing EE569 Digital Video Processing
Inter+MCInter+MC xx xx xx xx 0000 0000 010000 0000 01
Inter+MC+FILInter+MC+FIL xx 001001
Inter+MC+FILInter+MC+FIL xx xx xx 0101
Inter+MC+FILInter+MC+FIL xx xx xx xx 0000 010000 01
EE569 Digital Video Processing EE569 Digital Video Processing
1111
InterpretationInterpretation
MQUANT: when it is on, a new value of MQUANT: when it is on, a new value of quantizer stepsize will be transmitted;quantizer stepsize will be transmitted;MVD: when it is on, the motion vector MVD: when it is on, the motion vector difference will be transmitted;difference will be transmitted;CBP: when it is on, it means at least one CBP: when it is on, it means at least one transform coefficient in MB will be transform coefficient in MB will be transmitted;transmitted;TCOEFF: when it is on, transform coeffients TCOEFF: when it is on, transform coeffients will be transmittedwill be transmitted
EE569 Digital Video Processing EE569 Digital Video Processing
Motivation: to increase the number of zero coefficients
EE569 Digital Video Processing EE569 Digital Video Processing
1313
ExampleExample
CoeffCoeff
TT
Q[Coeff]Q[Coeff]
5050 00 00 00 3333 3434 00 4040 3333
3232 3232 3333 3434 3535 3636 3737 3838 3232
4848 00 00 00 00 00 00 4848 4848
Coef>T Coef<T Coef>T
EE569 Digital Video Processing EE569 Digital Video Processing
1414
Run-Length CodingRun-Length Coding
Zigzag Scan
00
00
00
00
00
00
00
0000
00
00
00
00
00
00
0100
00
00
00
00
00
00
0000
00
00
00
00
00
02
03
(run,level)
(0,3) (1,2) (7,1) EOB
EE569 Digital Video Processing EE569 Digital Video Processing
1515
H.261 Rate/Buffer ControlH.261 Rate/Buffer Control
The coded video data rate is controlled byThe coded video data rate is controlled by– Pre-processingPre-processing– Quantization step-sizeQuantization step-size– Block significance criterion (CBP flag)Block significance criterion (CBP flag)– Temporal sampling ratio Temporal sampling ratio
The fullness of buffer is controlled byThe fullness of buffer is controlled by– Quantization step-sizeQuantization step-size– Maximum allowable coding delay (150ms)Maximum allowable coding delay (150ms)
EE569 Digital Video Processing EE569 Digital Video Processing
1616
MPEG-I Standard
• Features
- Syntax based no specific algorithm is standardized, the parameters defining the encoded bit stream and decoder are contained in the bit stream itself.- Random accessAllow independent access points (I-frame) to the bitstream.- Fast forward and reverse search- Reasonable coding/decoding delay
EE569 Digital Video Processing EE569 Digital Video Processing
1717
Input Video Format
• Progressive video (interlaced video is handled by MPEG2)
• Input video is first converted into the MPEG standardinput format (SIF).SIF format: Y - 352 ×240, Cr/Cb - 176 ×120, 30 frames/sec
Y
Cr Cb
EE569 Digital Video Processing EE569 Digital Video Processing
1818
MPEG-I Constrained Parameter SetMPEG-I Constrained Parameter Set
-maximum number pixels/line: 720-maximum number of lines/picture: 576-maximum number of pictures/sec: 30-maximum number of macro-blocks/picture: 396-maximum number of macro-blocks/sec: 9900-maximum bit rate: 1.86 Mbps-maximum decoder buffer size: 376,832 bits
EE569 Digital Video Processing EE569 Digital Video Processing
1919
Perspective Video FormatsPerspective Video Formats
formatformat resolutionresolution Bit rateBit rate
EE569 Digital Video Processing EE569 Digital Video Processing
2020
Hierarchical Data Structure (I)
• Sequences are formed by Group Of Pictures (GOP)
• GOP are made up of pictures
• Pictures consist of slices
• Slices are made up of macro-blocks
• Macro-blocks (MB) consist of blocks
• Blocks are 8×8 pixels arrays
EE569 Digital Video Processing EE569 Digital Video Processing
2121
GOPGOPGOP GOP GOP GOP
frameframeframe frame frame frame
slicesliceslice slice slice slice
MBMBMB MB MB MB
blockblockblock block
Hierarchical Data Structure (II)
EE569 Digital Video Processing EE569 Digital Video Processing
2222
Four Compression Modes
• I frame : Intra-frame JPEG-like coding• P frame : forward Prediction from previous frames• B frame : forward, backward or bi-directional Prediction• D frame : contain only the DC component of each block
I P PB B BB B B
0 1 2 3 4 5 6 7 8GOP
EE569 Digital Video Processing EE569 Digital Video Processing
2323
GOP ReorderingGOP Reordering
I P PB B BB B B
0 1 2 3 4 5 6 7 8GOP
Processing order: 0,4,1,2,3,8,5,6,7
EE569 Digital Video Processing EE569 Digital Video Processing
2424
MB Types in MPEG-IMB Types in MPEG-II-picturesI-pictures P-picturesP-pictures B-picturesB-pictures
IntraIntra IntraIntra IntraIntra
Intra-AIntra-A Intra-AIntra-A Intra-AIntra-A
Inter-DInter-D Inter-FInter-F
Inter-DAInter-DA Inter-FDInter-FD
Inter-FInter-F Inter-FDAInter-FDA
Inter-FDInter-FD Inter-BInter-B
Inter-FDAInter-FDA Inter-BDInter-BD
SkippedSkipped Inter-BDAInter-BDA
A- adaptive quantizationA- adaptive quantization
F- forward prediction with MC F- forward prediction with MC
D- DCT of prediction error will be codedD- DCT of prediction error will be coded
B – backward prediction with MCB – backward prediction with MC
I – interpolated prediction with MCI – interpolated prediction with MC
Inter-IInter-I
Inter-IDInter-ID
Inter-IDAInter-IDA
SkippedSkipped
EE569 Digital Video Processing EE569 Digital Video Processing
2525
Intra-frame Compression Mode
8×8 DCT Quantization Run-length coding
JPEG-like coder
8369
6956
5646
4638
3835
3429
2927
27265848
4840
4035
3532
3229
2927
2726
26224037
3834
3429
3429
2726
2726
2222
22193734
3429
2927
2726
2422
2219
1616
168
Default quantization matrix Q0
spatially adaptive quantization
MQUANT parameter
MQUANTQQ 0
• MB types- Intra Q0
- Intra-A Q
EE569 Digital Video Processing EE569 Digital Video Processing
2626
Inter-frame Compression Mode (P)
• MB types
- Intra- Intra-A- Inter-D- Inter-DA- Inter-F- Inter-FD- Inter-FDA- skipped Directly copy from the block at the
same position in the previous frame
A new MQUANT value and DCT ofprediction error will be coded
We need to transmit MV and DCT ofprediction error
We need to transmit MV, DCT ofprediction error and a new MQUANT
EE569 Digital Video Processing EE569 Digital Video Processing
-allow efficient handling of problems associated withcovered/uncovered background-MC averaging over two frames suppresses noise betterthan prediction from just one frame-Since B-frames are not used in predicting future frames,they can be coded with fewer bits without causing errorpropagation
-Two frame buffers are needed-Longer coding delay
EE569 Digital Video Processing EE569 Digital Video Processing
Why does it improve coding efficiency?Why does it improve coding efficiency?– Multi-hypothesis motion compensation (MHMC)Multi-hypothesis motion compensation (MHMC)– B frame is one of the simplest MHMC (two B frame is one of the simplest MHMC (two
hypotheses: forward and backward)hypotheses: forward and backward)
Why does it facilitate scalable coding?Why does it facilitate scalable coding?– Temporal scalabilityTemporal scalability– We can skip B-frames without affecting the We can skip B-frames without affecting the
decoding of other framesdecoding of other frames
EE569 Digital Video Processing EE569 Digital Video Processing
2929
MPEG-I Encoder and Decoder• Encoder modules
• Decoder modules
motion estimation, selection of compression mode (MTYPE)per MB, setting MQUANT value, MCP, quantizer and dequantizer, DCT and IDCT, VLC, multiplexer, buffer and buffer regulator
Demultiplexer, VLC decoder, MCP, dequantizer and IDCT
- relative number of I,P,B pictures in a GOP is application dependent. The use of B-pictures is optional. There is at least one I picture every 132 pictures. - half-pixel accuracy in motion estimation- m.v. that refer to pixels outside of picture is not allowed
EE569 Digital Video Processing EE569 Digital Video Processing
Berkeley version Berkeley version – toe.cs.berkeley.edu (128.32.149.117)toe.cs.berkeley.edu (128.32.149.117)– /pub/multimedia/mpeg/mpeg-2.0.tar.Z/pub/multimedia/mpeg/mpeg-2.0.tar.Z
EE569 Digital Video Processing EE569 Digital Video Processing
3232
MPEG-2 Standard
• Features
- it allows for interlaced input, higher-definition inputsand alternative subsampling of chrominance channels
- it offers scalable bit stream
- it provides improved quantization and coding options
• Profiles- simple profile, main profile, SNR scalable profile,spatially scalable profile and high profile
EE569 Digital Video Processing EE569 Digital Video Processing
3333
Chrominance Subsampling• 4:2:0 (same as MPEG-I)
• 4:2:2 (chroma subsampled in the horizontal direction only)
• 4:4:4 (no chroma subsampling)
luminance
chrominance
luminance
chrominance
EE569 Digital Video Processing EE569 Digital Video Processing
3434
Interlaced Video Coding
• Frame pictures
• Field pictures
Interleave lines of even and odd fields to form composite frames
odd field
even field
8 8
8
8
Even and odd fields are treated as separate pictures
Q: For video containing significant motion, which format is preferred?
EE569 Digital Video Processing EE569 Digital Video Processing
3535
GOP can be composed of mixture of frame GOP can be composed of mixture of frame and field picturesand field pictures– Field pictures always appear in pair (top field and Field pictures always appear in pair (top field and
bottom field)bottom field)– If the top field is a P-/B- picture, then the bottom If the top field is a P-/B- picture, then the bottom
field must also be a P-/B- picturefield must also be a P-/B- picture– If the top field is an I-picture, then the bottom field If the top field is an I-picture, then the bottom field
can be an I- or P- picturecan be an I- or P- picture– A pair of field pictures are encoded in the order in A pair of field pictures are encoded in the order in
which they should appear at the outputwhich they should appear at the output
Frame and Field Pictures
EE569 Digital Video Processing EE569 Digital Video Processing
3636
Frame and Field DCT
Frame DCT Field DCT
EE569 Digital Video Processing EE569 Digital Video Processing
3737
MC Prediction ModesMC Prediction Modes– Simple field predictionSimple field prediction– Simple frame predictionSimple frame prediction
Within a field picture, only simple field Within a field picture, only simple field prediction is usedprediction is used
Within a frame picture, either simple field Within a frame picture, either simple field prediction or simple frame prediction can be prediction or simple frame prediction can be employed on a MB-by-MB basisemployed on a MB-by-MB basis
Frame and Field Prediction
EE569 Digital Video Processing EE569 Digital Video Processing
3838
In the presence of motion, frame prediction suffers In the presence of motion, frame prediction suffers from strong motion artifacts; in the absence of from strong motion artifacts; in the absence of motion, field prediction does not utilize all the motion, field prediction does not utilize all the available informationavailable information
16×8 MC mode: only used in the field pictures, two 16×8 MC mode: only used in the field pictures, two MVs are used for top and bottom fields respectivelyMVs are used for top and bottom fields respectively
Dual-prime mode: used only for P-pictures, one MV Dual-prime mode: used only for P-pictures, one MV and a small differential MV are encodedand a small differential MV are encoded
Frame and Field Prediction (cont’d)
EE569 Digital Video Processing EE569 Digital Video Processing
3939
Spatial, Temporal and SNR Scalability in MPEG-2
• Spatial (resolution) scalability
• SNR (rate, quality) scalability
• Temporal scalability
-base layer is a low spatial resolution of the video-enhancement layers successively enhances thespatial resolution
-base layer uses a coarse quantizer for DCT coefficients-enhancement layer uses a fine quantizer for DCT coeffcients
-allow the decodability at different frame rates
Note: the scalability feature provides by MPEG-2 is ad-hocin the sense of significantly sacrificing coding efficiency
EE569 Digital Video Processing EE569 Digital Video Processing
4040
Other Improvements (I)
optional alternate scan (said to fit interlaced video better)
EE569 Digital Video Processing EE569 Digital Video Processing
4141
Other Improvements (II)MPEG-IMPEG-I MPEG-IIMPEG-II
Intra MBIntra MB
DC Coeff.DC Coeff.
8bits8bits 11bits11bits
Intra MBIntra MB
AC Coeff.AC Coeff.
[-256,255][-256,255] [-2048,2047][-2048,2047]
Non-intra Non-intra MB Coeff.MB Coeff.
[-256,255][-256,255] [-2048,2047][-2048,2047]
Finer Quantization of the DCT Coefficients
EE569 Digital Video Processing EE569 Digital Video Processing
4242
Other Improvements (III)
Finer Adjustment of MQUANT
1.02.03.04.05.06.07.08.0
9.010.011.012.013.014.015.016.0
17.018.019.020.021.022.023.024.0
25.026.027.028.029.030.031.0
0.51.01.52.02.53.03.54.0
5.06.07.08.09.010.011.012.0
14.016.018.020.022.024.026.028.0
32.036.040.044.048.052.056.0
MQUANT in MPEG-I MQUANT in MPEG-II
EE569 Digital Video Processing EE569 Digital Video Processing
SimpleSimple Does not allow B-frame and only support Main levelDoes not allow B-frame and only support Main level
MainMain Does not support scalabilityDoes not support scalability
Support all four levels with upper bound of 4,15,60 Support all four levels with upper bound of 4,15,60 and 80 Mbps respectivelyand 80 Mbps respectively
SNR scalableSNR scalable Support Low and Main levels with maximum bit rates Support Low and Main levels with maximum bit rates 4(3) and 15(10) Mbps4(3) and 15(10) Mbps
Spatially Spatially scalablescalable
Support only High-1440 level with a maximum Support only High-1440 level with a maximum bitrate of 60(15) Mbpsbitrate of 60(15) Mbps
HighHigh Support Main, High-1440 and High levels with Support Main, High-1440 and High levels with maximum bit rates of 20(4), 80(20) and 100(25) maximum bit rates of 20(4), 80(20) and 100(25) Mbps respectivelyMbps respectively
Five profiles defined by MPEG-II
EE569 Digital Video Processing EE569 Digital Video Processing
SGS-ThomsonSGS-Thomson– STi3400: single-chip, MPEG-I, SIF ratesSTi3400: single-chip, MPEG-I, SIF rates– STi3500: the first MPEG-II chip on the marketSTi3500: the first MPEG-II chip on the market