Transcoding from H.264/AVC to HEVC
Shantanu KulkarniUTA ID: 1000789943
Introduction to Transcoding Video transcoding is the operation of converting
video from one format to another. A format is defined by characteristics such as
bit-rate, spatial resolution etc. Transcoding is one of the most promising
technologies, which provides video adaptation in terms of bit-rate reduction, resolution reduction and format conversion to meet various requirements
Following is the most basic transcoding architecture
Fig. 1 Basic architecture for transcoding [8]
Need for Transcoding Design of most video coding standards is primarily aimed
at having the highest coding efficiency, which is the ability to encode the video at lowest possible bitrate while maintaining certain level of video quality.
HEVC, which is a recently emerged video coding standard, aims at high coding efficiency while retaining the video quality
With its hybrid coding architecture, motion compensation prediction and transform coding technique, it can be seen as an improved version of the previous standard H.264 [6]
Need for Transcoding contd.
Transcoding from H.264 to HEVC will enable lowering the bitrate resulting in a more efficient compression.[1]
AVC and HEVC share a similar prediction, transform, quantization, and entropy coding architecture. [1]
Overview of HEVC
The HEVC standard is based on the well-known block-based hybrid coding architecture, combining motion-compensated prediction and transform coding with high-efficiency entropy coding
It employs a flexible quad-tree coding block partitioning structure that enables the efficient use of large and multiple sizes of coding, prediction, and transform blocks.
It also employs improved intra prediction and coding, adaptive motion parameter prediction and coding, new loop filter and an enhanced version of context-adaptive binary arithmetic coding (CABAC) entropy coding.
Overview of HEVC contd.
Fig. 2 HEVC encoder block diagram
Overview of HEVC contd.
Fig. 3 Block based architecture HEVC – Intra-Prediction [17]
Overview of H.264/AVC [6]
Directional spatial prediction for intra coding (9 directional prediction modes)
Variable block-size motion compensation with small block size
Quarter-sample-accurate motion compensation Motion vectors over picture boundaries Multiple reference picture motion compensation Decoupling of referencing order from display order In-the-loop deblocking filtering
H.264/AVC Encoder-Decoder Block Diagram
Fig. 4 H.264/AVC Encoder [2]
Fig. 5 H.264/AVC Decoder [2]
Comparison of AVC & HEVC Larger block structure leading to maximum of 64x64
pixels per block Intra prediction direction modes which are upto 35 (33
modes + DC + Planar) in case of HEVC while H.264 has 9 directional modes of intra prediction
Adaptive motion vector prediction, which allows codec to find more inter frame redundancies
Superior parallelization tools, including wavefront parallel processing, for more efficient coding in a multi core environment
Entropy using CABAC only, no more CAVLC Improvements to de-blocking filter and addition of one
more filter called Sample Adaptive Offset (SAO) that further leaves artifacts along block edges
HEVC Transcoder The transcoding schemes discussed here avoid high computational
complexity in terms of reduced RDO evaluations and motion compensation operation as well as fractional pixel interpolation operation.
AVC Decoder
HEVCRe-encoder
Simplified Mode
Selection
Input AVC Bitstream
Output HEVC Bitstream
Residual, modes and MVs CU, PU partitions
and MVs
Fig. 6 Pixel domain AVC-HEVC transcoder[1]
HEVC Transcoder Contd. The LCU will initially split according to the input MB modes in AVC The initial CU partitions will be further merged to larger size
according to the predict directions of its adjacent four sub-CUs. For example, if the predict directions of adjacent four 8x8 CUs are
the same, they will be merged to 16x16. Similar merge operations will also perform on CUs larger than 8x8. The merge process is applied from the 4x4 smallest 4x4 blocks to the blocks with size 32x32.
The input information from AVC can be used to reduce the candidate predict directions for SATD, or reducing the candidate SATD list
HEVC Transcoder contd. The major complexity of Inter picture coding comes from
the motion estimation (ME), MC operations when testing every set of possible coding parameters with possible CU size, PU and TU modes.
Thus, it is proposed to reduce these operations with the help of input
AVC information, e.g. residuals, modes and MVs.Since the largest CU (LCU) in HEVC consists of 16 MBs in AVC, after AVC decoding, the information of these MBs will transmit to the mode selection module.
HEVC Transcoder Contd. Since the input AVC bitstream already contains useful
information of the MB partitions and prediction directions, we extract the information which can be utilized by HEVC encoder without having to do any computations.
Key technology of AVC to HEVC transcoding consists of merging smaller blocks to a larger CU, especially for bit rate reduction transcoding. Since a large CU may consists of different 4x4 blocks, and probably, these blocks may have different MVs, merging these blocks now turns to measure the RD cost when the MV changes.
Cascaded encoder decoder transcoder architecture
Includes complete decoding and re-encoding High Complexity Error due to the lossy encoding of already decoded bit sequence
H.264 Encoder H.264 Decoder HEVC Encoder HEVC Decoder
Input Bit stream
Output Bit stream
H.264 Bit stream
Transcoded HEVC
Bit stream
Cascaded Decoder and Encoder
Reconstructed Bit stream
Fig.7 Cascaded encoder – decoder transcoder
Simulation results
Sequence Component Metric Encoded by H.264
Encoded by HEVC
Transcoded output with respect to
original
Transcoded output with respect to
H.264
akiyo_qcif
YMSE 7.9453 16.3527 14.9827 13.0089
PSNR 39.14 35.9949 36.3749 36.9884
UMSE 4.89645 73.6801 7.6814 5.4491
PSNR 41.234 39.4573 39.2764 40.7675
VMSE 4.05427 44.4938 5.099 3.8215
PSNR 42.054 41.6478 41.056 42.3084
PSNR (color) 39.766 37.13431 37.322725 38.1257875
Bitrate (kbps) 15.53 12.4128 11.64
Computation Time (sec) 149.847 504.537 494.878
Table 1. MSE and PSNR of akiyo_qcif.yuv video sequence for 100 frames
Simulation results continued…
Sequence Component Metric Encoded by H.264
Encoded by HEVC
Transcoded output with respect to
original
Transcoded output with respect to
H.264
foreman_qcif
YMSE 16.03446 29.9875 16.0345 23.6761
PSNR 36.134 33.3614 36.0803 34.3877
UMSE 5.56395 7.3461 5.5639 3.5923
PSNR 40.689 39.4702 40.677 42.577
VMSE 4.00861 5.8822 4.0086 3.9526
PSNR 42.124 40.4354 42.1009 42.1619
PSNR color 37.452125 35.00925 37.4074625 36.3831375
Bitrate (kbps) 90.6 45.8904 43.4808
Computation Time (sec) 198.281 851.182 839.835
Table 2. MSE and PSNR of foreman_qcif.yuv video sequence for 100 frames
Simulation results continued…
Sequence Component Metric Encoded by H.264
Encoded by HEVC
Transcoded output with respect to
original
Transcoded output with respect to
H.264
mobile_cif
YMSE 33.74791 60.16589 29.5597 48.6284
PSNR 33.021 30.3373 33.4238 31.2619
UMSE 16.33582 19.1911 24.0076 9.5459
PSNR 36.033 35.2998 34.3273 38.3326
VMSE 17.23845 22.2633 26.3195 12.7897
PSNR 35.806 34.6549 33.928 37.0622
PSNR color 33.745625 31.49731 33.5997625 32.870775
Bitrate (kbps) 851.1 361.4736 337.5528
Computation Time (sec) 605.525 4053.218 3957.333
Table 3. MSE and PSNR of mobile_cif.yuv video sequence for 100 frames
Simulation results continued…
Sequence Component Metric Encoded by H.264
Encoded by
HEVC
Transcoded output
with respect to
original
Transcoded output
with respect
to H.264
coastguard_cif
YMSE 43.65598 54.3388 33.7838 32.0597
PSNR 31.797 30.7797 32.8437 33.0712
UMSE 3.745 3.7246 21.7963 1.75509
PSNR 42.499 42.42 34.747 45.6878
VMSE 3.11601 2.9498 24.5536 1.4232
PSNR 43.26 43.4328 34.2296 46.5979
PSNR color 34.567625 33.81638 33.25485 36.3391125
Bitrate (kbps) 428.1 295.2936 233.5488
Computation Time (sec) 846.241 4012.575 3791.309
Table 4. MSE and PSNR of coastguard_cif.yuv video sequence for 100 frames
Comparison of PSNR for colored images between ‘qcif’ video sequences akiyo and foreman
Video Sequences
PSN
R (d
b)
Fig 8. PSNR Video sequences: akiyo_qcif and foreman_qcif
3434.5
3535.5
3636.5
3737.5
3838.5
3939.5
40
PSNR_akiyo_qcif PSNR_foreman_qcif
Encoded by H.264
Encoded by HEVC
Transcoded output withrespect to original
Transcoded output withrespect to H.264reconstructed frames
Video Sequences
PSNR (dB)
Comparison of PSNR for colored images between ‘cif’ video sequences mobile and coastguard
Fig 9. PSNR Video sequences: mobile_cif and coastguard_cif
3030.5
3131.5
3232.5
3333.5
3434.5
3535.5
3636.5
37
PSNR_mobile_cif PSNR_coastguard_cif
Encoded by H.264
Encoded by HEVC
Transcoded output withrespect to original
Transcoded output withrespect to H.264reconstructed frames
Video Sequences
PSNR (dB)
Comparison of bitrate for akiyo_qcif video sequence
1010.5
1111.5
1212.5
1313.5
1414.5
1515.5
16
Bitrate_akiyo_qcif
Encoded by H.264
Encoded by HEVC
Transcoded output withrespect to H.264reconstructed frames
Fig 10. Bitrate comparison between H.264 encoded, HEVC encoded and transcoded output using foreman_qcif.yuv sequence. (100 frames)
Video Sequences
Bitrate (kbps)
Comparison of bitrate for foreman_qcif video sequence
Fig 10. Bitrate comparison between H.264 encoded, HEVC encoded and transcoded output using foreman_qcif.yuv sequence. (100 frames)
404550556065707580859095
100
Bitrate_foreman_qcif
Encoded by H.264
Encoded by HEVC
Transcoded output withrespect to H.264reconstructed frames
Video Sequences
Bitrate (kbps)
Comparison of bitrate for mobile_cif video sequence
Fig 11. Bitrate comparison between H.264 encoded, HEVC encoded and transcoded output using mobile_cif.yuv sequence. (100 frames)
200250300350400450500550600650700750800850900
Bitrate_mobile_cif
Encoded by H.264
Encoded by HEVC
Transcoded output withrespect to H.264reconstructed frames
Video Sequences
Bitrate (kbps)
Comparison of bitrate for coastguard_cif video sequence
Fig 12. Bitrate comparison between H.264 encoded, HEVC encoded and transcoded output using coastguard_cif.yuv sequence. (100 frames)
200
225
250
275
300
325
350
375
400
425
450
Bitrate_coastguard_cif
Encoded by H.264
Encoded by HEVC
Transcoded output withrespect to H.264reconstructed frames
Video Sequences
Bitrate (kbps)
100125150175200225250275300325350375400425450475500525550
Computation Time Akiyo
Encoded by H.264
Encoded by HEVC
Transcoded output withrespect to H.264reconstructed frames
Fig 13. Computation time comparison between H.264 encoded, HEVC encoded and transcoded output using akiyo_qcif.yuv sequence. (100 frames)
Comparison of computation time for akiyo_qcif video sequence
Time(sec)
Video Sequences
Fig 14. Computation time comparison between H.264 encoded, HEVC encoded and transcoded output using foreman_qcif.yuv sequence. (100 frames)
Comparison of computation time for foreman_qcif video sequence
100150200250300350400450500550600650700750800850900
Computation Time Foreman
Encoded by H.264
Encoded by HEVC
Transcoded output withrespect to H.264reconstructed frames
Time(sec)
Video Sequences
Fig 15. Computation time comparison between H.264 encoded, HEVC encoded and transcoded output using mobile_cif.yuv sequence. (100 frames)
Comparison of computation time for mobile_cif video sequence
500750
1000125015001750200022502500275030003250350037504000
Computation Time Mobile
Encoded by H.264
Encoded by HEVC
Transcoded output withrespect to H.264reconstructed frames
Time(sec)
Video Sequences
Fig 16. Computation time comparison between H.264 encoded, HEVC encoded and transcoded output using coastguard_cif.yuv sequence. (100 frames)
Comparison of computation time for coastguard_cif video sequence
500750
10001250150017502000225025002750300032503500375040004250
Computation Time Coastguard
Encoded by H.264
Encoded by HEVC
Transcoded output withrespect to H.264reconstructed frames
Time(sec)
Video Sequences
Images Akiyo_qcif and Foreman_qcif
Fig 17. Akiyo_qcif video sequence 17a) H.264 encoded, 17b) transcoded and 17c) HEVC encoded and reconstructed
Fig 18. Foreman_qcif video sequence 18a) H.264 encoded, 18b) transcoded and 18c) HEVC encoded and reconstructed
Fig 17a. Fig 17b. Fig 17c.
Fig 18a. Fig 18b. Fig 18c.
Images Mobile_cif
Fig 19a. Mobile_cif H.264 encoded and reconstructed Fig 19b. Mobile_cif Transcoded
Fig 19c. HEVC Encoded and reconstructed
Images coastguard_cif
Fig 20a. Coastguard_cif H.264 encoded and reconstructed
Fig 20b. Coastguard_cif Transcoded
Fig 20c. Coastguard_cif HEVC Encoded and reconstructed
References1. D. Zhang, B. Li, J. Xu, and H. Li, ‘Fast Transcoding from H.264/AVC to High Efficiency Video
Coding’ IEEE International Conference on Multimedia Expo, pp. 651-656, July, 20122. T. Wiegand et al, “Overview of the H.264/AVC video coding standard,” IEEE Trans. CSVT,
Vol. 13, pp. 560-576, July 2003.J Xin, C.W. Lin and M.T. Sun, “Digital video transcoding”, Proceedings of the IEEE, Vol. 93, pp 84-97, Jan 2005.
3. T. Wiegand, G. J. Sullivan, G. Bjøntegaard, and A. Luthra, ‘Overview of the H.264/AVC Video Coding Standard’, IEEE transactions on circuits and systems for video technology, vol. 13, no. 7, pp. 560-576, July 2003
4. I. Kim, J. Min, T. Lee et al, ‘Block Partitioning Structure in the HEVC Standard’, IEEE transactions on circuits and systems for video technology, vol. 22, no. 12, pp. 1697-1706, December 2012
5. G. Sullivan, P. Topiwalla and A. Luthra, “The H.264/AVC video coding standard: overview and introduction to the fidelity range extensions”, SPIE Conference on Applications of Digital Image Processing XXVII, vol. 5558, pp. 53-74 Aug 2004.
6. T. D. Nguyen et al, “Efficient MPEG-4 to H.264/AVC transcoding with spatial downscaling”, ETRI Journal, vol.29, no.6, pp 826-828, Dec. 2007.
7. G.J. Sullivan, J. Ohm, W. Han et al, “Overview of High Efficiency Video Coding (HEVC) Standard” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, No.12, Dec 2012
8. A. Vetros, C. Christopoulos and H. Sun, “Video transcoding architectures and techniques: An overview”, IEEE Signal Processing Magazine, Vol. 20, pp 18-29, March 2003
9. HEVC open source software (encoder/decoder) https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-6.0
10. JM Reference Software - http://iphome.hhi.de/suehring/tml/ - H.264 reference software11. Eduardo Peixoto Fernandes da Silva, “Advanced Heterogeneous Video Transcoding” Queen
Mary, University of London, PhD Thesis.12. J. Padia, “Complexity Reduction For Vp6 To H.264 Transcoder Using Motion Vector Reuse”,
MPL, University of Texas at Arlington, May 2012.
References Contd.Reference Books9. K. Sayood, “Introduction to Data compression”, III edition, Morgan Kauffmann publishers, 2006.10. I.E.G. Richardson, “H.264 and MPEG-4 video compression: video coding for next-generation
multimedia”, Edition II ,Wiley, 2010.
Websites11. http://en.wikipedia.org/wiki/ : Website for Wikipedia, Encyclopedia12. http://www-ee.uta.edu/Dip/Courses/EE5359/index.html: Course website13. http://ieeexplore.ieee.org/: Website archive for IEEE papers online14. http://www.v-net.tv/hevc-is-game-changer-for-multi-screen-and-iptv/: Impact of HEVC standard on
digital media market like cell phones, TVs etc15. http://www.streamingmedia.com/Articles/Editorial/What-Is-.../What-Is-HEVC-(H.265)-87765.aspx:
Summary about HEVC , information site.16. http://mrutyunjayahiremath.blogspot.com/2010/09/h264-video-codec_22.html: Diagram for H.264
prediction direction modes17. http://opticalengineering.spiedigitallibrary.org/article.aspx?articleid=1352660: Diagram for
Intra prediction block structure for HEVC
THANK YOU