Top Banner
X. He et al. (Eds.): MMM 2015, Part II, LNCS 8936, pp. 307–310, 2015. © Springer International Publishing Switzerland 2015 Software Solution for HEVC Encoding and Decoding Shengbin Meng, Jun Sun, and Zongming Guo Institute of Computer Science & Technology, Peking University No.5 Yiheyuan Road, Beijing, China {shengbin,sunjun,guozongming}@pku.edu.cn Abstract. In this demonstration, we showcase a complete software encoding and decoding solution for the new High Efficiency Video Coding (HEVC) standard. The encoder is optimized for x86 processors using SSE instruction set extension and multi-thread technology, and achieves high efficiency at a signif- icantly reduced computation load. We have integrated the encoder library into the widely-used media framework FFmpeg and developed transcoding and recording applications for HEVC. The decoder is highly optimized for both x86 and ARM architecture. With novel single-instruction-multiple-data (SIMD) al- gorithms and a frame-based parallel framework for multi-core CPUs, decoding speed of 46FPS for 1080p videos on ARM Cortex-A9 1.5GHz dual-core pro- cessor and 75FPS for 4K (3840x2160) videos on Intel i7-2600 3.4GHz quad- core processor can be achieved. We have also integrated the decoder library into FFmpeg and made an Android video player based on that. The software solution can well meet the demand of producing and watching HEVC videos on existing devices, showing promising future of HEVC applications. Keywords: HEVC, codec, software implementation, SIMD, optimization. 1 Introduction The new video coding standard High Efficiency Video Coding (HEVC) [1] introduces some enhanced coding tools and manages to save about 50% bit-rate at the same video quality, comparing with its predecessor H.264/AVC. However, the computa- tional complexity has also increased and become an inevitable obstacle for HEVC’s popularity. In order to increase the amount of HEVC video content, efficient transcoding and recording, or basically, encoding, for HEVC has practical demand and becomes the prerequisite of the HEVC industrialization. On the other side, for HEVC experience to reach large scale of users, it’s necessary to achieve real-time UHD/HD video decoding under the limited capacity of existing personal computers and mobile devices. Before hardware HEVC encoding/decoding chips are produced and dominate the market, the implementation of fast software HEVC encoder/decoder is essential and challenging, and will be a long-lasting demand. To the best of our knowledge, existing published work related to improving software HEVC encoding or decoding speed is mostly based on the reference software HM [2]. And apart from
4

Software Solution for HEVC Encoding and Decoding …or.nsfc.gov.cn/bitstream/00001903-5/151274/1/1000010387600.pdfSoftware Solution for HEVC Encoding and Decoding 309 the x86 device

May 08, 2018

Download

Documents

hahanh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Software Solution for HEVC Encoding and Decoding …or.nsfc.gov.cn/bitstream/00001903-5/151274/1/1000010387600.pdfSoftware Solution for HEVC Encoding and Decoding 309 the x86 device

X. He et al. (Eds.): MMM 2015, Part II, LNCS 8936, pp. 307–310, 2015. © Springer International Publishing Switzerland 2015

Software Solution for HEVC Encoding and Decoding

Shengbin Meng, Jun Sun, and Zongming Guo

Institute of Computer Science & Technology, Peking University No.5 Yiheyuan Road, Beijing, China

{shengbin,sunjun,guozongming}@pku.edu.cn

Abstract. In this demonstration, we showcase a complete software encoding and decoding solution for the new High Efficiency Video Coding (HEVC) standard. The encoder is optimized for x86 processors using SSE instruction set extension and multi-thread technology, and achieves high efficiency at a signif-icantly reduced computation load. We have integrated the encoder library into the widely-used media framework FFmpeg and developed transcoding and recording applications for HEVC. The decoder is highly optimized for both x86 and ARM architecture. With novel single-instruction-multiple-data (SIMD) al-gorithms and a frame-based parallel framework for multi-core CPUs, decoding speed of 46FPS for 1080p videos on ARM Cortex-A9 1.5GHz dual-core pro-cessor and 75FPS for 4K (3840x2160) videos on Intel i7-2600 3.4GHz quad-core processor can be achieved. We have also integrated the decoder library into FFmpeg and made an Android video player based on that. The software solution can well meet the demand of producing and watching HEVC videos on existing devices, showing promising future of HEVC applications.

Keywords: HEVC, codec, software implementation, SIMD, optimization.

1 Introduction

The new video coding standard High Efficiency Video Coding (HEVC) [1] introduces some enhanced coding tools and manages to save about 50% bit-rate at the same video quality, comparing with its predecessor H.264/AVC. However, the computa-tional complexity has also increased and become an inevitable obstacle for HEVC’s popularity. In order to increase the amount of HEVC video content, efficient transcoding and recording, or basically, encoding, for HEVC has practical demand and becomes the prerequisite of the HEVC industrialization. On the other side, for HEVC experience to reach large scale of users, it’s necessary to achieve real-time UHD/HD video decoding under the limited capacity of existing personal computers and mobile devices. Before hardware HEVC encoding/decoding chips are produced and dominate the market, the implementation of fast software HEVC encoder/decoder is essential and challenging, and will be a long-lasting demand. To the best of our knowledge, existing published work related to improving software HEVC encoding or decoding speed is mostly based on the reference software HM [2]. And apart from

Page 2: Software Solution for HEVC Encoding and Decoding …or.nsfc.gov.cn/bitstream/00001903-5/151274/1/1000010387600.pdfSoftware Solution for HEVC Encoding and Decoding 309 the x86 device

308 S. Meng, J. Sun, and Z. Guo

general performance evaluation, these works provide neither technical details nor any source code for reference, thus being less beneficial for practical HEVC application.

In this demo paper, we present a complete software solution for HEVC, including optimized encoder and decoder libraries, integration into well-known open source media framework and some applications.

2 Implementation, Optimization and Performance

2.1 The Encoder

The reference software HM [2] contains an implementation of encoder. However, this implementation is not suitable for practical application due to its redundant structure and inefficient code. So we start from scratch and implement a totally new HEVC encoder, which is written in C language and then highly optimized for x86 processors. The optimization mainly includes novel data-level and task-level methods.

On data level, optimal SIMD algorithms are designed for the enhanced coding tools. We rewrite the time-consuming modules (e.g., motion compensation, integer transform, deblocking) using Streaming SIMD Extensions (SSE) instructions [3], which make the encoder about 2~3 times faster than the C version. On task level, an Inter-Frame Wavefront (IFW) method based on HEVC’s native Wavefront Parallel Processing (WPP) [4] design is introduced to parallelize the encoding process of Cod-ing Tree Blocks (CTBs), and achieves corresponding encoding speedup when running on multi-core CPUs with multiple threads.

The performance evaluation is conducted on Intel Xeon E5620 2.40 GHz processor under Windows Server 2008 operating system. We choose the x265 encoder [5], succes-sor of the best-in-class H.264/AVC encoder x264, to be the baseline of the performance comparison. The best quality preset (with slowest encoding) and the 8-bit HEVC stan-dard test sequences from Class C (832x480 resolution) [6] are used to test the encoding quality and speed. From the data in Table 1 it can be seen that, compared with x265, the proposed encoder (named Lentoid, as shown in the table) not only achieves about -20% BD-rate (coding gain) [7], but also runs nearly 8 times faster.

2.2 The Decoder

In our HEVC decoder solution, to guarantee the effectiveness of optimization, an efficient decoder prototype other than HM is designed. For the most time-consuming decoding modules, novel SIMD algorithms are designed and implemented for both x86 processors, using the SSE instruction set, and ARM processors, using the NEON instruction set [8]. We then use a frame-based parallel framework to achieve several times decoding speedup with multi-thread strategies on multi-core CPUs.

For x86 architecture, the performance evaluation is conducted on Intel i7-2600 3.4GHz quad-core processor with 8GB memory and Microsoft Windows 7 operating system. For ARM, the experiments are conducted on Xiaomi Mi2 [9] smartphone with Qualcomm Snapdragon S4 Pro APQ8064 Quad-core 1.5GHz Cortext-A9 pro-cessor, 2GB memory and Android 4.1 operating system. Considering the trending application scenarios, we present the test data for 4K (3840x2160) video decoding on

Page 3: Software Solution for HEVC Encoding and Decoding …or.nsfc.gov.cn/bitstream/00001903-5/151274/1/1000010387600.pdfSoftware Solution for HEVC Encoding and Decoding 309 the x86 device

Software Solution for HEVC Encoding and Decoding 309

the x86 device and 1080p (1920x1080) video decoding on the ARM device, in Table 2 and Table 3, respectively. The 1080p videos are selected from HEVC standard test sequences; and the 4K videos are downloaded from [10]. It can be concluded that, with the highly optimized decoder, real-time (24 FPS) playback is available for 4K videos on the x86 PC and for 1080p videos on the quad-core smartphone.

Table 1. Performance evaluation of the proposed encoder, comparing with x265 (both running at the best quality preset)

832x480

Sequence QP

Bitrate (kbps) Y-PSNR Encoding FPS BD-rate Speedup

x265 Lentoid x265 Lentoid x265 Lentoid

BasketballDrill

(500 frames)

30 1443.73 1428.51 35.20 35.99 0.208 1.548

-18.03% 7.11 33 932.61 951.60 33.51 34.37 0.259 1.797

36 582.48 650.77 31.46 32.84 0.347 2.375

39 511.12 452.69 30.41 31.37 0.385 2.770

PartyScene

(500 frames)

30 2860.43 2686.94 32.20 33.45 0.161 1.431

-33.06% 7.85 33 1586.59 1678.95 29.68 31.49 0.226 1.706

36 810.82 1034.41 26.93 29.57 0.305 2.393

39 583.29 624.60 25.88 27.69 0.409 2.914

Table 2. Performance evaluation of the optimized HEVC decoder over x86 processor for 4K videos

Table 3. Performance evaluation of the optimized HEVC decoder over ARM processor for 1080p videos

3840x2160 Sequences

Bit-rate (kbps)

FPS 1920x1080

Sequences Bit-rate (kbps)

FPS

Coastguard 10000 44.40 Cactus 5661 29.20 5000 48.97 2634 37.59 7500 56.94 1347 45.36

Foreman 10000 46.96 BQTerrace 8083 21.35 5000 54.53 2209 34.33 7500 65.00 856 46.18

Mobile 10000 40.46 BasketBall

Drive 6349 25.98

5000 46.75 3028 32.25 7500 56.23 1629 39.83

3 System Integration and Demonstration

For easy application, we have integrated the optimized encoder and decoder into the well-known media framework FFmpeg [11], which is widely used in the open source society and industry. We provide patches to FFmpeg, and enable it to encode and decode HEVC videos using our optimized codec as external libraries.

The demonstration of encoder includes applications such as transcoding and re-cording. For transcoding, the command line tool ffmpeg which comes with FFmpeg can be directly used. For recording, we provide an Android application which

Page 4: Software Solution for HEVC Encoding and Decoding …or.nsfc.gov.cn/bitstream/00001903-5/151274/1/1000010387600.pdfSoftware Solution for HEVC Encoding and Decoding 309 the x86 device

310 S. Meng, J. Sun, and Z. Guo

achieves 15FPS HEVC recording for CIF (352x288) resolution on a tablet with Intel Atom Quad-core 1.5GHz processor and a digital camera.

To demonstrate the decoder for x86, we use the simple player ffplay which come with FFmpeg to playback HEVC videos on an Intel PC. To demonstrate the decoder for ARM, we have made an Android application to play HEVC videos on Android phones and tablets with ARM CPUs.

The proposed implementation of HEVC encoder and decoder is evolving towards a commercial product and can be downloaded at [12]. More test results about the codec performance and the source code of some applications are also available at the web-site www.xhevc.com.

4 Conclusion

In this paper, we demonstrate a complete solution for software HEVC encoding and decoding. The encoder and decoder are highly optimized and can achieve significant performance promotion compared with existing implementations. The codec libraries are also integrated into widely-used media framework FFmpeg and ready for HEVC application development.

Acknowledgments. This work was supported by National Natural Science Founda-tion of China under contract No. 61271020, National High-tech Technology R&D Program (863 Program) of China under Grant 2014AA015205 and Beijing Natural Science Foundation under contract No.4142021. Jun Sun is the corresponding author.

References

1. Sullivan, G.J., Ohm, J.-R., Han, W.-J., Wiegand, T.: Overview of the High Efficiency Vid-eo Coding (HEVC) standard. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1649–1668 (2012)

2. Joint Collaborative Team on Video Coding (JCT-VC) Reference Software svn://hevc.kw.bbc.co.uk/svn/jctvc-hm/

3. Intel Corp., Intel® 64 and IA-32 Architectures Software Developers Manual 4. Henry, F., Pateux, S.: Wavefront Parallel Processing, document JCTVC-E196, JCT-VC,

Geneva, Switzerland (March 2011) 5. http://www.videolan.org/developers/x265.html 6. Bossen, F.: Common test conditions and software reference configurations, document

JCTVC-L1100, JCTVC, Geneva (January 2013) 7. Pateux, S.: Tools for proposal evaluations. ISO/IEC JTC1/SC29/WG11, JCTVC-A031

(April 2010) 8. The ARM Architecture, ARM Co. Ltd., Cambridge, UK,

http://www.arm.com/files/pdf/ARM_Arch_A8.pdf 9. http://www.phonearena.com/phones/Xiaomi-Mi-Two_id7427

10. http://www.elementaltechnologies.com/resources/4k-test-sequences

11. FFmpeg, http://ffmpeg.org 12. http://www.xhevc.com/en/downloads/downloadCenter.jsp