Top Banner
Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)
22

Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

Dec 14, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism

Analysis and Improvement

Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo

2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP)

Page 2: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

2

Outline

Introduction Parallelism Evaluation Of HEVC Encoding Proposed Method Experimental Results Conclusion

Page 3: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

3

Introduction

Great increment of computational complexity introduced by the enhanced coding tools makes HEVC difficult for application.

By developing the parallelism among the encoding tasks, the encoding speed can be significantly improved.

Page 4: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

4

Introduction

Compared with slices, WPP can achieve similar parallelism with less loss of coding efficiency.

In [11], Chi et al. proposed an Overlapped WaveFront (OWF) method based on WPP.

• [11] C. C. Chi, M. Alvarez-Mesa, B. Juurlink, G. Clare, F. Henry, S. Pateux, and T. Schierl, “Parallel Scalability and Efficiency of HEVC Parallelization Approaches,” IEEE Trans. Circuits Syst. Video Technol., vol. 22, pp.1827-1838, Dec. 2012

Page 5: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

5

Parallelism Evaluation Of HEVC Encoding(1/3)

Ti,j,k : Self Encoding Complexity (SEC) of Ci,j,k. SEC can be evaluated by the encoding time. Determined by the frame content and RDO design

and does not change with parallel methods. ETF(Ci,j,k) : Required Encoding Complexity

(REC) to encode Ci,j,k using parallel method F. REC can be regarded as the earliest ending time. Affected by the data dependence.

Page 6: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

6

Parallelism Evaluation Of HEVC Encoding(2/3)

max{} (1) (2)

• i, j, k : order of frame, line, and CTU.• DEPF,inter(Ci,j,k) : CTBs that Ci,j,k depends on when using parallel encoding method F.

Page 7: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

7

Parallelism Evaluation Of HEVC Encoding(3/3)

From (1) and (2), it is clear that the parallelism of different parallel methods can be evaluated:

This criterion is easy to be proved with (1) and (2) and can be simply explained as the less dependence in HEVC encoding, the higher parallelism can be obtained.

} (3)

(4)

Page 8: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

8

Data Dependence Analysis of WPP and OWF Method(1/4)

For intra :

(5)

Page 9: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

9

Data Dependence Analysis of WPP and OWF Method(2/4)

SEC of each CTB is of significant difference. Variance of the SEC in inter frame is much

greater than that of intra frame. Under the given encoding algorithm, the

unbalanced SEC is determined, thus being the bottleneck of intra-frame parallelism.

Page 10: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

10

Data Dependence Analysis of WPP and OWF Method(3/4)

Page 11: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

11

Data Dependence Analysis of WPP and OWF Method(4/4)

For inter :

• i, j, k : order of frame, line, and CTU.• W : the width of a frame measured by CTB.• L_OWF : a positive integer parameter denoting the safe range.

• In [11], L_OWF is roughly set to the upper round of 1/4 height of a frame measured by CTB.

, (6), (7)

Page 12: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

12

Proposed Method(1/5)

To best exploit the inter-frame parallelism, we designed a new Inter-frame Wavefront (IFW) coding order.

Page 13: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

13

Proposed Method(2/5)

For intra :

For inter :

, (8)

(9)

Page 14: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

14

Proposed Method(3/5)

Frame Thread (FT) is assigned to each frame to develop inter-frame parallelism.

Wavefront Thread (WT) is assigned to each frame to develop intra-frame parallelism.

Page 15: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

15

Proposed Method(4/5)

If L_IFW is no greater than L_OWF, for any i, j, k we can deduce that:

, (12)

,(13)

Page 16: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

16

Proposed Method(5/5)

It is also confirmed that the unbalanced SEC is a bottleneck for intra-frame parallelism.

Parallelism of IFW significantly increases as B-frames increase, because the effectively reduced inter-frame dependence makes much greater contribution in improving the overall parallelism.

Page 17: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

17

Experimental Results

The common test conditions and software reference configurations [12].

The hardware platform is a shared memory system with two AMD Opteron 6272 processors.

Page 18: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

18

Experimental Results(2/)

Page 19: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

19

Experimental Results

Frame Thread = 9, Wavefront Thread = 8

Page 20: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

20

Page 21: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

21

x265

Page 22: Towards Efficient Wavefront Parallel Encoding of HEVC: Parallelism Analysis and Improvement Keji Chen, Yizhou Duan, Jun Sun, Zongming Guo 2014 IEEE 16th.

22

Conclusion

A parallelism evaluation criterion and an IFW method are proposed to improve the encoding speed of HEVC.

IFW method achieves significant speedup on various sequences, being a promising technology for large-scale HEVC video applications.