Top Banner
Image Segmentation Approach For Realizing Zoomable Streaming HEVC Video Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao
42

Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Jan 03, 2016

Download

Documents

Nicholas Golden
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Image Segmentation Approach For Realizing Zoomable Streaming HEVC Video

Zarna Patel

Department of Electrical Engineering

University of Texas at Arlington

Advisor: Dr. K. R. Rao

Page 2: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Outline

• Need for Video Compression

• Evolution of Video Coding Standards

• Context and Emerging Problem

• Introduction of HEVC

• Partial Decoding

• Tiled Encoding

• Simulation Results

• Conclusions and Future Work

• Acknowledgements

• Acronyms

• References

Page 3: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Need for Video Compression [34]

• Uncompressed video data are huge. In HDTV, the bitrate easily exceeds 1 Gbps -- big problems for storage & network communcations. For example,

• Typical HDTV video -- 1920 × 1080 pixels per frame, 30 frames per second, full color depth 24 bits per pixel (8 bits red, green, blue)

• Total bit rate for transmitting video -- 1.5 Gb/sec.

• Nowadays, more than 50% of the current network traffic is video.

Page 4: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Evolution of Video Coding Standards [3]

Page 5: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Context and Emerging Problem [11-13]

• High-resolution video is expected to become widely available in the near future. E.g., 4k, 8k, 10k

• Mobile devices – unable to display such videos. Many of the captured details are lost or unclear because the small screen sizes available in such devices.

• A solution to this problem – cropping and zooming

• The use of cropping enables increased freedom in video editing and the ability to view other parts in the same content.

• Cropping & Zooming applications – Google Maps, Sports, Surveillance and Education

Page 6: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Context and Emerging Problem

Example of Cropping & Zooming [13]:

Page 7: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Context and Emerging Problem [11-13]

• One of the problems involved in displaying the region of interest (ROI) of a high resolution video is the decoding load – This is challenging issue in mobile devices because of having low speed CPUs.

• Another problem -- amount of data and communication bandwidth.

• Two different techniques evaluated – to optimize bitrate, decoding calculation cost and bandwidth efficiency.

Page 8: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Ultra-HD Zoom App [48]

• This research was shown on their tablet at the CeBIT computer expo in Hanover city of Germany on March 16-20, 2015.

Context and Emerging Problem

Page 9: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Introduction of HEVC

• HEVC is the most recent international standard for video compression; a successor to the H.264/MPEG-4 AVC (Advanced Video Coding) standard.

• It was introduced by Joint Collaborative Team of ITU-T VCEG and ISO/IEC MPEG.

• 50% bitrate reduction over the H.264 standard with the same perceptual quality.

Block diagram of HEVC Encoder and Decoder [3]

Page 10: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Simplified block diagram of HEVC [25]

Partitioning Prediction Transform & Quantization Entropy Encode

Compressed Syntax

Entropy decodeInverse Transform & QuantizationReconstruct

Video Source

In loop filtersVideo output

Video Encoder

Video Decoder

Page 11: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Sampled Representation of Pictures [20]

(a) 4:2:0 (b) 4:2:2 (c) 4:4:4

Page 12: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Picture Partitioning [1]

CTU and CTB

• In HEVC, each picture is divided into Coding Tree Units (CTUs). Also called Largest coding Unit (LCU). Possible sizes -- 64×64, 32×32 and 16×16. All the CTUs in a video stream have the same size.

• CTUs are composed of one luma (Y) and two chroma (Cb and Cr) Coding Tree Blocks (CTBs).

• CTB – too big to decide intra or inter prediction.

Page 13: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Picture Partitioning [1]

• CTBs are further divided into Coding Blocks (CBs). CB size is as small as 8×8. A luma and the corresponding chroma CBs form a Coding Unit (CU). The decision about the prediction type (intra, inter) is made from each CU, so CU is the basic unit of prediction in HEVC.

CTB spit in CB

Three CBs form CU

Page 14: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Prediction [1]

• The luma and chroma CBs can be further divided into PBs.

• PBs can be symmetric or asymmetric.

• PB sizes – from 64×64 to 4×4 samples.

• Two types: Intra Prediction and Inter Prediction

CB split into PB

Page 15: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Intra Prediction

• Spatial Redundancy or Intra-frame Correlation – pixels in an image are often similar to their adjacent neighbor pixels. This can be reduced using Intra-frame Prediction.

• 35 luma intra prediction modes, including DC & planar modes.

Spatial (intra-frame) correlation in a video sequence [25] Intra Prediction Modes for HEVC [1]

Page 16: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Inter Prediction

• Temporal Redundancy or Inter-frame Correlation – successive frames in time order are usually high correlated. Parts of the scene are repeated in time with little or no changes.

• Inter-frame Prediction is used to code only the changes in the video content, rather than coding each entire picture repeatedly.

Temporal (inter-frame) correlation in a video sequence [25]

Page 17: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Inter Prediction [44-47]

1. Block based Motion Estimation and Compensation:

To obtain the motion vector and the motion compensation, the following procedure is carried out for each MxN block in the current frame, where M and N are the block height and width respectively:

Motion Estimation [25]

1st step Motion Estimation: Each possible MxN block from previously encoded reference frame is compared with the MxN current block in terms of a certain matching criterion (e.g. energy). The block at a given displacement that minimizes the matching criterion is chosen as the best match. This process of finding the best match is known as motion estimation. This spatial displacement offset between the position of the candidate block (block extracted from reference frame) and the current block is the motion vector (MV).

Page 18: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Inter Prediction [44-47]

2nd step Motion Compensation: This chosen candidate block is subtracted from the current block to form a residual block.

3rd step: The residual block is encoded and transmitted with motion vector.

On the other side, the decoder uses the received motion vector to recreate the candidate region. This is added to the decoded residual block, to reconstruct a version of the original block.

Page 19: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Transform, Scaling, Quantization and Entropy Coding [1]

• The residual signal of the intra or inter prediction, which is the difference between the original block and its prediction, is transformed using a block transform based on the Discrete Cosine Transform (DCT) or Discrete Sine Transform (DST).

• By means of transform, the residual signal is converted to the frequency domain in order to decorrelate and compact the information. HEVC supports four transform sizes: 4x4, 8x8, 16x16 and 32x32.

• After obtaining the transform coefficients, they are then scaled and quantized.

• Once the quantized transform coefficients are obtained, they are combined with prediction information such as prediction modes, motion vectors, partitioning information and other header data, and then coded in order to obtain an HEVC bit-stream. All of these elements are coded using Context Adaptive Binary Arithmetic Coding (CABAC).

CB split into TB

Page 20: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

In-Loop Filters [1]

• In HEVC, the two loop filters are deblocking filter (DBF) followed by a sample adaptive offset (SAO). The DBF is intended to reduce the blocking artefacts around the block boundaries that may be introduced by the lossy encoding process.

• After deblocking is performed, a second filter optionally processes the picture. The SAO classifies reconstructed pixels into categories and reduces the distortion, improving the appearance of smooth regions and edges of objects, by adding an offset to pixels of each category in the current region. The SAO filter is a non-linear filter that makes use of look-up tables transmitted by the encoder.

Page 21: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

HEVC- Slices, Tiles and Wavefronts [1][21]

Subdivision of a picture into (a) slices and (b) tiles; (c) wavefront parallel processing

(a) (b) (c)

Page 22: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Partial Decoding [11][12]

Buffered area decoding

• When dealing with a high-resolution video on a mobile phone, it is necessary to reduce the decoding calculation cost.

• This method decides the decoded partial area (DPA) by extending N number of luma samples around the ROI in each direction.

• Pros and cons : if the reference ranges in the ROI change rapidly, the video will deteriorate; now, deterioration could be prevented by using a very large buffer, but this in turn would decrease the effect of reduced calculation cost.

Page 23: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Tiled Encoding [11][13][17]

• This method was evaluated in terms of bandwidth efficiency and storage requirements for ROI-based streaming.

• Tiled encoding partitions video frames into grid of tiles and encodes each tile as an independently decodable stream.

• For convenience, the tiles have a 1:1 aspect ratio in order to use the CTU size of HEVC.

• These streams are indexed by the spatial region they cover. For a given ROI, a minimal set of tiled streams covering the ROI is streamed by looking up the index. New tiles may be included into the stream or tiles may be dropped when the ROI changes.

Tiled streams

Page 24: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Tiled Encoding

• Here is the brief explanation for requested ROI tiles:

• Let’s consider the original video (uncompressed video in YUV format) with the frame size is 3x3.

• Now, create the tiled streams of size 1x1. That means total number of tiles: 3x3 / 1x1 = 9 tiles.

• From the uncompressed YUV video, crop into 9 smaller uncompressed YUV of size 1x1, at the coordinates (0,0) ,(0,1), …(2,2).

• For each of 9 tiled YUV of size 1x1, encode it from YUV —> compressed tiled streams. After this step, 9 compressed tiled streams of size 1x1 in total will be obtained. Each of these tiled streams is independently decodable stream.

• Now, request ROI X of size 2x2 at coordinates (0,0) —> the server needs to send 2x2 / 1x1 = 4 tiled streams at coordinates (0,0), (0,1), (1,0), (1,1).

• Since the tiled streams of size 1x1 at  (0,0), (0,1), (1,0), (1,1) are already obtained. Then, the compressed file size of those streams are just summed up, and the sum would be the amount of data to send for the ROI X – average data rate or transported data size.

Example of Tiled Encoding

Page 25: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Tiled Encoding [13]

Slice Structure Dependency on Tiled Encoding:

• This was introduced to further improve bandwidth efficiency in tiled encoding.

• Slices can be encoded and decoded independently.

• In tiled encoding, entire CTU that is on border of ROI, is sent. So, it can lead to transmission of redundant bits to clients.

• Thus, slice concept was introduced in terms of bytes (It could contain number of CTU, bytes, tiles).

Example of Tiled Encoding

Page 26: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Tiled Encoding

• For Tiled Encoding method: given the ROI size, ROI position, and the entire video frame size (W*H), the set of tiles which are overlapping with this ROI and necessary to send/decode this ROI can be easily determined.

• Number of slices in these tiles are same as number of slices in each of these compressed tiled streams.

• There are different ways to get these information.

• Each slice always starts with a slice header. Modify the decoder. While the decoder is parsing the bitstream, look for slice headers in each video frame, and increase the counter accordingly.

Page 27: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Simulation Results

• In this experiment, test sequences are encoded for a combination of three different tile sizes chosen from {16*16 LCU, 32*32 LCU and 64*64 LCU} and slice size (in bytes) chosen from {64, 512, 1460}. For partial decoding, three kinds of buffered luma samples are experimented: 16, 32 and 64.

• QP value (32) & video crop size (480×280) are fixed, but cropping position is taken different in all sequences.

• Random access profile was used for coding with GOP (Group of pictures) as 8, and 25 frames were encoded for each sequence.

• These results show the comparison of PSNR, file size, decoding time and transported data size. The transported data size (average data rate) is computed as the number of bits that would be transferred for a specific ROI dimension.

No. Sequence Name

Resolution Type No. of frames

1. park_joy 1280×720 HD 25

2. shields 1280×720 HD 25

3. KristenAndSara

1280×720 HD 25

Test Sequences Used [7][33]

Page 28: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Simulation Results [7][33]

1st frame of original sequence (Resolution: 1280×720)

1st frame of cropped sequence (Resolution: 480×280)

3rd frame of original sequence (Resolution: 1280×720)

3rd frame of cropped sequence (Resolution: 480×280)

1st frame of original sequence (Resolution: 1280×720)

1st frame of cropped sequence (Resolution: 480×280)

Page 29: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Simulation Results

park_joy

shiel

ds

KristenAndSa

ra0

50000

100000150000

200000250000

300000350000

400000450000

Tile Size: 16*16Tile Size: 32*32Tile Size: 64*64

Com

pres

sed

File

Siz

e (B

ytes

)

park_joy

shiel

ds

KristenAndSa

ra282930313233343536373839404142

Tile Size: 16*16Tile Size: 32*32Tile Size: 64*64PS

NR

(dB)

Encoded File Size Encoded Files’ PSNR

park_joy

shiel

ds

KristenAndSa

ra0

10000

20000

30000

40000

50000

60000

70000

80000

Tile Size: 16*16Tile Size: 32*32Tile Size: 64*64

Tran

spor

ted

Dat

a Si

ze (B

yte)

Transported Data Size

Page 30: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Simulation Results

park_joy shields KristenAndSara0

10

20

30

40

50

60

70

Tile Size: 16*16Tile Size: 32*32Tile Size: 64*64

Dec

odin

g tim

e (s

ec.)

Decoding time

Buffered: 16 Buffered: 32 Buffered: 640

5

10

15

20

25

30

35

park_joy

Tile Size: 16*16Tile Size: 32*32Tile Size: 64*64

Dec

odin

g tim

e (s

ec.)

Buffered: 16 Buffered: 32 Buffered: 640

5

10

15

20

25

30

35

40

45

50

shields

Tile Size: 16*16Tile Size: 32*32Tile Size: 64*64

Dec

odin

g tim

e (s

ec.)

Buffered: 16 Buffered: 32 Buffered: 640

5

10

15

20

25

30

35

40

45

50

KristenAndSara

Tile Size: 16*16Tile Size: 32*32Tile Size: 64*64

Dec

odin

g tim

e (s

ec.)

Decoding time of park_joy sequence (buffered area decoding)

Decoding time of shields sequence (buffered area decoding) Decoding time of KristenAndSara sequence (buffered area decoding)

Page 31: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Simulation Results

64 Bytes 512 Bytes 1460 Bytes340000

360000

380000

400000

420000

440000

460000

480000

park_joy

Tile Size: 16*16Tile Size: 32*32Tile Size: 64*64

Slice Size

File

Siz

e (B

ytes

)

64 Bytes 512 Bytes 1460 Bytes0

20000

40000

60000

80000

100000

120000

shields

Tile Size: 16*16Tile Size: 32*32Tile Size: 64*64

Slice SizeFi

le S

ize

(Byt

es)

64 Bytes 512 Bytes 1460 Bytes0

50001000015000200002500030000350004000045000

KristenAndSara

Tile Size: 16*16Tile Size: 32*32Tile Size: 64*64

Slice Size

File

Siz

e (B

ytes

)

Encoded File Size for Three Different Slice Sizes of park_joy sequence

Encoded File Size for Three Different Slice Sizes of shields sequence

Encoded File Size for Three Different Slice Sizes of KristenAndSara sequence

Page 32: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Simulation Results

park_joy shields KristenAndSara0

10000

20000

30000

40000

50000

60000

70000

80000

Tile Size: 16*16

64 Bytes Slice1460 Bytes Slice

Tran

spor

ted

Dat

a Si

ze (B

ytes

)

Transported Data Size for 64 and 1460 Bytes Slice (Tile Size: 16*16)

park_joy shields KristenAndSara0

10000

20000

30000

40000

50000

60000

70000

Tile Size: 16*16

Tran

spor

ted

Dat

a Si

ze (B

yte)

Transported Data Size (Previous Results)

Compared with previous results

Page 33: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Conclusions

• In this thesis, two methods for ROI based video transmission to support cropping and zooming were implemented and evaluated.

• The first, tiled encoding divides frames of a raw video stream into tiles and encodes individual tiles using a standard encoder. The requested ROI is met by sending tile streams that overlaps with the ROI. The results show that bandwidth efficiency of the tiled streaming system is best when the tile size is 16*16, despite a slight increase in encoded file size.

• Second, partial decoding using buffered area decoding based on DPA was performed to reduce decoding calculation cost, and these results demonstrate that 32 buffered luma samples around ROI give 40-55% time reduction to transmit requested ROI.

• At last, slice structure dependency on tiled encoding was performed to further improve bandwidth efficiency. In which, how slice structure influences bandwidth efficiency of ROI region was highlighted. The results show that larger slice size significantly reduces the average data rate. Thus, in terms of bandwidth efficiency 1460 byte slice structure is better than 64 byte slice for ROI based decoding.

Page 34: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Future Work

• Among many possible future directions for this research, the next is to see motion vector dependency on tiled encoding that would lead to better bandwidth efficiency.

• In tiled encoding, entire CTU that is on border of ROI, is sent. So, it can lead to transmission of redundant bits to clients – bits that do not contribute to decoding of pixels within ROI at all. To overcome this issue, Monolithic Stream method can be used. This method transmits only bits that are required for decoding of ROI.

Page 35: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Acknowledgements

• Dr. K. R. Rao – being my mentor

• Dr. W. Dillon & Dr. J. Bredow – committee members

• Khiem NGO – Ph.D Student at USC

• Karsten Suehring -- Project Manager at Fraunhofer HHI

• Tuan Ho and Srikanth Vasireddy -- MPL lab mates

• My family and friends

Page 36: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

AcronymsAVC: Advanced Video Coding

CABAC: Context Adaptive Binary Arithmetic Coding

CB: Coding Block

CPU: Central Processing Unit

CTB: Coding Tree Block

CTU: Coding Tree Unit

CU: Coding Unit

DBF: Deblocking Filter

DCT: Discrete Cosine Transform

DPA: Decoded Partial Area

DST: Discrete Sine Transform

GOP: Group of Pictures

HEVC: High Efficiency Video Coding

IEC: International Electrotechnical Commission

ISO: International Organization for standardization

ITU: International Telecommunication Union

LCU: Largest Coding Unit

LTE: Long Term Evolution

MC: Motion compensation

ME: Motion Estimation

Page 37: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

AcronymsMPEG: Moving Picture Experts Group

MSE: Mean Square Error

MV: Motion Vector

NAL: Network Abstraction Layer

PB: Prediction Block

PSNR: Peak Signal to Noise Ratio

PU: Prediction Unit

QP: Quantization Parameter

ROI: Region of Interest

SAD: Sum of Absolute Difference

SPS: Sequence Parameter Set

SVC: Scalable video coding

TB: Transform Block

TU: Transform Unit

UHD: Ultrahigh Definition

URQ: Uniform Reconstruction Quantization

VCEG: Visual Coding Experts Group

VCL: Video Coded Layer

VGA: Video Graphics Array

WPP: Wavefront Parallel Processing

Page 38: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

References[1] G. J. Sullivan et al, "Overview of the High Efficiency Video Coding (HEVC) Standard", IEEE Trans. on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1649-1668, Dec. 2012.

[2] G.J. Sullivan et al, “Standardized Extensions of High Efficiency Video Coding (HEVC)”, IEEE Journal of Selected Topics in Signal Processing, vol. 7, no. 6, pp. 1001-1016, Dec. 2013.

[3] K.R. Rao, D. N. Kim and J. J. Hwang, “Video Coding standards: AVS China, H.264/MPEG-4 Part 10, HEVC, VP6, DIRAC and VC-1”, Springer, 2014.

[4] M. Wien, “High Efficiency Video Coding: Coding Tools and Specification”, Springer, 2014.

[5] ITU-T: "H.265 : High efficiency video coding", April 2013.

http://www.itu.int/rec/T-REC-H.265-201304-I/en

[6] Special issues on HEVC:

1. Special issue on emerging research and standards in next generation video coding, IEEE Trans. on Circuits and Systems for Video Technology, vol. 22, pp. 1646-1909, Dec. 2012.

2. IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS)

Special Issue on Screen Content Video Coding and Applications: Final papers are due July 2016.

3. IEEE Journal of Selected Topics in Signal Processing, vol. 7, pp. 931-1151, Dec. 2013.

[7] Test sequences:

http://basakoztas.net/hevc-test-sequences/

[8] HEVC Reference Software HM15.0.

https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/branches/HM-15.0-dev/

[9] Discussion on “Multi-Frame Motion-Compensated Prediction” by Fraunhofer HHI

http://www.hhi.fraunhofer.de/en/fields-of-competence/image-processing/research-groups/image-communication/video-coding/multi-frame-motion-compensated-prediction.html

[10] NTT DOCOMO Technical Journal, vol. 14, no. 4 https://www.nttdocomo.co.jp/english/binary/pdf/corporate/technology/rd/technical_journal/bn/vol14_4/vol14_4_043en.pdf

Page 39: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

References[11] Y. Umezaki and S. Goto, ‘Image Segmentation Approach for Realizing Zoomable Streaming HEVC Video”, 9 th International Conference on Information, Communication and Signal Processing (ICICS), pp. 1-4, Dec. 2013.

[12] C. Liu et al, “Encoder-unconstrained user interactive partial decoding scheme”, IEICE Trans. on Fundamentals of Electronics, Communications and Computer Sciences, vol. E95-A, no. 8, pp. 1288-1296, Aug. 2012.

[13] N. Quang et al, “Supporting zoomable video streams with dynamic region-of-interest cropping”, Proceedings of the 18 th ACM International Conference on Multimedia, pp. 259-270, Feb. 2010.

[14] A. Mavlankar et al, “Region-of-interest prediction for interactively streaming regions of high resolution video”, Proceedings International Packet Video Workshop, Nov. 2007.

[15] K. B. Shimoga, “Region-of-interest based video image transcoding for heterogeneous client displays”, Proceedings International Packet Video Workshop, Apr. 2002.

[16] X. Fan et al, “Looking into video frames on small displays”, Proceedings of the 11 th ACM International Conference on Multimedia, pp. 247-250, Nov. 2003.

[17] W. Feng et al, “Supporting region-of-interest cropping through constrained compression”, Proceedings of the 16 th ACM international conference on Multimedia, pp. 745-748, Oct. 2008.

[18] A. Saxena et al, “Jointly optimal intra prediction and adaptive primary transform”, JCTVC-C108, Guangzhou, CN, Oct. 2010.

To access it, go to this link:

http://phenix.int-evry.fr/jct/doc_end_user/current_meeting.php and then give number JCTVC-C108 in Number field or type title of this document.

[19] I. E. G. Richardson, “H.264 and MPEG-4 Video Compression: Video Coding for Next-generation Multimedia”, Wiley, 2003.

[20] T. Wiegand et al, “WD2: Working Draft 2 of High-Efficiency Video Coding”, JCT-VC document, JCTVC-D503, Daegu, KR, Jan. 2011.

To access it, go to this link:

http://phenix.int-evry.fr/jct/doc_end_user/current_meeting.php and then give number JCTVC-D503 in Number field or type title of this document.

[21] G.J. Sullivan et al, “High efficiency video coding: the next frontier in video compression [Standards in a Nutshell]”, IEEE Signal Processing Magazine, vol. 30, no. 1, pp. 152-158, Jan. 2013.

[22] M.T. Pourazad et al, "HEVC: The New Gold Standard for Video Compression: How Does HEVC Compare with H.264/AVC", IEEE Consumer Electronics Magazine, vol. 1, no. 3, pp.36-46, July 2012.

Page 40: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

References[23] J. Chen et al, “Planar intra prediction improvement”, JCT-VC document, JCTVC-F483, Torino, Italy, July 2011.

To access it, go to this link:

http://phenix.int-evry.fr/jct/doc_end_user/current_meeting.php and then give number JCTVC-F483 in Number field or type title of this document.

[24] G. J. Sullivan and T, Wiegand, “Rate Distortion Optimization for Video Compreession”, IEEE Signal Processing Magazine, vol. 15, no. 6, pp. 74-90, Nov. 1998.

[25] I. E. Richardson, “The H.264 Advanced Video Compression Standard”, Wiley, 2010.

[26] J. Sole et al, “Transform Coefficient Coding in HEVC”, IEEE Trans. on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1765-1777, Dec. 2012.

[27] D. Marpe et al, “Context-Based Adaptive Binary Arithmetic Coding in the H.264/AVC Video Compression Standard” , IEEE Trans. on Circuits and System for Video Technology, vol. 13, no. 7, pp. 620-636, July 2003.

[28] V. Sze et al, “High Throughput CABAC Entropy Coding in HEVC “, IEEE Trans. on Circuits and System for Video Technology, vol. 22, no. 12, pp. 1778-1791, Dec. 2012.

[29] V. Sze, M. Budagavi and G. J. Sullivan, “High Efficiency Video Coding (HEVC): Algorithms and Architectures”, Springer, 2014.

[30] M.Budagavi and V.Sze, “Design and Implementation of Next Generation Video Coding Systems”, IEEE International Symposium on Circuits and Systems Tutorial, Melbourne, Australia, June 2014:

http://www.rle.mit.edu/eems/wp-content/uploads/2014/06/H.265-HEVC-Tutorial-2014-ISCAS.pdf

[31] M.Budagavi, “Design and Implementation of Next Generation Video Coding Systems HEVC/H.265 Tutorial”, Seminar presented in EE Department, UTA, 21 st Nov. 2014. http://iscas2014.org/

[32] HM15.0 Software Manual:

https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/branches/HM-15.0-dev/doc/software-manual.pdf

[33] Test sequences:

https://media.xiph.org/video/derf/

[34] K. R. Rao and P. Yip, “Discrete Cosine Transform: Algorithms, Advantages, Applications”, Academic Press, 1990.

[35] Z. Shi, X. Sun and F. Wu, “Spatially scalable video coding for HEVC”, IEEE Trans. on Circuits and System for Video Technology, vol. 22, no. 12, pp.1813-1826, Dec 2012.

Page 41: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

References[36] D.-K. Kwon, M. Budagavi and M. Zhou, “Multi-loop scalable video codec based on high efficiency video coding (HEVC)”, IEEE International Conference on Acoustics, Speech and Signal Processing, pp.1749-1753, June 2013.

[37] Y. Ye and P. Andrivon, “The scalable extensions of HEVC for ultra-high definition video delivery”, IEEE Multimedia magazine, vol. 21, no. 3, pp. 58-64, July 2014.

[38] H. Schwarz, “Extension of high efficiency video coding (HEVC) for multiview video and depth data”, IEEE International Conference on Image Processing, pp. 205-208, Oct. 2012.

[39] J. Stankowski et al, “Extensions of the HEVC technology for efficient multiview video coding", IEEE International Conference on Image Processing, pp. 225-228, Oct. 2012.

[40] M. Budagavi and D.-Y. Kwon, “Intra motion compensation and entropy coding improvements for HEVC screen content coding”, IEEE Picture Coding Symposium, pp. 365-368, Dec. 2013.

[41] M. Naccari et al, “Improving inter prediction in HEVC with residual DPCM for lossless screen content coding”, IEEE Picture Coding Symposium, pp. 361-364, Dec. 2013.

[42] I.E. Richardson, “Coding Video: A Practical guide to HEVC and beyond”, Wiley, May 2015.

[43] x265 HEVC Video Encoder:

http://x265.org/

[44] X. Jing and L.-P. Chau, “An efficient three-step search algorithm for block motion estimation”, IEEE Trans. on Multimedia, vol. 6, no. 3, pp. 435-438, June 2004.

[45] H.A. Choudhury and M. Saikia, “Survey on block matching algorithms for motion estimation”, IEEE International Conference on Communications and Signal Processing, pp. 36-40, Apr. 2014.

[46] S.M. Arora and N. Rajpal, “Survey of fast block motion estimation algorithms”, IEEE International Conference on Advances in Computing, Communications and Informatics, pp. 2022-2026, Sept. 2014.

[47] M.J. Jakubowski and G. Pastuszak, “Block-based motion estimation algorithms – a survey”, Opto-Electronics Review, vol. 21, pp. 86-102, Mrch 2013.

[48] Fraunhofer HHI – The Institute:

http://www.hhi.fraunhofer.de/start-page.html

Page 42: Zarna Patel Department of Electrical Engineering University of Texas at Arlington Advisor: Dr. K. R. Rao.

Questions and Comments

Thank you very much.