MULTIPLEXING AND DEMULTIPLEXING HEVC VIDEO BITSTREAM WITH AAC AUDIO BITSTREAM,ACHIEVING LIP SYNCHRONIZATION BY MRUDULA WARRIER UNDER THE GUIDANCE OF DR.

MULTIPLEXING AND DEMULTIPLEXING HEVC VIDEO BITSTREAM WITH AAC AUDIO

BITSTREAM,ACHIEVING LIP SYNCHRONIZATION

BY MRUDULA WARRIERUNDER THE GUIDANCE OF DR K.R.RAO

Table of contents

• Use of multiplexing• Need for compression• History of compression• Video and audio coding standard• Multiplexing process• Demultiplexing process• Lip sync• Test conditions, Results, conclusions and future

work

Use of Multiplexing

•Video, audio and other data contain several frames, thus has huge size.•Large size requires large bandwidth.•Compression standards like HEVC for video and AAC for audio are chosen.•To optimize the use of expensive resources, multiplexing is needed.

NEED FOR COMPRESSION

• HDTV: Resolution – 1920x1080, 8-bit for 3 colors each component.

• Lots of storage space• Heavy for network communications.• Compression reduces the large bandwidth

requirement for transmission for similar quality.

Applications

History of Compression Standards

Fig 1: History of compression standards [3]

High Efficiency Video Coding(HEVC)

• Newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group.

• working together in a partnership known as the Joint Collaborative Team on Video Coding (JCT-VC)

• 50% bit-rate reduction for equal perceptual video quality compared to H.264/AVC. [1]

Fig 2: Typical HEVC video encoder (with decoder modeling elements shaded in light gray). [1]

Fig 3: NAL Unit Header [7]• 2 byte header• VCL NAL units -contain coded pictures •non-VCL other associated data.

Table 1: NAL UNIT Types, meanings and type classes. [1]

NAL BITSTREAM STRUCTURE

CODING TREE UNIT

Fig 4: Subdivision of a CTB into CBs [and transform blocks (TBs)].Solid lines indicate CB boundaries and dotted lines indicate TB boundaries. (a) CTB with its partitioning. (b) Corresponding quadtree. [1]

Fig 5: Picture showing CTU division and motion vector modes.

Advanced Audio Coding(AAC)

• AAC supports inclusion of 48 full-bandwidth (up to 96 kHz) audio channels [2]

• AAC is the default or standard audio format for YouTube, iPhone, iPod, iPad, Nintendo DSi, Nintendo 3DS, iTunes, DivX Plus Web Player and PlayStation 3.

• three profiles: Low-Complexity profile (AAC-LC / LC-AAC), Main profile (AAC Main) and Scalable Sampling Rate profile (AAC-SSR) [2]

http://en.wikipedia.org/wiki/Bandwidth_(computing)

http://en.wikipedia.org/wiki/Audio_channel

http://en.wikipedia.org/wiki/Audio_channel

http://en.wikipedia.org/wiki/YouTube

http://en.wikipedia.org/wiki/IPhone

http://en.wikipedia.org/wiki/IPod

http://en.wikipedia.org/wiki/IPad

http://en.wikipedia.org/wiki/Nintendo_DSi

http://en.wikipedia.org/wiki/Nintendo_DSi

http://en.wikipedia.org/wiki/Nintendo_3DS

http://en.wikipedia.org/wiki/ITunes

http://en.wikipedia.org/wiki/DivX

http://en.wikipedia.org/wiki/DivX

http://en.wikipedia.org/wiki/PlayStation_3

AAC encoder

Figure 6 : Block diagram of AAC encoder [2]

AAC Bitstream format

• Bit stream Formats: • ADIF - Audio Data Interchange Format:

Only one header in the beginning of the file followed by raw data blocks

• ADTS - Audio Data Transport Stream Separate header for each frame enabling decoding from any frame [2]

ADTS format

Table 2: ADTS format information [2]

MPEG-2 PART 1 systems layer

• Elementary stream (ES)• 2 layers of packetization• PES and TS

Fig 7: MPEG -2 layers [28]

PES•3 byte start code, followed by a one byte stream ID and a 2 byte length field.•PES header distinguishes between audio and video PES packets .•11000000 –audio• 11100000- video •time stamp field, which contains the playback time information.•size of the PES packets variable.

Fig 8 : Packetized elementary stream [15]

MPEG-2 TRANSPORT STREAM

• MPEG Transport Streams (MPEG-TS) use a fixed length packet size of188 bytes including header and payload data.

• PID identifies whether audio or video• 0000001110 (14) - audio stream• 0000001111 (15) - video stream.

PUSI-payload unit start AFC-Adaptation field controlCC-Continuity counter PID-Packet Identifier

Fig 9: MPEG 2-TS format[15]

Fig 10: Multiplexing process [21]

Multiplexing Process

• Multiplexing plays an important role in avoiding the buffer overflow or underflow at the de-multiplexing end.

• Video and audio playback timings are used to ensure effective multiplexing of the TS packets.

• Timing counters are incremented according to the playback time of each TS packet.

• A packet with the least timing counter value is always given preference during packet allocation

Calculation of playback time

• Frames per second= n;• Hence, for 1 frame (f) = 1/n secs;• Number of TS packets (N)= length of PES

185•Playback time of one TS= f/N•Where f_video = 1/fps•f_audio= 1024/sampling frequency

Flowchart of multiplexing process

Demultiplexing process

• The transport stream (TS) input to a receiver is demultiplexed into a video elementary stream and audio elementary stream.

• These ES are initially written into video and audio buffers respectively.

• Once one of the buffers is full, the elementary stream is reconstructed from the point of synchronization.

Flow chart of demux process

Lip Synchronization

• The data is loaded from the buffer during playback.• IDR frame searched from the starting of the video

buffer.• Frame number of the IDR frame is extracted.• The playback time of the current IDR frame is calculated

as,Video playback time=IDR frame number/fps

• The corresponding audio frame number is calculated as, Audio frame number= (Video playback time * sampling

frequency)/1024

Lip sync cont…

• If a non-integer value, the audio frame number is rounded off and the corresponding audio frame is searched in the audio buffer.

• The audio and video contents from the corresponding frame numbers are decoded and played back.

• Then the audio and video buffers are refreshed and new set of data are loaded into the buffers and this process continues.

• If the corresponding audio frame is not found in the buffer, then next IDR frame is searched and the same process is repeated.

Test conditions

.avi/.mov file

Video YUV file .hevc file

Audio WAVE file .aac file

Figure 6.1: Test Condition for Video file

RESULTS

RESULTS

RESULTS

CONCLUSION

• Synchronization of audio-video is achieved by starting the demultiplexer from any TS packet.

• Visually there is no lag between the video and audio.

• The buffer fullness at the demultiplexer end is continuously monitored and buffer overflow or underflow is prevented using the adopted multiplexing method.

FUTURE WORK

• The proposed technique can be extended to support multiple elementary streams such as to include subtitles during playback.

• The proposed technique can also be modified to support elementary streams from different video and audio codecs depending on their NAL and ADTS formats respectively.

• The adopted method can also be extended to support some error resilient codes in the case of transmission of multimedia program over error prone networks.

REFERENCES• [1] G. Sullivan et al, “Overview of the high efficiency video coding (HEVC) standard”, IEEE

Transactions on Circuits and Systems for Video Technology, vol. 22, n 12, pp. 1649-1668, Dec. 2012.

• [2] Multimedia Processing Lab website: http://www.uta.edu/faculty/krrao/dip • [3] MPEG–2 advanced audio coding, AAC. International Standard IS 13818–7, ISO/IEC

JTC1/SC29 WG11, 1997. • [4] MPEG: Information technology — generic coding of moving pictures and associated

audio information, part 3: Audio .International Standard IS 13818–3, ISO/IEC JTC1/SC29 WG11, 1994.

• [5] MPEG: Information technology — generic coding of moving pictures and associated

audio information, part 4: Conformance testing .International Standard IS 13818–4, ISO/IEC JTC1/SC29 WG11, 1998.

• [6] Information technology—Generic coding of moving pictures and associated audio—Part

1: Systems, ISO/IEC 13818-1:2005, International Telecommunications Union.

• [7] MPEG-4: ISO/IEC JTC1/SC29 14496-10: Information technology – Coding of audio-visual objects - Part 10: Advanced Video Coding, ISO/IEC, 2005.

• [8] M. Bosi and M. Goldberg “Introduction to digital audio coding and standards”, Boston: Kluwer Academic Publishers, 2003.

http://www.uta.edu/faculty/krrao/dip

REFERENCES • [9] R.Linneman, “Advanced audio coding on FPGA”, BS honors thesis, October 2002,

School of Information Technology, Brisbane, Australia.

• [10] Y. Kubo et al,” Improved high-quality MPEG-2/4 advanced audio coding encoder”, The Acoustical Society of Japan, 2008.

• [11] K. Brandenburg, “MP3 and AAC Explained”, AES 17th International Conference,

Florence, Italy, September 1999.

• [12] J. Nightingale, Q. Wang and C. Grecos, “HEVStream: A framework for streaming and evaluation of High Efficiency Video Coding (HEVC) content in lossprone networks”, IEEE Transactions on Consumer Electronics, vol. 59, pp.404-412, May 2012.

• [13] G. Sullivan, P. Topiwala and A. Luthra, “The H.264/AVC video coding standard: overview and introduction to the fidelity range extensions”, SPIE Conference on Applications of Digital Image Processing XXVII, vol. 5558, pp. 53-74, August 2004.

• [14] C. Fogg, “Suggested figures for the HEVC specification”, ITUT/ISO/IEC Joint

Collaborative Team on Video Coding (JCTVC) document JCTVCJ0292r1, July 2012.

REFERENCES• [15] K.R.Rao, D. Kim and J.J. Hwang,” Video coding standards: AVS China,

H.264/MPEG-4 Part10, HEVC, VP6, DIRAC and VC-1"´, Springer, 2014.• • [16] I.E.Richardson, “The H.264 advanced video compression standard”, 2nd • Edition, Wiley, 2010. • • [17] ISO/MP4 information: http://en.wikipedia.org/wiki/MPEG4_Part_14.• • [18] T.Schierl et al, “RTP Payload Format for High Efficiency Video Coding”, Nokia,

February 27, 2012.• • [19] HEVC tutorial http://www.vcodex.com/h265.html. • • [20] G.Sullivan et al ,” Standardized Extensions of High Efficiency Video Coding (HEVC)

“, IEEE Journal of Selected Topics in Signal Processing, vol. 7, pp. 1001-1016, Dec. 2013.• • [21] T. Wiegand et al, “Overview of the H.264/AVC Video Coding Standard,” IEEE

Transactions on Circuits and Systems for Video Technology, vol. 13, pp. 560-576, July 2003.

• • [22] MPEG-4: ISO/IEC JTC1/SC29 14496-10: Information technology – Coding Of

audio-visual objects - Part 10: Advanced Video Coding, ISO/IEC, 2005.

http://www.vcodex.com/h265.html

REFERENCES• [23] J. Herre and H. Purnhagen, “General audio coding,” in The MPEG-4 Book

(Prentice Hall IMSC Multimedia Series), F. Pereira and T.Ebrahimi, Eds. Englewood Cliffs, NJ: Prentice-Hall, 2002.

• [24] V. Sze, M. Budhagiavi, G.J. Sullivan,”High efficiency video coding : Algorithms and architecture”, Springer 2014.

• [25] Website for AC-3: http://www.digitalpreservation.gov/formats/fdd/fdd000209.shtml• [26] Basics of video: http://lea.hamradio.si/~s51kq/V-BAS.HTM

• [27] The HEVC website: http://hevc.hhi.fraunhofer.de/ • [28] HEVC open source software (encoder/decoder): • https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/branches/HM14.0dev/ • [29] JCTVC documents are publicly available at http://ftp3.itu.ch/avarch/jctvcsite and

http://phenix.itsudparis.eu/jct/. • [30] HEVC software manual:

https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/branches/HM-9.2-dev/doc/software-manual.pdf

http://www.digitalpreservation.gov/formats/fdd/fdd000209.shtml

http://lea.hamradio.si/~s51kq/V-BAS.HTM

http://hevc.hhi.fraunhofer.de/

https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/branches/HM14.0dev/

http://ftp3.itu.ch/avarch/jctvcsite



REFERENCES• Special issues on HEVC.• [31] Special issue on emerging research and standards in next generation

video coding, IEEE Transactions on Circuits and Systems for Video Technology (CSVT), vol.22, pp. 1646-1909, Dec. 2012.

• [32] “Introduction to the issue on video coding: HEVC and beyond”, IEEE journal of Selected Topics in Signal Processing, vol.7, pp. 931-1151, Dec. 2013.

• • [33] D. K. Fibush, “Timing and Synchronization Using MPEG-2 Transport

Streams,” SMPTE Journal, pp. 395-400, July, 1996.• • [34] Z. Cai et.al “A RISC Implementation of MPEG-2 TS Packetization”, in

the proceedings of IEEE HPC conference, pp 688-691, May 2000. • • [35] P.A. Sarginson, “MPEG-2: Overview of systems layer”, BBC RD 1996/2.• • [36] MPEG 2 TS:

http://www.erg.abdn.ac.uk/future-net/digital-video/mpeg2-trans.html

http://www.erg.abdn.ac.uk/future-net/digital-video/mpeg2-trans.html

REFERENCES• [37] VLC software and source code website www.videolan.org• [38] Ffmpeg software and official website • http://ffmpeg.mplayerhq.hu/• [39] “FAAC and FAAD AAC software” www.audiocoding.com• [40] DivX player : www.divx.com • [41] MKVToolNix GUI preview• [42] T.Ogunfunmi, M. Narasimha, “Principles of speech coding”, Boca

rattan, FL, CRC press, 2010. • JVT REFLECTOR Queries/questions/clarifications etc. regarding

H.264/H.265 • [email protected] ; on behalf of; Karsten

Suehring [[email protected]]

http://www.videolan.org/

http://ffmpeg.mplayerhq.hu/

THANK YOU

MULTIPLEXING AND DEMULTIPLEXING HEVC VIDEO BITSTREAM WITH AAC AUDIO BITSTREAM,ACHIEVING LIP SYNCHRONIZATION BY MRUDULA WARRIER UNDER THE GUIDANCE OF DR.

Documents

aac audio bitstream

aac bitstream format

use of multiplexing

nal bitstream structure

standard audio format

video pes packets

future work slide

advanced audio codingaac