Introduction to Video Encoding - Forsiden · Introduction to Video Encoding Preben N. Olsen University of Oslo and Simula Research Laboratory ... Foreman and macroblocks 23/37. Frames

Introduction to Video Encoding

Preben N. Olsen

University of Oslo andSimula Research Laboratory

[email protected]

August 26, 2013

1 / 37

Agenda

1 IntroductionRepetitionHistoryQuality AssessmentContainers

2 Video Encoding FundamentalsMacroblocksFramesPrediction ModesMotion CompensationParallel Encoding

2 / 37

Repetition

From first lecture. . .

• Media Compression• Raw data is inconvenient, very large file sizes• Compression reduces bandwidth and storage costs

• Image Representation• Number of pixels, e.g., 1920 ∗ 1080• Color representation per pixel

• Y UV (Y CbCr) Color Space• Y is the luma component (light intensity)• U is a chroma component (color)• V is a chroma component (color)• Reduce file size by chroma sub-sampling

3 / 37

Repetition

Figure : RGB and CMYK [1]

4 / 37

Repetition

Figure : YUV Dissected, original [2]

5 / 37

Repetition

Full HD YUV frame size . . .

1920× 1080× 24 bits ≈ 5.9 MB

6 / 37

Repetition

Why not (buy and) download The Hobbit in full YUV format?

5.9 MB × 48 FPS × (162× 60) Seconds ≈ 2.6 TB

7 / 37

Repetition

Figure : YUV Data Layout, original [3]

8 / 37

Repetition

Figure : JPG Block Diagram [1]

9 / 37

Repetition

• JPEG is short for Joint Photographic Experts Group

• There’s a trade-off between size and quality in jpg images

• Compression rate of 1: 10 gives a reasonable result

• Lossless jpg encoding yields approx comp rate of 1: 1.6

10 / 37

History

• The MPEG is short for Motion Picture Expert Group

• Industry together with ISO and ITU develops standards

• MPEG-1 started in 1988, released in 1993

• MPEG-2 started in 1990, released in 1996

11 / 37

History

• MPEG-3 was to include support for HDTV (1080p)

• MPEG-4 started in 1998, released between 1999-. . .

• Part 2 of MPEG-4 describes H.263 Advanced Simple Profile

• Sometimes referred to as DivX or Xvid

12 / 37

History

• Part 10 of MPEG-4 defines H.264, introduced in 2003

• Twice the compression of H.263 (MPEG-4 ASP)

• Used by Blu-ray, Rikstv, Youtube, and many others

• Sometimes referred to as x264

• This codec has 17 different profiles

13 / 37

History

• High Efficiency Video Coding or H.265 started in 2004

• HVEC has better compression, same level of quality

• Released to the public on June 7th, 2013 [4]

• Supports Ultra High Definition TV (UHDTV), 7680× 4320

14 / 37

Money and Politics

• Patent pool created by MPEG-LA

• About 1,500 patents related to H.264

• Incentive for large, global companies

15 / 37

History

• Google bought On2, which initially developed VP8

• VP8 spec. released with open-source implementation in 2010

• Supported by many browsers and mobile platforms

• Ongoing development on VP9, a HVEC competitor

16 / 37

Quality Assessment

• Assessing video quality is difficult

• A group of people rate which version is best

• People have different opinions on quality

• Objective measurements can give an estimate

• Peak Signal-to-Noise Ratio (PSNR)

• Shell script for PSNR found in mplayer source tree

17 / 37

Containers

• File containers are not codecs

• Video codecs are used for encoding and decoding bitstreams

• Containers are used for packaging bitstreams

• Examples include Audio Video Interleave (AVI), Matroska(MKV), Video Objects (VOB), and OGG

18 / 37

Video Encoding Fundamentals

19 / 37

Figure : Overview of H.264/VP8

20 / 37

Macroblocks

Figure : Missing macroblocks [5]

21 / 37

Macroblocks

• Different macroblock types and sizes

• 16× 16 pixels, subdivided into 4× 4

• Intra-, predicted-, and bi-directional predicted macroblocks

22 / 37

Macroblocks

Figure : Foreman and macroblocks

23 / 37

Frames

• Also different frame types

• Usually intra-predicted frames, predicted frames, andbi-directional predicted frames

• VP8 does not have bi-directional, but alt-ref andgolden-frames

Figure : Different frames [6]

24 / 37

Frames

Prediction type

1 Intra-prediction

2 Inter-prediction

3 Bi-directional

Predict the pixels of amacroblock using informationavailable within a single frame.

Typically predicts from left, topand top-left macroblock by inter-or extrapolating the borderpixel’s values.

Different prediction modesavailable, e.g. horizontal,vertical, and average.

25 / 37

Frames

Prediction type

1 Intra-prediction

2 Inter-prediction

3 Bi-directional

Predict a macroblock by reusingpixels from another frame.Objects tend to move around in avideo, and motion vectors areused to compensate for this.

H.264 allows up to 16 referenceframes, while VP8 only supports3 frames.

26 / 37

Frames

Prediction type

1 Intra-prediction

2 Inter-prediction

3 Bi-directional

Predict the pixels of amacroblock using informationavailable in other frames, bothprevious and upcoming frames;that is, going back and forwardin time.

Can reference every type offrame, including otherbi-directional predicted frames.

27 / 37

Determining Prediction Modes

• The motion estimator tries many modes

• Different blocks are evaluated

• Two-step process, initial and refinement

28 / 37

Some Cost Functions

• Mean square error (MSE)

• Sum of Absolute Differences (SAD)

• Sum of Absolute Transformed Differences (SATD)

• SATD is more accurate than SAD

29 / 37

Motion Compensation

• With the best motion vector a predicted block is generated

• The original reference frame can not be used directly as inputto the motion compensator as the decoder never sees theoriginal image

• Decoder ”sees” a reconstructed image, i.e., an image with loss

• A reconstructed reference image must be used as input

30 / 37

Parallel Encoding

• Approaches available both for intra- and inter-prediction

• Some give up compression efficiency for increased parallelism

• Pipeline approach shouldn’t be combined with real-time reqs

31 / 37

Parallel EncodingWhat should be optimized?

Figure : VP8 profiling

32 / 37

Parallel Encoding

Figure : Group of Pictures [6]

33 / 37

Parallel Encoding

Figure : Sliced-based approach

34 / 37

Conclusion

• Video encoding is mainly about trying (and failing) differentprediction modes limited by user-defined restrictions (resourceusage)

• The ”actual” encoding of the video when the parameters areknown usually accounts for a small percentage of the runningtime

• Any (reasonable) codec can produce the desired video quality- what differs between them is the size of the outputbitstream they produce

35 / 37

The End

36 / 37

References

Video & Image Compression Techniques: Image Coding Fundamentals

http://goo.gl/6fCK7N

Wikipedia: YUV

http://en.wikipedia.org/wiki/Yuv

Any To YUV: Documentation

http://any2yuv.sourceforge.net/Docs

H.265: High efficiency video coding

http://www.itu.int/rec/T-REC-H.265

BitBlit.Org

http://www.bitblit.org/gsoc/g3dvl/

GOP (Group of Pictures)

http://goo.gl/83D7Hz

37 / 37

Introduction to Video Encoding - Forsiden · Introduction to Video Encoding Preben N. Olsen University of Oslo and Simula Research Laboratory ... Foreman and macroblocks 23/37. Frames

Documents