A LOSSLESS RE-ENCODING OF MPEG-2 CODED … LOSSLESS RE-ENCODING OF MPEG-2 CODED FILE BY ... compression ratio of 20% compared to the MPEG-2 standard method. ... are not always close

International Journal of Computer Science and Applications, Technomathematics Research Foundation Vol. 8, No. 1, pp. 14 – 35, 2011

14

A LOSSLESS RE-ENCODING OF MPEG-2 CODED FILE BY INTEGRATING

FOUR MOTION VECTORS

Kazuo Ohzeki

Shibaura Institute of Technology Toyosu, Koutouku, Tokyo, 135-8548 Japan ohzeki sic.shibaura-it.ac.jp

Hiroki Okumura

Shibaura Institute of Technology Toyosu, Koutouku, Tokyo, 135-8548 Japan l07034 shibaura-it.ac.jp

Yuǎn yù Wei

Shibaura Institute of Technology Toyosu, Koutouku, Tokyo, 135-8548 Japan m710101 shibaura-it.ac.jp

Eizaburo Iwata

Universal Robot Inc. Toyosu, Aoumi, Kotoku Tokyo Japan eiza urobot.co.jp

Ulrich Speidel

University of Auckland Tamaki – 331, Morrin Road, Glen Inne, Auckland New Zealand

ulrich cs.auckland.ac.nz

Re-encoding of once compressed files is one of the difficult challenges in measuring the efficiency of coding methods. Variable length coding with a variable source delimiting scheme is a promising method for improving re-encoding efficiency. Analyses of coded files with fixed length delimiting and with variable length delimiting are reviewed. Motion vector codes of MPEG-2 encoded files are modified as a variable-to-variable coding point of view. Length, bit-rates, and varieties of videos are examined. One of the largest file is 16 seconds of D1 full size at 720 ×480, and in total, the length of videos used in the experiments is more than 750 seconds. By coding using Huffman codes and arithmetic codes, an improvement of more than 20% in coding efficiency over the conventional MPEG-2 is obtained. Because the proposed method is a lossless re-encoding, the video quality is the same as before.

Lossless: Re-encoding: Variable Length Coding: Huffman.

1. Introduction

In this paper, for coded files, such as MPEG-2 encoded ones, redundancy will be evaluated to provide re-encoding information. There are four frameworks of encoding according to whether input event length and output bit length is fixed or variable. One example is fixed length input and fixed-length output that is an F-F type. In the same manner, there are three other types of F-V, V-F and V-V. Among these types, V-V is the most general and has the greatest possibilities in realizing the most efficient encoding method [Ziv(1990),Han et al (2001)]. However, the V-V type has the difficult problem of

A lossless re-encoding of MPEG-2

15

how to delimit the input events into the most efficient length. Universal coding and arithmetic coding are flexible methods. But they are basically still F-V types. Based on several examples of V-V trials, we will show an example of V-V re-encoding. The method improves a compression ratio of 20% compared to the MPEG-2 standard method. Examples of V-V coding methods are reported in several papers. Ziv presented that the V-F codes are better than the F-V codes for finite alphabets of K-th order ergodic Markov sources [Ziv (1990)]. Yamamoto et al mentioned that a VF code is called proper if the set of parse strings of the code satisfies the prefix condition [Yamamoto et al (2001)]. He also described that non-proper VF codes should be considered in order to realize efficient VF coding. This implies that variable parsing without considering prefix condition has a large possibility to enhance coding efficiency. Abrahams presented a survey paper on the theoretical literature on fixed-to-variable-length lossless source code trees, called code trees, and on variable-length-to-fixed lossless source code trees, called parse trees [Abraham (1997)]. He focused on Huffman coding and Tunstall V-F coding methods for parsing and making tree with a large bibliography for further investigation of algorithmic and performance perspectives. Matsui et al examined the compression ratio of the Tunstall-Huffman code. The Tunstall-Huffman code is a variable-to-variable code. They obtained better results by the Tunstall-Huffman V-V coding than with standalone coding methods such as the Tunstall V-F coding or the Huffman F-V coding [Matsui et al (2009)].

All these V-V coding are theoretical not for video files. MPEG-2 coding of video files is not proved to be an optimum method. In fact one of the authors has shown redundancy of coded MPEG-2 files using FV codes. In this paper, we will try to enhance to use VF codes for MPEG-2 coded files. Though the application is restricted to motion vector parts, coding efficiency of the VV codes is much larger than that of the FV case. The proposed method is an improvement of the MPEG-2 coding. So it is not compatible with the standard methods. As for V-V coding, there have been few papers. The reason may be that it is difficult to parse sequences at optimal delimiting points. To cope with this problem, an example of review trials will be presented in this paper. A fixed length analysis of sequences was first implemented. The length of the fixed parsing was incrementally examined. Many heuristic trials were carried out and problems with this analysis will be listed. To conduct variable parsing of sequences, the T-code generating method [Gunther et al (1997)] was taken up. T-code was introduced by Titchener in 1984 [Titchener (1984)]. Though the generating process is only a parsing process, it may be an influential tool to optimal variable length coding. It generates codes by copying pre-generated shorter codes. T-code generating experiments were carried out for more than 50 different video sequences. Efficient parsing is itself efficient coding. How to combine efficiency conditions with a parsing algorithm will be a further problem.

According to coding of motion vectors, Yu et al presented two-dimensional motion vector coding for low bit-rate video phones. However, their Huffman coding was generated by the JPEG procedure and the number of motion vector bits were about only one-frame for full D1 digital video size, 720×480 [Yu et al (1995)]. Shimizu et al proposed a method using representation of norm and angle for motion vectors. The method was complicated and still two separate codes were used [Shimizu et al (2001)]. Matsuda et al proposed a lossless re-encoding scheme for MPEG-1 video. They used an arithmetic coder for both DCT and motion vector data for the MPEG-1 coder [Matsuda et al (2009)]. In the following sections, an exemplified review of the fixed parsing method,

Ohzeki et al. 16

and a variable parsing method with T-code are presented for long-term investigation. Then a practical variable length coding will be presented with several variations of construction parameters.

2. Re-Encoding of MPEG-2 coded files

2.1. Fixed Length Analysis

Before considering V-V coding, we will review F-V coding to evaluate actual entropy. There is redundancy in the MPEG coded bit-stream. There are two problems in evaluating redundancy for coded bit-streams as statistical data with a long interval. These are quantity and quality. How much coded data should be prepared? How many kinds of pictures should be prepared? For this problem, we introduce a new inverted distribution model. Based on the model, we analyze entropy space and show that the distribution is symmetrically uniform on a distorted circle at isentropic space cut out by the same entropy plane. Though the model is uniformly distributed, actual random samples of coded bit-stream statistics are not uniformly distributed as in the case of the model. They are located in a small region by our experimental results. We conclude that the MPEG coded bit-stream is not random, but is very much correlated. Based on these results, we evaluate entropy of coded bit-streams. For the interval of 20bit, the entropy value is 0.9. If the encoder is an ideal one, the output of the encoder should be a perfect random number, which cannot be re-compressed at all into a file of smaller size. For evaluating ideal encoder performance, random number analysis is one of the most effective methods. Among several methods of random number analysis, entropy evaluation is used. How many bit data are needed to get reliable results? A relation between the length of the total bit stream and the length of the delimitted interval was obtained as formula (1),

L2kn , (1) where L is an interval length of the delimitted pattern, k is a safety coefficient for occurrence of the delimitted pattern, and n is the required length of the total bit stream to be analyzed. Using the formula (1), we can obtain a relation between the length of the total bit stream and the interval of the delimitted pattern. Using the formula (1), given the length of interval “L”, we can assure sufficient total length of the stream. To consider how many kinds of pictures and coding parameters we should prepare for valid results, we introduced entropy space and cut it with an equi-entropy plane. The entropy per bit for random sequence is Ent(R) =1.0. The probability distributions give specific entropy values of non-random sequences between 0<Entropy<1. Usually, we assume the stream be random and k=10 has been taken in our experiments. Fig. 1 shows isentropic curves that are plotted with probability points with the same entropy values. This figure is the case of a low dimension. We can utilize this idea for higher dimensional cases because we cannot exactly analyze all cases but some sample cases among all. If we take two points at random on one of these isentropic curves, these two points are not always close at hand, but rather located on symmetrically opposite each other in many cases. The two points that locate in completely symmetrical positions with each other have inverted probability distributions. If we make a sum set of the distribution of these symmetric two points with respect to their occurrence frequency, the distribution of the sum set is almost flat and its entropy would be as close as to 1.0. To


17

observe these behaviors, we choose many patterns on the isentropic curves. Then, taking two samples from this pattern set at random, we put them together to form a new point. The new point locates midway between two points in Fig.1, whose entropy increases because the new point moves in an inner direction in closed entropy space. This method is formalized as an evaluating method 1,

0

10

20

30

40

50

60

70

0 10 20 30 40 50 60 70

P robabilit y P 1 (1/100)

Pro

babi

lity

P2

(1/

100)

0 .85

0.9

0.95

If there is no difference between the entropy values of S2 and S4, then the original files are strongly correlated located in a narrow region. If there is a difference, the maximum value indicates the distribution of the original files. If the maximum value is not 1.0, then there should be redundancy in the set of original files.

2.2. Variable Length Analysis

To construct an optimal V-V coding, analysis of input sequences and parsing of the sequence is important. Savari et al presented an analysis of variable-variable length codes [Savari et al (2002)]. It found that The Tunstall V-F codes can be considered to have an equi-probable code set. Then, combining the Tunstall V-F code with Huffman F-V code will provide higher efficiency. Matsui experimentally proved the concept using text files. In this section, T-code analysis which was carried out in [Ohzeki et al (2010)] will be introduced. Table 1(a) shows an example that accomplishes better performance for the V-V encoding scheme than for the F-V encoding scheme. There is a 44-bit sequence in the number column of Table 1 (a). Parsing this sequence by the fixed length of two bits, we get

Method 1: S1. Evaluate entropy of files, F1,F2,…. S2. Select files, Fs1,Fs2,…,Fsn, with the same entropy value E, e.g. E=0.9 S3. Make combined files Cp=U(Fsi,Fsj), i=j S4. Evaluate Entropy of Cp S5. Evaluate increaseness of Entropy from S2 to S4.

Fig. 1 Isentropic curves for all probabilistic data.

Ohzeki et al. 18

Table 1 (a) Input Sequence and three code case

No. 2bit 3bit V 1 1 1 1 2 0 0 0 3 0 0 0 4 0 0 0 5 1 1 1 6 0 0 0 7 0 0 0 8 0 0 0 9 1 1 1 10 0 0 0 11 0 0 0 12 1 1 1 13 0 0 0 14 0 0 0 15 0 0 0 16 1 1 1 17 0 0 0 18 0 0 0 19 0 0 0 20 1 1 1 21 0 0 0 22 0 0 0 23 1 1 1 24 0 0 0 25 1 1 1 26 0 0 0 27 0 0 0 28 0 0 0 29 1 1 1 30 0 0 0 31 0 0 0 32 0 0 033 1 1 1 34 0 0 0 35 0 0 0 36 1 1 1 37 0 0 0

38 0 0 039 0 0 040 1 1 141 0 0 0 42 0 0 0 43 0 0 0 44 1 1 1

Table 1 (b) 2 bit code case

2bit pattern

freq code

length bits

00 ９ 1 9

01 6 2 14

10 7 2 12

11 0 - 0

35+

Table 1 (c) 3 bit code case

32bit pattern

freq code length

bits

000 2 2 4

001 4 2 8

010 4 2 8

011 0 - 100 4 2 8 101 0 - 110 0 - 111 0 -

28+

Table 1(d) Variable bit case

V-bit pattern

freq code length

bits

1000 8 1 8

100 3 2 6

10 1 3 3

1 1 3 3

20


19

Table 1 (b). For three events, allocating one and two bit codes, we get in total 35 coded bits. Next, parsing this sequence by the fixed length of three bits, we get Table 1 (c). For four events, allocating two bit codes, we get in total 28 coded bits. On the other hand, in the variable coding case, Table 1 (d) shows variable code example. There are 20 coded bits in total. Table 1 (e) shows another four bit fixed code case. The total number of bits is either 21 or 22. Still the variable case has a smaller number of bits.

This example shows that there is at least a better performance variable code than that of all fixed codes within a designated code length.

An example of the T-code generation is described in Fig. 2. The result after parsing is not necessarily the prefix condition. However, the parsed codes represent the original sequence and may provide an influential tool to design a variable parsing code set. According to the T-code generation rule [Gunther et al (1997)], detaching the rightmost bit “1”, at first code the second bit “1” from the right. In this case, “1”appears twice and proceeds to two bits to the left. Next, for the “0”, code newly “0”. Further, the “0” appears three times and counts three for the “0”. Then, another “0” appears, which is the fourth time. But in this case, as there is another “1”, “10” becomes a newly defined code as an extended new code from the previous generated codes. Further, this “10” code appears four times. The detailed generation rule is described using recursive formulation.

Table 1(e) 4 bit code case.

2 bit pattern

freq code

length bits

code length

bits

0001 4 1 4 2 8 0010 1 3 3 2 2 1000 4 2 8 2 81001 2 3 6 2 4

21 22

Fig. 2 T-code generation example

Ohzeki et al. 20

3. Experiment 1

3.1. Fixed Length Coding

The authors analyzed MPEG-2 coded files to re-encode them. The length of the fixed delimiting interval is more than 20 bits. The required length for obtaining a sufficient

reliable result, n= L2010 bits is provided by the formula (1) with L=20 and k=10. The variety of 20 bit data is about one million. The file length in which these 20 bit patterns appear once is about 2.5 Mbytes, which means that an input file needs to be about 25MB. Table 2 shows the necessary sizes of file and corresponding original video lengths needed for measuring bit length by assuming a ten times occurrence for the files.

3.2. T-code Analysis

Here, to obtain optimal delimiting methods of unknown bit streams in general, for the first step, T-code analysis is carried out. This is only half of the total coding design. But as the first step, we analyze bit streams by T-code and investigate the resultant entropy behavior. Table 3(a) shows entropies of generated T-codes for 50 different videos. The values are about half. This implies that the T-code generates codes in a balanced manner. Table 3(b) shows increase behaviors when combining two files. Table 3(c) shows further results of combining three files and five files. These values are all normalized to input single bit, and the entropy value of 0.5 means the compressed size is 1/2 of the original size. This entropy is not the so-called T-entropy in [Gunther et al (1997)]. Fig. 3 shows the increasing tendency of T-code entropy when combining a number of files. At the number of five files, the saturation tendency can be seen. The calculation time for T-code analysis takes a long time, and it is limited to evaluation for longer files.

3.3. Re-encoding of MPEG-2 Coded Files

Based on the concepts above, the coding efficiency of MPEG-2 coded files is shown as a V-V coding paradigm. Fig 4 shows a V-V coding design and coding execution. According to coding of motion vectors, Yu et al presented two-dimensional motion vector coding for low bit-rate video phone. However, their Huffman coding was generated by a JPEG procedure and the numbers of motion vector bits were equivalent to about only one-frame for full D1 digital video size, 720×480 [Yu et al (1995)]. Shimizu

Table 2 Necessary sizes of file and corresponding original video lengths needed for measuring bit length by assuming ten time occurrence for the files.

bit length of

delimitted interval

Number of patterns(2bit )

Size of files Video lengths（

6Mbps）

20 1048576 26MB 35 Sec

30 1073741824 40GB 15 hour

40 1099511627776 55TB 848 days


21

et al proposed a method using representation of norm and angle for motion vectors. The method was complicated and still two separate codes were used. Matsuda et al proposed a lossless re-encoding scheme for MPEG-1 video. They used an arithmetic coder to both DCT and motion vector data for the MPEG-1 coder. To realize the whole system, we will propose a re-encoding method that integrates two existing variable codes into a single code as the first step. Based on the information theory, blocking source inputs brings efficiency to the lower bound of entropy. For the Markov source, blocking gains more efficient results. In actual MPEG-2 encoding, 2D-VLC is the only example of blocking source events with run-length of zeros and amplitude in DCT coefficient coding. We constructed block coding for motion vector coding in this paper. Motion vectors of MPEG-2 consist of a set of symmetrical 16 variable-length-codes from 1-bit to 10-bit and one bit code for a zero vector. In the following parts of this section, several experiments are carried out to improve the coding algorithm.

Table 3(a) Entropies of T-codes for videos.

video entropy video entropy 01.mpg 0.534372 26.mpg 0.524720 02.mpg 0.539007 27.mpg 0.523816 03.mpg 0.529123 28.mpg 0.526155

04.mpg 0.532101 29.mpg 0.539802

05.mpg 0.539262 30.mpg 0.535905

06.mpg 0.538966 31.mpg 0.540882

07.mpg 0.537607 32.mpg 0.536436

08.mpg 0.532928 33.mpg 0.534316

09.mpg 0.529504 34.mpg 0.537676

10.mpg 0.535544 35.MPG 0.534722

11.mpg 0.527797 36.mpg 0.537318

12.mpg 0.524526 37.mpg 0.533381

13.mpg 0.525225 38.mpg 0.536857

14.mpg 0.528234 39.mpg 0.540498

15.mpg 0.533929 40.mpg 0.540714

16.mpg 0.531385 41.mpg 0.539724 17.mpg 0.535190 42.mpg 0.538251

18.mpg 0.527166 43.mpg 0.539320

19.mpg 0.522915 44.mpg 0.531534

20.mpg 0.538911 45.mpg 0.542534

21.mpg 0.530039 46.mpg 0.528326

22.mpg 0.538085 47.mpg 0.526850

23.mpg 0.520653 48.mpg 0.532450

24.mpg 0.537217 49.mpg 0.534462

25.mpg 0.525208 50.mpg 0.533274

Ohzeki et al. 22

Table 3(b). T-code entropy for combined files. videos entropy

01+02.mpg 0.537676

11+12.mpg 0.528685

21+22.mpg 0.537929

31+32.mpg 0.543184

41+42.mpg 0.545325

Table 3(c) T-code entropy for combined files.

Three files and five files case. videos entropy

01+02+03.mpg 0.541919 11+12+13.mpg 0.527247 21+22+23.mpg 0.538656 31+32+33.mpg 0.543422 41+42+43.mpg 0.547081

01+02+03+04+05.mpg 0.538796

Fig.3 Increasing tendency of T-code’s entropy vs. the number of connected files.

Fig.4 Design of V-V coding and re-encoding.


23

3.4. Symmetrical coding

In this subsection, 16 different non-zero events and a zero event are coded in a pair of two consecutive original VLCs. There is another sign bit to represent positive or negative for non-zero motion vectors. In this experiment, the sign bit is a fixed one bit, and is excluded for the calculation of efficiency. This scheme is called “Symmetrical coding” because positive and negative evens are coded by the same codes. The motion vectors appear as a pair of horizontal and vertical vectors as shown in Fig.5. For this format, we can block the horizontal vector and the vertical one in a single code. Table 4 shows the frequency of motion vectors in encoding a video by MPEG-2 encoder, TM-5. The video size is full D1 (720×480) and the length is half a second. Table 5 shows entropy of coded bits with improving ratios of compression rates. The entropy of blocked motion vectors is reduced 27% at most from the original MPEG-2 coded bits. Table 6 shows a part of the constructed codes. The Huffman codes are generated using free software by Marcus Geelnard available at http://bcl.comli.eu/.

Fig.5 MPEG-2 macroblock layer structure. Ext=Macroblock_extension_code, DT=DCT_type, Q=macroblock quantization step, mvH=motion vector for horizontal direction, mvV=for vertical direction. The numerical values in the bottom sections of boxes are allocated bits for the codes.

Ext 11

Address 1-11

Type1-9

DT,Q1,5

mvH1-11+

mvV1-11+

Table 4 Frequency of motion vectors of a video. (without sign bit consideration, video ‘car’ 0.5 sec)

mv 1 2 3 4 5 6 freq 50276 7784 1748 840 501 389 mv 7 8 9 10 11 12freq 470 623 267 138 187 135 mv 13 14 15 16 17 freq 253 126 230 257 50

Table 5 V-V re-encoding of motion vectors. MPEG-2 TM-5. (without sign bit consideration)

Measuring scheme Bits Improvement

ratioCoded bits of mv 1.62 (bit/mv) 1.0

Entropy of mv 1.30 (bit/mv) 0.80 Entropy of 2mv 2.38 (bit/2mv) 0.73

Coded bits of 2mv 2.47 (bit/2mv) 0.77 Coded bits of 2mv per mv

1.24 (bit/1mv) 0.77

Ohzeki et al. 24

3.5. Influence of Video Length

In this subsection, we examine the necessary length of input video for these experiments. It is better to use video test sequences as far as possible to examine the performance precisely. On the other hand, it is better to keep time and data volume to a minimum. New results of re-encoding for several cut-out partial sequences from a single video are listed in Table 7. Fig. 6 is a graph showing re-encoding. From the start of 0.5 seconds, information decreases gradually toward 16 seconds. The least squares regression lines on the graph can be gradually decreased and nearly converge. The videos used in this paper are listed in Table 7. The video used to examine of the influence of video length is No.4 “autobahn”. The size of the motion picture is 720x480. The bit-rate of encoding is 8Mbps. Sixteen different non-zero events and a zero event including the sign bit to represent positive or negative for non-zero motion vectors (MV) are coded in a pair of two consecutive original VLCs. In this experiment, the sign bit is included in newly generated VLCs and for the calculation of efficiency. Viewing these behaviors, we choose the video length of 4 seconds for the following experiments.

Table 6 V-V codes for a pair of motion codes. (part) (without sign bit consideration)

Index Code Value : Bit pattern

Bit

1 1 : 1 1 2 15 : 01111 5 3 33 : 0100001 7 4 79 : 01001111 8 5 436 : 0110110100 106 291 : 100100011 10 7 851 : 1101010011 11 8 616 : 1001101000 11 9 2743 : 0101010110111 13

10 3539 : 0110111010011 1311 2688 : 0101010000000 1312 2742 : 0101010110110 13 13 5378 : 01010100000010 14 14 2905 : 0101101011001 13 15 10759: 010101000000111 15 16 10758: 010101000000110 1517 0 00 2 18 29 011101 6 19 219 : 011011011 9


25

3.6. Influence of bit-rate

In this subsection, the influence of bit-rate is examined. There are many choices of bit-rates in MPEG-2 encoding. It is important to check the effects of bit-rate in re-encoding code design. Fig. 7 shows bit-rate characteristics with log-scale for the horizontal axis. Bits mean the original MPEG-2 motion vectors. Ent1 means entropy of motion vectors of MPEG-2. Ent2/2 means entropy of motion vectors obtained by the proposed re-encoding method. In general, a decrease can be seen with bit-rates. But improving ratios from

Table 7 Video sequences used in this paper.

No. name content

original size

1 car a taxi left to right

SD:720x480

2 giraffe jiggle by hand movement

SD: 720x480

3 cherry swinging cherry blossom in wind

HD: 1920x1080

4 autobahn highway driving

HD:1920x1080

5 rugby(75) rugby game in television

SD: 720x480

Fig.6 Video length influence of MPEG-2 re-encoding for the video No.4, at 8Mbps.

Table 8 Video length influence of MPEG-2 re-encoding. Video is No.4, 8Mbps.

Length of video [Sec]

bits Ent1 Ent2/2

0.5 3.95 3.57 3.37

1.0 3.90 3.53 3.34

4.0 3.75 3.43 3.27

16.0 3.84 3.50 3.34

Ohzeki et al. 26

MPEG-2 to re-encoding may be the same. The entropy is larger for low bit-rates, which means a role of motion vectors is large and may require more bit-rate for describing videos. For high bit-rates, the smaller entropy means that there may be redundancy and all motion vectors are not necessarily required. These understandings coincide with the former comments that improvement of motion vector coding is effective for low bit-rates in references [Yu et al (1995)] and [ Shimizu et al (2001)].

3.7. Evaluation of a variety of videos

Table 8 and Fig 8 show overall comparison of re-encoding efficiency. The first column of Table 8 is the numbers of coded bits of motion vectors for the case of the original MPEG-2. Ent1 means entropy of motion vectors of MPEG-2. Ent2/2 means entropy of motion vectors in the case of the re-encoding method. In Fig.8, improvement ratios from MPEG-2 to re-encoding are large for videos No. 1, No.4 and No.5, but are

Fig. 7. Bit rate characteristics for the video No.4 with a duration of four .seconds.

Fig. 8. Overall comparison of re-encoding efficiency. Video length is four seconds for No. 1-5. Bit-rate is 8Mbps.


27

small for videos No.2 and No.3. The characteristic of videos No.2 and No.3 is relatively smaller motion. On the other hand, videos No.1, No.4 and No.5 have large motion scenes. Ave in Fig. 8 means the average of five results.

3.8. Comparison of quantity of the number of MVs

Table 9 shows increased bits of motion vectors used in our experiments in this paper. A large number of video data are used to analyze the methods in detail and to improve reliability of the experiments. About 30 times more are carried out than the conventional experiments by Yu et al.. Fig.9 (a)-(d) are sample pictures of videos used in these experiments except (e) which is a rugby game on television broadcasting. The former four videos are presented at author’s homepage [Ohzeki, K., (2010_b)].

Fig. 9(a) Video 1 car

Fig. 9(b) Video 2 giraffe

Ohzeki et al. 28

Table 9 Increased bits of motion vectors in this paper compared to the conventional paper [Yu ey al (1995)].

Yu’s experiments Our experiments

video sequences

Bits of motion vectors

video sequences

Bits of motion vectors

Miss Am 6987 1.car 110464 Mother & Daughter

9135 2.giraffe 397424

Salesman 5377 3. cherry 412304 Car Phone 14961 4.autobahn 246924 Foreman 22105 5.rugby 615714

total 58565 total 1782830

Fig. 9(c) Video 3 cherry

Fig. 9(d) Video 4 autobahn


29

4. Experiment 2

Based on the experimental results in the previous chapter, further experiments with four motion vectors are carried out. The efficiency will increase as the number of the integration of motion vectors increases. However, the number of code database of the Huffman coding also grows, which requires to validate during the new code generation that there are sufficient times of occurrence for all possible events. To realize this condition, we should prepare many amounts of videos. It is virtually limited to increase the integration of motion vectors up to 4 or 6. Watching the conditions as such, we will increase the number of the integration of motion vectors in this chapter.

4.1. Number of Integration

As the number of integration of motion vectors increases, we should be careful in the lack of occurrence of code patterns. First of all, the number of patterns of integrated motion vectors are listed in Table 10. For the second row of the number of integration of motion vectors in Table 10, we can see and confirm all the 1089 patterns in experiments. For the third row case, we should prepare long videos enough for constructing all the Huffman codes. To investigate how many minuets of videos we need to construct a correct set of Huffman codes, the authors checked the number of patterns which are obtained from varieties of lengths of videos. Figure10 shows the number of motion vectors along changes of lengths of videos for three motion vectors integration cases. For the case of one which uses the original MPEG-2 motion vector codes without integration, a length of a video with a duration of 15 seconds is sufficient to provide all 33 code patterns. However, for the case of two that a pair of motion vectors are integrated, a length of 15 seconds videos only provides 95% of all 1089 patterns. A length of 600 seconds is sufficient to provide all patterns. For the case of three that four motion vectors are integrated, a length of 1800 seconds is not still sufficient. By extending this graph to the right direction, a length of 3600 seconds videos may reach sufficient amount for providing all patterns using estimation.

In preparing video materials, varieties of all kinds of videos should be properly collected is another problem. If we force to simply increase the number of videos without considering varieties of the contents, the resulted codes may not cover all patterns. Table 11 shows occurrence ratios of code patterns with respect to video contents. The live-video means live-action or real scenery videos which are taken by video cameras The animation means artificially generated videos by computer graphics technology. The

Table 10 The number of integration of motion vectors and the number of integrated motion vectors.

The number of integration

The number of MV patters

1 33 2 1089 4 1185921 6 1406408618241

Ohzeki et al. 30

length of these videos are 150 seconds. The number of code patterns of the animation is smaller than that of live-videos. Another test for Huffman code generation is carried out. Test files that are substitutes to long videos are made from redundant numbers. The lengths are from 100 to 10,000,000. Assuming these files as motion vectors, a Huffman code set and an arithmetic code set are generated. Coding tests of these files using the Huffman codes and arithmetic codes are carried out to see compression ratios and coding time, which are shown in Table 12 and 13. Table 12 shows coding efficiency for these test files. Table 13 show coding speeds by required times.

4.2. Large Scale Encoding

Based on these considerations above, we decide to code motion vectors of video files for 50 different varieties of contents. Each length is 15 seconds. The total amount of videos is 750 seconds. It is not sufficient for the case four of integration from Fig.10. By internal division of 600 and 1800 seconds in Fig.10, the point of 750 second reaches 40% of the occurrence ratio of code patterns.

Table 11 Occurrence ratios of code patterns with respect to video contents. integration live-video animation

1 100% 100%

2 94.9% 89.0%

4 5.1% 2.2%

Fig.10 Occurrence ratio of code patterns vs. video length for three integration cases. “1” uses the MPEG-2 original motion vectors. “2” is the case of the integration of a

pair of motion vectors. “4” is the case of the integration of four motion vectors.


31

This ratio of 40% means that the total real Huffman code set requires residual 60% codes to form a complete set of Huffman codes. Assuming the same distribution to the residual 60% part of code set, we can add bits to an obtained result, which is coded using 40% of the total Huffman code set. The additional number of bits is,

1.3219284.0

1log2 . (2)

Table 14 shows compression results for three integration cases with Huffman coding and arithmetic coding. The numbers are slightly different from the results of chapter III, because the video materials are renewed to obtain long and reliable results. The coding efficiency between the Huffman code an the arithmetic code is small in this case. For the cases of 1 and 2, the results represent for the case of four of integration, the result should be compensated by the value indicated by the formula (2).

Table 12 Coding efficiency for long test sequence.

Files Length Huffman coding

arithmetic coding

01 100 40.3% 56.0%

02 1000 33.3% 37.1%

03 10000 30.9% 31.1%

04 100000 31.5% 31.0%

05 1000000 32.9% 32.1%

06 1000000 31.5% 30.9%

07 1000000 32.3% 31.6%

08 1000000 31.9% 31.0%

09 1000000 33.1% 32.3%

10 10000000 31.7% 31.1%

Table 13 Coding time for long test sequence.

Files Length Huffman coding

arithmetic coding [Bodden, E]

01 100 0.03ms 0.2ms

02 1000 0.1ms 1.1ms

03 10000 0.8ms 8.3ms

04 100000 9.0ms 77ms

05 1000000 157ms 808ms

06 1000000 98ms 716ms

07 1000000 138ms 746ms

08 1000000 127ms 721ms

09 1000000 131ms 738ms

10 10000000 704ms 6312ms

Ohzeki et al. 32

The compensated values using the formula (2) are listed in Table 15 with other data in the same values as Table 14. The reducing ratios of the amount of coded bits from the original MPEG-2 to each integration of motion vector are listed the right column. For these video files, for the case 2, about 14% reduction can be seen. For the case 4, 21-22%reduction can be seen.

4.3. Validity of Results

Finally, we would like to verify the results by an incremental observing method. At first, by the law of large numbers, we can confirm the result is as more reliable as the number of data increases. Then by the central limit theorem, we can drive the situation in a small region. The central limit theorem states that even the original distribution is not known, the error between the average of the real distribution and the average of sampled values obey Gaussian distribution as the number of samples increases. The incremental observing method is a measuring step described in Fig.11. The function is an entropy function or real calculator of the number of bits for coding. The real average is a constant, which is not known yet. But though it is a constant, we can neglect the value in decision process of convergence.

Table 14 Coding efficiency for long videos with respect to the number of integration of motion vectors before compensation.

coding The Number

of Integration

Coded bits

Huffman 1 6.575236(bit/mv)

Huffman 2 5.337251(bit/1mv)

Huffman 4 3.873812(bit/1mv)

arithmetic 1 6.431969(bit/mv)

arithmetic 2 5.216413(bit/1mv)

arithmetic 4 3.775428(bit/1mv)

Table 15 Complete Coding efficiency for long videos with respect to the number of integration of motion vectors and reducing ratio..

coding NoI Coded bits Reducing

ratio

Huffman 1 6.575236(bit/mv) 1.0

Huffman 2 5.337251(bit/1mv) 0.8606808

Huffman 4 5.1957401(bit/1mv) 0.7901983

arithmetic 1 6.431969(bit/mv) 0.9782111

arithmetic 2 5.216413(bit/1mv) 0.859594

arithmetic 4 5.097328(bit/1mv) 0.7752312


33

According to this investigation, we prepare five kinds of lengths of videos, whose lengths are 15 seconds, one minuets, three minuets, ten minuets, and thirty minuets. The occurrence ratios of patterns of the motion vectors versus to the lengths of the used videos are shown in Fig. 12. For the number of integration (NoI) of 1, which is the original and no integration is made, a short period of video of 15 seconds is enough to collect all events for making Huffman cods by using the videos used in this paper. For the NoI=2, we should prepare 10 minuets of videos to collect all patterns as far as using these videos samples. Finally, we see that for the NoI=4 case, it is not sufficient to use 30 minuets. By using 30 minuets video, we can only collect 50% of all patterns. The tendency of the graph shows that one hour video may bring all patterns. But reaching 100% is not always sufficient to validate the generated codes are reliable. We demand that all patterns should occur not once but sufficient times to obtain a reliable distribution of probabilities of code patterns.

Fig. 11 The incremental observing method.

Ohzeki et al. 34

5. Conclusion

A new re-encoding paradigm is reviewed and a two dimensional and four dimensional semi-optimization is examined. A large number of video data are used to analyze the methods in detail and to improve reliability of the experiments. By two-dimensional Huffman re-encoding by V-V codes, a good coding efficiency is obtained. By integrating four motion vectors in a single code, 21-22% of reducing ratio is obtained for a longer and much variety of videos. The convergence characteristics of the results are verified by an incremental observing method. Though the result is restricted to the motion vector parts, the efficiency improves much comparing to the conventional methods using F-V codes for MPEG-2 with 21-22% coding efficiency. The results can be stable as to length of video and bit-rates. And, for the variety of video contents, the result may be convergent.

Acknowledgments

The author thanks Dr. Mark Titchener who provided tcalc software to calculate T-code in this paper. He also thanks Mr. T. Kato who carried out experiments of T-code analysis.

References

Abrahams, J. (1997) Code and parse trees for lossless source encoding, in Proceedings of Compression and Complexity of Sequences pp.145 - 171 , Jun..

Fig. 12 Ratios of occurrence of motion vectors vs. lengths of videos for the three cases of the Number of Integration (NoI) of motion vectors.


35

Bodden, E., Clasen M., and Kneis, J., Arithmetic Coding, http://www.bodden.de/legacy/arithmetic-coding/ Gunther, Ulrich et al., (1997), Representing Variable-Length Codes in Fixed-Length T-Depletion Format in

Encodes and Decoders, in J. of Universal Computer Science Vol. 3 No. 11 pp.1207-1225, Springer. Han, Te Sun and Kobayashi, Kingo. (2001). Mathematics of Information and Coding, American Mathematical

Society, Boston, MA, USA. Book. Matsuda, I. Wakabayashi, K., Ikeda, Y., and Itoh, S., (2009), A Lossless Re-encoding Scheme for MPEG-1

Video, in Proceedings of 17th European Signal Processing Conference (EUSIPCO-2009), pp.1834-1838. Matsui, Y., and Kida,T., (2009), Study on Efficiency of Tunstall-Huffman Code, in Proc. of Data Engineering

and Information Management Demi Forum i1-28 (in Japanese) http://db-event.jpn.org/deim2009/proceedings/files/i1-28.pdf Ohzeki, K., Kato, T., and Gi, E., (2010), Basic consideration for lossless re-encoding of MPEG coded files

using V-V codes, in IEICE Technical Report, IE2010-6, pp.31-36, April,. (in Japanese). Ohzeki, K., (2010_b) http://www.sic.shibaura-it.ac.jp/~ohzeki/oz4c/mmap/videos/index.html Savari, S.A., and Szpankowski, W. (2002), On the analysis of variable-to-variable length codes, in Proceedings.

IEEE International Symposium on Information Theory,page176. Shimizu, A., Sagata, A., Kamikura, K., and Kobayashi, N., (2001), Motion Vector Coding by Using

Representation of Norm and Angle Components, in IEICE Trans. J84-D-II(11), 2379-2386,. (in Japanese). Titchener, Mark, (1984), Digital encoding by means of new T-codes to provide improved data synchronization

and message integrity, in Technical Note, IEE Proceedings, Volume: 131, Pt. E, Number: 4 , July, Page(s): 51 –53.

Yu, Guo Yao and Chen, Cheng-Tie, (1995), Two-dimensional motion vector coding for low bit rate videophone applications, in Proc. ICIP, vol. 2, pp. 2414.

Yamamoto, H. and Yokoo, H. (2001). Average-Sense Optimality and Competitive Optimality for Almost Instantaneous VF Codes, .in IEEE Trans. IT, 47(6): pp.2174-2184, Sept.. Ziv, Jacob. (1990).Variable-to-fixed length codes are better than fixed-to-variable length codes for Markov sources, in IEEE Trans. IT, 36(4): pp861-863, July.

A LOSSLESS RE-ENCODING OF MPEG-2 CODED … LOSSLESS RE-ENCODING OF MPEG-2 CODED FILE BY ... compression ratio of 20% compared to the MPEG-2 standard method. ... are not always close

Documents