Top Banner
Fault Tolerance Technique in Huffman Coding applies to Baseline JPEG Cung Nguyen and Robert G. Redinbo Department of Electrical and Computer Engineering University of California, Davis, CA email: cunguyen, [email protected] Abstract Faults due to the incorrect functioning of the computation system, or the transmission errors of the internal data, could corrupt the output code stream of the Huffman encoder. In this paper, a fault detection method is proposed for the Huffman encoding system, which is implemented in the JPEG image compression standard [1]. The detection method based on the information input and the code stream output is described. The practical test results show the improvement of the reconstructed image quality. Index Terms: Huffman coding, zigzag sequence, runlength coding. I. I NTRODUCTION Huffman coding [2] is a very effective technique for compression data; saving of 20% to 90% are typical, depending on the characteristics of the data being compressed. Huffman coding starts by assign the shorter code words for more probable symbols and longer codewords for the less probable symbols. This coding technique is adopted by JPEG image compression standard. In JPEG image compression standard, two statistical models are used in the Huffman DCT-based sequential mode. First, the DCT basis functions are almost completely decorrelated [3] (pp. 150-57), such that they can be compressed independently without concern about correlation between coefficients. Second, the 63 AC coefficients arranged in zigzag order of decreasing probability of occurrence. Based on the statistical models, code tables for DC and AC coefficients are generalized and stored in memory storages before the bit stream is produced. Those codes will be selected to retrieve when the compression is in progressing. It has long been recognized that error codes can seriously degrade any data have been compressed whether using lossless or lossy methods [4], [5]. Combined source-channel coding techniques have been developed integrating the compression steps with transmission error-correcting coding [6]. Such techniques are useful for modern video compression methods [7]. However, there are other sources of errors resulting from failures in the computing resources that compress and transmit the data. Huffman coder is performed using computing resources that potentially affected by failures: the unexpected change of data bits due to physical faults in the system; the transient faults due to the noise, such as absorbtion of alpha particles in space [8] and electromagnetic interference, X-ray from the critical equipments, overheating. These source of errors could corrupt the output code stream. For these reasons, This research was supported in part by the NSF through grant CCR-0104851.
8

Fault Tolerance Technique in Huffman Coding applies to ... · PDF fileFault Tolerance Technique in Huffman Coding applies to ... Combined source-channel coding techniques ... of DCT

Mar 16, 2018

Download

Documents

buidang
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Fault Tolerance Technique in Huffman Coding applies to ... · PDF fileFault Tolerance Technique in Huffman Coding applies to ... Combined source-channel coding techniques ... of DCT

Fault Tolerance Technique in Huffman

Coding applies to Baseline JPEGCung Nguyen and Robert G. Redinbo

Department of Electrical and Computer Engineering

University of California, Davis, CA

email: cunguyen, [email protected]

Abstract

Faults due to the incorrect functioning of the computation system, or the transmission errors of the

internal data, could corrupt the output code stream of the Huffman encoder. In this paper, a fault detection

method is proposed for the Huffman encoding system, which is implemented in the JPEG image compression

standard [1]. The detection method based on the information input and the code stream output is described.

The practical test results show the improvement of the reconstructed image quality.

Index Terms: Huffman coding, zigzag sequence, runlength coding.

I. I NTRODUCTION

Huffman coding [2] is a very effective technique for compression data; saving of 20% to 90% are typical,

depending on the characteristics of the data being compressed. Huffman coding starts by assign the shorter

code words for more probable symbols and longer codewords for the less probable symbols. This coding

technique is adopted by JPEG image compression standard. In JPEG image compression standard, two

statistical models are used in the Huffman DCT-based sequential mode. First, the DCT basis functions

are almost completely decorrelated [3] (pp. 150-57), such that they can be compressed independently

without concern about correlation between coefficients. Second, the 63 AC coefficients arranged in zigzag

order of decreasing probability of occurrence. Based on the statistical models, code tables for DC and AC

coefficients are generalized and stored in memory storages before the bit stream is produced. Those codes

will be selected to retrieve when the compression is in progressing.

It has long been recognized that error codes can seriously degrade any data have been compressed

whether using lossless or lossy methods [4], [5]. Combined source-channel coding techniques have been

developed integrating the compression steps with transmission error-correcting coding [6]. Such techniques

are useful for modern video compression methods [7]. However, there are other sources of errors resulting

from failures in the computing resources that compress and transmit the data.

Huffman coder is performed using computing resources that potentially affected by failures: the

unexpected change of data bits due to physical faults in the system; the transient faults due to the noise,

such as absorbtion of alpha particles in space [8] and electromagnetic interference, X-ray from the critical

equipments, overheating. These source of errors could corrupt the output code stream. For these reasons,

This research was supported in part by the NSF through grant CCR-0104851.

Page 2: Fault Tolerance Technique in Huffman Coding applies to ... · PDF fileFault Tolerance Technique in Huffman Coding applies to ... Combined source-channel coding techniques ... of DCT

1

fault tolerance becomes the main issue to be solved. Fault tolerance technique for the Huffman coder is

based on the extra hardware components and extra time to combine with the original tasks. In this work,

redundancy check bits are generated and used to compare with parity bits that are generated at the output

code stream. When an error is detected, the system will request for a repeat encoding the defected data

block.

The technique discussed in this paper is motivated The rest of this paper is divided into three sections. The

first section overviews the Huffman encoding algorithm and data structures support the coding processes.

In second section, a redundancy subsystem is proposed to detect errors occur in the Huffman code stream

output. The third section provides the experiment results and the improvements in terms of of the image

quality and image reconstruction error.

II. OVERVIEW THE JPEG HUFFMAN CODING

To simplify the problem, only the fault tolerance JPEG Huffman entropy coder implementations for

the baseline mode operation is discussed in this paper. However the design principle can be applied to

the extended systems [1]. Baseline sequential coding is for images with 8-bit samples and uses Huffman

coding only, and its decoder can store only one AC and DC Huffman table. Prior to entropy coding, there

usually are few nonzero and many zeros-valued coefficients. The task of entropy coding is to encode these

few coefficients efficiently. The description of Baseline sequential entropy coding is given in two steps:

conversion of quantized DCT coefficients into an intermediate sequence of symbols and assignment of

variable-length codes to the symbols.

In the intermediate symbol sequence, each nonzero AC coefficient is represented in combination with the

“runlength” of zero-valued AC coefficients which precede it in the zigzag. Each such runlength-nonzero

coefficient combination is represented by a pair of symbols:

symbol-1 ⇔ (RUNLENGTH, SIZE) (1)

symbol-2 ⇔ (AMPLITUDE) (2)

symbol-1 represents two pieces of information, RUNLENGTH and SIZE;symbol-2 represents the single

piece of information designated AMPLITUDE, which is simply the amplitude of the nonzero AC coefficient.

RUNLENGTH is the number of consecutive zero-valued AC coefficients in the zigzag sequence preceding

the nonzero AC coefficient being represented. SIZE is the number of bits used to encode AMPLITUDE.

RUNLENGTH represents zero-runs of length 0 to 15. Actual zero-runs in the zigzag sequence can be greater

than 15, so thesymbol-1 value(15, 0) is interpreted as the extension symbol with runlength= 16. There

can be up to three consecutive(15, 0) extensions before the terminatingsymbol-1 whose RUNLENGTH

value complete the actual runlength. The terminatingsymbol-1 is always followed by a singlesymbol-2

except for the case in which the last run of zeros include the last(63rd) AC coefficient. In this frequent

case, the specialsymbol-1 value(0, 0) means EOB (end-of-block) symbol, is used to terminate the8×8

sample block.

A DC differenceHuffman code is concatenated by two codes: The first code,dDC , so called difference

category, defines the SIZE for the difference value. The second code,vDC , which denotes the additional

Page 3: Fault Tolerance Technique in Huffman Coding applies to ... · PDF fileFault Tolerance Technique in Huffman Coding applies to ... Combined source-channel coding techniques ... of DCT

2

bits of theDC difference. Let 4DC denotes theDC difference, k denotes the SIZE of theDC difference,

B denotes the equivalent decimal value of the CODE. IfB ≥ 2k−1, then4DC = B. Otherwise,4DC =

B − 2k + 1.

The possible range of quantized AC coefficients determines the range of values of both the AMPLITUDE

and the SIZE information. If the input data areN -bit integers, then the non-fractional part of the DCT

coefficients can grow by at most 3 bits. Baseline sequential has 8-bit integer source samples in the range

−27, 27−1, so quantized AC coefficient amplitudes are covered by integers in the range[−210, 210−1]. The

signed-integer encoding usessymbol-2 AMPLITUDE codes of 1 to 10 bits in length, hence SIZE represents

values from 1 to 10, and RUNLENGTH represents values from 0 to 15 as discussed previously. Figure 1

AC Huffman Encoder(subject of errors)

(RUNLENGTH-SIZE)Huffman code table

AMPLITUDE

(RUNLENGTH-SIZE)

address code

Code stream forAC coefficients

DC Huffman Encoder(subject of errors)

DIFFERENCE DC VALUEHuffman code table

DIF. DC CATEGORY

DIF. DC VALUE

address code

Code stream fordiff. DC values

address code

Fig. 1. General Structure for a Baseline JPEG Huffman Coder

shows a general structure of the baseline Huffman encoder forDC differenceand AC coefficients. The AC

encoder is represented by block, which accepts the intermediate symbols, performs all computations include

generation the addresses for accessing codes from the code tables. As a result, the encoder generated an

appropriated compressed code streams represents for the AC coefficients. The input of the encoder includes

a pair of intermediate symbols generated by the zigzag coder, once for each time. If the zero-runs is greater

than 15 or the EOB (end of block) is reached,symbol-2 input is omitted. When the zero-runs is greater

than 15, thesymbol-1 value, (15,0), is interpreted as the extension symbol with runlength is 16. In the

case of EOB, thesymbol-1 value is (0,0). The AC encoder shares theDC differencecode table with the

DC coder, because the AC encoder uses this table to get the code for the AMPLITUDE symbol. The

encoding for DC coefficients is similar to that of the AC coefficients, except the input for the DC part

contains theDC differencecategory and theDC differencevalue. There are 11DC differencecategories

(excluded the 0 category), and therefore,211 DC differencevalues. The general structure of the DC and AC

Huffman encoders is shown in Figure 2. Figure 2 illustrates the procedure to encode an 8x8 block of DCT

coefficients. The top array shows the 8x8 block of DCT coefficients arranged in zigzag order. The second

array is the result in the conversion the block into a sequence of symbols. The last line in Figure 2 shows

the output bit stream of the 8x8 DCT coefficients. In the presence of hardware or the instruction faults,

the code stream output may contain bit errors. The errors may prevent 8x8 code block from decoded, or

the reconstructed image for that block is degraded. These errors are due to the encoder itself and channel

Page 4: Fault Tolerance Technique in Huffman Coding applies to ... · PDF fileFault Tolerance Technique in Huffman Coding applies to ... Combined source-channel coding techniques ... of DCT

3

15 0 -2 -1 -1 -1 0 0 -1 0 ... 0

(2)(3) (1,2)(-2) (0,1)(-1) (0,1)(-1) (2,1)(-1) (0,0)

12

DC value of theprevious block

(0,1)(-1)

(15-12)

110 11 11011 01 00 0 00 0 00 0 11100 0 1010 Code stream

Symbols

55 zeros

Fig. 2. Encoding processes for samples an 8x8 DCT block

control coding cannot correct them. To increase the encoder’s reliability and improve the image quality,

the fault tolerance Huffman encoding system is proposed as shown in the next section

III. FAULT TOLERANCE MODEL FORBASELINE JPEG ENCODER

AC Huffman Encoder(subject of errors)

(RUNLENGTH-SIZE) Huffman code table

AddressDecoder Parity table for AMPLITUDE code

AMPLITUDE

CombineAddressDecoder Parity table for (RUNLENGTH-SIZE) code

(RUNLENGTH-SIZE)

address code

Paritygenerator

Table for the length of combine(RUNLENGTH-SIZE) + AMPLITUDE code

CompareErrorFlag

Code stream output for an 8x8 block

length control signal for comparison

(I)

(II)

(III)

Fig. 3. Fault tolerance model for the AC Huffman encoder

The method of inserting error-detecting features of the AC Huffman encoder is shown in Figure 3.

Failures are modelled as corrupting one or more variable-length code vectorss1s2 of the stream output.

These corrupted binary code vectors can be expressed as flipped the correct bits randomly. The method

is presented for fault detection throughout the data path so that the momentarily failure of any subsystem

will not contaminated data to go undetected. The error detection for the code stream output is performed

by generating the parity symbols via the spare subsystem connected parallels with the main encoder. These

parity symbols are ready for checking as soon the output code stream is emerged. The goal of this fault

tolerance design is to protect against at most one subsystem producing erroneous data. In the context of

this paper, error detection of such situation is the only requirement and no correction methods will be

addressed.

Let s1 = b1b2. . .bn1 denotes the code of (RUNLENGTH,SIZE) symbol;s2 = c1c2. . .cn2 denotes the

code of AMPLITUDE symbol. To reduce the delay in computation the parity symbols, the parity table for

the codes1 is created and maintained in memory storage throughout the encoding process. Assume the

code table fors1 was created by using the statistic model [1] (pp. 510-16). The correspond parity fors1 is

shown in Tab. I. A parity symbol can be retrieved quickly when a (RUNLENGTH,SIZE) symbol is valid

at the address decoder for the memory storage (I). The details of how to implement the address decoder

is not discussed in this paper, however, based on the (RUNLENGTH,SIZE) code table, the address is well

known to be unique.

Page 5: Fault Tolerance Technique in Huffman Coding applies to ... · PDF fileFault Tolerance Technique in Huffman Coding applies to ... Combined source-channel coding techniques ... of DCT

4

TABLE I

PARITY SYMBOL TABLE OF THE CODES OF(RUNLENGTH,SIZE)SYMBOLS

RUNLENGTH

0 0 1 0 1 1 0 0 0 1 1 0 0 0 0 0

1 0 0 0 1 0 0 1 1 0 1 1 1 0 1 0

1 1 1 0 0 1 0 1 1 1 1 0 1 1 0 1

1 1 1 1 1 0 1 0 0 0 0 0 0 0 0 1

SIZE 1 1 1 0 1 0 1 1 0 1 0 1 1 1 1 0

0 0 1 1 0 1 0 0 1 1 1 0 0 1 0 0

1 1 0 1 0 1 0 0 1 0 0 1 0 0 1 1

0 1 1 0 1 0 1 1 0 1 1 1 1 0 1 0

0 0 0 1 0 1 0 0 1 0 1 0 1 1 0 1

1 0 0 0 1 0 1 1 0 0 0 0 0 1 1 1

The memory storage (II) shares the same address decoder with the storage (I), because the memory

storage (II) contains the length of concatenated codes1s2. Both have the same number of items and have

a one-by-one relationship. As far the (RUNLENGTH,SIZE) symbol is known, the length of thes1s2 code

is determined bylength(s1s2) = length(s1) + length(s2)

= length(s1) + SIZE(3)

In this context, the SIZE represents for the code length of nonzero AC coefficient. The length parameter

is used to determine the parity check points for the code stream output. When the code stream for an 8x8

block is continuously coming out, it is virtually partitioned into code segments following a certain rule.

The length memory storage is used to determine the endpoints of these variable length codes.

The storage (III) is used to hold the parity ofs2 codes. There are 2047 different codes fors2, which

represent for the AC values in the range [-1023, 1023] [1] (pg. 445). The codes2 is computed similar

to that of the codevDC [1] (pg.191). The computed parities for thes2 codes are stored in the storage

(III). Figure 4 shows the construction of the codes2. The zero AC value is unused in this construction

due to the zigzag encoding result. Therefore the memory cell at the address 1023 is empty. However

the existence of this cell will save some operations for the address decoder. The parity of the code

-2 -1 x 1 2 3-3-4 4-5-6-7 5 6 7... ...... ...-1023 1023AC values

Code length 2 1 x 1 2 223 3333 3 3 3... ...... ...10 10

1 0 x 1 2 303 4210 5 6 7... ...... ...0 1023Code value

Parity bit 1 0 x 1 1 000 1110 0 0 1... ...... ...0 0

Fig. 4. The parity bit structure of non-zero AC coefficients code. This is the pre-computed parity bits are stored in memory and

will be retrieved uniquely by the AMPLITUDE symbol.

stream represents for (RUNLENGTH,SIZE)+AMPLITUDE symbol pair is determined by combine the

parity retrieved from the memory storages (I) and (III). This parity is used to compare with the parity of

the code stream generated by the Huffman encoder. The check point is determined by the length parameter

retrieved from the memory storage (II). The AC Huffman encoder is used to compute the code stream for

the (RUNLENGTH,SIZE)+AMPLITUDE pair using the Huffman code table and the built-in arithmetic

Page 6: Fault Tolerance Technique in Huffman Coding applies to ... · PDF fileFault Tolerance Technique in Huffman Coding applies to ... Combined source-channel coding techniques ... of DCT

5

processor. The operation principle of the AC Huffman encoder is known from the JPEG compression

standard.

Let s1 = b1b2. . .bn1 be the code of a (RUNLENGTH,SIZE) symbol;s2 = c1c2. . .cn2 be the code of an

AMPLITUDE symbol. The parity symbols of these codes are pre-computed and stored in memory (I) and

(III) as shown below:

p1 =b1⊕b2⊕. . .⊕bn1 (4a)

p2 =c1⊕c2⊕. . .⊕cn2 (5a)

where2 ≤ n1 ≤ 16 and 1 ≤ n1 ≤ 10. Thus there is required five bits to store the total code lengthn

(n = n1 +n2). The overall parity of the code sequenceb1b2. . .bn1c1c2. . .cn2 is determined byp = p1⊕p2.

Consider a specific pair of symbols (RUNLENGTH,SIZE) and AMPLITUDE resulting from the zero

runlength coding of a zigzag sequence. As soon this symbol pair is available at the input of the Huffman

encoder, the parity bitp and the lengthn is retrieved from the memory storages and ready for parity

checking. If error free, the parityp of the code streamb1b2. . .bn1 c1c2. . .cn2 generated by the encoder

must identical to the stored parityp. When this code stream sequentially pulling out, the parity bit for

that stream is dynamically computed. The control signal use the lengthn to set the comparison point for

the code stream output, where the comparison is taken place. The error flag is only active at that check

points. The structure of the error checking for the output bit stream is shown in Figure 5, where the parity

Parity table for AMPLITUDE code

Parity table for (RUNLENGTH-SIZE) code

ErrorFlag

Huffman code stream output

1p

2pp

p�

(I)

(III)

Table for the length of combine(RUNLENGTH-SIZE) + AMPLITUDE code length of

(II)1 2( )s s code

timingsignal

generator

Fig. 5. The logic structure of the parity checking for the code stream output

p is referred only at thenth bit in the sequenceb1b2. . .bn1 c1c2. . .cn2 . To clarify for the generation of the

parity p of the code sequenceb1b2. . .bn1 c1c2. . .cn2 , the parityp can be analyzed as shown below

p1 = b1 (6a)

p2 = p1 ⊕ b2 (6b)

... (6c)

pn1 = pn1−1⊕ bn1 (6d)

pn1+1 = pn1 ⊕ c1 (6e)

... (6f)

p = pn−1 ⊕ cn2 (6g)

Page 7: Fault Tolerance Technique in Huffman Coding applies to ... · PDF fileFault Tolerance Technique in Huffman Coding applies to ... Combined source-channel coding techniques ... of DCT

6

When the length of the stream is matched with the pre-stored length, the comparison is conducted. Assume

the synchronization is correct, then the two parity bits will be matched, if no error has occurred in the

encoding process. Otherwise, the error flag is turned on and sent re-encode repeat request to the control

center.

IV. EXPERIMENT RESULTS

The protection designed for Huffman encoder was implemented using a computer software resources. The

computer program, first, performed the image reading, color component transformation, DCT and Huffman

coding processes for both encoding and decoding. The error detection was implemented restricted only in

the encoder side. The error source corrupted the system by changing data bits of the code stream at random

locations. The error rate is varies is a typical range from10−4to10−1. Since the error correction is not

implemented in this paper, to improve the quality of image in the existence of error, the alternate form of

error correction is repeat encoding for the corrupted blocks. The error correction performance is evaluated

based on the Mean Square Error (MSE) between the reconstructed and the original data images. Figure

6 shows the quality performance curves of the Huffman encoder with and without error correction. The

dash line and solid line curves present the reconstruction performance of image with and without repeat

encoding request, respectively, versus error rate injection. As a matter of fact, the fidelity is much improved

10−4

10−3

10−2

10−1

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35error correction performance vs. error rate

error rate

1/M

SE

w/o correction with correction

Fig. 6. Quality performance curves with and without repeat encoding request.

when the error correction system is implemented. Figure 7 shows the reconstructed of color images in

the effects of corrupted blocks. The four particular error images with coding errors with error rates10−4

(top-left), 10−3 (top-right), 3×10−3 (bottom-left), and3×10−2 (top-right) are reconstructed. Each color

square is a result of discarding the bad code blocks. In contrast with the above results, Figure 8 shows the

reconstructed of images in the same error rates, but with repeat encoding request. Each corrupted block

is discarded and replaced by a re-encoded block. This method reduced the number of error spots with the

cost of delay.

Page 8: Fault Tolerance Technique in Huffman Coding applies to ... · PDF fileFault Tolerance Technique in Huffman Coding applies to ... Combined source-channel coding techniques ... of DCT

7

Recoved image, error rate = 0.0001 Recoved image, error rate = 0.001 Recoved image, error rate = 0.005 Recoved image, error rate = 0.03

Fig. 7. Reconstructed image without repeat encoding request

Recoved image, error rate = 0.0001 Recoved image, error rate = 0.001 Recoved image, error rate = 0.005 Recoved image, error rate = 0.03

Fig. 8. Reconstructed image with repeat encoding request

V. CONCLUSIONS

In this paper, the error detection and correction for JPEG Huffman entropy coding was implemented

successfully in software. The computer program takes the color input images, performs color transform,

DCT, scalar quantization and Huffman coding. Although the implementation only detects single error

occurred in a code word, the design principle can be expanded to solve for the complicated problems. In

the future research, error control coding techniques will be employed in order to generate more sophisticated

coding techniques for detection and correction difference types of errors. The simulation results show that

when repeat encoding process is performed on the corrupted part of the code stream, the significant visual

quality improvement is perceived in the decoded image.

REFERENCES

[1] W. Pennebaker and J. Mitchell,JPEG Still Image Data Compression Standard. New York, NY: International Thomson Publishing,

1993.

[2] D. Huffman, “A method for the construction of minimum-redundancy codes,”Proc. IRE, vol. 40, pp. 1098–101, 1952.

[3] A. K. Jain, Fundamental of Digital Image Processing. Englewood Cliffs, NJ: Prentice Hall Information and System Science

Series, 1989.

[4] D. Salomin,Data Compression, The Complete Reference. New York: Springer-Verlag, 1998.

[5] N. Demir and K. Syood, “Joint source/channel coding for variable length codes,” inProcedings Data Compression Conference

(DCC’98), (Snow Bird, UT), pp. 139–148, 1998.

[6] G. Elmasry, “Arithmetic coding algorithm with embedded channel coding,”Electronic letter, vol. 33, pp. 1687–1688, Sep. 1997.

[7] Y. Wang and Q. F. Zhu, “Error control and concelment for video communication: A review,” inProc. IEEE, vol. 86, pp. 974–977,

May 1998.

[8] M. Lovellette, K. Wood, D. Wood, J. Beall, P. Shirvani, N. Oh, and E. McCluskey, “Strategies for fault-tolerant, space-based

computing: Lessons learned from the argos testbed,”Aerospace Conference Proceedings, 2002, vol. 5, pp. 2109–2119, 2002.