Simple Fast and Adaptive Lossless Image Compression Algorithm

Simple Fast and Adaptive Lossless ImageCompression Algorithm

Roman Starosolski∗

December 20, 2006

This is a preprint of an article published inSoftware—Practice and Experience, 2007, 37(1):65-91, DOI: 10.1002/spe.746

Copyright c© 2006 John Wiley & Sons, Ltd.http://www.interscience.wiley.com

Abstract

In this paper we present a new lossless image compression algorithm. To achievethe high compression speed we use a linear prediction, modified Golomb–Rice codefamily, and a very fast prediction error modeling method. We compare the algo-rithm experimentally with others for medical and natural continuous tone grayscaleimages of depths of up to 16 bits. Its results are especially good for big images, fornatural images of high bit depths, and for noisy images. The average compressionspeed on Intel Xeon 3.06 GHz CPU is 47 MB/s. For big images the speed is over60MB/s, i.e., the algorithm needs less than 50 CPU cycles per byte of image.

KEY WORDS: lossless image compression; predictive coding; adaptive modeling; medical imaging;Golomb–Rice codes

1 Introduction

Lossless image compression algorithms are generally used for images that are documentsand when lossy compression is not applicable. Lossless algorithms are especially impor-tant for systems transmitting and archiving medical data, because lossy compression ofmedical images used for diagnostic purposes is, in many countries, forbidden by law.Furthermore, we have to use lossless image compression when we are unsure whether dis-carding information contained in the image is applicable or not. The latter case happensfrequently while transmitting images by the system not being aware of the images’ use,e.g., while transmitting them directly from the acquisition device or transmitting over thenetwork images to be processed further. The use of image compression algorithms couldimprove the transmission throughput provided that the compression algorithm complex-ities are low enough for a specific system. Some systems such as medical CT scanner

∗Silesian University of Technology, Institute of Computer Science, Akademicka 16, 44–100 Gliwice,Poland (e-mail: [email protected])

R. Starosolski — Simple Fast and Adaptive Lossless. . . 2

systems require rapid access to large sets of images or to volumetric data that are furtherprocessed, analyzed, or just displayed. In such a system, the images or volume slices arestored in the memory since mass storage turns out to be too slow—here, the fast loss-less image compression algorithm could virtually increase the memory capacity allowingprocessing of larger sets of data.

An image may be defined as a rectangular array of pixels. The pixel of a grayscaleimage is a nonnegative integer interpreted as the intensity (brightness, luminosity) of theimage. When image pixel intensities are in the range [0, 2N − 1], then we say that theimage is of N bit depth, or that it is an N -bit image. Typical grayscale images are of bitdepths from 8 to 16 bits.

Grayscale image compression algorithms are used as a basis for color image compres-sion algorithms and for algorithms compressing other than images 2-dimensional datacharacterized by a specific smoothness. These algorithms are also used for volumetric3-dimensional data. Sometimes such data, as a set of 2-dimensional images, is com-pressed using regular image compression algorithms. Other possibilities include prepro-cessing volumetric data before compressing it as a set of 2-dimensional images or usingalgorithms designed exclusively for volumetric data—the latter are usually derived fromregular image compression algorithms.

We could use a universal algorithm to compress images, i.e., we could simply encodea sequence of image pixels extracted from an image in the raster scan order. For a uni-versal algorithm such a sequence is hard to compress. Universal algorithms are usuallydesigned for alphabet sizes not exceeding 28 and do not exploit directly the followingimage data features: images are 2-dimensional data, intensities of neighboring pixels arehighly correlated, and the images contain noise added to the image during the acquisitionprocess—the latter feature makes dictionary compression algorithms perform worse thanstatistical ones for image data [1]. Modern grayscale image compression algorithms em-ploy techniques used in universal statistical compression algorithms. However, prior tostatistical modeling and entropy coding the image data is transformed to make it easierto compress.

Many image compression algorithms, including CALIC [2, 3], JPEG-LS [4], andSZIP [5], are predictive, as is the algorithm introduced in this paper. In a predictivealgorithm, we use the predictor function to guess the pixel intensities and then we calcu-late the prediction errors, i.e., differences between actual and predicted pixel intensities.Next, we encode the sequence of prediction errors, which is called the residuum. To calcu-late the predictor for a specific pixel we usually use intensities of small number of alreadyprocessed pixels neighboring it. Even using extremely simple predictors, such as one thatpredicts that pixel intensity is identical to the one in its left-hand side, results in a muchbetter compression ratio, than without the prediction. For typical grayscale images, thepixel intensity distribution is close to uniform. Prediction error distribution is close toLaplacian, i.e., symmetrically exponential [6, 7, 8]. Therefore entropy of prediction errorsis significantly smaller than entropy of pixel intensities, making prediction errors easierto compress.

The probability distribution of symbols to be encoded is estimated by the data model.There are two-pass compression algorithms that read data to be compressed twice. Dur-ing the first pass the data is analyzed and the data model is built. During the second passthe data is encoded using information stored in the model. In a two-pass algorithm wehave to include along with the encoded data the data model itself, or information allow-ing reconstructing the model by the decompression algorithm. In the adaptive modeling


we do not transmit the model; instead it is built on-line. Using a model built for allthe already processed symbols we encode specific symbol immediately after reading it.After encoding the symbol we update the data model. If the model estimates conditionalprobabilities, i.e., if the specific symbol’s context is considered in determining symbol’sprobability then it is the context model, otherwise the model is memoryless. As op-posed to universal algorithms that as contexts use symbols directly preceding the currentone, contexts in some of the image compression algorithms are formed by the pixel’s 2-dimensional neighborhood. In context determination we use pixel intensities, predictionerrors of neighboring pixels, or some other function of pixel intensities. High numberof intensity levels, especially if the context is formed of several pixels, could result ina vast number of possible contexts—too high a number considering the model memorycomplexity and the cost of adapting the model to actual image characteristics for allcontexts or of transmitting the model to the decompression algorithm. Therefore, in im-age compression algorithms we group contexts in collective context buckets and estimateprobability distribution jointly for all the contexts contained in the bucket. The contextbuckets were firstly used in the Sunset algorithm that evolved into Lossless JPEG, theformer JPEG committee standard for lossless image compression [9].

After the probability distribution for the symbol’s context is determined by a datamodel, the symbol is encoded using the entropy coder. In order to encode the symbol soptimally, we should use − log2(prob(s)) bits, where prob(s) is the probability assignedto s by the data model [10]. Employing an arithmetic entropy coder we may get arbitrarilyclose to the above optimum, but practical implementations of arithmetic coding arerelatively slow and not as perfect as theoretically possible [11]. For entropy coding wealso use prefix codes, such as Huffman codes [12], which are much faster in practice. In thiscase we encode symbols with binary codewords of integer lengths. The use of prefix codesmay lead to coding results noticeably worse than the above optimum, when probabilityassigned by the data model to the actual symbol is high. In image compression, asin universal compression algorithms, we use both methods of entropy coding, howeverknowing the probability distribution of symbols allows some improvements. Relativelyfast Huffman coding may be replaced by a faster entropy coder using a parametric familyof prefix codes, i.e., Golomb or Golomb–Rice family [13, 14].

The algorithms used for comparisons in this paper employ two more methods toimprove compression ratio for images. Some images contain highly compressible smooth(or ‘flat’ [7]) regions. It appears that modeling algorithms and entropy coders, tuned fortypical image characteristics, do not obtain best results when applied to such regions.Furthermore, if we encode pixels from such a region using prefix codes, then the resultingcode length cannot be less than 1 bit per pixel, even if the probability estimated for asymbol is close to 1. For the above reasons some compression algorithms detect smoothregions and encode them in a special way. For example, in the JPEG-LS algorithm,instead of encoding each pixel separately, we encode, with a single codeword, the numberof consecutive pixels of equal intensity. In the CALIC algorithm, we encode in a specialway sequences of pixels that are of at most two intensity levels—a method aimed notonly at smooth regions, but for bilevel images encoded as grayscale also.

The other method, probably firstly introduced in the CALIC algorithm, actuallyemploys modeling to improve prediction. This method is called the bias cancellation.The prediction error distribution for the whole image usually is close to Laplacian with0 mean. The mean of the distribution for a specific context, however, may locally varywith location within the image. To make the distribution centered at 0 we estimate a


local mean of the distribution and subtract it from the prediction error. The contexts, orcontext buckets, used for modeling the distribution mean may differ from contexts usedfor modeling distribution of prediction errors after the bias cancellation.

The performance of the predictive algorithm depends on the predictor function used.The predictors in CALIC and JPEG-LS are nonlinear and can be considered as switching,based on local image gradients, among a few simple linear predictors. More sophisticatedschemes are used in some recent algorithms to further improve the compression ratios.For example, in the APC algorithm the predictor is a linear combination of a set ofsimple predictors, where the combination coefficients are adaptively calculated based onthe least mean square prediction error [15]. Another interesting approach is used in theEDP algorithm, where the predictor is a linear combination of neighboring pixels and thepixel coefficients are determined adaptively based on the least-square optimization [16].To reduce the time complexity of the EDP algorithm the optimization is performed onlyfor pixels around the edges. Compared to a CALIC algorithm which, because of itscompression speed, is by some authors considered to be of research use rather than ofpractical use, the two latter algorithms obtain speeds significantly lower.

Another method of making the image data easier compressible, different from the pre-diction, is to use 2-dimensional image transforms, such as DCT or wavelet transform. Intransform algorithms, instead of pixel intensities, we encode a matrix of transform coeffi-cients. The transform is applied to the whole image, or to an image split into fragments.We use transforms for both lossless and lossy compression. Transform algorithms aremore popular in lossy compression, since for a lossy algorithm we do not need the inversetransform to be capable of losslessly reconstructing the original image from transformcoefficients encoded with finite precision. The new JPEG committee standard of lossyand lossless image compression, JPEG2000, is a transform algorithm employing a wavelettransform [17, 18]. Apart from lossy and lossless compressing and decompressing of wholeimages, transform algorithms deliver many interesting features (progressive transmission,region of interest coding, etc.), however, in respect to the lossless compression speed andratio, better results are obtained by predictive algorithms.

In this paper, we introduce a simple, fast, and adaptive lossless grayscale image com-pression algorithm. The algorithm, designed primarily to achieve the high compressionspeed, is based on the linear prediction, modified Golomb–Rice code family and a veryfast prediction error modeling method. The operation of updating the data model, whichis based on the data model known from the FELICS algorithm [19], although fast ascompared to many other modeling methods would be the most complex element of thealgorithm. Therefore we apply the reduced model update frequency method that increasesthe overall compression speed by a couple of hundred percent at the cost of worseningthe compression ratio by a fraction of a percent. The algorithm is capable of compress-ing images of high bit depths, actual implementation is for images of bit depths up to16 bits per pixel. The algorithm originates from an algorithm designed for 8-bit imagesonly [20]. We analyze the algorithm and compare it with other algorithms for manyclasses of images. In the experiments, we use natural continuous tone grayscale imagesof various depths (up to 16 bits), various sizes (up to about 4 millions of pixels) and var-ious classes of medical images (modalities: CR, CT, MR, and US). Nowadays, consumeracquisition devices, such as cameras or scanners, produce images of ever growing sizesand high nominal depths, often attaining 16 bits. The quality of the acquisition processseems to fall behind the growth of acquisition resolution and bit depth—typical highbit depth images are noisy. Natural images used in research were acquired using a high


Table 1: Predictors used in the research.

Pred0(X) = 0 Pred3(X) = C Pred6(X) = B + (A− C)/2Pred1(X) = A Pred4(X) = A + B − C Pred7(X) = (A + B)/2Pred2(X) = B Pred5(X) = A + (B − C)/2 Pred8(X) = (3A + 3B − 2C)/4

quality film scanner. To analyze the algorithm performance on noisy data special imageswith added noise were prepared. We also generated non-typical easily compressible andincompressible pseudo-images to estimate the best-case and the worst-case performanceof compression algorithms.

2 METHOD DESCRIPTION

2.1 Overview

Our algorithm is predictive and adaptive; it compresses continuous tone grayscale images.The image is processed in a raster-scan order. Firstly, we perform prediction using apredictor selected from a fixed set of 9 simple linear predictors. Prediction errors arereordered to obtain probability distribution expected by the data model and the entropycoder, and then output as a sequence of residuum symbols. For encoding residuumsymbols we use a family of prefix codes based on the Golomb–Rice family. For fast andadaptive modeling we use a simple context data model based on a model of the FELICSalgorithm [19] and the method of reduced model update frequency [20]. The algorithmwas designed to be simple and fast. We do not employ methods such as detection ofsmooth regions or bias cancellation. Decompression is a simple reversal of the compressionprocess. With respect to both time and memory complexity the algorithm is symmetric.

The algorithm described herein originates from an algorithm designed for images of8-bit depth, which obtained high compression speed but could not be just simply extendedto higher bit depths [20]. The most significant differences between these algorithms arereported in Section 2.6.

2.2 Prediction

To predict the intensity of a specific pixel X, we employ fast linear predictors that useup to 3 neighboring pixels: the left-hand neighbor (A), the upper neighbor (B), and theupper-left neighbor (C). We use 8 predictors of the Lossless JPEG algorithm (Table 1,Pred0–Pred7) [9], and one a bit more complex predictor Pred8, that actually returns anaverage of Pred4 and Pred7. Predictors are calculated using integer arithmetic. We selecta single predictor for the whole image, however for pixels of the first row and the firstcolumn some predictors cannot be calculated—in this case we use simpler predictors (e.g.,Pred2 for the first column).

If there is a subtraction operation in a calculation of the predictor, then its value maybe out of the nominal range of pixel intensities [0, 2N − 1], where N denotes image bitdepth. In such a case, we take the closest value from the above range. We compress theresiduum symbol that is a difference between the actual (X) and the predicted (Pred(X))


prob(R p )

R p

� 2N

+ 1 2N

� 10 � 1

prob(R m )

R m

2N

� 10

prob(R )

R

2N

� 10

a) b) c)

Figure 1: Probability distribution of prediction errors: a) before modulo reduction, b)after modulo reduction, and c) after reordering.

pixel intensity, i.e., Rp = X − Pred(X). Since both X and Pred(X) are in the range[0, 2N − 1], Rp is in the range [−2N + 1, 2N − 1]. To encode such a symbol directly, usingthe natural binary code, we would need N + 1 bits, i.e., we would expand the N -bitimage data before the actual compression. Fortunately, we may use N -bit symbols toencode prediction errors since for a specific pixel we have only 2N values of Rp (range[−Pred(X), 2N −1−Pred(X)]). The Pred(X) may be calculated by both the compressionand the decompression algorithm prior to processing the pixel X. Instead of the above-mentioned formula we use Rm = (X − Pred(X))mod2N . For decompression we useX = (Rm + Pred(X))mod2N .

The code family we use to encode the residuum requires residual values to be orderedin a descending order of probability. For typical images, before a modulo reduction,the distribution is close to symmetrically exponential (Laplacian), however after thatreduction it no longer descends (Fig. 1). We reorder residual values to get the probabilitydistribution close to exponential by simply picking symbols: first, last, second, last butone and so on:

R =

{2Rm for Rm < 2N−1

2(2N −Rm)− 1 for Rm ≥ 2N−1

2.3 The code family

The code family used is based on the Golomb–Rice (GR) family, i.e., on the infinite familyof prefix codes, that is a subset of a family described by Golomb [13] (Golomb family),rediscovered independently by Rice [14]. GR codes are optimal for encoding symbolsfrom an infinite alphabet of exponential symbol probability distribution. Each code inthe GR family is characterized by a nonnegative integer rank k. In order to encode thenonnegative integer i using the GR code of rank k we firstly encode the codeword prefix:bi/2kc using a unary code, then the suffix: imod2k using a fixed length k-bit naturalbinary code.

Prediction errors are symbols from a finite alphabet. The probability distribution ofthese symbols is only close to the exponential. To encode the prediction errors we use alimited codeword length variant of the GR family [21]. For encoding residuum symbolsof image of N -bit depth, that is for alphabet size 2N , we use family of N codes. We limitthe codeword length to lmax > N . For each code rank 0 ≤ k < N we define the threshold


Table 2: The code family for integers in range [0, 15], codeword length limited to 8 bits.

Integer Code

k = 0 k = 1 k = 2 k = 3

0 0• 0•0 0•00 0•0001 10• 0•1 0•01 0•0012 110• 10•0 0•10 0•0103 1110• 10•1 0•11 0•0114 1111•0000 110•0 10•00 0•1005 1111•0001 110•1 10•01 0•1016 1111•0010 1110•0 10•10 0•1107 1111•0011 1110•1 10•11 0•1118 1111•0100 1111•000 110•00 1•0009 1111•0101 1111•001 110•01 1•001

10 1111•0110 1111•010 110•10 1•01011 1111•0111 1111•011 110•11 1•01112 1111•1000 1111•100 111•00 1•10013 1111•1001 1111•101 111•01 1•10114 1111•1010 1111•110 111•10 1•11015 1111•1011 1111•111 111•11 1•111

πk = min((lmax − N)2k, 2N − 2k). We encode the nonnegative integer 0 ≤ i < 2N , inthe following way: if i < πk then we use the GR code of rank k, in the opposite casewe output a fixed prefix: πk/2

k ones, and then suffix: i − πk encoded using fixed lengthdlog2(2

N − πk)e-bit natural binary code.Sample codewords are presented in Table 2. The separator is inserted between prefix

and suffix of the codeword for legibility only. Some codewords are underlined. Fora specific code the underlined codeword and codewords above the underlined one areidentical to their equivalents in the GR code.

Use of the code family significantly simplifies the compression algorithm. To encodea certain residuum symbol we just select a code rank based on the information stored inthe data model and simply output a codeword assigned to this symbol by the code of theselected rank.

Limiting the codeword length is a method used in several other algorithms, includingJPEG-LS. It is introduced to reduce data expansion in case of selecting in the data modela code of improper (too small) rank to encode symbol of high value—coding images wedeal with alphabet sizes up to 216 and code of rank k = 0 assigns (i + 1)-bit codewordto symbol i. Apart from limiting the codeword length, the advantage of the presentedfamily over the original GR codes is that it contains, as the code of rank k = N − 1, theN -bit natural binary code. Using natural binary code we may avoid the data expansioneven when coding the incompressible data.


2.4 The data model

The modified data model known from the FELICS algorithm invented by Howard andVitter [19] is used. For prediction errors of pixels in the first column of an image aprediction error of the above pixel is used as a context, for prediction errors of theremaining pixels the preceding residuum symbol, i.e., a prediction error of pixel’s left-hand neighbor, is used as a context.

The method of selecting code rank in the data model of the FELICS algorithm is fastand simple. For each context we maintain an array of N counters, one counter for eachcode rank k. Each counter stores the code length we would have if we used code of rank kto encode all symbols encountered in the context so far. To encode a specific symbol in aspecific context we simply use the rank that in that context would give the shortest codeso far. After coding the symbol we update the counters in the current context. For eachcode rank we increase its counter by the length of the codeword assigned to the encodedsymbol by code of this rank. Periodically, when the smallest counter in a specific contextreaches a certain threshold all the counters in this context are halved, causing the modelto assign more importance to the more recently processed data.

Although one symbol only is used to determine the context we use collective contextbuckets. In the FELICS algorithm data model, for each context, at least one symbol hasto be encoded before we are able to estimate the code rank based on actual image data.The first symbol in a given context, or a few first symbols, may be encoded using animproper code rank. Since we deal with alphabet sizes up to 216, the number of pixelsencoded in a non-optimal way may worsen the overall compression ratio. Furthermore,due to an exponential prediction error probability distribution, some contexts may appearin the whole image a couple of times only. For the above reasons we group contexts ofhigher values in collective context buckets. In our case we maintain a single array ofcounters for all the contexts contained in the bucket. The number of contexts containedin the bucket grows exponentially in respect to the bucket number, starting with thebucket containing the single context. This way we reduce the FELICS model memorycomplexity of O(2N+1) to O(N2).

If there are some codes equally good for encoding a specific symbol according to thecriterion of the FELICS data model, then the code of the smallest rank is selected, whichmay cause an improper selection of small code ranks and lead to data expansion. Toreduce the effects of the improper rank selection at the beginning of the coding, Howardand Vitter suggest assigning a small initial penalty to the counters for small ranks. Weused a simple method that works not only at the beginning of the coding, but also whenthe image data characteristics change during the compression. We avoid the risk of dataexpansion by selecting, from among all the equally good codes, the one of the highestrank [22].

2.5 The reduced model update frequency method

The motivation for introducing the reduced model update frequency method is the ob-servation of typical image characteristics that change gradually for almost all the imagearea or even are invariable. In order to adapt to gradual changes, we may sample theimage, i.e., update the data model, less frequently than each time the pixel gets coded.Instead we update the model after the coding of selected pixels only. We could simplypick every i-th symbol to update the model, but such a constant period could interferewith the image structure. Therefore each time, after updating the model with some sym-


delay := 0

while not EOF

read symbol

compress symbol

if delay = 0 then

update model

delay := random(range)

else

delay := delay-1

endif

endwhile

Figure 2: The reduced model update frequency method.

bol, we select randomly a number of symbols to skip before next update of the model(Fig. 2). The number of symbols to skip is selected regardless of the actual value ofthe symbol used to update the model as well as of it’s context (the delay variable inthe Fig. 2 is a global one). In order to permit the decoder to select the same numberwe use a pseudo-random number generator. For just avoiding the interference with animage structure, even the simplest pseudo-random number generator, should suffice. Weuse the fixed pseudo-random number generator seed—this way we avoid storing the seedalong with the compressed image and make the compression process deterministic.

By selecting the range of the pseudo-random numbers we may change the modelupdate frequency, i.e., the probability of updating the model after coding a symbol.This way we control the speed of adapting the model and the speed of the compressionprocess. At the beginning of compression, the data model should adapt to the image datacharacteristics as quickly as possible. We start compression using all symbols in datamodeling and then we gradually decrease the model update frequency, until it reachessome minimal value.

The method is expected to vastly improve the compression speed without significantlyworsening the compression ratio. In case of the algorithm, from which the described algo-rithm originates, the method allowed to increase the compression speed by about 250% atthe cost of worsening the ratio by about 1% [20]. Similar, to a certain extent, approachwas used in the EDP algorithm, where the predictor coefficients are determined in anadaptive way by means of relatively complex LS optimization. As reported in [16], per-forming the above optimization only for pixels around the edges allows a decrease of thetime complexity by an order of magnitude at the cost of a negligible worsening of thecompression ratio. Note that, as opposed to the reduced update frequency method, inEDP location the pixels for which the time consuming operation is performed (or skipped)depends on the image contents.

2.6 Differences from the predecessor algorithm

The most significant differences between the described algorithm and the one from whichit originates [20] are the code family and the data model. Actually, the described algo-rithm is simpler compared to its predecessor.


The code family used in the previous algorithm was based on the Golomb codes. It wasa limited codeword length family, it contained the natural binary code, ordering of codesin the family was altered compared to original Golomb family. Generating codewordsfrom that family was not as simple as in case of the family presented in Section 2.3,however, it was not a problem for the algorithm of 8-bit image compression. As opposedto 16-bit depth, for 8-bit depth whole family may be precomputed and stored in the arrayof a reasonable size.

The data model of the predecessor algorithm was also more sophisticated. To aidfast adaptation to characteristics of the image data at the beginning of the compression,the model used variable number of collective context buckets. The compression startedwith a single bucket (containing all the contexts), that was subsequently divided intosmaller ones. By using this method the compression ratio for the smallest images wasimproved by over 1%. For the described algorithm, the use of variable number of bucketsresulted in worsening the average compression ratio (by about 0.2%); only in case of somesmall images the ratios were negligibly improved (by less than 0.1%). As opposed to itspredecessor, in the data model of the described algorithm we select, from among all theequally good codes, the one of the greatest rank. The above feature, along with usingdifferent code family, seems to be simpler and more efficient, than using the variablenumber of buckets. Furthermore, giving the variable number of buckets up we reduce themodeling memory and time complexity.

2.7 Complexity analysis

Time complexity

The fast adaptive compression algorithms are of the linear time complexity, in our case:

T (n) = n(cp + cc + cm) = n(cp + cc + pcu) = O(n),

where n is the number of pixels in the image, cp denotes prediction complexity (perpixel), cc—coding, cm—modeling, cu—single model update, and p—the update frequency.Prediction and coding are implemented as short sequences of simple integer operations.The model update is more complex since to update the model we have to compute lengthsof N codewords, where N is the image bit depth. Updating the model less frequently weaccelerate the slowest part of the compression process.

Memory complexity

The data model requires O(N2) bytes, where N is the image bit depth, for storing N coun-ters for each of cN buckets, c ≈ 1. To perform prediction we need O(w) bytes, where w isthe image width, since for some of the predictor functions we need the pixel’s upper-leftneighbor, i.e., memory for storing at least w + 1 pixels.

Actual implementation is aimed at maximizing the compression speed, rather than atreducing the memory complexity. Depending on image bit depth and endianness of theCPU it requires from about 7w + 2000 to about 12w + 4000 bytes.


3 EXPERIMENTAL RESULTS

3.1 Algorithm implementation

The algorithm was implemented in ANSI C language; the implementation may be down-loaded from http://sun.iinf.polsl.gliwice.pl/~rstaros/sfalic/. Algorithm pro-cesses the image row by row. After the row has been inputted, the prediction for thewhole row is performed and a resulting row of residuum symbols is stored in a mem-ory buffer. The row of residuum symbols is compressed to another memory buffer andthen output. After updating the model, the number of symbols to be skipped before thenext update is selected by picking a pseudo-random number and reducing it modulo 2m,where m is a nonnegative integer. Therefore following frequencies p = 2/(2m + 1) maybe used: 100% (the full update frequency), 66.6%, 40%, 22.2%, 11.8%, 6.06%, 3.08%,1.55%, 0.778%, 0.390%, 0.195%, . . . . We start coding the image with the full updatefrequency. Then, each time d pixels are coded, we decrease the frequency, until we reachthe target update frequency. The number of buckets in the model, as a function of imagebit depth, may be also selected. We tested a few following model structures (below arenumbers of contexts in the consecutive buckets):

a) 1, 1, 1, 2, 2, 4, 4, 8, 8, . . . ;

b) 1, 2, 4, 8, 16, . . . ;

c) 1, 4, 16, 64, . . . .

In an actual implementation, special variants of some functions were prepared forimages of depths up to 8 bits. Optimizations are possible when the alphabet size is small.For example, reordering of prediction errors or finding a bucket for specific context maybe done using single table-lookup to increase the compression speed; buffers for imagerows may be allocated for 8-bit pixels instead of 16-bit ones to reduce implementation’smemory requirements.

3.2 Procedure

An HP Proliant ML350G3 computer equipped with two Intel Xeon 3.06 GHz (512 KBcache memory) processors and Windows 2003 operating system was used to measure theperformance of algorithm implementations. Single-threaded application executables ofthe described algorithm, and other algorithms used for comparisons, were compiled usingIntel C++ Compiler, version 8.1. To minimize effects of the system load and the input-output subsystem performance on obtained results the executable was run several times.The time of the first run was ignored and the collective time of other runs (executed forat least one second, and at least 5 times) was measured and then averaged. The timemeasured was a sum of time spent by the processor in an application code and in kernelfunctions called by the application, as reported by the operating system after applicationexecution. Since we measure the execution time externally we actually include the timeof initializing the program by the operating system into our calculations; this time maybe significant for the smallest images.

The compression speed is reported in megabytes per second [MB/s], where 1 MB =220 bytes. Since we used PGM P5 image representation, the pixel size is 2 bytes for imagedepth over 8 bits, 1 byte in the opposite case. The compression ratio is in bits per pixel


[bpp]: 8e/n, where e is the size in bytes of the compressed image including the header,n—number of pixels in the image.

3.3 Test image set

A new diverse set of medical and natural continuous tone grayscale test images wasprepared to evaluate the performance of lossless image compression algorithms. Themain reason of preparing the new set was that, to our best knowledge, there was nopublicly available set of test images containing big, high quality images, which wereoriginally acquired with an actual 16-bit depth. The set contains natural continuous tonegrayscale images of various bit depths (up to 16 bits), various sizes (up to about 4 millionsof pixels) and medical images of various modalities (CR, CT, MR, and US). In the set,image groups were defined, to permit performance analysis based on average results forthe whole group, rather than on results for single images.

The biggest group, normal, is for evaluating algorithms’ performance in a typical case.A collection of smaller groups permits us to analyze or compare results with respect toimages’ bit depths, sizes, or medical image modality. The set contains also non-typicalimages, which do not belong to the normal group. To analyze the algorithms’ performanceon noisy data special images with added noise were prepared. To estimate the best-caseand the worst-case performance of algorithms, easily compressible and incompressiblepseudo-images were also generated. Below, we describe the image groups; details ofindividual images are reported in [23]. The set contains about one hundred images. Itis not as large as, for example, the set used by Clunie in an extensive study on losslesscompression of medical images [24] which contained over 3600 images but, on the otherhand, moderate size of the set allowed making it publicly available—it may be downloadedfrom http://sun.iinf.polsl.gliwice.pl/~rstaros/mednat/.

Group of natural continuous tone images, i.e., group of images acquired from scenesavailable for the human eye (photographic images), was constructed as follows. Four im-ages were acquired from a 36mm high quality diapositive film (Fuji Provia/Velvia) usingMinolta Dimage 5400 scanner (Fig. 3). In order to minimize the noise, the acquisitionwas first done at the device’s maximum depth of 16 bits, optical resolution 5400dpi andusing multiple sampling of each pixel (16 times or 4 times in case of branches image).One image (flower) was softened by setting the scanner focus too close. Then the im-ages’ resolution was reduced 3 times. These images formed a group of 16-bit big images,and then were subject to further resolution reduction (3 and 9 times) and to bit depthreduction (to 12 and to 8 bits). The set contains following groups of natural images:

• natural—36 natural images of various sizes and bit depths,

• big—12 natural images of various bit depths and size approximately 4000000 pixels,

• medium—12 natural images of various bit depths and size approx. 440000 pixels,

• small—12 natural images of various bit depths and size approx. 49000 pixels,

• 16bpp—12 natural images of various sizes and 16-bit depth,

• 12bpp—12 natural images of various sizes and 12-bit depth,

• 8bpp—12 natural images of various sizes and 8-bit depth.


branches flower

kid town

Figure 3: Sample natural images.

Groups of medical images were composed of CR, CT, MR, and US images of variousanatomical regions, acquired from devices of several vendors. Most of the medical imagesare from collections of medical images available on the Internet; the origin of individualimages is reported in [23]. In case of medical CR, CT, and MR images we report thenominal bit depth. The actual number of intensity levels may be smaller than impliedby the bit depth, by an order of magnitude or even more. The set contains the followinggroups of medical images:

• medical—48 medical CR, CT, MR, and US images,

• cr—12 medical CR images, nominal depth: 10 to 16 bits, average size approximately3500000 pixels,

• ct—12 medical CT images, nominal depth: 12 to 16 bits, average size approximately260000 pixels,

• mr—12 medical MR images, nominal depth of 16 bits, average size approximately200000 pixels,

• us—12 medical US images, 8-bit depth, average size approximately 300000 pixels.

To evaluate algorithms’ performance in a typical case, the normal group was defined.The normal group contains all 84 natural and medical images. The average resultsof compressing images from the normal group are used as a measure of algorithms’performance for continuous tone grayscale images. Unless indicated otherwise, we reportthe average results for this group.

Other groups contained in the set, non-typical images:

• noise—9 images with added noise, created using branches image of various bitdepths (8, 12, and 16 bits) and medium size (approximately 440000 pixels). Noisewas added using: v1 = v0(1 − a) + ra, where v0 denotes original pixel intensity,


v1—intensity after adding noise, r—random value of uniform distribution (range[0, 2N − 1], where N is image bit depth) and a is the amount of noise. We preparedimages using a = 0.1, 0.2, 0.5,

• empty—3 pseudo-images, intensity of all pixels equals 0, nominal depth of 8, 12,and 16 bits, size approximately 440000 pixels,

• random—3 pseudo-images, random intensities of pixels (uniform distribution), bitdepth of 8, 12, and 16 bits, size approximately 440000 pixels.

The set described above contains no images traditionally used for comparisons ofimage compression algorithms. To verify observations made using the above set fortraditional 8-bit test images and to make comparisons to results reported in other stud-ies possible, additional experiments were performed using the popular Waterloo Brag-Zone GreySet2 set of test images (downloaded from: http://links.uwaterloo.ca/

BragZone/GreySet2/).

3.4 Parameter selection for the algorithm

The parameter selection was based on the average compression speed and the averagecompression ratio for images from the normal group. The threshold, that triggers dividingall the counters in a certain context when the smallest counter reaches it, was selectedfor each update frequency and for each number of buckets individually as the averagebest value for predictors 1 to 8. This way we do not favor any specific update frequency,predictor, or model structure. Knowing these parameters, however, we could simply usea fixed threshold for all the update frequencies. Using for all the update frequencies thethreshold selected for the update frequency and the number of buckets of the defaultparameter set listed below, would simplify the algorithm and change the compressionratio, for some image groups only, by less than 0.1%. The remaining algorithm parameterswere selected by examining results of compression using combinations of all p values,all numbers of buckets, all predictors, some values of d, and some code length limitslmax. Parameter combinations which, compared to some other combination, resulted inworsening of both the compression speed and the compression ratio, were rejected. Oneof the remaining combinations was selected as the default parameter set, its use results inthe compression speed 20% less than the fastest one obtained and the compression ratioworse by about 0.5%, compared to the best ratio. The default parameters are:

• model update frequency p = 3.08%,

• predictor Pred8 ((3A + 3B − 2C)/4),

• decreasing the update frequency each d = 2048 pixels,

• code length limit lmax = 26,

• doubling model bucket size each bucket (model structure b).

Below we describe how these parameters, considered individually, influence compres-sion results. Fig. 4 presents the compression speed and the compression ratio obtainedusing various update frequencies. Using the reduced update frequency we may get acouple of hundred percent speed improvement at the cost of worsening the compression


7.445

7.465

7.485

7.505

7.525

7.545

7.565

7.585

0.0 10.0 20.0 30.0 40.0 50.0 60.0

the compression speed [MB/s]

the

com

pre

ssio

n r

atio

[b

pp

]

100%66.6%

40% 22.2%11.8%

6.06%3.08%

1.55%

0.778%

0.390%

0.195%

Figure 4: Compression results for various update frequencies (normal images).

ratio by about 0.5%. Note that decreasing the update frequency below some point stopsimproving the modeling speed. The reduced model update frequency method could prob-ably be applied to other adaptive algorithms, in which modeling is a considerable factorin algorithm’s overall time complexity. It could be used as a mean of adjusting the algo-rithm speed versus the quality of modeling or, as in our case, it could be used to improvethe speed, to a certain extent, without worsening the modeling quality significantly.

The selected Pred8 predictor, although the most complex, gives the best averagecompression ratio. Use of this predictor and the selected update frequency proves to bebetter than the use of any simpler predictor and a greater frequency since the compressionratio improvement is obtained at the relatively small cost. For a few specific image groups,the use of other predictors results in the compression ratio improvement (and in the speedimprovement by a few percent):

• for the cr and noise images Pred7 improves the ratio by 1.0% and 1.6% respectively,

• for the us images the use of Pred4 gives a 3.6% improvement of the compressionratio.

Decreasing gradually the update frequency each 2048 pixels, compared to compressingthe whole image using the same frequency, only affects the results for the smallest images.This way, for the small group, we get a 1.0% compression ratio improvement and thespeed lower by a few percent.

The code length limit was selected for the normal group. Except for the us group,selecting the limit for the specific group may improve the compression ratio by lessthan 0.1%. For the us images we may obtain a 2.1% compression ratio improvementby limiting the codeword length to 14 bits.

Results for the data model structures described earlier: a, b, and c are almost identical.Selecting other than the default model structure we may improve the compression ratiofor some groups by less than 0.1%. We also compared the data model that uses collectivecontext buckets to two other model structures. Using collective context buckets provesto be superior to compression with a model of 2N individual contexts that results in anexpansion of 16-bit images and also to compression with memoryless model that, for thenormal images, results in worsening the average compression ratio by 6.2%.


3.5 Comparison to other techniques

The algorithm described in this paper, denoted here as ‘SFALIC’, was compared to sev-eral other image compression algorithms. In Tables 3 and 4 we report average com-pression speeds and average ratios obtained by the algorithms described below, fornormal images. Due to the number of images contained in the set, results for indi-vidual images are not included in this paper, they may be downloaded from http:

//sun.iinf.polsl.gliwice.pl/~rstaros/sfalic/. After discussing results for thenew set we report results obtained for the well-known images of University of Water-loo. The results are reported for the following algorithms:

• CALIC-A—the relatively complex predictive and adaptive image compression algo-rithm using arithmetic entropy coder, which because of the very good compressionratios is commonly used as a reference for other image compression algorithms [2, 3].In the CALIC algorithm we use 7 neighboring pixels, both to determine contextand as arguments of the nonlinear predictor function. When pixels in the neighbor-hood are, at most, of 2 intensity levels CALIC enters, so called, the binary mode.In the binary mode, for consecutive pixels, we encode information whether pixelintensity is equal to brighter neighbors, darker neighbors, or neither of them-inthis case we leave the binary mode. CALIC utilizes the bias cancellation method.We used implementation by Wu and Memon [25]. Since this implementation is abinary executable for UltraSparc processors, the compression speed of CALIC algo-rithm is estimated based on the relative speed of this implementation compared tothe SFALIC speed on a different computer system (Sun Fire V440 running Solaris9, equipped with 1.06GHz UltraSparc IIIi processors; both implementations weresingle-threaded).

• CALIC-H—the variant of the CALIC algorithm using Huffman codes (compressionspeed estimated as in case of CALIC-A).

• JPEG-LS—the standard of the JPEG committee for lossless and near-lossless com-pression of still images [4]. The standard describes a low-complexity predictiveand adaptive image compression algorithm with entropy coding using a modifiedGolomb–Rice family. The algorithm is based on the LOCO-I algorithm [26, 27].In the JPEG-LS algorithm, we use 3 neighboring pixels for nonlinear prediction,and 4 pixels for modeling. JPEG-LS utilizes the bias cancellation method, also itdetects and encodes in a special way smooth image regions. If the smooth regionis detected we enter the, so called, run-mode and instead of encoding each pixelseparately we encode, with a single codeword, the number of consecutive pixels ofequal intensity. We used the SPMG/UBC implementation [28]. In this implemen-tation, some code parts are implemented in 2 variants: one for images of depths upto 8 bits and the other for image depths 9–16 bits.

• CCSDS SZIP—the standard of the Consultative Committee for Space Data Systemsused by space agencies for compressing scientific data transmitted from satellitesand other space instruments [5]. CCSDS SZIP is a very fast predictive compressionalgorithm based on the extended-Rice algorithm, it uses Golomb–Rice codes for en-tropy coding, and primarily was developed by Rice. CCSDS SZIP is often confusedwith a general-purpose compression utility by Schindler, which is also called ‘SZIP’.CCSDS SZIP does not employ an adaptive data model. The sequence of prediction


Table 3: The compression speed, normal images [MB/s].

image group CALIC-A CALIC-H JPEG-LS CCSDS SZIP SFALIC

natural 2.6 7.2 15.0 40.3 43.3big 3.4 9.7 18.1 56.0 61.8medium 2.8 8.0 16.5 44.8 48.1small 1.7 3.7 10.3 20.0 20.116bpp 1.9 6.9 15.9 40.0 41.312bpp 3.0 7.5 17.8 48.4 47.98bpp 3.0 7.0 11.1 32.4 40.7medical 3.6 9.6 20.7 50.6 50.1cr 4.9 11.5 24.8 73.5 70.5ct 3.4 9.1 21.6 50.1 49.1mr 2.7 8.4 20.9 41.6 39.9us 3.7 9.3 15.4 37.1 40.8

normal 3.2 8.5 18.2 46.1 47.2

errors is divided into blocks. Each block is compressed using a two-pass algorithm.In the first pass, we determine the best coding method for the whole block. In thesecond pass, we output the marker of the selected coding method as a side informa-tion along with prediction errors encoded using this method. The coding methodsinclude: Golomb–Rice codes of a chosen rank; unary code for transformed pairs ofprediction errors; fixed length natural binary code if the block is found to be incom-pressible; signaling to the decoder empty block if all prediction errors are zeroes.We used UNM implementation [29]. It was optimized for the default block size of16 symbols. Since biggest images (big and cr) required greater block size, we usedblock size of 20 symbols for all the images. For smaller images, compared to thereported results, by using the default block size, we get compression speed higherby about 10% to 20%, and the compression ratio from 0.5% worse to 1.2% better,depending on the image group. Higher compression speed for all the images byan average of 8.5% for the normal group may be obtained using a block size of32 symbols, however, at the cost of worsening the compression ratio by 0.9%.

JPEG2000, Lossless JPEG Huffman, PNG [30], and FELICS were also examined [23].We do not report these results because, for all the image groups, speeds and ratios ofthese algorithms are worse than obtained by the JPEG-LS. For other test image sets theJPEG2000 algorithm is reported to obtain ratios, depending on image or image class, littlebetter or little worse to JPEG-LS [24, 31]. Compared to SFALIC, the algorithms LosslessJPEG Huffman, PNG, and FELICS, obtain worse compression ratios for almost all thegroups (except for us group for PNG and FELICS, and empty group for PNG) and lowercompression speeds for all the groups. JPEG2000 obtains average ratio by 2.3% betterthan SFALIC and compression speed about 15 times lower. SFALIC was also compared tothe algorithm from which it originates. For 8bpp images SFALIC’s predecessor obtainedcompression ratio worse by 1.3% and the compression speed lower by 34%.


Table 4: The compression ratio, normal images [bpp].


natural 7.617 7.661 7.687 8.432 7.953big 6.962 7.059 7.083 7.773 7.274medium 7.623 7.699 7.710 8.403 8.009small 8.267 8.227 8.269 9.121 8.57616bpp 11.748 11.622 11.776 12.458 11.86712bpp 7.491 7.565 7.571 8.407 7.8698bpp 3.613 3.797 3.715 4.431 4.123medical 6.651 6.761 6.734 7.396 7.165cr 6.229 6.324 6.343 6.883 6.662ct 7.759 7.840 7.838 8.806 8.266mr 9.975 9.895 10.009 10.599 10.235us 2.641 2.985 2.748 3.298 3.497

normal 7.065 7.147 7.143 7.840 7.503

SFALIC algorithm is clearly the fastest algorithm among algorithms that use an adap-tive data model. The compression speed of SFALIC algorithm for normal images (Fig. 5)is over 2.5 times higher than the speed of the second fastest adaptive model algorithm(JPEG-LS) and about 12 times higher than the speed of an algorithm obtaining thebest compression ratios (CALIC-A). Compression speed of SFALIC is almost the sameas the speed of the CCSDS SZIP algorithm, which does not employ adaptive modeling.Actually SFALIC obtained speed little higher than CCSDS SZIP. However, the relativespeed difference is negligible. Probably both algorithms could be optimized to improvethe speed a little—CCSDS SZIP by optimizing it for block size of 20 symbols, or byoptimizing it for low image bit depths, SFALIC by integrating prediction into coding andmodeling loop. The highest compression speed is achieved for the biggest images (big andcr). The compression speed for these groups is over 60MB/s, i.e., for the biggest imageswe need less than 50 CPU cycles per byte of image. The compression speed significantlylower, than the average, was obtained for small images. For these images, the time ofinitializing the compression implementation executable by the operating system becomesa significant factor in the overall speed of the compression algorithm. To some extentsimilar behavior may be observed for all the examined algorithms. SFALIC compressionspeed depends on the image size rather than on the image bit depth. Since for depthsover 8 bits the image pixel is stored using 2 bytes, the compression speed for 12bppimages is greater than the speed for 8bpp and 16bpp images. We also notice that, forindividual images of similar bit depth and similar size there are no significant differencesin SFALIC compression speed. The compression speed of some other algorithms, such asJPEG-LS, depends to a larger extent on the image contents. Here the greater differencesare probably due to much lower complexity of the run mode employed by JPEG-LS forsmooth areas found in some images, compared to the complexity of regular mode usedfor non-smooth regions.

The average compression ratio of the CALIC-A, CALIC-H, and JPEG-LS algorithmsis better than the ratio of SFALIC by 5.8%, 4.7%, and 4.8% respectively. Such a cost


7.000

7.100

7.200

7.300

7.400

7.500

7.600

7.700

7.800

7.900

0.0 10.0 20.0 30.0 40.0 50.0


the

com

pre

ssio

n r

atio

[b

pp

]

CALIC-A

CALIC-H JPEG-LS

SFALIC

CCSDS SZIP

Figure 5: The compression results for various algorithms (normal images).

Figure 6: Sample medical us images.

of improving the compression speed is not important in many practical image processingsystems, especially when we compress images to transmit them or to store them tem-porarily. Compared to the CCSDS SZIP, which is the only algorithm that obtains speedclose to SFALIC, the compression ratio of SFALIC is better by 4.5%.

For compressing some images, other algorithms are much better—the us images arecompressed 24.5% better by CALIC-A. The predictor function and the codeword lengthlimit were selected for the normal group and are not well suited for the us images,but the main reason of a worse compression ratio is that SFALIC does not employ anyspecial method of processing smooth image regions—the us images contain large uniformintensity areas, i.e., black background for the actual image (Fig. 6).

On the Fig. 7 we compare ratios of SFALIC and JPEG-LS obtained for individualimages. The absolute differences of ratios are moderate; the greatest one is about 1 bpp.Note that bigger differences occur for smaller compression ratios so the relative differencesof ratios may be, for highly compressible images, practically important. Therefore, inthe Fig. 8, instead of an absolute ratio we present the relative compression ratio ofJPEG-LS expressed as percents of the ratio that SFALIC obtained for a specific image.We also mark, by the gray background, images that contain significant amount of smoothareas. Here, the image is considered to contain significant amount of smooth areas if atleast 15% of its pixels is encoded by the JPEG-LS using the run mode (actually it is atleast 17.6% for these images and at most 5.6% for the remaining ones). We can see thatthe relative ratio differences in favor of JPEG-LS are getting much greater as the imagecompression ratio decreases. It can also be seen that the JPEG-LS ratio is better than


0.000

2.000

4.000

6.000

8.000

10.000

12.000

14.000

0.000 2.000 4.000 6.000 8.000 10.000 12.000 14.000

the SFALIC compression ratio [bpp]

the J

PE

G-L

S r

ati

o [

bpp] 16bpp

12bpp

8bpp

cr

ct

mr

us

Figure 7: SFALIC and JPEG-LS ratios for individual images.

55%

60%

65%

70%

75%

80%

85%

90%

95%

100%

105%

0.000 2.000 4.000 6.000 8.000 10.000 12.000 14.000

the SFALIC compression ratio [bpp]

the

rela

tive

JPE

G-L

S r

atio

[%

]

16bpp

12bpp

8bpp

cr

ct

mr

us

Figure 8: SFALIC ratio and relative JPEG-LS ratio for individual images.

SFALICS’s by up to almost 40% for images containing significant amount of smooth areas,whereas for other images the JPEG-LS is better by up to about 16%. Above observationconfirms significance of using the method of efficient encoding smooth image areas in thecompression algorithm.

For natural images the relative compression ratio of algorithms obtaining better ratioscompared to SFALIC, does not depend on image size and significantly depends on imagebit depth. For 8bpp images the compression ratio of CALIC-A is 12.4% better, for 16bppimages it is better by 1.0%. Generally the SFALIC algorithm obtains good compressionratios when the actual number of intensity levels is high. The medical cr, ct, and mrimages, which are of 16-bit nominal depth, actually use much smaller than 216 numberof levels. Among 24 such images 21 use below 4000 levels and only 3 cr images useabout 25000 levels. For these 3 images the ratio of CALIC-A algorithm is better thanSFALIC’s ratio by 1.4%.

Most ct and mr images and some cr images are of sparse histograms. Not only theactual number of levels found in these images is much smaller than the nominal one,but the levels are distributed throughout almost all the entire nominal intensity range aswell. Such characteristics is clearly different from what is expected by a lossless imagecompression algorithm, both in case of predictive and of transform coding. In [32] we re-


11.500

11.600

11.700

11.800

11.900

12.000

12.100

12.200

12.300

12.400

12.500

12.600

0.0 10.0 20.0 30.0 40.0 50.0


the

com

pre

ssio

n r

atio

[b

pp

]

CALIC-A

CALIC-H

JPEG-LSSFALIC

CCSDS SZIP

Figure 9: The compression results for various algorithms (16bpp images).

Table 5: The compression speed for non-typical images [MB/s].


empty 13.2 39.9 156.1 118.2 79.3noise 2.0 7.4 14.9 41.4 45.6random 1.3 6.5 14.7 41.5 37.3

ported efficient methods of compressing these images. Employing the so called histogrampacking technique we may vastly improve compression ratios of sparse histogram images.This way the CALIC-A average compression ratios were improved to 4.485 bpp for ct andto 4.811 bpp for mr images (that is by about 42% and 52% respectively). Improvementof the average compression ratio for the cr group was about 15%.

Interesting results were obtained for 16bpp images (Fig. 9). For this group the com-pression ratio of arithmetic coding version of CALIC is 1.1% worse than the ratio ofthe Huffman-coder version. CALIC-H obtains ratios better than CALIC-A also for small(i.e., group containing 16-bit images) and for mr images (of 16 bit nominal depth). Aboveobservations suggest that the small difference in compression ratio between SFALIC andCALIC for 16bpp images should rather be attributed to imperfections of other algorithms,than to especially good performance of SFALIC. Probably there is still a possibility ofimproving the compression ratio of CALIC for high bit depth images.

The empty pseudo-images (Tables 5 and 6) are the most easily compressible datafor the image compression algorithm. As one could expect, for empty images the ratioof algorithms that employ a method of efficient encoding smooth image regions is closeto 0 bpp. For all the algorithms, the compression speed for the empty group is higherthan of any other group, the greatest speedup is observed for JPEG-LS.

For non-typical noisy images the compression speed of all algorithms is similar to theaverage medium group speed that contains images of similar size. Compression of theseimages, in case of some algorithms even of individual images with 50% noise added, stillresults in compression ratios smaller than the image bit depth, however not by much. The


Table 6: The compression ratio for non-typical images [bpp].


empty 0.001 0.045 0.002 0.027 1.000noise 10.478 10.690 10.693 11.101 10.842random 12.375 13.008 12.516 12.370 12.009

france library washsat

Figure 10: BragZone GreySet2 images with smooth areas.

random pseudo-images are incompressible and may be used for estimating the worst-casealgorithm compression ratio, however for a specific image compression algorithm we canprepare data even harder to compress, i.e., pseudo-image of characteristics opposite towhat is expected in prediction or modeling. The best method of processing incompressibledata is to copy them binary, i.e., to encode pixel intensities using N -bit natural binarycode, where N denotes image bit depth—for the random group we would get the com-pression ratio of 12 bpp. The SFALIC algorithm actually acts this way; its code familycontains the fixed length natural binary code that is used in case of processing randomimages. All the remaining algorithms cause noticeable data expansion. In case of CCSDSSZIP algorithm, natural binary code is also used, however it is a two-pass scheme. Thedata expansion of CCSDS SZIP is solely due to including, in the compressed data, sideinformation along with each block.

To verify observations made using the new set, additional experiments were performedwith the popular Waterloo BragZone GreySet2 set of 8-bit test images (Tables 7 and 8). Inthe tables we report results for individual images, average results for the whole BragZoneGreySet2 set, and the average results for 8-bit medium size natural images from the newset, i.e., for images belonging to the intersection of groups 8bpp and medium (rows labeled‘Average medium8bpp’). Not all the BragZone GreySet2 images are typical photographiccontinuous-tone ones. The france is a computer generated and the library is a compoundimage. The washsat is an aerial photo image. Compared to other GreySet2 images, thelatter 3 images contain significantly greater amount of smooth areas (Fig. 10). JPEG-LSencodes 46.9%, 22.4%, and 10.7% of pixels of france, library, and washsat respectively,using the run-mode, whereas for the remaining images it is at most 2.9%. In someimages dithering-like patterns are visible; these images are probably dithered palettecolor images converted to grayscale. The patterns are most noticeable in library and frogimages, also mandrill, mountain, and peppers seem to be dithered. We also note thatthe images washsat, frog, and mountain are of sparse histograms—the number of pixelintensity levels actually used in these images is 35, 102, and 110 respectively.


Table 7: The compression speed, BragZone GreySet2 images [MB/s].

image pixels CALIC-A CALIC-H JPEG-LS CCSDS SZIP SFALIC

barb 262144 1.8 4.2 10.3 28.7 34.3boat 262144 1.9 4.4 10.6 28.7 36.3france 333312 4.3 10.7 23.1 45.5 40.0frog 309258 2.0 5.3 10.3 30.4 35.8goldhill 262144 1.7 4.1 10.3 29.0 35.4lena 262144 1.9 4.4 10.5 28.8 36.6library 163328 1.6 4.2 11.0 24.1 27.4mandrill 262144 1.6 4.1 9.5 27.9 33.7mountain 307200 1.9 5.1 10.1 28.8 33.6peppers 262144 1.8 4.1 10.4 28.5 34.8washsat 262144 2.1 4.8 11.5 30.2 36.3zelda 262144 2.1 4.6 11.0 29.5 37.1

AverageGreySet2

267521 2.1 5.0 11.5 30.0 35.1

Averagemedium8bpp

440746 3.1 7.7 12.1 36.0 44.1

Table 8: The compression ratio, BragZone GreySet2 images [bpp].

image pixels CALIC-A CALIC-H JPEG-LS CCSDS SZIP SFALIC

barb 262144 4.453 4.569 4.733 5.775 5.315boat 262144 4.151 4.233 4.250 5.153 4.632france 333312 0.823 1.684 1.411 2.425 3.736frog 309258 5.853 6.232 6.049 6.657 6.536goldhill 262144 4.629 4.719 4.712 5.280 4.870lena 262144 4.110 4.184 4.244 5.046 4.567library 163328 5.012 5.228 5.101 5.858 6.025mandrill 262144 5.875 6.031 6.036 6.374 6.256mountain 307200 6.265 6.538 6.422 6.717 6.840peppers 262144 4.378 4.488 4.489 5.167 4.933washsat 262144 3.670 4.107 4.129 4.825 4.526zelda 262144 3.862 3.973 4.005 4.838 4.289

AverageGreySet2

267521 4.424 4.665 4.632 5.343 5.210

Averagemedium8bpp

440746 3.630 3.826 3.701 4.400 4.153


Generally, these results adhere to the results obtained for the new set. Average com-pression speed for GreySet2 images is little lower than the speed obtained for medium8bppimages, that are of little greater size. The compression speed for all the photographicimages, for a specific algorithm, does not vary significantly. Increased speed is observedfor france image, which contains smooth regions and, by all the algorithms, is compressedfaster than other images of similar size. In case of CALIC and JPEG-LS, the speed isincreased over 2 times compared to other images of similar size. The library image alsocontains smooth regions, but this image is smaller than others—as could be expectedSFALIC compresses this image slower compared to bigger images. For some other algo-rithms presence of smooth regions seems to have greater impact on compression speed,than the smaller size of the image. The smooth regions in washsat do not influence no-ticeably the compression speed. In this image, observed in a raster scan order, runs ofpixels (or prediction errors) are much shorter, than in france or in the library (and theoverall amount of pixels in smooth regions is smaller). There are no significant differencesin compression speed between photographic images with and without dithering patterns.

The average SFALIC compression ratio for GreySet2 images compared to ratios ob-tained by other algorithms, is little worse than in case of the medium8bpp images—for the GreySet2 CALIC-A obtains a ratio better than SFALIC by 15.1%, whereas formedium8bpp it is better by 12.6%. The greater differences are due to ratios obtained fortwo images (france and library) containing significant amount of smooth areas. If we ex-clude these images from comparison, then the CALIC-A ratio gets smaller than SFALIC’sby 10.5% only. For the france image the CALIC-A obtains a ratio of 0.823 bpp, whereasratios of other algorithms are from 1.411 bpp (JPEG-LS) to 3.736 bpp (SFALIC). Suchlarge differences cannot be attributed to smooth areas alone. The probable reasons oflarge differences among CALIC-A and other algorithms are: the CALIC’s binary mode(capable of encoding sequences of symbols of 2 intensity levels); the arithmetic coderused (which as opposed to Golomb–Rice codes is capable of efficiently encoding of anyprobability distribution); the predictors used (that in the CALIC and in the JPEG-LS areactually switching between simple predictors in order to detect edges); the more sophisti-cated data model used (SFALIC uses prediction error of a single neighbor of current pixelonly to determine pixel’s context) and the algorithms’ ability to adapt quickly to rapidchanges of the image characteristics (SFALIC uses the reduced model update frequencymethod; SZIP selects the coding method for a whole block of pixels). Note, however,that france is definitely not a typical continuous tone image and that for such imagesspecial algorithms exist. In [33] a reversible image preprocessing method is proposed,that in case of the JPEG-LS algorithm is reported to improve the compression ratio forthe france image from 1.411 bpp to 0.556 bpp. For images of sparse histograms we mayget significantly better ratios by applying the histogram packing technique, this way theCALIC compression ratio for washsat, frog, and mountain may be improved by respec-tively 44.5%, 16.8%, and 18.9% (similar ratio improvement is reported for the JPEG-LSalgorithm) [34]. We also notice for all the algorithms, that the average ratios for photo-graphic images containing visible dithering patterns are noticeably worse, than averageratios for the remaining photographic images.


4 CONCLUSIONS

The presented predictive and adaptive lossless image compression algorithm was designedto achieve high compression speed. The prediction errors obtained using simple linearpredictor are encoded using codes adaptively selected from the modified Golomb–Ricecode family. As opposed to the unmodified Golomb–Rice codes, this family limits thecodeword length and allows coding of incompressible data without expansion. Code se-lection is performed using a simple data model based on the model known from FELICSalgorithm. Since updating the data model, although fast as compared to many othermodeling methods, is the most complex element of the algorithm, we apply the reducedmodel update frequency method that increases the compression speed by a couple of hun-dred percent at the cost of worsening the compression ratio by about 0.5%. This methodcould probably be used for improving speed of other algorithms, in which data modelingis a considerable factor in the overall algorithm time complexity. The memory complexityis low—algorithm’s data structures fit into contemporary CPUs’ cache memory.

The presented algorithm was compared experimentally to several others. For continu-ous tone natural and medical images, on average, its compression ratio is by 5.8% worse,compared to the best ratio obtained by CALIC. Its compression speed is over 2.5 timeshigher than the speed of JPEG-LS. Compared to the CCSDS SZIP, i.e., to the algorithmthat does not employ adaptive data model, the presented algorithm obtains similar com-pression speed, and by 4.5% better compression ratio.

For some images SFALIC compression ratios are significantly worse than ratios ofcertain other schemes. The ratios worse than CALIC by up to about 1 bpp were obtainedfor images that contain significant amount of highly compressible smooth areas, such asmedical US images. For compound and computer generated images more sophisticatedalgorithms may obtain ratios better by even more. For images having sparse histograms,such as MR and CT medical images, significant ratio improvement is possible both in caseof SFALIC and the remaining algorithms used for comparisons in this paper. Findinga fast and efficient method of processing the above types of data is a potential field offuture algorithm improvement.

Another type of data requiring huge amounts of storage, for which a fast algorithmcould be practically useful, is volumetric data. A simple method of extending the de-scribed algorithm to exploit the 3-dimensional characteristics of the data, which is aninteresting field of further research, might be the use of 3-dimensional prediction func-tions.

The described algorithm is especially good for:

• big images, since it compresses them with the very high speed—over 60 MB/s on3.06GHz CPU, i.e., it needs less than 50 CPU cycles per byte of image,

• natural images of 16-bit depth, since it obtains for them very good compressionratio—it’s ratio differs by couple percent from the ratio of the CALIC algorithm,

• noisy images, since as opposed to the other algorithms, it causes almost no dataexpansion even if the image contains nothing, but noise.

Due to the above advantages it is ideally suited for lossless compression of data to betransmitted from modern medical and general purpose image acquisition devices, thatproduce images of high bit depths, big sizes, usually containing certain amount of noise.Presented algorithm is an alternative for compressing images to be transmitted over the


network—it may improve the transmission throughput when most other algorithms aretoo slow. The algorithm could also be used for compressing and decompressing, on thefly, large sets of images that are stored in memory for rapid access.

Acknowledgements

The research was fully supported by the Grant Nr 4 T11C 032 24 of the Ministry ofEducation and Science of the Republic of Poland. It was carried out at the Instituteof Computer Science, Silesian University of Technology, in years 2003 and 2004. Theauthor would like to thank Sebastian Deorowicz and anonymous referees for reviewingthe manuscript and suggesting significant improvements.

References

[1] Memon, N. D.; Sayood, K.: Lossless image compression: A comparative study.Proceedings of the SPIE, Still-Image Compression, San Jose, California, 1995, Vol.2418, pp. 8–20.

[2] Wu, X.; Memon, N.: Context-based, Adaptive, Lossless Image Codec. IEEE Trans-actions on Communications, April 1997, Vol. 45(4), pp. 437–44.

[3] Wu X.: Efficient Lossless Compression of Continuous-tone Images via Context Se-lection and Quantization. IEEE Transactions on Image Processing, May 1997, Vol.IP-6, pp. 656–64.

[4] ITU-T; ISO/IEC: Information technology—Lossless and near-lossless compression ofcontinuous-tone still images—Baseline. ITU-T Recommendation T.87 and ISO/IECInternational Standard 14495-1, June 1999.

[5] Consultative Committee for Space Data Systems: Lossless Data Compression.CCSDS Recommendation for Space System Data Standards, CCSDS 121.0-B-1, BlueBook, May 1997.

[6] Memon, N; Wu, X.: Recent developments in Context-Based Predictive Techniquesfor Lossless Image Compression. The Computer Journal, 1997, Vol. 40(2–3), pp.127–36.

[7] Carpentieri, B.; Weinberger, M. J.; Seroussi, G.: Lossless compression of Continuous-Tone Images. Proceedings of the IEEE, November 2000, Vol. 88(11), pp. 1797–809.

[8] Merhav, N.; Seroussi, G.; Weinberger, M. J.: Optimal prefix codes for sources withtwo-sided geometric distributions. IEEE Transactions on Information Theory, Vol.IT-46(1), January 2000, pp. 121–35.

[9] Langdon, G.; Gulati, A.; Seiler, E.: On the JPEG model for lossless image com-pression. Proceedings DCC ’92, Data Compression Conference, IEEE Comput. Soc.Press, Los Alamitos, California, 1992, pp. 172–80.

[10] Shannon, C.E.: A Mathematical Theory of Communication. Bell System TechnicalJournal, 1948, Vol. 27, pp. 379–423, 623–56.


[11] Moffat, A.; Neal, R. M.; Witten, I. H.: Arithmetic Coding Revisited. ACM Trans-actions on Information Systems, 1998, Vol. 16(3), pp. 256–94.

[12] Huffman, D. A.: A method for the construction of minimum-redundancy codes.Proceedings of the Institute of Radio Engineers 40(9), 1952, pp. 1098–101.

[13] Golomb, S. W.: Run-Length Encodings. IEEE Transactions on Information Theory,July 1966, IT-12, pp. 399–401.

[14] Rice, R. F.: Some practical universal noiseless coding techniques—part III. JetPropulsion Laboratory tech. report JPL-79-22, 1979.

[15] Deng G., Ye H.: Lossless image compression using adaptive predictor combinationsymbol mapping and context filtering. Proceedings of the IEEE International Con-ference on Image Processing, Kobe, Japan, Oct. 1999, Vol. 4, pp. 63–7.

[16] Li X., Orchard M. T.: Edge-Directed Prediction for Lossless Compression of NaturalImages. IEEE Transactions on Image Processing, June 2001, Vol. 10(6), pp. 813–17.

[17] ITU-T; ISO/IEC: Information technology—JPEG 2000 image coding system: Corecoding system. ITU-T Recommendation T.800 and ISO/IEC International Standard15444-1, August 2002.

[18] Christopoulos C.; Skodras A.; Ebrahimi T.: The JPEG2000 Still Image CodingSystem an Overview. IEEE Transactions on Consumer Electronics, November 2000,Vol. 46(4), pp. 1103–27.

[19] Howard, P. G.; Vitter, J. S.: Fast and efficient lossless image compression. Pro-ceedings DCC ’93 Data Compression Conference, IEEE Comput. Soc. Press, LosAlamitos, California, 1993, pp. 351–60.

[20] Starosolski, R.: Fast, robust and adaptive lossless image compression. MachineGraphics and Vision, 1999, Vol. 8, No. 1, pp. 95-116.

[21] Starosolski, R.; Skarbek, W.: Modified Golomb–Rice Codes for Lossless Compressionof Medical Images. Proceedings of International Conference on E-health in CommonEurope, Cracow, Poland, June 2003, pp. 423–37.

[22] Starosolski, R.: Reversing the Order of Codes in the Rice Family. Studia Informatica,2002, Vol. 23, No. 4(51), pp. 7–16.

[23] Starosolski, R.: Performance evaluation of lossless medical and natural continuoustone image compression algorithms. Proceedings of the SPIE, Medical Imaging, War-saw, Poland, September 2005, Vol. 5959, CID 59590L (pp. 116–27).

[24] Clunie D. A.: Lossless compression of grayscale medical images—effectiveness oftraditional and state of the art approaches. Proceedings of the SPIE, Medical Imaging2000: PACS Design and Evaluation: Engineering and Clinical Issues, San Diego,California, 2000, Vol. 3980, pp. 74–84.

[25] Wu, X.; Memon, N.: Implementation of Context-based, Adaptive, Lossless ImageCoder (CALIC), 1995, ftp://ftp.csd.uwo.ca/pub/from_wu/ (downloaded 20 Nov.1998).


[26] Weinberger, M. J.; Seroussi G.; Sapiro G.: LOCO-I: A low complexity, context-based, lossless image compression algorithm. Proceedings DCC’96, IEEE Comput.Soc. Press, Los Alamitos, California, 1996, pp. 140–9.

[27] Weinberger M. J., Seroussi G., Sapiro G.: The LOCO-I lossless image compressionalgorithm: Principles and standardization into JPEG-LS. IEEE Transactions onImage Processing, August 2000, Vol. 9(8), pp. 1309–24.

[28] Signal Processing and Multimedia Group, University of British Columbia:SPMG/JPEG-LS Implementation, version 2.2, 1999, ftp://ftp.se.netbsd.org/pub/NetBSD/packages/distfiles/jpeg_ls_v2.2.tar.gz (downloaded 1 Nov.2004).

[29] University of New Mexico, Microelectronics Research Center: SZIP science datalossless compression program, combined version: 1.5, 2002, ftp://ftp.ncsa.uiuc.edu/HDF/lib-external/szip/ (downloaded 22 Sept. 2004).

[30] W3C Recommendation: PNG (Portable Network Graphics) Specification, Version1.0, 1996, http://www.w3.org/TR/REC-png.html.

[31] Santa-Cruz, D.; Ebrahimi, T.: A Study of JPEG2000 Still Image Coding VersusOther Standards. Proceedings of X European Signal Processing Conference EU-SIPCO, Tampere, Finland, Sept. 2000, Vol. 2, pp. 673–76.

[32] Starosolski, R.: Compressing images of sparse histograms. Proceedings of the SPIE,Medical Imaging, Warsaw, Poland, September 2005, Vol. 5959, CID 595912 (pp.209–17).

[33] Pinho, A. J.: Preprocessing techniques for improving the lossless compression ofimages with quasi-sparse and locally sparse histograms. Proceedings of the IEEEInternational Conference on Multimedia and Expo, ICME-2002, Lausanne, Switzer-land, August 2002.

[34] Pinho, A. J.: A comparison of methods for improving the lossless compression ofimages with sparse histograms. Proceedings of the IEEE International Conferenceon Image Processing, ICIP-2002, Rochester, NY, September 2002, Vol. 2, pp. 673–6.

Simple Fast and Adaptive Lossless Image Compression Algorithm

Documents