Ultraspectral Sounder Data Compression Bormin Huang, Allen Huang, Alok Ahuja 5 th MURI Workshop Madison, WI, June 7-9, 2005 Cooperative Institute for Meteorological.

Ultraspectral Sounder Data Compression

Bormin Huang, Allen Huang, Alok Ahuja

5th MURI WorkshopMadison, WI, June 7-9, 2005

Cooperative Institute for Meteorological Satellite Studies (CIMSS)University of Wisconsin–Madison

Outline

Ultraspectral Sounder Data Compression

Current state-of-the-art Lossless Compression Schemes 2D JPEG2000 3D SPIHT (Set Partitioning In Hierarchical Trees) 2D CALIC (Context-based Adaptive Lossless Image Codec) 2D JPEG-LS

CIMSS’s Data Preprocessing Technique Bias-Adjusted Reordering (BAR, 2004)

CIMSS’s New Lossless Compression Schemes Predictive Partitioned Vector Quantization (PPVQ, 2004) Fast Precomputed Vector Quantization (FPVQ, 2005)

Summary

Ultraspectral sounder data vs.

Hyperspectral imager data

• Imager data is used in classification, target detection and pattern recognition. Significant data loss of imager data is usually acceptable by the human visual system.

• Main criterion of sounder data loss is retrieval quality. Retrieval of geophysical parameters from observed radiance is a mathematically ill-posed problem that is sensitive to error of data.

• Hence, there is a need for lossless or near lossless compression of hyperspectral sounder data !!

Ultraspectral sounder data for compression studies

AIRS: 2378 infrared channels, 135 scan lines x 90 cross-track footprints per granule

JPEG2000• A new ISO/IEC (International Organization for Standardization/International Electrotechnical Commission) compression standard.

• Successor to the DCT (discrete cosine transform)-based JPEG algorithm.

Wavelet based Schemes

3D SPIHT

It uses 3D spatial hierarchical tree relationship of the wavelet transform coefficients for efficient compression (Huang et al. 2003).

Parent-child interband relationship and locations for 3D SPIHT coding

Examples of allowable parent-child relations for 2D irregular data

Binarymode

ContextQuantization

Context Formation

Two-line Buffer

Error Modeling

Ternary EntropyCoder

ConditionalProbabilitiesEstimation

CodingHistogram

Sharpening

EntropyCoder

GradientAdjusted

Prediction

-

+

yes

no

I

I

e

e

I

codestream

2D CALIC (Context-based Adaptive Lossless Image Codec)

• Among the nine proposals in the initial ISO/JPEG evaluation in July 1995, CALIC was ranked first.

• It is considered the benchmark for lossless compression of continuous-tone images.

Predictor-Based Schemes

n ne

nnenn

nw

www ?

i

j

Schematic description of the CALIC encoder

Neighboring pixels used in prediction (Wu et. al. 1997)

PredictionContext

ModelingError

Encoding

Run-lengthCoder

Imagepixels

RegularMode

compressedbitstream

RunMode

FlatRegion?

No

Yes

2D JPEG-LS

• Published in 1999 as a lossless compression standard of the ISO/IEC.

c b

x

d

a

Schematic description of the JPEG-LS encoder

Neighborhood of JPEG-LS used in prediction

Given the ith reordered vector , we seek and to minimize*b*V2 2

1

( , ) ( )sn

i i ik k

k

f V b V V b v v b

Then the (i+1)-th reordered vector is simply 1 * *iV V b

The optimal value of b* is obtained by *

( , )0,

i

b b

f V b

b

which yields

*

1

1( )

si ik k

ks

n

b v v V Vn

iV

For lossless compression, *b is rounded to the nearest integer *b

the (i+1)-th reordered vector becomes 1 * *iV V b

A preprocessing technique for exploring the correlation among remote disjoint channels to improve the compression performance of the existing state-of-the-art schemes.

The Bias-Adjusted Reordering (BAR) Scheme (Huang et al., 2004)

, and

PPVQ (Predictive Partitioned Vector Quantization)

Linear Prediction: Each spatial frame is estimated from a linear combination of neighboring frames.

Channel Partitioning:

Vector Quantization:

Entropy Coding

More results from Serra-Sagrista et al. (IGARSS 2005)

Fast Precomputed Vector Quantization (FPVQ)(Huang et al. 2005)

• Linear Prediction: Each channel value is estimated from the linear combination of neighboring spectral channels

• Bit-depth Partitioning: Channels with the same bit depth assigned to the same partition

• Vector Quantization with Precomputed Codebooks: Normalized Gaussian codebooks are used for each partition.

• Optimal Bit Allocation: An algorithm is presented to reduce the expected total number of bits for quantization errors.

• Entropy Coding: Quantization indices and quantization errors are encoded using arithmetic coding.

Linear Prediction

The prediction coefficients are given by

Each channel is estimated from a linear combination of np neighboring channels.

1

ˆpn

i k i kk

c

X X ˆ ,i X CpX† ˆ( ) ( ) ,T T

iC Xp p pX X X

or

Prediction error of each channel is close to a Gaussian distribution with a different standard deviation.

Examples of Gaussian-like distributions of linear prediction errors

Vector Quantization with Precomputed Codebooks

• Prediction errors of each channel are close to a Gaussian distribution with a different standard deviation.

• Channels in each partition are represented as a linear combination of powers of 2. All 2k channels within a partition form a sub-partition.

• Only codebooks with 2m codewords for 2k-dimensional normalized Gaussian distributions are precomputed via the LBG algorithm.

• The actual, data-specific Gaussian codebook is the precomputed normalized Gaussian codebook scaled by the standard deviation spectrum.

Optimal Bit Allocation (Huang et al. 2005)

• Bit allocation algorithms based on marginal analysis have been proposed in literature (Riskin 1991, Cuperman 1993).

• These algorithms may not guarantee an optimal solution because they terminate as soon as the constraint of their respective minimization problems are met, and thus have no chance to move further along the hyperplane of the constraint to reach a minimum solution.

*

1 1

( ) min ( )ij

ij ij ijbi j

d ibn n

f b L b

1 1

,ij bi j

d ibn n

b n

21

( )( )

( ) ( ) log ijij ij ij k ij ijc

k ij

p ijijk

n bp b n

L b n p b bn

subject to

where

is the expected total bits for the quantization errors and the quantization indices.

Bit Allocation Minimization Problem for Lossless Compression of Ultraspectal

Compression

New Optimal Bit Allocation Algorithm

• Step 1) Set

• Step 2) Compute the marginal decrement

• Step 3) Find indices for which is minimum.

• Step 4) Set

• Step 5) Update

• Step 6) Repeat Steps 3-5 until

• Step 7) Compute the next marginal decrement

• Step 8) Find and

• Step 9) If set and

update and go to Step 8; else, STOP.

1, ,ijb i j

(2) (1), ,ij ij ijL L L i j , ( )L b

1b b ( ) ( 1)L L b L b

1 1ij b

i j

d ibn n

b n

* ( 1) ( ), ,ij ij ij ij ijL L b L b i j

*

( , )

( , ) ( )ij iji j

arg min L b ( , ) ( , )

( , ) ( )ij iji j

arg max L b

*L L 1b b 1b b * ( ) ( 1) ,L L b L b

Example of Optimal Bit Allocation Algorithm

Lossless Compression Ratios for AIRS data

Acknowledgement: This research is supported by NOAA NESDIS OSD under grant NA07EC0676.

Summary• In support of the NOAA/NESDIS GOES-R HES data processing studies, we investigated/developed lossless compression of 3D hyperspectral sounder data using wavelet-based (3D SPIHT, JPEG2000), predictor-based schemes (CALIC, JPEG-LS), and clustering-based schemes (PPVQ, FPVQ).

• The performance rank in terms of compression ratios before our BAR scheme: JPEG-LS > 3D SPIHT > JPEG2000 > CALIC.

• After our BAR scheme, the compression ratios of JPEG-LS, 3D SPIHT, JPEG2000 & CALIC are significantly improved and they all perform almost equally well !

• Our FPVQ & PPVQ schemes provide significantly higher compression ratios than existing start-of-the-art schemes on ultraspectral sounder data.

Ultraspectral Sounder Data Compression Bormin Huang, Allen Huang, Alok Ahuja 5 th MURI Workshop Madison, WI, June 7-9, 2005 Cooperative Institute for Meteorological.

Documents

d irregular data slide

prediction slide

lossless compression

error of data

granule slide

summary slide

d jpeg2000

compression performance