Lecture07 JPEG2000

JPEG2000

Presented by: Eddie Zaslavsky

Image Processing seminar (2003)

The next generation still image-compression standard

2

Contents:

1. Why another standard?2. JPEG20003. Examples4. Conclusions5. Add-on: EWZ algorithm

3

Why another standard?

• Low bit-rate compression: At low bit-rates (e.g. below 0.25 bpp for highly detailed gray-level images) the distortion in JPEG becomes unacceptable.

• Lossless and lossy compression: Need for standard, which provide lossless and lossy compression in one codestream.

• Large images: JPEG doesn't compress images greater then 64x64K without tiling.

4

Why another standard? (cont'd)

• Single decompression architecture: JPEG has 44 modes, many of them are application specific and not used by the majority of the JPEG decoders.

• Transmission in noisy environments: in JPEG quality suffers dramatically, when bit errors are encountered.

• Computer generated imaginary: JPEG is optimized for natural images and performs badly on computer generated images

• Compound documents: JPEG fails to compress bi-level (text) imagery.

5

JPEG2000 - Targets

• Coding standard for: different types of still images (gray-level, color, ...) different characteristics (natural, scientific, ...) different imaging models (client/server, real-time,...) within a unified and integrated system.

• This coding system is intended for: low bit-rate applications, exhibiting rate-distortion and subjective image quality performance superior to existing standards.

6

JPEG2000 (encoder-decoder scheme)

7

JPEG2000 - Overview

• The source image is decomposed into components (up to 256).

• The image components are (optionally) decomposed into rectangular tiles. The tile-component is the basic unit of the original or reconstructed image.

• A wavelet transform is applied on each tile. The tile is decomposed into different resolution levels.

• The decomposition levels are made up of subbands of coefficients that describe the frequency characteristics of local areas of the tile components, rather than across the entire image component.

• The sub-bands of coefficients are quantized and collected into rectangular arrays of code blocks.

8

JPEG2000 - Overview (cont'd)

• The bit planes of the coefficients in a code block (i.e. the bits of equal significance across the coefficients in a code block) are entropy coded.

• The encoding can be done in such a way that certain regions of interest (ROI) can be coded at a higher quality than the background.

• Markers are added to the bit stream to allow for error resilience.

• The code stream has a main header at the beginning that describes the original image and the various decomposition and coding styles that are used to locate, extract, decode and reconstruct the image with the desired resolution, fidelity, region of interest or other characteristics.

9

Pre-Processing1. Image tiling:• Image may be quite large in comparison to the amount of

memory available to the codec.• Partition of the original image into rectangular non-

overlapping blocks (tiles), to be compressed independently

2. DC-level shifting:• The codec expects its input sample data to have a nominal dynamic range that is approximately centered about zero (0 -- 255 -> -128 -- 128)• If the sample values are unsigned, the nominal dynamic range of the samples is adjusted by subtracting a bias from each of the sample values ( 2 P-1 , P is the component’s precision)

10

Pre-Processing - Tiling

• All operations, including component mixing, wavelet transform, quantization and entropy coding are performed independently on the image tiles.

• Tiling affects the image quality both subjectively and objectively • Smaller tiles create

more tiling artifacts

11

Pre-Processing (cont'd)

3. Components transformation:• Maps data from RGB to YCrCb (Y, Cr, Cb - less statistically

dependent; compress better); serves to reduce the correlation between components, leading to improved coding efficiency. There are reversible and irreversible transforms.

Forward reversible component transform

Inverse reversible component transform

12

Pre-Processing - Component Transformations

• Component transformations improve compression and allow visually relevant quantization:

• Irreversible component transformation (ICT): Floating point For use with irreversible

(floating point 9/7) wavelet

• Reversible component transformation (RCT) : Integer approximation For use with reversible

(integer 5/3) wavelet

13

ICT (example)

14

Wavelet Transform• Floating point 9/7 wavelet filter for lossy compression

Best performance at low bit rate High implementation complexity, especially for hardware

• Integer 5/3 wavelet filter for lossless coding Integer arithmetic, low implementation complexity

• We filter each row and column with a high pass and low pass filter, followed by downsampling by 2 (to keep the sample rate).

• Now we have divided the tile to sub-bands. All info (index, position, precincts, etc.), regarding the single tile, is put together in a contiguous stream of data called a packet.

15

Code-blocks, precincts and packets

16

Wavelet Transform

Two filtering modes: Convolution based: performing a series of dot

products between the two filter masks and the extended 1-D signal.

Lifting based: sequence of very simple filtering operations for which alternately odd sample values of the signal are updated with a weighted sum of even sample values, and vise versa.

Lossless 1D DWT

= 1.586, = 0.052, = 0.882, = 0.443, K = 1.230

Lossy 1D DWT

P and U stand for Prediction and Update.

17

Wavelet Transform• Symmetric extension: To ensure that for the filtering operations that take place at

both boundaries of the signal, one signal sample exists and spatially corresponds to each co-efficient of the filter mask.

18

DWT (example)

In JPEG2000 multiple stages of the DWT are performed. JPEG2000 supports from 0 to 32 stages. For natural images, usually between 4 to 8 stages are used.

19

Quantization • The wavelet coefficients are quantized using a

uniform quantizer with deadzone. For each subband b, a basic quantizer step size Δb is used to quantize all the coefficients in that subband according to:

• Example: Given a quantizer step of 10 and an encoder

input value of21.82, the quantizer index is determined as shown:

20

Coefficient Bit Modeling

• Wavelet coefficients are associated with different sub-bands arising from the 2D separable transform applied.

• These coefficients are then arranged into rectangular blocks within each sub-band, called code-blocks.

21

Coefficient Bit Modeling (cont'd)

• Code-blocks are then coded a bit-plane at a time starting from the Most Significant Bit-Plane to the Least Significant Bit-Plane (if some MSB-planes contain no 1s, the MSB-plane is set to the top most bit-plane, with at least one 1, the number of bit-planes which are skipped is then encoded in a header.)

3 6

5 1=

MSB-plane LSB-plane

22

Coefficient Bit Modeling (cont'd)

• For each bit-plane in a code-block, a special code- block scan pattern is used for each of three coding passes.

23

3 Passes Scanning

• Each coefficient bit in the bit-plane is coded in only one of the Three Coding Passes:

1. Significance Propagation 2. Magnitude Refinement 3. Clean-up

24

3 Passes Scanning

1. Significance Propagation Pass• If a bit is insignificant (=0) but at least one of it's eight

neighbors is significant (=1), then it is encoded.• If the bit at the same time is a 1, it's significance flag is

set to 1 and the sign of the symbol is encoded.

The encoding is done by the MQ-coder, a low complexity entropy coder.

2. Magnitude Refinement Pass:• Samples which are significant and were not coded

in the significance propagation pass.

3. Clean-up Pass:• It codes all bits which were passed over by the

previous two coding passes (insignificant bits). It is the first pass for MSB plane.

25

Quality layers organization• The resulting bit streams for each code-block are organized

into quality layers. A quality layer is a collection of some consecutive bit-plane coding passes from each tile. Each

code- block can contribute an arbitrary number of bit-plane coding passes to a layer, but not all coding passes must be assigned to a quality layer. Every additional layer increases the image quality.

26

Rate Control

• Rate control is the process by which the code-stream is altered so that a target bit rate can be reached.

• Once the entire image has been compressed, a post-processing operation passes over all the compressed blocks and determines the extent to which each block's embedded bit stream should be truncated in order to achieve the target bit rate.

• The ideal truncation strategy is one that minimizes distortion while still reaching the target bit-rate.

• The code-blocks are compressed independently, so any bit stream truncation policy can be used.

27

Bit stream organization

• In bit stream organization, the compressed data from the bit-plane coding passes are separated into packets.

• Then, the packets are multiplexed together in an ordered manner to form one code-stream.

• Each precinct generates one packet, even if the packet is empty. A packet is composed of a header and the compressed data.

28

Bit stream organization

29

Bit stream organization (cont'd)

• There are 5 ways to order the packets, called progressions, where position refers to the precinct number:

Quality: layer, resolution, component, positionResolution 1: resolution, layer, component, positionResolution 2: resolution, position, component, layerPosition: position, component, resolution, layerComponent: component, position, resolution, layer

• The sorting mechanisms are ordered from most significant to least significant. It is also possible for the progression order to change arbitrarily in the code-stream.

30

Code stream organization (diagram)

31

Decoding

• The decoder basically performs the opposite of the encoder:

• The code-stream is received by the decoder according to the progression order stated in the header. The coefficients in the packets are then decoded and dequantized, and the reverse-ICT is performed:

• In the case of irreversible compression, the decompression results in loss of data. The resulting image is not exactly like the original.

32

Characteristics:

So, what is new in JPEG2000, comparing to previous encoding protocols???

1. Compress once - decompress many ways

2. Region-Of-Interest encoding3. Progression4. Error resilience

33

JPEG2000 - Markets & Applications

34

Compress once, decompress many ways

• In JPEG2000, the compressor decides the maximum resolution and maximum image quality to be used.

• It is also possible to perform random access by decompressing only a certain region of the image or a specific component of the image (e.g. the grayscale component of a color image). Both can be performed with varying qualities and resolutions. • In each case it is

possible to locate, extract, and decode the bytes required for the desired image product without decoding the entire code-stream.

35

Region-of-interest (ROI)

• A ROI is a part of an image that is encoded with higher quality than the rest of the image (the background). The encoding is done in such a way that the information associated with the ROI precedes the information associated with the background.

• 2 methods: Scaling based and Maxshift

36

Region-of-interest (ROI) - Scaling based

1. The wavelet transform is calculated2. ROI mask is derived, indicating the

set of coefficients that are required for up to lossless ROI reconstruction

3. The wavelet coefficients are quantized

4. The coefficients that lay out of the ROI are downscaled by a specified scaling value

5. The resulting coefficients are progressively entropy encoded (with the most significant bit planes first)

6. ROI's scaling value and coordinates are added to the bit stream.

37

Region-of-interest (ROI) - Maxshift method

• ROI mask (a bit map) is created describing which quantized transform coefficients must be encoded with better quality.

• The quantized transform coefficients outside the ROI mask (background coefficients) are scaled down so that the bits associated with the ROI are placed in highest bit-planes and coded before the background.

• Selection of scaling value S: S max(Mb) ,

where Mb is the largest number of magnitude bit planes for any background coefficient in any code-block in the current component: after the scaling of the background coefficients, the LSB of all shifted ROI coefficients is above the MSB (non zero) of all background's coefficients.

• Advantage: arbitrary shaped ROIs without the need for shape information at the decoder.

38

ROI - example

Original Image

with ROI Defined

Decoded Image

with ROI Intact

39

Scalability and bit-stream parsing

• 2 important modes of scalability: Resolution/Spatial Quality (SNR)

• Bit-stream parsing A combination of

spatial and quality scalability.

It is possible to progress by spatial scalability to a given (resolution) level and then change the progression by SNR at a higher level.

40

Resolution scalability

41


42


43


44

Quality scalability

45

Quality scalability

46

Quality scalability

47

Error resilience

Error effects:1. In a packet body: corrupted arithmetically

coded data for some code-block => severe distortion.

2. In a packet head: wrong body length can be decoded, code block data can be assigned to wrong code-blocks => total synchronization loss.

3.Bytes missing (i.e. network packet loss): combined effects of error in packet head and body

48

Protecting code-block data

1. Segmentation symbols: special symbol sequence is coded at the end of each bit-plane. If wrong sequence is decoded, an error has occurred and the last bit-plane is corrupted (at least).

2. Regular predictable termination: the arithmetic coder is terminated at the end of each coding pass using a special algorithm (predictable termination). The decoder reproduces the termination and if it does not find the same unused bits at the end, an error has occurred in the last coding pass (at least).

3. Both mechanism can be freely mixed, but slightly decrease the compression efficiency.

49

Protecting packet head

1. SOP resynchronization marker: every packet can be preceded by an SOP marker with a sequence index. If an SOP marker with correct sequence index isn't found just before the packet head, an error has occurred. In such case the next, unaffected packet is searched in the codestream, and decoding proceed from there.

2. PPM/PPT markers: the packet head content can be moved to the main or tile headers in the codestream and transmitted through a channel with a much lower error rate.

3. Precincts: they limit packet head errors to a small image area.

50

Error resilience (cont'd)

51

Examples

Reconstructed images compressed at 0.25 bpp by means of (a) JPEG and (b) JPEG2000

52

Examples

Reconstructed images compressed at 0.125 bpp by means of (a) JPEG and (b) JPEG2000

53

ExamplesJPEG 2000 (1.83 KB)

Original (979 KB)

JPEG (6.21 KB)

54

Conclusion:

• Benefits: lossless and lossy compression, higher image quality and compression ratios, view the file at multiple resolutions, one area of the image to be examined more closely using its Region Of Interest capability.

• JPEG2000 uses wavelet technology to compress images (images being compressed more efficiently). Currently, JPEG2000 uses one wavelet for lossy compression and another wavelet for lossless compression, but in the future other wavelets may be used as the need arises.

• Many applications, including the Internet, medical imaging digital photography, .....

• Overall, JPEG 2000 is a huge upgrade over current compression methods and looks to be the next image compression standard in the near future.

55

EWZ algorithm (intro)

• The Embedded Zerotree Wavelet algorithm (EZW) is a simple, yet remarkable effective, image compression algorithm, having the property that the bits in the bit stream are generated in order of importance, giving a fully embedded (progressive) code.

• The compressed data stream can have any bit rate desired. Any bit rate is only possible if there is information loss somewhere so that the compressor is lossy. However, lossless compression is also possible with less spectacular results.

56

EZW - 2 observations

The EZW encoder is based on two important observations:

1. Natural images in general have a low pass spectrum, so the wavelet coefficients will, on average, be smaller in the higher subbands than in the lower subbands. This shows that progressive encoding is a very natural choice for compressing wavelet transformed images, since the higher subbands only add detail.

2. Large wavelet coefficients are more important than smaller wavelet coefficients.

631 544 86 10 -7 29 55 -54 730 655 -13 30 -12 44 41 32 19 23 37 17 -4 –13 -13 39 25 -49 32 -4 9 -23 -17 -35 32 -10 56 -22 -7 -25 40 -10 6 34 -44 4 13 -12 21 24 -12 -2 -8 -24 -42 9 -21 45 13 -3 -16 -15 31 -11 -10 -17

typical wavelet coefficients for a 8*8 block in a real

image

57

Motivation

• Transform Coding Needs “Significance Map” to be sent: At low bit rates a large number of the transform

coefficients are quantized to zero (Insignificant Coefficients). We’d like to not have to actually send any bits to code these. But you need to somehow inform the decoder about which coefficients are insignificant: JPEG does this using run-length coding.

58

Motivation

• Here is a two-stage wavelet decomposition of an image. Notice the large number of zeros (black):

59

Shapiro’s Idea for Solving Sig. Map Problem• "Zerotree" - a quad-tree of which all nodes are equal to or smaller

than the root. The tree is coded with a single symbol and reconstructed by the decoder as a quad-tree filled with zeroes. The root has to be smaller than the threshold against which the wavelet coefficients are currently being measured.

• Idea: An insignificant coefficient is VERY likely to have all of its “descendents” on its quad tree also be insignificant (wavelet coefficients DECREASE with scale).

Such a coefficient is called a “Zerotree Root”

60

EZW Algorithm

• First step: The DWT of the entire 2-D image will be computed

• Second step: Progressively EZW encodes the coefficients by decreasing the threshold

• Third step: Arithmetic coding is used to entropy code the symbols

61

EWZ Algorithm (second step)

• Sequence of Decreasing Thresholds: to, t1, . . . ,t(n-1)

with ti = t(i-1)/2 and

• Maintain Two Separate Lists: Dominant List: coordinates of coeffs not yet found significant Subordinate List: magnitudes of coefficients already found to be significant.

• For each threshold, perform two passes: Dominant Pass followed by Subordinate Pass

threshold = initial_threshold; do { dominant_pass(image); subordinate_pass(image); threshold = threshold/2; } while (threshold > minimum_threshold);

MAX(): the maximum coefficient value in the image

y(x,y): the coefficient

62

EWZ Algorithm - Dominant Pass

• Dominant Pass: * All the coefficients are scanned in a special order * If the coefficient is a zero tree root, it will be encoded as ZTR. All its

descendants don’t need to be encoded – they will be reconstructed as zero at this threshold level

* If the coefficient is insignificant but one of its descendants is significant, it is encoded as IZ (isolated zero).

* If the coefficient is significant then it is encoded as POS (positive) or NEG (negative) depends on its sign.

At the end, all the coefficients that are in absolute value larger, than the current threshold are extracted and placed without their sign on the subordinate list and their positions in the image are filled with zeroes, to prevent them from being coded again.

63

Dominant pass (scheme)

64

Scanning order

The wavelet coefficients are scanned in one of the following two orders. The scan order seems to be of some influence of the final compression result.

65

EWZ Algorithm - Subordinate Pass

• Subordinate Pass (Refinement Pass): * Now we check, if the values in the Subordinate

list are larger or smaller than the current threshold:

If larger - a 1 is sent to the entropy encoder and the current threshold is subtracted from the coefficient.

If smaller - a 0 is sent to the entropy encoder

* Sort the Subordinate list to place the larger

(important) coefficients in the front (also helps the entropy encoder...)

* Repeat with next lower threshold, till the total bit budget is exhausted. Encoded stream is an embedded stream

66

EZW - example

67

Lecture07 JPEG2000

Documents

nominal dynamic range

position reso lutio

significance propagation pass

bit stream organization

target bit rate

coefficient bit modeling

reconstructed images compressed

plane coding passes