UNIT IV Image Compression Outline Goal of Image Compression Lossless and lossy compression Types of redundancy General image compression model Transform.

Post on 30-Dec-2015

247 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

Transcript

UNIT IV

Image Compression

Outline

• Goal of Image Compression

• Lossless and lossy compression

• Types of redundancy

• General image compression model

• Transform coding

• JPEG and its different modes

• JBIG standard

• EZW and SPIHT coding

Goal of Image Compression

• The goal of image compression is to reduce the amount of data required to represent a digital image.

Approaches

• Lossless– Information preserving– Low compression ratios

• Lossy– Not information preserving– High compression ratios

• Trade-off: image quality vs compression ratio

Data ≠ Information

• Data and information are not synonymous terms!

• Data is the means by which information is conveyed.

• Data compression aims to reduce the amount of data required to represent a given quantity of information while preserving as much information as possible.

Data vs Information (cont’d)

• The same amount of information can be represented by various amount of data, e.g.:

Your wife, Helen, will meet you at Logan Airport in Boston at 5 minutes past 6:00 pm tomorrow night

Your wife will meet you at Logan Airport at 5 minutes past 6:00 pm tomorrow night

Helen will meet you at Logan at 6:00 pm tomorrow night

Ex1:

Ex2:

Ex3:

Definitions: Compression Ratio

compression

Compression ratio:

Definitions: Data Redundancy

• Relative data redundancy:

Example:

Types of Data Redundancy

(1) Coding Redundancy

(2) Interpixel Redundancy

(3) Psychovisual Redundancy

• Compression attempts to reduce one or more of these redundancy types.

Coding Redundancy

• Code: a list of symbols (letters, numbers, bits etc.)

• Code word: a sequence of symbols used to represent a piece of information or an event (e.g., gray levels).

• Code word length: number of symbols in each code word

Coding Redundancy (cont’d)

N x M imagerk: k-th gray level

P(rk): probability of rk

l(rk): # of bits for rk

( ) ( )x

E X xP X x Expected value:

Coding Redundancy (con’d)

• Case 1: l(rk) = constant length

Example:

Coding Redundancy (cont’d)

• Case 2: l(rk) = variable length

variable length

Interpixel redundancy

• Interpixel redundancy implies that pixel values are correlated (i.e., a pixel value can be reasonably predicted by its neighbors).

( ) ( ) ( ) ( )f x o g x f x g x a da

autocorrelation: f(x)=g(x)

AA

BB

Interpixel redundancy (cont’d)

• To reduce interpixel redundancy, the data must be transformed in another format (i.e., using a transformation)– e.g., thresholding, DFT, DWT, etc.

• Example:

original

thresholded

(profile – line 100)

threshold

(1+10) bits/pair

Psychovisual redundancy

• The human eye does not respond with equal sensitivity to all visual information.

• It is more sensitive to the lower frequencies than to the higher frequencies in the visual spectrum.

• Idea: discard data that is perceptually insignificant!

Psychovisual redundancy (cont’d)

256 gray levels 16 gray levels/random noise16 gray levels

C=8/4 = 2:1i.e., add to each pixel asmall pseudo-random numberprior to quantization

Example: quantization

Measuring Information

• What is the minimum amount of data that is sufficient to describe completely an image without loss of information?

• How do we measure the information content of a message/image?

Modeling Information

• We assume that information generation is a probabilistic process.

• Idea: associate information with probability!

Note: I(E)=0 when P(E)=1

A random event E with probability P(E) contains:

How much information does a pixel contain?

• Suppose that gray level values are generated by a random process, then rk contains:

units of information!

• Average information content of an image:

units/pixel(e.g., bits/pixel)

1

0

( ) P( )L

k kk

E I r r

using

How much information does an image contain?

(assumes statistically independent random events)

Entropy:

• Redundancy:

Redundancy (revisited)

where:

Note: if Lavg= H, then R=0 (no redundancy)

Entropy Estimation

• It is not easy to estimate H reliably!

image

Entropy Estimation (cont’d)

• First order estimate of H:

Estimating Entropy (cont’d)

• Second order estimate of H:– Use relative frequencies of pixel blocks :

image

Estimating Entropy (cont’d)

• The first-order estimate provides only a lower-bound on the compression that can be achieved.

• Differences between higher-order estimates of entropy and the first-order estimate indicate the presence of interpixel redundancy!

Need to apply some transformations todeal with interpixel redundancy!

Estimating Entropy (cont’d)• Example: consider differences:

16

Estimating Entropy (cont’d)

• Entropy of difference image:

• An even better transformation could be found since:

• Better than before (i.e., H=1.81 for original image)

Image Compression Model

Image Compression Model (cont’d)

• Mapper: transforms input data in a way that facilitates reduction of interpixel redundancies.

Image Compression Model (cont’d)

• Quantizer: reduces the accuracy of the mapper’s output in accordance with some pre-established fidelity criteria.

Image Compression Model (cont’d)

• Symbol encoder: assigns the shortest code to the most frequently occurring output values.

Image Compression Models (cont’d)

• Inverse steps are performed.

• Note that quantization is irreversible in general.

Fidelity Criteria

• How close is to ?

• Criteria– Subjective: based on human observers – Objective: mathematically defined criteria

Subjective Fidelity Criteria

Objective Fidelity Criteria

• Root mean square error (RMS)

• Mean-square signal-to-noise ratio (SNR)

RMSE = 5.17 RMSE = 15.67 RMSE = 14.17

Objective Fidelity Criteria (cont’d)

Lossless Compression

Lossless Methods - Taxonomy

Huffman Coding (coding redundancy)

• A variable-length coding technique.• Symbols are encoded one at a time!

– There is a one-to-one correspondence between source symbols and code words

• Optimal code (i.e., minimizes code word length per source symbol).

Huffman Coding (cont’d)

• Forward Pass1. Sort probabilities per symbol2. Combine the lowest two probabilities3. Repeat Step2 until only two probabilities

remain.

Huffman Coding (cont’d)

• Backward Pass

Assign code symbols going backwards

Huffman Coding (cont’d)

• Lavg assuming Huffman coding:

• Lavg assuming binary codes:

Huffman Coding/Decoding

• Coding/decoding can be implemented using a look-up table.

• Decoding can be done unambiguously.

Arithmetic (or Range) Coding (coding redundancy)

• Instead of encoding source symbols one at a time, sequences of source symbols are encoded together.– There is no one-to-one correspondence between source

symbols and code words.

• Slower than Huffman coding but typically achieves better compression.

Arithmetic Coding (cont’d)

• A sequence of source symbols is assigned to a sub-interval in [0,1) which corresponds to an arithmetic code, e.g.,

• We start with the interval [0, 1) ; as the number of symbols in the message increases, the interval used to represent the message becomes smaller.

α1 α2 α3 α3 α4 [0.06752, 0.0688) 0.068

arithmetic code

Arithmetic Coding (cont’d)

Encode message: α1 α2 α3 α3 α4

0 1

1) Start with interval [0, 1)

2) Subdivide [0, 1) based on the probabilities of αi

3) Update interval by processing source symbols

Example

[0.06752, 0.0688) or 0.068 (must be inside interval)

Encodeα1 α2 α3 α3 α4

0.2

0.4

0.8

Example (cont’d)

• The message is encoded using 3 decimal digits or 3/5 = 0.6 decimal digits per source symbol.

• The entropy of this message is:

Note: finite precision arithmetic might cause problems due to truncations!

-(3 x 0.2log10(0.2)+0.4log10(0.4))=0.5786 digits/symbol

α1 α2 α3 α3 α4

1.0

0.8

0.4

0.2

0.8

0.72

0.56

0.48

0.40.0

0.72

0.688

0.624

0.592

0.592

0.5856

0.5728

0.5664

0.5728

0.57152

0.56896

0.56768

0.56 0.56 0.5664

Decode 0.572

Arithmetic Decoding

α1

α2

α3

α4

α3 α3 α1 α2 α4

LZW Coding (interpixel redundancy)

• Requires no priori knowledge of pixel probability distribution values.

• Assigns fixed length code words to variable length symbol sequences.

• Included in GIF, TIFF and PDF file formats

LZW Coding

• A codebook (or dictionary) needs to be constructed.

• Initially, the first 256 entries of the dictionary are assigned to the gray levels 0,1,2,..,255 (i.e., assuming 8 bits/pixel)

Consider a 4x4, 8 bit image 39 39 126 126 39 39 126 126 39 39 126 126 39 39 126 126

Dictionary Location Entry

0 01 1. .255 255256 -

511 -

Initial Dictionary

LZW Coding (cont’d)

- Is 39 in the dictionary……..Yes - What about 39-39………….No * Add 39-39 at location 256

Dictionary Location Entry

0 01 1. .255 255256 -

511 -

39-39

As the encoder examines image pixels, gray level sequences (i.e., blocks) that are not in the dictionary are assigned to a new entry.

39 39 126 12639 39 126 12639 39 126 12639 39 126 126

Example

39 39 126 12639 39 126 12639 39 126 12639 39 126 126

Concatenated Sequence: CS = CR + P

else: (1) Output D(CR) (2) Add CS to D (3) CR=P

If CS is found: (1) No Output (2) CR=CS

(CR) (P)

CR = empty

Decoding LZW

• Use the dictionary for decoding the “encoded output” sequence.

• The dictionary need not be sent with the encoded output.

• Can be built on the “fly” by the decoder as it reads the received code words.

Run-length coding (RLC) (interpixel redundancy)

• Used to reduce the size of a repeating string of symbols (i.e., runs):

1 1 1 1 1 0 0 0 0 0 0 1 (1,5) (0, 6) (1, 1)

a a a b b b b b b c c (a,3) (b, 6) (c, 2)

• Encodes a run of symbols into two bytes: (symbol, count)

• Can compress any type of data but cannot achieve high compression ratios compared to other compression methods.

Combining Huffman Coding with Run-length Coding

• Assuming that a message has been encoded using Huffman coding, additional compression can be achieved using run-length coding.

e.g., (0,1)(1,1)(0,1)(1,0)(0,2)(1,4)(0,2)

Bit-plane coding (interpixel redundancy)

• Process each bit plane individually.

(1) Decompose an image into a series of binary images.

(2) Compress each binary image (e.g., using run-length coding)

Lossy Methods - Taxonomy

Lossy Compression

• Transform the image into a domain where compression can be performed more efficiently (i.e., reduce interpixel redundancies).

~ (N/n)2 subimages

Example: Fourier Transform

The magnitude of the FT decreases, as u, v increase!

K-1 K-1

K << N

Transform Selection

• T(u,v) can be computed using various transformations, for example:– DFT

– DCT (Discrete Cosine Transform)

– KLT (Karhunen-Loeve Transformation)

DCT (Discrete Cosine Transform)

if u=0

if u>0

if v=0

if v>0

Forward:

Inverse:

DCT (cont’d)

• Set of basis functions for a 4x4 image (i.e., cosines of different frequencies).

DCT (cont’d)

DFT WHT DCT

RMS error: 2.32 1.78 1.13

Using8 x 8 subimages

64 coefficientsper subimage

50% of the coefficientstruncated

DCT (cont’d)

• DCT minimizes "blocking artifacts" (i.e., boundaries between subimages do not become very visible).

DFT

i.e., n-point periodicitygives rise todiscontinuities!

DCTi.e., 2n-point periodicityprevents discontinuities!

DCT (cont’d)

• Subimage size selection:

2 x 2 subimagesoriginal 4 x 4 subimages 8 x 8 subimages

JPEG Compression

• Accepted as an international image compression standard in 1992.

• It uses DCT for handling interpixel redundancy.

• Modes of operation:(1) Sequential DCT-based encoding

(2) Progressive DCT-based encoding

(3) Lossless encoding

(4) Hierarchical encoding

JPEG Compression (Sequential DCT-based encoding)

Entropyencoder

Entropydecoder

JPEG Steps

1. Divide the image into 8x8 subimages;

For each subimage do:

2. Shift the gray-levels in the range [-128, 127] - DCT requires range be centered around 0

3. Apply DCT 64 coefficients

1 DC coefficient: F(0,0)

63 AC coefficients: F(u,v)

Example

(non-centeredspectrum)

[-128, 127]

JPEG Steps

4. Quantize the coefficients (i.e., reduce the amplitude of coefficients that do not contribute a lot).

Q(u,v): quantization table

Example

• Quantization Table Q[i][j]

Example (cont’d)

Quantization

JPEG Steps (cont’d)

5. Order the coefficients using zig-zag ordering- Places non-zero coefficients first

- Creates long runs of zeros (i.e., ideal for run-length encoding)

Example

JPEG Steps (cont’d)

6. Encode coefficients:

6.1 Form “intermediate” symbol sequence

6.2 Encode “intermediate” symbol sequences into a binary sequence.

Intermediate Symbol Sequence

symbol_1 (SIZE) symbol_2 (AMPLITUDE)

DC

DC (6) (61)

SIZE: # bits for encoding the amplitude of coefficient

DC Coefficients Encoding

symbol_1 symbol_2 (SIZE) (AMPLITUDE)

predictivecoding:

Intermediate Symbol Sequence (cont’d)

AC

end of blocksymbol_1 (RUN-LENGTH, SIZE) symbol_2 (AMPLITUDE)

SIZE: # bits for encoding the amplitude of coefficientRUN-LENGTH: run of zeros preceding coefficient

If RUN-LENGTH > 15, use symbol (15,0) , i.e., RUN-LENGTH=16

AC (0, 2) (-3)

Example: AC Coefficients Encoding Symbol_1(Variable Length Code (VLC))

Symbol_2(Variable Length Integer (VLI))

(1,4) (12) (111110110 1100)VLC VLI

# bits

Final Symbol Sequence

Effect of “Quality”

(58k bytes) (21k bytes) (8k bytes)

lower compressionhigher compression

Effect of “Quality” (cont’d)

Effect of Quantization: homogeneous 8 x 8 block

Effect of Quantization: homogeneous 8 x 8 block (cont’d)

Quantized De-quantized

Effect of Quantization: homogeneous 8 x 8 block (cont’d)

Reconstructed Error

Original

Effect of Quantization: non-homogeneous 8 x 8 block

Effect of Quantization: non-homogeneous 8 x 8 block (cont’d)

Quantized De-quantized

Effect of Quantization: non-homogeneous 8 x 8 block (cont’d)

Reconstructed Error

Original:

WSQ Fingerprint Compression

• An image coding standard for digitized fingerprints employing the discrete wavelet transform (Wavelet/Scalar Quantization or WSQ).

• Developed and maintained by: – FBI

– Los Alamos National Lab (LANL)

– National Institute for Standards and Technology (NIST)

Memory Requirements

• FBI is digitizing fingerprints at 500 dots per inch with 8 bits of grayscale resolution.

• A single fingerprint card turns into about 10 MB of data!

A sample fingerprint image

768 x 768 pixels =589,824 bytes

Preserving Fingerprint Details

The "white" spots in the middle of the black ridges are sweat pores

They’re admissible points of identification in court, as are the little black flesh ‘‘islands’’ in the grooves between the ridges

These details are just a couple pixels wide!

What compression scheme should be used?

• Better use a lossless method to preserve every pixel perfectly.

• Unfortunately, in practice lossless methods haven’t done better than 2:1 on fingerprints!

• Does JPEG work well for fingerprint compression?

Results using JPEG compressionfile size 45853 bytescompression ratio: 12.9

The fine details are pretty much history, and the whole image has this artificial ‘‘blocky’’ pattern superimposed on it.

The blocking artifacts affect the performance of manual or automated systems!

Results using WSQ compressionfile size 45621 bytescompression ratio: 12.9

The fine details are preserved better than they are with JPEG.

NO blocking artifacts!

WSQ Algorithm

Varying compression ratio

• FBI’s target bit rate is around 0.75 bits per pixel (bpp)– i.e., corresponds to a target compression ratio of 10.7

(assuming 8-bit images)

• This target bit rate is set via a ‘‘knob’’ on the WSQ algorithm.– i.e., similar to the "quality" parameter in many JPEG

implementations.

Varying compression ratio (cont’d)

Original image 768 x 768 pixels (589824 bytes)

Varying compression ratio (cont’d) 0.9 bpp compression

WSQ image, file size 47619 bytes, compression ratio 12.4

JPEG image, file size 49658 bytes, compression ratio 11.9

Varying compression ratio (cont’d)0.75 bpp compression

WSQ image, file size 39270 bytes compression ratio 15.0

JPEG image, file size 40780 bytes, compression ratio 14.5

Varying compression ratio (cont’d)0.6 bpp compression

WSQ image, file size 30987 bytes, compression ratio 19.0

JPEG image, file size 30081 bytes, compression ratio 19.6

JPEG Modes

• JPEG supports several different modes– Sequential Mode– Progressive Mode– Hierarchical Mode– Lossless Mode

• Sequential is the default mode– Each image component is encoded in a single left-to-right,

top-to-bottom scan.– This is the mode we have just described!

Progressive JPEG

• The image is encoded in multiple scans, in order to produce a quick, rough decoded image when transmission time is long.

Sequential

Progressive

Progressive JPEG (cont’d)

• Each scan, encodes a subset of DCT coefficients. (1) Progressive spectral selection algorithm

(2) Progressive successive approximation algorithm

(3) Combined progressive algorithm

Progressive JPEG (cont’d)

(1) Progressive spectral selection algorithm– Group DCT coefficients into several spectral bands

– Send low-frequency DCT coefficients first

– Send higher-frequency DCT coefficients next

Example

Progressive JPEG (cont’d)

(2) Progressive successive approximation algorithm– Send all DCT coefficients but with lower precision.

– Refine DCT coefficients in later scans.

Example

Exampleafter 0.9s after 1.6s

after 3.6s after 7.0s

Progressive JPEG (cont’d)

(3) Combined progressive algorithm– Combines spectral selection and successive approximation.

Hierarchical JPEG

• Hierarchical mode encodes the image at different resolutions.

• Image is transmitted in multiple passes with increased resolution at each pass.

Hierarchical JPEG (cont’d)

N x N

N/2 x N/2

N/4 x N/4

MPEG VIDEO COMPRESSION

Embedded Zero Tree Wavelet Coding

Compare the two matrices576 704 1152 1280 1344 1472 1536 1536

704 640 1156 1088 1344 1408 1536 1600

768 832 1216 1472 1472 1536 1600 1600

832 832 960 1344 1536 1536 1600 1536

832 832 960 1216 1536 1600 1536 1536

960 896 896 1088 1600 1600 1600 1536

768 768 832 832 1280 1472 1600 1600

448 768 704 640 1280 1408 1600 1600

1212 -306 -146 -54 -24 -68 -40 4

30 36 90 2 8 -20 8 -4

-50 -10 -20 -24 0 72 -16 -16

82 38 -24 68 48 -64 32 8

8 8 -32 16 -48 -48 -16 16

20 20 -56 -16 -16 32 -16 -16

-8 8 -48 0 -16 -16 -16 -16

44 36 0 8 80 -16 -16 0

A Multi-resolution Analysis Example

Wavelet Transform

Discrete Wavelet Transform

– Sub bands arise from separable application of filters

LL1

LH1

HL1

HH1

First stage

LH1

HL1

HH1

Second stage

LL2

LH2

HL2

HH2

Embedded Zero tree Wavelet algorithm (EZW)

• A simple, yet remarkable effective, image compression algorithm, having the property that the bits in the bit stream are generated in order of importance, giving a fully embedded (progressive) code.

• The compressed data stream can have any bit rate desired. Any bit rate is only possible if there is information loss somewhere so that the compressor is lossy. However, lossless compression is also possible with less spectacular results.

120

EZW - observations1. Natural images in general have a

low pass spectrum, so the wavelet coefficients will, on average, be smaller in the higher subbands than in the lower subbands. This shows that progressive encoding is a very natural choice for compressing wavelet transformed images, since the higher subbands only add detail.

2. Large wavelet coefficients are more important than smaller wavelet coefficients.

631 544 86 10 -7 29 55 -54 730 655 -13 30 -12 44 41 32 19 23 37 17 -4 –13 -13 39 25 -49 32 -4 9 -23 -17 -35 32 -10 56 -22 -7 -25 40 -10 6 34 -44 4 13 -12 21 24 -12 -2 -8 -24 -42 9 -21 45 13 -3 -16 -15 31 -11 -10 -17

typical wavelet coefficients for a 8*8 block in a real image

Zero Tree Coding

Parent – Child relationship

coefficients that are in the same spatial location consist of a quad-tree.

EZW Algorithm

EZW Algorithm contd..

Scanning order of sub bands

EZW - example

EZW – Example contd..

Contd..

Contd..

Contd..

132

EZW - example

Set Partitioning in Hierarchical Trees (SPIHT) Algorithm

SPIHT

Bi-Level Image Compression

Page 144

Overview

• Introduction to Bi-Level Image Compression

• Existing Facsimile Standards: G3 (MR) G4 (MMR) JBIG[1]

• New Bi-Level Standards: JBIG2

Page 145

Definition: Bi-Level

•Multi-Level•(Gray Scale)

•Bi-Level•(Black & White)

Page 146

Properties of Bi-Level Images

• Mostly High Frequency

• Often Very High Resolutions: Computer Monitor: 96dpi Fax Machine: 200dpi

• 1 page fax (8.5” x 11” x 200dpi) ~= .5 Meg Laser Printer: 600dpi (1 page = 4.2 Megs) High-End Printing Press: 1600dpi (30 Megs!)

• Will often contain text, halftoned images and line-art (graphs, equations, logos, etc.)

Page 147

Existing Fax Standards

• T.4 (Group 3) MH - Modified Huffman (and RLE) MR - Modified Read

• Uses information from previous line

• Uses MH mode every k lines for error correction

• T.6 (Group 4) MMR - Modified Modified Read

• Uses information from previous line

• Assumes Error-Free Environment

Page 148

Existing Fax Standards

• JBIG[1] (T.82 -- March, 1993)

• Joint Bi-Level Image Experts Group Committee with Academic & Industrial members:

• ISO (International organization of National Bodies)

• ITU-T (Regulatory body of the United Nations)

• Arithmetic Coding (QM Coder)

• Context-based prediction

• Progressive Compression (Display)

Page 149

Existing Fax Standards

• Arithmetic Q Coder Numerous variations: Q, QM, MQ

• Used by JBIG[1] , JPEG, JBIG2 & J2K

• Different probability tables, byte markers, etc.

• Adaptive Coder

• 16-bit Precision (32-bit C register)

• Uses numerous Approximations: Fixed Probability Table No Multiplication

Page 150

New Standards

• JBIG2 (T.88 -- February 2000)

• First “lossy” bi-level standard

• Supports Three basic coding modes: Generic (MMR or JBIG[1]-like arithmetic) Halftone Text

• Image can be segmented into regions Each region can be coded with a different method

Page 151

JBIG2 - Generic Coding

• The core coding method of JBIG2 has not changed that much from previous methods• There are two methods available in generic coding:

MMR (Group 4) MQ Arithmetic Coding

(similar to JBIG[1])larger contextsare available:

?

A

AA

A

Page 152

JBIG-2 Halftone Coding

• A halftone is codedas a multi-level image,along with a patternand grid parameters

• The decoder constructs the halftone from the multi-level image and the pattern

• The multi-level image is coded as bi-levelbit-planes, with the generic coder

Page 153

JBIG2 - Text Coding• Each symbol is

encoded in a dictionarywith generic coding:

• And then, the image is constructed by adding images from the dictionary:

y

x

• The symbol ID and the (relative)co-ordinatesare coded

Page 154

JBIG2 - Text Coding

• In actual documents, many symbols are very similar -- often due to scanning or spacial quantization errors

• Lossy Coding:Hard Pattern Matching

• Lossless Coding:Soft Pattern Matching

Page 155

JBIG2 - Soft Pattern Matching

• Soft Pattern Matching (refinement coding) is when a symbol is coded using a similar, previously coded symbol to provide additional context information.

Already coded: To be coded:

x ?

top related