Image Compression Fundamentals

Post on 21-Apr-2015

2184 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

CS804B, M3_1, Lecture Notes

Transcript

Resmi N.G. Reference:

Digital Image Processing 2nd Edition Rafael C. Gonzalez Richard E. Woods

Overview Introduction

Fundamentals Coding Redundancy

Interpi xel Redundancy

Psychovisual Redundancy

Fidelity Criteria

Image Compression Models Source Encoder and Decoder

Channel Encoder and Decoder

Elements of Information Theory Measuring Information

The Information Channel

Fundamental Coding Theorems Noiseless Coding Theorem

Noisy Coding Theorem

Source Coding Theorem

3/24/2012 2 CS 04 804B Image Processing Module 3

Error-Free Compression

Variable-Length Coding

Huffman Coding

Other Near Optimal Variable Length Codes

Arithmetic Coding

LZW Coding

Bit-Plane Coding

Bit-Plane Decomposition

Constant Area Coding

One-Dimensional Run-Length Coding

Two-Dimensional Run-Length Coding

Lossless Predictive Coding

Lossy Compression

Lossy Predictive Coding

3/24/2012 3 CS 04 804B Image Processing Module 3

Transform Coding

Transform Selection

Subimage Size Selection

Bit Allocation

Zonal Coding Implementation

Threshold Coding Implementation

Wavelet Coding

Wavelet Selection

Decomposition Level Selection

Quantizer Design

Image Compression Standards

Binary Image Compression Standards

One Dimensional Compression

Two Dimensional Compression

3/24/2012 CS 04 804B Image Processing Module 3 4

Continuous Tone Still Image Compression Standards

JPEG

Lossy Baseline Coding System

Extended Coding System

Lossless Independent Coding System

JPEG 2000

Video Compression Standards

3/24/2012 CS 04 804B Image Processing Module 3 5

Introduction Need for Compression

Huge amount of digital data

Difficult to store and transmit

Solution

Reduce the amount of data required to represent a digital image

Remove redundant data

Transform the data prior to storage and transmission

Categories

Information Preserving

Lossy Compression

3/24/2012 CS 04 804B Image Processing Module 3 6

Fundamentals Data compression

Difference between data and information

Data Redundancy

If n1 and n2 denote the number of information-carrying

units in two datasets that represent the same information,

the relative data redundancy RD of the first dataset is

defined as

3/24/2012 CS 04 804B Image Processing Module 3 7

1

2

11 ,

, , .

D

R

R

RC

nwhere C is called the compression ratio

n

3/24/2012 CS 04 804B Image Processing Module 3 8

2 1

2 1

2 1

1:

1 0

2 :

1

3:

0

R D

R D

R D

Case n n

C and R no redundant data

Case n n

C and R highly redundant data

significant compression

Case n n

C and R second dataset contains

more data than the original

Overview Introduction

Fundamentals Coding Redundancy

Interpi xel Redundancy

Psychovisual Redundancy

Fidelity Criteria

Image Compression Models Source Encoder and Decoder

Channel Encoder and Decoder

Elements of Information Theory Measuring Information

The Information Channel

Fundamental Coding Theorems Noiseless Coding Theorem

Noisy Coding Theorem

Source Coding Theorem

3/24/2012 9 CS 04 804B Image Processing Module 3

Coding Redundancy Let a discrete random variable rk in [0,1] represent the

graylevels of an image.

pr(rk) denotes the probability of occurrence of rk.

If the number of pixels used to represent each value of rk

is l(rk), then the average number of bits required to

represent each pixel is

3/24/2012 CS 04 804B Image Processing Module 3 10

( ) , 0,1,2,... 1kr k

np r k L

n

1

0

( ) ( )L

avg k r k

k

L l r p r

Hence, the total number of bits required to code an MxN

image is MNLavg.

For representing an image using an m-bit binary code,

Lavg= m.

3/24/2012 CS 04 804B Image Processing Module 3 11

How to achieve data compression?

Variable length coding - Assign fewer bits to the more probable graylevels than to the less probable ones.

Find Lavg, compression ratio and redundancy.

3/24/2012 CS 04 804B Image Processing Module 3 12

3/24/2012 CS 04 804B Image Processing Module 3 13

Overview Introduction

Fundamentals Coding Redundancy

Interpi xel Redundancy

Psychovisual Redundancy

Fidelity Criteria

Image Compression Models Source Encoder and Decoder

Channel Encoder and Decoder

Elements of Information Theory Measuring Information

The Information Channel

Fundamental Coding Theorems Noiseless Coding Theorem

Noisy Coding Theorem

Source Coding Theorem

3/24/2012 14 CS 04 804B Image Processing Module 3

Interpixel Redundancy Related to interpixel correlation within an image.

The value of a pixel in the image can be reasonably

predicted from the values of its neighbours.

The gray levels of neighboring pixels are roughly the

same and by knowing gray level value of one of the

neighborhood pixels one has a lot of information about

gray levels of other neighborhood pixels.

Information carried by individual pixels is relatively

small. These dependencies between values of pixels in the

image are called interpixel redundancy.

3/24/2012 CS 04 804B Image Processing Module 3 15

Autocorrelation

3/24/2012 CS 04 804B Image Processing Module 3 16

3/24/2012 CS 04 804B Image Processing Module 3 17

3/24/2012 CS 04 804B Image Processing Module 3 18

The autocorrelation coefficients along a single line of

image are computed as

For the entire image,

3/24/2012 CS 04 804B Image Processing Module 3 19

1

0

( )( )

(0)

1( ) ( , ) ( , )

N n

y

A nn

A

where A n f x y f x y nN n

To reduce interpixel redundancy, transform it into an

efficient format.

Example: The differences between adjacent pixels can be

used to represent the image.

Transformations that remove interpixel redundancies are

termed as mappings.

If original image can be reconstructed from the dataset,

these mappings are called reversible mappings.

3/24/2012 CS 04 804B Image Processing Module 3 20

3/24/2012 CS 04 804B Image Processing Module 3 21

Overview Introduction

Fundamentals Coding Redundancy

Interpi xel Redundancy

Psychovisual Redundancy

Fidelity Criteria

Image Compression Models Source Encoder and Decoder

Channel Encoder and Decoder

Elements of Information Theory Measuring Information

The Information Channel

Fundamental Coding Theorems Noiseless Coding Theorem

Noisy Coding Theorem

Source Coding Theorem

3/24/2012 22 CS 04 804B Image Processing Module 3

Psychovisual Redundancy Based on human perception

Associated with real or quantifiable visual information.

Elimination of psychovisual redundancy results in loss of

quantitative information. This is referred to as

quantization.

Quantization – mapping of a broad range of input values

to a limited number of output values.

Results in lossy data compression.

3/24/2012 CS 04 804B Image Processing Module 3 23

Overview Introduction

Fundamentals Coding Redundancy

Interpi xel Redundancy

Psychovisual Redundancy

Fidelity Criteria

Image Compression Models Source Encoder and Decoder

Channel Encoder and Decoder

Elements of Information Theory Measuring Information

The Information Channel

Fundamental Coding Theorems Noiseless Coding Theorem

Noisy Coding Theorem

Source Coding Theorem

3/24/2012 24 CS 04 804B Image Processing Module 3

Fidelity Criteria Objective fidelity criteria

When the level of information loss can be expressed as a

function of original (input) image and the compressed and

subsequently decompressed output image.

Example: Root Mean Square error between input and

output images.

3/24/2012 CS 04 804B Image Processing Module 3 25

121 1 2

0 0

( , ) ( , ) ( , )

1( , ) ( , )

M N

rms

x y

e x y f x y f x y

e f x y f x yMN

Mean Square Signal-to-Noise Ratio

3/24/2012 CS 04 804B Image Processing Module 3 26

1 12

0 0

21 1

0 0

( , )

( , ) ( , )

M N

x y

ms M N

x y

f x y

SNR

f x y f x y

Subjective fidelity criteria

Measures image quality by subjective evaluations of a

human observer.

3/24/2012 CS 04 804B Image Processing Module 3 27

Overview Introduction

Fundamentals Coding Redundancy

Interpi xel Redundancy

Psychovisual Redundancy

Fidelity Criteria

Image Compression Models Source Encoder and Decoder

Channel Encoder and Decoder

Elements of Information Theory Measuring Information

The Information Channel

Fundamental Coding Theorems Noiseless Coding Theorem

Noisy Coding Theorem

Source Coding Theorem

3/24/2012 28 CS 04 804B Image Processing Module 3

Image Compression Models

3/24/2012 CS 04 804B Image Processing Module 3 29

Encoder – Source encoder + Channel encoder

Source encoder – removes coding, interpixel, and

psychovisual redundancies in input image and outputs a

set of symbols.

Channel encoder – To increase the noise immunity of the

output of source encoder.

Decoder - Channel decoder + Source decoder

3/24/2012 CS 04 804B Image Processing Module 3 30

Source Encoder

Mapper

Transforms input data into a format designed to reduce

interpixel redundancies in input image.

Reversible process generally

May or may not reduce directly the amount of data required

to represent the image.

Examples

Run-length coding(directly results in data compression)

Transform coding

3/24/2012 CS 04 804B Image Processing Module 3 31

Quantizer

Reduces the accuracy of the mapper’s output in

accordance with some pre-established fidelity criterion.

Reduces the psychovisual redundancies of the input

image.

Irreversible process (irreversible information loss)

Must be omitted when error-free compression is desired.

3/24/2012 CS 04 804B Image Processing Module 3 32

Symbol encoder

Creates a fixed- or variable-length code to represent the

quantizer output and maps the output in accordance with

the code.

Usually, a variable-length code is used to represent the

mapped and quantized output.

Assigns the shortest codewords to the most frequently

occuring output values.

Reduces coding redundancy.

Reversible process

3/24/2012 CS 04 804B Image Processing Module 3 33

Source decoder

Symbol decoder

Inverse Mapper

Inverse operations are performed in the reverse order.

3/24/2012 CS 04 804B Image Processing Module 3 34

Channel Encoder and Decoder

Essential when the channel is noisy or error-prone.

Source encoded data – highly sensitive to channel noise.

Channel encoder reduces the impact of channel noise by

inserting controlled form of redundancy into the source

encoded data.

Example

Hamming Code – appends enough bits to the data being

encoded to ensure that two valid codewords differ by a

minimum number of bits.

3/24/2012 CS 04 804B Image Processing Module 3 35

7-bit Hamming(7,4) Code

7-bit codewords

4-bit word

3 bits of redundancy

Distance between two valid codewords (the minimum number of bit changes required to change from one code to another) is 3.

All single-bit errors can be detected and corrected.

Hamming distance between two codewords is the number of places where the codewords differ.

Minimum Distance of a code is the minimum number of bit changes between any two codewords.

Hamming weight of a codeword is equal to the number of non-zero elements (1’s) in the codeword.

3/24/2012 CS 04 804B Image Processing Module 3 36

3/24/2012 CS 04 804B Image Processing Module 3 37

Binary data

b3b2b1b0

Hamming Codeword

h1h2h3h4h5h6h7

0000 0000000

0001 1101001

0010 0101010

0011 1000011

0100 1001100

0101 0100101

0110 1100110

0111 0001111

Overview Introduction

Fundamentals Coding Redundancy

Interpi xel Redundancy

Psychovisual Redundancy

Fidelity Criteria

Image Compression Models Source Encoder and Decoder

Channel Encoder and Decoder

Elements of Information Theory Measuring Information

The Information Channel

Fundamental Coding Theorems Noiseless Coding Theorem

Noisy Coding Theorem

Source Coding Theorem

3/24/2012 38 CS 04 804B Image Processing Module 3

Basics of Probability

3/24/2012 CS 04 804B Image Processing Module 3 39

Ref: http://en.wikipedia.org/wiki/Probability

3/24/2012 CS 04 804B Image Processing Module 3 40

Ref: http://en.wikipedia.org/wiki/Probability

3/24/2012 CS 04 804B Image Processing Module 3 41

Ref: http://en.wikipedia.org/wiki/Probability

Elements of Information Theory Measuring Information

A random event E occuring with probability P(E) is said

to contain

units of information.

I(E) is called the self-information of E.

Amount of self-information of an event E is inversely

related to its probability.

3/24/2012 CS 04 804B Image Processing Module 3 42

1( ) log log( ( ))

( )I E P E

P E

If P(E) = 1, I(E) = 0. That is, there is no uncertainty

associated with the event.

No information is conveyed because it is certain that the

event will occur.

If base m logarithm is used, the measurement is in m-ary

units.

If base is 2, the measurement is in binary units. The unit of

information is called a bit.

If P(E) = ½, I(E) = -log (½) = 1 bit. That is, 1 bit of

information is conveyed when one of the two possible

equally likely outcomes occur.

3/24/2012 CS 04 804B Image Processing Module 3 43

Overview Introduction

Fundamentals Coding Redundancy

Interpi xel Redundancy

Psychovisual Redundancy

Fidelity Criteria

Image Compression Models Source Encoder and Decoder

Channel Encoder and Decoder

Elements of Information Theory Measuring Information

The Information Channel

Fundamental Coding Theorems Noiseless Coding Theorem

Noisy Coding Theorem

Source Coding Theorem

3/24/2012 44 CS 04 804B Image Processing Module 3

The Information Channel Information channel is the physical medium that connects

the information source to the user of information.

Self-information is transferred between an information

source and a user of the information, through the

information channel.

Information source – Generates a random sequence of

symbols from a finite or countably infinite set of possible

symbols.

Output of the source is a discrete random variable.

3/24/2012 CS 04 804B Image Processing Module 3 45

The set of source symbols or letters{a1, a2, …, aJ} is

referred to as the source alphabet A.

The probability of the event that the source will produce

symbol aj is P(aj).

The Jx1 vector is used to

represent the set of all source symbol probabilities.

The finite ensemble (A,z) describes the information source

completely.

3/24/2012 CS 04 804B Image Processing Module 3 46

1

( ) 1J

j

j

P a

1 2( ), ( ),..., ( )T

JP a P a P az

The probability that the discrete source will emit symbol

aj is P(aj).

Therefore, the self-information generated by the

production of a single source symbol is,

If k source symbols are generated, the average self-

information obtained from k outputs is

3/24/2012 CS 04 804B Image Processing Module 3 47

1 1 2 2

1

( ) log ( ) ( ) log ( ) ... ( ) log ( )

( ) log ( )

J J

J

j j

j

kP a P a kP a P a kP a P a

k P a P a

( ) log ( )j jI a P a

The average information per source output, denoted as

H(z), is

This is called the uncertainty or entropy of the source.

It is the average amount of information (in m-ary units

per symbol) obtained by observing a single source

output.

If the source symbols are equally probable, the entropy is

maximized and the source provides maximum possible

average information per source symbol.

3/24/2012 CS 04 804B Image Processing Module 3 48

1

1 1

( ) [ ( )] ( ) ( )

1( ) log ( ) log ( )

( )

J

j j

j

J J

j j j

j jj

H E I P a I a

P a P a P aP a

z z

3/24/2012 CS 04 804B Image Processing Module 3 49

A simple information system

Output of the channel is also a discrete random variable which takes on values from a finite or countably infinite set of symbols {b1, b2, …, bK} called the channel alphabet B.

The finite ensemble (B,v), where

describes the channel output completely and thus the information received by the user.

3/24/2012 CS 04 804B Image Processing Module 3 50

1 2( ), ( ),..., ( )T

JP b P b P bv

The probability P(bk) of a given channel output and the

probability distribution of the source z are related as

3/24/2012 CS 04 804B Image Processing Module 3 51

1

( ) ( | ) ( )

( | )

,

.

J

k k j j

j

k j

k

j

P b P b a P a

where P b a is the conditional probability that

the output symbol b is received given that the

source symbol a was generated

Forward Channel Transition Matrix or Channel Matrix

Matrix element,

The probability distribution of the output alphabet can be

computed from

v = Qz

3/24/2012 CS 04 804B Image Processing Module 3 52

1 1 1 2 1

2 1 2 2 2

1 2

| | ... |

| | ... |

: : ... :

| | ... |

J

J

K K K J

P b a P b a P b a

P b a P b a P b aQ

P b a P b a P b a

|kj k jq P b a

Conditional entropy function

3/24/2012 CS 04 804B Image Processing Module 3 53

1 1

1

1

( ) [ ( )] ( ) ( ) ( ) log ( )

( ) [ ( )] ( ) ( )

( | ) log ( | )

( | )

J J

j j j j

j j

J

k k j k j k

j

J

j k j k

j

j k j

Entropy

H E I P a I a P a P a

Conditional entropy function

H b E I b P a b I a b

P a b P a b

where P a b is the probability that symbol a is

transmitt

z z

z | z | | |

.ked by the source giventhat theuser receivesb

The expected or average value over all bk is

3/24/2012 CS 04 804B Image Processing Module 3 54

1

1 1

1 1

1 1

( ) ( ) ( )

( | ) log ( | ) ( )

( | ) ( ) log ( | )

( , ), ( | )

( )

( ) ( , ) log ( | )

K

k k

k

K J

j k j k k

k j

K J

j k k j k

k j

j k

j k

k

K J

j k j k

k j

H H b P b

P a b P a b P b

P a b P b P a b

P a bConditional Probability P a b

P b

H P a b P a b

z | v z |

z | v

P(aj,bk) is the joint probability of aj and bk. That is, the

probability that aj is transmitted and bk is received.

Mutual information

H(z) is the average information per source symbol,

assuming no knowledge of the output symbol.

H(z|v) is the average information per source symbol,

assuming observation of the output symbol.

The difference between H(z) and H(z|v) is the average

information received upon observing the output symbol,

and is called the mutual information of z and v, given by

I(z|v) = H(z) - H(z|v)

3/24/2012 CS 04 804B Image Processing Module 3 55

3/24/2012 CS 04 804B Image Processing Module 3 56

1 1 1

1 1 1

1 2

1

( ) ( ) ( )

( ) log ( ) ( , ) log ( | )

( ) log ( ) ( , ) log ( | )

( ) ( , ) ( , ) ... ( , )

( , )

J J K

j j j k j k

j j k

J J K

j j j k j k

j j k

j j j j K

K

j k

k

I H H

P a P a P a b P a b

P a P a P a b P a b

P a P a b P a b P a b

P a b

z | v z z | v

3/24/2012 CS 04 804B Image Processing Module 3 57

1 1 1 1

1 1

1 1

( ) ( , ) log ( ) ( , ) log ( | )

( | )( , ) log

( )

( , )( , ) log

( ) ( )

J K J K

j k j j k j k

j k j k

J Kj k

j k

j k j

J Kj k

j k

j k j k

I P a b P a P a b P a b

P a bP a b

P a

P a bP a b

P a P b

z | v

3/24/2012 CS 04 804B Image Processing Module 3 58

1 1

1 1

1 1

( , ) ( | ). ( )

( , ) ( | ). ( )

( | ). ( )( ) ( | ). ( ) log

( ) ( )

. ( ). ( ) log

( ) ( )

. ( ) log( )

. ( ) log( )

j k j k k

j k k j j

J Kk j j

k j j

j k j k

J Kkj j

kj j

j k j k

J Kkj

kj j

j k k

kj

kj j

k k

P a b P a b P b

P a b P b a P a

P b a P aI P b a P a

P a P b

q P aq P a

P a P b

qq P a

P b

qq P a

P b

z | v

1 1

J K

j

3/24/2012 CS 04 804B Image Processing Module 3 59

1

1 1

1

1 1

1

( ) ( | ) ( )

( ) . ( ) log

( | ) ( )

. ( ) log

( )

J

k k j j

j

J Kkj

kj j Jj k

k i i

i

J Kkj

kj j Jj k

ki i

i

P b P b a P a

qI q P a

P b a P a

qq P a

q P a

z | v

The minimum possible value of I(z|v) is zero.

Occurs when the input and output symbols are statistically

independent.

That is, when P(aj,bk) = P(aj)P(bk).

3/24/2012 CS 04 804B Image Processing Module 3 60

1 1

1 1

1 1

( , )I( | ) ( , ) log

( ) ( )

( ) ( )( , ) log

( ) ( )

( , ) log1 0

J Kj k

j k

j k j k

J Kj k

j k

j k j k

J K

j k

j k

P a bP a b

P a P b

P a P bP a b

P a P b

P a b

z v

Channel Capacity

The maximum value of I(z|v) over all possible choices of

source probabilities in the vector z is called the capacity,

C, of the channel described by channel matrix Q.

Channel capacity is the maximum rate at which

information can be transmitted reliably through the

channel.

Binary information source

Binary Symmetric Channel (BSC)

3/24/2012 CS 04 804B Image Processing Module 3 61

max[I( | )]C z

z v

Binary Information Source

3/24/2012 CS 04 804B Image Processing Module 3 62

1 2

2 2

2 2

{ , } 0, 1

1 , 2 1-

,

( ) log log

1 , 2 ,1-

log log

(.)

bs bs bs

bs bs bs bs

T T

bs bs

bs bs bs bs

bs

Source alphabet A a a

P a p P a p p

Entropy of source

H p p p p

where P a P a p p

p p p p is called thebinary entropy

function denoted as H

z

z

2 2, ( ) log logbsFor example H t t t t t

3/24/2012 CS 04 804B Image Processing Module 3 63

Binary Symmetric Channel (Noisy Binary Information

Channel)

3/24/2012 CS 04 804B Image Processing Module 3 64

1 1 1 2

2 1 2 2

.

( | ) ( | )

( | ) ( | )

(0 | 0) (0 |1)

(1| 0) (1|1)

1

1

e

ee e e

e e e e

Let the probability of error during transmission

of any symbol be p

Channel matrix for BSC

P b a P b aQ

P b a P b a

P P

P P

p pp p

p p p p

3/24/2012 CS 04 804B Image Processing Module 3 65

1 2

1 2

1 2

{ , b } 0, 1

, b 0 , 1

,

(0)

(1)

T T

bsee

bse e

bs ee bs

e bs e bs

Output alphabet B b

P b P P P

The probabilities of the receiving output symbols

b and b canbe determined by

pp p

pp p

P p p p p

P p p p p

v

v Qz

=

The mutual information of BSC can be computed as

3/24/2012 CS 04 804B Image Processing Module 3 66

2 2

2 21 1

1

1111 1 2

11 1 12 2

2121 1 2

21 1 22 2

1212 2 2

11 1 12 2

2222 2 2

21 1 22 2

( ) . ( ) log

( )

. ( ) log( ) ( )

. ( ) log( ) ( )

. ( ) log( ) ( )

. ( ) log( ) ( )

kj

kj j

j kki i

i

qI q P a

q P a

qq P a

q P a q P a

qq P a

q P a q P a

qq P a

q P a q P a

qq P a

q P a q P a

z | v

3/24/2012 CS 04 804B Image Processing Module 3 67

2 2

2 2

2 2

2 2

2 2

. log . log

. log . log

. log . log

. log . log

. log . log

e ebs e bse

bs e e bse bs e bs

e ee bs e bs

bs e e bse bs e bs

bs bs bs ee e e e bs

e bs e e bs e bs e bs

e e e bs ebs bs e b

p pp p p p

p p p p p p p p

ppp p p p

p p p p p p p p

p p p p p p p p p

p p p p p p p p p

p p p p p p p p p

2 2

2 2

. log . log

( ) ( )

(.) log log

s

e bse bs e e bs e bs

bs e bs bs ee bs

bs bs bs bs bs

p p p p p p p p p

H p p p p H p

where H p p p p

Capacity of BSC

Maximum of mutual information over all source distributions.

3/24/2012 CS 04 804B Image Processing Module 3 68

2 2

1 1 1( ) max . , .

2 2 2

1 1( ) ( ) ( )

2 2

1 1( (1 ) ) ( )

2 2

1( )

2

1 1 1 1log log ( )

2 2 2 2

1 ( )

T

bs

bs e bs ee

bs e e bs e

bs bs e

bs e

bs e

I is imum when p is This corresponds to

I H p p H p

H p p H p

H H p

H p

H p

z | v z

z | v

3/24/2012 CS 04 804B Image Processing Module 3 69

Overview Introduction

Fundamentals Coding Redundancy

Interpi xel Redundancy

Psychovisual Redundancy

Fidelity Criteria

Image Compression Models Source Encoder and Decoder

Channel Encoder and Decoder

Elements of Information Theory Measuring Information

The Information Channel

Fundamental Coding Theorems Noiseless Coding Theorem

Noisy Coding Theorem

Source Coding Theorem

3/24/2012 70 CS 04 804B Image Processing Module 3

Fundamental Coding Theorems

3/24/2012 CS 04 804B Image Processing Module 3 71

The Noiseless Coding Theorem or Shannon’s First

Theorem or Shannon’s Source Coding Theorem for

Lossless Data Compression

When both the information channel and communication

system are error-free

Defines the minimum average codeword length per source

symbol that can be achieved.

Aim: to represent source as compact as possible.

Let the information source (A,z), with statistically

independent source symbols, output an n-tuple of symbols

from source alphabet A. Then, the source output takes on

one of the Jn possible values, denoted by, αi , from

3/24/2012 CS 04 804B Image Processing Module 3 72

n1 2 3 JA' { , , , , }

3/24/2012 CS 04 804B Image Processing Module 3 73

1 2

1 2

1

1 2

, ( )

( ) ( ) ( )... ( )

' { ( ), ( ),..., ( )}

( ') ( ) log ( )

( ) ( )... ( ) log

n

n

i i

i j j jn

J

J

i i

i

j j jn

Probability of a given P is related to single symbol

probabilities as

P P a P a P a

P P P

Entropy of the sourceis givenby

H P P

P a P a P a P

z

z

1 2

1

( ) ( )... ( )

( ') ( )

nJ

j j jn

i

a P a P a

H nH

z z

3/24/2012 CS 04 804B Image Processing Module 3 74

Hence, the entropy of the zero-memory source is n times

the entropy of the corresponding single symbol source.

Such a source is called the nth extension of single-symbol

source.

1log .

( )

1 1log ( ) log 1

( ) ( )

i

i

i

i i

i

i

Self informationof isP

lP P

α is therefore represented by a codeword whoselength

is the smallest integer exceeding the self - information

of α .

3/24/2012 CS 04 804B Image Processing Module 3 75

1 1 1

1

1 1( ) log ( ) ( ) ( ) log ( )

( ) ( )

1 1( ) log ( ) ( ) ( ) log 1

( ) ( )

( ') ' ( ') 1

' ( ) ( )

'( ') ( ') 1

' 1( ) ( )

lim

n n n

n

i i i i i

i i

J J J

i i i i

i i ii i

avg

J

avg i i

i

avg

avg

n

P P l P PP P

P P l PP P

H L H

where L P l

LH H

n n n

LH H

n n

L

z z

z z

z z

'( )

avgH

n

z

Shannon’s source coding theorem for lossless data

compression states that for any code used to represent the

symbols from a source, the minimum number of bits

required to represent the source symbols on an average

must be atleast equal to the entropy of the source.

3/24/2012 CS 04 804B Image Processing Module 3 76

( )

'

( ')

'

avg

avg

Theefficiency of any encoding strategy canbe defined as

nH

L

H

L

z

z

' 1( ) ( )

avgLH H

n n z z

The Noisy Coding Theorem or Shannon’s Second

Theorem

When the channel is noisy or prone to error

Aim: to encode information so that the communication is

made reliable and the error is minimized.

Use of repetitive coding scheme

Encode nth extension of source using K-ary code

sequences of length r, Kr ≤ Jn.

Select only φ of the Kr possible code sequences as valid

codewords.

3/24/2012 CS 04 804B Image Processing Module 3 77

A zero-memory information source generates information

at a rate equal to its entropy.

The nth extension of the source provides information at a

rate of information units per symbol.

If the information is coded, the maximum rate of coded

information is log(φ/r) and occurs when the φ valid

codewords used to code the source are equally probable.

Hence, a code of size φ and block length r is said to have a

rate of

information units per symbol.

3/24/2012 CS 04 804B Image Processing Module 3 78

( ')H

n

z

logRr

The noisy coding theorem thus states that for any R<C,

where C is the capacity of the zero-memory channel with

matrix Q, there exists an integer r, and code of block

length r and rate R such that the probability of a block

decoding error is less than or equal to ε for any ε>0.

That is, the probability of error can be made arbitrarily

small so long as the coded message rate is less than the

capacity of the channel.

3/24/2012 CS 04 804B Image Processing Module 3 79

The Source Coding Theorem for Lossy Data Compression

When channel is error-free, but communication process is lossy.

Aim: information compression

To determine the smallest rate at which information about the source can be conveyed to the user.

To encode the source so that the average distortion is less than a maximum allowable level D.

Let the information source and ecoder output be defined by (A,z) and (B,v) respectively.

A nonnegative cost function ρ(aj,bk), called distortion measure, is used to define the penalty associated with reproducing source output aj with decoder output bk.

3/24/2012 CS 04 804B Image Processing Module 3 80

3/24/2012 CS 04 804B Image Processing Module 3 81

1 1

1 1

( ) ( , ) ( , )

( , ) ( )

.

( )

( ) min ( )

{ | ( ) }

D

J K

j k j k

j k

J K

j k j kj

j k

Q Q

D kj

Averagevalueof distortionis givenby

d Q a b P a b

a b P a q

whereQis thechannel matrix

Rate distortion function R D is defined as

R D I

whereQ q d Q D is the set o

z, v

.

f all

D admissibleencoding decoding procedures

If D = 0, R(D) is less than or equal to the entropy of the

source, or R(0)≤H(z).

defines the minimum rate at

which information can be conveyed to user subject to the

constraint that the average distortion be less than or equal

to D.

I(z,v) is minimized subject to:

d(Q) = D indicates that the minimum information rate

occurs when the maximum possible distortion is allowed.

3/24/2012 CS 04 804B Image Processing Module 3 82

( ) min ( )DQ Q

R D I

z, v

1

0, 1, ( )K

kj kj

k

q q and d Q D

Shannon’s Source Coding Theorem for Lossy Data

Compression states that for a given source (with all its

statistical properties known) and a given distortion

measure, there is a function, R(D), called the rate-

distortion function such that if D is the tolerable amount

of distortion, then R(D) is the best possible compression

rate.

The theory of lossy data compression is also known as

rate distortion theory.

The lossless data compression theory and lossy data

compression theory are collectively known as the source

coding theory.

3/24/2012 CS 04 804B Image Processing Module 3 83

Thank You

3/24/2012 CS 04 804B Image Processing Module 3 84

top related