Top Banner
National Institute of Science and Technology [1] Technical Seminar Presentation 2005 Sudeepta Mishra National Institute of Science and Technology Sudeepta Mishra CS200117052 A Review of Data Compression Techniques Presented by Sudeepta Mishra Roll# CS200117052 At NIST,Berhampur Under the guidance of Mr. Rowdra Ghatak
34
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[1]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

A Review of Data Compression Techniques

Presented by

Sudeepta Mishra Roll# CS200117052

At

NIST,Berhampur

Under the guidance of Mr. Rowdra Ghatak

Page 2: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[2]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

Introduction

• Data compression is the process of encoding data so that it takes less storage space or less transmission time than it would if it were not compressed.

• Compression is possible because most real-world data is very redundant

Page 3: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[3]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

Different Compression Techniques

• Mainly two types of data Compression techniques are there.

– Loss less Compression.

Useful in spreadsheets, text, executable program Compression.

– Lossy less Compression.

Compression of images, movies and sounds.

Page 4: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[4]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

Types of Loss less data Compression

• Dictionary coders.

– Zip (file format).

– Lempel Ziv.

• Entropy encoding.

– Huffman coding (simple entropy coding).

• Run-length encoding.

Page 5: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[5]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

Dictionary-Based Compression

• Dictionary-based algorithms do not encode single symbols as variable-length bit strings; they encode variable-length strings of symbols as single tokens.

• The tokens form an index into a phrase dictionary.

• If the tokens are smaller than the phrases they replace, compression occurs.

Page 6: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[6]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

Types of Dictionary

• Static Dictionary.

• Semi-Adaptive Dictionary.

• Adaptive Dictionary.

– Lempel Ziv algorithms belong to this category of dictionary coders. The dictionary is being built in a single pass, while at the same time encoding the data.

– The decoder can build up the dictionary in the same way as the encoder while decompressing the data.

Page 7: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[7]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

• Using a English Dictionary the string:

“A good example of how dictionary based compression works”

• Gives : 1/1 822/3 674/4 1343/60 928/75 550/32 173/46 421/2

• Using the dictionary as lookup table, each word is coded as x/y, where, x gives the page no. and y gives the number of the word on that page. If the dictionary has 2,200 pages with less than 256 entries per page: Therefore x requires 12 bits and y requires 8 bits, i.e., 20 bits per word (2.5 bytes per word). Using ASCII coding the above string requires 48 bytes, whereas our encoding requires only 20 (<-2.5 * 8) bytes: 50% compression.

Dictionary-Based Compression: Example

Page 8: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[8]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

Lempel Ziv

• It is a family of algorithms, stemming from the two algorithms proposed by Jacob Ziv and Abraham Lempel in their landmark papers in 1977 and 1978.

LZ77 LZ78

LZR

LZHLZSS LZB

LZFG

LZC LZT LZMW

LZW

LZJ

Page 9: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[9]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

LZW Algorithm

• It is An improved version of LZ78 algorithm.

• Published by Terry Welch in 1984.

• A dictionary that is indexed by “codes” is used. The dictionary is assumed to be initialized with 256 entries (indexed with ASCII codes 0 through 255) representing the ASCII table.

Page 10: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[10]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Compression)

W = NIL;while (there is input){K = next symbol from input;if (WK exists in the dictionary) {W = WK;} else {output (index(W));add WK to the dictionary;W = K;}}

Page 11: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[11]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Compression) Flow Chart

START

W= NULL

IS EOF?

K=NEXT INPUT

IS WKFOUND?W=WK

OUTPUT INDEX OF W

ADD WK TO DICTIONARY

STOP

W=K

YES

NO

YES

NO

Page 12: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[12]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Compression) Example

• Input string is • The Initial

Dictionarycontains symbols like a, b, c, d with their index values as 1, 2, 3, 4 respectively.

• Now the input string is read from left to right. Starting from a.

a b d c a d a c

a 1

b 2

c 3

d 4

Page 13: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[13]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Compression) Example

• W = Null

• K = a

• WK = a

In the dictionary.

a b d c a d a c

a 1

b 2

c 3

d 4

K

Page 14: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[14]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Compression) Example

• K = b.

• WK = ab

is not in the dictionary.

• Add WK to dictionary

• Output code for a.

• Set W = b

a b d c a d a c

K

1

ab 5a 1

b 2

c 3

d 4

Page 15: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[15]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Compression) Example

• K = d

• WK = bd

Not in the dictionary.

Add bd to dictionary.

• Output code b

• Set W = d

a b d c a d a c

1

K

2

ab 5a 1

b 2

c 3

d 4

bd 6

Page 16: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[16]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Compression) Example

• K = a

• WK = da

not in the dictionary.

• Add it to dictionary.

• Output code d

• Set W = a

a b d a b d a c

1

K

2 4

ab 5a 1

b 2

c 3

d 4

bd 6

da 7

Page 17: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[17]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Compression) Example

• K = b

• WK = ab

It is in the dictionary.

a b d a b d a c

1

K

2 4

ab 5a 1

b 2

c 3

d 4

bd 6

da 7

Page 18: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[18]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Compression) Example

• K = d

• WK = abd

Not in the dictionary.

• Add W to the dictionary.

• Output code for W.

• Set W = d

a b d a b d a c

1

K

2 4 5

ab 5a 1

b 2

c 3

d 4

bd 6

da 7

abd 8

Page 19: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[19]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Compression) Example

• K = a

• WK = da

In the dictionary.

a b d a b d a c

1

K

2 4 5

ab 5a 1

b 2

c 3

d 4

bd 6

da 7

abd 8

Page 20: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[20]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Compression) Example

• K = c

• WK = dac

Not in the dictionary.

• Add WK to the dictionary.

• Output code for W.

• Set W = c

• No input left so output code for W.

a b d a b d a c

1

K

2 4 5

ab 5a 1

b 2

c 3

d 4

bd 6

da 7

abd 8

7

dac 9

Page 21: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[21]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Compression) Example

• The final output string is

1 2 4 5 7 3

• Stop.

cadbadba

1

K

2 4 5

5ab

4d

3c

2b

1a

6bd

7da

8abd

7

9dac

3

Page 22: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[22]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

LZW Decompression Algorithm

read a character k;

output k;

w = k;

while ( read a character k )

/* k could be a character or a code. */

{ entry = dictionary entry for k;

output entry;

add w + entry[0] to dictionary;

w = entry; }

Page 23: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[23]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

LZW Decompression Algorithm Flow Chart

START

Output K

IS EOF?

K=NEXT INPUT

ENTRY=DICTIONARY INDEX (K)

ADD W+ENTRY[0] TO DICTIONARY

STOP

W=ENTRY

K=INPUT

W=K

YES

NO

Output ENTRY

Page 24: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[24]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Decompression) Example

• K = 1

• Out put K (i.e. a)

• W = K

1

K

2 4 5

4d

3c

2b

1a

7 3

a

Page 25: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[25]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Decompression) Example

• K = 2

• entry = b

• Output entry

• Add W + entry[0] to dictionary

• W = entry[0] (i.e. b)

1

K

2 4 5

4d

3c

2b

1a

7 3

a b

5ab

Page 26: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[26]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Decompression) Example

• K = 4

• entry = d

• Output entry

• Add W + entry[0] to dictionary

• W = entry[0] (i.e. d)

1

K

2 4 5

4d

3c

2b

1a

7 3

a b

5ab

6bd

d

Page 27: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[27]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Decompression) Example

• K = 5

• entry = ab

• Output entry

• Add W + entry[0] to dictionary

• W = entry[0] (i.e. a)

1

K

2 4 5

4d

3c

2b

1a

7 3

a b

5ab

6bd

d a b

7da

Page 28: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[28]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Decompression) Example

• K = 7

• entry = da

• Output entry

• Add W + entry[0] to dictionary

• W = entry[0] (i.e. d)

1

K

2 4 5

4d

3c

2b

1a

7 3

a b

5ab

6bd

d a b

7da

d a

8abd

Page 29: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[29]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

The LZW Algorithm (Decompression) Example

• K = 3

• entry = c

• Output entry

• Add W + entry[0] to dictionary

• W = entry[0] (i.e. c)

1

K

2 4 5

4d

3c

2b

1a

7 3

a b

5ab

6bd

d a b

7da

d a

8abd

c

9dac

Page 30: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[30]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

Advantages

• As LZW is adaptive dictionary coding no need to transfer the dictionary explicitly.

• It will be created at the decoder side.

• LZW can be made really fast, it grabs a fixed number of bits from input, so bit parsing is very easy, and table look up is automatic.

Page 31: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[31]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

Problems with the encoder

• What if we run out of space?

– Keep track of unused entries and use LRU (Last Recently Used).

– Monitor compression performance and flush dictionary when performance is poor.

Page 32: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[32]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

Conclusion

• LZW has given new dimensions for the development of new compression techniques.

• It has been implemented in well known compression format like Acrobat PDF and many other types of compression packages.

• In combination with other compression techniques many other different compression techniques are developed like LZMS.

Page 33: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[33]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

REFERENCES

[1] http://www.bambooweb.com/articles/d/a/Data_Compression.html[2] http://tuxtina.de/files/seminar/LempelZivReport.pdf[3] BELL, T. C., CLEARY, J. G., AND WITTEN, I. H. Text

Compression. Prentice Hall, Upper Sadle River, NJ, 1990.[4] http://www.cs.cf.ac.uk/Dave/Multimedia/node214.html[5] http://download.cdsoft.co.uk/tutorials/rlecompression/Run-

Length Encoding (RLE) Tutorial.htm[6] David Salomon, Data Compression The Complete Reference,

Second Edition. Springer-Verlac, New York, Inc, 2001 reprint.[7] http://www.programmersheaven.com/2/Art_Huffman_p1.htm[8] http://www.programmersheaven.com/2/Art_Huffman_p2.htm[9] Khalid Sayood, Introduction to Data Compression Second

Edition, Chapter 5, pp. 137-157, Harcourt India Private Limited.

Page 34: Data compression tech cs

Nati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

[34]

Technical Seminar Presentation 2005

Sudeepta MishraNati

onal In

stit

ute

of

Sci

en

ce a

nd T

ech

nolo

gy

Sudeepta Mishra CS200117052

Thank You