Top Banner

of 29

Image Compression Techniques.docmentation55555555555555555555555555

Jan 06, 2016

Download

Documents

Meraj Alam

Digital image compression technology is of special interest for the fast transmission and real-time processing of digital image information on the internet.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • INSTITUTE OF SCIENCE &

    TECHNOLOGY

  • Dissertation approval Sheet THIS IS CERTIFY THAT DISSERTATION TITLED

    IMAGE COMPRESSION TECHNIQUES By

    Raju Kumar

    Rabbul Hussain

    Md. Minhaj Alam

    Is approved for the degree of Bachelor of technology (B.TECH)

    In COMPUTER SCIENCE AND ENGINEERING (C.S.E)

    Ms. Pubali Das

    Project Guide Faculty

  • Signature of Project

    Coordinator Signature of Project Guide

    Ms. Pubali Das

    Signature of HOD

    Mr. Gouranga Mondal

    HOD, CSE&IT, IST

    CERTIFICATE

    This Is To Certify That The Project Entitled Image Compression Techniques Submitted By Raju Kumar (Roll no: 18200111025), Rabbul Hussain (Roll No:18200111024) and Md.Minhaj Alam

    (Roll No:18200111022)For The Partial Fulfillment Of The Degree Of Bachelor Of Technology In Computer Science, Under

    West Bengal University Of Technology Is Actually Based Upon

    His Group Work Under The Supervision Of Ms. Pubali Das department Of Computer Science And Information Technology,

    Institute Of Science & Technology, West Midinipur,West Bengal,

    Neither His Project Nor Any Of The Either Project Has Been Submitted For Any Degree Or Diploma Or Any Other Academic

    Award Anywhere Before.

  • Raju Kumar Roll no-18200111025

    Rabbul Hussain Roll no-18200111024

    Md.Minhaj Alam Roll no-18200111022

    AKNOWLEDGEMENT

    We take this opportunity to express my profound gratitude and

    deep regards to our guide Mr. Gouranga Mondal, HOD(Head of

    the Department), department of CSE & IT of Institute of

    Science & Technology, for his exemplary guidance, monitoring

    and constant encouragement throughout the course of this

    thesis. The blessing, help and guidance given by him time to

    time shall carry us a long way in the journey of our life on

    which we are about to embark. We would also like to thanks

    our project coordinator and our dear friends for helping us in

    this project. Without their help we wouldn't be able to make it.

    Thanks once again.

  • Raju kumar Roll no-18200111025

    Rabbul Hussain Roll no-18200111024

    Md.Minhaj Alam Roll no-18200111022

    DECLARATION

    This is to certify that the project work entitled Image Compression Techniques which is almost completed and

    performed by us in partial fulfillment of the completion

    Bachelors degree in Computer Science from Institute of Science & Technology Which Comprises of our Original Work.

  • ABSTRACT

    Digital image compression technology is of special interest for

    the fast transmission and real-time processing of digital image

    information on the internet. Although still image compression is

    a technique developed for a long time, plus there are several

    approaches which reduce the compression rate, and accelerate

    computation time, there are still a lot to go to improve the

    efficacy of compression. In this tutorial, several important

    image compression algorithms at present are examined,

    including DCT and the derived tools such as JPEG, JPEG 2000,

    fractal image compression and wavelet transformation. These

    use different aspects to help image processing smoother and

    faster.

    In the following tutorial, I would like to talk about the

    background of image compression first, including when we will

    need image compression, categories of techniques, and their

    properties. Secondly, I would briefly introduce some common

    image compression methods nowadays such as JPEG,

    JPEG2000, wavelet-based, fractal-based, and several other

    techniques and neural networks. When they can be applied,

    how to implement the algorithms, their advantages and

    disadvantages, how they differ and their development prospect

    will be described as well.

  • CONTENT

    Abstract

    I. Introduction

    1. Introduction to Image Compression 1.1 Image

    Pixel RGB Grayscale YUV

    1.2 Image Compression Techniques Advantages of image compression Techniques Categories

    2. Lossy Compression Techniques 2.1 Introduction 2.2 Techniques

    Transform Coding Vector Quantization Fractal Image Compression Compression can Fractal achieve

    Encoding Images

    3. Lossless Compression Techniques 3.1 Introduction 3.2 Techniques

    Run Length Encoding Huffman Encoding Area Coding

    II. Commonly Used Techniques

    4. JPEG 4.1 JPEG

    JPEG Encoder The 2D 8x8 DCT Quantization Differential Coding of DC Coefficients

    Summary

  • 1. Introduction to Image Compression The digital multimedia is popular nowadays because of their highly

    perceptual effects and the advanced development its corresponding

    technology. However, it often requires a large amount of data to store

    these multimedia contents due to the complex information they may

    encounter. Besides, the requirement of resolution is much higher than

    before, such that the data size of the image is surprisingly large.

    In other words, a still image is a sensory signal that contains significant

    amount of redundant information which exists in their canonical forms.

    Image data compression is the technique of reducing the redundancies

    in image data required to maintain a given quantity of information.

    Therefore, how to improve image compression becomes an important

    question. Great progress has been made in applying digital signal

    processing or wavelet transform techniques in this area.

    There are two different technique groups including lossy compression

    and lossless compression depending on if the information can be

    recovered after compression. I would like to briefly introduce different

    methods included in these two techniques. As for JPEG and JPEG2000,

    they are very popular compression tools, which will be described in

    details in the latter chapters.

    1.1 Image An image is essentially a 2-D signal processed by the human visual

    system. The signals representing images are usually in analog form.

    However, for image processing, storage and transmission, they are

    converted from analog to digital form. A digital image is basically a 2-

    D array of pixels.

    Images are formed of the significant part of data, particularly in remote

    sensing, biomedical and video conferencing applications. The use of

    and dependence on information and computers continue to grow, so

    does our need for efficient ways of storing and transmitting large

    amounts of data.

  • Pixel In digital image, a pixel is a single point in a raster image. It is the

    smallest unit of picture that can be controlled, and is the smallest

    addressable screen element as shown in Fig. Each pixel has its own

    address. The address of a pixel corresponds to its coordinates. They are

    usually arranged in a 2-D grid, and are often represented with dots or

    squares.

    Each pixel is a sample of an original image. More samples typically

    provide more accurate representations of the original. The intensity of

    each pixel is variable. In color image systems, a color is typically

    represented by three or four component intensities such as red, green,

    and blue.

    Pixel is smallest element of an image

    RGB When the eye perceives an image on a computer monitor, it is in actually

    perceiving a large collection of finite color elements, or pixels [1]. Each of

    these pixels is in and of itself composed of three dots of light; a green dot, a

    blue dot, and a red dot. The color the eye perceives at each pixel is a result

    of varying intensities of green, red, and blue light emanating from that

    location. A color image can be represented as 3 matrixes of values, each

    corresponding to the brightness of a particular color in each pixel.

    Therefore, a full color image can be reconstructed by superimposing these

    three matrices of RGB.

  • A color image is made of 3 matrices

    Grayscale If an image is measured by an intensity matrix with the relative

    intensity being represented as a color between black and white, it

    would appear to be a grayscale image.

    A grayscale image

    The intensity of a pixel is expressed within a given range between

    a minimum and a maximum. This range is represented in a range

    from 0 to 1 with incremental steps according to the bit-depth of

    the image. The greater the number of steps is, the larger the bit-

    depth is. If each intensity value is represented as an 8-bit number,

    then there are 256 variations. If the intensity values are

    represented as 16-bit numbers, there are 32,000 variations

    between absolute black and pure white. Fig. 4 demonstrates a

    black to white gradient in 4-bits of intensity.

  • 4-bit black to white gradient

    YUV In the case of a color RGB picture a point-wise transform is made

    to the YUV (luminance, blue chrominance, red chrominance)

    color space. This space in some sense is more efficient than the

    RGB space and allows better quantization. The transform is given

    by

    0.299 0.587 0.114 0

    0.1687 0.3313 0.5 0.5

    0.5 0.4187 0.0813 0.5

    Y R

    U G

    V B

    , (1)

    and the inverse transform is

    1 0 1.402

    1 0.344.4 0.71414 0.5

    1 1.772 0 0.5

    R Y

    G U

    B V

    . (2)

    1.2 Image Compression Techniques Image compression is an application of data compaction that can

    reduce the quantity of data. The block diagram of image coding

    system is shown in Fig.

  • Object

    R-G-B

    coordinate

    Transform to

    Y-Cb-Cr

    coordinate

    Downsample

    ChrominanceEncoder

    DecoderR-G-B

    coordinate

    Transform to

    R-G-B

    coordinate

    Upsample

    Chrominance

    Monitor

    C

    Camera

    C

    V

    HDDRate-

    Dstortion

    Comparison

    The block diagram of the general image storage system.

    The camera captures the reflected light from the surface of the

    object, and the received light will be converted into three primary

    color components R, G and B. These three primary color

    components are processed by coding algorithms afterward.

    Image compression addresses the problem of reducing the amount

    of data required to represent a digital image. It is a process

    intended to yield a compact representation of an image, thereby

    reducing the image storage/transmission requirements.

    Compression is achieved by the removal of one or more of the

    following three basic data redundancies:

    1. Coding Redundancy

    2. Inter-pixel Redundancy

    3. Perceptual Redundancy

    Coding redundancy occurs when the codes assigned to a set of

    events such as the pixel values of an image have not been selected

    to take full advantage of the probabilities of the events.

    Inter-pixel redundancy usually results from correlations between

    the pixels. Due to the high correlation between the pixels, any

    given pixel can be predicted from its neighboring pixels.

    Perceptual redundancy is due to data that is ignored by the human

    visual system. In other words, all the neighboring pixels in the

    smooth region of a natural image have a high degree of similarity

  • and this insignificant variation in the values of the neighboring

    pixels is not noticeable to the human eye.

    Technicques Image compression techniques reduce the number of bits required

    to represent an image by taking advantage of these redundancies.

    An inverse process called decoding is applied to the compressed

    data to get the reconstructed image. The objective of compression

    is to reduce the number of bits as much as possible, while keeping

    the resolution and the quality of the reconstructed image as close

    to the original image as possible.

    Image compression systems are composed of two distinct

    structural blocks: an encoder and a decoder, as shown in Fig.

    Image compression system

    Image f(x, y) is fed into the encoder, which creates a set of

    symbols form the input data and uses them to represent the image.

    If we let n1 and n2 denote the number of information carrying

    units in the original and encoded images respectively, the

    compression that is achieved can be quantified numerically via

    the compression ratio, CR = n1/n2

    As shown in Fig, the encoder is responsible for reducing the

    coding, inter-pixel and perceptual redundancies of input image.

    In first stage, the mapper transforms the input image into a format

    designed to reduce inter-pixel redundancies. The second stage,

  • qunatizer block reduces the accuracy of mappers output in

    accordance with a predefined criterion. In third and final stage, a

    symbol decoder creates a code for quantizer output and maps the

    output in accordance with the code. These blocks perform, in

    reverse order, the inverse operations of the encoders symbol

    coder and map per block. As quantization is irreversible, an

    inverse quantization is not included.

    Advantages of image Compression The benefits of image compression can be listed as follows:

    1. It provides a potential cost savings associated with sending less data over switched telephone network where cost of call is

    really usually based upon its duration. 2. It not only reduces storage requirements but also overall

    execution time.

    3. It reduces the transmission errors since fewer bits are transferred.

    4. It also provides a level of security against illicit monitoring.

    Categories The image compression techniques are broadly classified into two

    categories depending whether or not an exact replica of the

    original image could be reconstructed using the compressed

    image. These are:

    1. Lossy technique 2. Lossless technique

  • 2. Lossy Compression Techniques

    2.1 Introduction Lossy schemes provide much higher compression ratios than

    lossless schemes. Lossy schemes are widely used since the

    quality of the reconstructed images is adequate for most

    applications. By this scheme, the decompressed image is not

    identical to the original image, but reasonably close to it.

    Lossy image compression

    As shown in Fig, this prediction transformation

    decomposition process is completely reversible. The quantization

    process results in loss of information. The entropy coding after

    the quantization step, however, is lossless. The decoding is a

    reverse process. Firstly, entropy decoding is applied to

    compressed data to get the quantized data. Secondly, de-

    quantization is applied to it and finally the inverse transformation

    to get the reconstructed image.

    Major performance considerations of a lossy compression scheme

    include:

    1. Compression ratio 2. Signal - to noise ratio 3. Speed of encoding and decoding.

  • Lossy compression techniques includes following schemes:

    1. Transform coding

    2. Vector quantization

    3. Fractal Image Compression

    2.2 Techniques

    Transform Coding In this coding scheme, transforms such as DFT (Discrete Fourier

    Transform) and DCT (Discrete Cosine Transform) are used to

    change the pixels in the original image into frequency domain

    coefficients. These coefficients have several desirable properties.

    One is the energy compaction property that results in most of the

    energy of the original data being concentrated in only a few of the

    significant transform coefficients. Only those few significant

    coefficients are selected and the remaining is discarded. The

    selected coefficients are considered for further quantization and

    entropy encoding. DCT coding has been the most common

    approach to transform coding.

    DCT

  • Vector Quantization The basic idea in this technique is to develop a dictionary of

    fixed-size vectors, called code vectors. As shown in Fig, a vector

    is usually a block of pixel values. A given image is then

    partitioned into non-overlapping blocks (vectors) called image

    vectors. Each in the dictionary is determined and its index in the

    dictionary is used as the encoding of the original image vector.

    Thus, each image is represented by a sequence of indices that can

    be further entropy coded.

    (a)

    (b)

  • (a) Vector quantization coding procedure (b) decoding procedure

    Fractal Image Compression The essential idea here is to decompose the image into segments

    by using standard image processing techniques such as color

    separation, edge detection, and spectrum and texture analysis.

    Then each segment is looked up in a library of fractals. The

    library actually contains codes called iterated function system

    codes, which are compact sets of numbers. Using a systematic

    procedure, a set of codes for a given image are determined, such

    that when the IFS codes are applied to a suitable set of image

    blocks yield an image that is a very close approximation of the

    original. This scheme is highly effective for compressing images

    that have good regularity and self-similarity.

    Original image and self-similar portions of image

    Now, we want to find a map W which takes an input image and

    yields an output image. If we want to know when W is

  • contractive, we will have to define a distance between two

    images. The distance is defined as

    where f and g are value of the level of grey of pixel (for grayscale

    image), P is the space of the image, and x and y are the

    coordinates of any pixel. This distance defines position (x, y)

    where images f and g differ the most.

    Natural images are not exactly self similar. Lena image, a typical

    image of a face, does not contain the type of self-similarity that

    can be found in the Sierpinski triangle. But next image shows that

    we can find self-similar portions of the image. A part of her hat is

    similar to a portion of the reflection of the hat in the mirror.

    Compression can Fractal achieve

    The compression ratio for the fractal scheme is hard to measure since the image can be decoded at any scale. For example, the

    decoded image in Figure 3 is a portion of a 5.7

    to 1 compression of the whole Lena image. It is decoded at 4 times its original size, so the full decoded image contains 16 times as many pixels and hence this compression ratio is 91.2 to

    1. This many seem like cheating, but since the 4-times later image has detail at every scale, it really is not.

  • Encoding Images

    The previous theorems tell us that transformation W will have a

    unique fixed point in the space of all images. That is, whatever

    image (or set) we start with, we can repeatedly apply W to it and we will converge to a fixed image. Suppose we

    are given an image f that we wish to encode. This means we want

    to find a collection of transformations w1 , w2 , ...,wN and want f to be the fixed point of the map W (see fixed Point Theorem). In other

    words, we want to partition f into pieces to which we apply the

    transformations wi , and get back the original image f. A typical image of a face, does not contain the type of self-similarity like the

    fern in Figure The image does contain other type of self-similarity.

    Lena identical, and a portion of the reflection of the hat in the mirror is similar to the original. These distinctions form the kind of

    self-similarity here the image will be formed by copies of properly transformed parts of the original. These transformed parts do not fit

    together, in general, to form an exact copy of the original image,

    and so we must allow some error in our representation of an image as a set of transformations.

    3. Lossless Compression Techniques

    3.1 Introduction In lossless compression techniques, the original image can be

    perfectly recovered from the compressed image. These are also

    called noiseless since they do not add noise to the signal. It is also

    known as entropy coding since it use decomposition techniques to

    minimize redundancy.

    Following techniques are included in lossless compression:

    1. Run length encoding

    2. Huffman encoding

  • Techniques

    Run Length Encoding This is a very simple compression method used for sequential

    data. It is very useful in repetitive data. This technique replaces

    sequences of identical pixels, called runs by shorter symbols. The

    run length code for a gray scale image is represented by a

    sequence {Vi, Ri} where Vi is the intensity of pixel and Ri refers

    to the number of consecutive pixels with the intensity Vi as

    shown in the figure. If both Vi and Ri are represented by one byte,

    this span of 12 pixels is coded using eight bytes yielding a

    compression ration of 1: 5.

    Run-length encoding

    Huffman Encoding This is a general technique for coding symbols based on their

    statistical occurrence frequencies.

  • Huffman encoding

    The pixels in the image are treated as symbols. The symbols that

    occur more frequently are assigned a smaller number of bits,

    while the symbols that occur less frequent are assigned a

    relatively larger number of bits. Huffman code is a prefix code.

    The binary code of any symbol is not the prefix of the code of any

    other symbol. Most image coding standards use lossy techniques

    in earlier stages of compression and use Huffman coding as the

    final step.

    Area Coding Area coding is an enhanced form of run length coding, reflecting

    the two dimensional character of images. This is a significant

    advance over the other lossless methods. For coding an image it

    does not make too much sense to interpret it as a sequential

    stream, as it is in fact an array of sequences, building up a two

    dimensional object. The algorithms for area coding try to find

    rectangular regions with the same characteristics. These regions

    are coded in a descriptive form as an element with two points and

    a certain structure. This type of coding can be highly effective but

    it bears the problem of a nonlinear method, which cannot be

    implemented in hardware. Thus, the performance in terms of

  • compressio time is not competitive, although the compression

    ratio is.

    4. JPEG

    4.1 JPEG JPEG (Joint Photographic Expert Group) standard will be simply

    described in this section, and the flow chart, and the algorithm

    will be introduced.

    JPEG Encoder The block diagram of the JPEG standard. The YCbCr color

    transform and the chrominance subsampling format are not

    defined in the JPEG standard, but most JPEG software will

    perform this processing because it makes the JPEG encoder

    reduce the date quantity more efficient. However, we dont

    discuss the basic concept of subsampling format here. We will

    focus on JPEG encoder as follows.

    88 DCTSource

    ImageBitstream

    Huffman

    Table for AC

    Quantization

    Quantization

    Table

    DC

    AC

    Differential

    Coding

    Zero-Run-

    Length Coding

    Entropy

    Encoder

    Entropy

    Encoder

    Huffman

    Table for DC

    The flow chart of JPEG standard

  • 2D 8x8 DCT As we learned, we know that the energy of nature image are

    concentrated in low frequency, so we can used DCT transform to

    separate low frequency and high frequency. And then reserve the

    low frequency component as far as possible, and subtract the high

    frequency component to achieve reduction of compression rate.

    The encoder performs the DCT on these macroblocks to achieve

    decorrelation and frequency analysis. Because the transform are

    performed on the 8x8 blocks, the forward 2-D DCT formula is

    defined in (3), and x[m,n] and X[u,v] represent the input signal

    and the DCT coefficients, respectively.

    7 7

    0 0

    1 (2 1) (2 1), ( ) ( ) , cos cos

    4 16 16

    for 0,...,7 and 0,...,7

    x y

    m u n vX u v C u C v x m n

    u v

    (3)

    where

    5.

    1,for 0

    ( ) 2

    1 ,otherwise

    kC k

    The discrete cosine transform shown is closely related to the

    Discrete Fourier Transform (DFT). Both take a set of points from

    the spatial domain and transform them into an equivalent

    representation in the frequency domain. The difference is that

    while the DFT takes a discrete signal in one spatial dimension and

    transforms it into a set of points in one frequency dimension and

    the Discrete Cosine Transform (for an 8x8 block of values) takes

    a 64-point discrete signal, which can be thought of as a function

    of two spatial dimensions x and y, and turns them into 64 DCT

    coefficients which are in terms of the 64 unique orthogonal 2D

    spectrum.

  • The DCT coefficient values are the relative amounts of the 64

    spatial frequencies present in the original 64-point input. The

    element in the upper most left corresponding to zero frequency in

    both directions is the DC coefficient and the rest are called AC

    coefficients.

    64 two-dimensional spatial frequencies

    Because pixel values typically change vary slowly from point to

    point across an image, the FDCT processing step lays the

    foundation for achieving data compression by concentrating most

    of the signal in the lower spatial frequencies. For a typical 8x8

    sample block from a typical source image, most of the spatial

    frequencies have zero or near-zero amplitude and need not be

    encoded.

    At the decoder the IDCT reverses this processing step. It takes the

    64 DCT coefficients and reconstructs a 64-point output image

    signal by summing the basis signals. Mathematically, the DCT is

    one-to-one mapping for 64-point vectors between the image and

    the frequency domains. In principle, the DCT introduces no loss

    to the source image samples; it merely transforms them to a

    domain in which they can be more efficiently encoded.

  • Quantization After DCT, the encoder performs quantization to reduce precision

    of the data and discard the less important high frequency

    coefficients. As mentioned above, the human eyes are more

    sensitive to the low frequency components than the high

    frequency components, the JPEG quantization assigns large

    quantization step size to high frequency components to discard

    the redundant information, and assign small quantization step size

    to the low frequency components to preserve the significant

    information. Fig. 18 shows two quantization tables defined in

    JPEG, where Qr is the luminance quantization table and Qc is the

    chrominance quantization table.

    16 11 10 16 24 40 51 61

    12 12 14 19 26 58 60 55

    14 13 16 24 40 57 69 56

    14 17 22 29 51 87 80 62

    18 22 37 56 68 109 103 77

    24 35 55 64 81 104 113 92

    49 64 78 87 103 121 120 101

    72 92 95 98 112 100 103 99

    17 18 24 47

    18 21 26 66

    24 26 56 99

    47 66 9

    Y

    C

    Q

    Q

    99 99 99 99

    99 99 99 99

    99 99 99 99

    9 99 99 99 99 99

    99 99 99 99 99 99 99 99

    99 99 99 99 99 99 99 99

    99 99 99 99 99 99 99 99

    99 99 99 99 99 99 99 99

    Quantization table

  • The quantization step size is smaller in the upper left region to

    preserve the low frequency components. On the other hand, the

    quantization step is larger in the lower right region to reduce the

    less important high frequency components to zero. Since the

    distortion of the high frequency features of the image is less

    sensitive to the human eyes, it is not easy for us to observe the

    difference between the original image and the quantized image.

    Differential Coding of DC Coefficients

    After 2-D DCT and quantization, we can find that the component

    of AC terms in 88 block will consist of many zeros such as Fig.

    19. And the DC coefficient is the mean of each corresponding

    block, and the current DC coefficient is very similar to the DC

    coefficients in its neighboring blocks. Thus, the JPEG encoder

    performs the predictive coding on the DC coefficients to reduce

    the redundancy, as shown in Fig. 20. The differential coding of

    DC coefficient is denoted as difference of DCi and DCi-1.

  • DC 10 -6 2

    4 -5 0 0

    0 0 0 0

    0 0 0 0

    0 0 -1 0

    0 0 0 0

    0 0 0 0

    0 0 0 0

    0 0 0 0

    0 0 0 0

    0 0 0 0

    0 0 0 0

    0 0 0 0

    0 0 0 0

    0 0 0 0

    0 0 0 0

    8x8 block after 2-D DCT and quantization

    DCi-1 DCi

    Blocki-1 Blocki

    Diffi = DCi - DCi-1

    Differential coding of DC coefficients

  • SUMMARY

    This tutorial starts from the introduction of backgrounds of why

    we need image compression, how image is formed, what kind of

    image processing tools are there, what we can do via image

    compression techniques, to how each method is implemented.

    I didnt talk much on how to implement JPEG and JPEG2000, but

    focus more on introducing different kinds of compression

    techniques. The drawback is that each cannot be discussed in

    details, but since the topic is development of compression

    techniques, so I would like to talk more broadly. Therefore,

    different methods are covered.

    There are still many other analysis tools with more complex

    algorithms that have not been introduced here. Readers who are

    interested can find more information in the references.