Top Banner

of 68

Compression and Decompression techniques

Apr 14, 2018

Download

Documents

Varun Jain
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/27/2019 Compression and Decompression techniques

    1/68

    By:

    Shubhra goyal

  • 7/27/2019 Compression and Decompression techniques

    2/68

    Definition

    Why Compression

    Types of compression Binary image compression scheme

    Video image compression

    Audio image compression Fractal image compression

  • 7/27/2019 Compression and Decompression techniques

    3/68

    Compression: the process of coding that will

    effectively reduce the total number of bits

    needed to represent certain information.

    Encoder

    (compression)storage

    decoder

    (decompression)

    Input

    data

    output

    data

  • 7/27/2019 Compression and Decompression techniques

    4/68

    compression is possible due to:

    Redundancy in digital audio, image, and video data(silence removal, spatial redundancy, temporalredundancy)

    Properties of human perception.

    (Compressed version of digital audio, image, video

    need not represent the original information exactly

    Perception sensitivities are different for different signal

    patterns

    Human eye is less sensitive to the higher spatial

    frequency components than the lower frequencies

    (transform coding)

    )

  • 7/27/2019 Compression and Decompression techniques

    5/68

    Video and audio have much higher storage

    requirements than text.

    Data transmission rates (in terms of bandwidthrequirements) for sending continuous media

    are considerably higher than text.

    Efficient compression of audio and video data,

    including some compression standards, will be

    considered in this lesson.

  • 7/27/2019 Compression and Decompression techniques

    6/68

  • 7/27/2019 Compression and Decompression techniques

    7/68

    Data compression is about storing and sending a smaller number

    of bits. Therere two major categories for methods to compress data:

    lossless and lossy methods

    Data compression

    methods

    Lossless

    methodsLossy methods

    Run-

    lengt

    h

    LZW

    Ccitt

    grp 3

    2D

    Ccitt

    grp 4

    Ccitt grp

    3 1 DJPEG Ccitt

    h.261

    fractals

    Intel

    DVIMPEG

  • 7/27/2019 Compression and Decompression techniques

    8/68

    In lossless methods, original data and the data after compression anddecompression are exactly the same.

    Redundant data is removed in compression and added during decompression.

    It achieve reduction in size in the range of 1/10 to 1/50 of the originaluncompressed size.

    Lossless methods are used when we cant afford to lose any data: eg. TextCompression like legal and medical documents, computer programs.

  • 7/27/2019 Compression and Decompression techniques

    9/68

    There are five common lossless methods:

    Run length encoding

    CCITT Group 3 1D

    CCITT Group 3 2D

    CCITT Group 4

    Lempel-Ziv and welch algorithm LZW

  • 7/27/2019 Compression and Decompression techniques

    10/68

    Run-length encoding is the simplest method of

    compression.

    It can be used to compress data made of any

    combination of symbols.

    The general idea behind this method is to replace

    consecutive repeating occurrences of a symbol by one

    occurrence of the symbol followed by the number of

    occurrences.

    The method can be even more efficient if the data uses

    only two symbols (for example 0 and 1) in its bit

    pattern and one symbol is more frequent than the other.

  • 7/27/2019 Compression and Decompression techniques

    11/68

  • 7/27/2019 Compression and Decompression techniques

    12/68

    This is disadvantageous for a busy image. In busyimage, adjacent pixels or groups of adjacent pixelschange rapidly. These lead to shorter run lengths of

    black pixels or white pixels.

    In this, it could take more bits for the code torepresent the run length which generates more

    bytes than the original number of bytes in animage.

    This effect is called as reverse compression ornegative compression.

    It was designed for black and white images only,not for gray scale or color images.

  • 7/27/2019 Compression and Decompression techniques

    13/68

    Many facsimile and document imaging file formatssupport a form of lossless data compression oftendescribed as CCITT encoding.

    The CCITT (International Telegraph and Telephone

    Consultative Committee) is a standards organizationthat has developed a series of communications

    protocols for the facsimile transmission of black-and-white images over telephone lines and data networks.

    The CCITT actually defines three algorithms for the

    encoding of image data:

    Group 3 One-Dimensional (G31D)

    Group 3 Two-Dimensional (G32D)

    Group 4 (G4)

  • 7/27/2019 Compression and Decompression techniques

    14/68

    Huffman encoding is used for encoding the pixel run length in CCITT

    Group 3 and Group 4.

    It is a variablelength encoding scheme generating the shortest code for

    frequently occurring run lengths and longer code for less frequently

    occurring run lengths.

    Algorithm:

    Make a leaf node for each code symbol

    Add the generation probability of each symbol to the leaf node

    Take the two leaf nodes with the smallest probability and connectthem into a new node

    Add 1 or 0 to each of the two branchesThe probability of the new node is the sum of the probabilitiesof the two connecting nodes

    If there is only one node left, the code construction is completed. Ifnot, go back to (2)

  • 7/27/2019 Compression and Decompression techniques

    15/68

  • 7/27/2019 Compression and Decompression techniques

    16/68

  • 7/27/2019 Compression and Decompression techniques

    17/68

  • 7/27/2019 Compression and Decompression techniques

    18/68

  • 7/27/2019 Compression and Decompression techniques

    19/68

  • 7/27/2019 Compression and Decompression techniques

    20/68

    Advantages

    - it is simple to implement in both hardware

    and software.

    Disadvantages

    - it is one-dimensional as it encodes each row or

    line separately.

    - it assumes a reliable communication link and does

    not provide any protection mechanism.

  • 7/27/2019 Compression and Decompression techniques

    21/68

    CCITT Group 3 - 2D compression scheme is also known asmodified run-length encoding. This scheme is morecommonly used for software based document imagingsystem.

    While CCITT Group 3- 2D scheme provides fairly goodcompression, it is easier to compress in software thanCCITT Group 4 standard. The compression ratio averagessomewhere between 10-25, between Group 3 and Group 4.

    The compression scheme is based on statistical nature ofimages. For example, the image data across the adjacentscan line may normally be redundant, if black and whitetransitions occur within plus or minus 3 pixels in the nextline as well. Depending upon the scan resolution one line oftext may consist of 20-30 scan lines.

  • 7/27/2019 Compression and Decompression techniques

    22/68

    2 dimensional coding

    Images are divided into several groups of K lines

    the first line of each group is encoded using CCITT Group 3 1D method

    The rest of lines are encoded using some "differential schemes"

    Typically compression ratio 10 ~ 20

    The "K-factor" allows more error-free transmission World-wide fassimile standard

    The 2D scheme uses a combination of additional codes called vertical code,

    pass code, and horizontal code.

    Only one pass code, i.e. 0001 and one horizontal code, i.e. 001

    If vertical code and horizontal code are not applied, then the horizontal code is

    applied.

    Horizontal Code + Group 3 1D Code = 001 + markup code + terminating code

  • 7/27/2019 Compression and Decompression techniques

    23/68

  • 7/27/2019 Compression and Decompression techniques

    24/68

    Parse the coding line and look for the change in the

    pixel value. The pixel value change is found at the

    a1 location (a1 is the indicator that the pixel

    changed from binary 0 to binary 1.)

    Parse the reference line and look for the change in

    the pixel value. The change is found at the b1

    location.

    Find the difference in the location between b1 anda1: Delta = b1-a1.

  • 7/27/2019 Compression and Decompression techniques

    25/68

  • 7/27/2019 Compression and Decompression techniques

    26/68

  • 7/27/2019 Compression and Decompression techniques

    27/68

  • 7/27/2019 Compression and Decompression techniques

    28/68

    Advantages

    - the implementation of the K factor allows

    error-free transmission.

    - it is a worldwide facsimile standard.- due to its 2-dimensional nature, the compressionratios achieved with this scheme are better thanCCITT Group 3 1D.

    Disadvantages- it does not provide a dense compression

    - it is complex and relatively difficult to implementin software.

  • 7/27/2019 Compression and Decompression techniques

    29/68

    The compression ratio was not sufficient for serious, high-

    resolution document imaging.

    This is a 2D coding scheme without the k-factor, the k-factor in

    this scheme is the entire page of lines.

    Here, the first reference line is an imaginary all-white line abovethe top of the image.

    The first group of pixels is encoded using the imaginary white line

    as reference line.

    The new coded line becomes the reference line for the next scan

    line. Each successive line is coded relative to the previous line.

    This provides very large level of compression

  • 7/27/2019 Compression and Decompression techniques

    30/68

  • 7/27/2019 Compression and Decompression techniques

    31/68

    There are no EOL markers before the start of

    the compressed data.

    Fillers are not used for the scan line .There is an EOP (End-Of-Page) mark consisting

    of

    - two concatenated EOLs

    - padding bits are added immediately

    after the end of compressed data

  • 7/27/2019 Compression and Decompression techniques

    32/68

  • 7/27/2019 Compression and Decompression techniques

    33/68

    Advantages:

    - Better resolution

    Disadvantages:

    - Slow- Complex

    - As there is no reference line, a single error

    error can result in the rest of the pagebeing skewed.

  • 7/27/2019 Compression and Decompression techniques

    34/68

  • 7/27/2019 Compression and Decompression techniques

    35/68

    It is dictionary-basedencoding algorithm.

    It creates a dictionary (a table) of strings used during

    the communication session.

    If both the sender and the receiver have a copy of the

    dictionary, then previously-encountered strings can be

    substituted by their index in the dictionary to reduce

    the amount of information transmitted.

  • 7/27/2019 Compression and Decompression techniques

    36/68

    In this phase there are two concurrent events:

    - building an indexed dictionary and

    - compressing a string of symbols.

    The algorithm extracts the smallest substring that

    cannot be found in the dictionary from the remaininguncompressed string.

    It then stores a copy of this substring in the dictionary

    as a new entry and assigns it an index value.

    Compression occurs when the substring, except for thelast character, is replaced with the index found in the

    dictionary.

    The process then inserts the index and the last

    character of the substring into the compressed string.

  • 7/27/2019 Compression and Decompression techniques

    37/68

  • 7/27/2019 Compression and Decompression techniques

    38/68

    Decompression is the inverse of the compression

    process.

    The process extracts the substrings from the

    compressed string and tries to replace the indexes with

    the corresponding entry in the dictionary, which is

    empty at first and built up gradually.

    The idea is that when an index is received, there is

    already an entry in the dictionary corresponding to that

    index.

  • 7/27/2019 Compression and Decompression techniques

    39/68

  • 7/27/2019 Compression and Decompression techniques

    40/68

    Used for compressing images and video

    files (our eyes cannot distinguish subtle

    changes, so lossy data is acceptable).

    These methods are cheaper, less time andspace.

    Several methods: JPEG: compress pictures and graphics

    MPEG: compress video MP3: compress audio

  • 7/27/2019 Compression and Decompression techniques

    41/68

    Color characteristics:

    Luminance : This is the measure of the light

    emitted or reflected by an object.

    Hue: This is the color sensation producedin an observer due to the presence of certain

    wavelengths of color.

    Saturation: Depth of a color

    Difference between red and pink

  • 7/27/2019 Compression and Decompression techniques

    42/68

    A color model is an orderly system for creating a

    whole range of colors from a small set of primary

    colors.

    There are two types of color models,

    - Subtractive

    - Additive

    Additive color models use light to display color while

    subtractive models use printing inks.

    Colors perceived in additive models are the result oftransmitted light. the typical technique

    on color displays.

    Colors perceived in subtractive models are the result of

    reflected light, the typical technique in printers/plotters.

  • 7/27/2019 Compression and Decompression techniques

    43/68

    CMYK model:

    The Cyan ,Magenta ,Yellow and Black (CMYK)

    model is used in color printing devices.

    It is a color subtractive model.HSI MODEL (HSB MODEL):

    The Hue, Saturation and Intensity model

    represents tint, shade and tone.

    This model is used in IP for filtering and

    smoothing images.

    Requires high level of computation

  • 7/27/2019 Compression and Decompression techniques

    44/68

    YUV MODEL:

    Its a 3-D and Subtractive model.

    Y is Luminance component

    UV is chrominance components.

    Used in full motion video.

    Black and white

    information

    Coloured information

    U = red-cyan

    V= magenta-green

  • 7/27/2019 Compression and Decompression techniques

    45/68

    RGB Model:

    This model is additive in

    nature.intensities of Red, Green and Blue

    are added to generate various colors.

    Used in design of image capture devices,

    television, and color monitors.

    No color model is better than the other,the choice depends on the application

  • 7/27/2019 Compression and Decompression techniques

    46/68

    Color component conversion

    Y 0.299R + 0.587G + 0.114B

    U 0.596R 0.247G 0.322B

    V 0.211R 0.523G + 0.312B

    Color component conversion

    R 1.0Y + 0.956U + 0.621V

    G 1.0Y 0.272U 0.647V

    B 1.0Y -1.1061U 1.703V

  • 7/27/2019 Compression and Decompression techniques

    47/68

    JPEG standard is a collaboration among :

    International Telecommunication Union (ITU)

    International Organization for Standardization

    (ISO)

    International Electrotechnical Commission

    (IEC)

    The official names of JPEG :

    Joint Photographic Experts Group ISO/IEC 10918-1 Digital compression and coding

    of continuous-tone still image

    ITU-T Recommendation T.81

  • 7/27/2019 Compression and Decompression techniques

    48/68

    It should address image quality where visual

    fidelity is very high and an encoder can be

    parameterized to allow the user to set the

    compression or the quality level.

    Should compress any kind of continuous-tone

    digital source image and is not restricted by

    dimensions, color, aspect ratios etc.

    Should be scalable from completely losslessto lossy.

  • 7/27/2019 Compression and Decompression techniques

    49/68

    Four operation modes: Sequential encoding component is encoded

    in left to right and top to bottom scan.

    Progressive encoding the image is

    decompressed so that a coarser image isdisplayed first and filled in with more

    components when decompressed to a finer

    version of the image.

    Hierarchical encoding- the image iscompressed to multiple resolution levels.

    Lossless encoding the image can be

    guaranteed to provide full detail at the

    selected resolution when decompressed.

  • 7/27/2019 Compression and Decompression techniques

    50/68

    A codecis a device or computer program

    capable of encoding and decoding a digital

    stream of data or signal.

    They differ within an operation modeaccording to the precision of source image

    they can handle or the entropy coding

    method they use.

  • 7/27/2019 Compression and Decompression techniques

    51/68

    It have three levels of defination:Baseline System- it decompress color images,

    maintain a high compression ratio, and handle

    from 4 bits/pixels to 16 bits/pixels. It ensures s/w

    implementation are cost effective.Extended System it covers various encoding

    aspects such as variable-length, progressive and

    hierarchical mode of encoding.

    Special Lossless function it ensures there is noloss of detail in the compression and

    decompression process, but there is some loss in

    scanning process.

  • 7/27/2019 Compression and Decompression techniques

    52/68

    Baseline sequential codec- It consist of three steps:

    formation of DCT coefficients, quantization, andentropy encoding. Itsa rich compression scheme.

    DCT progressive mode- key steps of DCT coefficients

    and quantization are same as above with a diff. that

    each component is coded in multiple scans instead ofsingle scan.

    Predictive lossless encoding-defines a means of

    approaching lossless continuous-tone compression.

    Predictor combines sample areas and predictsneighboring areas.

    Hierarchical mode- provides a means of carrying

    multiple resolutions. Each successive encoding is

    reduced by a factor of two, either in horizontal orvertical dimension.

  • 7/27/2019 Compression and Decompression techniques

    53/68

  • 7/27/2019 Compression and Decompression techniques

    54/68

    The main steps in JPEG encoding are the following

    Transform RGB to YUV or YIQ and subsample color

    DCT on 8x8 image blocks

    Quantization

    Zig-zag ordering and run-length encoding

    Entropy coding

  • 7/27/2019 Compression and Decompression techniques

    55/68

    The image is divided up into 8x8 blocks

    2D DCT is performed on each block

    The DCT is performed independently for each

    block

    This is why, when a high degree of compression is

    requested, JPEG gives a blocky image result

  • 7/27/2019 Compression and Decompression techniques

    56/68

    7 7

    0 0

    1 (2 1) (2 1)( , ) ( ) ( ) ( , )cos cos

    4 16 16

    for 0,...,7 and 0,...,7

    x y

    x u y vF u v C u C v f x y

    u v

    1/ 2 for 0where ( )

    1 otherwise

    kC k

    7 7

    0 0

    1 (2 1) (2 1)( , ) ( ) ( ) ( , )cos cos

    4 16 16

    for 0,...,7 and 0,...,7

    u v

    x u y vf x y C u C v F u v

    x y

    Forward DCT:

    Inverse DCT:

  • 7/27/2019 Compression and Decompression techniques

    57/68

  • 7/27/2019 Compression and Decompression techniques

    58/68

    0 1 2 3 4 5 6 70

    1

    2

    3

    4

    5

    6

    7

    u

    v

  • 7/27/2019 Compression and Decompression techniques

    59/68

    Y

    the luminance of an image

    W

    H

    8x8 values of luminance

    48 39 40 68 60 38 50 121

    149 82 79 101 113 106 27 62

    58 63 77 69 124 107 74 125

    80 97 74 54 59 71 91 66

    18 34 33 46 64 61 32 37

    149 108 80 106 116 61 73 92

    211 233 159 88 107 158 161 109

    212 104 40 44 71 136 113 66

    DCT699.25 43.18 55.25 72.11 24.00 -25.51 11.21 -4.14

    -129.78 -71.50 -70.26 -73.35 59.43 -24.02 22.61 -2.05

    85.71 30.32 61.78 44.87 14.84 17.35 15.51 -13.19

    -40.81 10.17 -17.53 -55.81 30.50 -2.28 -21.00 -1.26

    -157.50 -49.39 13.27 -1.78 -8.75 22.47 -8.47 -9.23

    92.49 -9.03 45.72 -48.13 -58.51 -9.01 -28.54 10.38

    -53.09 -62.97 -3.49 -19.62 56.09 -2.25 -3.28 11.91

    -20.54 -55.90 -20.59 -18.19 -26.58 -27.07 8.47 0.31

  • 7/27/2019 Compression and Decompression techniques

    60/68

    Quantization in JPEG aims at reducing the

    total number of bits in the compressed image

    Divide each entry in the frequency space block

    by an integer, then round

    Use a quantization matrix Q(u, v)

  • 7/27/2019 Compression and Decompression techniques

    61/68

    Use larger entries in Q for the higher spatialfrequencies These are entries to the lower right part of the

    matrix

    The following slide shows the default Q(u, v)values for luminance and chrominance Based on psychophysical studies intended to maximize

    compression ratios while minimizing perceptualdistortion

    Since after division the entries are smaller, we can usefewer bits to encode them

  • 7/27/2019 Compression and Decompression techniques

    62/68

  • 7/27/2019 Compression and Decompression techniques

    63/68

    F(u,v)8x8 DCT coefficiences

    16 11 10 16 24 40 51 61

    12 12 14 19 26 58 60 55

    14 13 16 24 40 57 69 56

    14 17 22 29 51 87 80 62

    18 22 37 56 68 109 103 77

    24 35 55 64 81 104 113 92

    49 64 78 87 103 121 120 101

    72 92 95 98 112 100 103 99

    Q(u,v)Quantization matrix

    699.25 43.18 55.25 72.11 24.00 -25.51 11.21 -4.14

    -129.78 -71.50 -70.26 -73.35 59.43 -24.02 22.61 -2.05

    85.71 30.32 61.78 44.87 14.84 17.35 15.51 -13.19

    -40.81 10.17 -17.53 -55.81 30.50 -2.28 -21.00 -1.26

    -157.50 -49.39 13.27 -1.78 -8.75 22.47 -8.47 -9.23

    92.49 -9.03 45.72 -48.13 -58.51 -9.01 -28.54 10.38

    -53.09 -62.97 -3.49 -19.62 56.09 -2.25 -3.28 11.91-20.54 -55.90 -20.59 -18.19 -26.58 -27.07 8.47 0.31

  • 7/27/2019 Compression and Decompression techniques

    64/68

    43.70 3.93 5.52 4.51 1.00 -0.64 0.22 -0.07-10.82 -5.96 -5.02 -3.86 2.29 -0.41 0.38 -0.04

    6.12 2.33 3.86 1.87 0.37 0.30 0.22 -0.24

    -2.91 0.60 -0.80 -1.92 0.60 -0.03 -0.26 -0.02

    -8.75 -2.25 0.36 -0.03 -0.13 0.21 -0.08 -0.12

    3.85 -0.26 0.83 -0.75 -0.72 -0.09 -0.25 0.11

    -1.08 -0.98 -0.04 -0.23 0.54 -0.02 -0.03 0.12

    -0.29 -0.61 -0.22 -0.19 -0.24 -0.27 0.08 0.00

    ( , )( , )

    F u vQ u v

    44 4 6 5 1 -1 0 0

    -11 -6 -5 -4 2 0 0 0

    6 2 4 2 0 0 0 0

    -3 1 -1 -2 1 0 0 0

    -9 -2 0 0 0 0 0 0

    4 0 1 -1 -1 0 0 0

    -1 -1 0 0 1 0 0 0

    0 -1 0 0 0 0 0 0

    ( , )

    ( , )( , )

    qF u v

    F u vRoundQ u v

  • 7/27/2019 Compression and Decompression techniques

    65/68

    0 1 5 6 14 15 27 28

    2 4 7 13 16 26 29 42

    3 8 12 17 25 30 41 43

    9 11 18 24 31 40 44 53

    10 19 23 32 39 45 52 54

    20 22 33 38 46 51 55 60

    21 34 37 47 50 56 59 61

    35 36 48 49 57 58 62 63

    The zigzag sequence starts at the DC coefficient value.It is designed to facilitate entropy coding by placing low-frequency

    coefficients (which are non- zero) before high frequency coefficients.

  • 7/27/2019 Compression and Decompression techniques

    66/68

    44 4 6 5 1 -1 0 0

    -11 -6 -5 -4 2 0 0 0

    6 2 4 2 0 0 0 0

    -3 1 -1 -2 1 0 0 0

    -9 -2 0 0 0 0 0 0

    4 0 1 -1 -1 0 0 0

    -1 -1 0 0 1 0 0 0

    0 -1 0 0 0 0 0 0

    ( , )q

    F u v

    Zig-Zag Reordering :

    44,

    4,-11,

    6,-6,6,5,-5,2,-3,

    -9,1,4,-4,1,

    -1,2,2,-1,-2,4,

    -1,0,0,-2,0,0,0,

    0,0,0,1,0,1,-1,0,

    -1,0,-1,0,0,0,0,

    0,0,0,-1,0,0,

    0,1,0,0,0,0,0,0,0,

    0,0,0,

    0,0,

    0

  • 7/27/2019 Compression and Decompression techniques

    67/68

    Entropy is used in thermodynamics for the study of heat and work.

    In data compression, it is a measure of the information content ofa message in number of bits.

    Entropy in no. of bits = log2(probabilityofobject)

    Object can be a character

    Eg. If the probability of character T present in a string is 1/8, the

    entropy is 3 bits. i.e. If there are 7 Ts in a text string, then the

    message can be represented by 21 bits.

    JPEG uses two entropy coding schemes: huffman and arithmetic

    coding.

    Huffman coding requires one or more sets of huffman code tablesfor coding as well as decoding.

    Arithmetic coding uses DC and AC coefficients. The coefficient at

    0,0 position in the matrix is called DC coefficients and other 63

    are called as AC coefficients.

  • 7/27/2019 Compression and Decompression techniques

    68/68