This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
ROYAL HOLLOWAY UNIVERISTY OF LONDON
JPEG compression How images are generally compressed using JPEG
Candidate Number: 1600085
Contents Compression using JPEG ..................................................................................................................................................... 1
Down Sampling ................................................................................................................................................................... 2
Questions not answered in this Project: ............................................................................................................................. 8
Huffman algorithm gives the optimal codeword length for each symbol according to its frequency. However, if there are
a lot of symbols occurring, then we have to write the codeword for each symbol as it appears.
Using Huffman algorithm on data we obtain the following associated codewords:
Symbol Frequency Codeword
0 44 1
-1 4 010
1 5 001
2 3 0111
-3 3 0110
-6 1 00011
-2 1 00010
5 1 00001
-4 1 00000
Encoded string is: 011010110000100001101110000000100100001001011101000101001111111101001000000000000000000000000000000000000000. Our encoded string is 108 bits long. Huffman algorithm gives the optimal codeword length for each symbol according to
its frequency. However this is not very efficient in a sense that our original string is 64 characters long and we must write
the codeword for every character as it appears in our string. We can be more efficient by using a simple lossless data
7 Candidate number: 1600085
compression technique which is called Run-length encoding (RLE), before we apply Huffman in order to reduce the
number of characters to be encoded.
Definition (Runs): An element appearing more than once consecutively in a string is called run e.g. 0 appears five times
consecutively after the symbol 1 in a string 010000010 hence we call it run of 0
Definition (Run-Length encoding):
Lossless data compression method where the run of data is stored as a data value and its count e.g. 010000010 is stored
as 01(0,5)10
We use Run-length encoding for our original string obtained from matrix B using zigzag pattern.
Note that we only use RLE for elements appearing twice or more consecutively. We can now use Huffman encoding to
encode our string
Figure 16
Using the code words, our encoded string is: 1101110101000111111100011111010101011111011010 0001110001100011001000000 Note that this encoded string is on 72 bits long, much smaller compared to 108 bits. This is the encoded string for luminance component for our 8x8 pixel block that we store. If the image was divided into n blocks, we would send 3 ∗ 𝑛 different encoded string since we have three 8x8 matrices for each 8x8 pixel block i.e. Y, Cb and Cr.
8 Candidate number: 1600085
Conclusion This completes the general procedure for JPEG compression. Different software may use different variations in each stage e.g. higher ratio of down sampling of chrominance, different quantization matrix or different lossless encoding method for entropy encoding and other minor changes to achieve the required size or quality. However the general idea remains the same. Each of the stages are obviously reversible in order to reconstruct the original image. Some data is lost permanently and quality of the image may be lowered. Although in most cases, human eye would not be able to distinguish the difference between JPEG and the original image.
Questions not answered in this Project: 1. How 3x3 matrix for RGB to YCbCr derived and why are there different variations of these matrices?
2. How is the quantization matrix derived? What is the optimal Quantizer?
3. How is DCT formulae derived?
4. There are many other types of transforms such as Kahunen-Loeve transform, Discrete Fourier transform etc.
Why use DCT-II?
Karhunen–Loève transform (KLT) minimizes the total mean square error for the pixels. In fact it gives optimal
error however KLT is not used in practice since the co-efficient matrix is not constant and is image dependent.
This costs too much and is computationally slow. In fact for certain types of images, DCT is Kahunen-Loeve
transform. Also DCT assumes the pixels next to each other are similar, which is a reasonable assumption since
natural images are smooth and pixels are highly correlated. Discrete Cosine Transform is suboptimal but it is
very fast and efficient. However, more research is needed to answer this question in more depth.
9 Candidate number: 1600085
Bibliography:
[1] David Austin, Image Compression: Seeing What’s Not There [online]. Grand Valley State Univeristy [viewed 08 Jan 2016] Available from: http://www.ams.org/samplings/feature-column/fcarc-image-compression [2] Randell Heyman, How JPEG works. 23 Jan 2015 [viewed 02 Jan 2016] Available from: https://www.youtube.com/watch?v=f2odrCGjOFY [3] Mikulic, Discrete Cosine Transform. 01 Sept 2001 [viewed 04 Jan 2016] Available from: https://unix4lyfe.org/dct/ [4] JPEG: Wikipedia. 08 Jan 2016 [viewed 06 Jan 2016] Available from: https://en.wikipedia.org/wiki/JPEG#Discrete_cosine_transform [5] Discrete Cosine Transform: Wikipedia. 20 Dec 2015 [viewed 04 Jan 2016] Available from: https://en.wikipedia.org/wiki/Discrete_cosine_transform [6] Dheera Venkatraman, Online Plotting tool. Available from: http://fooplot.com/#W3sidHlwZSI6MTAwMH1d [7] Timur, Huffman coding calculator. Available from: http://planetcalc.com/2481/ [8] JPEG ‘files’ & Colour (JPEG Pt1): Computerphile. 21 Apr 2015 [viewed 28 dec 2015]. Available from: https://www.youtube.com/watch?v=n_uNPbdenRs [9] JPEGDCT, Discrete Cosine Transform (JPEG Pt2): Computerphile. 22 May 2015 [viewed 28 dec 2015]. Available from: https://www.youtube.com/watch?v=Q2aEzeMDHMA [10] Digital image processing: p010 – The Discrete Cosine Transform (DCT): Alireza Saberi. 15 March 2013 [viewed 02 Jan 2016]. Available from: https://www.youtube.com/watch?v=_bltj_7Ne2c [11] Digital image processing: p009 JPEGs 8x8 blocks: Alireza Saberi. 15 March 2013 [viewed 02 Jan 2016]. Available from: https://www.youtube.com/watch?v=pZuaOjfsv0Y [12] Run-length encoding: Wikipedia. 07 Dec 2015 [viewed 08 Jan 2016]. Available from: https://en.wikipedia.org/wiki/Run-length_encoding