Top Banner
The JPEG Standard
45

1. Introduction JPEG standard is a collaboration among : International Telecommunication Union (ITU) International Organization for Standardization (ISO)

Dec 24, 2015

Download

Documents

Laurel Griffith
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • Slide 1
  • Slide 2
  • 1. Introduction JPEG standard is a collaboration among : International Telecommunication Union (ITU) International Organization for Standardization (ISO) International Electrotechnical Commission (IEC) The official names of JPEG : Joint Photographic Experts Group ISO/IEC 10918-1 Digital compression and coding of continuous-tone still image ITU-T Recommendation T.81 Graduate Institute of Communication Engineering, National Taiwan University, Taipei, Taiwan, ROC2
  • Slide 3
  • 1. Introduction JPEG have the following mods : Lossless mode, predictive coding Sequential mode, DCT-based coding Progressive mode, DCT-based coding Hierarchical mode Baseline system Sequential mode, DCT-based coding, Huffman coding for entropy encoding The most widely used mode in practice Graduate Institute of Communication Engineering, National Taiwan University, Taipei, Taiwan, ROC3
  • Slide 4
  • General Image Storage System
  • Slide 5
  • The three most popular color models are RGB (used in computer graphics); YIQ, YUV, or YCbCr (used in video systems); and CMYK (used in color printing). All of the color spaces can be derived from the RGB information supplied by devices such as cameras and scanners. 5 RED Plane GREEN Plane BLUE Plane
  • Slide 6
  • RGB values for 100% amplitude, 100% saturated color bars, a common video test signal. 6 The RGB color space is the most prevalent choice for computer graphics because: color displays use red, green, and blue to create the desired color. Therefore, the choice of the RGB color space simplifies the architecture and design of the system. Also, a system that is designed using the RGB color space can take advantage of a large number of existing software routines, since this color space has been around for a number of years.
  • Slide 7
  • RGB Cons: However, RGB is not very efficient when dealing with real-world images. All three RGB components need to be of equal bandwidth to generate any color within the RGB color cube. (The result of this is a frame buffer that has the same pixel depth and display resolution for each RGB component). Also, processing an image in the RGB color space is usually not the most efficient method. For example, to modify the intensity or color of a given pixel, the three RGB values must be read from the frame buffer, the intensity or color calculated, the desired modifications performed, and the new RGB values calculated and written back to the frame buffer. If the system had access to an image stored directly in the intensity and color format, some processing steps would be faster. 7
  • Slide 8
  • YUV Color Space In many applications, it is desirable to describe a color in terms of its luminance and chrominance content separately, to enable more efficient processing and transmission of color signals One such coordinate is the YUV color space Y is the components of luminance Cb and Cr are the components of chrominance The values in the YUV coordinate are related to the values in the RGB coordinate by
  • Slide 9
  • YUV Color Space The YUV color space is used by the PAL (Phase Alternation Line), NTSC (National Television System Committee), and SECAM (Sequentiel Couleur Avec Mmoire or Sequential Color with Memory) composite color video standards. The black-and-white system used only luma (Y) information; color information (U and V) was added in such a way that a black-and-white receiver would still display a normal black-and-white picture. Color receivers decoded the additional color information to display a color picture. 9
  • Slide 10
  • Image Preparation Image Component/Plane: a gray-scale image consist of a single component i.e. Intensity. An YUV image consist of three planes; namely Luminance plane/Y plane Chrominance plane U Chrominance plane V C 1 : Luminance Y C2: Chrominance U C3: Chrominance V Each small rectangle represents a pixel
  • Slide 11
  • Color Specification Luminance Received brightness of the light, which is proportional to the total energy in the visible band. Chrominance Describe the perceived color tone of a light, which depends on the wavelength composition of light Chrominance is in turn characterized by two attributes Hue Specify the color tone, which depends on the peak wavelength of the light Saturation Describe how pure the color is, which depends on the spread or bandwidth of the light spectrum
  • Slide 12
  • Luminance only Chrominance only full color image 12 Luma represents the achromatic image, while the chroma components represent the color information. Converting R'G'B' sources (such as the output of camera) into luma and chroma allows for chroma subsampling: because human vision has finer spatial sensitivity to luminance ("black and white") differences than chromatic differences, video systems can store chromatic information at lower resolution, optimizing perceived detail at a particular bandwidth.
  • Slide 13
  • 2. Color Space Conversion Graduate Institute of Communication Engineering, National Taiwan University, Taipei, Taiwan, ROC13 (a) translate from RGB to YC b C r (b) translate from YC b C r to RGB
  • Slide 14
  • The basic equations to convert between 8-bit digital RGB data with a nominal range and YCbCr are: Y= 0.299R + 0.587G + 0.114B Cb = 0.172R 0.339G + 0.511B + 128 Cr = 0.511R 0.428G 0.083B + 128 R = Y+ 1.371(Cr 128) G = Y 0.698(Cr 128) 0.336(Cb 128) B = Y+ 1.732(Cb 128) 14
  • Slide 15
  • Data Unit/Block: Each block is made of 8x8 pixels. This definition of block/data unit comes from DCT. These blocks of 8x8 pixels are transferred to the next step as a unit for processing. There are two ways these data units are passed to the next step. Non-interleaved data ordering: In this case, data units are passed component by component. All the data units from first component are passed and then from the second component and so on. Data units from the each component are passed from left to right, top to bottom order. Image Preparation
  • Slide 16
  • Non-interleaved data ordering: Using this mode for an RGB-encoded image with very high resolution, the display would initially present only the red component, then green and then blue resulting the original image. Data Unit/Block To the Image Processing Step Represen ts data unit of 8x8 pixels
  • Slide 17
  • Interleaved data ordering: Interleaved data ordering: In this approach data units from different components are combined into Minimum Coded Unit (MCU). If all components have the same resolution, an MCU consists of exactly one data unit from each component. Data Unit/Block MCU2MCU3MCU4MCU5MCU1 And so on To the Image Processing Step
  • Slide 18
  • Interleaved data ordering: If all components dont have the same resolution, Different number of data units from each component comprises the MCU. The number of data units from each component is calculated from relative horizontal and vertical sampling ratios. Data units from each component are taken from left to right, top to bottom order. MCUs are also constructed from left to right, top to bottom order. Example 1: Let, three different plane of an image has resolutions as follows: Plane 0: X 0 = 512, Y 0 = 256 Plane 1: X 1 = 256, Y 1 = 512 Plane 2: X 2 = 128, Y 2 = 256 Data Unit/Block
  • Slide 19
  • Example 1: Data Unit/Block .. 512 256 256 512 128 256 ... Plane 0(512x256 pixels) Plane1 (256x512 pixels) Plane 2(128x256 pixels) Now, if we see data units of each component we find ******************** ******************** 64 32 ******** ******** ******** ******** 32 64 16 32 ** ** Each * represents a data unit i.e. 8x8 pixels
  • Slide 20
  • Data Unit/Block Example 1: Now H i, and V i are called relative sampling ratios and calculated for each plane. H i = X i / X min V i = Y i / Y min So, we get Plane 0: H 0 = 4, V 0 = 1 Plane 1: H 1 = 2, V 1 = 2 Plane 2: H 2 = 1, V 2 = 1 H i, and V i must be integer values between 1 and 4. Now, a MCU is built by taking H 0 xV 0 data units from plane 0 H 1 xV 1 units from plane 1 H 2 xV 2 units from plane 2
  • Slide 21
  • Example 1: Data Unit/Block ****** * MCU Example 2: Let, an image has four planes with the following dimensions Plane 0: X 0 = 48 pixels, Y 0 = 32 pixels Plane 1: X 1 = 48 pixels, Y 1 = 16 pixels Plane 2: X 2 = 24 pixels, Y 2 = 32 pixels Plane 3: X 3 = 24 pixels, Y 3 = 16 pixels
  • Slide 22
  • Data Unit/Block Example 2: Represents a data unit or block i.e. 8x8 pixels If we calculate H i and V i like previous example we will find MCUs like below H 0 = 2, V 0 = 2 H 1 = 2, V 1 = 1 H 2 = 1, V 2 = 2 H 3 = 1, V 3 = 1 MCU1 MCU2 MCU3 MCU4 MCU5 MCU6 Blocks of these MCUs are transferred to the next step.
  • Slide 23
  • 3. Downsampling 23 Y W H CbCb W H CrCr W H Y W H Y W H CbCb W CrCr W H H (a) 4:4:4(b) 4:2:2(c) 4:2:0 CbCb W H CrCr W H Figure 2. Three color format in the baseline system 4:2:0 YCbCr Format Rather than the horizontal-only 2:1 reduction of Cb and Cr used by 4:2:2, 4:2:0 YCbCr implements a 2:1 reduction of Cb and Cr in both the vertical and horizontal directions. It is commonly used for video compression.
  • Slide 24
  • Graduate Institute of Communication Engineering, National Taiwan University, Taipei, Taiwan, ROC24 4:1:1 YCbCr Format For every four horizontal Y samples, there is one Cb and Cr value. Each component is typically 8 bits. Each sample therefore requires 12 bits 4:2:2 YCbCr Format For every two horizontal Y samples, there is one Cb and Cr sample. Each sample is typically 8 bits (consumer applications) or 10 bits (pro-video applications) per component. Each sample therefore requires 16 bits (or 20 bits for provideo applications ). 4:4:4 YCbCr Format Each sample has a Y, a Cb, and a Cr value. Each sample is typically 8 bits (consumer applications) or 10 bits (pro-video applications) per component. Each sample therefore requires 24 bits (or 30 bits for pro-video applications).
  • Slide 25
  • 4. Discrete Cosine Transform A technique for converting signal into elementary frequency components ( transforming an image from spatial to frequency domain. Transforming a 8x8 pixel block through FDCT we find 64 coefficients which can be regarded as a two-dimensional frequency ) Coding Predictive Coding - In predictive coding, information already sent or available is used to predict future values, and the difference is coded. Transform Coding - Transforms the image from its spatial domain representation to a different type of representation using some well known transform and then codes the transformed values (coefficients) Transform coding relies on the premise that pixels in an image exhibit a certain level of correlation with their neighboring pixels Consequently, these correlations can be exploited to predict the value of a pixel from its respective neighbors.
  • Slide 26
  • 4. Discrete Cosine Transform 26 Forward DCT: Inverse DCT:
  • Slide 27
  • 4. Discrete Cosine Transform 27 0 1 2 3 4 5 6 7 u v The 8x8 DCT basis FDCT: The function of the formula is called basis function. The 64 basis functions can be illustrated by the following image. Steps of FDCT: At the beginning of FDCT, the pixel values are shifted into the range [-128, 127], with zero as the center.
  • Slide 28
  • 4. Discrete Cosine Transform 28 Example : Y the luminance of an image W H 8x8 values of luminance 8x8 coefficiences DCT
  • Slide 29
  • Steps of FDCT: Example 3: And then taking DCT and rounding to the nearest integer results Image Processing
  • Slide 30
  • Lossy Sequential DCT-based Mode Step: Quantization The human eye is fairly good at seeing small differences in brightness over a relatively large area. But not so good at distinguishing the exact strength of a high frequency brightness variation. This fact allows reducing the amount of information in the high frequency components. This is done by simply dividing each coefficient by a constant for that component, and then rounding to the nearest integer.
  • Slide 31
  • Quantization This is the main lossy operation in the whole process. As a result of this, it is typically the case that many of the higher frequency components are rounded to zero. A common quantization matrix is(i.e. the numbers by which the coefficients are divided)
  • Slide 32
  • 5. Quantization Graduate Institute of Communication Engineering, National Taiwan University, Taipei, Taiwan, ROC 32 Example : F(u,v) 8x8 DCT coefficiences 1611101624405161 12 141926586055 1413162440576956 1417222951878062 182237566810910377 243555648110411392 49647887103121120101 7292959811210010399 Q(u,v) Quantization matrix
  • Slide 33
  • Using this quantization matrix with the DCT coefficient matrix of example 3 results in Quantization Here we see, most of the high frequency components are zero. This matrix is sent to the next step for entropy encoding.
  • Slide 34
  • Lossy Sequential DCT-based Mode Entropy Encoding
  • Slide 35
  • Encoding of DC-coefficient: During the initial step of entropy encoding, a DC-coefficient is encoded as the difference between the current and the previous one. Only the differences are subsequently processed. Lossy Sequential DCT-based Mode Step: Entropy Encoding Block i-1 Block i DC i-1 DC i Diff = DC i DC i-1
  • Slide 36
  • Encoding of DC-coefficient: The DC coefficients are usually highly correlated, this reduces the entropy of the DC data stream. The result is a series of numbers. Example 4: This may result something like 1,2,-1,0,2,3,-3,..., one number for each block in the image, which is to be compressed with a lossless entropy encoding method. Now these numbers are encoded according to the following table. -8-7-6-5-4-3-201234567 00 1 1 2-3-223 3-7-6-5-44567 4-15-14-13-12-11-10-9-889101112131415 5 Entropy Encoding
  • Slide 37
  • 1 is at (1, 0) in binary (0001,00000000000) represented as (0001, 0) 3 is at (2, 1) in binary (0010,00000000001) represented as (0010, 01) -3 is at (2,-2) in binary (0010,11111111110) represented as (0010, 10) Under the baseline mode the first number( i.e. 0001 of (0001,0) ) is additionally compressed using either a default Huffman table, or optionally one provided with the image. Encoding of DC-coefficient: Entropy Encoding
  • Slide 38
  • Zig-Zag ordering of AC-coefficients: Zig-Zag ordering of AC-coefficients: The AC-coefficients are taken in the order shown below. The Zig-Zag sequence actually collects the low frequency coefficients before high frequency coefficients, thereby grouping large numbers at the beginning of the sequence. The Zig-Zag sequence for the example 3 after quantization will be 3, 0, 3, 2, 6, 2, 4, 1, 4, 1, 1, 5, 1, 2, 1, 1, 1, 2, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 Encoding of AC-coefficient:
  • Slide 39
  • 6. Zig-Zag Reordering 39 44465100 -11-6-5-42000 62420000 -31-21000 -9-2000000 401 000 001000 0 000000 Example : Zig-Zag Reordering : 44, 4,-11, 6,-6,6, 5,-5,2,-3, -9,1,4,-4,1, -1,2,2,-1,-2,4, -1,0,0,-2,0,0,0, 0,0,0,1,0,1,-1,0, -1,0,-1,0,0,0,0, 0,0,0,-1,0,0, 0,1,0,0,0, 0,0,0,0, 0,0,0, 0,0, 0 Issue The quantization generates error by its nature. Therefore, the fault tolerance systems need to distinguish between the quantization errors and the computer failure errors. This is one of the challenges in the research project To improve the compression ratio, the quantized block is rearranged into the zigzag order then applied the run length coding method to convert the sequence into the intermediate symbols
  • Slide 40
  • 9. Huffman Coding 40 CategoryValuesBits for the value 1-1,10,1 2-3,-2,2,300,01,10,11 3-7,-6,-5,-4,4,5,6,7000,001,010,011,100,101,110,111 4-15,...,-8,8,...,150000,...,0111,1000,...,1111 5-31,...,-16,16,...3100000,...,01111,10000,...,11111 6-63,...,-32,32,...63000000,...,011111,100000,...,111111 7-127,...,-64,64,...,1270000000,...,0111111,1000000,...,1111111 8-255,..,-128,128,..,255... 9-511,..,-256,256,..,511... 10-1023,..,-512,512,..,1023... 11-2047,..,-1024,1024,..,2047... Figure 6. Table of values and bits for the value
  • Slide 41
  • 9. Huffman Coding Graduate Institute of Communication Engineering, National Taiwan University, Taipei, Taiwan, ROC41 Example Run lenth coding of 63 AC coefficiences : (0,57) ; (0,45) ; (4,23) ; (1,-30) ; (0,-8) ; (2,1) ; (0,0) Encode the right value of these pair as category and bits for the value, except the special markers like (0,0) or (15,0) : (0,6,111001) ; (0,6,101101) ; (4,5,10111); (1,5,00001) ; (0,4,0111) ; (2,1,1) ; (0,0) The difference of DC coefficience : -511 Encode the value as category and bits for the value : 9, 000000000
  • Slide 42
  • 9. Huffman Coding Graduate Institute of Communication Engineering, National Taiwan University, Taipei, Taiwan, ROC42 run/categorycode lengthcode word 0/041010... 0/671111000... 0/10161111111110000011 1/141100... 1/51111111110110... 4/5161111111110011000... 15/10161111111111111110 Figure 7. Huffman table of luminance AC coefficience
  • Slide 43
  • 9. Huffman Coding 43 categorycode lengthcode word 0200 13010 23011 33100 43101 53110 641110 7511110 86111110 971111110 10811111110 119111111110 Figure 8. Huffman table of luminance DC coefficience
  • Slide 44
  • 9. Huffman Coding Graduate Institute of Communication Engineering, National Taiwan University, Taipei, Taiwan, ROC44 Example The AC coefficiences : (0,6,111001) ; (0,6,101101) ; (4,5,10111); (1,5,00001) ; (0,4,0111) ; (2,1,1) ; (0,0) Encode the left two value in () using Huffman encoding : 1111000 111001, 1111000 101101, 1111111110011000 10111, 11111110110 00001, 1011 0111, 11100 1, 1010 The DC coefficience : 9, 000000000 Encode the category using Huffman encoding : 1111110 000000000
  • Slide 45
  • End Graduate Institute of Communication Engineering, National Taiwan University, Taipei, Taiwan, ROC45