CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.
Post on 13-Jan-2016
237 Views
Preview:
Transcript
CMPT365 Multimedia Systems 1
Lossy Compression
Spring 2015
CMPT 365 Multimedia Systems
CMPT365 Multimedia Systems 2
Lossless vs Lossy Compression
If the compression and decompression processes induce no information loss, then the compression scheme is lossless; otherwise, it is lossy.
Why is lossy compression possible ?
Compression Ratio: 12.3Compression Ratio: 7.7 Compression Ratio: 33.9
Original
CMPT365 Multimedia Systems 3
Outline
Quantization Uniform Non-uniform
Transform coding DCT
CMPT365 Multimedia Systems 4
Quantization The process of representing a large (possibly
infinite) set of values with a much smaller set. Example: A/D conversion
An efficient tool for lossy compression Review …
EntropycodingQuantizatio
nTransform
Encoder
Entropydecoding
InverseQuantizatio
n
InverseTransform
Decoder
channel
CMPT365 Multimedia Systems 5
Review: Basic IdeaIn
put
Val
ues
Bin 0
Bin 1
Bin 2
Bin 3
Bin 4
x EntropyCoding
EntropyDecoding
Bin 0
Bin 1
Bin 2
Bin 3
Bin 4
Rec
onst
ruct
ion
Val
ues
Quantizer Dequantizer(Inverse Quantizer)
x
Quantization is a function that maps an input interval to one integer Can reduce the bits required to represent the source. Reconstructed result is generally not the original input Terminologies:
Decision boundaries bi: bin boundaries Reconstruction levels yi: output value of each bin by the dequantizer.
index index
CMPT365 Multimedia Systems 6
Uniform Quantizer All bins have the same size except possibly for the two outer intervals:
bi and yi are spaced evenly The spacing of bi and yi are both ∆ (step size)
∆ 2∆ 3∆ Input
-3∆ -2∆ -∆
Reconstruction3.5∆
2.5∆
1.5∆0.5 ∆
-0.5∆
-1.5∆
-2.5∆-3.5∆
Uniform Midrise Quantizer
Even number of reconstruction levels0 is not a reconstruction level
-2.5∆ -1.5∆ -0.5∆
Reconstruction3∆
2∆
∆
-∆
-2∆-3∆
Uniform Midtread Quantizer
0.5∆ 1.5∆ 2.5∆ Input
Odd number of reconstruction levels0 is a reconstruction level
iii bby 12
1for inner intervals.
CMPT365 Multimedia Systems 7
Midtread Quantizer
5.0)()(
xxsignxAq
Quantization mapping:
Output is an index
qqBx )(ˆ
De-quantization mapping:
Example:
x = -1.8∆, q = -2. -2.5∆ -1.5∆ -0.5∆
Reconstruction
3∆
2∆
∆
-∆
-2∆
-3∆
0.5∆ 1.5∆ 2.5∆ Input
CMPT365 Multimedia Systems 8
Model of Quantization
Quantization: q = A(x) Inverse Quantization: )())(()(ˆ xQxABqBx
B(x) is not exactly the inverse function of A(x), because xx ˆxxxe ˆ)( Quantization error:
Aq
x̂x B
Q x̂x
Combining quantizer and de-quantizer:
- e(x)
x̂xor
CMPT365 Multimedia Systems 9
Rate-Distortion Tradeoff Things to be determined:
Number of bins Bin boundaries Reconstruction levels
A tradeoff between rate and distortion: To reduce the size of the encoded bits, we need to
reduce the number of bins Less bins More reconstruction errors
Rate
Distortion
A
B
CMPT365 Multimedia Systems 10
Measure of Distortion Quantization error: Mean Squared Error (MSE) for Quantization
Average quantization error of all input values Need to know the probability distribution of the input
Number of bins: M Decision boundaries: bi, i = 0, …, M
Reconstruction Levels: yi, i = 1, …, M Reconstruction:
iii bxbyx 1 iff ˆ
MSE:
M
i
b
b
iq
i
i
dxxfyxdxxfxxMSE1
22
1
)()(ˆ
xxxe ˆ)(
Same as the variance of e(x) if μ = E{e(x)} = 0 (zero mean).
Definition of Variance: deefe ee )(22
CMPT365 Multimedia Systems 11
Rate-Distortion Optimization
Two Scenarios: Given M, find bi and yi that minimize the MSE.
Given a distortion constraint D, find M, bi and yi such that the MSE ≤ D.
CMPT365 Multimedia Systems 12
Outline
Quantization Uniform Non-uniform Vector quantization
Transform coding DCT
CMPT365 Multimedia Systems 13
Uniform Quantization of a Uniformly Distributed Source
Input X: uniformly distributed in [-Xmax, Xmax]: f(x)= 1 / (2Xmax) Number of bins: M (even for midrise quantizer) Step size is easy to get: ∆ = 2Xmax / M. bi = (i – M/2) ∆
b40
b3-∆
b2-2∆
b1-3∆
b0-4∆
b5∆
b62∆
b73∆
b84∆
-XmaxXmax
y4-0.5 ∆
y3-1.5∆
y2-2.5∆
y1-3.5∆
y50. 5∆
y61.5∆
y72.5∆
y83.5∆ x
e(x) is uniformly distributed in [-∆/2, ∆/2].
x 0.5 ∆
-0.5 ∆
∆ 2∆ 3∆ 4∆-∆-2∆-3∆-4∆
CMPT365 Multimedia Systems 14
Uniform Quantization of a Uniformly Distributed Source
MSE
23
max0
2
max
1
22
12
1
12
1
222
1
)()(ˆ1
X
Mdxx
XM
dxxfyxdxxfxxMSEM
i
b
b
iq
i
i
M increases, ∆ decreases, MSE decreases
Variance of a random variable uniformly distributed in [- ∆/2, ∆/2]:
22/
2/
22
12
110
dxxq
Optimization: Find M such that MSE ≤ D
DXMD
M
XD
3
1
2
12
1
12
1max
2
max2
CMPT365 Multimedia Systems 15
Signal to Noise Ratio (SNR)
Variance is a measure of signal energy Let M = 2n
Each bin index is represented by n bits
dBn
nMMX
X
X
EnergyNoise
nergySignaldBSNR
n
02.6
)2log20(2log10log10/2
2log10
12/1
212/1log10
E log10)(
102
102
102max
2max
10
2
2max
1010
If nn+1, ∆ is halved, noise variance reduces to 1/4,and SNR increases by 6 dB.
CMPT365 Multimedia Systems 16
Outline
Quantization Uniform Non-uniform
Transform coding DCT
CMPT365 Multimedia Systems 17
Non-uniform Quantization Uniform quantizer is not optimal if source is not uniformly
distributed For given M, to reduce MSE, we want narrow bin when f(x) is
high and wide bin when f(x) is low
x
f(x)
0
M
k
b
b
kq
k
k
dxxfyxdxxfxx1
222
1
)()(ˆ
CMPT365 Multimedia Systems 18
Lloyd-Max Quantizer
M
k
b
b
kq
k
k
dxxfyxdxxfxx1
222
1
)()(ˆ
Also known as pdf-optimized quantizer
Given M, the optimal bi and yi that minimize MSE, satisfying
.0 ,0 :condition Lagrangian22
i
q
i
q
by
i
i
i
i
b
b
b
bi
i
q
dxxf
dxxfx
yy
1
1
)(
)(
02
yi is the centroid of interval [bi-1, bi].x
f(x)
0 bi-1 bi
yi
CMPT365 Multimedia Systems 19
Lloyd-Max Quantizer
If f(x) = c (uniformly distributed source):
)(2
1)(21
)(
)(
)(
11
21
2
1
1
1
1
ii
ii
ii
ii
b
b
b
b
b
bi bb
bb
bb
bbc
dxxc
dxxf
dxxfx
y
i
i
i
i
i
i
2 0 1
2
iii
i
q yyb
b
bi is the midpoint of yi and yi+1x
f(x)
0 bi-1 bi bi+1
yi yi+1
CMPT365 Multimedia Systems 20
Lloyd-Max Quantizer
How to find optimal bi and yi simultaneously? A deadlock:
• Reconstruction levels depend on decision levels• Decision levels depend on reconstruction levels
Solution: iterative method !
i
i
i
i
b
b
b
bi
dxxf
dxxfx
y
1
1
)(
)(
2
1 ii
i
yyb
Given bi, can find the corresponding optimal yi
Given yi, can find the corresponding optimal bi
Summary of conditions for optimal quantizer:
CMPT365 Multimedia Systems 21
Lloyd Algorithm (Sayood pp. 267)
1. Start from an initial set of reconstruction values yi.
2. Find all decision levels
3. Computer MSE:
4. Stop if MSE changes little from last time.
5. Otherwise, update yi,
go to step 2.
M
k
b
b
kq
k
k
dxxfyx1
22
1
)(
2 1
iii
yyb
i
i
i
i
b
b
b
bi
dxxf
dxxfx
y
1
1
)(
)(
CMPT365 Multimedia Systems 22
Outline
Quantization Uniform quantization Non-uniform quantization
Transform coding Discrete Cosine Transform (DCT)
CMPT365 Multimedia Systems 23
Why Transform Coding ? Transform
From one domain/space to another space Time -> Frequency Spatial/Pixel -> Frequency
Purpose of transform Remove correlation between input samples Transform most energy of an input block into a few coefficients Small coefficients can be discarded by quantization without too much
impact to reconstruction quality
EntropycodingQuantizatio
nTransform
Encoder
CMPT365 Multimedia Systems 24
1-D Example Fourier Transform
CMPT365 Multimedia Systems 25
1-D Example Application (besides compression)
Boost bass/audio equalizer Noise cancellation
CMPT365 Multimedia Systems 26
1-D Example http://www.mathdemos.org/mathdemos/trigsounddemo/tri
gsounddemo.html Sine wave/sound/piano
www.sagebrush.com/mousing.htm An electronic instrument that allows direct control of pitch
and amplitude
CMPT365 Multimedia Systems 27
1-D Example Smooth signals have strong DC (direct current, or zero frequency)
and low frequency components, and weak high frequency components
High frequencyDC
1 2 3 4 5 6 7 80
100
200Original Input
1 2 3 4 5 6 7 80
1000
2000DFT Magnitudes
1 2 3 4 5 6 7 8-500
0
500DCT Coefficients
Sample Index
High frequencyDC
CMPT365 Multimedia Systems 28
2-D ExampleOriginal Image
2-D DCT Coefficients. Min= -465.37, max= 1789.00
0 50 100 150 200 250 3000
2000
4000
6000
8000
10000
-500 0 500 1000 1500 20000
1
2
3x 10
5
Apply transform to each 8x8 block Histograms of source and DCT coefficients
Most transform coefficients are around 0. Desired for compression
CMPT365 Multimedia Systems 29
Rationale behind Transform
If Y is the result of a linear transform T of the input vector X in such a way that the components of Y are much less correlated, then Y can be coded more efficiently than X.
If most information is accurately described by the first few components of a transformed vector, then the remaining components can be coarsely quantized, or even set to zero, with little signal distortion.
CMPT365 Multimedia Systems 30
Matrix Representation of Transform Linear transform is an N x N matrix:
11 NNNN xTy TX y
Inverse Transform:
yTx 1 TXy
T x-1
Unitary Transform (aka orthonormal):
TTT 1TX
yT x
T
For unitary transform: rows/cols have unit norm and are orthogonal to each others
ji ,0
ji ,1 ij
Tji
T ttITT
CMPT365 Multimedia Systems 31
Discrete Cosine Transform (DCT)
DCT – close to optimal (known as KL Transform) but much simpler and faster
1.-N ..., 1,ifor /2
0,ifor /1
1.-N ..., 0,j i, ,2
)12(cos ,
Na
Na
N
ijaji
C
Definition:
Matlab function: dct(eye(N));
CMPT365 Multimedia Systems 32
DCT
Definition:
1.-N ..., 1,ifor /2
0,ifor /1
1.-N ..., 0,j i, ,2
)12(cos ,
Na
Na
N
ijaji
C
N = 2 (Haar Transform):
11
11
2
12C
11
10
1
0
1
02
1
0
2
1
11
11
2
1
xx
xx
x
x
x
x
y
yC
y0 captures the mean of x0 and x1 (low-pass) x0 = x1 = 1 y0 = sqrt(2) (DC), y1 = 0
y1 captures the difference of x0 and x1 (high-pass) x0 = 1, x1 = -1 y0 = 0 (DC), y1 = sqrt(2).
CMPT365 Multimedia Systems 33
DCT Magnitude Frequency Responses of 2-point DCT:
Can be obtained by freqz( ) in Matlab.
0 0.1 0.2 0.3 0.4 0.5-40
-35
-30
-25
-20
-15
-10
-5
0
5DC Att. 403.0103, Mirr Att. 324.2604, Stopband 50, Coding Gain = 5.055 dB
Normalized Frequency
Mag
nitu
de R
espo
nse
(dB
)
x 2πDC
11
11
2
12C
Low pass
Highpass
CMPT365 Multimedia Systems 34
4-point DCT Four subbands 0.5000 0.5000 0.5000 0.5000
0.6533 0.2706 -0.2706 -0.6533 0.5000 -0.5000 -0.5000 0.5000 0.2706 -0.6533 0.6533 -0.2706
0 0.1 0.2 0.3 0.4 0.5-40
-35
-30
-25
-20
-15
-10
-5
0
5DC Att. 406.0206, Mirr Att. 324.2604, Stopband 8.3456, Coding Gain = 7.5701 dB
Normalized Frequency
Mag
nitu
de R
espo
nse
(dB
)
x 2π
CMPT365 Multimedia Systems 35
8-point DCT Eight subbands
x 2π
0 0.1 0.2 0.3 0.4 0.5-40
-35
-30
-25
-20
-15
-10
-5
0
5DC Att. 409.0309, Mirr Att. 320.1639, Stopband 9.9559, Coding Gain = 8.8259 dB
Normalized Frequency
Mag
nitu
de R
espo
nse
(dB
)
CMPT365 Multimedia Systems 36
Example x = [100 110 120 130 140 150 160 170]T; 8-point DCT:
[381.8377, -64.4232, 0.0, -6.7345, 0.0, -2.0090, 0.0, -0.5070]Most energy are in the first 2 coefficients.
1 2 3 4 5 6 7 850
100
150
200
250
1 2 3 4 5 6 7 8-100
0
100
200
300
400
CMPT365 Multimedia Systems 37
Block Transform
Divide input data into blocks (2D) Encode each block separately (sometimes with information
from neighboring blocks) Examples:
Most DCT-based image/video coding standards
CMPT365 Multimedia Systems 38
2-D DCT Basis
For 2-point DCT For 4-point DCT
CMPT365 Multimedia Systems 39
2-D Separable DCT
X: N x N input block T: N x N transform A = TX: Apply T to each column of X B=XTT: Apply T to each row of X 2-D Separable Transform:
Apply T to each row Then apply T to each column
TTXTY
Inverse Transform:
YTTX T
CMPT365 Multimedia Systems 40
2-D 8-point DCT Example
89 78 76 75 70 82 81 82122 95 86 80 80 76 74 81184 153 126 106 85 76 71 75221 205 180 146 97 71 68 67225 222 217 194 144 95 78 82228 225 227 220 193 146 110 108223 224 225 224 220 197 156 120217 219 219 224 230 220 197 151
Original Data:
2-D DCT Coefficients (after rounding to integers):
1155 259 -23 6 11 7 3 0-377 -50 85 -10 10 4 7 -3 -4 -158 -24 42 -15 1 0 1 -2 3 -34 -19 9 -5 4 -1 1 9 6 -15 -10 6 -5 -1 3 13 3 6 -9 2 0 -3 8 -2 4 -1 3 -1 0 -2 2 0 -3 2 -2 0 0 -1
Most energy is in the upper-left corner
CMPT365 Multimedia Systems 41
Interpretation of Transform
Forward transform y = Tx (x is N x 1 vector) Let ti be the i-th row of T yi = ti x = <ti
T, x> (Inner product) yi measures the similarity between x and ti
Higher similarity larger transform coefficient
i
N
i
Ti
TN
TTT y
1
0110 tytttyTx
x is the weighted combination of ti. Rows of T are called basis vectors.
Inverse transform:
CMPT365 Multimedia Systems 42
Interpretation of 2-D Transform
2-D basis matrices:
Outer products of basis vectors
YTTXTXTY TT
.1 ..., ,0 , , NjijTi tt
Proof:
),(1
0
1
0
1
0
1
0, j
Ti
N
i
N
j
N
i
N
jji
TT jiY ttTSTYTTX
0. are others all j), Y(i, isentry th -j)(i, The Define :, jiS
X is the weighted combination of basic matrices.
CMPT365 Multimedia Systems 43
2-D DCT Basis Matrices
For 2-point DCT For 4-point DCT
CMPT365 Multimedia Systems 44
2-D DCT Basis Matrices: 8-point DCT
CMPT365 Multimedia Systems 45
Further Exploration
Textbook 8.1-8.5 Other sources
Introduction to Data Compression by Khalid Sayood Vector Quantization and Signal Compression by Allen
Gersho and Robert M. Gray Digital Image Processing by Rafael C. Gonzales and
Richard E.Woods Probability and Random Processes with Applications to
Signal Processing by Henry Stark and John W. Woods A Wavelet Tour of Signal Processing by Stephane G. Mallat
top related