exercise in the previous class binary Huffman code? average codeword length? 1 A B C D E F G H prob. 0.363 0.174 0.143 0.098 0.087 0.069 0.045 0.021 1.000 0.637 0.359 0.278 0.185 0.135 0.066 0.363 A 0.174 B 0.143 C 0.098 D 0.087 E 0.069 F 0.045 G 0.021 H 0 1 0 100 110 1010 1011 1110 11110 11111 0.363×1+0.174×3+...+0.021×5=2.660
prob. 0.363 0.174 0.143 0.098 0.087 0.069 0.045 0.021. 0. A B C D E F G H. 1. exercise in the previous class. binary Huffman code? average codeword length?. 0.363 ×1+0.174 ×3+...+0.021×5= 2.660. 1.000. 0.637. 0.359. 0.278. 0.185. 0.135. 0.066. 0.363 A. 0.174 B. - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
exercise in the previous class
binary Huffman code?average codeword length?
1
ABCDEFGH
prob.0.3630.1740.1430.0980.0870.0690.0450.021
1.000
0.637
0.359 0.278
0.185 0.1350.066
0.363A
0.174B
0.143C
0.098D
0.087E
0.069F
0.045G
0.021H
01
0 100 110 1010 1011 1110 11110 11111
0.363×1+0.174×3+...+0.021×5=2.660
exercise in the previous class
4-ary Huffman code?[basic idea] join four trees
we may have #trees < 4 in the final round.
with one “join”, 4 – 1 = 3 trees disappear.add dummy nodes, start with 3k+1 nodes.
2
ABCDEFGH
prob.0.3630.1740.1430.0980.0870.0690.0450.021
? dummy
0.363A
0.174B
0.143C
0.098D
0.087E
0.069F
0.045G
0.021H
a b c da db dc dda ddb
0*
0*
0.0660.320
1.000
a b cd
today’s class
basic properties needed for source codinguniquely decodableimmediately decodable
Huffman codeconstruction of Huffman code
extensions of Huffman codetheoretical limit of the “compression”related topics
3
today
today’s class (detail)
Huffman codes are good, but how good are they?
Huffman codes for extended information sourcespossible means ( 手段 ) to improve the efficiency
Shannon’s source coding theoremthe theoretical limit of efficiency
some more variations of Huffman codesblocks of symbols with variable block length
4
algorithm
math.
algorithm
math.
how should we evaluate Huffman codes?
good codeimmediately decodable...“use code trees”small average codeword length (ACL)
It seems that Huffman’s algorithm gives a good solution.
To see that Huffman codes are really good,we discuss a mathematical limit of the ACL
...under a certain assumption (up to the slide 11)
...in the general case (Shannon’s theorem)
5
0
10
10
1
theoretical limit under an assumption
assumptionthe encoding is done in a symbol-by-symbol manner
define one codeword for each symbol of the source SS produces M symbols with probabilities p1, ..., pM
Lemma (restricted Shannon’s theorem):1. for any code, the ACL ≥ H1(S)
2. a code with ACL ≤ H1(S)+1 is constructible
H1(S) is the borderline of “possible” and “impossible”.
6
Shannon’s lemma (bad naming...)
To prove the restricted Shannon’s theorema small technical lemma (Shannon’s lemma) is needed.
Shannon’s lemma ( シャノンの補助定理)For any non-negative numbers q1, ..., qM with q1 +...+ qM ≤ 1,
7
∑𝑖=1
𝑀
−𝑝𝑖 log2𝑞𝑖≥∑𝑖=1
𝑀
−𝑝𝑖 log2𝑝𝑖(¿𝐻1(𝑆))
with the equation holds if and only if pi = qi.
remind: p1, ..., pM are symbol probabilities and p1 +...+ pM = 1
proof (sketch)
left hand side – right hand side=
8
∑𝑖=1
𝑀
−𝑝𝑖 log2𝑞𝑖+∑𝑖=1
𝑀
𝑝𝑖 log2𝑝𝑖¿∑𝑖=1
𝑀
−𝑝𝑖 log2
𝑞𝑖
𝑝𝑖
=∑𝑖=1
𝑀 𝑝𝑖
log𝑒2(−log𝑒
𝑞𝑖
𝑝𝑖
)
≥∑𝑖=1
𝑀 𝑝𝑖
log𝑒2 (1− 𝑞𝑖
𝑝𝑖)= 1
log𝑒2∑𝑖=1
𝑀
(𝑝𝑖−𝑞𝑖 )
¿ 1log𝑒2
(∑𝑖=1
𝑀
𝑝𝑖−∑𝑖=1
𝑀
𝑞𝑖)=1
log𝑒2(1−∑
𝑖=1
𝑀
𝑞𝑖)
≥0y = 1 – x
1O
y = – logex
− log𝑒𝑥 ≥1−𝑥the equation holds iff qi/pi = 1
proof of the restricted Shannon’s theorem: 1
for any code, the average codeword length ≥ H1(S)
Let l1, ..., lM be the length of codewords, and define .
Kraft: Shannon’s Lemma:
9
∑𝑖=1
𝑀
−𝑝𝑖 log2𝑞𝑖≥∑𝑖=1
𝑀
−𝑝𝑖 log2𝑝𝑖=𝐻1 (𝑆 ) .
the ACL We have shown that L ≥ H1(S).
𝑙𝑖=− log2𝑞𝑖
proof of the restricted Shannon’s theorem: 2
a code with average codeword length ≤ H1(S)+1 is constructible
Choose integers l1, ..., lM so that .
The choice makes , and
10
... Kraft’s inequality
We can construct a code with codeword length l1, ..., lM,
Lemma (restricted Shannon’s theorem):1. for any code, the ACL ≥ H1(S)
2. a code with ACL ≤ H1(S)+1 is constructible
We can show that, for a Huffman code, L ≤ H1(S) + 1
there is no symbol-by-symbol code whoseACL is smaller than L.
proof ... by recursion on the size of code trees
A Huffman code is said to be a compact code.
11
coding for extended information sources
The Huffman code is the best symbol-by-symbol code, but...the ACL 1not good for encoding binary information sources
12
A
0
B
10
A
0
C
11
C
11
A
0
symbolAB
average
prob.0.80.2
C1
01
1.0
C2
10
1.0
If we encode several symbols in a block, then...the ACL per symbol can be < 1good for binary sources also A B
10
A C C
110
A
01
block Huffman coding
13
fixed-length(equal-,
constant-)variable-length
(unequal-)block partitionrun-length
message
“block” operation
ABCBCBBCAA...
AB CBC BB CAA...
Huffman encoding
01 10 001 1101...
blocked message
codewords
fixed-length block Huffman coding
ACL:0.6×1+ 0.3×2+ 0.1×2 = 1.4 bitfor one symbol
14
ABC
prob.0.60.30.1
codeword0
1011
AAABACBABBBCCACBCC
prob.0.360.180.060.180.090.030.060.030.01
codeword0
1001100101
1110111101101
111110111111
blocks with two symbolsACL:
0.36×1+ ... + 0.01×6 = 2.67 bit,but this is for two symbols
2.67 / 2 = 1.335 bit for one symbol
improved!
block coding for binary sources
ACL:0.8×1+ 0.2×1 = 1.0 bitfor one symbol
15
AB
prob.0.80.2
codeword01
AAABBABB
prob.0.640.160.160.04
codeword0
10110111
blocks with two symbolsACL:
0.64×1+ ... + 0.04×3 = 1.56 bitfor two symbols
1.56 / 2 = 0.78 bit for one symbol
improved!
the block length
blocks with three symbolsACL:
0.512×1+ ... + 0.008×5 = 2.184 bit
for three symbols2.184 / 3 = 0.728 bit for one symbol
16
AAAAABABAABBBAABABBBABBB
prob.0.5120.1280.1280.0320.1280.0320.0320.008
codeword0
100101
11100110
111011111011111
block size123:
ACL per symbol1.0
0.780.728
:
larger block size more compact
block code and extension of information source
What happens if we increase the block length further?
Observe that...a block code defines a codeword for each block pattern.one block = a sequence of n symbols of S
= one symbol of Sn, the n-th order extension of S restricted Shannon’s theorem is applicable:
H1(Sn) ≤ Ln < H1(Sn) + 1
Ln = the ACL for n symbols
for one symbol of S,
17
𝐻1(𝑆𝑛)
𝑛≤𝐿𝑛
𝑛<𝐻1 (𝑆𝑛)𝑛
+ 1𝑛
Shannon’s source coding theorem
H1(Sn) / n ... the n-th order entropy of S (→ Apr. 12)
If n goes to the infinity...
18
Shannon’s source coding theorem:1. for any code, the ACL ≥ H (S)2. a code with ACL ≤ H (S) + ε is constructible
what the theorem means
Shannon’s source coding theorem:1. for any code, the ACL ≥ H (S)
2. a code with ACL ≤ H (S) + ε is constructible
Use block Huffman codes, and you can approach to the limit.You never overcome the limit. however.
19
AB
prob.0.80.2
block size123:
ACL per symbol1.0
0.780.728
:0.723 + εH(S) = 0.723
remark 1
Why block codes give smaller ACL?
fact 1: the ACL is minimized by a real-number solutionif P(A) = 0.8, P(B) = 0.2, then we want l1 and l2 with...
20
fact 2: the length of a codeword must be an integer
s.t.
s.t.
and integers
...loss!
...gain!
frequent loss, seldom gain...
remark 1 (cnt’d)
the gap between the ideal and the real codeword lengths: ... is an integer approximation of
the gap is weighted by the probability…
21
0
0.2
0.4
0.6
0.8
1
p
the weighted gaplong block many symbols small probabilities small weighted gaps close to the ideal ACL
today’s class (detail)
Huffman codes are good, but how good are they?
Huffman codes for extended information sourcespossible means ( 手段 ) to improve the efficiency
Shannon’s source coding theoremthe theoretical limit of efficiency
some more variations of Huffman codesblocks of symbols with variable block length
22
algorithm
math.
algorithm
math.
practical issues ( 問題 ) of block coding
Theoretically saying, the block Huffman codes are the best.
From practical viewpoint, there are several problems:We need to know the probability distribution in advance.
(this will be discussed in the next class)We need a large table for the encoding/decoding.
if one byte is needed to record one entry of the table...–256 byte table, if block length = 8–64 Kbyte table, if block length = 16–4 Gbyte table, if block length = 32
23
use blocks with variable-length
If we define blocks so thatthey have the same length, then ...
some blocks have small probabilitiesthose blocks also need codewords
If we define blocks so thatthey have similar probabilities, then ...
length differ from block by blockthe table has little useless blocks
24
AAAAABABAABBBAABABBBABBB
prob.0.5120.1280.1280.0320.1280.0320.0320.008
codeword0
100101
11100110
111011111011111
AAAAABABB
prob.0.5120.1280.160.2
codeword0
10010111
definition of block patterns
Block patterns must be defined so that...the patterns can represent (almost) all symbol sequences.
bad example: block pattern = {AAA, AAB, AB}
25
AABABAAB AAB AB AAB
AABBBAAB AAB ?
two different approaches are well-known;block partition approachrun-length approach
define patterns with block partition approach
1. prepare all blocks with length one2. partition the block with the largest probability
by appending one more symbol3. go to 2
Example: P(A) = 0.8, P(B) = 0.2
26
AB
0.80.2
AAABB
0.640.160.2
AAAAABABB
0.5120.1280.160.2
010010111
codewords
how good is this?
to determine the average codeword length,assume that n blocks are produced from S:
27
AAAAABABB
0.5120.1280.160.2
010010111
0.512n×3 + 0.128n×3 ...= 2.44n symbols
0.512n×1 + 0.128n×3 ... = 1.776n bits
S
AAA AB AAA B AB ...encode0 101 0 11 101 ...
2.44n symbols are encoded to 1.776n bits the average codeword length is 1.776n / 2.44n = 0.728 bit(almost the same as the block length = 8, p. 16, but small table)
define patterns with run-length approach
run = a sequence of consecutive ( 連続の ) identical symbol
28
A B B A A A A A B A A A B
run of length = 1
run of length = 5
run of length = 3
run of length = 0
Example: divide a message into runs of “A”:
The message is constructible if the lengths of runs are given. define blocks as runs of various length
upper-bound the run-length
small problem? ... there can be very long run put an upper-bound limit : run-length limited (RLL) coding
29
upper-bound = 3run length
01234567:
representation012
3+03+13+1
3+3+03+3+1
:
ABBAAAAABAAAB is represented asone “A” followed by Bzero “A” followed by Bthree or more “A”s followed by Btwo “A”s followed by Bthree or more “A”s followed by Bzero “A” followed by B
run-length Huffman code
Huffman code defined to encode the length or runseffective when there is strong bias on the symbol probabilities