✬ ✫ ✩ ✪ Error-Correcting Codes Fall 2013 Mao-Chao Lin 1
'
&
$
%
Error-Correcting Codes
Fall 2013Mao-Chao Lin
1
'
&
$
%
References
1. Shu Lin and Daniel J. Costello, “ Error Control Coding:Fundamentals and Applications“PEARSON/ Prentice Hall, secondedition, 2004.
2. Richard E. Blahut, ”Theory and Practice of Error Control Codes“Addison-Wesley, 1983.
3. William E. Ryan and Shu Lin, ”Channel Codes : Classical andModern“ Cambridge, 2009.
4. (a) Quiz : 20%
(b) Homework : 10%
(c) Midterm : 35%
(d) Final exam: 35%
2
'
&
$
%
Contents
1. Fundamentals
2. Introduction to Algebra
3. Linear Block Codes
4. Important Linear Block Codes
5. Cyclic Codes
6. BCH Codes and Reed-Solomon Codes
7. Convolutional Codes
8. Coded Modulation
9. Turbo Codes
10. Low Density Parity-Check Codes
11. Soft Decoding of Linear Block Codes3
'
&
$
%
Chapter 1 Fundamentals
Basic Coded Communication System:
⋆ Source encoder/decoder: For reducing redundancy without distortionor with distortion.⋆ Channel encoder/decoder: For increasing transmission reliability byadding redundancy.
4
'
&
$
%
Binary Symmetric Channels (BSC)
0 0
1
v
1
r
1 - P
P
P
1 - P
⋆ transition probabilities:
P (r = 1|v = 0) = P (r = 0|v = 1) = p
P (r = 0|v = 0) = P (r = 1|v = 1) = 1− p
5
'
&
$
%
Additive Gaussian Noise Channel :
+ v v+N
N
P (N = n) =1√2πσ2
e−n2/2σ2
⋆ Both binary symmetric channel and additive Gaussian noise channelare classified as discrete-time memoryless channels.
6
'
&
$
%
Additive White Gaussian Noise Channel (AWGN Channel):
+ v v+N
N
P (N = n) =1√2πσ2
e−n2/2σ2
σ2 = N0B
N0: One-Sided power spectral density of the AWGN ChannelB: Bandwidth
7
'
&
$
%
MODULO OPERATION
⋆ a = cb+ d ⇒ a ≡ d (mod b)
where a, c ∈ Z, d ∈ N, b ∈ N0, and Z,N,N0 are the sets of integers,natural numbers and positive integers.
Example 1.1:
71≡1 (mod 7)71≡1 (mod 10)
8
'
&
$
%
BLOCK CODE
⋆ An (n, k) block code C uniquely maps a block of information symbolsof length k, i.e., u0, u1, · · · , uk−1 to a codeword of length n, i.e.,v0, v1, · · · , vn−1.
⋆ The number of redundancy symbols is n− k.
⋆ The ratio R = k/n is the code rate.
9
'
&
$
%
SINGLE PARITY CHECK CODE
⋆ Let n = k + 1, v = (v0, v1, · · · , vn−1), u = (u0, u1, · · · , uk−1).Suppose v0 = u0, v1 = u1, · · · , vk−1 = uk−1 andvn−1 = v0 + v1 + · · · vk−1 (mod 2).Then, we have a single parity check code.
⋆ Let k = 2 and n = 3. Then,
v0 = u0 v1 = u1 v2 = u0 + u1
0 0 00 1 11 0 11 1 0
10
'
&
$
%
REPETITION CODE
⋆ Suppose that
u0 = v0 = v1 = · · · = vn−1 = 0
u0 = v0 = v1 = · · · = vn−1 = 1.
We have an (n, 1) repetition code.
11
'
&
$
%
Componentwise Vector Addition:
⋆ Let c = (c0, c1, · · · , cn−1) and a = (a0, a1, · · · , an−1),where cj , aj ∈ F2 = {0, 1}.The addition of c and a is
c+ a = (c0 + a0, · · · , cn−1 + an−1).
Linear code:
⋆ A code C is linear if all linear combinations of codewords are againcodewords.
12
'
&
$
%
Scalar product:
⋆ The scalar inner product for a and b ∈ Fn2 is
< a, b >=∑i=n−1
i=0 aibi (mod 2)
Hamming Weight:⋆ The (Hamming) weight of a vector v is
wt(v) =
j=n−1∑j=0
wt(vj),
where
wt(vj) =
{0 vj = 0
1 vj = 0
13
'
&
$
%
Hamming Distance:
⋆ The (Hamming) distance between two vectors a and c is
dist(a, c) = wt(a− c) =∑n−1
j=0 wt(aj − cj).
⋆ Example 1.2:Let c = (0011010) and a = (1010000).Then, wt(c) = 3, wt(a) = 2,and
a+ c = (1001010)
,wt(a+ c) = dist(a, c) = 3
.
14
'
&
$
%
Weight Enumerator :
⋆ Let wj be the number of codewords of a code C with weight j. Theweight enumerator is W (x) =
∑nj=0 wjx
j .
⋆ For example 1.2, we have w0 = 1, w1 = 0, w2 = 3, w3 = 0.
15
'
&
$
%
Minimum Distance :⋆The minimum distance d of a code C is the minimum distancebetween any two different codewords, i.e.,
d = mina, c ∈ C
a = c
{dist(a, c)}
⋆ If C is linear, then
d = mina, c ∈ C
a = c
{dist(a, c)} = minc ∈ C
c = 0
{wt(c)}
16
'
&
$
%
Example 1.3:Let C ={(000),(111)}. Then, d = 3 and R = 1/3.Intuitively, the set of all the possibly received vectors Z can be dividedinto two disjoint decoding regions
X = {(000), (001), (010), (100)}
andY = {(011), (101), (110), (111)}.
⋆ Any received vector r ∈ X will be decoded into the codeword(000) ∈ C and any received vector r ∈ Y will be decoded into thecodeword (111) ∈ C.⋆ The error-correcting capability of C is t = 1, where t = ⌊d−1
2 ⌋ = 1.
17
'
&
$
%
Example 1.4:Let c = {(00000), (11111)}. Then d = 5, R = 1/5 and t = ⌊d−1
2 ⌋ = 2.
Example 1.5:Let C = {(0, 0, · · · , 0), (1, 1, · · · , 1)} ⊂ Fn
2 and n = 2t+ 1. Then,d = 2t+ 1 and R = 1/(2t+ 1).
⋆ Apparently, R → 0 as t → ∞. Is it necessarythat to achieve zero error probability, we can only have zero coding rate?
18
'
&
$
%
Example 1.6:Consider a (7,4) Hamming code, where n = 7, k = 4, d = 3, R = 4/7.
u v u v
0000 0000000 0001 1010001
1000 1101000 1001 0111001
0100 0110100 0101 1100101
1100 1011100 1101 0001101
0010 1110010 0011 0100011
1010 0011010 1011 1001011
0110 1000110 0111 0010111
1110 0101110 1111 1111111
19
'
&
$
%
Nearest-Neighbor Decoding:
⋆ Let v1 be the transmitted codeword, r be the received vector and e bethe error vector. The codeword v1 can be correctly recovered ifwt(e) = wt(r + v1) ≤ t. Equivalently wt(e) ≤ ⌊d−1
2 ⌋.
⋆ For any (n, k, d) code C, up to e = ⌊(d− 1)/2⌋ errors can always becorrected, or up to d− 1 errors can always be detected.
20
'
&
$
%
⋆ An (n, k, d) code C is a subset of {0, 1}n with size |C| = 2k, for whichthe minimum distance between any two codewords is d.
⋆ For a given d, it is desired to find a subset C of {0, 1}n such that |C|is as large as possible. For a given k, it is desired to find a subset C of{0, 1}n such that |C| = 2k and d is as large as possible.
21
'
&
$
%
Shannon’s Channel Coding Theorem:
Given a channel and a source that generates information at a rate lessthan the channel capacity, it is possible to transmit the informationthrough the channel such that the error rate is as close to zero as welike.
⋆ The capacity of a BSC with transition probability p is Cp = 1−H(p)
bit per transmission, where H(p) = −p log2 p− (1− p) logp(1− p).
Example :Let p = 0.1. Then, Cp = 0.531.
22
'
&
$
%
⋆ The capacity of an AWGN channel is Cp = B log2(1 +SN ) bits per
second, where B is the channel bandwidth in Hertz and S/N is thesignal-to-noise ratio. Let Eb = STb = S/Cp and N = N0B, where Tb isthe bit interval. Then,
Cp
B= log2
(1 +
Eb
N0
Cp
B
)
23
'
&
$
%
Decoding Regions:
Let v ∈ C be the transmitted codeword. Let r = v + e ∈ {0, 1}n be thereceived vector, where e is the error vector (or error pattern). Thedecoder divides the space {0, 1}n into 2k disjoint decoding regions,R1, R2, · · · , R2k . If r ∈ Ri, then r is decoded into the codewordvi, i = 1, 2, · · · , 2k.
⋆ Usually, all the decoding regions are of the same size. Among the 2n
possible error patterns, only 2n−k of them yield the correct decoding.
24
'
&
$
%
⋆ Consider the (7,4) Hamming code given in Ex.1.6.Suppose that this code is applied over a BSC with transition probabilityp.The 2n−k = 8 correctable error patterns are
{(0000000), (1000000), (0100000), (0010000), (0001000), (0000100),(0000010), (0000001)}.The probability of erroneous decoding is
E = 1− (1− p)7 − 7p(1− p)6.
For p = 10−3, we have E = 2.1× 10−5.For p = 10−4, we have E = 2.1× 10−7.
25
'
&
$
%
Convolutional Codes:For an (n, k) block code, at each time, the output of the encoder is ann-tuple which only depends on the current k-tuple input.For an (n, k) convolutional code at each time, the n-tuple outputdepends not only on the current k-tuple input but also on the previousm input blocks.An (2,1) convolutional code,
Rate R = k/n = 1/2. Memory order m = 2.
26
'
&
$
%
Hard Decision and Soft DecisionConsider a Gaussian channel with binary input A and -A. If harddecision on the output is made, then the Gaussian channel is reducedto a BSC channel.
Example 1.7:Let C = {(−1,−1), (1, 1)}. Suppose that r = (0.1,−1.7). If harddecoding is used, no decision on the decoded codeword can be made.If soft decoding is used, (-1,-1) can be decoded.
27
'
&
$
%
Coding over the Gaussian Channel
ri = vi + ni
ni : zero mean Gaussian noise with variance σ2.
P (ri|vi) =1√2πσ2
exp(− (vi − ri)2
2σ2)
⋆ The squared Euclidean distance between two sequence v and v′ is
[dE(v, v′)]2 = ΣN−1
i=0 |vi − v′i|2
⋆ Coding with large Euclidean distance is desired.⋆ For a rate R code, the asymptotic coding gain is
G = 10 log(d2E , coded
d2E , uncodedR)
28
'
&
$
%
Soft-in/Soft-out Decoder
Lcy : channel values for all code bitsLe(u) : extrinsic values for all information bitsL(u) : a posteriori values for all information bitsL(u) : a priori values for all information bits⋆ For a systematic code, the soft output for an information bit u will beL(u) = L(u|y) = log P (u=+1|y)
P (u=−1|y) = Lcy + L(u) + Le(u)
29
'
&
$
%
Turbo Encoder
⋆ Let R1 and R2 be code rates of RSC code 1 and RSC code 2respectively.⋆ Suppose R1 = R2 = 1/2. The overall code rate will be R =1/3 .⋆ By punctuating RSC code 1 and RSC code 2, we can haveR1 = R2 = 2/3 and the overall code rate is R = 1/2 .⋆ Nonuniform interleaving is preferred .⋆ Size of interleaver M is critical.
30
'
&
$
%
Iterative Decoding
31
'
&
$
%
8PSK Signals
Each signal point in the 8PSK signal set (constellation) is labelled by 3bits (a, b, c)
32
'
&
$
%
TCM (Trellis Coded Modulation) Encoder
4-state case
33
'
&
$
%
8-state case
34
'
&
$
%
Low Density Parity-check Codes⋆ A (t,s)-regular low density parity check (LDPC) code is a binary blockcode for which its parity check matrix has the following property:1. All column vectors of its parity-check matrix H have the same weightt.2. All row vectors of H have the same weight s.
⋆ For an irregular Tanner graph, denote the symbol degree sequenceand the parity check degree sequence respectively by
Ds = {ds0 , ds1 , · · · , dsn−1}
andDc = {dc0 , dc1 , · · · , dcm−1
}
where dsj is the degree of the symbol node sj and dci is the degree ofthe check node ci.
35
'
&
$
%
Types of Errors:
1. Random errors;
2. Burst errors.
Error Control Strategies:
1. ARQ (automatic repeat request)
2. FEC (forward error correction)
3. Hybrid.
36
'
&
$
%
Chapter 2 Introduction to Algebra
⋆Let G be a set. A binary operation ∗ on G is a rule that assigns toeach pair of elements a and b in G a uniquely defined third elementc = a ∗ b in G.
⋆Let I be the set of integers. “ + “, “− “, “× “ are binary operations onI and “÷ “ is not.
37
'
&
$
%
Definition. 2.1: A set G on which a binary operation ∗ is defined iscalled a group if
1. a ∗ (b ∗ c) = (a ∗ b) ∗ c for all a, b, c ∈ G
2. There exists e ∈ G such that a ∗ e = e ∗ a = a for all a ∈ G, where e
is the identity of G.
3. For any a ∈ G, there exists a′ ∈ G such that a ∗ a′ = a′ ∗ a = e,where a′ is the inverse of a.
⋆ A group G is commutative (abelian) if a ∗ b = b ∗ a for all a, b ∈ G.
38
'
&
$
%
Theorem 2.1: The identity in a group is unique.Proof : Let e and e′ are identities in G. Then, e = e ∗ e′ = e′ ∗ e = e′.
Theorem 2.2: The inverse of a group element is unique.Proof : Suppose a′ and a′′ are both inverses of a. Then,a′ = a′ ∗ e = a′ ∗ (a ∗ a′′) = (a′ ∗ a) ∗ a′′ = e ∗ a′′ = a′′
39
'
&
$
%
Example 2.2:Let G = {0, 1}. Let ” + ” be the modulo 2 addition operation, and ” · ”be the modulo 2 multiplication operation on G respectively.
+ 0 1 · 0 10 0 1 0 0 01 1 0 1 0 1
Then ” + ” is an additive group on G and ” · ” is a multiplicative groupon G \ {0}.
⋆ The number of elements in a group G is called the order of the groupG, denoted |G|.
⋆ A set H is a subgroup of a group G if H ⊆ G and H is a group underthe same operation.
40
'
&
$
%
Theorem 2.3 : Let G be a group under the binary operation ∗. Let Hbe a subgroup of G if the following conditions hold.
(i) H is closed under the binary operation ∗(ii) For any element a in H , the inverse of a is also in H.
Definition 2.2 : Let H be a subgroup of a group G with binaryoperation ∗ be in G. Then the set a ∗H ≡ {a ∗ h : a ∈ G,h ∈ H} iscalled a left coset of H. The set H ∗ a ≡ {h ∗ a : a ∈ G,h ∈ H} is calleda right coset of H.
⋆ If G is commutative, then a ∗H = H ∗ a which is called a coset of H.
41
'
&
$
%
Theorem 2.4 : Let H be a subgroup of G.Then, |a ∗H| = |H|.proof: Let h = h′ for h and h′ in H.Suppose a ∗ h = a ∗ h′.Then,a−1 ∗ (a ∗ h) = a−1 ∗ (a ∗ h′), which implies h = h′.Hence, |a ∗H| = |H|.
Example : Let G = {0, 1, 2, 3, 4, 5} and let ” + ” be the modulo 6addition operation. Then, G is a group under the operation of ” + ”.We can prove that H = {0, 3} is a subgroup of G under the operationof ” + ”. Then, 1 +H = {1, 4} and 2 +H = {2, 5} are both cosets of H.
42
'
&
$
%
Theorem 2.5 : No two elements in two different cosets of a subgroupof H of a group G are identical.Proof :Let a ∗H and b ∗H be two distinct cosets of H,where a, b ∈ G.Suppose a ∗ h = b ∗ h′ forh and h′ ∈ H. Then (a ∗ h) ∗ h−1 = (b ∗ h′) ∗ h−1.That means a = b ∗ (h′ ∗ h−1) where h′ ∗ h−1 ∈ H.Hence a ∈ b ∗H.Then,
a ∗H = (b ∗ h′ ∗ h−1) ∗H
= {b ∗ h′∗ h−1 ∗ h′′ : h′′ ∈ H}
= {b ∗ h′′′
: h′′′
∈ H}
= b ∗H.
43
'
&
$
%
⋆ Each element of G appears in one and only one coset of H.
⋆ All the distinct cosets of H are disjoint.
⋆ G is the union of all the distinct cosets of H.
Theorem 2.6 : (Lagrange’s Theorem): Let G be a group of order n andH be a subgroup of order m. Then m|n, i.e. |H| | |G|, and the partitionG/H consists of n/m cosets of H.
44
'
&
$
%
Definition 2.3: Let F be a set with two binary operations, “ + “
(addition) and “ · “ (multiplication). The set F is a field if
1. F is a commutative group under “ + “. The identity with respect to“ + “ is called the zero element and is denoted “0“.
2. The set F \ {0} is a commutative group under “ · “. The identitywith respect to “ · “ is called unit element and is denoted “1“ .
3. The operation “ · “ is distributive over “ + “, i.e.,a · (b+ c) = a · b+ a · c for all a, b, c ∈ F .
45
'
&
$
%
⋆ Let a, b be elements of a field. Then
1. a · 0 = 0 · a = 0
2. a · b = 0 if a = 0 and b = 0
3. a · b = 0 implies a = 0 or b = 0.
4. −(a · b) = (−a) · b = a · (−b)
5. ab = ac implies b = c if a = 0.
Example 2.3:F = {0, 1} with “ + “ and “ · “ given in Example 2.2 is a field (binaryfield).
Example 2.4:Let S = {0, 1, 2, 3} with “+ “ and “ · “ defined as modulo-4 addition andmultiplication respectively is not a field. A clear inconsistency is2 · 2 = 4 ≡ 0 (mod 4).
46
'
&
$
%
Example 2.5:Let F = {0, 1, 2, · · · , p− 1} with “ + “ and “ · “ defined as modulo-paddition and multiplication respectively is a field (a prime field) and isdenoted GF (p) if p is a prime.
Proof : Let x = 0 and x ∈ F . Since (x, p) = 1, there exist a and b ∈ I
such that ax+ bp = 1, i.e., ax ≡ 1 (mod p). Thus, multiplicative inverseof x exists for x ∈ F \ {0}. Hence, F \ {0} is a multiplicative group withidentity 1.It can be checked that 1. and 3. of Def. 2.5 are satisfied. Hence, F is afield.
⋆ Every finite field is of the form GF (pm), where p is a prime. That isevery finite field has order which is a power of a prime. Finite fields arecalled Galois fields.
47
'
&
$
%
Example 2.6: The “ + “ and “ · “ of GF(4) is
+ 0 1 2 3 · 0 1 2 30 0 1 2 3 0 0 0 0 01 1 0 3 2 1 0 1 2 32 2 3 0 1 2 0 2 3 13 3 2 1 0 3 0 3 1 2
⋆For GF(q), if λ is the smallest positive integer such that∑λ
i=1 1 = 0.Then, λ is called the characteristic of GF(q).
Theorem 2.7 : The characteristic λ of a finite field is a prime.Proof : Let λ = mℓ. Thus,
∑mℓi=1 1 = (
∑mi=1 1)(
∑ℓi=1 1) = 0. This implies∑m
i=1 1 = 0 or∑ℓ
i=1 1 = 0. Hence either m = 1 or ℓ = 1.
⋆ Let a be in a field of characteristic λ. Then,∑λ
i=1 a = 0.⋆GF (λ) is a subfield of GF (q)
48
'
&
$
%
⋆ Let a ∈ GF(q) and a = 0. Then, {1, a, a2, · · ·} form multiplicativesubgroup of GF(q) \{0}.
⋆The order of a ∈ GF(q) is the smallest positive integer n such thatan = 1, where a = 0.
⋆ A group is said to be cyclic if there exists an element in the groupwhose powers constitute the whole group.
Theorem 2.8 : Let a be a nonzero element of GF(q). Then, aq−1 = 1.
Proof : Let b1, b2, · · · , bq−1 be the q − 1 nonzero elements of GF(q).Then, ab1, ab2, · · · , abq−1 are nonzero and different. Thus,(ab1)(ab2) · · · (abq−1) = b1b2 · · · bq−1. Hence, aq−1 = 1.
49
'
&
$
%
Theorem 2.9 : Let a ∈ GF(q) and a = 0. Let n be the order of a. Then,n|q − 1.
Proof : Let q − 1 = kn+ r for some 0 ≤ r < n. Then,1 = aq−1 = akn+r = akn · ar = ar. Thus, r = 0.
⋆ A nonzero element a ∈ GF (q) is called a primitive element if theorder of a is q − 1.Exercise: Every finite field has a primitive element.
50
'
&
$
%
⋆ Let V be a set of elements on which a binary operation “+” is defined.Let F be a field. A multiplicative operation denoted “·” betweenelements in F and elements in V is also defined. The set V is called avector space over the field F if
1. V is a commutative group under addition.
2. For any a ∈ F and v ∈ V , we have a · v is in V .
3. For a, b ∈ F and u, v ∈ V , we have
a · (u+ v) = a · u+ a · v(a+ b) · v = a · v + b · v
4. For a, b ∈ F , v ∈ V , we have
(a · b) · v = a · (b · v)
5. Let 1 be the unit of F . We have 1 · v = v for any v ∈ V .
51
'
&
$
%
⋆ The set of polynomials of degree less than m over F is a vectorspace over F under conventional polynomial addition and scalarmultiplication.
⋆A polynomial p(x) over GF (q) of degree m is said to be irreducibleover GF (q) if p(x) is not divisible by any polynomial over GF (q) ofdegree less than m but greater than zero.
Example 2.7: x2 + 1 = (x+ 1)2 is reducible over GF (2). x3 + x+ 1 isirreducible over GF (2).
⋆ For any m ≥ 1, there exists an irreducible polynomial of degree m.
52
'
&
$
%
Theorem 2.10 : Any irreducible polynomial q(x) over GF (2) of degreem divides x2m−1 + 1.
⋆ If q(x) is irreducible over GF (2) of degree m and is not a factor ofxn + 1 for all n < 2m − 1, then q(x) is primitive.
Theorem 2.11 : Let f(x) be a binary polynomial. Then,[f(x)]2
ℓ
= f(x2ℓ).Proof : Let f(x) = f0 + f1x+ · · ·+ fnx
n, fi ∈ GF (2). Then,
f2(x) = [f0 + (f1x+ · · ·+ fnxn)]2
= f20 + (f1x+ · · ·+ fnx
n)2
= f20 + f2
1x2 + · · ·+ f2
n(x2)n
= f0 + f1x2 + · · ·+ fn(x
2)n = f(x2)
53
'
&
$
%
Repeat the same process, we have
[f(x)]2ℓ
= f(x2ℓ).
⋆ Let f(x) be a polynomial over GF (p), where p is a prime. Then,[f(x)]p = f(xp).
⋆ Let f(x) be a polynomial over GF (q), where q = pm is a prime power.Then, [f(x)]q = f(xq).
54
'
&
$
%
⋆ The field formed by taking polynomials over GF (q) modulo anirreducible polynomial Q(x) of degree k over GF (q) is called anextension field of degree k over GF (q).
⋆ The above field is a vector space over GF (q) of dimension k. Anapparent basis is {1, α, α2, · · · , αk−1}, where α is a root of Q(x). Thefield is isomorphic to GF (qk).
⋆ Let p(x) be an irreducible polynomial of degree m over GF (q). If p(x)contains a primitive element of GF (qm) as a root, the p(x) is called aprimitive polynomial.
55
'
&
$
%
Example 2.7:Let p(x) be a primitive polynomial of GF (2) and α in GF (24) is a root ofp(x) = 1 + x+ x4.
0 0 α8 1 + α2
1 1 α9 α + α3
α α α10 1 + α+ α2
α2 α2 α11 α+ α2 + α3
α3 α3 α12 1 + α+ α2 + α3
α4 1 + α α13 1 + α2 + α3
α5 α+ α2 α14 1 + α3
α6 α2 + α3 α15 1α7 1 + α + α3
56
'
&
$
%
⋆ Every factor of x2m−1 + 1 which is irreducible over GF (2) has degreem or less.Proof : Suppose q(x) is irreducible and is a factor of x2m−1 + 1 anddeg(q(x)) = ℓ > m. Let α be a root of q(x). Then, 1, α, α2, · · · , αℓ−1 arelinearly independent. However, αi ∈ GF (2m). In GF (2m), the numberof linearly independent elements can not be greater than m.
57
'
&
$
%
⋆ Factorization of x15 + 1
Let α be a primitive element in GF (24). Then,x15 + 1 = (x+ 1)(x+ α)(x+ α2) · · · (x+ α14) , since 1, α, α2, · · · , α14
are distinct and (αi)15 = 1.
Note that x3 + 1 = (x+ 1)(x2 + x+ 1) is a factor of x15 + 1, andx5 + 1 = (x+ 1)(x4 + x3 + x2 + x+ 1) is a factor of x15 + 1. It can bechecked that primitive polynomials x4 + x+ 1 and x4 + x3 + 1 arefactors of x15 + 1.Hence,x15+1 = (x+1)(x2+x+1)(x4+x3+x2+x+1)(x4+x+1)(x4+x3+1)
58
'
&
$
%
⋆ Since every root of x2 + x+ 1 has order 3, the only candidates ofsuch roots are α5 and α10.
⋆ Since every root of x4 + x3 + x2 + x+ 1 has order 5, the onlycandidates of such roots are α3, α6, α9 and α12.
⋆ Every root of x4 + x+ 1 and x4 + x3 + 1 has order 15. If α is a root ofx4 + x+ 1, then α2, α4, α8 are also roots of x4 + x+ 1 andα7, α14, α13, α11 are roots of x4 + x3 + 1.
59
'
&
$
%
Theorem 2.12 : Let f(x) be a binary polynomial. If β is in someextension field of GF (2) and is a root of f(x), then for any ℓ > 0, β2ℓ isalso a root of f(x).Proof : Since f(x2ℓ) = [f(x)]2
ℓ
, we have [f(β)]2ℓ
= f(β2ℓ) = 0
⋆ β2ℓ is called a conjugate of β.
Theorem 2.13 : The 2m − 1 nonzero elements of GF (2m) form all theroots of x2m−1 + 1.
⋆ Let β be in GF (2m). A minimal polynomial of β is the binarypolynomial ϕ(x) of smallest degree such that ϕ(β) = 0.
60
'
&
$
%
Theorem 2.14 : The minimal polynomial ϕ(x) of a field element β isirreducible.Proof : Let ϕ(x) = ϕ1(x)ϕ2(x), where ϕi(x) is a binary polynomial.Thus, ϕ(β) = ϕ1(β)ϕ2(β) = 0. This implies either ϕ1(β) = 0 orϕ2(β) = 0. This contradicts to the assumption if both deg(ϕ1(x)) > 0
and deg(ϕ2(x)) > 0.
Theorem 2.15 : Let f(x) be a binary polynomial. Let ϕ(x) be theminimal polynomial of a field element β. If f(β) = 0, then f(x) isdivisible by ϕ(x).Proof : Let f(x) = ϕ(x)q(x) + r(x), where deg(r(x)) < deg(ϕ(x)). Thefact that f(β) = 0 implies r(β) = 0. Thus, r(x) = 0.
61
'
&
$
%
Theorem 2.16 : The minimal polynomial ϕ(x) of an element β inGF (2m) divides x2m + x.Proof : Since x2m + x|β = 0, and from Theorem 2.14, we have that ϕ(x)is a factor of x2m + x.
Theorem 2.17 : Let f(x) be an irreducible polynomial over GF (2). Letβ be in GF (2m). Let ϕ(x) be the minimal polynomial of β. If f(β) = 0,then ϕ(x) = f(x).
62
'
&
$
%
Theorem 2.18 : Let β be in GF (2m), Let e be the smallest nonnegative
integer such that β2e = β. Then, f(x) =e−1∏i=0
(x+ β2i) is an irreducible
polynomial over GF (2).
Proof : f2(x) = [e−1∏i=0
(x+ β2i)]2 =e−1∏i=0
(x+ β2i)2. Since
(x+ β2i)2 = x2 + β2i+1
, we have
f2(x) = [e−1∏i=1
(x2 + β2i)](x2 + β2e) = f(x2)
Let f(x) =∑
fixi. Then, f2(x) =
∑f2i x
2i = f(x2) =∑
fix2i. Hence,
f(x) is over GF (2). Let f(x) = a(x)b(x), where a(x) and b(x) are overGF (2). The fact f(β) = 0 implies either a(β) = 0 or b(β) = 0. Supposea(β) = 0. Then, a(x) contains β, β2, · · · as roots. Hence, a(x) is amultiple of f(x). This means f(x) is irreducible over GF (2).
63
'
&
$
%
Theorem 2.19 : Let ϕ(x) be the minimal polynomial of β in GF (2m).Let e be the smallest positive integer such that β2e = β. Then,
ϕ(x) =e−1∏i=0
(x+ β2i).
Theorem 2.20 : Let ϕ(x) be the minimal polynomial of an element β inGF (2m). Let e be the degree of ϕ(x). Then, e is the smallest positiveinteger such that β2e = β.
Theorem 2.21 : If β is a primitive element of GF (2m), all its conjugatesβ2i are also primitive elements of GF (2m).Proof : Let the order of β2i be n. Then, (β2i)n = βn·2i = 1. Thus,n · 2i = ℓ(2m − 1) for some integer ℓ. Since (2i, 2m − 1) = 1, then(2m − 1)|n. Since n ≤ 2m − 1, we have n = 2m − 1.
Theorem 2.22 : If β is an element of order n in GF (2m), all itsconjugates have the same order n.Proof : Exercise.
64
'
&
$
%
⋆(Cyclotomic Cosets)The cyclotomic coset Ki corresponding to a number n = qm − 1 areKi = {iqj mod n, j = 0, 1, · · · ,m− 1}, where i is the smallest elementof the set in Ki.
⋆ This definition is also applicable to n which is a factor of qm − 1.
⋆ some related properties:
1. |Ki| ≤ m;
2. Ki ∩Kj = ∅ if Ki = Kj
3. K0 = {0}
4. ∪iKi = {0, 1, · · · , n− 1}
65
'
&
$
%
Example 2.8: Let n = 15 = 24 − 1. Then, K0 = {0} ,K1 = {1, 2, 4, 8},K3 = {3, 6, 9, 12}, K5 = {5, 10}, K7 = {7, 11, 13, 14}.
⋆(Trace function) The trace function is a mapping of elements β ∈GF(qm) to GF(q), where q = pℓ, ℓ ≥ 1 and p is a prime number. It isdefined bytr(β) = β + βq + βq2 + · · ·+ βqm−1
= a ∈ GF(q).
⋆ Since aq = a, i.e. aq−1 = 1, hence a ∈ GF(q).
⋆ Let α be a primitive element of GF(qm) and β = αi. Then
tr(β) =m−1∑j=0
βqj =∑j∈Ki
αj .
⋆ tr(β + r) = tr(β) + tr(r)
66