9/24/19 1 CSCE5760 September 24, 2019 1 CSCE 5760: Design For Fault Tolerance HW #1. Take a look at the solutions I will quickly summarize how to approach the problems 2.1. We are told that the system failed between 4 and 8 years. We need to find the probability that it failed before 5 years. That we need the conditional probability P(T<5 | 4<T<8) à failed between 4 and 5 years Pr ob {[T < 5] ∩[4 ≤ T ≤ 8]} Pr ob {4 ≤ T ≤ 8} = Pr ob [4 ≤ T < 5] Pr ob {4 ≤ T ≤ 8} = F (5) − F (4) F (8) − F (4) = (1 − e −05*5) ) − (1 − e −0.5*4) ) (1 − e −05*8) ) − (1 − e −0.5*4) ) = 0.455 Second problem is straightforward --> use Weibull distribution Third problem involves converting the system into series-parallel combinations 1 CSCE5760 September 24, 2019 2 CSCE 5760: Design For Fault Tolerance Let us consider the parallel set in the top middle (I will label the units as C and D) RCD = 1 – (1-RC)(1-RD) Now let us add the unit in front (say A) RACD = RA*RCD = RA*[1 – (1-RC)(1-RD)] Now we have two parallel paths (with the bottom left, say B) RBACD = 1- [(1-RB)*{RA*(1 – (1-RC)(1-RD))}] Finally we add the unit on the right (say E RABCDE = RBACD*RE Figure 2.2: A 5-module series-parallel system. 2
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
9/24/19
1
CSCE5760 September 24, 2019 1
CSCE 5760: Design For Fault ToleranceHW #1. Take a look at the solutionsI will quickly summarize how to approach the problems
2.1. We are told that the system failed between 4 and 8 years. We need to find the probability that it failed before 5 years. That we need the conditional probability
P(T<5 | 4<T<8) à failed between 4 and 5 years
Prob{[T < 5]∩[4 ≤ T ≤ 8]}Prob{4 ≤ T ≤ 8}
= Prob[4 ≤ T < 5]Prob{4 ≤ T ≤ 8}
= F(5)− F(4)F(8)− F(4)
= (1− e−05*5) )− (1− e−0.5*4) )
(1− e−05*8) )− (1− e−0.5*4) )= 0.455
Second problem is straightforward --> use Weibull distribution
Third problem involves converting the system into series-parallel combinations
1
CSCE5760 September 24, 2019 2
CSCE 5760: Design For Fault Tolerance
Let us consider the parallel set in the top middle (I will label the units as C and D)RCD = 1 – (1-RC)(1-RD)
Now let us add the unit in front (say A)RACD = RA*RCD = RA*[1 – (1-RC)(1-RD)]
Now we have two parallel paths (with the bottom left, say B)RBACD = 1- [(1-RB)*{RA*(1 – (1-RC)(1-RD))}]
Finally we add the unit on the right (say ERABCDE = RBACD*RE
Solutions to Chapter 2 Exercises 3
Figure 2.2: A 5-module series-parallel system.
4 blocks and the second unit with the rightmost block. If the reliability of the leftmost 4blocks is RA(t), the system reliability is RA(t)R(t).
Now, we calculate RA(t). This subsystem consists of a parallel arrangement of one unitconsisting of the bottom block and another consisting of the other 3 blocks. If RB(t) is thereliability of the top 3 blocks, RA(t) = 1 − (1 − RB(t))(1 − R(t)).
Next, we calculate RB(t): this subsystem consists of a series arrangement of one block withanother consisting of two blocks in parallel. Hence, we have
RB(t) = R(t)(1 − (1 − R(t))2)
5. The lifetime of each of the seven blocks in Figure 2.3 is exponentially distributed withparameter λ. Derive an expression for the reliability function of the system, Rsystem(t),and plot it over the range t = [0, 100] for λ = 0.02.
Figure 2.3: A 7-module series-parallel system.
Solution:
As before, we decompose this structure into two substructures connected in series. Theleft substructure is four blocks in parallel; the right substructure consists of a series ar-rangement of two blocks in parallel with one block.
The reliability of this system is thus given by:
Rsystem(t) =[
1 − (1 − R(t))4] [
1 − (1 − R(t))(1 − R2(t))]
where R(t) = e−λt.
Solutions for Fault-Tolerant Systems, by Koren and Krishna c⃝2007 Elsevier Science (USA) – Do not copy
2
9/24/19
2
CSCE5760 September 24, 2019 3
CSCE 5760: Design For Fault ToleranceAnother problem àrelated HW #2
Suppose that the reliability of a system consisting of 4 blocks, two of which are identical, is given by the following equation:
Rsystem = R1R2R3 +R12 - R12R2R3
Draw the reliability block diagram representing the system.
This means module M1 is in series with some other network [R2R3 + R1 - R1R2R3] = = 1 – [(1-R2R3)(1-R1)
We have M2 and M3 in series and together in parallel with M1
4.3. Draw a Markov chain for reliability evaluation of the TMR with three voters shown inFig. 4.6. Assume that the failure rate of each module is �m, and the failure rate of each voteris �v. No repairs are allowed.Solution
Part-II:
Suppose that the reliability of a system consisting of 4 blocks, two of which are identical, isgiven by the following equation:Rsystem = R1R2R3 +R2
1 �R21R2R3 Draw the reliability block diagram representing the system
Solution
4
3
CSCE5760 September 24, 2019 4
CSCE 5760: Design For Fault ToleranceReview:
Markov Chains and Markov ProcessesDeriving state probabilities
Steady stateAbsorbing states
Deriving differential equations
An example: Duplex with repair
åå¹¹
+-=ij
jjiijijii tPtPdttdP )()()( ll Our general equation
CSCE 5760: Design For Fault ToleranceSolving the differential equations we get
tetP )(2222 )(2)()( µlµllµµlµ +-+++=
te )(222 )( µlµll +-++
tetP )(221 )()(2)(2)( µlµlµllµllµ +-+-++=
te )(222 )(2 µlµll +-+-
)()(1)( 120 tPtPtP --=
Note that this Markov process is irreducible à only recurrent states and no absorbing states
If we have irreducible process, we can derive the steady state behaviorThat is when time t à ¥
5
CSCE5760 September 24, 2019 6
CSCE 5760: Design For Fault Tolerance)()(2)( 122 tPtPdttdP µl +-=
)()()(2)(2)( 1021 tPtPtPdttdP µlµl +-+=
)(2)()( 010 tPtPdttdP µl -=
Setting dPi(t)/dt=0 we get −2λP2(t)+ 2µP1(t) = 02λP2(t)+ 2µP0(t)− (λ + µ)P1(t)= 0λP1(t)− 2µP0(t)= 0Po(t)+ P1(t)+ P2(t) = 1
And solving these linear equations we get(we can drop t since we are looking at steady state)
22 )(2 µlµ +=P2)(21 µllµ +=P
22 )(0 µll +=P
6
9/24/19
4
CSCE5760 September 24, 2019 7
CSCE 5760: Design For Fault Tolerance
So, the long term availability is given by 1- P0 = P1 + P2
2222 )(1)()2( µllµllµµ +-=+= +
Another way of looking at modeling continuous time Markov processes.Let us construct a transition matrix (instantaneous transition probability)Consider a system with two states: for example the following systems
M = m11 m21m12 m22
⎡
⎣⎢
⎤
⎦⎥
p. 27 - Design of Fault Tolerant Systems - Elena Dubrova, ESDlab
Single-component system, no repair
• Only two states – one operational (state 1) and one failed (state 2) – if no repair is allow, there is a single, non-reversible
transition between the states (used in availability analysis)
– label l corresponds to the failure rate of the component
1 2 l
Note how the rows and columns are numbered: columns should add to zeroIf we have absorbing states, diagonal elements of those states will be zero
7
CSCE5760 September 24, 2019 8
CSCE 5760: Design For Fault Tolerance
M =−2λ µ 02λ −λ − µ 2µ0 λ −2µ
⎡
⎣
⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥
Using our example with 3 states we have
ddt
P2(t)
P1(t)
P0(t)
⎡
⎣
⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥
=−2λ µ 02λ −λ − µ 2µ0 λ −2µ
⎡
⎣
⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥
*
P2(t)
P1(t)
P0(t)
⎡
⎣
⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥
)()(2)( 122 tPtPdttdP µl +-=
)()()(2)(2)( 1021 tPtPtPdttdP µlµl +-+=
)(2)()( 010 tPtPdttdP µl -=
These are the same equations we had before
8
9/24/19
5
CSCE5760 September 24, 2019 9
CSCE 5760: Design For Fault ToleranceWhat are watchdog processor?there are also watchdog timers
Watchdog processors only check for correct execution flow Check the control flow of a program
Signatures with instructions executed by each basic blockSo if some instructions are incorrectly sequenced signature differsComputed vs assigned
Can we use similar idea for detecting security violations?
9
CSCE5760 September 24, 2019 10
CSCE 5760: Design For Fault ToleranceMalicious failures à not benign
The approach to solve such problems is called Byzantine algorithm– related Byzantine generals stories
Byz(N, m) – N nodes with up to m failed nodes
Step 1: Original source sends data to each of the N-1 receiversStep 2: If m>0, each of the N-1 receivers become sources. They distribute the values received in the previous step to other nodes.
In other words, each of the N-1 nodes apply Byz (N-1, m-1) algorithm-- step 2 is recursive
Step 3: At the end of communications, each node has a “vector” of values.Each node looks for a majority value in this vectorIf no majority, need to a default value
To correctly work, N >= 3m+1
10
9/24/19
6
CSCE5760 September 24, 2019 11
CSCE 5760: Design For Fault ToleranceSome assumptions
Non faulty unit is truthful about all its messagesFaulty unit may send contradictory or even no messagesTime out mechanism is available to detect “no message”When no message, assume a default value
Chapter 3. Information redundancy à coding theory
Vector space – N dimensional space over a q elements (that is each dimension can take q values) can have Nq vectors.If binary space, q=2
If we select a subspace – with fewer than the maximum vectors we defined a code
The idea is: the only legal vectors are those in the code space.if you see a vector outside the code space – an error.
p. 7 - Design of Fault Tolerant Systems - Elena Dubrova, ESDlab
Error detection
• We can define a code so that errors introduced in a codeword force it to lie outside the range of codewords
– basic principle of error detection
p. 8 - Design of Fault Tolerant Systems - Elena Dubrova, ESDlab
Error detection
all possible words
code words
p. 9 - Design of Fault Tolerant Systems - Elena Dubrova, ESDlab
Error correction
• We can define a code so that it is possible to determine the correct code word from the erroneous codeword
– basic principle of error correction
p. 10 - Design of Fault Tolerant Systems - Elena Dubrova, ESDlab
Error correction
all possible codewords
code words
11
CSCE5760 September 24, 2019 12
CSCE 5760: Design For Fault ToleranceSimple parity (only for binary data)
one extra bit p =1 or c = d+1– only half of all possible values are used
Distance (Hamming distance) between two codewords - the number of bit positions in which the two words differ
Minimum distance is the minimum of all distances between pairs of codewordsFor simple parity, minimum distance = 2
Distance can be defined for other bases (say decimal)
Simple introduction firstTo detect k bit errors, minimum distance must be >= k+1To detect 1 bit error minimum distance must be at least 2
Simple parity works
To correct (implies detect first) k errors, minimum distance must >= 2K+1
12
9/24/19
7
CSCE5760 September 24, 2019 13
CSCE 5760: Design For Fault Tolerance
P1 P2 D1 P3 D2 D3 D4P1 is a parity on D1, D2 and D4
P2 is a parity on D1, D3 and D4P3 is a parity on D2, D3 and D4
Consider data value of 0010P1= 0; P2 = 1 and P3 = 1Codeword = 0101010
How do we detect and correct errors
Compute parities from received data – if all parities check, no errorIf parity Pi does not match correct value, we assign a Si =1So, we have 3 bit Syndrome S1S2S3 which ranges between 000 and 111
This number locates the error bit
Consider a minimum distance 3 Hamming code with 4 bit binary numbers. We need to add 3 parity bits. So we have 7-bit code words for 4-bit data word.
13
CSCE5760 September 24, 2019 14
CSCE 5760: Design For Fault ToleranceIn general if we have N = 2n data bits, we need n+1 parity bits (distance 3 code)For 32 bit data we need 6 parity bits
Parity bits are located at bit positions corresponding powers of 2
Pi is at bit position 2i
P1 at bit 1; P2 at bit 2, P3 at bit 4; P4 at bit 8; P5 at bit 16; P6 at bit 32
P1 P2 D1 P3 D2 D3 D4 P4 D5 D6 D7 D8 D9 D10 D11 P5
Parity Pi is a parity on “alternating 2i bits, starting Pi
P1 is parity on P1, D1, D2,….P2 is parity on P2, D1, D3, D4, D6, D7,
P3 is parity on P3, D2, D3, D4, D8, D9, D10, D11,….
P4 is parity on P4, D5, D6, D7, D8, D9, D10, D11, ….
14
9/24/19
8
CSCE5760 September 24, 2019 15
CSCE 5760: Design For Fault ToleranceSeparable codes
data and parity bits are separate Hamming code is a separable code
Non separable codesdata and parity cannot be separatemore complex to decode
Groups, Rings and Fields
A group [S, +] is a non-empty set with a binary operation (operation on two values):closure (a+b is also a member of the group)associative (a+b)+c = a+(b+a)an identity element exist for the operation (a+0 = a)each element has an inverse under the operation (a+(-a)) =0commutative group if the operation is commutative (a+b=b+a)
Examples: Set of all integers with addition (not just positive integers)Polynomials in x. Addition and multiplication of polynomials.Modulo addition and multiplication.
15
CSCE5760 September 24, 2019 16
CSCE 5760: Design For Fault ToleranceRing. [R, +, *] is a ring on a set of elements R defined with two operations, + and *.
[R, +] must be a commutative group.
and the * operation must satisfy: closure, associative and distributive operation (over +)Distributive: a*(b+c) = a*b + a*c à no need for an inverse
Examples: Set of Integers (both positive and negative) with addition and multiplication is a Ring.
Polynomials over real (or integers) is a ring under addition and multiplication.
Another example:Consider the set of n * n matrices over integers. Define matrix addition and Matrix multiplication
à not regular matrix multiplication but multiplication of respective elements
We have ring here.
16
9/24/19
9
CSCE5760 September 24, 2019 17
CSCE 5760: Design For Fault ToleranceYet more complex structure is called a field. [F, +, *] is a field if [F,+, *] is a ring
that is, the following holdcommutative a+b = b+a and a*b = b*aAssociative (a+b)+c = a + (b+c) and (a*b)*c = a* (b*c)Distributive a*(b+c) = a*b + a*c
andField has both additive and multiplicative identity elements
a+0 =0; a*1 = a (0 and 1 are identify elements)
Fields have additive and multiplicative inverses (except for 0)
a+ b =0 then b is the inverse of a under additiona*b =1 then b is the inverse of a under multiplication
17
CSCE5760 September 24, 2019 18
CSCE 5760: Design For Fault ToleranceExamples: The set of real numbers under addition and multiplication is a field
(note set of integers is not – why not?)
Consider Modulo addition and division. [Z5, +5, *5]. Is this a field?
Important: In a field, a*b = 0 if and only if either a or b is 0 (0 is the additive identity).
18
9/24/19
10
CSCE5760 September 24, 2019 19
CSCE 5760: Design For Fault ToleranceWhat is the inverse of 2? 2 has no inverse. So this is not a field. Why is Z5 a field but not Z4? Zp is a field if p is a prime number.
Now we have a field. This because, we are not looking at Z4, we are looking at Zx where x = 22
Or in other words, a filed over x = py where p is a prime number.
We need to be careful in defining these tables for + and * to make sure we have a fieldSuch fields are called Galois fields.
We will write them as GFq or GF(q) where q = ph and p is a prime number
19
CSCE5760 September 24, 2019 20
CSCE 5760: Design For Fault ToleranceAnother interesting factor. For any prime power field, there is a non-zero element say g that generates all the other members: that is, all the other number can be written as gi. Such an element is called the primitive element (or generator element).In the above example (with 4 elements 0, 1, a, b), a can be used as the primitive number since
a0 = 0, a= a1, a2= a*a = b, a3= b*a = 1.
More complex structures. Vector Spaces. We have to first start with a field, that is we will start with say [F, +, *] that gives us a field (the set F contains values taken by field elements).We will define vector of dimension n as a n tuple.
v = (v1,v2,....,vn) à all vi elements belong to F
We can now define vector addition as a binary operation.u+v = (u1+v1,u2+v2,....,un+vn)
Is this operation closed? Since the tuple elements are from the field and addition is closed on these elements, vector addition is also closed.
20
9/24/19
11
CSCE5760 September 24, 2019 21
CSCE 5760: Design For Fault ToleranceLikewise this operation is associative.
The vector 0 = (0,0,.....,0) the zero vector is the identity for the vector addition operation.
What about inverse of a vector?
V-1 = (v1-1,v2-1,....,vn-1) à this vector is in the same field since vi-1 belongs to F
We can also define a scalar multiplication on vectors.
a*V = (a*v1,a*v2,....,a*vn)
Since a*vi is defined on the underlying field elements, this scalar multiplication is closed, and associative.
1 (the multiplication identity of the underlying field) is the identity for scalar multiplication
Scalar multiplication is associative: (c*d)*V = c*(d*V)
21
CSCE5760 September 24, 2019 22
CSCE 5760: Design For Fault ToleranceDistributive Laws
c*(U+V) = c*U + c*V
(c+d)*V = (c*V + d*V)
For scalar multiplication we cannot define inverses, since the identity is not a vector.
We can sometimes define vector multiplication (this is different from dot product)U*V = (u1*v1, u2*v2,......, un*vn)
Now can define an identity vector (1,1,...,1)
This vector multiplication is closed, associative
U*(c*V + d*W) = c*(U*V) + d*(U*W)
Then we have a linear associative algebra.
22
9/24/19
12
CSCE5760 September 24, 2019 23
CSCE 5760: Design For Fault ToleranceLinear combination of vectors.
V = a*A + b*B +… where A and B are vectors themselves.
In other words, we can express one vector as a combination of other vectors.
If we start with say n vectors (V1, V2, ..., Vn). we can define a vector space by taking all possible linear combination of these vectors.
The set of vectors (V1, V2, ..., Vk) are known as linearly independent if we cannot express any one of the vectors as a linear combination of the others vectors. à basis vectors
We can talk about Subgroup, we can talk of subspace, subfields etc.
Subgroup. [H, +] is a subgroup of [G,+] if H is a subset of G and H is a group under +.
That is, + must be closed within H, must be associative, must have an identity and must define inverses for the elements of H. But, we need only to check for closure and inverses (the other properties follow).
Examples. Consider the set of {0,1,a,b} with the addition defined – here we have q=22
23
CSCE5760 September 24, 2019 24
CSCE 5760: Design For Fault Tolerance
{0,1} forms a subgroup under the same operations. But {a,b} does not.--- does not contain the additive identify
In general for subgroup H = {h0, h1, ...., hm-1}, m is the cardinaility of the set (also known as the rank or order of the group)
The identity element of G is also the identity element of H.
What is the cardinality of group G? Must be an integer multiple of m. Why?
24
9/24/19
13
CSCE5760 September 24, 2019 25
CSCE 5760: Design For Fault ToleranceConsider writing as follows (each gi is a member of the group but not in H)
Now we have rows – all of which are members of the group – but not of the subgroup H (except first row).
So, the cardinality of G is an integral multiple of the cardinality of H
Each row of elements {gj+hi | for hi in H} is called a co-set of H, and gj is called the co-set leader
Co-sets are useful in detecting and correcting errors.
25
CSCE5760 September 24, 2019 26
CSCE 5760: Design For Fault ToleranceHow do we know if two elements g and g' are in the same co-set? If they are, then we can use the co-set leader say gj and express g and g’ as
g = gj+hi and g' = gj+hk
Consider the element g-1+g' = ( gj+hi)-1+ (gj+hk) = ( hi-1+gj -1)+ (gj+hk) = (hi-1+hk)
Since hi and hk are element of H, ( hi-1+hk) must also be an element of H.
In other words, if g-1+g' belongs to H then g and g' belong to the same co-set.
Normal Subgroup. H is a normal subgroup of G if and only if for every element g of G and any element of h, g-1+h+g is in H.
Vector Subspaces. Any linear combination of vectors, say V1, V2, ..., Vn forms a subspace.
26
9/24/19
14
CSCE5760 September 24, 2019 27
CSCE 5760: Design For Fault ToleranceIf we can obtain the same vector space with k linearly independent vectors U1, U2,.., Uk then we call these k vectors as the basis and the vector subspace describes a k dimensional space.
We can represent any vector in the k-dimensional space as a linear combination of the basis vectors.
V = (a1*U1 + a2*U2 + ....+ak*Uk )
We can also represent this linear combination as(a1, a2, ..., an) [U] where [U] is a matrix written as
U1
U2U3
….Uk
We should know that two Matrices M and N are equivalent if one can be obtained from the other by using 1)row transposition, 2)row arithmetic, 3) column transposition and 4) column arithmetic.
27
CSCE5760 September 24, 2019 28
CSCE 5760: Design For Fault ToleranceFor example consider the following matrices in binary field
These matrices are equivalent (add row 1 to row 2; add row 1 and row 2 to row 3)
For any k-dimensional vector space, we can use the following basis
(1,0,0....0); (0, 1,...,0)....(0,0,...1).In matrix representation this looks like the Identity matrix.
However, we can also represent these vectors as matrices. Consider the matrices aboveWe have 3 independent rows and thus 3 independent vectors
We can generate vectors inn 3 dimensional space (subspace of 4 dimensions)
28
9/24/19
15
CSCE5760 September 24, 2019 29
CSCE 5760: Design For Fault ToleranceIntroduction to Linear Codes.
A linear code is nothing but a subspace. Every element (or codeword) of a linear code can be expressed as a linear combination of the basis vectors or basis codewords.
The basis codewords can be written as a matrix.
Consider the following Hamming code. p1 p2 m1 p4 m2 m3 m4 à as we have seen before
Either of these matrices form the basis for the 4-dimensional vector subspace in a 7-dimensional space à And give the code we are looking for.
The left matrix looks like {I | P} where I is the identify matrix.
29
CSCE5760 September 24, 2019 30
CSCE 5760: Design For Fault ToleranceIn order to construct a linear code, we need to consider a vector space (actually a subspace) over a finite Galois Field (filed over set containing a prime number of elements or power of a prime set)
If we are looking at a k dimensional subspace of a n-dimensional vectors, we will call the code [n, k] code over GFq.
Sometimes we would also describe the "minimum distance" d; [n, k, d] code.
Note since the linear code C is a subspace, for any two vector U and V in C,U+V is also in C (since the vector addition must be closed).
And for any scalar a in GFq, a*V is also in C (scalar multiplication is closed).
The zero vector (0,0,..,0) always belongs to any linear code -- the identity over vector addition.
Thus the codewords (elements of C) are n-dimensional vectors forming a k dimensional vector subspace.
In terms of codes, n-k is the number of parity bits or redundancy, k is the number of data bits.n-k is the redundancy
30
9/24/19
16
CSCE5760 September 24, 2019 31
CSCE 5760: Design For Fault ToleranceWeight of a vector or a codeword is the number of non-zero elements of the vector (represented as a tuple).
For example, consider a 4 dimensional vector over GF3 –(1, 0, 2, 0) weight is 2 (2, 2, 2, 1) weight is 4.
Distance between two vectors: is the weight of the difference vector.(1,0,2,0) - (2,2,2,1) = (1,0,2,0) + (-2, -2, -2, -1) = (1,0,2,0) + (1, 1,1, 2) = (2, 1, 0, 2)
Difference = weight of (2,1,0,2) = 3.
Note. Weight of a binary vector is the number of 1's in the vectorDistance between 2 binary vectors is the number places they differ
CSCE 5760: Design For Fault ToleranceMinimum distance of a vector space (or a code -- vector subspace)The minimum of all distances between any pair of vectors or codewords.
Note, the difference between any two codewords is a codeword since vector addition/subtraction is closed.
So, minimum distance of a code is the minimum weight of a codeword in the code.
How to generate codewords. A code is a vector space. We can think of basis vectors which can be written as generator matrix using the basis vectors.
For a [n,k] code -- k-dimensional subspace of n-dimensional space, we need k basis vectors; each vector is a n tuple. That is we have k * n matrix.
Since we are dealing with k-dimensional subspace, we can have only qk codewords (vectors) in our code.
Since we can manipulate matrices and obtain a new equivalent matrix, it will be useful to convert a generator matrix of a code to look like [Ik | Ak x n-k] matrix. (Ik is identity matrix)
32
9/24/19
17
CSCE5760 September 24, 2019 33
CSCE 5760: Design For Fault ToleranceConsider the Hamming code example
p1 p2 m1 p4 m2 m3 m4 à this is what we want to generate
We can use either of these generator matrices. The one on the right is how I have shown previously when we generated parity bits for Hamming code.
This is a [7, 4] code. We can have only 24 = 16 codewords. We can encode 4 bit numbers since we have 16 different possible numbers here. We can encode 4-bit numbers to get 7-bit codewords --- 3 parity bits. To do this all we have to do is multiply the 4 bit number by the generator matrix.
The same applies for any [n,k] code over any GFq field.
33
CSCE5760 September 24, 2019 34
CSCE 5760: Design For Fault ToleranceHow to decode?
Note that Vector spaces forms a group under vector addition. Since a Code is a subspace, it forms a subgroup under vector addition.
It means we can construct co-sets using the elements of the code and the vectors which are not part of the code.
These vectors which are not part of the code indicate errors.
The co-set leader (the vector with the smallest weight in a co-set) indicates the error or syndrome.
We could create an array.
For example with 7-bit binary numbers and 16 code words, we would have 7 co-sets in addition to the code – note the 7 co-sets indicate the 7 possible error states we discussed
So, we can construct an array 8 x 16 (7 co-sets and the codewords)
We can then locate where the received (code with error) falls and appropriately decode it.
34
9/24/19
18
CSCE5760 September 24, 2019 35
CSCE 5760: Design For Fault ToleranceIn general, we should compute for each syndrome, the coset leader.
Note the syndrome indicates the error state as represented by the paritiesAnd the co-set leader indicates which bit is in error (or just subtract this co-set from the received value)
consider the correct codeword [1010101] but we received [1010111] y*HT = [001] give us the syndrome
Remember co-sets and co-set leaders?We can think of creating a matrix as we did before.
Now we can use this matrix to find the co-set leader based on the syndrome.Then we can subtract the co-set leader from received value to get original codeword
Let us see how we can construct H matrix for any number of data bits Note that if we can construct the H matrix, we can get G matrix for any code
Suppose you want to create a code for d data bitsnote that we need r parity bits where 2r >= d+r+1 à number of statesand n =d+r
Here we are looking at distance 3 (or only one bit is in error hence the number of states is d+r+1)
40
9/24/19
21
CSCE5760 September 24, 2019 41
CSCE 5760: Design For Fault ToleranceH will have r rows and n columns.We can start with H by using all non zero values for columns Then rearrange H to look like [AT | I] and then obtain G
Consider r= 3 (n=7 and d=4)Let us start H as
0 0 0 1 1 1 10 1 1 0 0 1 11 0 1 0 1 0 1
We can rewrite this into 0 1 1 1 1 0 01 0 1 1 0 1 01 1 0 1 0 0 1
CSCE 5760: Design For Fault ToleranceFor any field over q, we need to create H using non zero values possible as columns. For example consider the filed over (0, 1, 2) or q=3Let us use 2 parity digits (r =2)
So 32 >= d + r+1 Now we can have 6 data digits
One example H matrix can be (2*8)
0 0 1 1 1 2 2 21 2 0 1 2 0 1 2
We can rewrite the H to be in the standard form [AT | I ]and then get G matrix