Lecture 21: Picture of nearest-nbr. decoding of a generic code with exhaustive search: c’s and ?’s: For a linear code, straighten the picture out; line up the cosets, row by row: Standard array: coset leader coset 1 = code c ? coset 2 coset 3 ? ... ... c ? ? ... c ? ? ... c ? ? ... Recall syndromes and syndrome decoding for a linear code: Complete Syndrome Decoding Algorithm: 1. Given received vector x, compute S(x) = HxT , where H is a parity check matrix. 2. Find the coset leader, y, corresponding to x in the syndrome table, i.e., S(y) = S(x). 3. Decode x to c := x − y. Main idea: the coset leader is the best guess of the error vector in transmission. Recall that syndrome decoding is a version of nearest nbr. decoding. Proposition: Let C be a linear code in V (n, q). Let e ∈ V (n, q). If wt(e) ≤ b(d(C) − 1)/2c, then e is the unique coset leader of its coset (i.e., the unique word of minimal weight in its coset). Example: Look at the first six rows in the standard array for C3. 1 Proof 1: If c is transmitted and x = c + e is received, then nearest nbr. decoding will correctly correct and decode to c = x − e. Syndrome decoding will decode to x−y where y is the coset leader of the coset x + C. Since syndrome decoding is a version of nearest nbr. decoding, we have c = x − e = x − y. Thus, e = y must be a coset leader of its coset. It must be the unique coset leader since otherwise a different coset leader could have been chosen, leading to a decoding error. Proof 2: Observe that for any words u, v of the same length, wt(u ± v) ≤ wt(u) + wt(v). x belongs to the coset x + C. Let c ∈ C and c 6= 0. We need to show: wt(x) < wt(x + c) We have: wt(x) ≤ b(d(C) − 1)/2c Since c = (x + c) − x, we have wt(c) ≤ wt(x + c) + wt(x). Thus, wt(x+c) ≥ wt(c)−wt(x) ≥ d(C)−b(d(C)−1)/2c > b(d(C)−1)/2c Incomplete Syndrome Decoding Algorithm (hedge your bets): – If S(x) corresponds to a coset with exactly one coset leader, then decode as above. – Otherwise, declare an error. This is a way of implementing incomplete nearest neighbour decoding because: 2 There is a tie, i.e., c, c0 ∈ C, c 6= c0, and d(x, c) = d(x, c0) achieve the minimum distance to x iff x − c and x − c0 have the same minimum weight and are in the coset x + C. Parity Check matrix (in standard form) for C3: 1 1 1 0 0 H=1 0 0 1 0 0 1 0 0 1 Incomplete Syndrome Table for C3: coset leader syndrome C3 : 00000 000 10000 110 01000 101 00100 100 00010 010 00001 001 −−−−− −−−−− 11000 011 10001 111 Let x be received vector. — If S(x) is above the line, then decode x to c : x − y — Else declare error. By Proposition above, the top part of the incomplete syndrome table always contains the coset leaders corresponding to all error vectors within the nominal error-correcting capability of the code. Revisit examples of decoding for C3 from last lecture, using incomplete syndrome decoding: 3 Example 1: transmitted vector c = 11011; error vector e = 01000; received vector x = c + e = 10011. Syndrome: 101 correct decoding. Example 2: transmitted vector c = 11011; error vector e = 01010; received vector x = c + e = 10001. Syndrome: 111 declare an error (instead of decoding incorrectly). Recall Hamming code C, defined by parity check matrix: 1 1 1 0 1 0 0 H= 1 1 0 1 0 1 0 1 0 1 1 0 0 1 H is an (n − k) × n matrix. C is a [7, 4, 3]-code. |C| = 24 = 16 and there are 27/24 = 23 = 8 cosets, each containing 16 words. Note that each nonzero vector of length 3 appears exactly once as a column of H (there are seven such vectors). Permute columns to obtain an equivalent Hamming code where the i-th column is the number i written in binary: 0 0 0 1 1 1 1 0 H =0 1 1 0 0 1 1 1 0 1 0 1 0 1 4 Coset leaders: Since d(C) = 3, all words of weight at most 1 are coset leaders; there are 8 such words and so these exhaust all the cosets. The syndrome corresponding to a coset leader with a 1 in the i-th position is the i-th column of H 0 (written as a row vector) which represents the number i in binary. Syndromes: 000, followed by each of the columns of H. coset leader syndrome C : 0000000 000 1000000 001 0100000 010 0010000 011 0001000 100 0000100 101 0000010 110 0000001 111 There is no bottom part of the table. Equivalent syndrome decoding algorithm (which does not need the syndrome table) is: 1. Given received vector x, compute S(x). 2. Let i be the decimal representation of the binary string S(x). 3. Flip the i-th bit; i.e., c = x − ei. (if i = 0 don’t flip any bits) Example of decoding (permuted) Hamming code: received vector x = 1001010, syndrome S(x) = 011 = 3 in decimal; flip the 3rd bit, i.e. decode to 1011010 (check that this is indeed a codeword). 5 Another Example: a linear code C over GF (5). Parity check matrix: H= 1 1 1 1 1 2 3 4 H is an (n − k) × n matrix. So. n = 4, k = 2. And d = 3 since no column is zero, no pair of columns is linearly dependent (λ(1, i)T 6= (1, j)T if i 6= j. ), and the first three columns are linearly dependent: 2K 1 + K 2 + 2K 3 = 0 Alternatively, observe that the row rank of H is 2 and since the column rank of H is the same as the row rank (for any matrix), any set of three columns must be linearly dependent. So, C is a [4, 2, 3]5-code. |C| = 52 = 25 and there are 54/52 = 25 cosets, each containing 25 words. So, there are 25 coset leaders and corresponding syndromes. Let’s do Incomplete Syndrome Decoding to correct single errors. The coset leaders include all words of weight at most 1; call this set L: L = {y1y2y3y4 : at most one yi 6= 0} Note that |L| = 17, and we can write each element of L as kej , 1 ≤ j ≤ 4, 0 ≤ k ≤ 4. The syndrome table will contain 17 rows above the line and 8 rows below the line. What are the syndromes of the 17 coset leaders above the line? S(kej ) = H(kej )T = (k, kj) 6 coset leader syndrome C: 0000 00 1000 11 2000 22 ... ... 0300 31 ... ... We don’t really need a syndrome decoding table for this. Instead, Define (A, B) := S(x). 1. If A = 0 and B = 0, decode x to c := x. 2. If A 6= 0 and B 6= 0, let k = A and j = A−1B; decode x to c := x − kej , i.e., subtract k from the entry in the j-th position. 3. If exactly one of A or B is 0, declare an error (must be one of the other 8 syndromes). Case 1 corresponds to no error: the syndrome is (A, B) := 0. Case 2 corresponds to exactly one error: the syndrome is (A, B) := S(kej )T = (k, kj) and so k and j are recovered from A and B by k = A and j = A−1B. In this way, we have found the error location (j) and error value (k). Many practical codes use this approach. Case 3 corresponds to at least two errors. Example: x = 2123; S(x) = (3, 2) = (A, B), so k = 3 and j = 3−12 = 4. Decode to c = 2123 − 0003 = 2120 (check that this is a codeword!) 7 Lecture 22: Review of previous example of code over Z5. Defn: A code is perfect if it achieves the Hamming bound with equality, i.e., an (n, M, 2t + 1)q -code C such that M = |C| = Pt m=0 qn n m m (q − 1) This holds iff |C||Bt| = |Fqn| iff the Hamming balls of radius t centered at the codewords of C form a partition of Fqn, i.e., they are pairwise disjoint and their union is all of Fqn (recall that Fq denotes an alphabet of size q). Note that if a perfect (n, M, 2t + 1)q -code exists then Aq (n, 2t + 1) = RHS. Perfect codes are rare. Examples: Assume q = 2 and t = 1 (so, d = 3). Then a code C is perfect iff: 2n |C| = 1+n A necessary condition for this is that the RHS must be an integer, equivalently n + 1 is a power of 2, equivalently, n = 2` − 1 for some `. So, the only possibilities are n = 3, 7, 15, 31, . . . (n = 1 is too small). These are all achievable. Examples: n 2 – n = 3: RHS is 1+n = 2: binary 3-repetition code. So, A2(3, 3) = 2. Can see the Hamming balls on the 3-diml. cube. – n = 7: RHS is 2n 1+n = 16: Hamming code. So, A2(7, 3) = 16. There exists a [23, 12, 7] binary code (called the Golay code, p. 90). 8 Construction of fields GF (q), q = pk : In the original defn. of field (p. 31), it is possible that 1 = 0. But then for any x ∈ F , x = x · 1 = x · 0 and so F = {0}. This is the trivial field. From now, on we only consider non-trivial fields. Defn: Let F be a field. Let a ∈ F and let m be a positive integer. Then ma := a + . . . + a (i.e., add m copies of a). Defn: The characteristic of a field F is the smallest positive integer m such that m · 1 = 0 and 0 if no such m exists. Note that the characteristic is never = 1 (because 1 6= 0). Examples: If F = R, then m = 0. For prime p, and F = GF (p) = Zp = {0, 1, . . . , p − 1}, we have m = p. For F = GF (4) = {0, 1, a, b}, we have m = 2 since 2 · 1 = 1 + 1 = 0. Prop: The characteristic m of a field is either 0 or prime. Proof: Suppose that m 6= 0 and m is not prime. Then m ≥ 2. So, there is a factorization: m = ij, i.e. 1 < i, j < m. Then in F, 0 = m · 1 = (ij) · 1 = (i · 1)(j · 1) (using the distributive law). But neither i · 1 nor j · 1 is 0. This contradicts the fact that F is a field. Prop: a finite field has prime characteristic. Proof: We must show that for some positive integer m, m · 1 = 0. Since there are infinitely many positive integers and the field is finite, there exist j > i s.t. j · 1 = i · 1. Thus, (j − i) · 1 = 0. 9