Lecture 19: Midterm on Thursday. Office hours this week: Tuesday, Wednesday 1:30- 3:30 Practice midterm: We have been using the following useful interpretations of matrix multiplication: Ax = y — means that y is a linear combination of columns of A where the coefficients are the entries of x — and it also means that the entries of y are the dot products of x with the rows of A. Recall: For a linear code C, generator matrix G: C = {uG : u ∈ V (k, q)} parity check matrix H: C = {x ∈ V (n, q) : HxT = 0} Fact (see Lecture 18 notes): if for a linear code C, H = [B | In−k ] is a parity check matrix in standard form, then a generator matrix for C in standard form is: G = [Ik | − B T ] Recall: The Hamming code is C = {x : x1+x2+x3+x5 = 0, x1+x2+x4+x6 = 0, x1+x3+x4+x7 = 0} Cluster picture: Clusters are {x1, x2, x3, x5}, {x1, x2, x4, x6}, {x1, x3, x4, x7} 1 Each cluster has even parity. Equivalently, the Hamming code is defined by the parity check matrix 1 1 1 0 1 0 0 H= 1 1 0 1 0 1 0 1 0 1 1 0 0 1 So n = 7 and n − k = 3 (the number of rows of H) and so k = 4. And d(C) = 3 because there is no zero column, no two columns are identical and K 1 + K 4 + K 5 = 0. So, the Hamming code is a [7, 4, 3]-binary code and |C| = 24 = 16. Note that H = [B | I3] is a parity check matrix in standard form. So, a generator matrix in standard form is: 1 0 G = [I4 | − B T ] = 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 1 1 0 1 1 0 1 1 0 1 1 equivalently, C = {x1x2x3x4x5x6x7 : x5 = x1+x2+x3, x6 = x1+x2+x4, x7 = x1+x3+x4} Note that the variables x1, x2, x3, x4 are free (i.e., we can choose any values for them). For each such choice, the other variables x5, x6, x7 are determined by the equations above. Decoding of linear codes: Recall: cosets of a subgroup: G is a group, K is a subgroup, a+K is a coset (a ∈ G); here, a + K = {a + k : k ∈ K} 2 Also, given x, y ∈ G, x and y belong to the same coset of K iff x − y ∈ K. Let C be a linear code in V (n, q). Then C is an (additive) subgroup of V (n, q), and thus V (n, q) is the disjoint union of the cosets, a + C, of C, each of which has the same size as |C| = q k . How many cosets are there?: q n/q k = q n−k . Note: the cosets have to do with the (additive) group structure of V (n, q) but nothing to do with scalar multiplication. Defn: A coset leader of a coset is an element of the coset with minimum weight. Note: C is a coset of itself and its coset leader is 0. Standard Array of a code: a list of all elements of V (n, q) with the rows being the cosets, the first row being the code and the first column being the coset leaders. Example: C3 handout C3 : {00000, 10110, 01101, 11011} Standard Array for C3: coset leader C3 : 00000 10000 01000 00100 00010 00001 11000 10001 01101 11101 00101 01001 01111 01100 10101 11100 3 10110 00110 11110 10010 10100 10111 01110 00111 11011 01011 10011 11111 11001 11010 00011 01010 Note: Coset leaders need not be unique (see the last two rows of the standard array above). Defn: Let C be a linear code in V (n, q) with parity check matrix H. The syndrome of x ∈ V (n, q) is defined S(x) = SH (x) = HxT Note: computation of the syndromes depends on the field structure (both addition and multiplication). Note: a syndrome is a column vector in V (n − k, q) but often we write it as a row vector. Proposition: Let C be a linear code in V (n, q). Let x, y ∈ V (n, q). Then 1. S(x + y) = S(x) + S(y) 2. S(x) = 0 iff x ∈ C. 3. Let x, y ∈ V (n, q). Then x, y belong to the same coset of C iff S(x) = S(y) Proof: 1. H(x + y)T = H(xT + y T ) = HxT + Hy T . 2. S(x) = 0 iff HxT = 0 iff x ∈ C. 3. S(x) = S(y) iff H(x − y)T = 0 iff x − y ∈ C. Note: meaning of “syndrome” (symptom of disease). By Part 2, a nonzero syndrome is the “symptom” that that the received word x is not a codeword. Part 3 says that the syndromes are in 1-1 correspondence with the cosets. Since the syndromes are in V (n − k, q) and there are q n−k cosets, we have 4 Proposition: The syndromes exhaust all of V (n − k, q) Alternative Proof: The set S of all syndromes is the set of all linear combinations, over GF (q), of columns of H. So, S is the column space of the matrix H. dim(S) = dim( column space of H) = dim( row space of H) = dim(C ⊥) = n − k. But S ⊆ V (n − k, q). Thus, S = V (n − k, q). Return to Example C3: [5, 2, 3]-code over GF (2) (a 1-error-correcting code) Generator matrix (in standard form): 1 0 1 1 0 G= 0 1 1 0 1 Parity Check matrix (in standard form): 1 1 1 0 0 H=1 0 0 1 0 0 1 0 0 1 Syndrome Table for C3: coset leader syndrome C3 : 00000 000 10000 110 01000 101 00100 100 00010 010 00001 001 11000 011 10001 111 Proposition: Let x ∈ V (n, q). 5 1. There is a coset leader y such that S(y) = S(x). 2. Let c := x − y. Then c ∈ C and c is a nearest nbr. of x, i.e., d(x, c) = minc0∈C d(x, c0). Proof: 1. y is the coset leader of the coset with syndrome S(x). 2. S(c) = S(x − y) = S(x) − S(y) = 0. So, c ∈ C. d(x, c) = wt(x − c) = wt(y). for all c0 ∈ C, d(x, c0) = wt(x − c0). Both y and x − c0 belong to x + C. Since y is a coset leader, wt(y) ≤ wt(x − c0). Thus, d(x, c) ≤ d(x, c0). Complete Syndrome Decoding Algorithm: 1. Given received vector x, compute S(x). 2. Find coset leader, y, corresponding to y in the syndrome table. 3. Decode x to c := x − y. Corollary: Complete syndrome decoding is a specific way to implement complete nearest neighbour decoding. Proof: This follows from previous proposition. Idea: y is the best guess for the “error vector”. Example for C3: Transmit c = 11011; error vector e = 01000; received vector x = c + e = 10011. Syndrome: 101 Corresponding coset leader: y = 01000. 6 Decode to c = x − y = 11011. Correct! Another Example for C3: Transmit c = 11011; error vector e = 01010; received vector x = c + e = 10001. Syndrome: 111 Corresponding coset leader: y = 10001. Decode to c = x − y = 00000. Incorrect! Note: in first example, one error was made and code was able to correct; in second example, two errors were made. Advantage 4 of linear codes: “easy” to decode. Lecture 20: Midterm 2 7