CS1260 Mathematics for Computer Science Unit 8 Some Applications of Matrix Algebra Cryptography (Secret Codes) As before we wish to send some sensitive information to another person over an insecure communication channel (such as the Internet). There is a danger that a third person might intercept the message and make use of the information to our disadvantage. To avoid this messages are often encrypted before transmission using a method only known to the sender and receiver. Thus only the intended recipient knows how to decrypt the message received and thereby retrieve the original message. The original message is known as the plain-text and the encrypted message is known as the cipher-text (or encrypted text). One of the simplest and oldest encryption methods is the substitution cipher. Suppose we have a message constructed out of an 'alphabet' of N characters. Each character is assigned a number in the range 0 to N-1 and we encode the text by using arithmetic in ZN. For example working in Z31, allows an alphabet of 31 characters (sufficient for the 26 letters plus a few punctuation characters, space, full-stop, comma, apostrophe and question mark with 'a' --> 1, 'b' --> 2, ... 'z' --> 26, blank --> 0 and so on ). Beforehand the sender of the message chooses at random two numbers (a and b, say) in ZN with the number a being invertible in ZN. The two numbers a and b (and N) are made known to the intended recipient, but to no one else. Knowing a and N the recipient can easily calculate a–1. Each character c in the plain-text is encoded to produce the character e (say) in the cipher-text as follows: e = a c + b mod N The cipher-text is then transmitted and when it is received the character e can be decoded by the recipient as follows: c = a–1 (e – b) mod N An eavesdropper, however, cannot easily decode the message as (s)he does not know a and b. Unfortunately such single character substitution ciphers are easy to break either by brute force trial and error approach or by statistical means. In the latter the average frequencies of the characters in the language are tabulated (by analysing a range of documents written in the language). For example in English the most frequent letter is 'e' (with numerical code 5, say) followed next by 't' (with numerical code 20, say). A code-breaker simply counts the number of occurrences of each character in the cipher-text and suppose (s)he finds that the two most frequent encrypted letters are 'z' followed by 'q', then there is a high probability that 'z' (code 26) and 'q' (code 17) in the cipher-text correspond to 'e' and 't' respectively in the plain-text. The numbers a and b can now be determined by solving two simultaneous equations (working modulo 31): 5 a + b = 26 mod 31 20 a + b = 17 mod 31 © A Barnes 2006 since since 1 a 'e' + b --> 'z' a 't' + b --> 'q' CS1260/L8 Once a and b are known, a–1 can easily be calculated and the whole message decoded. In practice a little trial and error might be needed (as the actual frequencies of letters in the plain-text may differ from the average and the modulus N may not be known), but basically such ciphers can be easily broken by experienced code-breakers. Substitution ciphers can be made somewhat more secure by splitting the plain text into groups of characters and encoding groups of characters rather than single characters. To see how this might work lets us work modulo 31 and split the message into pairs of characters. Beforehand we choose at random an invertible 2 2 matrix A and column vector B of order 2 both over Z31. We reveal A and B only to the intended recipient of the message. The recipient can easily calculate A–1 as (s)he knows A. Alternatively we could secretly tell the recipient the inverse matrix A–1 and the column vector B. We split the plain-text into pairs of characters and each pair of characters corresponds to a column vector of order 2. We encode each column vector X by multiplying by A and then adding B to produce an encrypted vector Y: Y = AX + B and transmit the encrypted vectors. Our recipient splits the received message into pairs of characters and decodes each pair as follows: X = A–1 (Y – B) For definiteness suppose that we have chosen 9 3 A 10 15 19 9 23 B then A1 = 11 25 24 (see the example above) and that we wish to send the message 'attack'. 1 'at' --> 20 20 'ta' --> 1 3 'ck' --> 11 We send the encrypted message 'ykpiqt' since 9 3 1 19 25 = ' yk' 10 15 20 11 11 9 3 20 19 16 = ' pi' 10 15 1 11 9 9 3 3 19 17 = ' qt' 10 15 11 11 20 The recipient subtracts B from each vector received and then multiplies by A–1 to recover the original message. Such codes may still be broken by statistical means. The frequencies of pairs of characters are tabulated by analysing a range of documents; in an alphabet of 31 characters there are 312 = 961 such pairs. A code-breaker can compare the frequencies of pairs of characters in the cipher-text with these average frequencies to identify which the commonest encrypted pairs. It is highly likely that these correspond to the commonest pairs in plain text. The entries in A and B can then be found by solving a set of 6 simultaneous linear equations. © A Barnes 2006 2 CS1260/L8 However as we increase the size of the character groupings the code becomes increasingly more difficult to break by these methods. In a practical code the message would be broken into groups of (say) 10 or more characters and the matrices A and B would be of order 10 or larger. Provided A and B are keep secret, the code would be difficult to break even with if considerable computing resources were available. However such substitution ciphers are still not ideal. They belong to a class of ciphers called symmetric ciphers. This means that if a person knows how to encipher a message, he or she can also decipher messages without too much effort. In substitution ciphers if A and B become known to the eavesdropper, it is a simple matter for him/her to calculate A–1 and so decode the cipher-text. As we saw in a previous unit, in the last thirty years or so a class of asymmetric (or public key) cipher systems such as RSA have been developed. In these knowledge of how to encipher a message does not enable a code breaker to decipher the message. In public key systems there are two keys: the public key used to encipher messages and the private key used to decipher them. Knowledge of the public key and the cipher-text is of no use to a code-breaker in determining the private key and/or the plain-text. The private key is only known by the recipient1 of the message whereas the public key can be made public. Since only one person need know the private key, the system is far more secure than a symmetrical system where both sender and receiver need to know the private key. However as we saw earlier public key systems (such as RSA), do require a considerable amount of computing (raising integers with over a 100 digits to some high power modulo some very large prime. Some cryptosystems (such as PGP) use a symmetric cipher (such as the matrix substitution cipher discussed above with matrices of order > 10) and only use RSA to encode and send the symmetric key (that is the modulus N and the matrices A and B to the recipient. The message itself is then encoded using the symmetric cipher which is considerably faster than using RSA. Error-Detecting and Error-Correcting Codes Imagine that binary information is being transmitted between computers over a noisy communication channel (for example a public telephone line). Owing to noise on the channel (that is random electrical disturbances) a bit which is transmitted as 0 might be received as a 1 (or vice-versa); a transmission error has occurred. A similar problem occurs when binary data is being stored on some magnetic media (such as a hard-disk, floppy disk or magnetic tape). As time passes the information on the media may be corrupted by exposure to stray magnetic fields, heat or severe mechanical vibration; a storage error has occurred. An error-detecting code is a way of encoding extra bits in the signal before transmission so that any error in transmission can be detected by the receiver. Then depending on the context the receiver can either ask the transmitter to re-send that part of the message or it can ignore the erroneous part of the message. An error-correcting code is a way of encoding extra bits in the signal before transmission so that any error in transmission can be detected and corrected by the receiver. Error correcting codes are useful when storing data on magnetic media as the original data can be reconstructed even after it has become corrupted. Error-correcting codes are also useful when It does not need to be known by the sender of the message (and in fact should not be known by the sender). 1 © A Barnes 2006 3 CS1260/L8 it is not feasible to ask the receiver to re-send the erroneous part of the message. For example when television signals are being sent back to Earth from a spacecraft in orbit around a distant planet when the communication delay could be several hours. A code is said to be single error detecting if it can detect a single flipped bit in the received data whereas if k altered bits can be detected it is said to be k-error detecting. Similarly a code is said to be single error correcting if it can correct a single flipped bit in the received data whereas if k altered bits can be corrected it is said to be k-error correcting. Obviously if a code is k-error-correcting, it is also at least k-error-detecting2. Matrices over Z2 (the integers modulo 2) are useful for constructing error-detecting and errorcorrecting codes for binary data. The simplest form of single error-detecting code is to add a single parity bit to the transmitted message; if the number of 1's in the transmitted message is even the parity bit is 0, whereas if the number of 1's is odd the parity bit is 1. Thus the transmitted message including parity bit always has an even number of 1’s. Example For simplicity we consider the transmission of 4 information bits (that is a 'nybble' or half a 'byte') plus a parity bit: Nybble 1011 1111 Transmitted 10111 Odd number of 1's 11110 Even number of 1's Parity bit 1 Parity bit 0 If the message 10011 is received we know that an error has occurred as there are two 1's in the information bits, and the parity bit is 1 (as the number of 1's is even the parity bit should be 0). If the parity check is correct we can be reasonably confident that no errors have occurred in transmission. Strictly speaking we only know an even number of errors (zero, we hope!) have occurred, whereas if the parity is incorrect we know an odd number of errors have occurred. Of course if the probability of an error occurring is low (say, 1 in 1000), then the probability of multiple errors occurring is small (assuming errors are not correlated, the probabilities are 1 in a million for two errors and 1 in a billion for three errors). So in practice the possibility of multiple errors occurring is often ignored. We can regard a message above plus its parity bit as a row vector of order 5 over Z2. 11000 1 1 0 0 0 11101 1 1 1 0 1 and so on. One way of checking the parity bit is to multiply the transpose of the received vector Y (say) with the parity check vector P 1 1 1 1 1 (working in Z2). This is equivalent to determining if there are an even or odd number of 1’s in the received vector Y. If P YT = [ 0 ], the received vector has the correct parity whereas as if P YT = [ 1 ] then we know an error has occurred (or rather more accurately that an odd number of errors has occurred). Although matrix multiplication here may seem an overly complicated way of distinguishing an even number of 1’s from an odd number, this method generalizes naturally for more complicated error correcting codes. Single error correcting codes can be constructed by having several parity bits constructed from subsets of the information bits. One of the best known means of doing this is a 2 Unless the errors could be detected how could they be corrected? © A Barnes 2006 4 CS1260/L8 Hamming code. For a Hamming code of length N, the parity check matrix over Z2 has N columns which are in order the binary representations of the numbers 1, 2, ..., N. The parity bits in the message are situated at positions that are powers of 2, that is 1, 2, 4, ... and so on. For example for a code of length 7 there are 3 parity bits occurring at positions 1, 2 and 4 plus 4 information bits (at positions 3, 5, 6 and 7). The parity check matrix is 0 0 0 1 1 1 1 P 0 1 1 0 0 1 1 1 0 1 0 1 0 1 The three parity bits in any code word X are chosen so that P XT = 0 where 0 is a zero column vector with the same number of rows as P. This leads to a set of simultaneous equations to solve for the parity bits. For example for the Hamming code of length 7 if the parity bits are denoted by p, q and r and the information bits by w, x, y and z then the equation P XT = 0 with X p q w r x y z leads to the equations r `= x + y + z q = w +y +z p=w+x +z Remember that –1 = 1 in Z2, so that the equation r + x + y + z = 0 reduces to the first of the above equations and so on. Thus given the information bits we can very easily calculate the parity bits. For example if the 4 information bits are I = [w, x, y, z] = [1, 0, 0, 1], we have r = 1, q = 0 and p = 0 so that the corresponding code vector is X = [0, 0, 1, 1, 0, 0, 1]. This is transmitted, and the receiver checks the received vector Y (say) by calculating P YT. If this is zero then no errors3 have occurred in transmission. If it is non-zero, then we know a transmission error has occurred and moreover if just one error has occurred the column vector P YT gives the position at which the error occurred. For example if when we transmit X = [0, 0, 1, 1, 0, 0, 1], an error occurs at position 5 the vector Y = [0, 0, 1, 1, 1, 0, 1] will be received. On doing the parity check we find PY T 0 0 0 0 0 1 1 1 1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0 1 0 1 1 1 0 1 Since 101 is the binary representation of 5 we know that the error occurred in position 5. If we flip bit 5 in the received message to get the vector [0, 0, 1, 1, 0, 0, 1] that was transmitted; then extracting the information bits we have I = [1, 0, 0, 1]. 3 or at least 3 errors transmission errors have occurred. © A Barnes 2006 5 CS1260/L8 Extended Hamming Codes What happens if two errors actually occur? For example suppose we have I = [1, 1, 0, 0], with corresponding code vector is [0, 1, 1, 1, 1, 0, 0]. Now suppose errors occur in positions 2 and 7 so that the vector received is Y = [0, 0, 1, 1, 1, 0, 1]. This is the same as that in the previous paragraph and thus the Hamming code above cannot distinguish between single and double errors. Fortunately it is easy to rectify this shortcoming for any Hamming code by adding an overall parity bit to the code. We obtain a code which is single error-correcting and double error-detecting. For the example above the code vectors now have length 8 with 4 information bits and 4 parity bits (in positions 1,2, 4 and 8. We add an 8th column consisting of 3 zeros to the parity check matrix and then add a 4th row consisting of all 1's. The parity check matrix is 0 0 P 1 1 0 0 1 1 1 1 0 1 1 0 0 1 1 0 0 1 0 1 0 1 0 1 1 1 1 1 1 1 The four parity bits in any code word X are chosen so that P XT = 0 where 0 is a zero column vector with 4 rows. This leads to a set of 4 simultaneous equations to solve for the parity bits. If the parity bits are denoted by p, q, r and t and the information bits by w, x, y and z then the equation P XT = 0 with X p q w r x y z t leads to the equations r `= x + y + z q = w +y +z p=w +x +z t `= p + q + w + r + x + y + z For example when I = [1, 0, 0, 1] then we find r `= 1 q = 0 p=0 t `= 1 Thus we transmit X = [0, 0, 1, 1, 0, 0, 1, 1] and if an error occurs at position 5 the vector Y = [0, 0, 1, 1, 1, 0, 1, 1] would be received. On doing the parity check we find PY T © A Barnes 2006 0 0 1 1 0 0 1 1 1 0 0 1 1 0 1 1 1 1 1 0 0 1 1 0 1 0 1 0 1 1 1 1 0 0 1 1 1 0 1 1 0 1 1 1 6 CS1260/L8 As the overall parity check bit is 1 (in row 4) we know an odd number of errors has occurred and assuming that one error occurred its position is given (in binary) by the first 3 rows. Thus we can be reasonable confident that one error occurred in position 5 and so flipping bit 5 we find that the transmitted vector was [0, 0, 1, 1, 0, 0, 1, 1] corresponding to information bits I = [1, 0, 0, 1]. We now reconsider the previous example with I = [1, 1, 0, 0] where two errors occur in positions 2 and 7. The transmitted code vector is [0, 1, 1, 1, 1, 0, 0, 0] and the received vector is Y = [0, 0, 1, 1, 1, 0, 1, 0]. On doing the parity check we find PY T 0 0 1 1 0 0 1 1 1 0 0 1 1 0 1 1 1 1 1 0 0 1 1 0 1 0 1 0 1 1 1 1 0 0 1 1 1 0 1 1 0 0 1 0 since the overall parity check bit is 0 (in row 4) we know an even number of errors has occurred and since rows 1-3 are not all zero we know at least one error has occurred. Hence two (or more) errors must have occurred. Depending on the context we can ask for the vector to be resent or we can discard it. Multiple Error Detecting or Error Correcting Codes It is possible to construct codes that can correct more than one error. For example the Mariner space probes to Mars transmitted back television pictures using a Reed-Muller code of length 32 with 6 information bits. Thus the transmitted message including the extra bits for error-detection and correction was over 5 times as long as the original message. These Reed-Muller codes allowed up to 7 errors to be corrected (and 8 errors to be detected) in each 32-bit message. This technique using 7-error correcting codes was superior to the alternative which would have been to store a copy of the transmitted pictures on the space-probe until the receiving station signaled back to the space probe that the picture had been received correctly or requesting its retransmission. Owing to the large distances involved (15 light minutes or so)the round trip transmission time was pictures would have needed to be stored for 30 minutes or so on the space-probe. In the 1980’s this would have required excessively large amounts of (heavy and expensive) onboard computer memory on the space probe. The alternative of using highly redundant Reed-Muller codes was superior as transmission bandwidth was not a significant problem. Similar codes were used on the Voyager space probes to the outer planets. However the mathematics of such codes (which used modular arithmetic and the theory of finite fields) is beyond the scope of this course. © A Barnes 2006 7 CS1260/L8