Chapter 17: Information Science Lesson Plan Binary Codes Encoding with Parity-Check Sums Cryptography Web Searches and Mathematical Logic © 2006, W.H. Freeman and Company For All Practical Purposes Mathematical Literacy in Today’s World, 7th ed. Chapter 17: Information Science Binary Codes Mathematical Challenges in the Digital Revolution How to correct errors in data transmission How to electronically send and store information economically How to ensure security of transmitted data How to improve Web search efficiency Binary Code – A Binary Codes system for Binary codes are the hidden language of coding data computers, made up of two states, 0 and 1. made up of Examples of binary codes: Postnet code, two states (or UPC code, Morse code, Braille, etc. symbols). Other recent advancements, such as CD players, fax machines, digital TVs, cell phones, use binary coding with data represented as strings of 0’s and 1’s rather than usual digits 0 through 9 and letters A through Z. Chapter 17: Information Science Encoding with Parity-Check Sums Binary Coding Strings of 0’s and 1’s with extra digits for error correction can be used to send full-text messages. Example: Assign the letter a the string 00001, b the string 00010, c the string 00100, and so on, until all letters and characters are assigned a binary string of length 5. For this example we can have: 25 = 2 × 2 × 2 × 2 × 2 = 32 possible binary strings. Error Detection and Correction via Binary Coding By translating words into binary code, error detection can be devised so that errors in the transmission of the code can be corrected. The messages are encoded by appending extra digits, determined by the parity (even or odd sums) of certain portions of the messages. Chapter 17: Information Science Encoding with Parity-Check Sums Parity-Check Sums Sums of digits whose parities determine the check digits. Even Parity – Even integers are said to have even parity. Odd Parity – Odd integers are said to have odd parity. Decoding The process of translating received data into code words. Example: Say the parity-check sums detects an error. The encoded message is compared to each of the possible correct messages. This process of decoding works by comparing the distance between two strings of equal length and determining the number of positions in which the strings differ. The one that differs in the fewest positions is chosen to replace the message in error. In other words, the computer is programmed to automatically correct the error or choose the “closest” permissible answer. Chapter 17: Information Science Encoding with Parity-Check Sums Nearest-Neighbor Decoding Method A method that decodes a received message as the code word that agrees with the message in the most positions. Assuming that errors occur independently, the nearest-neighbor method decodes each received message as the one it most likely represents. Binary Linear Code A binary linear code consists of words composed of 0’s and 1’s obtained from all possible messages of a given length by using parity-check sums to append check digits to the messages. The resulting strings are call code words. Think of the binary linear code as a set of n-digit strings in which each string is composed of two parts—the message part and the remaining check-digit part. Chapter 17: Information Science Encoding with Parity-Check Sums Weight of a Binary Code Morse code The minimum number of 1’s that occur among all nonzero code words of that code. Variable-Length Code A code in which the number of symbols for each code word may vary. Like in Morse code, the letters that occur most frequently have the shortest coding—similar to data compression. Data Compression Encoding data process where the fewest symbols represent the most frequently occurring data. Delta Encoding – A simple method of compression for sets of numbers that fluctuate little from one number to the next. Huffman Coding – A widely used scheme for data compression created in 1951 by a graduate student, David Huffman. The code is made by using a so-called code tree by arranging the characters from top to bottom according to increasing probabilities. Chapter 17: Information Science Cryptography Crytptology In many situations, there is a desire for security against unauthorized interpretation of coded data (desire for secrecy). Hence came cryptology, which is the study of how to make and break secret codes. Cryptology – The study of how to make and break secret codes. Encryption The process of encoding data (or simply disguising the data) to protect against unauthorized interpretation. In the past, encryption was primarily used for military and diplomatic transmission. Today, encryption is essential for securing electronic transactions of all kinds. Here are some examples: Web sites allowed to receive/transfer encrypted credit-card numbers Schemes to prevent hackers from charging calls to your cell phone Various schemes used to authenticate electronic transactions Chapter 17: Information Science Cryptography Three Types of Cryptosystems: Caesar Cipher A cryptosystem used by Julius Caesar whereby each letter is shifted the same amount. Not much effort to “crack” this code! Modular Arithmetic A more sophisticated scheme for transmitting messages secretly. This method of encrypting data is based on addition and multiplication involving modulo, n. For any positive integer a and n, we define a mod n (“a modulo n” or “a mod n”) to be the remainder when a is divided by n. Vigenére Cipher A cryptosystem that uses a key word to determine how much each letter is shifted. Key word – A word used to determine the amount of shifting for each letter while encoding a message. Chapter 17: Information Science Cryptography RSA Public Key Encryption Scheme – A method of encoding that permits each person to announce publicly the means by which secret messages are to be sent to him or her. In honor of Rivest, Shair, and Adleman, who discovered it. The method is practical and secure because no one knows an efficient algorithm for factoring large integers (about 200 digits long). Cryptogram – A cryptogram is a sentence (or message) that has been encrypted. Cryptography is the basis for popular word puzzles, called cryptograms, found in newspapers, puzzle books, and Web sites. Cryptogram Tips – Knowing the frequency of letters may help: A widely used frequency table for letters in normal English usage. Chapter 17: Information Science Cryptography Cryptogram Tips (continued) Here are some other helpful tips to know when solving cryptograms: One word consisting of a single letter must be the word a or i. Most common two-letter words in order of frequency: of, to, in, it, is, be, as, at, so ,we, he, by, or, on, do, if, me, my, up, an, go, no, us, am. Most common three-letter words in order of frequency: the, and, for, are, but, not, you, all, any, can, had, her, was, one, our, out, day, get. Most common four-letter words in order of frequency: that, with, have, this, will, your, from, they, know, want, been, good, much, some, time. The most commonly used words in the English language in order of frequency: the, of, and, to, in, a, is, that, be, it, by, are, for, was, as, he, with, on, his, at, which, but, from, has, this, will, one, have, not, were, or. The most common double letters in order of frequency: ss, ee, tt, ff, ll, mm, oo. Chapter 17: Information Science Web Searches and Mathematical Logic Web Searches In 2004, the number of Web pages indexed by large Internet search engines, such as Google, exceeded 8 billion. The algorithm used by the Google search engine, for instance, ranks all pages on the Web using interrelations to determine their relevance to the user’s search. Factors such as frequency, location near the top of the page of key words, font size, and number of links are taken into account. Boolean Logic A branch of mathematics that uses operations to connect statements, such as the connectives: AND, OR, NOT. Boolean logic was named after George Boole (1815–1864), a nineteenth-century mathematician. Boolean logic is used to make search engine queries more efficient. Chapter 17: Information Science Web Searches and Mathematical Logic Expression In Boolean logic, an expression is simply a statement that is either true or false. Complex expression can be constructed by connecting individual expression with connectives: AND, OR, and NOT. Connectives, their math notations, and meanings: AND conjunction ^, means to find the results with all of the words. OR disjunction v, means to find the results of at least one of the words. NOT ¬, means to find the results without the words. Two expressions are said to be logically equivalent if they have the same value, true or false. Truth Tables – Tabular representations of an expression in which the variables and the intermediate expressions appear in columns, and the last column contains the expression being evaluated.