Modular Arithmetic - Cryptographer's Mathematics http://www.antilles.k12.vi.us/math/cryptotut/home.htm http://www.antilles.k12.vi.us/math/cryptotut/mod_arithmetic.htm Introduction Mod-arithmetic is the central mathematical concept in cryptography. Almost any cipher from the Caesar Cipher to the RSA Cipher use it. Thus, I will show you here how to perform Mod addition, Mod subtraction, Mod multiplication, Mod Division and Mod Exponentiation. It is a very easy concept to understand as you will see. Use this page as a reference page and open it whenever you encounter any mod-calculations or mod-terminology that leave questions behind. So let's start: What is meant by Mod, Modulus and Modular Arithmetic? “Modulus” (abbreviated as "mod") is the Latin word for “remainder, residue” or more in “what is left after parts of the whole are taken”. Thus, "modular" or "mod arithmetic" is really "remainder arithmetic". More precise: We are looking for the integer that occurs as a remainder (or the "left-over") when one integers is divided by another integer. Let's do three examples: Example 1: When 7 is divided by 3 it leaves a remainder of 1. Think of $1 as a left over after $7 are equally split among 3 people. Surely, there is a mathematical notation for mod arithmetic: Instead of writing 7 = 3*2 + 1 where 1 is the integer remainder we will write: 7 mod 3 = 1 which reads as: "7 modulo 3 is 1" and 3 is called the "modulus". Sure, this notation does not reveal the $2 that every person gets as his share. True, however, we are solely interested in the left over part, the remainder of $1 in our example. Example 2: When 8 is divided by 3 it leaves a remainder of 2. Thus, we write: 8 mod 3 = 2. Example 3: When 9 is divided by 3 it leaves no remainder. Thus, we write: 9 mod 3 = 0. 1 Figure 1: Arithmetic MOD 3 can be performed on a clock with 3 different times: 0, 1 and 2. Computations involving the modulus to determine remainders are called “Modular Arithmetic”. It was first studied by the German Mathematician Karl Friedrich Gauss (1777-1855) in 1801. You may have heard this anecdote about Gauss when he went to school: His Mathematics teacher tried to keep the bored genius busy, so he asked him to add up the first 100 integers hoping that he would keep him quiet for a little while. However, young Karl promptly responded “5050 and the formula for the sum of the first n integers is n*(n+1) / 2”. Do you know why? Great, we have the principle of Mod Arithmetic straight: To find the remainder simply divide the larger integer by the smaller integer. This surely works for large numbers as well: I.e. 365 MOD 7 = 1 (since 365 = 52*7 +1) . What is the usage of Mod arithmetic? 365 MOD 7 = 1 tells us that if Christmas will fall on Thursday and we don't have a leap year it will fall on a Friday next year. The same for your birthday and any other day as well: every week day will fall on the following weekday the next year. Notice again that we only care about the remainder 1 and not the completed 52 weeks in a year. In fact if a year would consist of only 358 or 351 or 15 or 8 days, we would still have the same "shift by 1" effect. Apparently, solely the length of each week (called the modulus) determines the "shift by 1". What does 366 MOD 7 = 2 explain for leap years? Shift by 2 days. Congruent numbers Integers that leave the same remainder when divided by the modulus m are somehow similar, however, not identical. Such numbers are called "congruent" . For instance, 1 and 13 and 25 and 37 are congruent mod 12 since they all leave the same remainder when divided by 12. We write this as 1 = 13 = 25 = 37 mod 12. However, they are not congruent mod 13. Why not? Yield a different remainder when divided by 13. Find 5 numbers that are congruent to 1) 7 mod 5 2,12,17,-3,-10 2) 7 mod 25 32,57,82,-18,-43 3) 17 mod 25. 42,67,92,-8,-33 Modular Arithmetic is also called Clock Arithmetic 2 The classical example for mod arithmetic is clock arithmetic: Look at the 12-hour clock in your room. You see 12 numbers on the clock. Here, the modulus is 12 with the twelve remainders 0,1,2,..11. So, when you give the time you actually give a remainder between 0 and 11. Again, the modulus m=12 is in charge of these reminders. What time is it 50 hours after midnight? It is 2 (a.m.) since 50 hours equal 2 full days and 2 hours. Reduce the following: 1) 2) 3) 4) 5) 6) 40 mod 12 50 mod 12 50 mod 24 40 mod 24 100 mod 33 1000 mod 33 4 2 2 16 1 10 In Modular Arithmetic, we add, subtract, multiply, divide and exponentiate as follows: A) Mod Addition Let's start simple: What time is it 10 hours after 11:00? It is 11+10 = 21 o'clock, and 21 minus the modulus 12 leaves a remainder of 9, thus 9 o'clock. What time is it 22 hours after 11:00? It is 11+22 = 33 and subtracting the modulus 12 repeatedly (which is also called "dividing") yields again 9. Ignoring a.m. and p.m., we are performing mod arithmetic on the clock. Let's write the two examples in mod notation: 11+10 = 21 mod 12 = 9 and 11 + 22 = 33 mod 12 = 9. How to perform Mod Addition: First add the two numbers, Secondly, divide the sum by the modulus to compute the remainder. In "8-hour-land" where a day lasts only 8 hours, we would add 12 and 9 as follows: First, 12+9 = 21, secondly 21 divided by the modulus 8 leaves a remainder of 5 since 21=2*8+5. How many different remainders does "8-hour-land" have? Click here for a modular clock. B) Mod Subtraction 3 Subtraction is performed in a similar fashion: First subtract, secondly compute the remainder. Example 1: 25 - 8 = 17 MOD 12 = 5 Example 2: 50 - 11 = 39 MOD 12 = 3 What if we obtain a negative answer? Say it is 2 o'clock in New York, what time is it in L.A.? Turning the hand on a clock 3 hours backwards shows that it is 11 o'clock: 2 - 3 = -1 MOD 12 = 11 Thus, if the answer is negative, add the modulus you get a positive number. That number must be between 0 and the modulus. Example 3: 3 - 50 = -47 MOD 12 = 1 since - 1 + 12 =11. Example 4: 14 - 77 = -63 MOD 12 = 9 since -63 + 12 + 12 + 12 + 12 + 12 + 12 = 9. Example 5: 50 - 11 = -39 MOD 15 = 6 since -39 + 15 + 15 + 15 = 6 C) Mod Multiplication Since multiplication of positive numbers is repeated addition it can be reduced to the above mod addition. How do we compute 5 * 8 MOD 12? First we multiply: 5 * 8 = 40 secondly we find the remainder: 40 MOD 12 = 4. A useful shortcut: A mod expert would find the answer to 123 * 62 mod 12 immediately. It is 6. How does she know? Without being a Gauss genius, she computes 123 mod 12 = 3 and 62 mod 12 = 2 and multiplies those two answers. To verify this: 123*62 mod 12 = 7626 mod 12 = 6. This computation aid is true for addition and subtraction as well. Therefore, we may write them as: 3 Computation Rules for Mod Arithmetic 1) a + b mod m = (a mod m) + (b mod m) 2) a - b mod m = (a mod m) - (b mod m) 3) a * b mod m = (a mod m) * (b mod m) Compute the following: 1) 2) 3) 4) 5) 6) 73 + 58 = x mod 12 1411 - 285 = x mod 141 74 * 93 = x mod 13 33 * 266 = x mod 26 2590 * 5253= x mod 26 133 * 5202 = x mod 26 11 139 5 16 16 6 4 D) Mod Division Division is the inverse operation of multiplication. This means that every division question can be answered by answering a "find the missing number" multiplication question. I.e. Since 5*8 = 4 MOD 12 dividing by 5 yields 8 = 4/5 MOD 12. Thus, if I had asked you: Compute 4/5 MOD 12, the answer is apparently 8. Example 1: To compute 5 / 7 mod 12, we introduce an x x = 5 / 7 mod 12 to multiply both sides by 7. 7x = 5 mod 12. We find x by testing the 12 different remainders 0, 1, ...11. Trial and error yields x=11 since 7 * 11 mod 12 = 77 mod 12 = 5. Do these problems: 1) 2) 3) 4) 5) 6) 7 / 5 = x mod 12 7 / 11 = x mod 12 29 / 7 = x mod 12 x * 7 = 8 mod 12 7 * x = 9 mod 12 x * 7 = 5 mod 12 11 5 11 8 3 11 A Better Division Method If the modulus is small as above, trial and error will find the answer. But what if the modulus is large, i.e. x * 8 = 19 mod 131 ? Testing each remainder would take a long time. Surely, we could have a computer do the testing for us, and we would confirm the found answer by performing the Mod multiplication. Luckily, all this is not necessary as there exists a straightforward method to perform Mod Division: To divide i.e. 5 by 7 mod 12, we first rewrite it as we did above by multiplying both sides by 7: x * 7 = 5 mod 12 To isolate x, we simply multiply both sides by the inverse of 7 mod 12, which is by chance 7 itself since 7 * 7 mod 12 = 49 mod 12 = 1. Now, we multiply both sides by 7 which yields x on the left and 7 * 5 mod 12 = 35 mod 12 = 11 on the right. Therefore, x = 11 mod 12 or 5/7 = 11 mod 12. Done we are. Looking back at this method, the only tricky part is how to find the inverse. If we had to do trial and error, 5 we would not gain anything in comparison to our previous method. It is the extended version of the Euclidean Algorithm that allows us to find the inverse mod m in an efficient manner. Click here for an explanation of the Extended Euclidean Algorithm. I also created a Javascript-demonstration of the Extended Euclidean Algorithm here which allows you to understand the mechanics involved quickly. Of course, you don't have to practice this Algorithm, a computer will do this for us. You do the trial and error method to find multiplicative inverses. Afterwards, check your answers here: 1) 2) 3) 4) 5) 6) 5 mod 12 5 mod 11 5 mod 13 7 mod 26 11 mod 26 15 mod 26 5 9 8 15 19 7 Try to solve the following 4 challenging problems. Hint: Let a be 2, 3, 4, etc., compute the inverse a-1 in each case and find a pattern. 7) Find a-1 mod 2a-1. 8) Find a-1 mod 2a+1. 9) Find a-1 mod a2-1. 10) Find a-1 mod a2+1. 2 2a -1 a a2+1-a Using the found inverses, now perform the following Mod divisions yourself. First rewrite the equations as we did in example 1, then compute. 11) 12) 13) 14) 15) 16) 7 / 5 = x mod 12 7 / 5 = x mod 11 7 / 5 = x mod 13 3 / 7 = x mod 26 9 / 11 = x mod 26 11 / 15 = x mod 26 11 8 4 19 15 25 Some Mod Divisions have no Solution Unlike the division of real numbers, mod division does not always yield an answer. The reason for that is that the mod-multiplication does not always yield all possible remainders less than the modulus. Let's investigate this fact. For example: Using a modulus of m=6, we set up a multiplication table that displays the multiplications of the remainders 0,1,2,3,4 and 5. * mod 6 0 1 2 3 4 5 0 0 0 0 0 0 0 1 0 1 2 3 4 5 2 0 2 4 0 2 4 3 0 3 0 3 0 3 4 0 4 2 0 4 2 6 4 0 4 2 0 4 2 5 0 5 4 3 2 1 Two Observations: 1) Some divisions have no answer The rows created by the remainders 0,2,3,4 do not contain all six remainders. Consequently, some divisions have no answer. I.e. consider division by 2: 4 / 2 = 2 mod 6 since 2 * 2 = 4 mod 6. 2 / 2 = 1 mod 6 since 1 * 2 = 2 mod 6. However, 3 / 2 = x mod 6 has no answer x since there exists no remainder x such that 2 * x yields 3 mod 6. Also, 5 / 2 mod 6 has no answer. Why not? In fact no odd integer could possibly be divided by 2 mod. If, however, we use a modulus of 7 any odd (and any even) integer less than 7 can be divided by 2. Explain why by using the multiplication for mod 7 below. 2) Some divisions have many answers I.e. 4 / 2 = 2 mod 6 since 2 * 2 = 4 mod 6, also 4 / 2 = 5 mod 6 since 2 * 5 = 4 mod 6. Even seemingly odd divisions like 0/3 or even worse 0/0 are legal mod 6. Both have the answer 2. Explain this. If we don't limit us to the six remainders as answers, we actually find an infinite number of answers. Notice that also 2*8, 2*11, 2*14, 2*17, ... yield 4 mod 6. Consequently, when dividing 4 by 2 mod 6 8,11,14,17,...are correct answers as well. Thus, don't be surprised if your partner finds a different answer than you. It might be just as correct as yours. Exercise: Give five answers to 3 / 3 mod 6. Is the obvious answer "1" correct? Exercise 1: Create mod multiplication table using modulus 6,7,8 on paper. Afterwards, verify their correctness by creating these tables at the right. Exercise 2: For encryption purposes, we prefer to have unique answers. By choosing the modulus in a certain way, can we assure unique answers? How shall we choose the modulus? Come up with a guess why division by 1 and 5 yields unique answers mod 6 (when restricting us to the six remainders 0,1,2,3,4 and 5). Observe from your tables created in exercise 1 the following facts: a) Division by 1,3,5 and 7 yields unique answers mod 8, b) division by 1,2,4,5,7 and 8 yields unique answers mod 9, c) division by 1,2,3,4,5 and 6 yields unique answers mod 7. Before you continue reading write down your guess when mod division yields unique answers and when not. Answer: If the number we are dividing by is relative prime to the modulus (that means their only common divisor is 1) the divisions yield unique answers. Consequently, if the modulus is a prime number (hence all integers less than the prime number are relative prime) all divisions yield unique answers. 7 Example: If the modulus is m=7, the divisions yield unique solutions. * mod 7 0 1 2 3 4 5 6 0 0 0 0 0 0 0 0 1 0 1 2 3 4 5 6 2 0 2 4 6 1 3 5 3 0 3 6 2 5 1 4 4 0 4 1 5 2 6 3 5 0 5 3 1 6 4 2 6 0 6 5 4 3 2 1 Perform the following divisions. Which one has multiple answers between 0 and the modulus? Find them if they exist. Which division has a unique answer? Which division has no answer? 1) 2) 3) 4) 5) 6) 7 / 4 = x mod 12 6 / 9 = x mod 12 7 / 5 = x mod 13 3 / 13 = x mod 26 4 / 10 = x mod 26 12 / 10 = x mod 29 (or 7 = 4*x mod 12) No answer. (or 6 = 9*x mod 12) x=2. (or 7 = 5*x mod 13) x=4. (or 3 = 13*x mod 26) No answer. (or 4 = 10*x mod 26) x=3. (or 12 = 10*x mod 29) x=7. E) Mod Exponentiation In the encryption process of the RSA Cipher the plain message is raised to the power of e mod m, where e and m (commonly a 200 digit number) make up the public key. To account for huge numbers, one of our goals in this section is to learn some shortcuts when performing mod exponentiation. Since mod exponentiation is repeated multiplication, it can be reduced to the above mod multiplication. How do we compute 34 MOD 12? First we multiply: 3 * 3 * 3 * 3 = 81, secondly we find the remainder: 81 mod 12 = 9. Compute the following: 1) 2) 3) 4) 5) 43 mod 12 44 mod 12 53 mod 12 27 mod 20 115 mod 10 4 4 5 8 1 Don't worry if you did not get the last one. However, if you did get 1, congratulations. You figured out already the shortcut that can be used to compute large powers. To compute 115 mod 10, we compute (11 mod 10) = 1 and multiply that answer 5 times by itself which yields the answer 1. Using this shortcut, the answer to 125 mod 10 is 2 since 12 mod 10 = 2 and 25 mod 10 = 32 mod 10 = 2. 8 Shortcut 1: Instead of first computing the (large) power and secondly finding the remainder, it is easier to find the remainders of smaller powers and mod multiply them to get the final answer. Compute the following: 1) 2) 3) 4) 5) 143 mod 12 244 mod 12 3716 mod 12 827 mod 20 1635 mod 10 8 0 1 8 3 There is a fast way to compute 2377 mod 24 . Since 23 = -1 mod 24, we may write (-1)77 mod 24 which simplifies to -1 mod 24. To get a positive answer, we add 24 to get 23 as the final answer. Shortcut 2: If the base is a little less than the modulus, then rewrite the base as a small negative number which when exponentiated yields a smaller answer then the original power. Compute the following: 1) 113 mod 12 11 16 2) 154 mod 17 1 3) 5416 mod 55 40 4) 827 mod 84 4 5) 165 mod 19 6) Powers such as 12345676 would yield an overflow on your calculator. However, performing modular arithmetic using the modulus m=1234569 we are able to compute the answer 64. Why? Answer: 1234567 = -2 mod 1234569. Thus, (-2)6 = 64 MOD 1234569 We conclude the Mod Exponentiation with one last shortcut. There is a fast way to compute 211 mod 15. Since 211 = ((22)2)2 * 23 , we compute 211 mod 15 as (24)2 * 23 mod 15 =((24)2 mod 15 ) * (23 mod 15 ) = 1 * 8 mod 15 = 8. Check: Dividing 2048 by 15 leaves a remainder of 8. Correct. Shortcut 3: (Repeated Squaring) To compute an, divide the exponent n by the greatest power of 2 that is less than n. This yields an exponent e and a remainder r. Finally, compute ae * ar. To compute ae, use shortcuts if necessary. ar is a fairly small number. Compute the following: 1) 311 mod 12 3. Since 32 = 9, 34=(9)2 = 81 = 9 mod 12. Also: 38=(34)2=(9)2 = 81 = 9 mod 12. 3 Since, 3 = 27 = 3 mod 12, 311 = 38 * 33 = 9 * 3 = 27 = 3 mod 12. 4. Since 42 =4, 44=48=4. 48 * 43 = 4. 2) 411 mod 12 17 3) 4 mod 17 4) 513 mod 17 3. Since 52=8, 54 = 13 = -4, 58 = 16 =-1, 513 = -1 * -4 * 5 =20 3. Since 32 = 9, 34 = -4, 38 = -1, 316 =1, 332 = 1 5) 333 mod 17 5 6) 16 mod 19 4. Since 16 = -3, 162 = 9, 164 = 81 = 5. A computation challenge: 7) 2130 mod 17 8) 2269 mod 19 9) 3333 mod 15 4. Since 24 =-1, 28=216=...=2128 =1. 10. Since 24 = -3, 28 =9, 216 = 5,... 5 3. Since 3 = 3, ... 9 10 Substitution Ciphers (March 11, 2004) About the Ciphers The name substitution cipher comes from the fact that each letter that you want to encipher is substituted by another letter or symbol, but the order in which these appear is kept the same. Another way to say this is that the message you want to keep secret (called the plaintext) is transformed into the enciphered message (called the ciphertext) by using a different alphabet. It is useful to keep track of the different alphabets by always writing your plaintext in lowercase and your ciphertext in uppercase. You have probably seen this type of cipher before; often in the newspapers there is a "cryptoquip" which challenges you to solve just such a cipher. Examples Try to decipher the encrypted text below. Each cipher below uses some systematic way of replacing the plaintext letters with the ciphertext letters. In addition to figuring out what each says, try to figure out how the message was enciphered. Use the word breaks to give you clues as to what is being said, and look for patterns you recognize. All of these messages are in English! You can also look at some hints. 1. Caesar shift: WKLV PHVVDJH ZDV HQFUBSWHG XVLQJ D FDHVDU VKLIW FLSKHU. Answer 2. Atbash: GL WVXLWV GSRH, BLF HLOEVW ZM ZGYZHS XRKSVI. VCXVOOVMG! Answer 3. Keyword: TCDR ROJTOJIO WAR OJIDMCOQOS WDTC TCO FOYWKQS AGDRKJ. Answer Challenge Problems After you have tried the examples above, try the ciphers on the challenge sheet. Questions for Thought 1. What allowed you to break these? 2. What if you didn't have any word breaks? 11 Answer 3. What else would make this harder? Answer 4. Often the ciphertext alphabet is called the key to this sort of cipher (since knowing the alphabet allows you to instantly decrypt the ciphertext). How many possible keys are there? Answer Other Links* • Code Breaking Software Polyalphabetic Substitution Ciphers (March 18, 2004) About the Ciphers Last week we worked on monoalphabetic substitution ciphers -- ones which were encoded using only one fixed alphabet (hence the Greek root "mono" meaning "one"). As you saw, especially when the spaces between words are still there, these are fairly easy to break. Given a few minutes and several people working on a message, the secret contents are revealed. So, how can you make this harder? Well, one way is to use more than one alphabet, switching between them systematically. This type of cipher is called a polyalphabetic substitution cipher ("poly" is the Greek root for "many"). The difference, as you will see, is that frequency analysis no longer works the same way to break these. One such cipher is the famous Vigenere cipher, which was thought to be unbreakable for almost 300 years! The Vigenere cipher uses the power of 26 possible shift ciphers (which we met last week). How this Cipher Works 1. Pick a keyword (for our example, the keyword will be "MEC"). 2. Write your keyword across the top of the text you want to encipher, repeating it as many times as necessary. 3. For each letter, look at the letter of the keyword above it (if it was 'M', then you would go to the row that starts with an 'M'), and find that row in the Vigenere table. 4. Then find the column of your plaintext letter (for example, 'w', so the twenty-third column). 5. Finally, trace down that column until you reach the row you found before and write down the letter in the cell where they intersect (in this case, you find an 'I' there). Example: 12 Keyword: M E C M E C M E C M E C M E C M E C M E C M Plaintext: w e n e e d m o r e s u p p l i e s f a s t Ciphertext: I I P Q I F Y S T Q W W B T N U I U R E U F Thus, the urgent message "We need more supplies fast!" comes out: I I P Q I F Y S T Q W W B T N U I U R E U F So, as you can see, the letter 'e' is enciphered sometimes as an 'I' and sometimes as a 'Q'. Not only that, but 'I' represents two different letters, sometimes a 'w' and sometimes an 'e'. This renders our favorite tool, frequency analysis, nearly useless. Even though 'e' is used very often in the plaintext, the letters that replace it ('I' and 'Q') don't show up as frequently. Also, now if we check doubled letters in the ciphertext (say 'II' or 'WW'), these are not doubled letters in the plaintext. You may, then, ask yourself "is there any hope?" Fortunately, there is! Given a long enough piece of ciphertext, certain words or parts of words (like "the") will line up with the keyword several times, giving rise to a repeated string of letters in the ciphertext ("the" may be enciphered as "KPQ" more than once). This can give us a clue as to the length of the keyword. After that, we can use frequency analysis on each piece that was enciphered with the same letter to crack the code. Consequently, cracking these ciphers hinges on finding repeated strings of letters in the ciphertext. How to crack this cipher: 1. Search the ciphertext for repeated strings of letters; the longer strings you find the better (say you find the string "KPQ" four times). Note where they are by circling them or highlighting them in some manner. 2. For each occurrence of a repeated string, count how many letters are between the first letters in the string and add one (for example, if our ciphertext contains KPQRE IIJKO KPQAE, we count that there are nine letters between the first 'K' in the first "KPQ" and the first 'K' in the second "KPQ"; adding one yields ten). 3. Factor the number you got in the above computation (2 and 5 are factors of 10) 4. Repeat this process with each repeated string you find and make a table of common factors. The most common factor is probably the length of the keyword that was used to encipher the ciphertext (in our case, assume it was five). Call this number 'n'. 5. Do a frequency count on the ciphertext, on every nth letter. You should end up with n different frequency counts. 6. Compare these counts to standard frequency tables to figure out how much each letter was shifted by. 7. Undo the shifts and read off the message! Examples 1. Encipher the following message using the Vigenere cipher and the keyword "IHS": 13 there is a secret passage behind the picture frame Answer 2. There is an easier way to use the Vigenere cipher, using modular arithmetic. Discuss how you might do this (hint: represent each letter by a number, starting with 'a' = 0). Using this method, encipher the above message using the key 19, 15, 22 Answer 3. Decipher the following message (work as a team!): I C J E V A Q I P W B C I J R Q F V I F A Z C P Q Y M J A H N G F Y D H W E Q R N A R E L K B R Y G P C S P K W B U P G K B K Z W D S Z X S A F Z L O I W E T V P S I T Q I S O T F K K V T Q P S E O W K P V R L J I E C H O H I T F P S U D X X A R C L J S N L U B O I P R J H Y P I E F J E R B T V M U Q O I J Z A G Y L O H S E O H W J F C L J G G T W A C W E K E G K Z N A S G E K A I E T W A R J E D P S J Y H Q H I L O E B K S H A J V Y W K T K S L O B F E V Q Q T P H Z W E R Z A A R V H I S O T F K O G C R L C J L O K T R Y D H Z Z L Q Y S F Y W D S W Z O H C N T Q C P R D L O A R V H S O I E R C S K S H N A R V H L S R N H P C X P W D S I L P L Z V Q L J O E N L W Z J F S L C I E D J R R Y X J R V C V P O E O L J U F Y R Q F G L U P H Y L W I S O T F K W J E R N S T Z Q M I V C W D S C Z V P H V C U E H F C B E B K P A W G E P Z I S O T F K O E O D N W Q Z Q W H Y P V A H K W H I S E E G A H R T O E G C P I P H F J R Q Answer Challenge Problems After you have tried the examples above, try the ciphers on the challenge sheet. Other Links* 14 • Vigenere Cipher Applet Transposition Ciphers (March 25, 2004) About the Ciphers The last two weeks we have been working on substitution ciphers (monoalphabetic and polyalphabetic). Recall that substitution ciphers are ones in which each letter is replaced by another letter (or symbol) in some systematic way. However, the order in which the letters appear stays the same. This week, we're going to work on a few transposition ciphers; ones for which the letters stay the same, but the order is all mixed up! One of the oldest ways to do this was created by the ancient Egyptians and Greeks. It uses a stick called scytale . They would have used wooden sticks and parchment, but we're going to use poster tubes and adding machine tape! How the scytale cipher works 1. Get a scytale and a strip of parchment. 2. Wrap your parchment around your scytale until the stick is covered. Try to avoid overlapping and gaps. 3. Write your message along the length of the stick, one character per pass of the paper. If you need more space, rotate the stick away from you and keep writing. 4. Unwrap the scytale and send the scrambled message to a friend with the same-diameter stick. 5. The friend then wraps his scytale with the encoded parchment. Since the diameters are the same, the message is clearly legible! This technique was very useful in ancient battles; the Spartans are known to have used this rather extensively. Each general was given a stick of uniform diameter so that he could quickly encipher and decipher any message sent from other generals. Notice how quick and easy this is to use! However, it is also rather easy to crack. In a battle situation, the most likely way to crack this would be to steal a general's scytale. Then, each message could be read easily. However, it can be cracked even without stooping to theivery. As it ends up, the scytale is just a very old (and rather simple) version of a greater class of ciphers called matrix transposition ciphers. The way the simplest of these works is by picking a matrix of a fixed size (say, 6x10) and then writing your message across the rows. The encipherment step consists of writing down the letters in the matrix by following the columns. Here's a simple 6x10 example: T R OO P S HE AD I N GWE S T NE E D MOR E S UP P L 15 I E S S E NDGE N E R AL DUB OI S ME NT OAI D Where we've written the message: troops heading west need more supplies. send general dubois' men to aid row by row into the matrix. Then, to encipher this, we simply read off the columns to get: TIDIE MRNME REOGO SANOW RSLTP EEEDO SSSNU AHTUD BIENP GODAE PEIDE LNS The scytale cipher is just like one of these. Note that the number of "rows" in your message is determined by the diameter of your stick and the size of your writing. Cracking them, as you may guess, is just a matter of systematic guess-and-check. How to crack the simple matrix transposition ciphers: 1. Count how many letters are in the ciphertext (for this example, assume the ciphertext is 99 letters long) 2. Make all of the matrices that would fit such a length (e.g. 2x50, 3x33, 4x25, 5x20, 6x17, 7x15, 8x13, 9x11, 10x10). Use TWO of each size. 3. For each size matrix, write out the ciphertext across the rows on one copy. On the other copy, write out the ciphertext down the columns. 4. At each stage, see if you can find anything legible, reading perpendicular to how you put the ciphertext in. A harder version of the matrix transposition cipher is the column-scrambled matrix transposition cipher. Just like the ones above, you find a matrix of suitable dimensions and write your text in rowby-row. If there are blank cells left, fill them in with a dummy character (sometimes an 'X'). However, before writing down the ciphertext from the columns, you first scramble the columns. This generates a new matrix of the same size. Now read off the text down the columns, as before. This is a harder cipher, but there is a systematic way to crack it. How to crack the column-scrambled matrix transposition ciphers: 1. Count how many letters are in your ciphertext (for example, 75) and factor that number (75 =5*5*3). 2. Create all of the possible matrices to fit this ciphertext (in our case, 3x25, 5x15, 15x5, 25x3). 3. Write the ciphertext into these matrices down the columns. 4. For each of your matrices, consider all of the possible permutations of the columns (for n columns, there are n! possible rearrangements). In our case, we hope that the message was enciphered using one of the last two matrices (the 15x5 and the 25x3), since in those cases, 16 we have only 6 and 120 possibilites to check (3! = 6, 5! = 120, 15! ~ 1.31x10^12, 25! ~ 1.55x10^25). 5. Rearrange each matrix to see if you get anything intelligible. Read the message off row-byrow. Note that this is much more easily done by a computer than by hand, but it is doable (for small matrices). Examples 1. Play around with the scytales; write messages using each different-diameter tube and try to read them by using the other tubes. Try to stump your friends with your ciphers. 2. Decipher the following message (it was enciphered using a simple matrix transposition cipher note that the letters are in groups of five): DEEFY HSTEH TFIEO IVOUI ARSOO IAYSI WULLU CEULN NLBRU LFLBN OSLPW HRWDD GIEEW EEIED OWHYH EYOF Answer 3. Decipher the following message (it was enciphered with matrixtransposition cipher - the letters are again in groups of five). a column-scrambled NNRTA NNFTO IONEL IEKSD MRTLE LRTNE EGRTA NTAEI HXOIO EMOIO DMRTI HLDHR SNEEG UHLEG IHNNB GMAND NBTX Answer 4. Think of how you might program a computer to decipher some column-scrambled matrix transposition ciphertext. Answer Challenge Problems After you have tried the examples above, try the ciphers on the challenge sheet. Other Links* • • The Double Transposition Cipher Transposition cipher encryption over the internet * These links are for informational purposes only and are from sources outside MEC and Cornell University 17 Primes, Modular Arithmetic, and Public Key Cryptography (April 15, 2004) Introduction Every cipher we have worked with up to this point has been what is called a symmetric key cipher, in that the key with which you encipher a plaintext message is the same as the key with which you decipher a ciphertext message. As we have discussed from time to time, this leads to several problems. One of these is that, somehow, two people who want to use such a system must privately and secretly agree on a secret key. This is quite difficult if they are a long distance apart (it requires either a trusted courier or an expensive trip), and is wholly impractical if there is a whole network of people (for example, an army) who need to communicate. Even the sophisticated Enigma machine required secret keys. In fact, it was exactly the key distribution problem that led to the initial successful attacks on the Enigma machine. However, in the late 1970's, several people came up with a remarkable new way to solve the keydistribution problem. This allows two people to publicly exchange information that leads to a shared secret without anyone else being able to figure out the secret. The Diffie-Hellman key exchange is based on some math that you may not have seen before. Thus, before we get to the code, we discuss the necessary mathematical background. Prime Numbers and Modular Arithmetic Recall that a prime number is an integer (a whole number) that has as its only factors 1 and itself (for example, 2, 17, 23, and 127 are prime). We'll be working a lot with prime numbers, since they have some special properties associated with them. Modular arithmetic is basically doing addition (and other operations) not on a line, as you usually do, but on a circle -- the values "wrap around", always staying less than a fixed number called the modulus. To find, for example, 39 modulo 7, you simply calculate 39/7 (= 5 4/7) and take the remainder. In this case, 7 divides into 39 with a remainder of 4. Thus, 39 modulo 7 = 4. Note that the remainder (when dividing by 7) is always less than 7. Thus, the values "wrap around," as you can see below: 0 mod 7=0 6 mod 7=6 1 mod 7=1 7 mod 7=0 2 mod 7=2 8 mod 7=1 3 mod 7=3 9 mod 7=2 4 mod 7=4 10 mod 7=3 5 mod 7=5 etc. To do modular addition, you first add the two numbers normally, then divide by the modulus and take the remainder. Thus, (17+20) mod 7 = (37) mod 7 = 2. 18 Modular arithmetic is not unfamiliar to you; you've used it before when you want to calculate, for example, when you would have to get up in the morning if you want to get a certain number of hours of sleep. Say you're planning to go to bed at 10 PM and want to get 8 hours of sleep. To figure out when to set your alarm for, you count, starting at 10, the hours until midnight (in this case, two). At midnight (12), you reset to zero (you "wrap around" to 0) and keep counting until your total is 8. The result is 6 AM. What you just did is to solve (10+8) mod 12. As long as you don't want to sleep for more than 12 hours, you'll get the right answer using this technique. What happens if you slept more than 12 hours? Examples Here are some exercises for you to practice modular arithmetic on. 1. 12+18(mod 9) Answer 2. 3*7(mod 11) Answer 3. (103 (mod 17))*(42 (mod 17)) (mod 17) Answer 4. 103*42 (mod 17) Answer 5. 72 (mod 13) Answer 6. 73 (mod 13) Answer 7. 74 (mod 13) Answer 8. 75 (mod 13) Answer 9. 76 (mod 13) Answer Did you notice something funny about the last 5 exercises? While, usually, when we take powers of numbers, the answer gets systematically bigger and bigger, using modular arithmetic has the effect of scrambling the answers. This is, as you may guess, useful for cryptography! Diffie-Hellman Key Exchange The premise of the Diffie-Hellman key exchange is that two people, Alice and Bob, want to come up with a shared secret number. However, they're limited to using an insecure telephone line that their adversary, Eve (an eavesdropper), is sure to be listening to. Alice and Bob may use this secret number as their key to a Vigenere cipher, or as their key to some other cipher. If Eve gets the key, then she'll be able to read all of Alice and Bob's correspondence effortlessly. So, what are Alice and Bob to do? The amazing thing is that, using prime numbers and modular arithmetic, Alice and Bob can share their secret, right under Eve's nose! Here's how the key exchange works. 1. Alice and Bob agree, publicly, on a prime number P, and a base number N. Eve will know these two numbers, and it won't matter! 2. Alice chooses a number A, which we'll call her "secret exponent." She keeps A secret from everyone, including Bob. Bob, likewise, chooses his "secret exponent" B, which he keeps 19 secret from everyone, including Alice (for subtle reasons, both A and B should be relatively prime to N; that is, A should have no common factors with N, and neither should B). 3. Then, Alice computes the number J = NA (mod P) and sends J to Bob. Similarly, Bob computes the number K = NB (mod P) and sends K to Alice. Note that Eve now has both J and K in her possession. 4. The final mathematical trick is that Alice now takes K, the number she got from Bob, and computes KA(mod P). Bob does the same step in his own way, computing JB (mod P). The number they get is the same! Why is this so? Well, remember that K = NB (mod P) and Alice computed KA (mod P) = (NB)A (mod P) = NBA (mod P). Also, Bob used J = NA (mod P), and computed JB (mod P) = (NA)B (mod P) = NAB (mod P). Thus, without ever knowing Bob's secret exponent, B, Alice was able to compute NAB (mod P). With this number as a key, Alice and Bob can now start communicating privately using some other cipher. Why Diffie-Hellman Works At this point, you may be asking, "Why can't Eve break this?" This is indeed, a good question. Eve knows N, P, J, and K. Why can't she find A, B, or, most importantly, NAB(mod P)? Isn't there some sort of inverse process by which Eve can recover A from NA(mod P)? Well, the thing Eve would most like to do, that is, take the logarithm (base N) of J, to get A, is confounded by the fact that Alice and Bob have done all of their math modulo P. The problem of finding A, given N, P, and NA (mod P) is called the discrete logarithm problem. As of now, there is no fast way known to do this, especially as P gets really large. One way for Eve to solve this is to make a table of all of the powers of N modulo P. However, Eve's table will have (P-1) entries in it. Thus, if P is enormous (say 100 digits long), the table Eve would have to make would have more entries in it than the number of atoms in the universe! Simply storing that table would be impossible, not to mention searching through it to find a particular number. Thus, if P is sufficiently large, Eve doesn't have a good way to recover Alice and Bob's secret. "That's fine," you counter, "but if P is so huge, how in the world are Alice and Bob going to compute powers of numbers that big?" This is, again, a valid question. Certainly, raising a 100digit-long number to the power 138239, for example, will produce a ridiculously large number. 20 This is true, but since Alice and Bob are working modulo P, there is a shortcut called the repeated squaring method. To illustrate the method, we'll use small numbers (it works the same for larger numbers, but it requires more paper to print!). Say we want to compute 729 (mod 17). It's actually possible to do this on a simple four-function calculator. Certainly, 729 is too large for the calculator to handle by itself, so we need to break the problem down into more manageable chunks. First, break the exponent (29) into a sum of powers of two. That is, 29 = 16 + 8 + 4 + 1 = 24 + 23 + 22 + 20 (all we're doing here is writing 29 in binary: 11101). Now, make a list of the repeated squares of the base (7) modulo 17: 71 (mod 17) = 7 (mod 17) = 49 (mod 17) = 15 72 74 (mod 17) = 72 * 72 (mod 17) = 15 * 15 (mod 17) = 4 78 (mod 17) = 74 * 74 (mod 17) = 4*4 (mod 17) = 16 16 8 8 7 (mod 17) = 7 * 7 (mod 17) = 16*16 (mod 17) = 1 Then, 729 (mod 17) = 716 * 78 * 74 * 71 (mod 17) = 1 * 16 * 4 * 7 (mod 17) = 448 (mod 17) = 6. The neat thing is that the numbers in this whole process never got bigger than 162 = 256 (except for the last step; if those numbers get too big, you can reduce mod 17 again before multiplying). Thus, even though P may be huge, Alice's and Bob's computers don't need to deal with numbers bigger than (P-1)2 at any point. Key Exchange Worksheet Here's a worksheet to try this whole process out on. It may help to work with a few friends! Questions to Consider 1. Will this method work if Alice and Bob don't know each other? What could Eve do if she were impersonating one of them? What if she were "in the middle", that is, what if Bob thought Eve was Alice and Alice thought Eve was Bob? Answer 2. How can Alice and Bob know that Eve hasn't jumped "in the middle"? That is, how can Alice really know that Bob is Bob? (This is called authentication.) Is there any way for Alice to know with absolute certainty? Other Links* o o More on modular arithmetic Overview of Diffie-Hellman • These links are for informational purposes on 21 Primes, Modular Arithmetic, and Public Key Cryptography II (April 22, 2004) Introduction Rounding out our study of cryptology, we'll finish with the most-used cipher today. The RSA cipher (named after its creators, Rivest, Shamir, and Adleman) is a public key cipher -- an amazing type of cipher invented in the late 1970’s that doesn’t require the sender and the receiver to have previously shared a secret key. Public key ciphers seem to defy all logic. In a public key cryptosystem, the encipherment step and the decipherment step are two rather different things. Consequently, the "key" to such a system is split in half, into a public key and a private key. A person, let’s call her Alice, who wants people to be able to send messages to her, distributes her public key widely, possibly in a format like a phone book. She carefully guards the other half of her key, her private key, to ensure that it is kept secret. Her public key allows her correspondents to encipher a message so that only she can read it. If Bob wants to send a message to Alice, Bob looks up Alice’s public key in the book, writes Alice a message, enciphers it and sends it to her. Once Bob has enciphered the message, only Alice can decipher it. Bob can’t even undo what he has done! When Alice gets the message, she uses her private key to decipher it. Here are two diagrams to show the difference between a public-key system and a symmetric-key system: Figure 1: In the traditional model, Bob and Alice must agree secretly to the same key K before they can use their cipher. Eve, an eavesdropper, must not know K for the cipher to work. [The letters above each person’s box indicate what he or she knows. M represents a message being sent from Bob to Alice, and K(M) represents the enciphered message. To decipher the message, Alice simply applies K to K(M).] 22 Figure 2: With the public-key model, Alice freely distributes her public key PA, so both Bob and Eve know it. However she keeps her private key, PrA, secret (Bob doesn’t even know it). Even though Eve knows PA and PA(M), Eve can’t recover the message without Alice’s private key. [Again, letters near each box indicate what each person knows. M is the message Bob sends to Alice (he knows it since he wrote it, and Alice knows it since she has deciphered it). PA(M) represents the message enciphered with Alice’s public key. To get M from PA(M), Alice applies PrA to PA(M).] It’s helpful to think of the following analogy for a public key cryptosystem. Alice’s public key is like an open (unlocked) padlock, to which only she has the key (her private key). Alice hands these open padlocks out to anyone who wants one. Then, when Bob wants to send Alice a message, he puts the message in a box, puts Alice’s lock (her public key) on there, and locks it. Once he has done that, only Alice can open the box, since she is the only one with the right key (her private key). With that in mind, let’s turn to one such public-key system. The RSA Cipher The RSA cipher, like the Diffie-Hellman key exchange we have already worked with, is based on properties of prime numbers and modular arithmetic. 1. Alice chooses two different prime numbers, P and Q, which she keeps secret (in practice, P and Q are enormous — usually about 100 digits long). 2. She calculates her modulus N by multiplying P and Q together (N = P*Q). 3. She calculates her Z by multiplying (P-1) and (Q-1) together (Z = (P-1)*(Q-1)). Once she has done this, she can forget what P and Q are. 23 4. Alice then picks her encryption exponent E (1 < E < Z) so that E and Z have no factors in common. 5. Then, she finds her decryption exponent D by solving for the number D such that the product of E and D is 1 modulo Z (E*D (mod Z) = 1). This can be done using the Euclidean algorithm on a computer (or by hand, if the numbers are small enough). 6. Alice then publicizes the pair of numbers E and N (these numbers make up her public key), and she keeps D secret (this is her private key). 7. When Bob wants to send a message to Alice, Bob turns his message into a number M (for example, he might use the ASCII code). This is not part of the security of the system, so he can publicize how he does this. In particular, Alice needs to know. 8. Then, Bob computes C = ME (mod N) and sends C to Alice. 9. When Alice receives C, she computes CD (mod N) = MED (mod N). By the special properties that E and D have, this yields M back. Thus, Alice gets Bob’s message! Why this Works Eve, our eavesdropper, seems rather well-equipped in this scenario. She knows E, N, and C (remember C= ME (mod N)). So, you might think that it would be easy for her to discover M from what she knows. However, this is really quite difficult. The only way to recover M from C is to know D, Alice’s private-key. In order to compute D, Eve needs to know what Z is, and, to do that, she needs to know what P and Q are. That is, Eve needs to be able to factor N (remember, N = P*Q). It’s important to know that the whole system breaks down if Eve can factor N. What’s so hard about factoring? Well, the numbers used in RSA are typically huge (N is usually about 200 digits long), and there is no easy or fast way known to factor such large numbers. In fact, the fastest ways known are not too much faster than checking all of the prime numbers less than the square root of N, which would certainly take a while! Factoring quickly is a problem that mathematicians have worked on for thousands of years, and it seems like there is no easy answer. In fact, there are large cash prizes (up to $200,000) for people who can factor specific numbers (visit http://www.rsasecurity.com/rsalabs/challenges/factoring/index.html for details). The beauty of this system is that, while it’s hard for Eve to factor N into P and Q, it’s not hard for Alice to compute N from P and Q. Basically, multiplying is easy, but factoring is hard. For a simple example of this, multiply 7 and 13 in your head. Now try to factor 1001 in your head. The factors of 1001 are fairly small numbers, but I bet it took you a while to find them. 24 Finally, we’ll address the issue of why MED (mod N) = M. This is obviously important for the cipher to work! However, the math behind this is a bit advanced, so if you had rather skip it, just go on to the next section. There is a theorem called Fermat’s Little Theorem that states: For any prime number p and any number a with a < p, ap (mod p) = a. To convince yourself of this, you can try it out on a few different numbers with a few different primes (however, that won’t be a proof!). Now, to extend this to two primes (remember, in our case, N = P*Q), we’ll use an extension of this theorem that Euler proved: If n = p*q is the product of two different primes, and a is any number such that a < n, then a(p —1)*(q —1)+ 1 (mod n) = a. Notice that the exponent in the conclusion of Euler’s theorem is (p -1)*(q -1) + 1. This number is equal to 1 modulo (p -1)*(q -1). Also, remember that Z = (P -1)*(Q -1) in our above discussion. It ends up that for any exponent equal to 1 modulo (p -1)*(q -1), a to that power is a modulo n. By the way we constructed D from E, E*D = 1 (mod (P -1)*(Q -1)), so MED (mod N) = M. Try working with this cipher yourself, using the RSA secret-sharing worksheet. Final Thoughts Now, if you ever hear anything in the news about a large number being factored, you’ll know why it’s important. Nearly everything that is transmitted over the Internet securely is sent using the RSA cipher. This includes all of your credit card transactions, your banking information, and whatever else you send back and forth online. Since, as you have experienced, the RSA cipher takes some rather hefty computation, it’s fairly slow to send large messages back and forth enciphered with this cipher. Thus, in practice what is done is that two entities (for example, amazon.com and your computer) will exchange public keys and will use RSA to send each other a new secret key which they will use for another, much faster cipher called DES (or, sometimes Triple-DES). Once they’ve agreed on that secret key, they abandon RSA for the much faster cipher. In addition, RSA is used to solve the problems of authentication, non-repudiation, and many other Internet security issues. (To find out more about Internet security issues, helpful websites include • • http://developer.netscape.com/docs/manuals/security/pkin/index.html http://www.rsasecurity.com/rsalabs/faq/index.html 25