2008 RSA Cryptography The Science of Keeping Secrets Math 430: Number Theory My Viet Nguyen Reference #: 32 5/1/2008 Tuohtiw egaugnal terces a etaerc nac uoy taht nam ecnassianer taerg a hcus gnieb enigami dednah-tfel erew ouy esuaceb ecneinevnoc erup fo tuo metsysotpyrc siht detaerc ouy. Gniyrt neve gnigdums tuoba hcum oot yrrow ot evah ton did uoy, dednah-thgir erew uoy fi. Nep lliuq a gnisu ylbaborp Odranoel. Dednah-tfel erew uoy fi dluow uoy ekil mlap ruoy fo edis eht htiw krow ruoy S’icniv ad tuo erugif to sraey ynam koot ti hgouhtla. Tfel ot thgir morf etorw eh suht, oot os thguoht ot neht od ot sgniht retteb dah tsuj eh, saedi sih gnilaets srehto tuoba suovren ton saw eh, sterces Nep lliuq sih htiw thgif In other words… Imagine being such a great renaissance man that you can create a secret language without even trying. You created this cryptosystem out of pure convenience because you were left-handed using a quill pen. If you were right-handed, you did not have to worry too much about smudging your work with the side of your palm like you would if you were left-handed. Leonardo probably thought so too, thus he wrote from right to left. Although it took many years to figure out da Vinci’s secrets, he was not nervous about others stealing his ideas, he just had better things to do then to fight with his quill pen. Today, keeping secrets is a necessity. Only allowing the right persons access to your private information is the key ingredient to identity fraud prevention. In the following pages we will discuss RSA cryptography by first examining where it came from and where it is today. First we will describe Caesar’s Cipher, the Diffie-Hellman Key Exchange and how this leads to a more general form of the discrete logarithm problem which is the focal point of RSA. Cryptography as my professor has said is the science of keeping secrets. Before we can discuss the exchange of secrets and messages, we need to find a way to convert letters, numbers, and common symbols into integers that can easily be processed in a cryptosystem. For the purpose of this paper we will use the following standard two-digit integer assignment: Digital Alphabet A= B= C= D= E= F= G= H= I= J= 00 01 02 03 04 05 06 07 08 09 K= L= M= N= O= P= Q= R= S= T= 10 11 12 13 14 15 16 17 18 19 (space) = U= V= W= X= Y= Z= ,= .= ?= 0= 40 or 99 Table 1 20 21 22 23 24 25 26 27 28 29 1= 2= 3= 4= 5= 6= 7= 8= 9= != 30 31 32 33 34 35 36 37 38 39 Then the numbers: 1200190740171402101839 is the message, MATH ROCKS!, but can also be digitalized as 1200190799171402101839 Also the message: Don’t try that one on me! Can be digitalized to 031413199919172499190700199914130499141399120439 or 031413194019172440190700194014130440141340120439. As you can see, the lowercases and the apostrophe ( ’ ) are not included in our integer assignment, so our messages cannot be case sensitive. As for the apostrophe we would just omit it, since DONT can be easily understood to be Don’t. It is possible to extend the integer assignment to include both lowercases and other symbols. One example is the ASCII character assignment, which could be found at http://www.petefreitag.com/cheatsheets/ascii-codes/. Once the original message, called plaintext is digitalized we can use a cryptosystem to turn it into a ciphertext which is unreadable by others. In general all cryptosystems can be broken down into symmetric and asymmetric cryptography, also known as private-key and public-key cryptography, respectively. In a private-key cryptosystem, the plaintext is encrypted into a ciphertext using only one process, called the private key. Since this private key is necessary to encrypt and decrypt a message, it is important that only those that are trusted should know about the private key. Some examples include the Caesar’s Cipher, Vigenere’s Cipher, and Hill’s Cipher. Of the three mentioned, Caesar’s Cipher is the simplest to understand and the other two ciphers are derived from it. It is said that Caesar himself used this cryptosystem to get messages out to his soldiers who were out building his empire. With Caesar’s Cipher, he would take his message and shift it three letters over. So his cipher alphabet would be: Caesar’s Cipher A B C D E F G H I = = = = = = = = = D E F G H I J K L J = K L M N O P Q R = = = = = = = = M N O P Q R S T U Table 2 S T U V W X Y Z = = = = = = = = V W X Y Z A B C For example, the plaintext MATH ROX, would be converted to the ciphertext PDWK URA. Even though Caesar used a three letter shift, this general method could be used with any number of shifts. Using a digital alphabet we could create an equation from congruence theory that will allow us to easily convert our message from plaintext, M to ciphertext, C one letter at a time. This equation is , where k is the private key, or number of shifts and A is the number of characters in a digital alphabet. Then using Caesar’s Cipher, we would have the equation, , Using the same example, our message digitalized and encrypted can be found by utilizing just the first 26-letters from the digital alphabet in table 1 and it would be Plaintext, M Digitalized M M 12 15 A 00 03 T 19 22 H 07 10 R 17 20 O 14 17 X 23 26 (mod 26) Ciphertext, C 15 P 03 D 22 W 10 K 20 U 17 R 00 A Note that we can use any number k provided k is between 1 and A-1. If k was bigger than A-1, our ciphertext would be the same as the ciphertext generated by k-A modulus A, since A is 0 modulo A. Now to decrypt our message we use a similar process. To get our original message, we would subtract our private key, k, from the ciphertext, C, to get our original message, M. , For example, If we had private key, k =27, and C =?GD_6_4,F52,!61_6F2,A69Z, (the underscore character, (_), will be used to signify and easily read spaces), we have Ciphertext, C ? G D _ 6 Digitalized C 28 06 03 40 35 01 -21-24 13 08 (mod 41) 01 20 17 13 08 Plaintext, M B U R N I Ciphertext, C ! Digitalized C 39 12 (mod 41) 12 Plaintext, M M 6 1 35 30 08 03 08 03 I D _ 40 13 13 N _ 4 , F 5 40 33 26 05 34 13 06 -01-22 07 13 06 40 19 07 N G _ T H 6 F 2 35 05 31 08 -22 04 08 19 04 I T E Thus our secret message is M= Burning the midnite oil! , A 6 26 00 35 -1 -27 08 40 14 08 _ O I 2 , 31 26 04 -01 04 40 E _ 9 Z 38 25 11 01 11 39 L ! Although, Caesar’s ciphers and other private-key cryptosystems can be used with some effectiveness, they do have some flaws. One problem occurs when two people or entities such as Wescom Credit Union and SchoolsFirst Federal Credit Union (SFFCU) are trying to establish their mutual private key. If they are relatively close by, they could send a representative in person to exchange the private key, otherwise they would have to find a secure way to exchange this information or have a trusted source deliver the key, without the knowledge of another person or company like Bank of America (BofA) who might want to have access to these credit unions’ secrets. Then the private key could be compromised. One way to counter this attack is to use the Diffie-Hellman Key Exchange (DH KEy). This is a method which can allow Wescom and SFFCU to communicate openly about their private key, without BofA finding out. DH KEy is an asymmetric cipher that is usually used in conjunction with a private-key cryptosystem. It is usually used to effortlessly exchange a private key between two parties. With asymmetric cryptography we have two separate keys. One is made public for anyone to know and see, while the other is kept private. Wescom could release a public key to others so they can send an encrypted message back to Wescom. Even if BofA intercepted the message, the message could not be decrypted without the private key. This is possible because the equation used to decrypt the ciphertext to plaintext is much harder to perform without knowing the missing information known as the private key. On the other hand the equation is easy to compute to decrypt a plaintext message with the given public key. In 1976, Whitfield Diffie and Martin Hellman first used a discrete logarithm problem to create their asymmetric cryptosystem, DH KEy. The discrete logarithm problem involves three numbers, namely g, p, and x, where we know the values of , we need to find x. There are special properties that are necessary to ensure that there are no easy shortcuts to finding x. First it is important that p is a very large, usually 200+digit prime number. That is a number that only has two positive divisors, 1 and itself. Some examples include 2, 3, 7, 17, 37, and 101. This p needs to be large so that it would be computationally hard to find x. The second requirement is g must be a primitive root of p. A primitive root of a prime, p, is an integer, g, between 1 and p-1 such that , and there are no other powers, x, between 1 and p-1 where . In other words, p-1 is the smallest power of g that will give us 1 modulo p. For our example, we will use smaller primes so that we can easily see how the system works. Given a prime number 13, a primitive root of 13 is 2 since Unfortunately there are no easy calculations that will give us primitive roots, but there are a few tricks we can used to help the process go a little faster. First we need Fermat’s Little Theorem (F l T). It states that for all prime p, and all integers g such that p does not divide g, then . Let’s just make sure this is true. Suppose are arbitrary, such that p is prime and p does not divide g. First consider the first p-1 multiples of g, namely We can say that each of these are mutually incongruent to each other modulo p. If there existed such that and , Then this would imply that which is not possible since p is prime. Also, since none of the multiples of g are congruent to 0 modulo p, Since p does not divide , then we can cancel out from both sides, which then gives us what we are trying to prove, . We can now find a primitive roots much faster, because if and , then x must divide p-1. Thus when looking for a primitive root, we only need to check the factors of p-1 and not all integers between 1 and p-1. This proof is omitted, however consider p = 31. We can check that 17 is a primitive root by checking that the powers 1, 2, 3, 5, 6, 10, 15 of 17 will not be 1 modulo 17. Now that we have a prime number p and a primitive root g we can begin the process of DH KEy. Wescom and SFFCU both agree on a prime, p, and a primitive root, g. Then both Wescom and SFFCU either picks or randomly generates their own secret number, and respectively. Wescom then computes , and SFFCU computes . Then Wescom and SFFCU trade off W and S to each other. Once Wescom receives S, they compute And when SFFCU receives W, then they can compute Since then now both Wescom and SFFCU can use this as their private key k. This k can be use in conjunction with one of the private-key ciphers. So BofA may be able to intercept g, p, S, and W, but will not be able to find k without and . Suppose Wescom and SFFCU agree on the prime, 31 and the primitive root, 17, Wescom might choose , and SFFCU might select . Then Wescom performs the following operations And SFFCU calculates . Wescom and SFFCU also share the numbers 7 and 8 to each other. Wescom then computes, while SFFCU calculates Now that they have this private-key, k, they can use this along with a symmetric-cipher to exchange messages. Even with DH KEy, over time people outside of those entrusted with the private key could systematically find the key. With today’s technology, we can use frequency analysis to find this private key quite easily since the encryption scheme and decryption scheme are basically the same. One way to prevent that from happening is to constantly change the private key. Wescom and SFFCU could add an extra line to their encrypted message with a new prime and/or primitive root. This would remedy a frequency analysis scheme on the ciphertext. However, primitive roots although easy to find, can be computationally tedious to calculate as we select bigger and bigger prime numbers and there has to be a minimum of three contacts between two parties in order to just send or receive the first set of ciphertext. When Wescom and SFFCU are sharing time sensitive information, the least number of contacts can be crucial. An alternative is RSA cryptography. RSA is currently the most widely used cryptosystem and is the basis for most other public-key cryptosystems. RSA are the initials of the MIT professors, R. Rivest, A. Shamir, and L. Adleman who created this cipher. They also utilizes the discrete logarithm problem, however instead of only using it to create a private key, RSA is used to encrypt the plaintext itself. If SFFCU and Wescom wanted to use RSA, they would first find two large primes, p and q usually over 200 digits and multiply them together to get n. Then SFFCU carefully selects an encryption exponent, e. It is important that the . Then SFFCU would publish the integer pair (n, e). These would be the public key, while the pair of primes, (p, q) are kept private. Then if Wescom wanted to send the message, M, to SFFCU it would take the public key pair and compute, . and send the ciphertext, C to SFFCU. For example say SFFCU selected the prime pair (157, 163) and published the public key, (25591, 7) and Wescom wanted to let SFFCU know that “Banks are evil!”, Wescom would first digitize it using a table similar to table 1 and translate the message to 010013101899001704990421081139 Then separate this long strand of digits into smaller strings of digits, called bit-strings, so each string is smaller than n. This is necessary because and that can be represented as BHA which is not the same as our original message. Then let’s break 010013101899001704990421081139 into 010013/10189/9001/7049/9042/10811/39, where / signifies the ending and beginning of each new bit-string. Since the last number is much smaller, we can add in a filter number like a space or a random symbol to make it a bit bigger. Then we calculate Then Wescom would send SFFCU, C = 3659/16489/7322/16175/1166/9913/10926. With RSA we would not convert it back into letters because sometimes the output numbers are bigger than the largest value in the digital alphabet. This could lead to some reducing that could possibly change the original message, as we saw earlier when we didn’t reduce the message, M, into bit-strings of the plaintext message smaller than n. So now that SFFCU has this ciphertext, how do they decrypt it back to the original message? In order to do this, SFFCU needs the decryption exponent, d. To get, d, we need to solve the congruence, Since p and q are both large primes, then would be hard to know what will both be large even numbers. So it without knowing what p and q were originally. To solve this congruence, we use the Euclidean Algorithm. Then from our example above, thus we need to solve, Then it is equivalent to say Then by the Euclidean Algorithm we get Then combining these we get, 1 Thus d=10831. Once we have d, we can calculate to get the original message again. Let’s check to make sure this works. From above we have Then it is also true that Then Since , we can use the Chinese Remainder Theorem1 (CRT) to break this into a system of equations, Recall that by F l T, and so by the uniqueness of CRT , 1 For a Proof of the Chinese Remainder Theorem, Please see Appendix A. So if Wescom sent another message to SFFCU, such as 21509 / 17445 / 5624 / 4093 / 5624 / 11835 / 14368 / 25277 / 9078 / 25277 / 4951 / 19352, SFFCU would decrypt it by computing, Putting these strings all down we get, Digitalized, M 03 14 13 19 99 19 17 24 99 19 07 00 19 99 14 13 04 99 14 13 99 12 04 39 Plaintext, M D O N T _ T R Y _ T H A T _ O N E _ O N _ M E ! RSA cryptography also relies on large primes similar to the DH KEy, although a primitive root is not necessary in RSA. Most cryptosystem schemes currently in use today is RSA or some variation of it. So rest assure that Wescom Credit Union and SchoolsFirst Federal Credit Union can easily exchange information without Bank of America or anyone else being able to steal the private information. So even though this seems complicated, I would like to leave you with one last ciphertext, a quote from S. Gudder, SGNIHT ELPMIS EKAM OT TON SI SCITAMEHTAM FO ECNESSE EHT ELPMIS SGNIHT DETACILPMOC EKAM OT TUB, DETACILPMOC Works Cited Annin, Scott. “Math 430: Number Theory” Class notes. 2008 Burton, David M. Elementary Number Theory. New Delhi: Tata McGraw-Hill, 2007 Freitag, Pete. “ASCII Character Codes & Cheat Sheet.”2005-2008. http://www.petefreitag.com/cheatsheets/ascii-codes/ Lu, Mark. “Large Mod Calculator.” 2008 http://www.excelex.net/powermod.php Marx, Kyle. “TI-83 RSA Program” 2008 McCurley, Kevin. “Diffie-Hellman Key Echange.” 1/23/1998. http://www.swcp.com/~mccurley/talks/msri2/node14.html Sequib, Al. “Diffie-Hellman Key Echange.” http://www.xml-dev.com/blog/index.php?action=viewtopic&id=196 Appendix A The Chinese Remainder Theorem, CRT. Let where such that , then the system of equations: has a unique solution for Proof: To show this is true, first we define then such that Then the congruence has a unique solution where is unknown. We claim that is a unique solution to the system, Remember that and thus and This process can be checked for each of the and pair to show that for all i =1, 2, …, r. Last we need to check for uniqueness. Suppose y is also a solution to the system Then we need to show that Since y is a solution to the system, then and We also know such that , then , which implies that Thus the solution to the CRT is unique. Let’s see an example to see how this works, compute By the uniqueness of the CRT, it is easier to compute the system Note that Thus we have , We need to find such that or and or Then we can see that . Thus we need to compute ,