RSA Public-key Cryptosystem

advertisement
Farley 1
Chris Farley
Research Paper
RSA Public-key Cryptosystem
Suppose Alice and Bob want to communicate with each other, but they do not
want others to be able to ‘ease drop’ on their conversations. For example, Alice and Bob
may be working on top-secret research or they may be allies in a war. They would need
to develop their own system for communication like their own language or alphabet.
However, this would not necessarily ensure that they could protect the translation from an
attacker. They need a way of converting information and then decoding the converted
information so that it is only understood by each other. The process by which they will
achieve this is cryptography, also known as cryptology. The particular scheme that they
employ is a cryptosystem.
Cryptosystems are made up of three basic parts: the encryption algorithm, the
decryption algorithm, and the key(s). The encryption algorithm is the algorithm used to
encode an original, or a plaintext message. The decryption algorithm is the reverse
process of the encryption algorithm. With the decryption algorithm, the user converts the
encoded message back to its original plaintext message. The key system is used during
the process of encrypting and decrypting messages. Generally, an encryption key is used
to encrypt messages, whereas a decryption key is used decrypt messages.
The RSA algorithm is used in cryptography as a public-key cryptosystem. This
algorithm was the first known to be suitable for signing as well as encryption. Because
of this, it was one of the first great advancements in public-key cryptology. RSA is still
Farley 2
widely used and is believed to be secure given sufficiently long keys. This is because
RSA is based on the difficulty of factoring large prime numbers.
Ron Rivest, Adi Shamir, and Len Adleman described the RSA algorithm at MIT
in 1977. The algorithm’s name, “RSA”, was derived from the initials of their last names.
The algorithm was patented by MIT in the United States in 1983. It was given the U.S.
Patent number 4,405,829, expiring on September 21, 2000 (Wikipedia, RSA).
A British mathematician, Clifford Cocks, supposedly had written an equivalent
system to RSA in 1973. However, the research was not pursued due to the lack of and
the cost of computers that were needed to execute the algorithm. This discovery was not
revealed until 1997 because of the top-secret classification given by GCHQ (UK
Government Communications Head Quarters), of which, Cocks was affiliated.
Now that we have a little background about RSA, it is time to answer how Alice
can securely communicate with Bob. How do we go about generating the keys that we
need to securely send the message? If Bob wants to have Alice send him a secure
message over an insecure medium, then he needs to take the following steps to generate
the public and private keys:
1. Choose two large primes, p and q, randomly and independently of each other;
however, be sure that they are very large so that their product will be of sufficient
size. Also, the primes are to be kept secret.
2. Compute the product N = pq. The product, N, is to be made public and is known
as the modulus.
3. Compute the totient  ( N )  ( p  1)( q  1). The totient function is Euler’s phi
function (see notes section).
Farley 3
4. Choose an integer e (which is to be made public), where 1  e   ( N ), such that
gcd( ( N ), e)  1, or equivalently, e is coprime or relatively prime to  (N ). In
general, if gcd( a, b)  1, then a and b are relatively prime, and a  b (mod  ( N )) is
a congruence. Note that this can be found using the Euclidean Algorithm.
5. Compute the private key, d, which is the multiplicative inverse of
e (mod  ( N )), i.e., find an integer d with de  1 (mod  ( N )). In general, solve for
d in the equation 1  ed mod( p  1)( q  1). The existence of d follows from the
fact that if given two integers, a and b, where the gcd( a, b)  1, then b is invertible
(mod a) and b has a multiplicative inverse in  a . This can also be found with the
Euclidean Algorithm.
That is how we generate our keys. The public key consists of the modulus, N, and
the public exponent, also referred to as the encryption exponent, e. The private key
consists of, again, the public modulus, N, and the private exponent, d, which is also called
the decryption exponent. The value of d must be kept secret, as well as the values of p, q,
and  (N ).
Now that we have our keys, we can start sending messages. How do we use our
keys to send encrypted messages and then decrypt them? To explain, we will return to
our hypothetical ‘guinea pigs’, Alice and Bob. Suppose Alice wishes to send a message
M to Bob. Alice converts the message M to an integer, m < N, by using whatever method
she chooses (I will briefly discuss different types of methods that are generally used to do
this in other sections of the paper). Note that if m  N , then Alice would need to break it
up into groups, where each group is a number mi  N , i  0,1,2... and the encryption
Farley 4
procedure would be done on all groups. However, for this example, we will assume that
m < N. Alice has m, and given the public key N and e, she then converts m to the
ciphertext, y, by using the following equation:
y  m e mod N
This can be done quickly using the method of exponentiation by squaring (see
notes section), but is usually done with a computer. After Alice converts the message to
cyphertext, she sends it to Bob. When Bob receives the message, he uses his private key,
d, to recover m from y by doing the inverse modular exponentiation process:
m  y d mod N
That was easy enough, right? However, you may be asking how we can be sure
that when Bob receives the ciphered message, and then deciphers it, that he will arrive at
the original message sent by Alice? In other words, we need to verify that
m
ed
 m (mod N ) (this notation is described in the notes section, briefly; we say m
ed
is
congruent to m modulo N, where N is the modulus of the congruence, also written
m
ed
mod N  m , Michon, G. P.) For a quick explanation, the decryption procedure
works because d mod N is the multiplicative inverse of e. Thus, y d is the inverse of m e .
To elaborate, y d mod N  (m e ) d mod N  m ed mod N , also, ed  1 (mod p  1) and
ed  1 (mod q  1). Using Fermat’s little theorem (see notes section) we conclude that
m
ed
 m (mod p) and m
ed
 m (mod q ). Given that p and q are distinct primes, the
Chinese remainder theorem (see notes section) says that these two congruences can be
rewritten as m
ed
 m (mod pq) , and finally, m
ed
 m (mod N ). ■ (Wikipedia, RSA)
Farley 5
In order to exemplify the RSA encryption/decryption process with a working
example, we will use very small numbers that are not recommended to be used for real
RSA processes. Suppose Alice wants to send the message “Hello Bob” to Bob. Also,
assume that Alice and Bob have decided to convert letters to numbers by using each
letter’s corresponding two-digit representational position in the alphabet. So our number
m is to represent the message “Hello Bob”. Before we go further, we need to generate
our keys. First, we will pick our two primes which are to be kept secret. Since we are
using relatively small numbers, we let p = 53 and q = 41. Next, we compute N = pq
= 53  41  2173, which is the modulus, a part of the public key. Then, we choose e such
that it is coprime to ( p  1)( q  1)  52  40  2080   ( N ). For this example, we will
choose e = 7, verified by gcd( ( N ), e)  gcd( 2080,7)  1. Remember that e, the
encryption exponent, is also part of the public key. From here, we can find d by solving
for it in the equation 1 = 7d mod 2080. In this case, d = 1783 will work because
7  1783  12481  6(2080)  1  1 (mod N ). (CITE)
Now that we have generated our public key, (N, e), and our private key, (N, d), we
are ready to proceed. We will assume that Bob made the public key and sent it to Alice
so that she could transfer encrypted messages to him. We will also assume that Bob has
our private key, which he will use to decrypt the message. Alice transforms “Hello Bob”
to m = 080512121500021502 (for our example any blank space is represented by 00).
Since the number m is larger than the modulus N, Alice must split it up. To do so, she
splits m into six different numbers, each containing three digits, ensuring that
each mi  N :
Farley 6
m1  080
m4  500
m2  512
m5  021
m3  121
m6  502
Next, for each mi , Alice will convert the cyphertext y, where y is also made up of six
numbers corresponding to each of the mi ' s. To do this, Alice computes yi  mie mod N :
y1  080 7 mod 2173  200
y 2  512 7 mod 2173 1730
y3  1217 mod 2173 1102
y 4  500 7 mod 2173 1355
y5  0217 mod 2173  1837
y6  502 7 mod 2173  592
Alice then transmits all six of these messages to Bob. When Bob gets around to checking
his messages, he realizes that Alice has sent him one. Excited, he quickly applies his
private key, d = 1783 to decipher the cyphertext. To do so, he computes
mi  yid mod N , for each yi :
m1  2001783 mod 2173  080
m2  17301783 mod 2173  512
m3  11021783 mod 2173 121
m4  13551783 mod 2173  500
m5  18371783 mod 2173  021
m6  5921783 mod 2173  502
I will explain how one can compute these large modular exponentiations later (see notes
section), for now, I will continue with the decrypting process.
Bob now has all of the pieces that he needs to decipher the message. Bob groups
the pieces together and gets the number 080512121500021502. He then splits the
number up again by grouping in pairs: 08 05 12 12 15 00 02 12 02. Since Bob knows
that the numbers correspond to positions of letters in the alphabet, he can convert the
message one step further, deciphering the message to discover that Alice says, “Hello
Bob.”
Farley 7
As mentioned earlier, the security of relies on the fact it is difficult to factor
sufficiently large numbers. No polynomial-time method for factoring large integers on a
classical computer has been found yet. However, it has not been proven that none exists.
As of November 8, 2005, the largest number factored by general purpose methods was
640 bits long (known as the RSA-640 number) which was a 193 digit number. This
factorization was done on 80 CPU’s working together and took approximately four and a
half months, total. The prize that the German BSI team received for factoring this
number was $20,000. Because of the different methods of factorization that are being
used with the help of linked computers or super-computers, the current recommendation
for the product of primes, N, is for it to be is at least 2048 bits long. However, if
quantum computing reaches a sufficient level, as it is predicted to be no earlier than 2015,
it potentially can perform factorization in polynomial time, rendering RSA and related
algorithms obsolete.
How do we know that the message we receive is the sent by our correspondent?
For this, RSA provides for message “signing” which entails the sender producing a hash
value of the message to be sent, raising that value to the power of d mod N (as done when
decrypting), and attaching it as a “signature” to the message. When the receiver gets the
signed message, he raises the signature to the power of e mod N (as done when
encrypting), and compares the resulting hash value with the message’s actual hash value,
making sure they agree, therefore verifying the signature. (Wikipedia, RSA)
There are various different methods that are used to pad numbers to messages to
enhance the security of the messages. These methods are commonly referred to as
padding schemes. Although, we will not fully discuss these schemes, I think that it is
Farley 8
pertinent to know the value of applying these techniques for a practical application. For
example, if a sender did not use padding schemes, then the security of their message may
be flawed by the following things:
1. Both of the values of m = 0 or m = 1 will produce the same ciphertext due to the
properties of exponentiation.
2. If the encryption is done with small encryption exponents and small message
values, then the result of m e could be strictly less than the value of the modulus n.
If this is the case, then the ciphertext is easily discovered by taking the eth root of
it (without using the modulus).
3. Because RSA uses public keys, then an attacker could develop a ciphertext library
of resultant ciphers of encryptions using the public key. If the ciphertext that are
intercepted match any of the library’s ciphers, then the attacker would know the
resultant plaintext.
Again, there are many different techniques that are employed to securely pad plaintext.
This paper has some references which describe some of these schemes in further detail.
The computation of RSA renders it to be slower than other cryptosystems.
Because of this, typical applications of RSA would involve sending credit card numbers,
very short messages, map coordinates, or other relatively small numbers. If someone
wished to send a large message, they usually would encrypt the message with a different
cryptography method such as DES. Then they would encrypt a key to decrypt the
message with RSA. Of course, the key would be relatively small. After which, the
sender would transmit both the RSA encrypted key to the message and the encrypted
message itself.
Farley 9
There are lots of other security issues and general considerations regarding RSA.
For more information, consult the references listed at the end of this paper. For now, we
will elaborate on the mathematical ‘backbone’ of RSA.
Notes Section: How It All Works
This section presents some principles of number theory and definitions that are
used throughout this paper. First, for congruences, we have several forms of notation.
For example, “a is congruent to b modulo N” is a congruence where N is the modulus of
the congruence. Mathematically, this congruence can be expressed with the following
notations, of which, the first two are used in this paper:
a  b (mod N )
a mod N  b
a  b [N ]
[a]  [b]
10
10
Next, these ideas are the foundational tools that make the RSA cryptosystem work.

Prime numbers are those that are greater than one, which only has factors of one
and itself.

Two numbers are said to be relatively prime or coprime if the two numbers have
no common factors other than one.

The largest integer that divides two integers, say, a and b, is known as the
greatest common divisor of a and b, denoted gcd( a, b).

If two numbers, a and b, are coprime, then gcd( a, b)  1 . The converse is also
true. A corollary to this idea is that there exist integers x and y such that
ax  by  1 (Bézout’s identity).
Farley 10

The Euler phi function or totient function, denoted  (N ), is defined as the
number of integers from 1 to N that are coprime to N, where such numbers are
totatives of N. Also, for any prime p,  ( p )  p  1.

When we say N divides a, we use the notation N | a. Also, there exists an integer
k such that a = Nk.

The common notation, a  b (mod N ), means a and b have the same remainder
when divided by N, or N | a-b.

If a  b (mod N ) and c  d (mod N ) then a  c  b  d (mod N ) and
ac  bd (mod N ).

If given that p and q are coprime, and p | a and q | a, then pq | a. Also, if pq | a
then p | a and q | a, for all p and q.

If p and q are coprime, then  ( p) (q)   ( pq).
The following section describes the theorems that RSA cryptography is based on:

Taken from the Chinese Remainder Theorem: Let p and q be integers, not
necessarily prime, such that they are coprime. If a  b (mod p ) and
a  b (mod q) then we have a  b (mod pq).
Proof: Since a  b (mod p ), then p | a-b. Also, since a  b (mod q), then
q | a-b. But p and q are coprime, thus pq | a-b. Therefore, a  b (mod pq). ■
(Davis, T.)

Fermat’s Little Theorem: For any integer a and any prime number p,
a
p
 a (mod p ). If a and p are coprime, then a
equivalently, p | (a
p 1
p 1
 1 (mod p ), or,
 1). This Theorem has many proofs, for ours, we will
Farley 11
prove it by using group theory. For this we will need to recognize that
G  1, 2, ..., p  1forms a group under the operation of multiplication modulo p
and also, we will need to use Lagrange’s Theorem: The order of any subgroup
divides the order of the whole group.
Proof: Let G  1, 2, ..., p  1form a group under the operation of multiplication
modulo p. Choose a  G. Let k be the order of a, so that a k  1 (mod p). By
Lagrange’s theorem, k divides the order of G, which is p  1. So, p 1  km for
some positive integer m. Thus, a
p1
 a km  (a k ) m  1m  1 (mod p). ■
(Wikipedia, Proofs of Fermat’s Little Theorem)

Fermat-Euler Theorem or Euler’s Generalization of Fermat’s Little
Theorem: For any number a coprime to n, we have a
 ( n)
 1 (mod n), where
 (n) is Euler’s phi function (as previously described). Since this is a
generalization of Fermat’s Little Theorem, we will not display a formal proof.
From here, we have all of the tools needed to prove the main result of the RSA process.
The following is a proof that will use the abovementioned number theory and theorems to
prove the main result of the RSA scheme. Choose two large primes, p and q, randomly
and independently of each other. Let N  pq, also, define the totient function
 ( N )  ( p  1)( q  1). Next, let e be an integer such that gcd( ( N ), e)  1 (i.e. e is
coprime to  (N ) ). Then, let d be a number such that ed  1 (mod  ( N )) (A further
elaboration of how to generate d is given following our proof). Given the message m, an
integer such that 0  m  N , define the encoded message, y, as y  m e mod N . We need
to show that the decoded message is given by m  y d mod N .
Farley 12
Proof of main result: Since we are given that ed  1 (mod  ( N )), we can express this
congruence as  ( N ) | (ed  1), that is  (N ) divides ed  1. Since p and q are coprime, we
have  ( N )   ( pq)   ( p) (q). Therefore,  ( p) (q) | (ed  1), and we can also express
this as  ( p) | (ed  1) and  (q) | (ed  1). Since  ( p) | (ed  1), then  an integer k such
that ed  1  k ( p) or, because p is prime, ed  1  k ( p  1).
Now, looking at the general case of a ed mod p, a  . We can form the congruence
a ed  a ed 11 (mod p), or equivalently, a ed  a ed 1  a (mod p). As shown above,
k ( p1)
 a (mod p).
we can substitute for ed  1 and we have, a ed  a
Since p is prime, then our integer, a, will be either coprime to p, or it will be a multiple of
p (i.e. gcd( a, p)  p).
Case 1: In the case where a is coprime to p, from Fermat’s Little Theorem, we
have a
p1
getting a
a ed  a
 1 (mod p). We can raise this to the power of an integer k
k ( p1)
k ( p1)
 1 (mod p). Thus, our congruence can be rewritten as
 a  1  a  a (mod p).
Case 2: Considering the other case where a is a multiple of p, we recognize that if
p | a, then p | a ed . From this fact, we have a ed  0 (mod p). But, since
p | a, we have a  0 (mod p). Therefore, because p | a  0, the
congruence a ed  a (mod p) still holds true.
By the same argument, we can show that because  (q ) | (ed  1), we have
a ed  a (mod q).
Farley 13
We have shown that a ed  a (mod p) and a ed  a (mod q). Thus, because p and q are
coprime, by the Chinese Remainder Theorem, a ed  a (mod pq)  a (mod N ).
We can now relate this to our original equation, as given, y  m e mod N . Now we have
y d  (m e ) d  m ed (mod N ). But, as a result shown above, m ed  m (mod N ).
Hence, y d  m (mod N ) and m  y d (mod N ).
At this point, it is important to mention that expressed as a congruence, this relationship
does not yield a one-to-one correspondence between our message m and our ciphertext y.
But, because the calculation is limited to  N , we are limiting the value to be less than our
modulus, N. Therefore, there is only one integer m, satisfying m  y d (mod N ). Thus,
m  y d mod N , 0  m  N . ■ (DI Management)
Further elaboration on the generation of d, with terms defined as above, is as
follows: Given the integer e, we will need to find d such that ed  1 (mod  ( N )). To
find this d, we will use Euler’s Generalization of Fermat’s Little Theorem. By this
 ( ( N ))
 ( ( N ))1
mod  ( N )  1, so e
mod  ( N ) is a suitable value for
theorem we get e
d (Davis, T.).
The following section describes, as promised, the computation of large modular
exponentiations:
Look at the following computation, as seen previously.
m1  2001783 mod 2173  080
Although this calculation looks tough, it can be done by taking the binary expansion of
the exponent, d, in order to calculate partial results modulo N, otherwise known as the
Farley 14
method of modular exponentiation by squaring. To elaborate, consider our d = 1783.
This can also be written as d =1024 + 512 + 128 + 64+ 32 + 16 + 4 + 2 + 1 = 1783.
Therefore, if we want to calculate 2001783 , we do the following:
1024  512 128  64  32 16  4  2 1
2001783  200
 2001024 200512    2001.
Now we need to calculate the partial results modulo N, and following the repeated
squaring pattern of 200, we will get all the exponents that have a power of 2. As we
obtain these, we will exclude the ones that are not used in the binary expansion of 1783
(we will not need 256 or 8):
2001 mod 2173  200
200 2 mod 2173  886
200 4 mod 2173  543
20016 mod 2173  365
20032 mod 2173  672
200 64 mod 2173 1773
200128 mod 2173 1371
200512 mod 2173 16
2001024 mod 2173  256
The result we will generate from this is:
2001783 mod 2173  (256  16  1371  1773  672  365  543  886  200 ) mod 2173  80  m1
That is exactly what we were looking for. In this case, the computation of 2001024 is
also too challenging for your calculator, but you can repeat the preceding process on it
and then use its result to finish the computation (Davis, T.).
Farley 15
Random quote:
“Mathematics is the Queen of sciences, and arithmetic the Queen of mathematics.”
–Carl Friedrich Gauss (1777-1855).
Reference:
Davis, T., “RSA Encryption.” Geometer, Math Circles. Retrieved on November 11,
2005 from the World Wide Web: http://www.geometer.org/mathcircles
DI Management, “RSA Theory.” DI Management Services, © 2000-2004. Retrieved on
December 4, 2005 from the World Wide Web:
http://www.di-mgt.com.au/rsa_theory.html
Farley 16
DI Management, “RSA Algorithm.” DI Management Services, © 2000-2004. Retrieved
on December 4, 2005 from the World Wide Web:
http://www.di-mgt.com.au/rsa_theory.html
Gilbert, J., Gilbert, L., “Elements of Modern Algebra.” Brooks/Cole, Thompson
Learning, © 2000. ISBN 0-534-37351-8.
Gottesman, D., Lo Hoi-Kwong, “Figure 2 from ‘From Quantum Cheating to Quantum
Security’.” Physics Today Online, © 2001 American Institute of Physics.
Retrieved on November 8, 2005 from the World Wide Web:
http://www.physicstoday.org/pt/vol-53/iss-11/captions/p22cap2.html
Johnsonbaugh, R., “Discrete Mathematics.” Prentice Hall © 2001. ISBN 0-13-089008-1.
Lewand, R. E., “Cryptological Mathematics.” The Mathematical Association of America
Inc., © 2000. Library of Congress Catalog Card Number 00-105256.
ISBN 0-88385-719-7.
Michon, G. P., “Final Answers: Modular Arithmetic.” © 2000-2005 Gerald P. Michon,
Ph.D. Retrieved on December 4, 2005 from:
http://home.att.net/~numericana/answer/modular.htm
Weisstein, E. W., “RSA-640 Factored.” MathWorld Headline News. Nov. 8, 2005.
Retrieved on November 11, 2005 from MathWorld:
http://mathworld.wolfram.com/news/2005-11-08/rsa-640/
Weisstein, E. W., “RSA Encryption.” From MathWorld—A Wolfram Web Resource.
Retrieved on October 21, 2005 from MathWorld:
http://mathworld.wolfram.com/RSAEncryption.html
Farley 17
Wikipedia, “Proofs of Fermat’s little theorem.” Wikipedia, the free encyclopedia.
Retrieved on November 10, 2005 from Wikipedia:
http://en.wikipedia.org/wiki/Proofs _of_Fermat%27s_little_theorem
Wikipedia, “RSA.” Wikipedia, the free encyclopedia. Retrieved on October 21, 2005
from Wikipedia: http://en.wikipedia.org/wiki/RSA
Download