RSA Cryptography - Full

advertisement
2008
RSA Cryptography
The Science of Keeping Secrets
Math 430: Number Theory
My Viet Nguyen
Reference #: 32
5/1/2008
Tuohtiw egaugnal terces a etaerc nac uoy taht nam ecnassianer taerg a hcus gnieb enigami
dednah-tfel erew ouy esuaceb ecneinevnoc erup fo tuo metsysotpyrc siht detaerc ouy. Gniyrt neve
gnigdums tuoba hcum oot yrrow ot evah ton did uoy, dednah-thgir erew uoy fi. Nep lliuq a gnisu
ylbaborp Odranoel. Dednah-tfel erew uoy fi dluow uoy ekil mlap ruoy fo edis eht htiw krow ruoy
S’icniv ad tuo erugif to sraey ynam koot ti hgouhtla. Tfel ot thgir morf etorw eh suht, oot os thguoht
ot neht od ot sgniht retteb dah tsuj eh, saedi sih gnilaets srehto tuoba suovren ton saw eh, sterces
Nep lliuq sih htiw thgif
In other words…
Imagine being such a great renaissance man that you can create a secret language without
even trying. You created this cryptosystem out of pure convenience because you were left-handed
using a quill pen. If you were right-handed, you did not have to worry too much about smudging
your work with the side of your palm like you would if you were left-handed. Leonardo probably
thought so too, thus he wrote from right to left. Although it took many years to figure out da Vinci’s
secrets, he was not nervous about others stealing his ideas, he just had better things to do then to
fight with his quill pen.
Today, keeping secrets is a necessity. Only allowing the right persons access to your private
information is the key ingredient to identity fraud prevention. In the following pages we will
discuss RSA cryptography by first examining where it came from and where it is today. First we
will describe Caesar’s Cipher, the Diffie-Hellman Key Exchange and how this leads to a more general
form of the discrete logarithm problem which is the focal point of RSA.
Cryptography as my professor has said is the science of keeping secrets. Before we can
discuss the exchange of secrets and messages, we need to find a way to convert letters, numbers,
and common symbols into integers that can easily be processed in a cryptosystem. For the purpose
of this paper we will use the following standard two-digit integer assignment:
Digital Alphabet
A=
B=
C=
D=
E=
F=
G=
H=
I=
J=
00
01
02
03
04
05
06
07
08
09
K=
L=
M=
N=
O=
P=
Q=
R=
S=
T=
10
11
12
13
14
15
16
17
18
19
(space) =
U=
V=
W=
X=
Y=
Z=
,=
.=
?=
0=
40 or 99
Table 1
20
21
22
23
24
25
26
27
28
29
1=
2=
3=
4=
5=
6=
7=
8=
9=
!=
30
31
32
33
34
35
36
37
38
39
Then the numbers:
1200190740171402101839
is the message, MATH ROCKS!, but can also be digitalized as
1200190799171402101839
Also the message:
Don’t try that one on me!
Can be digitalized to
031413199919172499190700199914130499141399120439
or 031413194019172440190700194014130440141340120439.
As you can see, the lowercases and the apostrophe ( ’ ) are not included in our integer assignment,
so our messages cannot be case sensitive. As for the apostrophe we would just omit it, since DONT
can be easily understood to be Don’t. It is possible to extend the integer assignment to include both
lowercases and other symbols. One example is the ASCII character assignment, which could be
found at http://www.petefreitag.com/cheatsheets/ascii-codes/. Once the original message, called
plaintext is digitalized we can use a cryptosystem to turn it into a ciphertext which is unreadable by
others.
In general all cryptosystems can be broken down into symmetric and asymmetric
cryptography, also known as private-key and public-key cryptography, respectively.
In a private-key cryptosystem, the plaintext is encrypted into a ciphertext using only one
process, called the private key. Since this private key is necessary to encrypt and decrypt a
message, it is important that only those that are trusted should know about the private key. Some
examples include the Caesar’s Cipher, Vigenere’s Cipher, and Hill’s Cipher.
Of the three mentioned, Caesar’s Cipher is the simplest to understand and the other two
ciphers are derived from it. It is said that Caesar himself used this cryptosystem to get messages
out to his soldiers who were out building his empire. With Caesar’s Cipher, he would take his
message and shift it three letters over. So his cipher alphabet would be:
Caesar’s Cipher
A
B
C
D
E
F
G
H
I
=
=
=
=
=
=
=
=
=
D
E
F
G
H
I
J
K
L
J =
K
L
M
N
O
P
Q
R
=
=
=
=
=
=
=
=
M
N
O
P
Q
R
S
T
U
Table 2
S
T
U
V
W
X
Y
Z
=
=
=
=
=
=
=
=
V
W
X
Y
Z
A
B
C
For example, the plaintext
MATH ROX,
would be converted to the ciphertext
PDWK URA.
Even though Caesar used a three letter shift, this general method could be used with any number of
shifts. Using a digital alphabet we could create an equation from congruence theory that will allow
us to easily convert our message from plaintext, M to ciphertext, C one letter at a time. This
equation is
,
where k is the private key, or number of shifts and A is the number of characters in a digital
alphabet. Then using Caesar’s Cipher, we would have the equation,
,
Using the same example, our message digitalized and encrypted can be found by utilizing just the
first 26-letters from the digital alphabet in table 1 and it would be
Plaintext, M
Digitalized M
M
12
15
A
00
03
T
19
22
H
07
10
R
17
20
O
14
17
X
23
26
(mod 26)
Ciphertext, C
15
P
03
D
22
W
10
K
20
U
17
R
00
A
Note that we can use any number k provided k is between 1 and A-1. If k was bigger than A-1, our
ciphertext would be the same as the ciphertext generated by k-A modulus A, since A is 0 modulo A.
Now to decrypt our message we use a similar process. To get our original message, we
would subtract our private key, k, from the ciphertext, C, to get our original message, M.
,
For example, If we had private key, k =27, and
C =?GD_6_4,F52,!61_6F2,A69Z,
(the underscore character, (_), will be used to signify and easily read spaces), we have
Ciphertext, C ? G D _ 6
Digitalized C 28 06 03 40 35
01 -21-24 13 08
(mod 41)
01 20 17 13 08
Plaintext, M
B U R N I
Ciphertext, C !
Digitalized C 39
12
(mod 41)
12
Plaintext, M M
6 1
35 30
08 03
08 03
I D
_
40
13
13
N
_ 4 , F 5
40 33 26 05 34
13 06 -01-22 07
13 06 40 19 07
N G _ T H
6 F 2
35 05 31
08 -22 04
08 19 04
I T E
Thus our secret message is M= Burning the midnite oil!
, A 6
26 00 35
-1 -27 08
40 14 08
_ O I
2 ,
31 26
04 -01
04 40
E _
9 Z
38 25
11 01
11 39
L !
Although, Caesar’s ciphers and other private-key cryptosystems can be used with some
effectiveness, they do have some flaws. One problem occurs when two people or entities such as
Wescom Credit Union and SchoolsFirst Federal Credit Union (SFFCU) are trying to establish their
mutual private key. If they are relatively close by, they could send a representative in person to
exchange the private key, otherwise they would have to find a secure way to exchange this
information or have a trusted source deliver the key, without the knowledge of another person or
company like Bank of America (BofA) who might want to have access to these credit unions’
secrets. Then the private key could be compromised. One way to counter this attack is to use the
Diffie-Hellman Key Exchange (DH KEy). This is a method which can allow Wescom and SFFCU to
communicate openly about their private key, without BofA finding out. DH KEy is an asymmetric
cipher that is usually used in conjunction with a private-key cryptosystem. It is usually used to
effortlessly exchange a private key between two parties.
With asymmetric cryptography we have two separate keys. One is made public for anyone
to know and see, while the other is kept private. Wescom could release a public key to others so
they can send an encrypted message back to Wescom. Even if BofA intercepted the message, the
message could not be decrypted without the private key. This is possible because the equation
used to decrypt the ciphertext to plaintext is much harder to perform without knowing the missing
information known as the private key. On the other hand the equation is easy to compute to
decrypt a plaintext message with the given public key.
In 1976, Whitfield Diffie and Martin Hellman first used a discrete logarithm problem to
create their asymmetric cryptosystem, DH KEy. The discrete logarithm problem involves three
numbers, namely g, p, and x, where we know the values of
,
we need to find x.
There are special properties that are necessary to ensure that there are no easy shortcuts to
finding x. First it is important that p is a very large, usually 200+digit prime number. That is a
number that only has two positive divisors, 1 and itself. Some examples include 2, 3, 7, 17, 37, and
101. This p needs to be large so that it would be computationally hard to find x. The second
requirement is g must be a primitive root of p. A primitive root of a prime, p, is an integer, g,
between 1 and p-1 such that
,
and there are no other powers, x, between 1 and p-1 where
.
In other words, p-1 is the smallest power of g that will give us 1 modulo p. For our example, we will
use smaller primes so that we can easily see how the system works. Given a prime number 13, a
primitive root of 13 is 2 since
Unfortunately there are no easy calculations that will give us primitive roots, but there are a few
tricks we can used to help the process go a little faster. First we need Fermat’s Little Theorem (F l
T). It states that for all prime p, and all integers g such that p does not divide g, then
.
Let’s just make sure this is true. Suppose
are arbitrary, such that p is prime and p does not
divide g. First consider the first p-1 multiples of g, namely
We can say that each of these are mutually incongruent to each other modulo p. If there existed
such that
and
,
Then this would imply that
which is not possible since p is prime. Also, since none of the multiples of g are congruent to 0
modulo p,
Since p does not divide
, then we can cancel out
from both sides, which then gives
us what we are trying to prove,
.
We can now find a primitive roots much faster, because if
and
,
then x must divide p-1. Thus when looking for a primitive root, we only need to check the factors of
p-1 and not all integers between 1 and p-1. This proof is omitted, however consider p = 31. We can
check that 17 is a primitive root by checking that the powers 1, 2, 3, 5, 6, 10, 15 of 17 will not be 1
modulo 17.
Now that we have a prime number p and a primitive root g we can begin the process of DH
KEy. Wescom and SFFCU both agree on a prime, p, and a primitive root, g. Then both Wescom and
SFFCU either picks or randomly generates their own secret number,
and
respectively.
Wescom then computes
,
and SFFCU computes
.
Then Wescom and SFFCU trade off W and S to each other. Once Wescom receives S, they compute
And when SFFCU receives W, then they can compute
Since
then now both Wescom and SFFCU can use this as their private key k. This k can be use in
conjunction with one of the private-key ciphers. So BofA may be able to intercept g, p, S, and W, but
will not be able to find k without
and . Suppose Wescom and SFFCU agree on the prime, 31
and the primitive root, 17, Wescom might choose
, and SFFCU might select
. Then
Wescom performs the following operations
And SFFCU calculates
.
Wescom and SFFCU also share the numbers 7 and 8 to each other.
Wescom then computes,
while SFFCU calculates
Now that they have this private-key, k, they can use this along with a symmetric-cipher to exchange
messages.
Even with DH KEy, over time people outside of those entrusted with the private key could
systematically find the key. With today’s technology, we can use frequency analysis to find this
private key quite easily since the encryption scheme and decryption scheme are basically the same.
One way to prevent that from happening is to constantly change the private key. Wescom and
SFFCU could add an extra line to their encrypted message with a new prime and/or primitive root.
This would remedy a frequency analysis scheme on the ciphertext. However, primitive roots
although easy to find, can be computationally tedious to calculate as we select bigger and bigger
prime numbers and there has to be a minimum of three contacts between two parties in order to
just send or receive the first set of ciphertext. When Wescom and SFFCU are sharing time sensitive
information, the least number of contacts can be crucial.
An alternative is RSA cryptography. RSA is currently the most widely used cryptosystem
and is the basis for most other public-key cryptosystems. RSA are the initials of the MIT professors,
R. Rivest, A. Shamir, and L. Adleman who created this cipher. They also utilizes the discrete
logarithm problem, however instead of only using it to create a private key, RSA is used to encrypt
the plaintext itself.
If SFFCU and Wescom wanted to use RSA, they would first find two large primes, p and q
usually over 200 digits and multiply them together to get n. Then SFFCU carefully selects an
encryption exponent, e. It is important that the
.
Then SFFCU would publish the integer pair (n, e). These would be the public key, while the pair of
primes, (p, q) are kept private. Then if Wescom wanted to send the message, M, to SFFCU it would
take the public key pair and compute,
.
and send the ciphertext, C to SFFCU. For example say SFFCU selected the prime pair (157, 163) and
published the public key, (25591, 7) and Wescom wanted to let SFFCU know that
“Banks are evil!”,
Wescom would first digitize it using a table similar to table 1 and translate the message to
010013101899001704990421081139
Then separate this long strand of digits into smaller strings of digits, called bit-strings, so each
string is smaller than n. This is necessary because
and that can be represented as BHA which is not the same as our original message. Then let’s break
010013101899001704990421081139
into
010013/10189/9001/7049/9042/10811/39,
where / signifies the ending and beginning of each new bit-string. Since the last number is much
smaller, we can add in a filter number like a space or a random symbol to make it a bit bigger.
Then we calculate
Then Wescom would send SFFCU,
C = 3659/16489/7322/16175/1166/9913/10926.
With RSA we would not convert it back into letters because sometimes the output numbers are
bigger than the largest value in the digital alphabet. This could lead to some reducing that could
possibly change the original message, as we saw earlier when we didn’t reduce the message, M,
into bit-strings of the plaintext message smaller than n.
So now that SFFCU has this ciphertext, how do they decrypt it back to the original message?
In order to do this, SFFCU needs the decryption exponent, d. To get, d, we need to solve the
congruence,
Since p and q are both large primes, then
would be hard to know what
will both be large even numbers. So it
without knowing what p and q were originally.
To solve this congruence, we use the Euclidean Algorithm. Then from our example above,
thus we need to solve,
Then it is equivalent to say
Then by the Euclidean Algorithm we get
Then combining these we get,
1
Thus d=10831. Once we have d, we can calculate
to get the original message again. Let’s check to make sure this works.
From above we have
Then it is also true that
Then
Since
, we can use the Chinese Remainder Theorem1 (CRT) to break this into a system of
equations,
Recall that by F l T,
and so by the uniqueness of CRT ,
1
For a Proof of the Chinese Remainder Theorem, Please see Appendix A.
So if Wescom sent another message to SFFCU, such as
21509 / 17445 / 5624 / 4093 / 5624 / 11835 /
14368 / 25277 / 9078 / 25277 / 4951 / 19352,
SFFCU would decrypt it by computing,
Putting these strings all down we get,
Digitalized, M 03 14 13 19 99 19 17 24 99 19 07 00 19 99 14 13 04 99 14 13 99 12 04 39
Plaintext, M D O N T _ T R Y _ T H A T _ O N E _ O N _ M E !
RSA cryptography also relies on large primes similar to the DH KEy, although a primitive
root is not necessary in RSA. Most cryptosystem schemes currently in use today is RSA or some
variation of it. So rest assure that Wescom Credit Union and SchoolsFirst Federal Credit Union can
easily exchange information without Bank of America or anyone else being able to steal the private
information. So even though this seems complicated, I would like to leave you with one last
ciphertext, a quote from S. Gudder,
SGNIHT ELPMIS EKAM OT TON SI SCITAMEHTAM FO ECNESSE EHT
ELPMIS SGNIHT DETACILPMOC EKAM OT TUB, DETACILPMOC
Works Cited
Annin, Scott. “Math 430: Number Theory” Class notes. 2008
Burton, David M. Elementary Number Theory. New Delhi: Tata McGraw-Hill,
2007
Freitag, Pete. “ASCII Character Codes & Cheat Sheet.”2005-2008.
http://www.petefreitag.com/cheatsheets/ascii-codes/
Lu, Mark. “Large Mod Calculator.” 2008
http://www.excelex.net/powermod.php
Marx, Kyle. “TI-83 RSA Program” 2008
McCurley, Kevin. “Diffie-Hellman Key Echange.” 1/23/1998.
http://www.swcp.com/~mccurley/talks/msri2/node14.html
Sequib, Al. “Diffie-Hellman Key Echange.”
http://www.xml-dev.com/blog/index.php?action=viewtopic&id=196
Appendix A
The Chinese Remainder Theorem, CRT.
Let
where
such that
,
then the system of equations:
has a unique solution for
Proof:
To show this is true, first we define
then
such that
Then the
congruence
has a unique solution where
is unknown. We claim that
is a unique solution to the system,
Remember that
and
thus
and
This process can be checked for each of the
and
pair to show that
for all i =1, 2, …, r. Last we need to check for uniqueness. Suppose y is also a solution to the system
Then we need to show that
Since y is a solution to the system, then
and
We also know
such that
,
then
,
which implies that
Thus the solution to the CRT is unique.
Let’s see an example to see how this works, compute
By the uniqueness of the CRT, it is easier to compute the system
Note that
Thus we have
,
We need to find
such that
or
and
or
Then we can see that
. Thus we need to compute
,
Download