DNA Cryptography

advertisement
DNA CRYPTOGRAPHY IN COMPUTER
APPLICATIONS
I
MAILAM ENGINEERING COLLEGE
J.KARTHICK
[email protected]
9566091883, 8124989452.
1
ABSTRACT
The DNA cryptography is a new and very
curve cryptography) used in this project connecting
promising direction in cryptography research. DNA
with
can be used in cryptography for storing and
message. Finally we compare the performance of
transmitting the information, as well as for
DNA computing based elliptic curve cryptography
computation. Although in its primitive stage, DNA
with RSA algorithm.
cryptography is shown to be very effective.
DNA
computing
technique
to
encrypt
INTRODUCTION
Currently, several DNA computing algorithms are
cryptography,
As the rapidly development of information
cryptanalysis and steganography problems, and
technology, the security of information and
they are very powerful in these areas. However, the
communication becomes more and more important.
use of the DNA as a means of cryptography has
Cryptographic techniques is the kernel of whole
high tech lab requirements and computational
information security technology. Recent research
limitations,
intensive
has considered DNA as a medium for secure data
extrapolation means so far. In this project, we do
transmission and for ultra- compact information
not intend to utilize real DNA to perform the
storage. DNA was proposed for computation by
cryptography process; rather, we will introduce a
Adleman in 1994.After that many approaches have
new cryptography method based on central dogma
been investigated. It has been shown that DNA
of molecular biology. Since this method simulates
computing was
some critical processes in central dogma, it is a
currently hard to resolve (intractable) and could
pseudo DNA cryptography method. The theoretical
work more faster than electronic computer. After
analysis and experiments show this method to be
Adleman solved the Hamilton Path Problem using a
efficient in computation, storage and transmission;
combinatorial
and it is very powerful against certain attacks. Thus,
computational problems were calculated by DNA
this method can be of many uses in cryptography,
computer. The DNA computing could solve many
such as an enhancement in security and speed to the
problems, like NP complete problem,0-1 planning
other cryptography methods. There are also
problem, SAT problem, integer planning problem,
extensions and variations to this method, which
optimal problem , graph theory, cryptography ,
have
database , etc. For DNA computing, an important
proposed
for
as
enhanced
quite
well
some
as
security,
the
labor
effectiveness
and
applicability. The RSA algorithm and ECC (elliptic
suitable for some problems
molecular
method,
many
hard
problem is how to deduce.The probability of
mistakes happen during reaction to make the code
2
better works. Much experimentation have shown
applications. The goal is to transmit a message
that the reliability of this new ways can be
between a sender and
improved significantly through suitable encoding
hacker is unable to understand it. Encryption is
strategy.
the process of scrambling the plaintext using a
A single strand DNA consist of four different base
nucleotides, including adenine (A), thymine (T),
cytosine (C) and guanine (G). After attached to
deoxyribose, those nucleotides could be strung
together to generate long sequences. Each single
string will paired up with a complementary string
to be a double helix. The pair up just occurs under
the Watson-Crick complement rule. The rule is, A
only pairs with T and G only pairs with C. Also the
strand can be separated by kinds of chemistry and
physics ways. The dissociated strand separate from
each other, but it cannot breaking the chemical
bonds of the nucleotides together We usually use
different DNA molecule structure to solve different
problem. The majority of DNA molecules used in
known algorithm and a secret key. The output is
a sequence of characters known as the ciphertext.
Decryption
transforms
computing have hairpin DNA molecules and
is the
reverse process,
which
the encrypted message back to the
original form using a key. The goal of encryption
is to prevent decryption by an adversary who
does not know the secret key. An unbreakable
cryptosystem
is
one
for
which
successful
cryptanalysis is not possible. Such a system is
the one-time-pad cipher. It gets its name from
the
fact that the sender and receiver each
possess identical
notepads
filled with random
data. Each piece of data is used once to encrypt
a message by the sender and to decrypt it by
the receiver, after which it is destroyed.
DNA computing are single strands and double
strands, but some of them were used in DNA
receiver such that an
3.3 Mapping the plaintext:
DNA
Cryptosystems
Using
Random One-Time-Pads
plasmid DNA molecules
One-time-pad encryption
uses a codebook of
random data to convert plaintext to ciphertext.
Since the codebook serves as the key, if it were
predictable (i.e.
adversary
Data
security
and
cryptography are critical
aspects of conventional
computing
and may
also be important to possible DNA database
not
random), then
could guess the algorithm
an
that
generates the codebook, allowing decryption of
the message.
No piece of data
from the
codebook should ever be used more than once. If
3
it was, then it would leak information about the
Encryption occurs by substituting each plaintext
probability
DNA word with a corresponding DNA cipher
distribution
of
the
plaintext,
increasing the efficiency of an attempt to guess
word.
the message. This class of cryptosystems using
long DNA
a secret random
segments, each of which
cryptosystems
one-time-pad are the only
known
to
be
absolutely
unbreakable.
The mapping is implemented using a
pad
that consisting
of many
specifies a single
plaintext word to cipher word mapping.
The
plaintext word acts as a hybridization site for
We will first assemble a large one-time-pad in
the form of a DNA strand, which is randomly
assembled from short oligonucleotide sequences,
the
binding
elongated.
of a primer,
which is then
This results in the formation of a
plaintext-ciphertext word-pair.
then isolated and cloned. These one-time-pads
An ideal one-time-pad library would contain a
will be assumed to be constructed in secret, and
huge number of pads and each would provide a
we further assume that specific one-time-pads
perfectly
are shared in advance by both the sender and
plaintext words to cipher words.The plaintext is
receiver of the secret message. This assumption
to be mapped with the nucleotides. The mapping
requires initial communication of the one-time-
formation is given in the table. Use of mapping is to
pad between sender and receiver, which is
encode the original message. For example if the
facilitated by the compact nature of DNA.The
plaintext is ‘think’ for each character it will encode
decryption is done by similar methods.
with corresponding
DNA
Cryptosystem
Using
random
mapping
from
nucleotides given in Table
3.1.
Table 3.1: Nucleotides
Substitution
'A
A substitution one-time-pad system uses a
ciphertext.
The
= 'K
CGA'
plaintext binary message and a table defining a
random mapping to
unique,
input
= 'U = CTG'
AAG'
‘0 = ACT
'
'B = CCA' 'L = TGC'
'V = CCT'
'1 = ACC'
'C = GTT'
'M = TCC' 'W =CCG'
'2 = TAG'
'D = TTG'
'N = TCT'
'3 = GAC'
strand is of length n and is partitioned into
plaintext words of fixed length.
The
table
maps all possible plaintext strings of a fixed
'X = CTA'
length to corresponding ciphertext strings, such
'E = GGC' 'O
that there is a unique reverse mapping.
GGA'
4
= 'Y
AAA'
= '4 = GAG'
'F = GGT'
'P = GTG'
'Z = CTT'
'G = TTT'
'Q = AAC' '''= ATA'
'5 = AGA'
'6 = TTA'
differently. The one is used for encryption and the
corresponding key must be used for decryption.
The RSA algorithm is the most common and
'H = CGC' 'R = TCA'
', = TCG'
'7 = ACA'
'I = ATG'
'S = ACG'
'. = GAT'
'8 =AGG';
'J = AGT'
'T = TTC'
': = GCT'
'9 = GCG'
proved asymmetric key cryptographic algorithm
but the RSA algorithm is based on the mathematical
fact which is easy to get and multiply large prime
numbers together. the private and public keys in
RSA are based on extremely large prime numbers
The original message is the plaintext. The plaintext
is mapped with the nucleotides. After the mapping
P&Q(two large prime numberare choiced
process is done it is converted to numbers. Then the
public key cryptography algorithm is followed to
N=P*Q(Calculate N)
encrypt and decrypt the message. Here the public
key cryptography algorithms RSA and ECC are
Public Key:E & Private key:D
used to to recover the original plaintext in the early
period cryptography used to be executed by using
Encryption:CT=PTE mod N
manual techniques. Now the basic frame of
performing cryptography has remained just about
CT to receiver
the same as past.most importantly, computer has
accompolish these cryptographic algorithms and
Decryption:PT=CTD mod N
functions,thus making the ways a lot faster and
secure.
For
telecommunication
and
data,
cryptography is necessary over any untrusted
medium,which includes any networks, particularly
Elliptic curve cryptography (ECC) is an approach to
public-key cryptography based on the algebraic
structure of elliptic curves over finite fields
any systems.cryptography not only protects data
from being stolen or attracted, but it also can be
Public-key
cryptography
is
based
on
the
used for authentication for users.
intractability of certain mathematical problems.
Early public-key systems, such as the RSA
Public-key cryptography has been considered to be
the most significant development in cryptography
recently years. It is also called asymmetric
cryptography,two
different
keys
are
algorithm, are secure assuming that it is difficult to
factor a large integer composed of two or more
large
used
5
prime
factors.
For
elliptic-curve-based
protocols, it is assumed that finding the discrete
logarithm of a random elliptic curve element with
respect
to
a
publicly-known
base
point
is
unfeasible. The size of the elliptic curve determines
the difficulty of the problem. It is believed that the
same level of security afforded by an RSA-based
Private key :n A; Public key :PA = n A*G
Receiver key generation:
Private key :n B ;Public key :PB = n B*G
Elliptic Curve Encryption
system with a large modulus can be achieved with a
Elliptic curve cryptography can be used to encrypt
much smaller elliptic curve group. Using a small
plaintext messages, M , into ciphertexts. The plaintext
group
message M is encoded into a point PM
reduces
storage
and
transmission
form the finite
requirements.
set of points in the elliptic group, Ep (a, b). The first step
For current cryptographic purposes, an elliptic
consists in choosing a generator point, G ∈ Ep (a, b), such
curve is a plane curve which consists of the points
that the smallest value of n such that nG = O is a very
satisfying the equation
large prime number. The elliptic group Ep (a, b) and
the generator point G are made public.
Each user select a private key, nA < n and compute
along with a distinguished point at infinity, denoted
. (The coordinates here are to be chosen from a
fixed finite field of characteristic not equal to 2 or
3, or the curve equation will be somewhat more
complicated.) This set together with the group
operation of the elliptic group theory form an
the public key PA as: PA = nA G. To encrypt the
message point PM
for Bob (B), Alice (A) choses a
random integer k and compute the ciphertext pair of
points PC using Bob’s public key PB :
PC
= [(kG), (PM + kPB )]
Abelian group, with the point at infinity as identity
element. The structure of the group is inherited
After receiving the ciphertext pair of points, PC , Bob
from the divisor group of the underlying algebraic
multiplies the first point, (kG) with his private key,
variety.
nB , and then adds the result to the second point in
the ciphertext pair of points, (PM + kPB ):which is
The entire security of ECC depends on the ability to
compute a point multiplication and the inability to
the plaintext point, corresponding to the plaintext
message M .
compute the multiplicand given the original and
product points.
Security of ECC
Sender key generation:
6
The cryptographic strength of elliptic curve encryption
computational complexity of 1.6 × 1028
lies in the difficulty for a cryptanalyst to determine the
(still with the Pollard ρ method).
MIPS-years
secret random number k from kP and P itself. The
fastest method to solve this problem (known as the
elliptic curve logarithm problem ) is the Pollard ρ
CONCLUSION
factorization method [Sta99].
This method simulates some critical processes in
The computational complexity for breaking the elliptic
central dogma, it is a pseudo DNA cryptography
curve cryptosystem, using the Pollard ρ method, is
method. The theoretical analysis and experiments
3.8×1010 MIPS-years (i.e. millions of instructions per
show this method to be efficient in computation,
second times the required number of years) or an elliptic
storage and transmission; and it is very powerful
curve key size of only 150 bits [Sta99]. For comparison, the
against certain attacks.Thus, this method can be of
fastest method to break RSA, using the General
many uses in cryptography, such as an
Number Field Sieve Method to factor the composite
enhancement in security and speed to the other
interger n into the two primes p and q, requires 2 × 108
cryptography methods. There are also extensions
MIPS-years for a 768-bit RSA key and 3 × 1011 MIPS-
and variations to this method,which have enhanced
years with a RSA key of length 1024.
security, effectiveness and applicability.
References
[1] L.M. Adleman, Molecular computation of
solutions to combinational problems. Science,
(1994), 266(4), pp. 1021-1025.
[2] R.J. Lipton, Using DNA to solve NP-complete
problems. Science, (1995), 268(4), pp. 542-545.
[3] R.S. Braich, N. Chelyapov, and C. Johnson,
Solution of a 20-Variable 3-SAT Problem on a
DNA Computer, Scienc (2002),296, pp. 499-
If the RSA key length is increased to 2048 bits, the
General Number Field Sieve Method will need 3 × 1020
MIPS-years to factor n whereas increasing the elliptic
502.
[4]D.Boneh,C.Dunworth,andR.Lipton,BreakingDE
S using a molecular computer. In Proceedings of
curve key length to only 234 bits will impose a
DIMACS
workshop
(1995),pp. 37-65.
7
on
DNA
computing,
[5]
L.M.Adleman,P.Rothemund,andS.Roweis,On
applying molecular computation to the date
encryption strands in DNA based computers, In
Proceedings
of
2ndDIMACS
Workshopon
DNABasedComputers, (1996), pp. 28-48.
[6]
C.T.Celland,V.Risca,andC.Bancroft,Hiding
messages in DNA microdots. Nature, (1999),
399, pp.533-534.
[7] J.P.L. Cox. Long-term data storage in DNA.
Trends Biotechnol. (2001), 19, pp. 247-250.
[8]
G.Z.Cui,Y.L.Liu,andX.C.Zhang,NewDirectionof
Data
Storage:
Technology,
DNA
Computer
Molecular
Storage
Engineering
and
Applications, (2006), 42(26), pp. 29-32.
[9]
J.
Chen.
A
DNA-based,
biomolecular
cryptography design. Circuits and Systems
ISCAS apos, (2003), pp. 822-825.
[10] M. Amosa, G. Paun, and G. Rozenbergd,
Topics in the theory of
DNA computing,
Theoretical Computer Science, (2002), 287, pp.
3-38.
8