Homework Assignment for Phys 498 Biological Information and

advertisement
Homework Assignment for Phys 498 Biological Information and Complexity - Submitted by Anoush Aghajani-Talesh
Computing with DNA
In the following I will discuss the paper „Breaking DES Using a Molecular Computer“ by Dan Boneh,
Christopher Dunworth and Richard Lipton [2], which describes an algorithm using DNA and
standardized biotechnological procedures for breaking the Data Encryption Standard (DES). This
paper is rather a description of a thought experiment than the design of a DNA-computer. To my knowledge no successful experiments based on the described DNA-computer was carried out so far.
The DES-Algorithm encrypts a 64-bit block of data (plain text) into a 64-bit block (encoded message)
using a key consisting of 56-bit. To understand the ideas presented in the paper of Boneh et al. a
detailed description of DES is not required. All we have to know is that DES repeatedly uses rather
simple binary operations to mix the bits of the plain text just like cards in a game. Given the key for an
encrypted text it is possible to reverse the encryption and to obtain the plain text from the encrypted
message. Therefore the key must be kept secret and it is the objective of a code-breaker to find the key.
Given pair of a 64-bit plain text block and its encryption it is possible to find the key by a brute force
method. One can try to calculate for any possible key (there are 256) the encrypted message out of the
given plain text in order to compare it with the known encrypted message. By the time the discussed
paper was published a brute force attack on DES required very expensive computers and was
presumably only carried out by secret services. Notice that the 56-Bit DES was invented in the 1970s
and is no longer state of the art. DES was replaced by the 112-bit 3DES, which is going to be followed
by the Advanced Encryption Standard (AES). The molecular computer described by Boneh et al.
breaks DES by brute force, but with the amazing difference that it computes and compares all possible
keys and encryptions parallel at the same time.
The attack on DES can be summarized as followed:
Step 1: Let the keys be represented in a suitable way by DNA strands. Generate randomly all possible
keys. The authors estimated that all necessary 256 strands would fit into a 1-liter test tube.
Step 2: By applying biomolecular procedures run the DES Encryption-Algorithm on the DNA-strands
in such way, that for each key the corresponding encrypted text is appended to the DNA-strand.
Step 3: The test tube contains all pairs of keys and encrypted messages. Extract the DNA strands that
contain the sequence belonging to the known encrypted message. The desired key can be
obtained by sequencing this strand.
In order to run this algorithm it is necessary to find a representation of binary numbers as DNA strands
which allows performing biomolecular operations on them. The idea is to let the bit #i of a n-bit binary
number be represented by two unique and DNA-strands Bi(0) and Bi(1), one for each possible value.
That means that the representation of any n-bit number requires an alphabet consisting of 2n (one for
each value 1 or 0) unique strands. In addition for technical reasons it is necessary that each of the
sequences is separated by a specific separator sequence Si. Bone et al proposed to use strands with a
length of 30 base pairs. The length of a strand should be as short as possible, to avoid breaking to make
the biological operations faster and less expensive. On the other hand uniqueness of the oligomers is
desired. They should be different in not too short subsequences.
1
It remains to clarify how binary operations can be executed on DNA-Strands. Boneh et al. proposed to
use following methods:
Extraction: With a method called streptavidin bead separation technique it is possible to extract from
the test tube all DNA strands, that contains a short specific nucleotide sequence. This is done by using
many copies of the complementary sequence, which are bound to the surface of tiny magnetic beads. In
the test tube DNA containing this specific sequence will be annealed to its complementary sequence. It
then can be extracted with a magnetic field.
Amplification via PCR: The polymer chain reaction allows making duplicates of DNA in the test tube.
It requires a beginning and a subsequence (usually 20 bp long), which are called primers to identify the
sequence to be replicated. Copies of the primers anneal to the DNA strand. The enzyme polymerase
then rebuilds the complementary part of the sequence between the two primers. In a process called
melting the original DNA and the complementary copy of the sequence between the primers can be
separated. Both strands can again be copied and used for further replication.
Tagging: This will append a new short sequence to the end of every DNA strand in a tube. It is done by
annealing a short strand to a longer strand so that the short strand is extending off the end of the longer
strand. To the extending part of the string the complement can be annealed using polymerase. By
melting the short string and the now extended string can be separated.
Given these operations one can describe breaking DES in terms of manipulations on DNA:
Step 1: (Generating keys)
Using PCR it is possible to obtain numerous copies of the oligomers Bn(i) and Sn and its
complementary sequences. Starting with S0 and appending randomly B1(0) or B1(1) followed by S1
etc. one can successively generate all possible keys, provided that one a uses sufficient number of
oligomers. Notice that achieving all the oligomers to be appended in the right order requires several
substeps that I left out for brevity.
Step 2: (Encryption of the plain text)
For the execution of the DES-Algorithm it is necessary to be able to perform an exclusive-OR
operation (XOR) and a table lookup where a 6-bit value is assigned to a 4-bit value. The authors
claimed that it is possible to run DES encryption with only these two operations. DES consists of 16
levels or rounds. Each round maps a 64-bit value to 64 bit-value by evaluating 48 XOR, 8 table lookups
an another 32 XOR. Evaluating the XOR of bit #i and #j means to append the value ij on each strand.
At first one extract all strand which contain the sequences Bi(1). Out of these one extracts those strands
with Bj(0). Analogous one extracts all strands with Bi(0) and Bi(1). Then one appends one these strands
by using the tagging technique a Sx and Bx(1) oligomer. On all other strands one appends Sx and Bx(0).
Here x denotes the position of the appended bit. The 48 bits appended are then grouped into 6 bits each
and used for 8 table lookups. A table lookup is a function that maps a 6-bit value on a 4-bit value,
which means that it appends for each 6-bit sequence a corresponding 4-bit sequence.
For simplification of the of the table lookup, the authors claimed that it is possible to append the bits in
such a way, that a for the table lookup only 6 consecutive bits have to be evaluated. Thus by using 64
different magnetic beads it is possible to separate the strand by sequences of 6 consecutive bits. The
corresponding 4-bit value is then appended by using the tagging technique. This step has to be repeated
8 times. After the table lookup another 32 XOR operations are executed. At the end of each round the
strands contains redundant bits that have to be removed using restriction enzymes. Having completed
all 16 DES rounds the DNA strands in the test tube hopefully consist of the 56-bit key and the
encrypted message.
2
Step 3: (Isolation of the key)
By Extraction the DNA-strand carrying the known encryption and its corresponding key can be isolated
from the test tube. By amplifying and sequencing the key can be read.
Boneh et al. roughly estimated that four month of lab time to are required to break DES by this method.
However the hereby created solution which matches the keys to the encrypted texts can be reused to
find other keys in estimated one day, which is a remarkable much short time. Another amazing fact is
that apparently DNA allows it to store the complete table consisting of 1017 entries ( 106 terabyte of
data) in a volume of 1 Liter. In addition all steps of the algorithm are performed massively parallel (256
operations at the same time), which might in certain applications overweigh the compared to
conventional computers long time (several hours) needed to perform a single step.
The discussed paper was supposed to demonstrate the possibilities of DNA computation, but it also
demonstrated its limitations. The admittedly astonishing construction of a code-breaking algorithm was
in many ways only possible due to some special properties of DES. Firstly DES can be attacked by a
brute force method; other encryption algorithm cannot be broken in such a simple way. Secondly DES
has a comparatively short key length of 56-Bit. 256 DNA strands in a test tube appear to be an upper
limit (breaking 112-bit 3DES would require a 1017 liter test tube). Finally it was of great importance
that DES uses few and rather simple operation. Even a slightly more demanding algorithm might cause
a far more complex implementation.
One has to notice that everything above was said under the assumption that no errors occur during the
whole procedure. In fact the technical difficulties are immense and far away from being controlled. A
detailed discussion about the technical problems and about the control of errors in molecular computing
and can be found in [3] and [1].
References:
[1] Adleman L.M., Rothemund, P.W.K., Roweis S., Winfree E. On Applying Molecular Methods to the
Data Encryption Standard, Proceedings of the Second Annual Meeting on DNA Based Computers,
held at Princeton University, June 10-12, 1996.
[2] Boneh, D., Dunworth, C., and Lipton, R. J. (1995a). Breaking DES using a molecular computer.
Technical Report CS-TR-489-95, Princeton University, Princeton, N.J.
[3] Maley, C.C. DNA-Computation: Theory, Practice and Prospects, Evolutionary Computation 6(3)
201-229 (1998)
3
Download