DNA AS A NANO COMPUTATIONAL TOOL

advertisement
DNA AS A NANO COMPUTATIONAL TOOL
ABSTRACT
Computational machines are predominantly silicon based and silicon chips have
become indispensable for computation involving vast and complex data. Our work dwells on
the idea of using DNA, the genetic material of all living forms, as a Nano-computational
tool to solve complex computational problems. The massive parallelism involved in DNA
interaction vindicates the idea and hence the idea of using DNA as a computational tool for
parallel processing. The challenge we took up is to find the shortest route between two cities
provided all other three cities are visited exactly once (a typical Hamiltonian path). While
supercomputers are expected to take plethora of time to arrive at the answer, we discuss the
competence of DNA in providing the answer almost instantaneously.
The efficacy of DNA to act as a Nano-computational tool is dealt with in this paper.
Here we have taken 5 cities: Chennai, Pondicherry, Tiruchirapalli, Ooty, Tuticorin. The
problem is that one has to travel from Chennai to Tuticorin, passing through all the other
cities exactly once, by the shortest pathway. We have exploited the inherent properties of
DNA and molecular biology techniques to arrive at the answer to the problem .For our
convenience we name the cities as A, B, C, D, E respectively. The entire process is designed
to be carried out in the WET LAB (Molecular Biology laboratory using Molecular biology
laboratory instruments). There are nine roads, which link each city to every other. We
represent these nine roads by unique DNA sequences (synthesized using a DNA synthesizer).
The novelty is that the DNA segment length is directly proportional to the road length.
The idea is to generate all possible pathways connecting the cities, select the pathway that
start with the correct start city (A) and end with the correct end city(E) using PCR.
Next step is to select the pathways that pass through each city at least once. This is
done by Affinity Separation technique.
Length of the pathways comprising of the various roads to be taken will be directly
proportional to the DNA sequence that will correspond to the particular pathway. So to
obtain the shortest possible route is to obtain the shortest DNA sequence, which is possible
by the Gel Electrophoresis technique that separates the DNA sequence based on their length.
By isolating that DNA and sequencing it, we can know the sequence, which encodes the
shortest correct pathway. Thus the answer to a supposedly complex problem is arrived in a
test tube.
*Corresponding author: achuthz@yahoo.com
.INTRODUCTION
In this era where computational processes come to the rescue of Biological
conundrums -the underlying dogma of Bioinformatics, this paper aspires to explore the
vice-versa. The prime contention of the paper is to assert that DNA, the genetic material
of all living organisms can be exploited as a Nano computational tool. The efficacy of
DNA in solving computational problems is brought to light in this paper by taking a
problem of the Hamiltonian type. We have five cities namely Chennai, Pondicherry,
Tiruchirapalli, Ooty, Titicorin.We have to travel from the city Chennai to Tuticorin
passing through all other cities exactly once through the shortest pathway. Since the
process is carried extensively in WET LAB (MolecularBiology Laboratory using
molecular biology instruments), it is mandatory that we know some eclectic facts about
DNA, some Molecular Biology related terminologies and processes.
DNA (DEOXY RIBOSE NUCLIEC ACID)
The bio-molecule that carries all the information about the living organism is a
Polynucleotide. A nucleotide is a chemical compound consisting of sugar phosphate and
a nitrogen base. The DNA molecule is mostly double stranded and exists in a double
helical structure. The weak forces acting between the nitrogen bases of both the strand
contributes to the stability of the double helical structure.
COMPLEMANTARITY AND BASE PAIRING
The
four
nitrogen
bases
are
Adenine(A),Thiamine(T),Guanine(G),
Cytosine(C).The sequence of DNA is determined by the sequence of the nitrogen bases
in the polynucleotide chain.
Adenine always pairs with Thiamine only and Guanine always pairs with
Cytosine only. This is called the Watson and crick base pairing rule. Thus if know one
strand of DNA we can predict the other strand of DNA.
For example if one strand of DNA is of the sequence
AGTCAT
The other must be of the sequence
TCAGTA
i.e
AGTCAT
| | | | | |
TCA G TA
Because A always pairs T, and G with C.
This is called the complementarity of the DNA. One strand is always the complement of
the other strand. Thus if two complementary single stranded DNA molecules come
together they bind to form double stranded helical molecule.
POLYMERASE
It is an enzyme that is vital for DNA replication. It attaches to the single
stranded DNA and slides along creating a complementary DNA strand thus resulting in a
double stranded DNA.
LIGASE
It is an enzyme that slicks together the DNA molecules when they come into
close proximity in a linear fashion.
PCR (POLYMERASE CHAIN REACTION)
It allows one to produce many copies of a specific sequence of DNA. It is a
iterative process that cycles through a series of copying events using the enzyme
polymerase. Polymerase will copy a section of a single stranded DNA starting at the
position of a ‘prime’, a short piece of DNA complimentary to one end of a section of
DNA that we are interested in.By selecting primer that flank the section of DNA to be
amplified, the polymerase preferentially amplifies the DNA between the primers,
doubling the amount of DNA containing this segment. After many iteration of PCR, the
DNA is amplified exponentially.
DNA SYNTHESIZER
It is a device that synthesizes short DNA molecules chemically.
AFFINITY PURIFICATION
DNA containing specific sequence can be purified from a sample of
mixed DNA by a technique called Affinity separation. This is accomplished by attaching
the compliment of the sequence in question to a substrate like a magnetic bead (about 1
micron in size). The beads are then mixed with the DNA.DNA which contains the
sequence of interest hybridizes with the complement sequence on the beads. These beads
can then be retrieved and the DNA isolated.
GEL ELECTROPHOEROSIS
It is common procedure used to resolve the size of DNA. The basic
principle behind Gel Electrophoresis is to force the DNA through a gel matrix using an
electric field. The gel is made up of a polymer that forms a meshwork .DNA being
negatively charged moves towards the anode. The DNA now is forced to thread its way
through the tiny spaces.
THE PROBLEM:
Consider the five cities Chennai (A), Pondicherry (B), Tiruchirapalli(C), Ooty (D)
and Titicorin (E). We have to travel from cities Chennai (A) to Tuticorin (E) passing
though all the cities exactly once. The distance between the each city is given below.
ROAD
1. A – B
2. B – C
3. C – D
4. D – E
5. B – D
6. C – E
7. A – C
8. B – E
9. A – D
DISTANCE (Kms)
168
196
301
661
418
263
316
459
543
ROUND OFF DISTANCE (Kms)
160
200
300
660
420
260
320
460
540
Here there are nine roads connecting each city with every other. The case is that we
have to start at A and end at E and also pass through all the cities exactly once.
We round off the distances to the nearest multiple of 20 for he sake of our
convenience.
So only 6 correct pathways can exist. They are
1.
2.
3.
4.
5.
6.
AB-BC-CD-DE (1320Kms)
AB-BD-DC-CE (1140Kms)
AC-CD-DB-BE (1500Kms)
AC-CB-BD-DE (1600Kms)
AD-DC-CB-BE (1500Kms)
AD-DB-BC-CE (1420Kms)
<The shortest correct pathway>
Solving the problem with DNA
THE PROCEDURE:
The problem can be solved using DNA in wet lab in following manner.
1. We represent each route by a DNA sequence where the sequence length is directly
proportional to the route length.
ROAD DISTANCE




AB
BC
CD
DE





BD 420
CE 260
AC 320
BE 460
AD 540
160
200
300
660
DNA sequence
ACCG CATG
TCGG TA CTGC
TTGG ACGCATG TGAT
ATAT CATG GATAT CGTC GAGG-GAAT CAGT TAGG
GAAA GCTAA TAAT GGGA TCAT
ACGC TTGGA TAGG
ACCGATAG CCGT AGGT
CCGT GCAA GGA GCTA GAAA TAGG
ACCG TTAC GCAA TGCA GCGG GGA-GGGC
We represent every 20 Km by a nucleotide and hence route AB which is 160 KM
long is represented by an 8 nucleotide sequence ACCG CATG. Note that for all the roads
that start from the city A, the first 4 nucleotides of sequence are same and for all roads
that end at city E, last 4 nucleotides sequences are the same.
The complement of the route AB is
Sequence of the route:
ACCG CATG
Sequence of complement: TGGC GTAC
Now we represent the Road pair complementary Adhering Sequencing for every pair of
roads by a DNA sequence. The DNA sequence is not assigned randomly but assigned in
the following way.
For the pair of road pair BC – CD we assign the nucleotide sequence by taking the last four nucleotides of route BC and concatenating
with the 4 nucleotides of route CD.
Hence the complementary adhering sequence for the road pair BC – CD will be
GACG AACC
ACCGT ACGC ATTA CCGCTAT CTGC TTGG ACGCA TTGC.
GACG AACC
All the nucleotide sequences are obtained with the help of DNA synthesizer. It is a device
that chemically synthesize short strands of DNA on giving the nucleotide sequence.
1. Put all the DNA sequences that correspond to the roads, Complementary
Adhering Sequence to road pairs, along with DNA Ligase and buffer salts. A
massively parallel reaction takes place and we have the solution to our problem in
the test tube in less than a second. Knowing the answer takes little time though.
As the complimentary base pairs bond with each other, whenever the two roads
meet the complementary adhering sequence of the road pair, they will bind. For
example,
As the two road pairs are bound to the route pair complementary adhering
sequence, they come extremely close and are separated only by a nick. The DNA ligase
then forms the bond between the nucleotides, which are an either side of the nick. Hence
we will get a single linear strand.
2. So in the test tube we have the DNA sequence that corresponds to the shortest
path, all correct pathways and copious junk sequences. We now have to separate
the junk sequences. We now transfer the contents of the test tube PCR machine. A
PCR machine produces multiple copies of only those DNA sequences whose
beginning and ending nucleotides are complements to the forward primer and the
reverse primer sequence of the PCR.
As mentioned earlier, the first 4 nucleotides are same for all the roads that
start from A and last 4 nucleotides are same for all the roads that end at E. So the
forward primer is complement and backward primer is complement of E. as a
result of this, the pathways, which begin with correct start city A and end with
correct stop city E, will be amplified to the order of several millions.
The correct pathways which starts at A and end at E and pass through all
other cities exactly once are






ROADS
AB-BC-CD-DE
AB-BD-DC-CE
AC-CD-DB-BE
AC-CB-BD-DE
AD-DC-CB-BE
AD-DB-BC-CE
order of connecting the city
A--B--C--D--E
A--B--D--C--E
A--C--D--B--E
A--C--B--D--E
A--D--C--B--E
A--D--B--C--E
The sequences containing the correct sequences (i.e) the ones encoding the correct
pathways mentioned above can be segregated out using affinity separation.
THE MAIN IDEA IN EFFECTING THE AFFINITY SEPARATION TECHNIQUE
IS THAT WE OBTAIN PATHWAYS IN WHICH ALL THE CITIES ARE VISITED
ATLEAST ONCE.
This process is explained in detail with the diagram shown below.
SOLVING THE PROBLEM WITH DNA
(AFFINITY SEPARATION TECHNIQUE)
From Chennai(A),we can go to Pondicherry(B) or Tiruchirapalli(C) or Ooty(D)
through roads AB,AC,AD respectively, which we shall call the first, second and third
arm respectively. From B, we can go to C or D via roads BC and BD respectively. If
the roads taken are AB and BC, then D can be reached by taking the road CD and E
can be reached by road DE. Now we have a sequence in the test tube, which is one of
the 6 correct pathways. If in the FIRST ARM, the roads taken are AB and BD, then C
can be reached by taking the road DC and E can be reached by road CE. Now we have
a sequence in the test tube, which is one of the 6 correct pathways
Similarly the SECOND ARM and the THIRD ARM will also yield 2 correct
pathways each. Hence the three arms in all will give the 6 correct pathways, out of
which one is the shortest.
To effect the affinity separation process, we use probes which are complementary
to the roads in the pathways. The probes are attached to minute iron balls so that when
the probes attach to their complementary sequence, they can be separated from the
rest by applying a magnetic field. After retrieving the probe bound to its
complementary sequence, the probe DNA molecule can be separated from its
complement by mild heating.
The order of using the probes is the order of roads that appear in the pathway.
We will have to progressively fish out the pathway by the order of its roads. At the
end of the first arm, the discarded solution is transferred back to the mother test tube.
And then the SECOND ARM is carried out. The discarded solution of the second arm
is transferred to the mother test tube and then, the THIRD ARM is effected. Each arm
is effected sequentially. First arm first and the third arm last. The discarded solution
from the THIRD ARM contains the sequences of the pathways which does not pass
through all the cities. The contents of each of the end test tube(s) of each arm (i.e) the
ones that contain the pathways in which all cities are visited atleast once are
transferred to a new test tube.
So now in the test tube we a have the pathways which start at Chennai(A)
and end at Tuticorin(E) and visit all other cities at least once. As mentioned earlier the
road length is directly proportional to nucleotide sequence length. So by segregating
the nucleotide sequence based on their length, we can easily get the answer!!
This done with help of Gel Electrophoresis.
Here the shortest DNA sequence moves the fastest and is easily separated from the
rest. That DNA is isolated and sequenced. The sequence will invariably be the one
corresponding to the shortest pathway.
The sequence obtained will be
ACCGCATGGAAAGCTAATAATGGGATCATTTGGACGCATGTGATACGC
TTGGA TAGG
Corresponding to the pathway AB-BD-DC-CE passing through the cities
A-B-D-C-E (i.e) Chennai- Ponodicherry –Ooty-Tiruchirapalli- Tuticorin.
Thus, this clearly and methodically demonstrates that wet lab analysis (or
rather computation) with DNA can indeed be used to solve problems with greater
efficiency and speed than the “silicon chips” in computers.
The advantage of DNA computation



DNA provides dense information storage. For example, one gram of DNA when
dry, occupies a volume of 1cubic centimeter, but can store information
equivalent to ‘one trillion CDs’.
The DNA-DNA interactions provide enormous parallelism. The result of the
Hamiltonian path problem here was obtained in less than a second, an act which
even the most powerful supercomputer present today can only imagine.
DNA computation has extraordinary energy efficiency. One Joule is sufficient for
approximately 2*10^19 ligation operations, which is nearly equal to the
Theoretical Maximum operations of 34*10^19 operations per joule as dictated
by Second law of Thermodynamics. This is the fascinating power of DNA
computation, considering the fact that even the fastest supercomputer present in
this era can only perform 10^9 operations per joule.
Conclusion
The effectivity in using DNA as a Nano-Computational tool for
computational purposes is remarkable largely due to its massive parrellism,
complementarity and information storing capacity. DNA computers have tremendous
potential to compete with electronic computers, which boasts of superior speeds in
computation. This paper has explored that the answer to challenging computational
problems can lie coiled in something as unimaginable as DNA molecule. A new face in
the field of computation is introduced and the possibility of using DNA as a nanocomputational tool is high-lightened and described that even a molecular biology
laboratory can be made to perform computational operations just like the dry lab or the
computer lab, broadening the horizon of computational sciences. This paper can be
viewed as a manifestation of an emerging new area of science made possible by our
rapidly developing ability to control the molecular world.
References
 COMPUTING WITH DNA,Leonard M.Adleman,Scientific American, August
1998
 ON THE PATH OF COMPUTATION WITH DNA,David K Gifford in Science,
November 1994
 MOLECULAR COMPUTATION OF SOLUTIONS TO COMBINATORIAL
PROBLEMS, Leonard M.Adleman in
Science, November 1994
 “A ROAD GUIDE TO TAMIL NADU “ TTK Discover India Series,2002
Download