Biomolecular Computation – A Survey Report C. Arunkumar, H. S. Negi, S. Garg, V. Rastogi, W. Zhi Guo 1. Introduction Biomolecular computing is an exciting and fast growing research area. It is concerned with the use of (bio) molecules and biochemical processes for the purpose of computing. Although it is centered around computer science, biomolecular computing is a very interdisciplinary area with researchers from computer science, mathematics, molecular biology, crystallography, biochemistry, physics, etc participating in it. Biomolecular computing has the potential to resolve two well recognized obstacles of silicon based computer technology: miniaturization and massive parallelism. Through molecular computing one ‘descends’ to the nano-scale computing which solves the miniaturization problem. e.g., a single drop of solution can contain trillions of DNA molecules, and when an operation is performed on a tube containing DNA molecules then it is performed on every molecule in the tube, massive parallelism is obtained on a grand scale. One may say that the main thrust of the current research in molecular computing is the assessment of its full potential. The results obtained to date are cautiously optimistic. In particular, the conceptual understanding and experimental testing of basic principles achieved is already quite impressive. Biomolecular computations can be carried out by using biotechnological techniques like cellular processing, computing using RNA and computing using DNA or more popularly known as DNA computing. Effective use of any technique is dependent on the construction of a powerful interface to the molecular world. First two techniques are not well developed and will not be discussed further. However a lot of work is going on in DNA computing and some promising results have also been obtained. 2. DNA Computing All computers in existence today make use of binary code - 1's and 0's, or on's and off's on the circuits of a computer chip, forming the basis for every calculation a computer performs, from simple addition to the solution of the most complex differential equations. The DNA molecule is also a code, but instead made up of a sequence of four bases which pair up in a predictable manner, Adleman [1] saw the possibility for using it as a molecular computer. However, rather than relying on the position of electronic switches on a microchip, Adleman relied on the much faster reactions of DNA nucleotides binding with their complements, a brute force method that would indeed work. This was the start of DNA computing. Since then researchers have developed several different models to solve other mathematical and computational problems using molecular techniques. DNA computing (the area of molecular computing where one considers DNA molecules) offers a number of features in addition to miniaturization and massive parallelism which make it an attractive alternative (or supplementary) technology to modern silicon computing. These features include very impressive energy efficiency and information density. The power of molecular computers is worth comparing to today's silicon based systems. In speed, the DNA clearly wins the race, performing 1,000 operations per second more than the fastest supercomputer (which executes about 1012 operations per second). To get a better idea of the speed, a typical desktop computer performs 109 times slower than the DNA, a measly 106 operations per second! In energy efficiency, a biological system such as a cell can perform 2x10 19 power operations using one joule of energy (the amount of energy needed to burn a 100-watt light bulb for a second), while a supercomputer only manages 1010 operations, making it 1010 times less energy efficient! Just as the cell pushes the limit of the second law of thermodynamics, which predicts that one joule can fuel a maximum of 34x1019 irreversible power operations, the DNA computer's energy consumption from DNA strand synthesis and PCR should also be small compared to that used up by a supercomputer. The potential for information storage in molecular computers follows the same trend as speed and efficiency. While storage media of today, such as videotapes, store information at a density of one bit per 1012 cubic nanometers, the molecules of DNA make this figure seem ridiculous, with an information storage density of 1 bit per cubic nanometer - a trillion times less space! Even though present molecular computer or DNA computer would have a hard time multiplying two 100-digit integers, an easy task for one of today's electronic computers, its capability to solve complex problems is unparalleled. However, as work continues in this exciting area, molecular computers may impress us once again and challenge the dominance of electronic systems in solving even more types of problems. After all, the DNA based system of computation has had millions of years to evolve and perfect itself, while man-made systems have only existed for a small fraction of this span. It is an impressive computer indeed that can spend eons producing new and varied organisms through trial and error until it finally finds a solution - the intelligent species we call human beings. Although the massive parallelism of DNA in solution is impressive (more than 1020 bytes of active memory per liter) and the energy consumption is very low, the ultimate attraction of DNA-Computers is their potential to design new hardware solutions to problems. Unlike conventional computers, DNA computers can construct new hardware during operation. Thus, the closest point of contact to electronic computing involves hardware design, in particular reconfigurable hardware design, rather than conventional parallel algorithms or languages. Molecular computers can be constructed reversibly in flow systems, where an exchange of DNA populations is possible. Rapid hardware redesign opens the door to evolving computer systems, so configurable DNA Computing also aims at harnessing evolution for design and problem solving. Because of the huge information storage potential of aqueous solutions containing DNA, comparatively low flow rates suffice for massively parallel processing so that synthetic DNA can be treated as an affordable, easily degradable resource. 3. Potential Applications of DNA computing With the power of DNA computing, or biomolecular computation, we can consider using it to deal with many problems that are considered impractical to solve with normal methods. One class of problems is NP problems, such as the SAT problem and the Hamiltonian graph problem. Some papers have proposed different biomolecular computation techniques to solve these problems. Cryptanalysis is another problem that require large amount of computation. Breaking a system such as DES is computationally very costly with modern silicon computers; however it will be easy to break such system by DNA computing. In molecular biology, there is a class of problems, including sequencing, fingerprinting, and mutation detection, which are difficult to deal with by biomolecular computation. This is because biomolecular computation will have to process natural DNA rather than artificially synthesized DNA. This sort of problems is called DNA2DNA computation. DNA computing has the potential to provide huge memories. Each individual strand of DNA can encode binary information. A small volume can contain a vast number of molecules. DNA in a weak solution of 1 liter of water can encode 107 to 108 tera-bytes, and we can perform massively parallel associative searches on these memories. DNA computing also has the potential to supply massive computational power. General use of DNA computing is to construct parallel machines where each processor's state is encoded by a DNA strand. It can perform massively parallel computations by executing recombinant DNA operations that act on all the DNA molecules at the same time. These recombinant DNA operations may be performed to execute massively parallel local memory read/write, logical operations and also further basic operations on words such as parallel arithmetic. DNA in a weak solution of 1 liter of water can encode the state of about 1018 processors, and since certain recombinant DNA operations can take many minutes, the overall potential for a massively parallel BMC machines is about 1000 teraops. 4. Open Problems Some of the problems that can be potentially solved by DNA computing are: 4.1. Breaking DES using a molecular computer DES is a widely used encryption procedure. It encrypts 64 bit messages and uses a 56 bit key. By breaking DES we mean that given one (plain-text, cipher-text) pair we can find a key mapping the plain-text to the cipher-text. By brute-force attack we need 256 steps to break DES. A non-deterministic Turing Machine can break any crypto-system, including DES, by guessing the correct key. Though these results are very important theoretically, they are not useful in practice. For instance, breaking DES using a Turing Machine would require millions of biological operations. Differential cryptanalysis methods have proven to be very useful for breaking DES, but it requires 243 pairs of plain-text, cipher-text, while molecular computer requires only one such pair. 4.2. DNA Implementation of Non-Determinism DNA recombination and separation can perform computations meaningful to human endeavors as shown by Adleman in 1994. There are techniques available for programmable fault-tolerant implementation of nondeterministic finite-state machines that enforces the basic conditions in the subset constructions that permit efficient computation. The implementation can be extended to arbitrary nondeterministic Turing machines of a moderate size in practice since they are basically finite state machines with a large state set. 4.3. Arithmetic and Logic Operations with DNA Logic operations like AND, OR and NOT and arithmetic operations like addition and subtraction are necessary for DNA computing to be applicable on a wider range of problems. Unlike combinatorial search problems, which can be solved by generating all possible combinations and extracting the correct output, these operations mandate that only a unique output be generated by specific inputs. 4.4. DNA2 DNA Computations Requirements like DNA sequencing DNA fingerprinting DNA mutation detection or population screening cannot be solved by the current or even future electronic machines. Since the problems to be solved are not digital, there is no way that electronic machines can solve them. But DNA computations could be a good proposal. The idea is to use DNA computations to operate on unknown pieces of DNA. This avoids the expensive step of sequencing the given unknown DNA strands. 4.5. Massively Parallel DNA computation Approaches to DNA computation are impractical for large problems as they require processing of vast quantities of DNA with steps associated with large error propagation. However, approach to the production of the solution and reading the answer based on reliable and automated PCR steps can solve large problems by processing up to 1015 or more distinct strands of DNA in parallel. 4.6. Dynamic Programming Algorithms on a DNA computer DNA computers are especially useful for running algorithms which are based on dynamic programming. This class of algorithms takes advantage of the large memory capacity of a DNA computer. Unlike other algorithms for DNA computers, which are brute force, dynamic programming is the same algorithm one would use to solve smaller problems on a conventional computer. For example – graph reachability problem – knapsack problem – can be solved efficiently on a DNA computer. 5. Conclusion The first DNA computers are unlikely to feature word processing, e-mailing and solitaire programs. Instead, their powerful computing power will be used by national governments for cracking secret codes, or by airlines wanting to map more efficient routes. Studying DNA computers may also lead us to a better understanding of a more complex computer -- the human brain. DNA computer components -- logic gates and biochips -- will take years to develop into a practical, workable DNA computer. If such a computer is ever built, scientists say that it will be more compact, accurate and efficient than conventional computers. DNA computing is a nascent technology and it has a long way to go to become feasible. Most important step to make it feasible is construction of a powerful interface to the molecular world. In our pursuit we would like to explore the field and try to contribute a step further in making DNA computer feasible. References 1. Adleman L., “Molecular Computation of Solutions to Combinatorial Problems”. Science, 266, November 1994, pp 1021-1024. 2. Fredman Y., “Simple Guide to DNA Based Computers”, http://dna2z.com/dnacpu/dne.html 3. Wisz M., “DNA Computing”, http://www.englib.cornell.edu/scitech/w96/DNA.html 4. Berger B., Singh M., “Introduction to Computational Molecular Biology”, http://theory.lcs.mit.edu/~mona/18.417-home.html 5. John H. Reif, “Paradigms for Biomolecular Computation”, Unconventional Models of Computation, 1998, pp 72-93. 6. Leete T. H, Schwartz M. D., Williams R. M., Wood D. H., “Massively parallel DNA computation: Expansion of symbolic determinants”, DNA based computers III: DIMACS Workshop, June 23-25, 1997 7. Adleman L. M., Rothemund P. W. K. , Roweis S., Winfree E., “On applying molecular computation to the data encryption standard”, DNA based computers III : DIMACS Workshop, June 23-25, 1997 8. Baum E. B., Boneh D., “Running dynamic programming algorithms on a DNA computer”, DNA based computers III: DIMACS Workshop, June 23-25, 1997 9. Guarnieri F., Orlian M., Bancroft C., “Parallel operations in DNA-based computation”, DNA based computers III: DIMACS Workshop, June 23-25, 1997 10. Gao Y., Garzon M., Murphy R. C., Rose J. A., Deaton R., Stevens S.E., “DNA implementation of non-determinism”, DNA based computers II : DIMACS workshop, June 10-12, 1996 11. Ladweber L. F., Lipton R. J., Rabin M. O., “DNA2 DNA computations: A potential ‘Killer App’?”, DNA based computers II: DIMACS workshop, June 10-12, 1996 12. Gupta V., Parthasarathy S., Zaki J., “Arithmetic and logic operations with DNA”, Proceedings of a DIMACS workshop, April 4, 1995, Princeton University 13. Boneh D., Dunworth C., Lipton R. J., “Breaking DES using a molecular computer”, Proceedings of a DIMACS workshop, April 4, 1995, Princeton University