vii TABLE OF CONTENTS CHAPTER 1 TITLE PAGE DECLARATION ii DEDICATION iii ACKNOLEDGEMENTS iv ABSTRACT v ABSTRAK vi TABLE OF CONTENTS vii LIST OF TABLES xi LIST OF FIGURES xvi LIST OF SYMBOLS xviii LIST OF ABBREVIATIONS xx LIST OF APPENDICES xxii INTRODUCTION 1 1.1 Deoxyribonucleic Acid (DNA) 1 1.2 Basic Biotechnology 4 1.2.1 Syhthesizing DNA 4 1.2.2 Hybridization and Denaturation 4 1.2.3 Ligation 6 1.2.4 Polymerization 6 1.2.5 Polymerase Chain Reaction (PCR) 7 1.2.6 Gel Electrophoresis 9 viii 1.2.7 1.3 DNA Extraction 10 DNA Computing Paradigm 11 1.3.1 Hamiltonian Path Problem 11 1.3.2 From Turing Machine to DNA Computing 12 1.4 Emergence of DNA Computing 14 1.5 Reviewes of Output Visualization Technologies in 17 DNA Computing 2 1.5.1 Polymerase Chain Reaction 17 1.5.2 DNA Sequencing 18 1.5.3 Biochip 19 1.5.4 Fluorescence Detection 20 1.5.5 Atomic Force Microscope 20 1.6 Problem Statement 21 1.7 Objective 22 1.8 Scope of Work 23 1.9 Contribution 25 1.10 Publication List 26 1.11 Thesis Organization 27 DNA COMPUTING READOUT METHOD BASED ON 29 REAL-TIME POLYMERASE CHAIN REACTION 2.1 Introduction 29 2.2 Real-Time PCR 31 2.3 Basic Notation 34 2.4 Readout Approach 36 2.5 Experiment 39 2.5.1 Preparation of Input Molecules 39 2.5.2 Real-Time PCR Experiments 44 2.6 Results 49 2.7 Discussion 53 2.8 Chapter Summary 54 ix 3 CLUSTERING IMPLEMENTATION ON DNA 56 COMPUTING READOUT METHOD BASED ON LIGHTCYCLER SYSTEM 3.1 Introduction 56 3.2 Data Clustering 58 3.3 K-Means Algorithm 60 3.4 Fuzzy C-Means 62 3.5 Classification of TaqMan reactions Using FCM 63 Clustering Algorithm 4 3.6 Results 68 3.7 Discussion 71 3.8 Chapter Summary 74 CLUSTERING IMPLEMENTATION ON DNA 75 COMPUTING READOUT METHOD BASED ON DNA ENGINE OPTICON 2 SYSTEM 4.1 Introduction 75 4.2 Fuzzy C-Means Implementations 77 4.2.1 Methodology 77 4.2.2 Results 78 4.2.3 Discussion 82 4.3 4.4 5 Alternative Fuzzy C-Means Implementation 83 4.3.1 Methodology 83 4.3.2 Results 85 4.3.3 Discussion 88 Chapter Summary 89 CONCLUSIONS 90 5.1 Thesis Summary 90 5.2 Conclusions 91 5.3 Future Research 92 x REFERENCES Appendix A 93 103-105 xi LIST OF TABLES TABLE NO. 2.1 TITLE 11 ssDNAs used for generation of input molecules PAGE 40 readout of V0→V2→V4→V1→V3→V5 2.2 Transformation index from 41 V0→V2→V4→V1→V3→V5 to V0→V4→V1→V2→V3→V5 2.3 The required 13 ssDNAs for readout of 42 Hamiltonian Path V0→V1→V4→V2→V5→V3→V6 2.4 Transformation index for another seven nodes HPP 43 2.5 Sequences for forward and reverse 44 primers for V0→V2→V4→V1→V3→V5. 2.6 Sequences for TaqMan dual-labeled 45 probes for V0→V2→V4→V1→V3→V5. 2.7 Sequences for forward and reverse primers for 45 V0→V4→V1→V2→V3→V5. 2.8 Sequences for TaqMan dual-labeled probes for 45 V0→V4→V1→V2→V3→V5. 2.9 Sequences for forward and reverse primers for 47 V0→V1→V4→V2→V5→V3→V6. 2.10 Sequences for TaqMan dual-labeled probes for 47 V0→V1→V4→V2→V5→V3→V6. 2.11 Sequences for forward and reverse primers for V0→V1→V3→V5→V4→V2→V6. 47 xii 2.12 Sequences for TaqMan dual-labeled probes for 48 V0→V1→V3→V5→V4→V2→V6. 2.13 Sequences for forward and reverse primers for 48 V0→V1→V5→V3→V4→V2→V6 2.14 Sequences for TaqMan dual-labeled probes for 48 V0→V1→V5→V3→V4→V2→V6 2.15 Summary of the results obtained from both LightCycler 52 and DNA Engine Opticon 2 System. 2.16 Comparison of two different outputs by using standard 53 and modified in silico algorithm 3.1 Partition matrix values for each real-time PCR reaction 65 calculated based on FCM clustering algorithm for test data. 3.2 Partition matrix values for each real-time PCR reaction 66 calculated based on FCM clustering algorithm for test data. 3.3 Partition matrix values for each TaqMan reaction based on 69 K-means clustering algorithm for data1 3.4 Partition matrix values for each TaqMan reaction based on 69 K-means clustering algorithm for data2. 3.5 Partition matrix values for each TaqMan reaction based on 70 FCM clustering algorithm for data1 3.6 Partition matrix values for each TaqMan reaction based 71 on FCM clustering algorithm for data2. 3.7 Comparison of K-means and FCM for data1 73 (100 iteration runs). 3.8 Comparison of K-means and FCM for data2 73 (100 iteration runs). 4.1 Partition matrix value for each TaqMan reaction 79 based on FCM clustering algorithm for data3. 4.2 Partition matrix value for each TaqMan reaction 80 based on FCM clustering algorithm for data5. 4.3 Partition matrix value for each TaqMan reaction 81 based on FCM clustering algorithm for data5. 4.4 Outliers classification in DNA Engine Opticon 2 data set. 82 xiii 4.5 Partition matrix value for each TaqMan reaction 86 based on AFCM clustering algorithm for data3. 4.6 Partition matrix value for each TaqMan reaction 87 based on AFCM clustering algorithm for data4. 4.7 Partition matrix value for each TaqMan reaction 88 based on AFCM clustering algorithm for data5. 4.8 100 independent runs of AFCM clustering algorithm 89 xiv LIST OF FIGURES FIGURE NO. TITLE PAGE 1.1 A nucleotide 1 1.2 A single-stranded DNA 2 1.3 Double helix structure of DNA 3 1.4 Bi-molecular hybridization and denaturation of DNA 5 1.5 An example of hairpin formation of DNA 5 1.6 Ligation 6 1.7 DNA polymerization 7 1.8 Polymerase chain reaction 8 1.9 Gel electrophoresis 9 1.10 Example of a gel image 9 1.11 An example of DNA extraction by using 10 streptavidin-coated magnetic bead. 1.12(a) A directed graph for Hamiltonian path problem 11 1.12(b) The answer of Hamiltonian path problem. 11 1.13 The overall procedure of Adleman HPP base 14 DNA computing. 1.14 Scope of work and contribution 24 1.15 The whole process of readout method based on 24 real-time PCR 2.1 Overview of the research. The in vitro part is highlighted as the main work in this chapter. The improvement of in silico algorithm is depicted as a small contribution in this chapter. 30 xv 2.2 Illustration of the structure of a TaqMan DNA probe. 32 Here, R and Q denote the reporter and quencher fluorophores, respectively 2.3 Mechanism of real-time PCR based on TaqMan probe 33 2.4 An example of amplification plots corresponding to 35 TaqMan(v0,vk,vl) = YES (first condition) and TaqMan(v0,vk,vl) = NO (second condition) implemented on LightCycler System. 2.5 An example of amplification plots corresponding to 35 TaqMan(v0,vk,vl) = YES (first condition) and TaqMan(v0,vk,vl) = NO (second condition) implemented on DNA Engine Opticon 2 System. 2.6 Gel image for the preparation of 120-bp input molecules. 42 Lane M denotes a 20-bp molecular marker, lane 1 is the product of initial pool generation based on parallel overlap assembly, and lane 2 is the amplified PCR product. 2.7 Gel image for the preparation of 140-bp input molecules. 43 2.8 Output of real-time PCR for readout of 49 V0→V2→V4→V1→V3→V5 implemented on LightCycler System. Reaction 1 to 6 indicate the TaqMan(v0,vk,vl) reactions. 2.9 Output of real-time PCR for readout of 50 V0→V4→V1→V2→V3→V5 implemented on LightCycler System. Reaction 1 to 6 indicate the TaqMan(v0,vk,vl) reactions. 2.10 Output of real-time PCR for 50 readout of V0→V1→V4→V2→V5→V3→V6 implemented on DNA Engine Opticon 2 System. Reaction 1 to 10 indicate the TaqMan(v0,vk,vl) reactions. 2.11 Output of real-time PCR for readout of V0→V1→V3→V5→V4→V2→V6 implemented on DNA Engine Opticon 2 System. Reaction 1 to 10 indicate the TaqMan(v0,vk,vl) reactions. 51 xvi 2.12 Output of real-time PCR for 51 readout of V0→V1→V5→V3→V4→V2→V6 implemented on DNA Engine Opticon 2 System. Reaction 1 to 10 indicate the TaqMan(v0,vk,vl) reactions. 3.1 Scope of work and contribution of this thesis. The 57 implementation of clustering on LightCycler System is highlighted as the main contribution in this chapter. 3.2 A simple example of cluster. 59 3.3 Graphical representation of hard and soft 60 partitioning cluster. 3.4 The K-means algorithm. 61 3.5 The FCM algorithm. 63 3.6 Test data obtained from LightCycler System. 64 3.7 Output of real-time PCR with y1 and y2 centers, 64 calculated using FCM clustering algorithm. 3.8 Output of real-time PCR with y1 and y2 centers, 66 calculated using FCM clustering algorithm. 3.9 Classification procedure of TaqMan reactions 67 using K-means algorithm 3.10 Classification procedure of TaqMans reaction using 68 FCM algorithm 3.11 Output of real-time PCR with “YES” and “NO” centers, 68 implemented based on K-means clustering algorithm for data1. 3.12 Output of real-time PCR with “YES” and “NO” centers 69 implemented based on K-means clustering algorithm for data2. 3.13 Output of real-time PCR with “YES” and “NO” centers 70 implemented based on FCM clustering algorithm for data1 with ࢟ଵ(ସହ) > ࢟ଶ(ସହ) . 3.14 Output of real-time PCR with “YES” and “NO” centers 71 implemented based on FCM clustering algorithm for data2. 3.15 The comparison of convergence behaviors for the 72 xvii K-means and FCM clustering algorithms implemented on data1. 3.16 The comparison of convergence behaviors for the 72 K-means and FCM clustering algorithms implemented on data2. 4.1 Scope of work and contribution of this thesis. The 76 implementation of clustering on DNA Engine Opticon 2 System is highlighted as the main contribution in this chapter. 4.2 Classification of TaqMan reaction using FCM algorithm 78 4.3 Output of real-time PCR with “YES” and “NO” centers, 79 implemented by FCM clustering algorithm for data3 with ݕଵ(ସ) > ݕଶ(ସ) . 4.4 Output of real-time PCR with “YES” and “NO” centers, 80 implemented by FCM clustering algorithm for data4 with ݕଵ(ସ) > ݕଶ(ସ) . 4.5 Output of real-time PCR with “YES” and “NO” centers, 81 implemented by FCM clustering algorithm for data5 with ݕଵ(ସ) > ݕଶ(ସ) . 4.6 Classification of TaqMan reaction using AFCM 84 algorithm 4.7 Output of real-time PCR with “YES” and “NO” centers 85 implemented by AFCM clustering algorithm for data3 with ݕଵ(ସ) > ݕଶ(ସ) . 4.8 Output of real-time PCR with “YES” and “NO” centers 86 implemented by AFCM clustering algorithm for data4 with ݕଵ(ସ) > ݕଶ(ସ) . 4.9 Output of real-time PCR with “YES” and “NO” centers 87 implemented by AFCM clustering algorithm for data5 with ݕଵ(ସ) > ݕଶ(ସ) . 4.10 Convergence behaviors for the AFCM clustering algorithms implemented on data3, data4, and data5. 89 xviii LIST OF SYMBOLS °C - degree celcius Ts - DNA strand S - DNA strand S* - DNA complement of S F - DNA strand G - directed graph V - set of vertices eij - edges Vin - start node Vout - end node nm - nanometer kg - kilogram vi - double stranded DNA Vi - node |V| - number of nodes L - array of location of nodes A - array of aggregation values N - array of Hamiltonian path node µl - microliter v̅i - reverse primer µM - micro Molar rpm - revolution per minute s - second J - cost function U - partition matrix xix Y - set of cluster centers X - set of data C - number of clusters N - number of data m - fuzziness value index x - data point y - cluster center µ - membership value d (x,y) - distance ߝ - error t - iteration step GHz - Giga Herzt GB - Giga Byte η - scale parameter β - positive constant xx LIST OF ABBREVIATIONS DNA - Deoxyribonucleic acid PCR - Polymerase Chain Reaction HPP - Hamiltonian Path Problem A - Adenine C - Cytosine G - Guanine T - Thymine ssDNA - single-stranded DNA dsDNA - double stranded DNA ATP - Adenosine-5'-triphosphate NAD - Nicotinamide adenine dinucleotide PO −4 - phosphate dNTP - deoxynucleotide triphosphate NP - Nondeterministic polynomial RNA - Ribonucleic acid PAGE - Polyacrylamide Gel Electrophoresis UV - ultra violet SAT - satisfiability problem SA - simulated annealing EA - Evolutionary Algorithm ACO - Ant Colony Optimization PSO - Particle Swarm Optimization AFM - Atomic Force Microscope DHP - Directed Hamiltonian Path FCM - Fuzzy C-Means + xxi AFCM - Alternative Fuzzy C-Means EtBr - ethidium bromide. FAM - 6-carboxyfluorescein TAMRA - tetramethylrhodamine FRET - fluorescence resonance energy transfer R - reporter dye and Q - quencher dye Taq - Thermus aquaticus bp - base pairs POA - Parallel Overlap Assembly ddH2O - double distilled water MgCl2 - magnesium chloride dUTP-2' - deoxyuridine 5'-triphosphate dTTP - deoxythymidine triphosphate EM - Expectation Maximization PCA - Principal Component Analysis PCM - Possibilistic C-Means TSP - Travelling Salesman Problem SPP - Shortest Path Problem xxii LIST OF APPENDICES APPENDIX A TITLE List of publications PAGE 103