Molecular Computational Engines of Intelligence The Second Joint Symposium on Computational Intelligence (JSCI) Jan. 19, 2006, KAIST, Korea Byoung-Tak Zhang Biointelligence Laboratory School of Computer Science and Engineering Brain Science, Cognitive Science, Bioinformatics Programs Seoul National University Seoul 151-742, Korea btzhang@cse.snu.ac.kr http://bi.snu.ac.kr/ Da Vinci’s Dream of Flying Machines 2 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Engines of Flight Piston Engine Jet Engine Rocket Engine 3 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Turing’s Dream of Intelligent Machines Alan Turing (1912-1954) Computing Machinery and Intelligence (1950) © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 4 Computers and Intelligence 5 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Humans and Computers The Entire Problem Space Human Computers What Kind of Computers? Current Computers 6 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Computational Engines of Intelligence Symbolic Rule-Based Systems Connectionist Neural Networks Evolutionary Genetic Algorithms ? 7 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Brain as a Molecular Computer Mind Mind Brain Cell memory Molecule 1011 cells 1010 mol. 8 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Molecular Mechanisms of Memory in the Brain 9 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Two Faces of the Brain: Electrical Waves or Chemical Particles? Brain as a network of neurons and synapses (a) Neuron-oriented cellular view (“electrical” waves) (b) Synapse-oriented molecular view 10 (“chemical” particles) © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ [Zhang, 2005] Principles of Information Processing in the Brain The Principle of Uncertainty Precision vs. prediction The Principle of Nonseparability “UN-IBM” Processor vs. memory The Principle of Infinitity Limited matter vs. unbounded memory The Principle of “Big Numbers Count” Hyperinteraction of 1011 neurons (or > 1017 molecules) The Principle of “Matter Matters” Material basis of “consciousness” [Zhang, 2005] 11 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Unconventional Computing Quantum Computing Atoms Superposition, quantum entanglements Chemical Computing Chemicals Reaction-diffusion computing Molecular Computing Molecules “Self-organizing hardware” 12 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Molecular Computers vs. Silicon Computers Molecular Computers Silicon Computers Processing Ballistic Hardwired Medium Liquid (wet) or Gaseous (dry) Solid (dry) Communication 3D collision 2D switching Configuration Amorphous (asynchronous) Fixed (synchronous) Parallelism Massively parallel Sequential Speed Fast (millisec) Ultra-fast (nanosec) Reliability Low High Density Ultrahigh Very high Reproducibility Probabilistic Deterministic 13 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ The Quest for the “Right” Molecules Protein Versatile structures Unpredictable structure Chemically unstable DNA Versatile sequences (synthesizable) Predictable structure (can be designed) Chemically stable and durable RNA Both properties of proteins and DNA Difficult to handle … 14 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ DNA as “Programmable Matter” 15 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ DNA Computation of Hamiltonian Paths [Adleman, Science 1994; Scientific American 16 1998] © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Molecular Operators Variation Ligation Restriction Mutation (PCR) Selection Gel electrophoresis Affinity separation (beads) Capillary electrophoresis Amplification Repeat Polymerase chain reaction (PCR) Rolling circle amplification (RCA) Hybridization Ligation Heat Cool Polymer 17 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Why Molecular/DNA Computers? 6.022 1023 molecules / mole Massively Parallel Search Desktop: 109 operations / sec Supercomputer: 1012 operations / sec 1 mmol of DNA: 1026 reactions Favorable Energetics: Gibbs Free Energy G 8 kcal mol -1 1 J for 2 1019 operations Storage Capacity: 1 bit per cubic nanometer The Fastest Supercomputer vs. DNA computer 106 op/sec vs. 1014 op/sec 109 op/J vs. 1019 op/J (in ligation step) 1bit per 1012 nm3 vs. 1 bit per 1 nm3 (video tape vs. molecules) 18 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Solving a 20-var 3-CNF Problem 19 [Braich et al., Science 2002] © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ [Winfree et al., Nature 1998] 20 [LaBean et al., Nature 2002] © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ DNA-Linked Nanoparticles [Mirkin et al.] © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 21 Self-assembly Computing by DNA-Linked Nanoparticles I II 22 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ [Park, J.-Y. et al.] The Hypernetwork Model: A Molecular Computational Engine of Intelligence Hypergraphs A hypergraph is a (undirected) graph G whose edges connect a non-null number of vertices, i.e. G = (V, E), where V = {v1, v2, …, vn}, E = {E1, E2, …, En}, and Ei = {vi1, vi2, …, vim} An m-hypergraph consists of a set V of vertices and a subset E of V[m], i.e. G = (V, V[m]) where V[m] is a set of subsets of V whose elements have precisely m members. A hypergraph G is said to be k-uniform if every edge Ei in E has cardinality k. A hypergraph G is k-regular if every vertex has degree k. Rem.: An ordinary graph is a 2-uniform hypergraph. 24 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ An Example Hypergraph E1 G = (V, E) V = {v1, v2, v3, …, v7} E = {E1, E2, E3, E4, E5} E3 v1 E2 E1 = {v1, v3, v4} E2 = {v1, v4} E3 = {v2, v3, v6} E4 = {v3, v4, v6, v7} E5 = {v4, v5, v7} v2 E4 v3 v4 v6 v5 E5 v7 25 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Hypernetworks [Zhang, 2006, in preparation] A hypernetwork is a hypergraph of weighted edges. It is defined as a triple H = (V, E, W), where V = {v1, v2, …, vn}, E = {E1, E2, …, En}, and W = {w1, w2, …, wn}. An m-hypernetwork consists of a set V of vertices and a subset E of V[m], i.e. H = (V, V[m], W) where V[m] is a set of subsets of V whose elements have precisely m members and W is the set of weights associated with the hyperedges. A hypernetwork H is said to be k-uniform if every edge Ei in E has cardinality k. A hypernetwork H is k-regular if every vertex has degree k. Rem.: An ordinary graph is a 2-uniform hypergraph with wi=1. 26 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ A Hypernetwork x1 x2 x15 x3 x14 x4 x13 x5 x12 x6 x11 x7 x10 x8 x9 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 27 The Hypernetwork Model of Learning The hypernetwo rk is defined as H ( X , S ,W ) X ( x1 , x2 ,..., xI ) The energy of the hypernetwo rk 1 1 E (x ( n ) ;W ) w(i1i22) x (i1n ) x (i2n ) w(i1i22i)3 x (i1n ) x (i2n ) x (i3n ) ... 2 i1 ,i2 6 i1 ,i2 ,i3 S Si , Si X , k | Si | The probabilit y distributi on i 1 P(x ( n ) | W ) exp[ E (x ( n ) ;W ) ( 2) ( 3) (K ) W (W ,W ,...,W ) Z(W ) Training set : 1 1 1 ( 2) ( n ) ( n ) ( 2) ( n ) ( n ) ( n ) exp w x x w x x x ... D {x ( n ) }1N Z(W ) 2 6 i , i i , i , i K 1 1 (k ) (n) (n) (n) exp w x x ... x , Z(W ) c ( k ) i ,i ,..., i k 2 i1i2 i1 i2 i1i2i3 1 2 i1 i2 i3 1 2 3 i1i2 ...i3 1 2 i1 i2 ik 3 where the partition function is K 1 (k ) ( m) ( m) (m) Z(W ) exp wi1i2 ...i3 x i1 x i2 ...x ik k 2 c(k ) i1 ,i2 ,..., i3 x( m ) [Zhang, 2006, in preparation] © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 28 Deriving the Learning Rule P({x ( n ) }N1 | W ) N P(x (n) | W ) n 1 ln P ({x ( n ) }N1 | W ) N ln P (x (n) | W ( 2 ) , W ( 3) ,..., W ( K ) ) n 1 K 1 (k ) (n) (n) (n) exp wi1i2 ...i3 x i1 x i2 ...x ik ln Z (W ) n 1 k 2 c( k ) i1 ,i2 ,..., ik N (n) N ln P ({ x } 1 |W ) (s) w i i ...i 12 s 29 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Derivation of the Learning Rule ln P ({x ( n ) }N1 | W ) (s) wi1i2 ...is w(i1is2)...is K 1 (k ) (n) (n) (n) exp w x x ... x ln Z ( W ) i1i2 ...ik i1 i2 ik c ( k ) n 1 k 2 i , i ,..., i 1 2 k N N K 1 (k ) (n) (n) (n) w x x ... x ln Z ( W ) (s) (s) i1i2 ...ik i1 i2 ik w n 1 w i i ...i k 2 c ( k ) i1 ,i2 ,..., ik i1i2 ...is 12 s N x x ...x N x (i1n ) x (i2n ) ...x (isn ) xi1 xi2 ...xis n 1 i1 i2 is Data P ( x|W ) xi1 xi2 ...xis P ( x|W ) where 1 N xi1 xi2 ...xis Data xi1 xi2 ...xis P ( x|W ) x N n 1 (n) i1 x (i2n ) ...x (isn ) x i1 x i2 ...x is P ( x | W ) x 30 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 1 x1 =1 x2 =0 x3 =0 x4 =1 x5 =0 x6 =0 x7 =0 x8 =0 x9 =0 x10 =1 x11 =0 x12 =1 x13 =0 x14 =0 x15 =0 y =1 2 x1 =0 x2 =1 x3 =1 x4 =0 x5 =0 x6 =0 x7 =0 x8 =0 x9 =1 x10 =0 x11 =0 x12 =0 x13 =0 x14 =1 x15 =0 y =0 3 x1 =0 x2 =0 x3 =1 x4 =0 x5 =0 x6 =1 x7 =0 x8 =1 x9 =0 x10 =0 x11 =0 x12 =0 x13 =1 x14 =0 x15 =0 y =1 4 x1 =0 x2 =0 x3 =0 x4 =0 x5 =0 x6 =0 x7 =0 x8 =1 x9 =0 x10 =0 x11 =1 x12 =0 x13 =0 x14 =0 x15 =1 y =1 4 examples x1 x2 1 x1 x4 x10 y=1 x1 x4 x12 y=1 x4 x10 x12 y=1 x15 Round 3 1 2 x3 x14 x4 2 3 4 x2 x3 x9 y=0 x2 x3 x14 y=0 x3 x9 x14 y=0 x3 x6 x8 y=1 x3 x6 x13 y=1 x6 x8 x13 y=1 x8 x11 x15 y=0 x13 x12 x5 x6 x11 x7 x10 x8 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ x9 31 Self-Assemblying Hypernetworks xi xj y Molecular Encoding Hypernetwork Representation X1 x1 x3 x1 Class x1 x3 x1 X2 x3 Class x1 Class x1 x3 X8 x3 Class Class x1 Class x1 x3 x1 x1 xn … x1 x2 x3 x1 Class x2 x4 x3 x2 X3 X7 x1 x1 Class x3 x3 x2 Class Class x1 x3 x4 x2 x2 Class x2 X4 X6 x2 x2 x4 Class x1 x4 Class Class x4 Class Class x2 x4 x3 Class x2 x1 Class x3 Class Class x1 x3 x4 x2 x4 x1 x3 Class Class x1 x1 x2 xn … Class x1 Class x1 Class Class Class x2 x2 Class Class Class x2 Class Class x4 Class X5 32 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Encoding a Hypernetwork with DNA a) z1 : (x1=0, x2=1, x3=0, y=1) z2 : (x1=0, x2=0, x3=1, x4=0, x5=0, y=0) z3 : (x2=1, x4=1, y=1) Collection of (labeled) hyperedges z4 : (x2=1, x3=0, x4=1, y=0) b) z1 : AAAACCAATTGGAAGGCCATGCGG z2 : AAAACCAATTCCAAGGGGCCTTCCCCAACCATGCCC z3 : AATTGGCCTTGGATGCGG Library of DNA molecules z4 : AATTGGAAGGCCCCTTGGATGCCC corresponding to (a) where AAAA x1 AATT x2 AAGG x3 CCTT x4 CCAA x5 ATGC CC 0 GG 1 y 33 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Learning the Hypernetwork (by Evolution) Next generation i i Library of combinatorial molecules Library Example + Select the library elements matching the example Amplify the matched library elements by PCR [Zhang, DNA11] Hybridize 34 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ The Theory of Bayesian Evolution Evolution as a Bayesian inference process Evolutionary computation (EC) is viewed as an iterative process of generating the individuals of ever higher posterior probabilities from the priors and the observed data. generation 0 P(A |D) P(A |D) ... P0(Ai) generation g Pg(Ai |D) Pg(Ai) i i [Zhang, CEC-99] © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 35 Animation for Molecular Evolutionary Learning MP4.avi 36 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Molecular Programming (MP): The Evolutionary Learning Algorithm 1. Let the library L represent the current distribution P(X,Y). 2. Get a training example (x,y). 3. Classify x using L as follows 3.1 Extract all molecules matching x into M. 3.2 From M separate the molecules into classes: Extract the molecules with label Y=0 into M0 Extract the molecules with label Y=1 into M1 3.3 Compute y*=argmaxY{0,1}| MY |/|M| 4. Update L If y*=y, then Ln ← Ln-1+{c(u, v)} for u=x and v=y for (u, v) Ln-1, If y*≠y, then Ln ← Ln-1{c(u, v)} for u=x and v ≠ y for (u, v) Ln-1 5.Goto step 2 if not terminated. [Zhang, GECCO-2005] 37 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Step 1: Probability Distribution in the Library D {( x i , yi )} K i 1 xi ( xi1 , xi2 , , xin ) {0,1}n yi {0,1} 1 P( X , Y ) L L (n) f i ( X1, X 2 ,..., X n , Y ) i 1 Step 2: Presentation of an Example (or Query) P(xi , yi | x q , yq ) exp( G (xi , yi | x q , yq )) exp( G(x , y | x j i i q , yq )) 38 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Step 3: Classify the Example (Decision Making) y* arg max P (Y | x) Y {0 ,1} P (Y , x) arg max Y {0 ,1} P ( x ) c(x) L M L P(x) y * arg max c(Y | x) / M Y {0 ,1} arg max c(Y | x) Y {0 ,1} c(Y | x) M M Y M P(Y | x) arg max P(Y | x) Y {0 ,1} 39 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Step 4: Update the Library (Learning) L L {( u, v)} L L {( u, v)} Pn ( X , Y | x, y) (1 ) Pn1 ( X , Y | x, y) P(x, y | X , Y ) P(x, y ) P(x, y ) c ( x, y ) cn 1 (x, y ) 40 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Nano Self-Replication 41 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Benchmark Problem: Digit Images • 8x8=64 bit images (made from 64x64 scanned gray images) • Training set: 3823 images • Test set: 1797 images 42 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Pattern Classification Hidden Layer Input Layer x1 x1 Class x1 Probabilistic Library Model x1 Class x1 w1 x2 x2 Class x2 x1 Class x1 x3 x3 Class x1 Class x1 x1 x3 x3 • • •x Class Class x1 Class Class x2 1 x3 x1 Class x1 xn … x1 x2 x3 x1 Class x2 x4 x3 x2 x1 x1 Class x3 x3 x2 x1 Class x2 Class x1 x3 x4 x2 x4 x2 x2 x2 • • • Class x1 x4 Class x2 x4 x3 Class x1 x3 x1 x4 x3 Class x2 Class x1x3 Class xn Class x4 Class x1 Class x2 Class 2 Class Class Class x3 x3 x1 x1 x3 x4 x1 Class x1 x3 x4 wi x x1x2 Class Class x1 x2 x3 Class Class x1 Class x1 x2 Class x1 xn Class Class x2 Class … Class Class x1 Class x2 x2 Class • • • x3 wm x1 Output Layer wj x3 x2 Class w2 x1 Class Hyperinteraction Network x1 … x1 Class n n m k 1 k n n W {wi | 1 i , w # of copies} k 1 k xn … Class xn Class Class x1…xn © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 43 Pattern Classification: Learning Curve Classes 0-9, Random Sampling of Low-Order Features 1 0.9 0.8 Classification rate 0.7 0.6 Order 1 0.5 Order 2 0.4 0.3 0.2 0.1 0 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 epoch 44 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Pattern Completion Task I Classes 0-9, Random Sampling of Features of Order 5 45 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Pattern Completion Task II Classes 0-9, Subsampled Features 46 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Pattern Completion Task III Subsampled Features for Two Classes 47 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Biological Application & 120 samples from 60 leukemia patients Gene expression data Training with 6-fold validation [Cheok et al., Nature Genetics, 2003] © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Class: ALL/AML Diagnosis 48 Simulation Results Fitness evolution of the population of wDNF terms 49 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Simulation Results Fitness curves for runs with fixed-size wDNF terms (fixed-order 1, 4, 7, and 10) 50 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Simulation Results Distribution of the size of wDNF terms From left to right the epoch number is 0, 5, 10. 51 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Future Technology Enablers True neural computing Bio-electric computers 1e6-1e7 x lower power for lifetime batteries Quantum computer, molecular electronics Smart lab-on-chip, plastic/printed ICs, self-assembly Full motion mobile video/office Metal gates, Hi-k/metal oxides, Lo-k with Cu, SOI Now +2 Vertical/3D CMOS, Microwireless nets, Integrated optics +4 Wearable communications, wireless remote medicine, ‘hardware over internet’ ! Pervasive voice recognition, “smart” transportation +6 +8 +10 +12 Source: Motorola, Inc,522000 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Da Vinci’s Dream of Flying Machines 53 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Horsepower Per Pound for Flying Liquid Fuel Rockets Gas Turbines Jet Engines Gas Piston Engines Combustion Engines Steam Engines Steam Engines 1850 1950 2050 54 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Interaction Horsepower Per Pound for Computing Molecular Engines Electronic Engines Electrical Engines Electronic Engines Mechanical Engines Mechanical Engines 1850 Molecular Engines 1950 2050 55 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Conclusion Hyperinteraction is a “fundamental information processing principle” underlying the brain functions. Molecular computing is an “unconventional computing paradigm” that can best realize the hyperinteractionistic principle at the moment. DNA molecules are one of the most versatile and reliable “programmable matters” found so far for engineering molecular computers in practice. The hyperinteraction networks are a probabilistic molecular computer that “evolutionarily organize” its random network architecture based on observed data. The capability of learning molecular hypernetworks to perform “hyperinteractionistic, associative, and fault-tolerant” pattern processing seems promising for realizing large-scale computational engines of intelligence. 56 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Acknowledgements Collaborating Labs - Biointelligence Laboratory, Seoul National University - Biochemistry Lab, Seoul National Univ. Medical School - Cell and Microbiology Lab, Seoul National University - Advanced Proteomics Lab, Hanyang University - DigitalGenomics, Inc. - GenoProt, Inc. Supported by - National Research Lab Program of Min. of Sci. & Tech. (2002-2007) - Next Generation Tech. Program of Min. of Ind. & Comm. (2000-2010) More Information at - http://bi.snu.ac.kr/MEC/ - http://cbit.snu.ac.kr/ 57 © 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/