Lecture Note - 서울대 : Biointelligence lab

advertisement
Molecular Computational Engines of
Intelligence
The Second Joint Symposium on Computational Intelligence (JSCI)
Jan. 19, 2006, KAIST, Korea
Byoung-Tak Zhang
Biointelligence Laboratory
School of Computer Science and Engineering
Brain Science, Cognitive Science, Bioinformatics Programs
Seoul National University
Seoul 151-742, Korea
btzhang@cse.snu.ac.kr
http://bi.snu.ac.kr/
Da Vinci’s Dream of Flying Machines
2
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Engines of Flight
Piston Engine
Jet Engine
Rocket Engine
3
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Turing’s Dream of Intelligent Machines
Alan Turing
(1912-1954)
Computing Machinery and Intelligence (1950)
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
4
Computers and Intelligence
5
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Humans and Computers
The Entire Problem Space
Human Computers
What Kind of
Computers?
Current Computers
6
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Computational Engines of Intelligence
Symbolic
Rule-Based Systems
Connectionist
Neural Networks
Evolutionary
Genetic Algorithms
?
7
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Brain as a Molecular Computer
Mind
Mind
Brain
Cell
 memory
Molecule
1011 cells
1010 mol.
8
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Molecular Mechanisms of Memory in the Brain
9
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Two Faces of the Brain: Electrical Waves or
Chemical Particles?
Brain as a network of
neurons and synapses
(a) Neuron-oriented cellular view
(“electrical” waves)
(b) Synapse-oriented molecular view
10
(“chemical” particles)
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
[Zhang, 2005]
Principles of Information Processing in
the Brain

The Principle of Uncertainty
 Precision vs. prediction

The Principle of Nonseparability
“UN-IBM”
 Processor vs. memory

The Principle of Infinitity
 Limited matter vs. unbounded memory

The Principle of “Big Numbers Count”
 Hyperinteraction of 1011 neurons (or > 1017 molecules)

The Principle of “Matter Matters”
 Material basis of “consciousness”
[Zhang, 2005]
11
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Unconventional Computing

Quantum Computing
 Atoms
 Superposition, quantum entanglements

Chemical Computing
 Chemicals
 Reaction-diffusion computing

Molecular Computing
 Molecules
 “Self-organizing hardware”
12
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Molecular Computers vs. Silicon Computers
Molecular Computers
Silicon Computers
Processing
Ballistic
Hardwired
Medium
Liquid (wet) or Gaseous (dry)
Solid (dry)
Communication
3D collision
2D switching
Configuration
Amorphous (asynchronous)
Fixed (synchronous)
Parallelism
Massively parallel
Sequential
Speed
Fast (millisec)
Ultra-fast (nanosec)
Reliability
Low
High
Density
Ultrahigh
Very high
Reproducibility
Probabilistic
Deterministic
13
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
The Quest for the “Right” Molecules

Protein
 Versatile structures
 Unpredictable structure
 Chemically unstable

DNA
 Versatile sequences (synthesizable)
 Predictable structure (can be designed)
 Chemically stable and durable

RNA
 Both properties of proteins and DNA
 Difficult to handle

…
14
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
DNA as “Programmable Matter”
15
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
DNA Computation of Hamiltonian Paths
[Adleman, Science 1994; Scientific American
16 1998]
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Molecular Operators



Variation
 Ligation
 Restriction
 Mutation (PCR)
Selection
 Gel electrophoresis
 Affinity separation (beads)
 Capillary electrophoresis
Amplification
Repeat
 Polymerase chain reaction (PCR)
 Rolling circle amplification (RCA)
Hybridization
Ligation
Heat
Cool
Polymer
17
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Why Molecular/DNA Computers?


6.022  1023 molecules / mole
Massively Parallel Search
 Desktop: 109 operations / sec
 Supercomputer: 1012 operations / sec
 1 mmol of DNA: 1026 reactions

Favorable Energetics: Gibbs Free Energy
G  8 kcal mol -1
 1 J for 2  1019 operations


Storage Capacity: 1 bit per cubic nanometer
The Fastest Supercomputer vs. DNA computer
 106 op/sec vs. 1014 op/sec
 109 op/J vs. 1019 op/J (in ligation step)
 1bit per 1012 nm3 vs. 1 bit per 1 nm3
(video tape vs. molecules)
18
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Solving a 20-var 3-CNF Problem
19
[Braich et al., Science 2002]
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
[Winfree et al., Nature 1998]
20
[LaBean et al., Nature
2002]
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
DNA-Linked Nanoparticles
[Mirkin et al.]
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
21
Self-assembly Computing by DNA-Linked
Nanoparticles
I
II
22
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
[Park, J.-Y. et al.]
The Hypernetwork Model: A Molecular
Computational Engine of Intelligence
Hypergraphs





A hypergraph is a (undirected) graph G whose edges connect
a non-null number of vertices, i.e. G = (V, E), where
V = {v1, v2, …, vn},
E = {E1, E2, …, En},
and Ei = {vi1, vi2, …, vim}
An m-hypergraph consists of a set V of vertices and a subset
E of V[m], i.e. G = (V, V[m]) where V[m] is a set of subsets of V
whose elements have precisely m members.
A hypergraph G is said to be k-uniform if every edge Ei in E
has cardinality k.
A hypergraph G is k-regular if every vertex has degree k.
Rem.: An ordinary graph is a 2-uniform hypergraph.
24
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
An Example Hypergraph
E1
G = (V, E)
V = {v1, v2, v3, …, v7}
E = {E1, E2, E3, E4, E5}
E3
v1
E2
E1 = {v1, v3, v4}
E2 = {v1, v4}
E3 = {v2, v3, v6}
E4 = {v3, v4, v6, v7}
E5 = {v4, v5, v7}
v2
E4
v3
v4
v6
v5
E5
v7
25
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Hypernetworks





[Zhang, 2006, in preparation]
A hypernetwork is a hypergraph of weighted edges. It is defined as a
triple H = (V, E, W), where
V = {v1, v2, …, vn},
E = {E1, E2, …, En},
and W = {w1, w2, …, wn}.
An m-hypernetwork consists of a set V of vertices and a subset E of V[m],
i.e. H = (V, V[m], W) where V[m] is a set of subsets of V whose elements
have precisely m members and W is the set of weights associated with the
hyperedges.
A hypernetwork H is said to be k-uniform if every edge Ei in E has
cardinality k.
A hypernetwork H is k-regular if every vertex has degree k.
Rem.: An ordinary graph is a 2-uniform hypergraph with wi=1.
26
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
A Hypernetwork
x1
x2
x15
x3
x14
x4
x13
x5
x12
x6
x11
x7
x10
x8
x9
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
27
The Hypernetwork Model of Learning
The hypernetwo rk is defined as
H  ( X , S ,W )
X  ( x1 , x2 ,..., xI )
The energy of the hypernetwo rk
1
1
E (x ( n ) ;W )    w(i1i22) x (i1n ) x (i2n )   w(i1i22i)3 x (i1n ) x (i2n ) x (i3n )  ...
2 i1 ,i2
6 i1 ,i2 ,i3
S   Si ,
Si  X , k | Si | The probabilit y distributi on
i
1
P(x ( n ) | W ) 
exp[  E (x ( n ) ;W )
( 2)
( 3)
(K )
W  (W ,W ,...,W )
Z(W )
Training set :
1

1
1
( 2) ( n ) ( n )
( 2) ( n ) ( n ) ( n )

exp
w
x
x

w
x
x
x

...
 


D  {x ( n ) }1N
Z(W )
2
6
i
,
i
i
,
i
,
i


K 1

1
(k )
(n) (n)
(n)

exp 
w
x
x
...
x
,

Z(W )
c
(
k
)
i ,i ,..., i
 k 2

i1i2
i1
i2
i1i2i3
1 2
i1
i2
i3
1 2 3
i1i2 ...i3
1 2
i1
i2
ik
3
where the partition function is
K 1
(k )
( m) ( m)
(m) 
Z(W )   exp  
 wi1i2 ...i3 x i1 x i2 ...x ik 
 k  2 c(k ) i1 ,i2 ,..., i3

x( m )
[Zhang, 2006, in preparation]
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
28
Deriving the Learning Rule
P({x ( n ) }N1 | W )
N
  P(x (n) | W )
n 1
ln P ({x ( n ) }N1 | W )
N
 ln  P (x (n) | W ( 2 ) , W ( 3) ,..., W ( K ) )
n 1


K 1



(k )
(n) (n)
(n)
  exp 
wi1i2 ...i3 x i1 x i2 ...x ik   ln Z (W ) 


n 1 
 k  2 c( k ) i1 ,i2 ,..., ik



N

(n) N
ln
P
({
x
} 1 |W )
(s)
w i i ...i
12
s
29
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Derivation of the Learning Rule

ln P ({x ( n ) }N1 | W )
(s)
wi1i2 ...is


w(i1is2)...is


 K

1


(k )
(n) (n)
(n)
exp
w
x
x
...
x

ln
Z
(
W
)






i1i2 ...ik
i1
i2
ik
c
(
k
)

n 1 
k

2
i
,
i
,...,
i
1 2
k




N
N 

 K

1

 

(k )
(n) (n)
(n)
 
w
x
x
...
x

ln
Z
(
W
)





(s)
(s)
i1i2 ...ik
i1
i2
ik

w
n 1  w i i ...i  k  2 c ( k ) i1 ,i2 ,..., ik


i1i2 ...is
12
s



 N  x x ...x
N
  x (i1n ) x (i2n ) ...x (isn )  xi1 xi2 ...xis
n 1
i1
i2
is
Data
P ( x|W )
 xi1 xi2 ...xis


P ( x|W )
where
1

N
xi1 xi2 ...xis
Data
xi1 xi2 ...xis
P ( x|W )
 x
N
n 1

(n)
i1
x (i2n ) ...x (isn )

  x i1 x i2 ...x is P ( x | W )

x
30
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
1
x1
=1
x2
=0
x3
=0
x4
=1
x5
=0
x6
=0
x7
=0
x8
=0
x9
=0
x10
=1
x11
=0
x12
=1
x13
=0
x14
=0
x15
=0
y
=1
2
x1
=0
x2
=1
x3
=1
x4
=0
x5
=0
x6
=0
x7
=0
x8
=0
x9
=1
x10
=0
x11
=0
x12
=0
x13
=0
x14
=1
x15
=0
y
=0
3
x1
=0
x2
=0
x3
=1
x4
=0
x5
=0
x6
=1
x7
=0
x8
=1
x9
=0
x10
=0
x11
=0
x12
=0
x13
=1
x14
=0
x15
=0
y
=1
4
x1
=0
x2
=0
x3
=0
x4
=0
x5
=0
x6
=0
x7
=0
x8
=1
x9
=0
x10
=0
x11
=1
x12
=0
x13
=0
x14
=0
x15
=1
y
=1
4 examples
x1
x2
1
x1
x4
x10
y=1
x1
x4
x12
y=1
x4
x10
x12
y=1
x15
Round 3
1
2
x3
x14
x4
2
3
4
x2
x3
x9
y=0
x2
x3
x14
y=0
x3
x9
x14
y=0
x3
x6
x8
y=1
x3
x6
x13
y=1
x6
x8
x13
y=1
x8
x11
x15
y=0
x13
x12
x5
x6
x11
x7
x10
x8
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
x9
31
Self-Assemblying Hypernetworks
xi
xj
y
Molecular Encoding
Hypernetwork Representation
X1
x1
x3
x1
Class
x1
x3
x1
X2
x3
Class
x1
Class
x1
x3
X8
x3
Class
Class
x1
Class
x1
x3
x1
x1
xn
…
x1
x2
x3
x1
Class
x2
x4
x3
x2
X3
X7
x1
x1
Class
x3
x3
x2
Class
Class
x1
x3
x4
x2
x2
Class
x2
X4
X6
x2
x2
x4
Class
x1
x4
Class
Class
x4
Class
Class
x2
x4
x3
Class
x2
x1
Class
x3
Class
Class
x1
x3
x4
x2
x4
x1
x3
Class
Class
x1
x1
x2
xn
…
Class
x1
Class
x1
Class
Class
Class
x2
x2
Class
Class
Class
x2
Class
Class
x4
Class
X5
32
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Encoding a Hypernetwork with DNA
a) z1 : (x1=0, x2=1, x3=0, y=1)
z2 : (x1=0, x2=0, x3=1, x4=0, x5=0, y=0)
z3 : (x2=1, x4=1, y=1)
Collection of (labeled) hyperedges
z4 : (x2=1, x3=0, x4=1, y=0)
b) z1 : AAAACCAATTGGAAGGCCATGCGG
z2 : AAAACCAATTCCAAGGGGCCTTCCCCAACCATGCCC
z3 : AATTGGCCTTGGATGCGG
Library of DNA molecules
z4 : AATTGGAAGGCCCCTTGGATGCCC
corresponding to (a)
where
AAAA
x1
AATT x2
AAGG x3
CCTT
x4
CCAA x5
ATGC
CC
0
GG 1
y
33
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning the Hypernetwork (by Evolution)
Next generation
i
i
Library of combinatorial
molecules
Library
Example
+
Select the library elements
matching the example
Amplify the matched library
elements by PCR
[Zhang, DNA11]
Hybridize
34
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
The Theory of Bayesian Evolution


Evolution as a Bayesian inference process
Evolutionary computation (EC) is viewed as an iterative process of
generating the individuals of ever higher posterior probabilities from the
priors and the observed data.
generation 0
P(A |D)
P(A |D)
...
P0(Ai)
generation g
Pg(Ai |D)
Pg(Ai)
i
i
[Zhang, CEC-99]
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
35
Animation for Molecular Evolutionary
Learning

MP4.avi
36
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Molecular Programming (MP): The
Evolutionary Learning Algorithm
1. Let the library L represent the current distribution P(X,Y).
2. Get a training example (x,y).
3. Classify x using L as follows
3.1 Extract all molecules matching x into M.
3.2 From M separate the molecules into classes:
Extract the molecules with label Y=0 into M0
Extract the molecules with label Y=1 into M1
3.3 Compute y*=argmaxY{0,1}| MY |/|M|
4. Update L
If y*=y, then Ln ← Ln-1+{c(u, v)} for u=x and v=y for (u, v) Ln-1,
If y*≠y, then Ln ← Ln-1{c(u, v)} for u=x and v ≠ y for (u, v) Ln-1
5.Goto step 2 if not terminated.
[Zhang, GECCO-2005]
37
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Step 1: Probability Distribution in the Library
D  {( x i , yi )}
K
i 1
xi  ( xi1 , xi2 ,  , xin ) {0,1}n
yi {0,1}
1
P( X , Y ) 
L
L
(n)
f
 i ( X1, X 2 ,..., X n , Y )
i 1
Step 2: Presentation of an Example (or Query)
P(xi , yi | x q , yq ) 
exp( G (xi , yi | x q , yq ))
 exp( G(x , y | x
j
i
i
q
, yq ))
38
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Step 3: Classify the Example (Decision Making)
y*  arg max P (Y | x)
Y {0 ,1}
P (Y , x)
 arg max
Y {0 ,1} P ( x )
c(x) L  M L  P(x)
y *  arg max c(Y | x) / M
Y {0 ,1}
 arg max c(Y | x)
Y {0 ,1}
c(Y | x) M  M Y
M  P(Y | x)
 arg max P(Y | x)
Y {0 ,1}
39
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Step 4: Update the Library (Learning)
L  L  {( u, v)}
L  L  {( u, v)}
Pn ( X , Y | x, y)  (1   ) Pn1 ( X , Y | x, y)
P(x, y | X , Y )  P(x, y )

P(x, y )
c ( x, y )

cn 1 (x, y )
40
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Nano Self-Replication
41
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Benchmark Problem: Digit Images
• 8x8=64 bit images (made from 64x64 scanned gray images)
• Training set: 3823 images
• Test set: 1797 images
42
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Pattern Classification
Hidden Layer
Input Layer
x1
x1
Class
x1
Probabilistic Library Model
x1
Class
x1
w1
x2
x2
Class
x2
x1
Class
x1
x3
x3
Class
x1
Class
x1
x1
x3
x3
•
•
•x
Class
Class
x1
Class
Class
x2
1
x3
x1
Class
x1
xn
…
x1
x2
x3
x1
Class
x2
x4
x3
x2
x1
x1
Class
x3
x3
x2
x1
Class
x2
Class
x1
x3
x4
x2
x4
x2
x2
x2
•
•
•
Class
x1
x4
Class
x2
x4
x3
Class
x1
x3
x1
x4
x3
Class
x2
Class
x1x3
Class
xn
Class
x4
Class
x1
Class
x2
Class
2
Class
Class
Class
x3
x3
x1
x1
x3
x4
x1
Class
x1
x3
x4
wi
x
x1x2
Class
Class
x1
x2
x3
Class
Class
x1
Class
x1
x2
Class
x1
xn
Class
Class
x2
Class
…
Class
Class
x1
Class
x2
x2
Class
•
•
•
x3
wm
x1
Output Layer
wj
x3
x2
Class
w2
x1
Class
Hyperinteraction Network
x1
…
x1
Class
n
n
m   
k 1  k 
n
n
W  {wi | 1  i    , w  # of copies}
k 1  k 
xn
…
Class
xn
Class
Class
x1…xn
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
43
Pattern Classification: Learning Curve
Classes 0-9, Random Sampling of Low-Order Features
1
0.9
0.8
Classification rate
0.7
0.6
Order 1
0.5
Order 2
0.4
0.3
0.2
0.1
0
1
7
13 19 25 31 37 43 49 55 61 67 73 79 85 91 97
epoch
44
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Pattern Completion Task I
Classes 0-9, Random Sampling of Features of Order 5
45
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Pattern Completion Task II
Classes 0-9, Subsampled Features
46
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Pattern Completion Task III
Subsampled Features for Two Classes
47
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Biological Application
&
120 samples from
60 leukemia patients
Gene expression data
Training with
6-fold validation
[Cheok et al., Nature Genetics, 2003]
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Class: ALL/AML
Diagnosis
48
Simulation Results
Fitness evolution of
the population of wDNF terms
49
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Simulation Results
Fitness curves for runs with fixed-size wDNF terms
(fixed-order 1, 4, 7, and 10)
50
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Simulation Results
Distribution of the size of wDNF terms
From left to right the epoch number is 0, 5, 10.
51
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Future Technology Enablers
True neural computing
Bio-electric
computers
1e6-1e7 x lower power
for lifetime batteries
Quantum computer,
molecular electronics
Smart lab-on-chip,
plastic/printed ICs,
self-assembly
Full motion
mobile
video/office
Metal gates,
Hi-k/metal
oxides, Lo-k
with Cu, SOI
Now
+2
Vertical/3D
CMOS, Microwireless nets,
Integrated optics
+4
Wearable communications,
wireless remote medicine,
‘hardware over internet’ !
Pervasive voice
recognition, “smart”
transportation
+6
+8
+10
+12
Source: Motorola, Inc,522000
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Da Vinci’s Dream of Flying Machines
53
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Horsepower Per Pound for Flying
Liquid Fuel Rockets
Gas Turbines
Jet Engines
Gas Piston Engines
Combustion Engines
Steam Engines
Steam Engines
1850
1950
2050
54
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Interaction Horsepower Per Pound
for Computing
Molecular Engines
Electronic Engines
Electrical Engines
Electronic Engines
Mechanical Engines
Mechanical Engines
1850
Molecular Engines
1950
2050
55
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Conclusion





Hyperinteraction is a “fundamental information processing
principle” underlying the brain functions.
Molecular computing is an “unconventional computing
paradigm” that can best realize the hyperinteractionistic
principle at the moment.
DNA molecules are one of the most versatile and reliable
“programmable matters” found so far for engineering
molecular computers in practice.
The hyperinteraction networks are a probabilistic molecular
computer that “evolutionarily organize” its random network
architecture based on observed data.
The capability of learning molecular hypernetworks to
perform “hyperinteractionistic, associative, and fault-tolerant”
pattern processing seems promising for realizing large-scale
computational engines of intelligence.
56
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Acknowledgements
Collaborating Labs
- Biointelligence Laboratory, Seoul National University
- Biochemistry Lab, Seoul National Univ. Medical School
- Cell and Microbiology Lab, Seoul National University
- Advanced Proteomics Lab, Hanyang University
- DigitalGenomics, Inc.
- GenoProt, Inc.
Supported by
- National Research Lab Program of Min. of Sci. & Tech. (2002-2007)
- Next Generation Tech. Program of Min. of Ind. & Comm. (2000-2010)
More Information at
- http://bi.snu.ac.kr/MEC/
- http://cbit.snu.ac.kr/
57
© 2006, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Download