Lecture Note - 서울대 : Biointelligence lab

advertisement
Learning with Hypergraphs:
Discovery of Higher-Order Interaction Patterns from
High-Dimensional Data
Moscow State University, Faculty of Computational Mathematics and
Cybernetics, Feb. 22, 2007, Moscow, Russia
Byoung-Tak Zhang
Biointelligence Laboratory
School of Computer Science and Engineering
Brain Science, Cognitive Science, Bioinformatics Programs
Seoul National University
Seoul 151-742, Korea
btzhang@cse.snu.ac.kr
http://bi.snu.ac.kr/
Probabilistic Graphical Models (PGMs)

Represent the joint
probability distribution on
some random variables in
graphical form.
 Undirected PGMs
 Directed PGMs

• C and D are
B
A
C
independent given B.
D
• C asserts
dependency between
A and B.
• B and E are
independent given C.
E
Generative: The probability
distribution for some
P( A, B, C , D, E )
variables given values of
 P( A) P( B | A) P(C | A, B) P( D | A, B, C )
other variables can be
P( E | A, B, C , D)
obtained.
 P( A) P( B) P(C | A, B) P( D | B) P( E | C )
 Probabilistic inference
2
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Kinds of Graphical Models
Graphical Models
Undirected
- Boltzmann Machines
- Markov Random Fields
Directed
- Bayesian Networks
- Latent Variable Models
- Hidden Markov Models
- Generative Topographic Mapping
- Non-negative Matrix Factorization
3
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Bayesian Networks
 BN = (S, P) consists of a network structure S and a set of local
probability distributions P
n
p(x)   p( x | pa )
i 1
i
i
<BN for detecting credit card fraud>
• Structure can be found by relying on the prior knowledge of causal relationships
4
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
From Bayes Nets to High-Order PGMs
(1) Naïve Bayes
P( F | J , G, S , A)
P( J , G, S , A | F ) P( F )
P( J , G, S , A)
 P( J , G, S , A | F )

J
A
F
 P( J | F ) P(G | F ) P( S | F ) P( A | F )


G
S
P( x | F )
x{ J ,G , S , A}
J
(2) Bayesian Net
F
P( F , J , G, S , A)
A
 P(G | F ) P( J | F ) P( J | A)( J | S )


P( x | pa ( x))
S
G
x{ F , J ,G , S , A}
(3) High-Order PGM
J
P ( F , J , G , S , A)
A
F
 P( J , G | F ) P( J , S | F ) P( J , A | F )
P (G, S | F ) P(G, A | F )
G
S
P( S , A | F )


he ( x , y ){( x , y )| x , y{ J ,G , S , A}
and x  y }
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
P (he( x5, y ) | F )
The Hypernetworks
Hypergraphs





A hypergraph is a (undirected) graph G whose edges connect
a non-null number of vertices, i.e. G = (V, E), where
V = {v1, v2, …, vn},
E = {E1, E2, …, En},
and Ei = {vi1, vi2, …, vim}
An m-hypergraph consists of a set V of vertices and a subset
E of V[m], i.e. G = (V, V[m]) where V[m] is a set of subsets of V
whose elements have precisely m members.
A hypergraph G is said to be k-uniform if every edge Ei in E
has cardinality k.
A hypergraph G is k-regular if every vertex has degree k.
Rem.: An ordinary graph is a 2-uniform hypergraph.
7
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
An Example Hypergraph
E1
G = (V, E)
V = {v1, v2, v3, …, v7}
E = {E1, E2, E3, E4, E5}
E3
v1
E2
E1 = {v1, v3, v4}
E2 = {v1, v4}
E3 = {v2, v3, v6}
E4 = {v3, v4, v6, v7}
E5 = {v4, v5, v7}
v2
E4
v3
v4
v6
v5
E5
v7
8
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Hypernetworks





[Zhang, DNA-2006]
A hypernetwork is a hypergraph of weighted edges. It is defined as a
triple H = (V, E, W), where
V = {v1, v2, …, vn},
E = {E1, E2, …, En},
and W = {w1, w2, …, wn}.
An m-hypernetwork consists of a set V of vertices and a subset E of V[m],
i.e. H = (V, V[m], W) where V[m] is a set of subsets of V whose elements
have precisely m members and W is the set of weights associated with the
hyperedges.
A hypernetwork H is said to be k-uniform if every edge Ei in E has
cardinality k.
A hypernetwork H is k-regular if every vertex has degree k.
Rem.: An ordinary graph is a 2-uniform hypergraph with wi=1.
9
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
A Hypernetwork
x1
x2
x15
x3
x14
x4
x13
x5
x12
x6
x11
x7
x10
x8
x9
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
10
Learning with Hypernetworks
The Hypernetwork Model of Learning
The hypernetwo rk is defined as
H  ( X , S ,W )
X  ( x1 , x2 ,..., xI )
The energy of the hypernetwo rk
E ( x ( n ) ;W )  
1
1
w(i i2 ) x (i n ) x (i n )   w(i 3i )i x (i n ) x (i n ) x (i n )  ...

2 i1 ,i2 1 2 1 2
6 i1 ,i2 ,i3 1 2 3 1 2 3
S   Si ,
Si  X , k | S i | The probabilit y distributi on
i
1
P(x ( n ) | W ) 
exp[   E (x ( n ) ;W )]
( 2)
( 3)
(K )
W  (W , W ,...,W )
Z(W )
Training set :
1

1
1
( 2) ( n ) ( n )
( 3) ( n ) ( n ) ( n )

exp
w
x
x

w
x
x
x

...
 


D  {x ( n ) }1N
Z(W )
2
6
i
,
i
i
,
i
,
i


K 1

1
(k )
(n) (n)
(n)

exp 
w
x
x
...
x
,

Z(W )
c
(
k
)
i ,i ,..., i
 k 2

i1i2
i1
i2
i1i2i3
1 2
i1
i2
i3
1 2 3
i1i2 ...ik
1 2
i1
i2
ik
k
where the partition function is
K 1
(k )
(m) (m)
(m) 
Z(W )   exp  
 wi1i2 ...ik x i1 x i2 ...x ik 
 k  2 c(k ) i1 ,i2 ,..., ik

x( m )
[Zhang, 2006]
12
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Deriving the Learning Rule
P({x ( n ) }N1 | W )
N
  P(x (n) | W )
n 1
ln P ({x ( n ) }N1 | W )
N
 ln  P (x ( n ) | W ( 2 ) , W ( 3) ,..., W ( K ) )
n 1


K 1



(k )
(n) (n)
(n)
  exp 
wi i ...i x i x i ...x i   ln Z (W )

12 k
1
2
k
c
(
k
)

n 1 
k

2
i
,
i
,...,
i
1 2
k




N

(n) N
ln
P
({
x
} 1 |W )
(s)
w i i ...i
12
s
13
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Derivation of the Learning Rule

ln P ({x ( n ) }N1 | W )
(s)
w i1i2 ...is


w(i1is2)...is


 K

1


(k )
(n) (n)
(n)
exp
w
x
x
...
x

ln
Z
(
W
)






i1i2 ...ik
i1
i2
ik
c
(
k
)

n 1 
k

2
i
,
i
,...,
i
1 2
k




N
N 

 K

1

 

(k )
(n) (n)
(n)
 
exp
w
x
x
...
x

ln
Z
(
W
)





(s)
(s)
i1i2 ...ik
i1
i2
ik
c
(
k
)

w
n 1  w i i ...i
k

2
i
,
i
,...,
i

1 2
k


i1i2 ...is
12
s



 N  x x ...x
N
  x (i1n ) x (i2n ) ...x (isn )  xi1 xi2 ...xis
P ( x|W )
n 1
i1
i2
is
Data
 xi1 xi2 ...xis


P ( x|W )
where
1

N
xi1 xi2 ...xis
Data
xi1 xi2 ...xis
P ( x|W )
 x
N
n 1

(n)
i1
x (i n ) ...x (i n )
2
s

  x i x i ...x i P ( x | W )
x
1
2
s

14
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
1
x1
=1
x2
=0
x3
=0
x4
=1
x5
=0
x6
=0
x7
=0
x8
=0
x9
=0
x10
=1
x11
=0
x12
=1
x13
=0
x14
=0
x15
=0
y
=1
2
x1
=0
x2
=1
x3
=1
x4
=0
x5
=0
x6
=0
x7
=0
x8
=0
x9
=1
x10
=0
x11
=0
x12
=0
x13
=0
x14
=1
x15
=0
y
=0
3
x1
=0
x2
=0
x3
=1
x4
=0
x5
=0
x6
=1
x7
=0
x8
=1
x9
=0
x10
=0
x11
=0
x12
=0
x13
=1
x14
=0
x15
=0
y
=1
4
x1
=0
x2
=0
x3
=0
x4
=0
x5
=0
x6
=0
x7
=0
x8
=1
x9
=0
x10
=0
x11
=1
x12
=0
x13
=0
x14
=0
x15
=1
y
=1
4 examples
x1
x2
1
x1
x4
x10
y=1
x1
x4
x12
y=1
x4
x10
x12
y=1
x15
Round 3
1
2
x3
x14
x4
2
3
4
x2
x3
x9
y=0
x2
x3
x14
y=0
x3
x9
x14
y=0
x3
x6
x8
y=1
x3
x6
x13
y=1
x6
x8
x13
y=1
x8
x11
x15
y=0
x13
x12
x5
x6
x11
x7
x10
x8
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
x9
15
Molecular Self-Assembly of Hypernetworks
xi
xj
y
Molecular Encoding
Hypernetwork Representation
X1
x1
x3
x1
Class
x1
x3
x1
X2
x3
Class
x1
Class
x1
x3
X8
x3
Class
Class
x1
Class
x1
x3
x1
x1
xn
…
x1
x2
x3
x1
Class
x2
x4
x3
x2
X3
X7
x1
x1
Class
x3
x3
x2
Class
Class
x1
x3
x4
x2
x2
Class
x2
X4
X6
x2
x2
x4
Class
x1
x4
Class
Class
x4
Class
Class
x2
x4
x3
Class
x2
x1
Class
x3
Class
Class
x1
x3
x4
x2
x4
x1
x3
Class
Class
x1
x1
x2
xn
…
Class
x1
Class
x1
Class
Class
Class
x2
x2
Class
Class
Class
x2
Class
Class
x4
Class
X5
16
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Encoding a Hypernetwork with DNA
a) z1 : (x1=0, x2=1, x3=0, y=1)
z2 : (x1=0, x2=0, x3=1, x4=0, x5=0, y=0)
z3 : (x2=1, x4=1, y=1)
Collection of (labeled) hyperedges
z4 : (x2=1, x3=0, x4=1, y=0)
b) z1 : AAAACCAATTGGAAGGCCATGCGG
z2 : AAAACCAATTCCAAGGGGCCTTCCCCAACCATGCCC
z3 : AATTGGCCTTGGATGCGG
Library of DNA molecules
z4 : AATTGGAAGGCCCCTTGGATGCCC
corresponding to (a)
where
AAAA
x1
AATT x2
AAGG x3
CCTT
x4
CCAA x5
ATGC
CC
0
GG 1
y
17
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
DNA Molecular Computing
Nanostructure
Molecular recognition
Self-replication
Self-assembly
Heat
Cool
Repeat
Polymer
18
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning the Hypernetwork (by Molecular
Evolution)
Next generation
i
i
Library of combinatorial
molecules
Library
Example
+
Select the library elements
matching the example
Amplify the matched library
elements by PCR
[Zhang, DNA11]
Hybridize
19
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Molecular Information Processing

MP4.avi
20
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
The Theory of Bayesian Evolution


Evolution as a Bayesian inference process
Evolutionary computation (EC) is viewed as an iterative process of
generating the individuals of ever higher posterior probabilities from the
priors and the observed data.
generation 0
P(A |D)
P(A |D)
...
P0(Ai)
generation g
Pg(Ai |D)
Pg(Ai)
i
i
[Zhang, CEC-99]
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
21
Evolutionary Learning Algorithm for
Hypernetwork Classifiers
1. Let the hypernetwork H represent the current distribution
P(X,Y).
2. Get a training example (x,y).
3. Classify x using H as follows
3.1 Extract all molecules matching x into M.
3.2 From M separate the molecules into classes:
Extract the molecules with label Y=0 into M0
Extract the molecules with label Y=1 into M1
3.3 Compute y*=argmaxY{0,1}| MY |/|M|
4. Update H
If y*=y, then Hn ← Hn-1+{c(u, v)} for u=x and v=y for (u, v) Hn-1,
If y*≠y, then Hn ← Hn-1{c(u, v)} for u=x and v ≠ y for (u, v) Hn-1
5.Goto step 2 if not terminated.
22
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning with Hypergraphs:
Application Results
Biological Applications

DNA-Based Molecular Diagnosis
 MicroRNA-Based Diagnosis
 Aptamer-Based Diagnosis
DNA-Based Diagnosis
&
120 samples from
60 leukemia patients
Gene expression data
Training Hypernets with
6-fold validation
[Cheok et al., Nature Genetics, 2003]
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Class: ALL/AML
Diagnosis
25
Learning Curve
Fitness evolution of
the population of hyperedges
26
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Order Effects on Learning
Fitness curves for runs with fixed-cardinality hyperedges
(card = 1, 4, 7, 10)
27
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Aptamer-Based Cardiovascular Disease
Diagnosis
Training Data
▷ Disease : Cardiovascular Disease (CVD)
▷ Classes : 4 Classes [Normal / 1st / 2nd / 3rd Stages]
▷ The number of Samples : 135 Samples [N : 40 / 1st : 38 / 2nd : 19 / 3rd : 18]
▷ Preprocessing
Feature Selection
Using Gain Ratio
Convert to
Real-value
3K Aptamer Array
3K Real-value Data
Binarization
Using MDL
150 Real-value Data
150 Boolean Data
▷ Simulation Parameter Value
1) Order : 2 ~ 70
2) Sampling Rate : 50
3) In each case, 10 times repeated and averaged
▷ Classification : Majority voting with The Sum of Library Element Weight
▷ Training / Test Size : Traing 108 (80%) / Test 27 (20%)
29
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning & Classification by Hypernetworks
Training
Data
Test
Data
X0=1X1=1X2=0X3=0X4=1X5=1X6=1X7=0 … X149=1C=1
X0=0X1=0X2=0X3=1X4=1X5=1X6=0X7=0 … X149=1C=0
X0=0X1=0X2=1X3=1X4=0X5=1X6=0X7=1 … X149=1C=1
Binarization
X0=0X1=1X2=1X3=1X4=0X5=0X6=0X7=1 … X149=1C=1
X0=1X1=0X2=1X3=1X4=0X5=0X6=0X7=1 … X149=1C=0
X0=1X1=1X2=0X3=0 C=1 W=1000
Data Set
Source Data
X0=1X4=1X6=1X7=0 C=1 W=1000
X18=1X35=0X68=1X82=0C=1 W=1000
Learining Loop [Evolution Stage]
X6=0X7=0X8=0X9=1 C=0 W=1000
X14=0X4=1X5=1X7=0 C=0 W=1000
Adjust Learning Rate
X22=0X4=1X6=0X149=1C=0 W=1000
95
X0=1X1=1X2=0X3=0 C=1
X1=0X33=1X4=0X9=1 C=1 W=1000
W’=1
90
X0=1X4=1X6=1X7=0 C=1 W’=45
X3=1X6=0X52=1X8=0 C=1 W=1000
85
80
X18=1X35=0X68=1X82=0C=1 W’=4000
X0=0X2=1X4=0X5=1 C=1 W=1000
75
70
160
200
240
280
320
360
400
440
480
520
560
600
640
680
720
760
800
840
880
920
960
1000
160
200
240
280
320
360
400
440
480
520
560
600
640
680
720
760
800
840
880
920
960
1000
0
80
40
120
Test
X22=0X4=1X6=0X149=1C=0 W’=500
86
84
82
80
X1=0X33=1X4=0X9=1 C=1 W’=1300
78
76
W’=4
X0=0X2=1X4=0X5=1 C=1 W’=14
Test
Data
74
72
70
0
Training
Data
X3=1X6=0X52=1X8=0 C=1
40
Weight Update Rule (Learning)
: Error Correction
In case that all index-value matched,
If Class is correct, w = w*1.0001
Else w = w*0.95.
X14=0X4=1X5=1X7=0 C=0 W’=8530
65
80
Weight
Update
X6=0X7=0X8=0X9=1 C=0 W’=12
120
Library
Library
30
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Simulation Result (1/3)
▷ Training & test errors as learning goes on (order k=12)
100
95
Accuracy
90
Training
Test
85
80
75
0
50
100
150
200
250
Epoch
300
350
400
450
500
31
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Simulation Result (2/3)
▷ Accuracy on test data as learning goes on (order k=12)
84
82
80
78
Accuracy
Order
76
2
4
8
12
16
20
30
40
50
60
70
74
72
70
68
66
64
0
50
100
Epoch
150
200
32
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Simulation Result (3/3)
▷ The effect of learning
84
Learning
Sampling only
82
80
78
Accuracy
76
74
72
70
68
66
64
0
10
20
30
40
Order
50
60
70
80
33
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Mining Cancer-Related MicroRNA
Modules from miRNA Expression
Profiles
Gene Regulation by microRNAs

MicroRNAs
 MicroRNAs (miRNAs) are
endogenous about 22 nt RNAs
that can play important
regulatory roles in animals,
plants and viruses.
Post-transcriptional gene
regulation
 Binding target genes for
degradation or translational
repression

 Recently, miRNAs are reported
that related to the cancer
development and progression.
35
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Dataset

The miRNA expression
microarray data
 The expression profiles of
miRNA in human among 11
tumors,
which were bladder, breast,
colon, kidney, lung, pancreas,
prostate, uterus, melanoma,
mesothelioma, ovary tissue (Lu
et al., 2005).
 This dataset consists of an expre
ssion matrix of 151 miRNAs (ro
ws) and 89 samples (columns).
Tissue type
Cancer
Norma
l
Bladder
1
6
Breast
3
6
Colon
4
7
Kidney
3
4
Lung
2
5
Pancreas
1
8
Prostate
6
6
Uterus
1
10
Melanoma
0
3
Mesothelioma
0
8
Ovary
0
5
All tissues
21
68
36
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Representing a Hypernetwork
from miRNA Expression Data
Data item
: 151 miRNAs
89 samples
X=1
1
X=2
0
X=3
1
X=4
1
X=5
0
X=6
1
2
X=1
0
X=2
0
X=3
0
X=4
1
X=5
0
X=6
0
89
X=1
1
X=2
0
X=3
0
X=4
1
X=5
0
X=6
1
X=151
0
Class
cancer
…….
X=151
1
Class
normal
…….
X=151
1
Class
cancer
…
1
Library (normal or cancer classification rules)
1
2
X=2
cancer
X=10
X=20
normal
X=1
X=45
cancer
X=10
X=31
cancer
X=1
X=80
normal
X=31
X=20
normal
X=1
X=2
cancer
…
X=1
89
X=1
X=2
cancer
X=1
X=45
cancer
X=1
X=45
cancer
X=1
X=2
cancer
A hypernetwork H = (X, E, W) of DNA Molecules
37
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Performance

Leave-one-out cross-validation
Algorithms
Correct
classification rate
Bayesian Network
79.77 %
Naïve Bayes
83.15 %
88.76 %
90.00%
91.01 %
ID3
Hypernetworks
Sequential Minimal Optimization
(SMO)
Multi-layer perceptron (MLP)
92.13 %
38
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Accuracy vs. Order for Test Data
(sampling only)
1
0.9
Classification ratio
0.8
0.7
0.6
0.5
0.4
0.3
0.2
20
40
60
80
Order
100
120
140
39
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning Curves for Training Data
1
Classification ratio
0.95
0.9
Order
0.85
2
3
4
5
6
7
0.8
0
10
20
30
Epoch
40
50
60
40
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
miRNA Data Mining

miRNA modules related to cancer
Weight
miRNA modules
a
b

miRNAs related to cancer
miRNAs
weight
hsa-miR-155
295972.7
7919.249184
hsa-miR-215
1
hsa-miR-7
1
hsa-miR-105
283034.8
6787.927872
hsa-miR-194
1
hsa-miR-30d
0
hsa-miR-223
280371.4
6787.927872
hsa-miR-214
1
hsa-miR-30e
0
hsa-miR-21
277609.9
6084.600896
hsa-miR-21
1
hsa-miR-321
1
hsa-let-7c
270764.7
5656.60656
hsa-miR-142-3p
1
hsa-miR-34b
0
hsa-miR-142-3p
266700.1
5656.60656
hsa-miR-142-3p
1
hsa-miR-96
0
hsa-miR-29b
263159
5656.60656
hsa-miR-126
1
hsa-miR-30c
0
hsa-miR-224
260877.3
5324.025784
hsa-miR-26b
1
hsa-miR-29b
1
hsa-miR-183
260877.3
5324.025784
hsa-let-7f
1
hsa-miR-9*
1
hsa-miR-184
260116.7
5324.025784
hsa-miR-224
1
hsa-miR-301
0
hsa-let-7a
256313.8
41
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Non-Biological Applications

Digit Recognition
 Face Classification
 Text Classification
 Movie Title Prediction
Digit Recognition: Dataset

Original Data
 Handwritten digits (0 ~ 9)
 Training data: 2,630 (263
examples for each class)
 Test data: 1,130 (113
examples for each class)

Preprocessing
 Each example is 8x8
binary matrix.
 Each pixel is 0 or 1.
43
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Pattern Classification
Hidden Layer
Input Layer
x1
x1
Class
x1
x1
x1
x2
x2
Class
x2
x1
Class
x1
x3
x3
Class
x1
Class
x1
x1
x3
x3
•
•
•x
Class
Class
x1
Class
Class
x2
1
x3
x1
Class
x1
xn
…
x1
x2
x3
x1
Class
x2
x4
x3
x2
x1
x1
Class
x3
x3
x2
x1
Class
x2
Class
x1
x3
x4
x2
x4
x2
x2
x2
•
•
•
Class
x1
x4
Class
x2
x4
x3
Class
x1
x3
x1
x4
x3
Class
x2
Class
x1x3
Class
xn
Class
x4
Class
x1
Class
x2
Class
2
Class
Class
Class
x3
x3
x1
x1
x3
x4
x1
Class
x1
x3
x4
wi
x
x1x2
Class
Class
x1
x2
x3
Class
Class
x1
Class
x1
x2
Class
x1
xn
Class
Class
x2
Class
…
Class
Class
x1
Class
x2
x2
Class
•
•
•
x3
wm
x1
Output Layer
wj
x3
x2
Class
w2
x1
Class
“Layered” Hypernetwork
w1
Probabilistic Library
(DNA Representation)
Class
x1
…
x1
Class
n
n
m   
k 1  k 
n
n
W  {wi | 1  i    , w  # of copies}
k 1  k 
xn
…
Class
xn
Class
Class
x1…xn
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
44
Simulation Results – without Error
Correction

|Train set| = 3760, |Test set| = 1797.
45
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Performance Comparison
Methods
Accuracy
MLP with 37 hidden nodes
0.941
MLP with no hidden nodes
0.901
SVM with polynomial kernel
0.926
SVM with RBF kernel
0.934
Decision Tree
0.859
Naïve Bayes
0.885
kNN (k=1)
0.936
kNN (k=3)
0.951
Hypernet with learning (k = 10)
0.923
Hypernet with sampling (k = 33)
0.949
46
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Error Correction Algorithm
1.
2.
3.
Initialize the library as before.
maxChangeCnt := librarySize.
For i := 0 to iteration_limit
1. trainCorrectCnt := 0.
2.
3.
Run classification for all training patterns. For each correctly classifed
patterns, increase trainCorrectCnt.
For each library elements
1.
2.
Initialize fitness value to 0.
For each misclassified training patterns if a library element is matched to
that example
1.
2.
4.
5.
6.
if classified correctly, then fitness of the library element gains 2 points.
Else it loses 1 points.
changeCnt := max{ librarySize * (1.5 * (trainSetSize - trainCorrectCnt)
/ trainSetSize + 0.01), maxChangeCnt * 0.9 }.
maxChangeCnt := changeCnt.
Delete changeCnt library elements of lowest fitness and resample library
elements whose classes are that of deleted ones.
47
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Simulation Results – with Error
Correction

iterationLimit = Train
37, librarySize = 382,300,
Test
1
0.93
0.99
0.92
0.98
Classification ratio
Classification ratio
0.97
0.96
0.95
0.94
Order
6
10
14
18
22
27
0.93
0.92
0.91
0.9
0
5
10
15
20
Iteration
25
30
35
0.91
0.9
Order
0.89
6
10
14
18
22
26
0.88
0.87
0
5
10
15
20
Iteration
25
30
35
48
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Performance Comparison
Algorithms
Correct
classification rate
Random Forest (f=10, t=50)
KNN (k=4)
Hypernetwork (Order=26)
AdaBoost (Weak Learner: J48)
94.10 %
93.49 %
92.99 %
91.93 %
SVM (Gaussian Kernel, SMO)
MLP
91.37 %
90.53 %
Naïve Bayes
J48
87.26 %
84.86 %
49
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Face Classification Experiments
Face Data Set

Yale dataset
 15 people
 11 images
per person
 Total 165
images
51
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Training Images of a Person

10 for
training
 The
remaining 1
for test
52
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Bitmaps for Training Data
(Dimensionality = 480)
53
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Classification Rate by Leave-One-Out
54
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Classification Rate
(Dimensionality = 64 by PCA)
55
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Text Classification Experiments
Text Classification
1. Documents
2. Bag-of-words representation
...
3. Term vectors
1
0
0
0
2
0
1
baseball
specs
graphics
hockey
unix
space
0
1
0
1
0
d1
d2
d3
1
0
1
0
0
0
1
0
0
0
1
0
0
1
0
0
1
0
1
1
0
1
1
1
1
1
0
0
0
3
0
0
0
2
1
0
0
0
1
dn
0
1
0
0
0
0
1
0
0
1
1
0
0
0
0
0
1
1
0
0
1
0
0
1
0
0
1
1
0
0
1
0
0
1
0
0
0
1
1
0
1
0
1
0
0
4. Binary term-document matrix
x1=0
x2=1
x3=1
y=1
x1=0
x2=0
x3=1
y=0
x2=1
x3=1
y=1
x1=0 x2=0
y=0
x1=0 x2=0
y=0
x1=0 x2=0
y=0
x2=1
x1=0 y=0
x1=0 y=0
x1=0 y=0
x1=0 x2=0
y=1
x1=0 x2=0
y=1
x1=0 x2=0
y=1
y=0
x2=0
y=1
x2=0
y=1
x2=0
y=1
x1=0 x2=1
y=0
x1=0 x2=1
y=0
x1=0 x2=1
y=0
x1=0 x2=0 x3=0 y=0
x1=0 x2=0 x3=0 y=0
x1=0 x2=0 x3=0 y=0
x1=0 x2=0 x3=0 y=1
x1=0 x2=0 x3=0 y=1
x1=0 x2=0 x3=0 y=1
x1=0 x2=0 x3=1 y=1
x1=0 x2=0 x3=1 y=1
x1=0 x2=0 x3=1 y=1
y=0
x3=0
x2=0
y=0
x2=0
y=0
x2=0
y=0
x1=0 y=1
x1=0 y=1
x1=0 y=1
x1=0 x2=1
y=1
x1=0 x2=1
y=1
x1=0 x2=1
y=1
x2=1
x1=0 x2=0 x3=1 y=0
x1=0 x2=0 x3=1 y=0
x1=0 x2=0 x3=1 y=0
x1=0 x2=1 x3=0 y=0
x1=0 x2=1 x3=0 y=0
x1=0 x2=1 x3=0 y=0
x1=0 x2=1 x3=0 y=1
x1=0 x2=1 x3=0 y=1
x1=0 x2=1 x3=0 y=1
57
5. DNA encoded kernel functions
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Text Classification


Data from Reuters-21578 (‘ACQ’ and ‘EARN’)
Learning curves: average for 10 runs
58
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Performance Comparison

‘ACQ’ data (4,724 documents)

‘EARN’ data (7,888 documents)

Higher-dimensional kernel functions can improve the
performance further.
59
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning from Movie Captions
Experiments
Learning Hypernets from Movie Captions
 Order
 Sequential
 Range:
2~3
 Corpus
 Friends
 Prison
Break
 24
61
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning Hypernets from Movie Captions
62
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning Hypernets from Movie Captions
63
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning Hypernets from Movie Captions
64
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning Hypernets from Movie Captions
 Classification
 Query
generation
- I intend to marry her
: I ? to marry her
I intend ? marry her
I intend to ? her
I intend to marry ?
 Matching
- I ? to marry her
order 2: I intend, I am, intend to, ….
order 3: I intend to, intend to marry, …
 Count the number of max-perfect-matchin
hyperedges
65
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning Hypernets from Movie Captions
 Completion & Classification Examples
Query
Completion
who are you
Corpus: Friends, 24, Prison Break
? are you
who ? you
who are ?
what are you
who are you
who are you
you need to wear it
Corpus: 24, Prison Break, House
? need to wear it
you ? to wear it
you need ? wear it
you need to ? it
you need to wear ?
i need to wear it
you want to wear it
you need to wear it
you need to do it
you need to wear a
Classification
Friends
Friends
Friends
24
24
24
House
24
66
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Conclusion





Hypernetworks are a graphical model employing higher-order nodes
explicitly and allowing for a more natural representation for learning
higher-order graphical models.
We introduce an evolutionary learning algorithm that makes use of the
high information density and massive parallelism of molecular
computing to solve the combinatorial explosion problems.
Applied to pattern recognition (and completion) problems in IT and BT.
Obtained a performance competitive to conventional ML classifiers.
Why does this work?
 Exploits the huge population size available in DNA computing to build an
ensemble machine, i.e. a hypernetwork, of simple random hyperedges.
 A new kind of evolutionary algorithm where a very simple “molecular”
operators are applied to a “huge” population of individuals in a “massively
parallel” way.

Another potential of hypernetworks is for application to solving
biological problems where data are given as “wet” DNA or RNA
molecules.
67
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Acknowledgements
Simulation Experiments
Joo-Kyoung Kim, Sun Kim, Soo-Jin Kim,
Jung-Woo Ha, Chan-Hoon Park, Ha-Young Jang
Collaborating Labs
- Biointelligence Laboratory, Seoul National University
- RNomics Lab, Seoul National University
- DigitalGenomics, Inc.
- GenoProt, Inc.
Supported by
- National Research Lab Program of Min. of Sci. & Tech. (2002-2007)
- Next Generation Tech. Program of Min. of Ind. & Comm. (2000-2010)
More Information at
- http://bi.snu.ac.kr/MEC/
- http://cbit.snu.ac.kr/
68
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Download