Lecture Note - 서울대 : Biointelligence lab

advertisement
Hypernetwork Models of Memory
Byoung-Tak Zhang
Biointelligence Laboratory
School of Computer Science and Engineering
Brain Science, Cognitive Science, Bioinformatics Programs
Seoul National University
Seoul 151-742, Korea
btzhang@cse.snu.ac.kr
http://bi.snu.ac.kr/
Signaling Networks
2
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Protein-Protein Interaction Networks
[Jeong et al., Nature 2001]
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
3
Regulatory Networks
Transcription Factor
[Katia Basso, et al., Nature Genetics 37, 382 – 390, 2005]
4
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Molecular Networks
x2
x15
x3
x14
x4
x13
x5
x12
x6
x11
x7
x10
x8
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
x9
5
The “Hyperinteraction” Model of Biomolecules
x1
x2
x15
x3
x14
x4
x13
x5
x12
x6
x11
x7
[Zhang, DNA-2006]
[Zhang, FOCI-2007]
x10
x8
• Vertex:
• Genes
• Proteins
• Neurotransmitters
• Neuromodulators
• Hormones
• Edge:
• Interactions
• Genetic
• Signaling
• Metabolic
• Synaptic
x9
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
6
The Hypernetworks
Hypergraphs





A hypergraph is a (undirected) graph G whose edges connect
a non-null number of vertices, i.e. G = (V, E), where
V = {v1, v2, …, vn},
E = {E1, E2, …, En},
and Ei = {vi1, vi2, …, vim}
An m-hypergraph consists of a set V of vertices and a subset
E of V[m], i.e. G = (V, V[m]) where V[m] is a set of subsets of V
whose elements have precisely m members.
A hypergraph G is said to be k-uniform if every edge Ei in E
has cardinality k.
A hypergraph G is k-regular if every vertex has degree k.
Rem.: An ordinary graph is a 2-uniform hypergraph.
8
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
An Example Hypergraph
E1
G = (V, E)
V = {v1, v2, v3, …, v7}
E = {E1, E2, E3, E4, E5}
E3
v1
E2
E1 = {v1, v3, v4}
E2 = {v1, v4}
E3 = {v2, v3, v6}
E4 = {v3, v4, v6, v7}
E5 = {v4, v5, v7}
v2
E4
v3
v4
v6
v5
E5
v7
9
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Hypernetworks





[Zhang, DNA-2006]
A hypernetwork is a hypergraph of weighted edges. It is defined as a
triple H = (V, E, W), where
V = {v1, v2, …, vn},
E = {E1, E2, …, En},
and W = {w1, w2, …, wn}.
An m-hypernetwork consists of a set V of vertices and a subset E of V[m],
i.e. H = (V, V[m], W) where V[m] is a set of subsets of V whose elements
have precisely m members and W is the set of weights associated with the
hyperedges.
A hypernetwork H is said to be k-uniform if every edge Ei in E has
cardinality k.
A hypernetwork H is k-regular if every vertex has degree k.
Rem.: An ordinary graph is a 2-uniform hypergraph with wi=1.
10
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
The Hypernetwork Memory
x1
x2
[Zhang, DNA12-2006]
x15
The hypernetwo rk is defined as
H  ( X , S ,W )
X  ( x1 , x2 ,..., xI )
S   Si ,
Si  X , k | Si |
i
x3
W  (W ( 2) ,W (3) ,...,W ( K ) )
Training set :
x14
D  {x ( n ) }1N
x4
x13
The energy of the hypernetwo rk
1
1
E (x ( n ) ;W )    w(i1i22) x (i1n ) x (i2n )   w(i13i2)i3 x (i1n ) x (i2n ) x (i3n )  ...
2 i1 ,i2
6 i1 ,i2 ,i3
The probabilit y distributi on
x5
x12
x6
x11
P(x ( n ) | W ) 
1
exp[  E (x ( n ) ;W )]
Z(W )

1

1
1
exp   w(i1i22) x (i1n ) x (i2n )   w(i13i2)i3 x (i1n ) x (i2n ) x (i3n )  ...
Z(W )
6 i1 ,i2 ,i3
 2 i1 ,i2


K 1

1
exp 
w(i1ik2 )...ik x (i1n ) x (i2n ) ...x (ikn ) ,

Z(W )
 k  2 c(k ) i1 ,i2 ,..., ik

where the partition function is
x7
x10
x8
K 1
(k )
(m) (m)
(m) 
Z(W )   exp  
 wi1i2 ...ik x i1 x i2 ...x ik 
 k  2 c(k ) i1 ,i2 ,..., ik

x( m )
x9
© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/
11
1
x1
=1
x2
=0
x3
=0
x4
=1
x5
=0
x6
=0
x7
=0
x8
=0
x9
=0
x10
=1
x11
=0
x12
=1
x13
=0
x14
=0
x15
=0
y
=1
2
x1
=0
x2
=1
x3
=1
x4
=0
x5
=0
x6
=0
x7
=0
x8
=0
x9
=1
x10
=0
x11
=0
x12
=0
x13
=0
x14
=1
x15
=0
y
=0
3
x1
=0
x2
=0
x3
=1
x4
=0
x5
=0
x6
=1
x7
=0
x8
=1
x9
=0
x10
=0
x11
=0
x12
=0
x13
=1
x14
=0
x15
=0
y
=1
4
x1
=0
x2
=0
x3
=0
x4
=0
x5
=0
x6
=0
x7
=0
x8
=1
x9
=0
x10
=0
x11
=1
x12
=0
x13
=0
x14
=0
x15
=1
y
=1
4 sentences
(with labels)
x1
x2
1
x1
x4
x10
y=1
x1
x4
x12
y=1
x4
x10
x12
y=1
x15
Round 3
1
2
x3
x14
x4
2
3
4
x2
x3
x9
y=0
x2
x3
x14
y=0
x3
x9
x14
y=0
x3
x6
x8
y=1
x3
x6
x13
y=1
x6
x8
x13
y=1
x8
x11
x15
y=0
x13
x12
x5
x6
x11
x7
x10
x8
© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/
x9
12
An Example:
Hypernetwork Model of Linguistic
Memory
The Language Game Platform
14
© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/
A Language Game
We ? ? a lot ? gifts.
 We don't have a lot of gifts.
? still ? believe ? did this.
 I still can't believe you did this.
15
© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Text Corpus: TV Drama Series
Friends, 24, House, Grey Anatomy, Gilmore Girls, Sex and the
City
I don't know what happened.
What ? ? ? here.
Take a look at this.
…
289,468
Sentences
(Training Data)
? have ? visit the ? room.
…
700 Sentences
with Blanks
(Test Data)
© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/
16
Step 1: Learning a Linguistic Memory
Computer network is rapidly increased.
The price of computer network installing is very cheap.
The price of monitor display is on decreasing.
Nowadays, color monitor display is so common, the price is not so high.
This is a system adopting text mode color display.
This is an animation news networks.
...
k=2
k=2
k=3
k=4
…
Computer
Network
Computer
Price
Computer
Network
Price
Computer
Monitor
Display
Computer
Display
Monitor
Display
Monitor
Price
Computer
Display
News
Price
Hypernetwork
Memory
`
Color
Monitor
Color
Display
Color
Monitor
Color
Text
Color
Text
Display
Text
News
Display
Monitor
Color
Price
Text
Network
News
Price
Network
© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/
17
Step 2: Recalling from the Memory
He
is
my
best
friend
He
is
a
strong
boy
Strong
friend
likes
pretty
girl
Storage
X1
X2
He
is
is
my
X3 my
He
is
is
a
Strong
friend
friend
likes
X8
best
best
friend
X7
a
strong
strong
boy
likes
strong
pretty
girl
X4
X6
X5
Recall
He
is
a
a
?
friend
Strong
friend
Self-assembly
He
is
a
strong
friend
strong
best
friend
© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/
18
The Language Game: Results

? gonna ? upstairs ? ? a shower
I'm gonna go upstairs and take a shower

We ? ? a lot ? gifts
We don't have a lot of gifts

? have ? visit the ? room
I have to visit the ladies' room

? ? don't need your ?
If I don't need your help

? still ? believe ? did this
I still can't believe you did this

? ? a dream about ? In ?
I had a dream about you in Copenhagen

? ? ? decision
to make a decision

What ? ? ? here
What are you doing here
? appreciate it if ? call her by ? ?
 I appreciate it if you call her by the way

? you ? first ? of medical school
Are you go first day of medical school
Would you ? to meet ? ? Tuesday ?
 Would you nice to meet you in Tuesday and
I'm standing ? the ? ? ? cafeteria
 I'm standing in the one of the cafeteria
Why ? you ? come ? down ?
 Why are you go come on down here
? think ? I ? met ? somewhere before
 I think but I am met him somewhere before
19
© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Experimental Setup

The order (k) of an hyperedge
 Range: 2~4
 Fixed order for each experiment

The method of creating hyperedges from training
data
 Sliding window method
 Sequential sampling from the first word

The number of blanks (question marks) in test data
 Range: 1~4
 Maximum: k - 1
20
© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning Behavior Analysis (1/3)
Sentences with One Missing Words Completion
800
700
600
500
400
300
200
Order 2
Order 3
Order 4
100
0
40K


80K
120K
160K
200K
240K
280K
290K
The performance monotonically increases as the learning corpus grows.
The low-order memory performs best for the one-missing-word problem.
21
© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning Behavior Analysis (2/3)
Sentences with Two Missing Words Completion
800
700
600
500
400
300
200
Order 2
Order 3
Order 4
100
0
40K

80K
120K
160K
200K
240K
280K
290K
The medium-order (k=3) memory performs best for the two-missing-words
problem.
22
© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning Behavior Analysis (3/3)
Sentences with Three Missing Words Completion
800
700
600
500
400
300
200
Order 2
Order 3
Order 4
100
0
40K

80K
120K
160K
200K
240K
280K
290K
The high-order (k=4) memory performs best for the three-missing-words
problem.
23
© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning with Hypernetworks
1
x1
=1
x2
=0
x3
=0
x4
=1
x5
=0
x6
=0
x7
=0
x8
=0
x9
=0
x10
=1
x11
=0
x12
=1
x13
=0
x14
=0
x15
=0
y
=1
2
x1
=0
x2
=1
x3
=1
x4
=0
x5
=0
x6
=0
x7
=0
x8
=0
x9
=1
x10
=0
x11
=0
x12
=0
x13
=0
x14
=1
x15
=0
y
=0
3
x1
=0
x2
=0
x3
=1
x4
=0
x5
=0
x6
=1
x7
=0
x8
=1
x9
=0
x10
=0
x11
=0
x12
=0
x13
=1
x14
=0
x15
=0
y
=1
4
x1
=0
x2
=0
x3
=0
x4
=0
x5
=0
x6
=0
x7
=0
x8
=1
x9
=0
x10
=0
x11
=1
x12
=0
x13
=0
x14
=0
x15
=1
y
=1
4 examples
x1
x2
1
x1
x4
x10
y=1
x1
x4
x12
y=1
x4
x10
x12
y=1
x15
Round 3
1
2
x3
x14
x4
2
3
4
x2
x3
x9
y=0
x2
x3
x14
y=0
x3
x9
x14
y=0
x3
x6
x8
y=1
x3
x6
x13
y=1
x6
x8
x13
y=1
x8
x11
x15
y=0
x13
x12
x5
x6
x11
x7
x10
x8
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
x9
25
1
2
Random Graph Process (RGP)
3
4
5
26
27
Collectively Autocatalytic Sets

“A contemporary cell is a collectively
autocatalytic whole in which DNA, RNA, the
code, proteins, and metabolism linking the
synthesis of species of some molecular species to
the breakdown of other “high-energy” molecular
species all weave together and conspire to catalyze
the entire set of reactions required for the whole
cell to reproduce.” (Kauffman)
28
Reaction Graph
29
The Hypernetwork Model of Memory
The hypernetwo rk is defined as
H  ( X , S ,W )
X  ( x1 , x2 ,..., xI )
The energy of the hypernetwork
1
1
( n) ( n)
E (x (n ) ;W )   wi(1) xi( n )   wi(2)
x
x

wi(3)
x i( n ) x i( n ) x i( n )  ...

i
i
i
i
i
1
1
2 i1 ,i2 1 2 1 2
6 i1 ,i2 ,i3 1 2 3 1 2 3
i1
S   Si ,
Si  X , k | Si | The probability distribution
i
1
P (x (n ) |W ) 
exp[   E ( x(n ) ;W )]
( 2)
( 3)
(K )
W  (W ,W ,...,W )
Z(W )
Training set :
1

1
1
(2) ( n ) ( n )
(3) ( n ) ( n ) ( n )

exp
w
x
x

w
x
x
x

...
  i1i2 i1 i2


i1i2i3
i1
i2
i3
D  {x ( n ) }1N
Z(W )
2
6
i
,
i
i
,
i
,
i
1 2 3
 12

K 1

1
(k )
(n) (n)
(n)

exp  
w
x
x
...
x
,

i1i2 ... ik
i1
i2
ik
Z(W )
c
(
k
)
i1 ,i2 ,...,ik
 k 2

where the partition function is
K 1
(k )
(m) (m)
( m) 
Z(W )   exp  
 wi1i2 ...ik xi1 xi2 ...xik 
 k  2 c(k ) i1 ,i2 ,...,ik

x( m )
[Zhang, 2006]
30
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Deriving the Learning Rule
P({x ( n ) }N1 | W )
N
  P(x (n) | W )
n 1
ln P ({x ( n ) }N1 | W )
N
 ln  P (x ( n ) | W ( 2 ) , W ( 3) ,..., W ( K ) )
n 1


K 1



(k )
(n) (n)
(n)
  exp 
wi1i2 ...ik x i1 x i2 ...x ik   ln Z (W )


n 1 
 k  2 c( k ) i1 ,i2 ,..., ik



N

(n) N
ln
P
({
x
} 1 |W )
(s)
w i i ...i
12
s
31
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Derivation of the Learning Rule

ln P ({x ( n ) }N1 | W )
(s)
wi1i2 ...is


w(i1is2)...is


 K

1


(k )
(n) (n)
(n)
exp
w
x
x
...
x

ln
Z
(
W
)






i1i2 ...ik
i1
i2
ik
c
(
k
)

n 1 
k

2
i
,
i
,...,
i
1 2
k




N
N 

 K

1

 

(k )
(n) (n)
(n)
 
exp
w
x
x
...
x

ln
Z
(
W
)





(s)
(s)
i1i2 ...ik
i1
i2
ik
c
(
k
)

w
n 1  w i i ...i
k

2
i
,
i
,...,
i

1 2
k


i1i2 ...is
12
s



 N  x x ...x
N
  x (i1n ) x (i2n ) ...x (isn )  xi1 xi2 ...xis
n 1
i1
i2
is
Data
P ( x|W )
 xi1 xi2 ...xis


P ( x|W )
where
1

N
xi1 xi2 ...xis
Data
xi1 xi2 ...xis
P ( x|W )
 x
N
n 1

(n)
i1
x (i2n ) ...x (isn )

  x i1 x i2 ...x is P ( x | W )

x
32
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Comparison to Other Machine
Learning Methods
Probabilistic Graphical Models (PGMs)

Represent the joint
probability distribution on
some random variables in
graphical form.
 Undirected PGMs
 Directed PGMs

• C and D are
B
A
C
independent given B.
D
• C asserts
dependency between
A and B.
• B and E are
independent given C.
E
Generative: The probability
distribution for some
P( A, B, C , D, E )
variables given values of
 P( A) P( B | A) P(C | A, B) P( D | A, B, C )
other variables can be
P( E | A, B, C , D)
obtained.
 P( A) P( B) P(C | A, B) P( D | B) P( E | C )
 Probabilistic inference
34
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Kinds of Graphical Models
Graphical Models
Undirected
- Boltzmann Machines
- Markov Random Fields
Directed
- Bayesian Networks
- Latent Variable Models
- Hidden Markov Models
- Generative Topographic Mapping
- Non-negative Matrix Factorization
35
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Bayesian Networks
 BN = (S, P) consists of a network structure S and a set of local
probability distributions P
n
p(x)   p( x | pa )
i 1
i
i
<BN for detecting credit card fraud>
• Structure can be found by relying on the prior knowledge of causal relationships
36
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
From Bayes Nets to High-Order PGMs
(1) Naïve Bayes
J
A
F
P( F | J , G, S , A)
P( J , G, S , A | F ) P( F )

P( J , G, S , A)
 P( J , G, S , A | F )
 P( J | F ) P(G | F ) P( S | F ) P( A | F )


G
S
P( x | F )
x{ J ,G , S , A}
J
(2) Bayesian Net
F
P( F , J , G, S , A)
 P(G | F ) P( J | F ) P( J | A)( J | S )


P( x | pa ( x))
G
x{ F , J ,G , S , A}
S
(3) High-Order PGM
J
A
F
G
A
S
P( F , J , G, S , A)
 P( J , G | F ) P( J , S | F ) P( J , A | F )
P(G, S | F ) P(G, A | F )
P( S , A | F )


he ( x , y ){( x , y )| x , y{ J ,G , S , A}
and x  y }
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
P(he(37
x, y ) | F )
Visual Memories

Digit Recognition
 Face Classification
 Text Classification
 Movie Title Prediction
Digit Recognition: Dataset

Original Data
 Handwritten digits (0 ~ 9)
 Training data: 2,630 (263
examples for each class)
 Test data: 1,130 (113
examples for each class)

Preprocessing
 Each example is 8x8
binary matrix.
 Each pixel is 0 or 1.
39
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Pattern Classification
Hidden Layer
Input Layer
x1
x1
Class
x1
x1
x1
x2
x2
Class
x2
x1
Class
x1
x3
x3
Class
x1
Class
x1
x1
x3
x3
•
•
•x
Class
Class
x1
Class
Class
x2
1
x3
x1
Class
x1
xn
…
x1
x2
x3
x1
Class
x2
x4
x3
x2
x1
x1
Class
x3
x3
x2
x1
Class
x2
Class
x1
x3
x4
x2
x4
x2
x2
x2
•
•
•
Class
x1
x4
Class
x2
x4
x3
Class
x1
x3
x1
x4
x3
Class
x2
Class
x1x3
Class
xn
Class
x4
Class
x1
Class
x2
Class
2
Class
Class
Class
x3
x3
x1
x1
x3
x4
x1
Class
x1
x3
x4
wi
x
x1x2
Class
Class
x1
x2
x3
Class
Class
x1
Class
x1
x2
Class
x1
xn
Class
Class
x2
Class
…
Class
Class
x1
Class
x2
x2
Class
•
•
•
x3
wm
x1
Output Layer
wj
x3
x2
Class
w2
x1
Class
“Layered” Hypernetwork
w1
Probabilistic Library
(DNA Representation)
Class
x1
…
x1
Class
n
n
m   
k 1  k 
n
n
W  {wi | 1  i    , w  # of copies}
k 1  k 
xn
…
Class
xn
Class
Class
x1…xn
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
40
Simulation Results – without Error
Correction

|Train set| = 3760, |Test set| = 1797.
41
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Performance Comparison
Methods
Accuracy
MLP with 37 hidden nodes
0.941
MLP with no hidden nodes
0.901
SVM with polynomial kernel
0.926
SVM with RBF kernel
0.934
Decision Tree
0.859
Naïve Bayes
0.885
kNN (k=1)
0.936
kNN (k=3)
0.951
Hypernet with learning (k = 10)
0.923
Hypernet with sampling (k = 33)
0.949
42
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Error Correction Algorithm
1.
2.
3.
Initialize the library as before.
maxChangeCnt := librarySize.
For i := 0 to iteration_limit
1. trainCorrectCnt := 0.
2.
3.
Run classification for all training patterns. For each correctly classifed
patterns, increase trainCorrectCnt.
For each library elements
1.
2.
Initialize fitness value to 0.
For each misclassified training patterns if a library element is matched to
that example
1.
2.
4.
5.
6.
if classified correctly, then fitness of the library element gains 2 points.
Else it loses 1 points.
changeCnt := max{ librarySize * (1.5 * (trainSetSize - trainCorrectCnt)
/ trainSetSize + 0.01), maxChangeCnt * 0.9 }.
maxChangeCnt := changeCnt.
Delete changeCnt library elements of lowest fitness and resample library
elements whose classes are that of deleted ones.
43
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Simulation Results – with Error
Correction

iterationLimit = Train
37, librarySize = 382,300,
Test
1
0.93
0.99
0.92
0.98
Classification ratio
Classification ratio
0.97
0.96
0.95
0.94
Order
6
10
14
18
22
27
0.93
0.92
0.91
0.9
0
5
10
15
20
Iteration
25
30
35
0.91
0.9
Order
0.89
6
10
14
18
22
26
0.88
0.87
0
5
10
15
20
Iteration
25
30
35
44
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Performance Comparison
Algorithms
Correct
classification rate
Random Forest (f=10, t=50)
KNN (k=4)
Hypernetwork (Order=26)
94.10 %
93.49 %
92.99 %
AdaBoost (Weak Learner: J48)
SVM (Gaussian Kernel, SMO)
MLP
91.93 %
91.37 %
90.53 %
Naïve Bayes
J48
87.26 %
84.86 %
45
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Face Classification Experiments
Face Data Set

Yale dataset
 15 people
 11 images
per person
 Total 165
images
47
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Training Images of a Person

10 for
training
 The
remaining 1
for test
48
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Bitmaps for Training Data
(Dimensionality = 480)
49
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Classification Rate by Leave-One-Out
50
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Classification Rate
(Dimensionality = 64 by PCA)
51
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning Hypernets from Movie Captions
 Order
 Sequential
 Range:
2~3
 Corpus
 Friends
 Prison
Break
 24
52
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning Hypernets from Movie Captions
53
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning Hypernets from Movie Captions
54
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning Hypernets from Movie Captions
55
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning Hypernets from Movie Captions
 Classification
 Query
generation
- I intend to marry her
: I ? to marry her
I intend ? marry her
I intend to ? her
I intend to marry ?
 Matching
- I ? to marry her
order 2: I intend, I am, intend to, ….
order 3: I intend to, intend to marry, …
 Count the number of max-perfect-matchin
hyperedges
56
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Learning Hypernets from Movie Captions
 Completion & Classification Examples
Query
Completion
who are you
Corpus: Friends, 24, Prison Break
? are you
who ? you
who are ?
what are you
who are you
who are you
you need to wear it
Corpus: 24, Prison Break, House
? need to wear it
you ? to wear it
you need ? wear it
you need to ? it
you need to wear ?
i need to wear it
you want to wear it
you need to wear it
you need to do it
you need to wear a
Classification
Friends
Friends
Friends
24
24
24
House
24
57
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Download