Markov Logic Networks

advertisement
Markov Logic Networks
Speaker: Benedict Fehringer
Seminar: Probabilistic Models for Information Extraction
by Dr. Martin Theobald and Maximilian Dylla
Based on Richards, M., and Domingos, P. (2006)
1
Markov Logic Networks
Outline
• Part 1: Why we need Markov Logic Networks (MLN’s)
• Markov Networks
• First-Order Logic
• Conclusion and Motivation
• Part 2: How the MLN’s work?
• Part 3: Are they better than other methods?
2
Markov Logic Networks
Outline
• Part 1: Why we need Markov Logic Networks (MLN’s)
• Markov Networks
• First-Order Logic
• Conclusion and Motivation
• Part 2: How the MLN’s work?
• Part 3: Are they better than other methods?
3
Markov Logic Networks
Markov Networks
 Set of variables:
X  ( X 1 , X 2 ,..., X n )  
 The distribution is given by:
1
P( X  x)  k ( x{k } )
Z k
with Z   k ( x{k } ) as normalization factor
x
and k
4
k
as potential function
Markov Logic Networks
Markov Networks
 Representation as log-linear model:


1
1
P( X  x)  k ( x{k } )  exp    j f j ( x) 
Z k
Z
 j

 In our case there will be only binary features:
 Each feature corresponds to each possible state x{k }
 The weight is equal to the log of the potential:
5
Markov Logic Networks
f j ( x)  0,1
 j  log k ( x{k } )
Little example scenario
There is a company witch has a Playstation and each company’s
employee has the right to play with it.
If two employees are friends then the probability is high (ω=3) that
both play all day long or both do not.
If someone plays all day long then the chance for her/him is high
(ω=2) to get fired.
6
Markov Logic Networks
Little example scenario
There is a company witch has a Playstation and each company’s
employee has the right to play with it.
If two employees are friends then the probability is high (ω=3) that
both play all day long or both do not.
If someone plays all day long then the chance for her/him is high
(ω=2) to get fired.
7
Markov Logic Networks
Markov Networks
Has playing
Friend


1
1

P( X  x)  k ( x{k } )  exp    j f j ( x) 
Z k
Z
 j

Plays
Fired
8
Markov Logic Networks
Plays
Fired
ω
True
True
2
True
False
0
False
True
2
False
False
2
Friends Plays
ω
True
True
3
True
False
0
False
True
0
False
False
3
One possibility…
Markov Networks
Is Friend
with
Plays
Some playing
Employee
Fired
And another one.
9
Markov Logic Networks
Little example scenario
There is a company witch has a Playstation and each company’s
employee has the right to play with it.
If two employees are friends then the probability is high (ω=3) that
both play all day long or both do not.
If someone plays all day long then the chance for her/him is high
(ω=2) to get fired.
If an employee A can convince another employee B to play, depends
on the lability of B. For a high lability of B there is a higher
probability (ω=4) than for a low lability (ω=2)
10
Markov Logic Networks
Markov Networks
Is labile
Is Friend
with
Playing
Person
Plays
Fired
11
Markov Logic Networks
Could be.
Markov Networks
Advantages:
• Efficiently handling uncertainty
• Tolerant against imperfection and contradictory knowledge
Disadvantages:
•Very complex networks for a wide variety of knowledge
• Difficult to incorporate a wide range of domain knowledge
12
Markov Logic Networks
Outline
• Part 1: Why we need Markov Logic Networks (MLN’s)
• Markov Networks
• First-Order Logic
• Conclusion and Motivation
• Part 2: How the MLN’s work?
• Part 3: Are they better than other methods?
13
Markov Logic Networks
First-Order Logic
 Four types of symbols:
Constants:
concrete object in the domain (e.g., people: Anna, Bob)
Variables:
range over the objects in the domain
Functions:
Mapping from tuples of objects to objects (e.g., GrandpaOf)
Predicates:
relations among objects in the domain (e.g., Friends) or
attributes of objects (e.g. Fired)
 Term: Any expression representing an object in the domain.
Consisting of a constant, a variable, or a function applied to a tuple
of terms.
 Atomic formula or atom: predicate applied to a tuple of terms
 Logical connectives and quantifier: ,,, , , , 
14
Markov Logic Networks
Translation in First-Order Logic
If two employees are friends then the probability is high that both
play all day long or both do not.
xy : Friends ( x, y )  ( Plays ( x)  Plays ( y ))
In Clausal Form:
Friends ( x, y )  Plays ( x)  Plays ( y ),
Friends ( x, y )  Plays ( x)  Plays ( y )
If someone plays all day long then the chance for her/him is high to
get fired.
x( Plays ( x)  Fired ( x))
In Clausal Form:
15
Markov Logic Networks
Plays ( x)  Fired ( x)
First-Order Logic
Advantages:
• Compact representation a wide variety of knowledge
• Flexible and modularly incorporate a wide range of domain
knowledge
Disadvantages:
• No possibility to handle uncertainty
• No handling of imperfection and contradictory knowledge
16
Markov Logic Networks
Outline
• Part 1: Why we need Markov Logic Networks (MLN’s)
• Markov Networks
• First-Order Logic
• Conclusion and Motivation
• Part 2: How the MLN’s work?
• Part 3: Are they better than other methods?
17
Markov Logic Networks
Conclusion and Motivation
Markov Networks
First-Order Logic
 Efficiently handling
 Compact representation a
uncertainty
 Tolerant against imperfection
and contradictory knowledge
wide variety of knowledge
 Flexible and modularly
incorporate a wide range of
domain knowledge
→ Combination of Markov Networks and First-Order Logic to use the
advantages of both
18
Markov Logic Networks
Outline
• Part 1: Why we need Markov Logic Networks (MLN’s)
• Markov Networks
• First-Order Logic
• Conclusion and Motivation
• Part 2: How the MLN’s work?
• Part 3: Are they better than other methods?
19
Markov Logic Networks
Markov Logic Network
Description of
the problem
Translation in
First-Order
Logic
Construction of
a MLN”Template”
Compute
whatever you
want
20
Markov Logic Networks
Derive a
concrete MLN
for a given Set of
Constants
Markov Logic Network
Description of
the problem
Translation in
First-Order
Logic
Construction of
a MLN”Template”
Compute
whatever you
want
21
Markov Logic Networks
Derive a
concrete MLN
for a given Set of
Constants
Markov Logic Network
- Translation in First-Order Logic If two employees are friends then the probability is high that both
play all day long or both do not.
xy : Friends ( x, y )  ( Plays ( x)  Plays ( y ))
In Clausal Form:
Friends ( x, y )  Plays ( x)  Plays ( y ),
Friends ( x, y )  Plays ( x)  Plays ( y )
If someone plays all day long then the chance for her/him is high to
get fired.
x( Plays ( x)  Fired ( x))
In Clausal Form:
22
Markov Logic Networks
Plays ( x)  Fired ( x)
Markov Logic Network
Description of
the problem
Translation in
First-Order
Logic
Construction of
a MLN”Template”
Compute
whatever you
want
23
Markov Logic Networks
Derive a
concrete MLN
for a given Set of
Constants
Markov Logic Network
 Each formula matches one clique
 Each formula owns a weight that reflects the importance of this
formula
 If a world violates one formula then it is less probable but not
impossible
 Concrete: The weight of this formula will be ignored (that means the weight
is 0)
24
Markov Logic Networks
Markov Logic Network
Formula
ni ( x )
1
1


P( X  x)   i ( x{i} )
 exp   i ni ( x) 
Z i
Z
 i

to compare:


1
1

P( X  x)  k ( x{k } )  exp    j f j ( x) 
Z k
Z
 j

Three Assumptions:
1. Unique Names
2. Domain closure
3. Known functions
25
Markov Logic Networks
Markov Logic Network
Description of
the problem
Translation in
First-Order
Logic
Construction of
a MLN”Template”
Compute
whatever you
want
26
Markov Logic Networks
Derive a
concrete MLN
for a given Set of
Constants
Markov Logic Network
Grounding: (with Constants c1 and c2)
Plays(c1)Elimination
Plays(c2) of Fired(c1)
True
the existential
False
quantifier
True
Fired(c2)
Elimination of
False
the universal
quantifier
Elimination of
the functions
xy : Plays ( x)  Fired ( y ) => x : Plays ( x)  Fired (c1)  Fired (c 2)
x : Plays ( x)  Fired (c1)  Fired (c 2) =>
Plays (c1)  Fired (c1)  Fired (c 2)
Plays (c 2)  Fired (c1)  Fired (c 2)
27
Markov Logic Networks
=>
Plays (c1)  Fired (c1)  Fired (c 2)
Plays (c 2)  Fired (c1)  Fired (c 2)
True  True  False
False  True  False
Markov Networks
Friends(A,B)
Friends(A,A)
Plays(A)
Plays(B)
Friends(B,B)
Friends(B,A)
Friends ( x, y )  Plays ( x)  Plays ( y ),
Friends ( x, y )  Plays ( x)  Plays ( y )
28
Markov Logic Networks
Constants:
Alice (A) and Bob (B)
Markov Logic Network
Friends(A,B)
Friends(A,A)
Fired(A)
Plays ( x)  Fired ( x)
29
Markov Logic Networks
Plays(A)
Plays(B)
Friends(B,A)
Friends(B,B)
Fired(B)
Constants:
Alice (A) and Bob (B)
Markov Logic Network
Friends(x,y
)
Plays(x)
Plays(y)
ω
True
True
True
3
True
False
True
False
True
False
Plays(x)
Fired(x)
ω
True
True
2
0
True
False
0
True
3
False
True
2
Friends(A,A)
False
True
3
True
True
False
0
True
False
False
3
False
True
False
3
False
False
False
3
Fired(A)
Friends(A,B)
Plays(A)
Plays(B)
Friends(B,A)
Friends ( x, y )  Plays ( x)  Plays ( y ),
Plays
( x) ( xFired
x)Plays ( x)  Plays ( y )
Friends
, y ) (
30
Markov Logic Networks
FalseFriends(B,B)
False
Fired(B)
2
Markov Logic Network
Description of
the problem
Translation in
First-Order
Logic
Construction of
a MLN”Template”
Compute
whatever you
want
31
Markov Logic Networks
Derive a
concrete MLN
for a given Set of
Constants
Markov Logic Network
What is the probability that Alice and Bob are friends, both play
playstation all day long but both are not getting fired?
Friends(A,B)
Plays(A)
Plays(B)
ω
Friends(x,x)
Plays(x)
Friends(A,B)
True
True
True
3
False
=1
True
Friends ( x, y )  Plays ( x)  Plays ( y ),
Friends(A,A)
Friends
( x, y )  Plays ( xPlays(A)=1
)  Plays ( y )
=1
ω
Plays(x)
Fired(x)
ω
3
True
False
0
Plays ( x)  Fired ( x)
Plays(B)=1
1
 3

P( X  x( Alice , Bob ))  exp  Friends(B,A)
i ni ( x( Alice , Bob )) 
Z
Fired(A)=0
 i 1

=1

32
1
1
exp 3 * 4  3 * 4  2 * 0  exp 24
Z
Z
Markov Logic Networks
Friends(B,B)
=1
Fired(B)=0
Markov Logic Network
What happens if limω→∞?
1
 3

P( X  x)  exp   i ni ( x) 
Z
 i 1

1
1
If all formulas fulfilled: exp  * 4   * 4   * 2  exp  *10
Z
Z
If not all formulas fulfilled:
1
exp  * k , k  10
Z
=> Z  n1* exp  *10  n2 * exp  * 9  ...  n11,
33
Markov Logic Networks
Markov Logic Network
What happens if limω→∞?
1
 3

P( X  x)  exp   i ni ( x) 
Z
 i 1

If all formulas fulfilled:
exp  *10
n1* exp  *10  n2 * exp  * 9  ...  n11
1
1


n1  n2 * exp  * (1)   ...  n11* exp  * (10)  n1
34
Markov Logic Networks
Markov Logic Network
What happens if limω→∞?
1
 3

P( X  x)  exp   i ni ( x) 
Z
 i 1

If not all formulas fulfilled:
exp  * k 
n1* exp  *10  n2 * exp  * 9  ...  n11
1

0
n1* exp  * (10  k )   n2 * exp  * (9  k )   ...  n11* exp  * (k ) 
35
Markov Logic Networks
Markov Logic Network
What is the probability that a formula F1 holds given that formula F2
does?
P ( F1 | F2 , L, C )
 P ( F1 | F2 , M L ,C )

P ( F1  F2 , M L ,C )
P ( F2 , M L ,C )
 


x
36
Markov Logic Networks
F1  x F2
x F2
P ( X  x, M L ,C )
P ( X  x , M L ,C )
Markov Logic Network
Learning the weights:
It is #P-complete to count the number of true groundings
=> approximation is necessary
=> using the pseudo-likelihood:
n
P* ( X  x)   P ( X l  xl | MBx ( X l ))
l 1
37
Markov Logic Networks
Outline
• Part 1: Why we need Markov Logic Networks (MLN’s)
• Markov Networks
• First-Order Logic
• Conclusion and Motivation
• Part 2: How the MLN’s work?
• Part 3: Are they better than other methods?
38
Markov Logic Networks
Experiment I
Setting:
 Using a database describing the Department of Computer Science
and Engineering at the University of Washington
 12 predicates (e.g. Professor, Student, Area, AdvisedBy, …)
 2707 constants
 96 formulas (Knowledge base was provided by four volunteers
who not know the database but were member of the department)
 The whole database was divided into five subsets for each area (AI,
graphics, programming languages, systems, theory)
=> in the end 521 true ground atoms of possible 58,457
39
Markov Logic Networks
Experiment II
Testing:
 leave-one-out over the areas
 Prediction of AdvisedBy(x,y)
 Either with all or only partial (except Student(x) and Professor(x))
information
 Drawing the precision/recall curves
 Computation of the area under the curve (AUC)
40
Markov Logic Networks
Experiment III
MLN was compared with:
 Logic (only logical KB without probability)
 Probability (only probability relations without special knowledge
representations)
 Naïve Bayes (NB) and Bayesian Network (BN)
 Inductive logic programming (automatically development of the
KB)
 CLAUDIEN (clausal discovery engine)
41
Markov Logic Networks
Results
42
Markov Logic Networks
Results I
43
all Areas
Markov Logic Networks
AI Area
Results II
44
graphics area
Markov Logic Networks
prog. Language Area
Results III
45
systems area
Markov Logic Networks
theory Area
Sample applications
 Link Prediction
 Link-Based Clustering
 Social Network Modeling
 …
46
Markov Logic Networks
Conclusion
 MLN’s are a simple way to combine first-order logic and
probability
 They can be seen as a template for construction ordinary Markov
Networks
 Clauses can be learned by CLAUDIEN
 Empirical tests with real-world data and knowledge are promising
for the use of the MLN’s
47
Markov Logic Networks
Literature
Richardson, M., & Domingos, P. (2006). Markov logic networks.
Machine Learning Journal, 62, 107-136.
48
Markov Logic Networks
Download