Markov Logic

advertisement
Markov Logic
Overview
• Introduction
– Statistical Relational Learning
– Applications
– First-Order Logic
• Markov Networks
–
–
–
–
–
What is it?
Potential Functions
Log-Linear Model
Markov Networks vs. Bayes Networks
Computing Probabilities
Overview
• Markov Logic
–
–
–
–
–
–
–
Intuition
Definition
Example
Markov Logic Networks
MAP Inference
Computing Probabilities
Optimization
Introduction
Statistical Relational Learning
Goals:
• Combine (subsets of) logic and
probability into a single language
• Develop efficient inference algorithms
• Develop efficient learning algorithms
• Apply to real-world problems
L. Getoor & B. Taskar (eds.), Introduction to Statistical
Relational Learning, MIT Press, 2007.
Applications
• Professor Kautz’s GPS tracking project
– Determine people’s activities and thoughts about
activities based on their own actions as well as their
interactions with the world around them
Applications
• Collective classification
– Determine labels for a set of objects (such as Web pages)
based on their attributes as well as their relations to one
another
• Social network analysis and link prediction
– Predict relations between people based on attributes,
attributes based on relations, cluster entities based on
relations, etc. (smoker example)
• Entity resolution
– Determine which observations imply real-world objects
(Deduplicating a database)
• etc.
First-Order Logic
• Constants, variables, functions, predicates
E.g.: Anna, x, MotherOf(x), Friends(x, y)
• Literal: Predicate or its negation
• Clause: Disjunction of literals
• Grounding: Replace all variables by
constants
E.g.: Friends (Anna, Bob)
• World (model, interpretation):
Assignment of truth values to all ground
predicates
Markov Networks
What is a Markov Network?
• Represents a joint distribution of
variables X
• Undirected graph
• Nodes = variables
• Clique = potential function (weight)
Markov Networks
• Undirected graphical models
Smoking
Cancer
Asthma

Cough
Potential functions defined over cliques
1
P( x)    c ( xc )
Z c
Z   c ( xc )
x
1
 c(S,C)
( xc )

Z c
SmokingP( xCancer
)
False
False
4.5
False
True
4.5
True
False
2.7
True
True
4.5
c
Markov Networks
• Undirected graphical models
Smoking
Cancer
Asthma

Cough
Log-linear model:
1


P( x)  exp  wi f i ( x) 
Z
 i

Weight of Feature i
Feature i
 1 if  Smoking  Cancer
f1 (Smoking, Cancer )  
 0 otherwise
w1  1.5
Markov Nets vs. Bayes Nets
Property
Form
Potentials
Cycles
Markov Nets Bayes Nets
Prod. potentials Prod. potentials
Cond. probabilities
Arbitrary
Allowed
Forbidden
Partition func. Z = ?
Z=1
Indep. check Graph separation D-separation
Inference MCMC, BP, etc. Convert to Markov
Convert to
Inference MCMC, BP,
Markov
etc.
Computing Probabilities
• Goal: Compute marginals & conditionals
of
1


P( X )  exp   wi fi ( X ) 
Z
 i



Z   exp   wi fi ( X ) 
X
 i

• Exact inference is #P-complete
• Approximate inference
– Monte Carlo methods
– Belief propagation
– Variational approximations
Markov Logic
Markov Logic: Intuition
• A logical KB is a set of hard constraints
on the set of possible worlds
• Let’s make them soft constraints:
When a world violates a formula,
It becomes less probable, not impossible
• Give each formula a weight
(Higher weight  Stronger constraint)
P(world) exp weights of formulasit satisfies
Markov Logic: Definition
• A Markov Logic Network (MLN) is a set of
pairs (F, w) where
– F is a formula in first-order logic
– w is a real number
• Together with a set of constants,
it defines a Markov network with
– One node for each grounding of each
predicate in the MLN
– One feature for each grounding of each
formula F in the MLN, with the
corresponding weight w
Example: Friends & Smokers
Smoking causes cancer.
Friends have similar smoking habits.
Example: Friends & Smokers
x Sm okes( x )  Cancer( x)
x, y Friends( x, y )  Sm okes( x )  Sm okes( y ) 
Example: Friends & Smokers
1.5 x Sm okes( x )  Cancer( x)
1.1 x, y Friends( x, y )  Sm okes( x )  Sm okes( y ) 
Example: Friends & Smokers
1.5 x Sm okes( x )  Cancer( x)
1.1 x, y Friends( x, y )  Sm okes( x )  Sm okes( y ) 
Two constants: Anna (A) and Bob (B)
Example: Friends & Smokers
1.5 x Sm okes( x )  Cancer( x)
1.1 x, y Friends( x, y )  Sm okes( x )  Sm okes( y ) 
Two constants: Anna (A) and Bob (B)
Smokes(A)
Cancer(A)
Smokes(B)
Cancer(B)
Example: Friends & Smokers
1.5 x Sm okes( x )  Cancer( x)
1.1 x, y Friends( x, y )  Sm okes( x )  Sm okes( y ) 
Two constants: Anna (A) and Bob (B)
Friends(A,B)
Friends(A,A)
Smokes(A)
Smokes(B)
Cancer(A)
Friends(B,B)
Cancer(B)
Friends(B,A)
Example: Friends & Smokers
1.5 x Sm okes( x )  Cancer( x)
1.1 x, y Friends( x, y )  Sm okes( x )  Sm okes( y ) 
Two constants: Anna (A) and Bob (B)
Friends(A,B)
Friends(A,A)
Smokes(A)
Smokes(B)
Cancer(A)
Friends(B,B)
Cancer(B)
Friends(B,A)
Example: Friends & Smokers
1.5 x Sm okes( x )  Cancer( x)
1.1 x, y Friends( x, y )  Sm okes( x )  Sm okes( y ) 
Two constants: Anna (A) and Bob (B)
Friends(A,B)
Friends(A,A)
Smokes(A)
Smokes(B)
Cancer(A)
Friends(B,B)
Cancer(B)
Friends(B,A)
Markov Logic Networks
• MLN is template for ground Markov nets
• Typed variables and constants greatly
reduce size of ground Markov net
• Probability of a world x:
1


P( x)  exp  wi ni ( x) 
Z
 i

Weight of formula i
No. of true groundings of formula i in x
Markov Networks
1
P( x)    c ( xc )
Z c
Z   c ( xc )
x
c
1


P( x)  exp  wi f i ( x) 
Z
 i

MAP Inference
• Problem: Find most likely state of
world given evidence
arg max P( y | x)
y
Query
Evidence
MAP Inference
• Problem: Find most likely state of
world given evidence
1


arg max
exp  wi ni ( x, y) 
Zx
y
 i

MAP Inference
• Problem: Find most likely state of
world given evidence
arg max
y
 w n ( x, y)
i i
i
MAP Inference
• Problem: Find most likely state of
world given evidence
arg max
y
 w n ( x, y)
i i
i
• This is just the weighted MaxSAT
problem
• Use weighted SAT solver
(e.g., MaxWalkSAT [Kautz et al., 1997] )
The MaxWalkSAT Algorithm
for i := 1 to max-tries do
solution = random truth assignment
for j := 1 to max-flips do
if weights(sat. clauses) > threshold then
return solution
c := random unsatisfied clause
with probability p
flip a random variable in c
else
flip variable in c that maximizes
weights(sat. clauses)
return failure, best solution found
Computing Probabilities
• P(Formula|MLN,C) = ?
• Brute force: Sum probs. of worlds where
formula holds
• MCMC: Sample worlds, check formula
holds
• P(Formula1|Formula2,MLN,C) = ?
• Discard worlds where Formula 2 does not
hold
• Slow! Can use Gibbs sampling instead
Weighted Learning
• Given a formula without weights, we
can learn them
• Given a set with labeled instances, we
want to find wi’s that maximize the sum
of the features
References
• P. Domingos & D. Lowd, Markov Logic: An Interface Layer for
Artificial Intelligence, Synthesis Lectures on Artificial Intelligence
and Machine Learning, Morgan & Claypool, 2009.
• Most of the slides were taken from P. Domingos’ course website:
http://www.cs.washington.edu/homes/pedrod/803/
Thank You!
Download