Markov Logic Networks Speaker: Benedict Fehringer Seminar: Probabilistic Models for Information Extraction by Dr. Martin Theobald and Maximilian Dylla Based on Richards, M., and Domingos, P. (2006) 1 Markov Logic Networks Outline • Part 1: Why we need Markov Logic Networks (MLN’s) • Markov Networks • First-Order Logic • Conclusion and Motivation • Part 2: How the MLN’s work? • Part 3: Are they better than other methods? 2 Markov Logic Networks Outline • Part 1: Why we need Markov Logic Networks (MLN’s) • Markov Networks • First-Order Logic • Conclusion and Motivation • Part 2: How the MLN’s work? • Part 3: Are they better than other methods? 3 Markov Logic Networks Markov Networks Set of variables: X ( X 1 , X 2 ,..., X n ) The distribution is given by: 1 P( X x) k ( x{k } ) Z k with Z k ( x{k } ) as normalization factor x and k 4 k as potential function Markov Logic Networks Markov Networks Representation as log-linear model: 1 1 P( X x) k ( x{k } ) exp j f j ( x) Z k Z j In our case there will be only binary features: Each feature corresponds to each possible state x{k } The weight is equal to the log of the potential: 5 Markov Logic Networks f j ( x) 0,1 j log k ( x{k } ) Little example scenario There is a company witch has a Playstation and each company’s employee has the right to play with it. If two employees are friends then the probability is high (ω=3) that both play all day long or both do not. If someone plays all day long then the chance for her/him is high (ω=2) to get fired. 6 Markov Logic Networks Little example scenario There is a company witch has a Playstation and each company’s employee has the right to play with it. If two employees are friends then the probability is high (ω=3) that both play all day long or both do not. If someone plays all day long then the chance for her/him is high (ω=2) to get fired. 7 Markov Logic Networks Markov Networks Has playing Friend 1 1 P( X x) k ( x{k } ) exp j f j ( x) Z k Z j Plays Fired 8 Markov Logic Networks Plays Fired ω True True 2 True False 0 False True 2 False False 2 Friends Plays ω True True 3 True False 0 False True 0 False False 3 One possibility… Markov Networks Is Friend with Plays Some playing Employee Fired And another one. 9 Markov Logic Networks Little example scenario There is a company witch has a Playstation and each company’s employee has the right to play with it. If two employees are friends then the probability is high (ω=3) that both play all day long or both do not. If someone plays all day long then the chance for her/him is high (ω=2) to get fired. If an employee A can convince another employee B to play, depends on the lability of B. For a high lability of B there is a higher probability (ω=4) than for a low lability (ω=2) 10 Markov Logic Networks Markov Networks Is labile Is Friend with Playing Person Plays Fired 11 Markov Logic Networks Could be. Markov Networks Advantages: • Efficiently handling uncertainty • Tolerant against imperfection and contradictory knowledge Disadvantages: •Very complex networks for a wide variety of knowledge • Difficult to incorporate a wide range of domain knowledge 12 Markov Logic Networks Outline • Part 1: Why we need Markov Logic Networks (MLN’s) • Markov Networks • First-Order Logic • Conclusion and Motivation • Part 2: How the MLN’s work? • Part 3: Are they better than other methods? 13 Markov Logic Networks First-Order Logic Four types of symbols: Constants: concrete object in the domain (e.g., people: Anna, Bob) Variables: range over the objects in the domain Functions: Mapping from tuples of objects to objects (e.g., GrandpaOf) Predicates: relations among objects in the domain (e.g., Friends) or attributes of objects (e.g. Fired) Term: Any expression representing an object in the domain. Consisting of a constant, a variable, or a function applied to a tuple of terms. Atomic formula or atom: predicate applied to a tuple of terms Logical connectives and quantifier: ,,, , , , 14 Markov Logic Networks Translation in First-Order Logic If two employees are friends then the probability is high that both play all day long or both do not. xy : Friends ( x, y ) ( Plays ( x) Plays ( y )) In Clausal Form: Friends ( x, y ) Plays ( x) Plays ( y ), Friends ( x, y ) Plays ( x) Plays ( y ) If someone plays all day long then the chance for her/him is high to get fired. x( Plays ( x) Fired ( x)) In Clausal Form: 15 Markov Logic Networks Plays ( x) Fired ( x) First-Order Logic Advantages: • Compact representation a wide variety of knowledge • Flexible and modularly incorporate a wide range of domain knowledge Disadvantages: • No possibility to handle uncertainty • No handling of imperfection and contradictory knowledge 16 Markov Logic Networks Outline • Part 1: Why we need Markov Logic Networks (MLN’s) • Markov Networks • First-Order Logic • Conclusion and Motivation • Part 2: How the MLN’s work? • Part 3: Are they better than other methods? 17 Markov Logic Networks Conclusion and Motivation Markov Networks First-Order Logic Efficiently handling Compact representation a uncertainty Tolerant against imperfection and contradictory knowledge wide variety of knowledge Flexible and modularly incorporate a wide range of domain knowledge → Combination of Markov Networks and First-Order Logic to use the advantages of both 18 Markov Logic Networks Outline • Part 1: Why we need Markov Logic Networks (MLN’s) • Markov Networks • First-Order Logic • Conclusion and Motivation • Part 2: How the MLN’s work? • Part 3: Are they better than other methods? 19 Markov Logic Networks Markov Logic Network Description of the problem Translation in First-Order Logic Construction of a MLN”Template” Compute whatever you want 20 Markov Logic Networks Derive a concrete MLN for a given Set of Constants Markov Logic Network Description of the problem Translation in First-Order Logic Construction of a MLN”Template” Compute whatever you want 21 Markov Logic Networks Derive a concrete MLN for a given Set of Constants Markov Logic Network - Translation in First-Order Logic If two employees are friends then the probability is high that both play all day long or both do not. xy : Friends ( x, y ) ( Plays ( x) Plays ( y )) In Clausal Form: Friends ( x, y ) Plays ( x) Plays ( y ), Friends ( x, y ) Plays ( x) Plays ( y ) If someone plays all day long then the chance for her/him is high to get fired. x( Plays ( x) Fired ( x)) In Clausal Form: 22 Markov Logic Networks Plays ( x) Fired ( x) Markov Logic Network Description of the problem Translation in First-Order Logic Construction of a MLN”Template” Compute whatever you want 23 Markov Logic Networks Derive a concrete MLN for a given Set of Constants Markov Logic Network Each formula matches one clique Each formula owns a weight that reflects the importance of this formula If a world violates one formula then it is less probable but not impossible Concrete: The weight of this formula will be ignored (that means the weight is 0) 24 Markov Logic Networks Markov Logic Network Formula ni ( x ) 1 1 P( X x) i ( x{i} ) exp i ni ( x) Z i Z i to compare: 1 1 P( X x) k ( x{k } ) exp j f j ( x) Z k Z j Three Assumptions: 1. Unique Names 2. Domain closure 3. Known functions 25 Markov Logic Networks Markov Logic Network Description of the problem Translation in First-Order Logic Construction of a MLN”Template” Compute whatever you want 26 Markov Logic Networks Derive a concrete MLN for a given Set of Constants Markov Logic Network Grounding: (with Constants c1 and c2) Plays(c1)Elimination Plays(c2) of Fired(c1) True the existential False quantifier True Fired(c2) Elimination of False the universal quantifier Elimination of the functions xy : Plays ( x) Fired ( y ) => x : Plays ( x) Fired (c1) Fired (c 2) x : Plays ( x) Fired (c1) Fired (c 2) => Plays (c1) Fired (c1) Fired (c 2) Plays (c 2) Fired (c1) Fired (c 2) 27 Markov Logic Networks => Plays (c1) Fired (c1) Fired (c 2) Plays (c 2) Fired (c1) Fired (c 2) True True False False True False Markov Networks Friends(A,B) Friends(A,A) Plays(A) Plays(B) Friends(B,B) Friends(B,A) Friends ( x, y ) Plays ( x) Plays ( y ), Friends ( x, y ) Plays ( x) Plays ( y ) 28 Markov Logic Networks Constants: Alice (A) and Bob (B) Markov Logic Network Friends(A,B) Friends(A,A) Fired(A) Plays ( x) Fired ( x) 29 Markov Logic Networks Plays(A) Plays(B) Friends(B,A) Friends(B,B) Fired(B) Constants: Alice (A) and Bob (B) Markov Logic Network Friends(x,y ) Plays(x) Plays(y) ω True True True 3 True False True False True False Plays(x) Fired(x) ω True True 2 0 True False 0 True 3 False True 2 Friends(A,A) False True 3 True True False 0 True False False 3 False True False 3 False False False 3 Fired(A) Friends(A,B) Plays(A) Plays(B) Friends(B,A) Friends ( x, y ) Plays ( x) Plays ( y ), Plays ( x) ( xFired x)Plays ( x) Plays ( y ) Friends , y ) ( 30 Markov Logic Networks FalseFriends(B,B) False Fired(B) 2 Markov Logic Network Description of the problem Translation in First-Order Logic Construction of a MLN”Template” Compute whatever you want 31 Markov Logic Networks Derive a concrete MLN for a given Set of Constants Markov Logic Network What is the probability that Alice and Bob are friends, both play playstation all day long but both are not getting fired? Friends(A,B) Plays(A) Plays(B) ω Friends(x,x) Plays(x) Friends(A,B) True True True 3 False =1 True Friends ( x, y ) Plays ( x) Plays ( y ), Friends(A,A) Friends ( x, y ) Plays ( xPlays(A)=1 ) Plays ( y ) =1 ω Plays(x) Fired(x) ω 3 True False 0 Plays ( x) Fired ( x) Plays(B)=1 1 3 P( X x( Alice , Bob )) exp Friends(B,A) i ni ( x( Alice , Bob )) Z Fired(A)=0 i 1 =1 32 1 1 exp 3 * 4 3 * 4 2 * 0 exp 24 Z Z Markov Logic Networks Friends(B,B) =1 Fired(B)=0 Markov Logic Network What happens if limω→∞? 1 3 P( X x) exp i ni ( x) Z i 1 1 1 If all formulas fulfilled: exp * 4 * 4 * 2 exp *10 Z Z If not all formulas fulfilled: 1 exp * k , k 10 Z => Z n1* exp *10 n2 * exp * 9 ... n11, 33 Markov Logic Networks Markov Logic Network What happens if limω→∞? 1 3 P( X x) exp i ni ( x) Z i 1 If all formulas fulfilled: exp *10 n1* exp *10 n2 * exp * 9 ... n11 1 1 n1 n2 * exp * (1) ... n11* exp * (10) n1 34 Markov Logic Networks Markov Logic Network What happens if limω→∞? 1 3 P( X x) exp i ni ( x) Z i 1 If not all formulas fulfilled: exp * k n1* exp *10 n2 * exp * 9 ... n11 1 0 n1* exp * (10 k ) n2 * exp * (9 k ) ... n11* exp * (k ) 35 Markov Logic Networks Markov Logic Network What is the probability that a formula F1 holds given that formula F2 does? P ( F1 | F2 , L, C ) P ( F1 | F2 , M L ,C ) P ( F1 F2 , M L ,C ) P ( F2 , M L ,C ) x 36 Markov Logic Networks F1 x F2 x F2 P ( X x, M L ,C ) P ( X x , M L ,C ) Markov Logic Network Learning the weights: It is #P-complete to count the number of true groundings => approximation is necessary => using the pseudo-likelihood: n P* ( X x) P ( X l xl | MBx ( X l )) l 1 37 Markov Logic Networks Outline • Part 1: Why we need Markov Logic Networks (MLN’s) • Markov Networks • First-Order Logic • Conclusion and Motivation • Part 2: How the MLN’s work? • Part 3: Are they better than other methods? 38 Markov Logic Networks Experiment I Setting: Using a database describing the Department of Computer Science and Engineering at the University of Washington 12 predicates (e.g. Professor, Student, Area, AdvisedBy, …) 2707 constants 96 formulas (Knowledge base was provided by four volunteers who not know the database but were member of the department) The whole database was divided into five subsets for each area (AI, graphics, programming languages, systems, theory) => in the end 521 true ground atoms of possible 58,457 39 Markov Logic Networks Experiment II Testing: leave-one-out over the areas Prediction of AdvisedBy(x,y) Either with all or only partial (except Student(x) and Professor(x)) information Drawing the precision/recall curves Computation of the area under the curve (AUC) 40 Markov Logic Networks Experiment III MLN was compared with: Logic (only logical KB without probability) Probability (only probability relations without special knowledge representations) Naïve Bayes (NB) and Bayesian Network (BN) Inductive logic programming (automatically development of the KB) CLAUDIEN (clausal discovery engine) 41 Markov Logic Networks Results 42 Markov Logic Networks Results I 43 all Areas Markov Logic Networks AI Area Results II 44 graphics area Markov Logic Networks prog. Language Area Results III 45 systems area Markov Logic Networks theory Area Sample applications Link Prediction Link-Based Clustering Social Network Modeling … 46 Markov Logic Networks Conclusion MLN’s are a simple way to combine first-order logic and probability They can be seen as a template for construction ordinary Markov Networks Clauses can be learned by CLAUDIEN Empirical tests with real-world data and knowledge are promising for the use of the MLN’s 47 Markov Logic Networks Literature Richardson, M., & Domingos, P. (2006). Markov logic networks. Machine Learning Journal, 62, 107-136. 48 Markov Logic Networks