BLPs

advertisement
Bayesian Logic Programs
Summer School on Relational Data Mining
17 and 18 August 2002, Helsinki, Finland
Kristian Kersting, Luc De Raedt
Albert-Ludwigs University
Freiburg, Germany
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Context
Real-world applications
uncertainty
complex, structured
domains
probability theory
discrete, continuous
logic
objects, relations,functors
Bayesian networks
Logic Programming (Prolog)
+
Bayesian logic programs
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Outline
• Bayesian Logic Programs
• Examples and Language
• Semantics and Support Networks
• Learning Bayesian Logic Programs
• Data Cases
• Parameter Estimation
• Structural Learning
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
•
•
•
•
•
Bayesian Logic Programs
Probabilistic models structured using logic
Extend Bayesian networks with notions of
objects and relations
Probability density over (countably) infinitely
many random variables
Flexible discrete-time stochastic processes
Generalize pure Prolog, Bayesian networks,
dynamic Bayesian networks, dynamic Bayesian
multinets, hidden Markov models,...
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Bayesian Networks
•
•
One of the successes of AI
State-of-the-art to model uncertainty, in
particular the degree of belief
• Advantage [Russell, Norvig 96]:
„strict separation of qualitative and
quantitative aspects of the world“
• Disadvantge [Breese, Ngo, Haddawy, Koller, ...]:
Propositional character, no notion of objects
and relations among them
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
•
•
•
Stud farm (Jensen ´96)
The colt John has been born recently on a
stud farm.
John suffers from a life threatening
hereditary carried by a recessive gene.
The disease is so serious that John is
displaced instantly, and the stud farm
wants the gene out of production, his
parents are taken out of breeding.
What are the probabilities for the
remaining horses to be carriers of the
unwanted gene?
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Bayesian networks
[Pearl ´88]
Based on the stud farm example [Jensen ´96]
bt_unknown1
bt_fred
bt_ann
bt_cecily
bt_brian
bt_dorothy
bt_gwenn
bt_eric
bt_henry
bt_irene
bt_john
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
bt_unknown2
Pa  bt _ john 
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Bayesian networks
[Pearl ´88]
Based on the stud farm example [Jensen ´96]
bt_unknown1
bt_ann
bt_dorothy
bt_fred
bt_cecily
bt_brian
bt_gwenn
bt_eric
bt_henry
bt_unknown2
bt_irene
Pa  bt _ john 
(Conditional) Probability distribution
P(bt_john)
bt_henry
bt_irene
(1.0,0.0,0.0)
aa
aa
(0.5,0.5,0.0)
aa
aA
(0.0,1.0,0.0)
aa
AA
...
bt_john
P(bt_cecily=aA|bt_john=aA)=0.1499
P(bt_john=AA|bt_ann=aA)=0.6906
P(bt_john=AA)=0.9909
AA
AA
Summer(0.33,0.33,0.33)
School on Relational Data
Mining, 17 and
18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
•
•
Bayesian networks (contd.)
acyclic graphs
probability distribution over a finite set
X1 , , X n of random variables:
P  X1,
, Xn 
 P  X1 X 2 ,
, X n   P  X 2 X3,

, Xn 

 P  X 1 Pa  X 1    P X 2 Pa  X 2  
n
  P  X i Pa  X i  
i 1
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
P Xn 
 P  X n Pa  X n  
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
bt_unknown1
bt_fred
From Bayesian Networks to
Bayesian Logic Programs
bt_ann
bt_cecily
bt_brian
bt_dorothy
bt_eric
bt_henry
bt_gwenn
bt_irene
bt_john
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
bt_unknown2
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
bt_unknown1.
bt_fred
From Bayesian Networks to
Bayesian Logic Programs
bt_ann.
bt_cecily.
bt_brian.
bt_dorothy
bt_eric
bt_henry
bt_gwenn
bt_irene
bt_john
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
bt_unknown2.
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
From Bayesian Networks to
Bayesian Logic Programs
bt_unknown1.
bt_ann.
bt_cecily.
bt_brian.
bt_unknown2.
Pa  bt _ fred 
bt_fred | bt_unknown1,
bt_ann.
bt_dorothy | bt_ann,
bt_brian.
bt_eric| bt_brian,
bt_cecily.
bt_henry
bt_irene
bt_john
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
bt_gwenn | bt_ann,
bt_unknown2.
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
From Bayesian Networks to
Bayesian Logic Programs
bt_unknown1.
bt_ann.
bt_cecily.
bt_brian.
bt_unknown2.
Pa  bt _ fred 
bt_fred | bt_unknown1,
bt_ann.
bt_dorothy | bt_ann,
bt_brian.
bt_eric| bt_brian,
bt_cecily.
bt_henry | bt_fred,
bt_dorothy.
bt_irene | bt_eric,
bt_gwenn.
bt_john
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
bt_gwenn | bt_ann,
bt_unknown2.
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
From Bayesian Networks to
Bayesian Logic Programs
bt_unknown1.
bt_ann.
bt_cecily.
bt_brian.
bt_unknown2.
Pa  bt _ fred 
bt_fred | bt_unknown1,
bt_ann.
bt_dorothy | bt_ann,
bt_brian.
bt_henry | bt_fred,
bt_dorothy.
bt_eric| bt_brian,
bt_cecily.
bt_irene | bt_eric,
bt_gwenn.
bt_john | bt_henry ,bt_irene.
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
bt_gwenn | bt_ann,
bt_unknown2.
P(bt_john)
bt_henry
bt_irene
(1.0,0.0,0.0)
aa
aa
(0.5,0.5,0.0)
aa
aA
(0.0,1.0,0.0)
aa
AA
AA
AA
...
(0.33,0.33,0.33)
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
From Bayesian Networks to
Bayesian Logic Programs
% apriori nodes
Domain
e.g. finite,
discrete,
continuous
bt_ann.
bt_brian.
bt_cecily. bt_unknown1. bt_unknown1.
% aposteriori nodes
bt_henry |
bt_irene |
bt_fred
|
bt_dorothy|
bt_eric
|
bt_gwenn |
bt_john
|
bt_fred, bt_dorothy.
bt_eric, bt_gwenn.
bt_unknown1, bt_ann.
bt_brian, bt_ann.
bt_brian, bt_cecily.
bt_unknown2, bt_ann.
bt_henry, bt_irene.
(conditional) probability distribution
P(bt_john)
bt_henry
bt_irene
(1.0,0.0,0.0)
aa
aa
(0.5,0.5,0.0)
aa
aA
(0.0,1.0,0.0)
aa
AA
AA
AA
...
(0.33,0.33,0.33)
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
From Bayesian Networks to
Bayesian Logic Programs
% apriori nodes
bt(ann).
bt(brian).
bt(cecily). bt(unknown1). bt(unknown1).
% aposteriori nodes
bt(henry) |
bt(irene) |
bt(fred)
|
bt(dorothy)|
bt(eric)
|
bt(gwenn) |
bt(john)
|
bt(fred), bt(dorothy).
bt(eric), bt(gwenn).
bt(unknown1), bt(ann).
bt(brian), bt(ann).
bt(brian), bt(cecily).
bt(unknown2), bt(ann).
bt(henry), bt(irene).
(conditional) probability distribution
P(bt(john))
bt(henry)
bt(irene)
(1.0,0.0,0.0)
aa
aa
(0.5,0.5,0.0)
aa
aA
(0.0,1.0,0.0)
aa
AA
AA
AA
...
(0.33,0.33,0.33)
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
From Bayesian Networks to
Bayesian Logic Programs
% ground facts / apriori
bt(ann).
bt(brian).
bt(cecily). bt(unkown1).
father(unkown1,fred).
father(brian,dorothy).
father(brian,eric).
father(unkown2,gwenn).
father(fred,henry).
father(eric,irene).
father(henry,john).
bt(unkown1).
mother(ann,fred).
mother(ann, dorothy).
mother(cecily,eric).
mother(ann,gwenn).
mother(dorothy,henry).
mother(gwenn,irene).
mother(irene,john).
% rules / aposteriori
bt(X) | father(F,X), bt(F), mother(M,X), bt(M).
(conditional)
probability distribution
P(bt(X))
father(F,X)
bt(F)
mother(M,X)
bt(M)
(1.0,0.0,0.0)
true
Aa
true
aa
AA
false
AA
...
(0.33,0.33,0.33)
false
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Dependency graph
= Bayesian network
bt(unknown2)
father(brian,dorothy)
bt(ann)
bt(cecily)
bt(brian)
father(brian,eric)
mother(ann,dorothy)
bt(eric)
mother(ann,gwenn)
mother(cecily,eric)
bt(dorothy)
bt(unknown1)
bt(gwenn)
father(eric,irene)
bt(fred)
bt(irene)
mother(dorothy,henry)
father(unknown1,fred)
bt(henry)
father(unknown2,eric)
bt(john)
mother(gwenn,irene)
mother(ann,fred)
father(fred,henry)
father(henry,john)
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
mother(irene,john)
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
bt_unknown1
bt_fred
Dependency graph
= Bayesian network
bt_ann
bt_cecily
bt_brian
bt_dorothy
bt_eric
bt_henry
bt_gwenn
bt_irene
bt_john
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
bt_unknown2
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Bayesian Logic Programs
- a first definition
A BLP B consists of
•
•
a finite set of Bayesian clauses.
To each clause c in B a conditional probability
distribution cpd  c  is associated:

cpd  c   P head  c  body  c 
•
•
•

Proper random variables ~ LH(B)
graphical structure
~ dependency graph
Quantitative information ~ CPDs
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Bayesian Logic Programs
- Examples
% apriori nodes
nat(0).
nat(0)
nat(s(0))
nat(s(s(0))
% aposteriori nodes
nat(s(X)) | nat(X).
% apriori nodes
state(0).
% aposteriori nodes
state(s(Time)) | state(Time).
output(Time)
| state(Time)
% apriori nodes
n1(0).
state(0)
state(s(0))
output(0)
output(s(0))
n2(s(0))
n2(0)
n3(0)
% aposteriori nodes
n1(s(TimeSlice) | n2(TimeSlice). n1(0)
n2(TimeSlice)
| n1(TimeSlice).
n3(TimeSlice)
| n1(TimeSlice), n2(TimeSlice).
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
...
...
...
n3(s(0))
n1(s(0))
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
•
Associated CPDs
represent generically the CPD for each
ground instance of the corresponding
Bayesian clause.
P(bt(X))
father(F,X)
bt(F)
mother(M,X)
bt(M)
(1.0,0.0,0.0)
true
Aa
true
aa
AA
false
AA
...
(0.33,0.33,0.33)
1
false
2
...
n
Multiple ground instances of clauses having
the same head atom?
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Combining Rules
Multiple ground instances of clauses having the
same head atom?
% ground facts as before
% rules
bt(X) | father(F,X), bt(F).
bt(X) | mother(M,X), bt(M).
cpd(bt(john)|father(henry,john), bt(henry)) and
cpd(bt(john)|mother(henry,john), bt(irene))
But we need !!!
cpd(bt(john)|father(henry,john),bt(henry),mother(irene,john),bt(irene))
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Combining Rules (contd.)

Any algorithm which

P(A|B) and P(A|C)
combines a set of PDFs
cpd  A A ,
i1
, Aini


1 i  m
into the (combined) PDFs
cpd  A B1 ,
CR
where  B1 ,
P(A|B,C)
, Bk 
, Bk  
m
1
i 1

A ,
has an empty output if and
only if the input is empty

E.g. noisy-or, regression, ...
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
, Aini

Bayesian Logic Programs
E Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Bayesian Logic Programs
- a definition
A BLP B consists of
•
•
•
•
•
•
a finite set of Bayesian clauses.
To each clause c in B a conditional probability
distribution cpd  c  is associated:
cpd  c   P head  c  body  c 
To each Bayesian predicate p a combining rule cr
is associated to combine CPDs of multiple ground
instances of clauses having the same head


 p
Proper random variables ~ LH(B)
graphical structure
~ dependency graph
Quantitative information ~ CPDs and CRs
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Outline
• Bayesian Logic Programs
• Examples and Language
E • Semantics and Support Networks
• Learning Bayesian Logic Programs
• Data Cases
• Parameter Estimation
• Structural Learning
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
E Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Discrete-Time Stochastic Process
•
Family  X t , t  J  of random variables X t
over a domain X, where J  0,1, 2, 
•
for each linearization of the partial order
induced by the dependency graph a Bayesian
logic program specifies a discrete-time
stochastic process

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
E Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Theorem of Kolmogorov
Existence and uniqueness of
probability measure
X : a Polish space
•
• H  J  : set of all non-empty, finite subsets of J
I
• PI
: the probability measure over X , I  H
•
 
J 
If the projective family PI I H J exists then there
 
exists a unique probability measure
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
E Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Consistency Conditions
•
Probability measure PI , I  H  J 
is represented by a finite Bayesian network
which is a subnetwork of the dependency graph
over LH(B): Support Network
•
(Elimination Order): All stochastic processes
represented by a Bayesian logic program B
specify the same probability measure over
LH(B).
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
E Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Support network
bt(unknown2)
father(brian,dorothy)
bt(ann)
bt(cecily)
bt(brian)
father(brian,eric)
mother(ann,dorothy)
bt(eric)
mother(ann,gwenn)
mother(cecily,eric)
bt(dorothy)
bt(unknown1)
bt(gwenn)
father(eric,irene)
bt(fred)
bt(irene)
mother(dorothy,henry)
father(unknown1,fred)
bt(henry)
father(unknown2,eric)
bt(john)
mother(gwenn,irene)
mother(ann,fred)
father(fred,henry)
father(henry,john)
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
mother(irene,john)
Bayesian Logic Programs
Examples and Language
E Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Support network
bt(unknown2)
father(brian,dorothy)
bt(ann)
bt(cecily)
bt(brian)
father(brian,eric)
mother(ann,dorothy)
bt(eric)
mother(ann,gwenn)
mother(cecily,eric)
bt(dorothy)
bt(unknown1)
bt(gwenn)
father(eric,irene)
bt(fred)
bt(irene)
mother(dorothy,henry)
father(unknown1,fred)
bt(henry)
father(unknown2,eric)
bt(john)
mother(gwenn,irene)
mother(ann,fred)
father(fred,henry)
father(henry,john)
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
mother(irene,john)
Bayesian Logic Programs
Examples and Language
E Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Support network
bt(unknown2)
father(brian,dorothy)
bt(ann)
bt(cecily)
bt(brian)
father(brian,eric)
mother(ann,dorothy)
bt(eric)
mother(ann,gwenn)
mother(cecily,eric)
bt(dorothy)
bt(unknown1)
bt(gwenn)
father(eric,irene)
bt(fred)
bt(irene)
mother(dorothy,henry)
father(unknown1,fred)
bt(henry)
father(unknown2,eric)
bt(john)
mother(gwenn,irene)
mother(ann,fred)
father(fred,henry)
father(henry,john)
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
mother(irene,john)
Bayesian Logic Programs
Examples and Language
E Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
•
Support network
Support network N  x  of x  LH  B  is the
induced subnetwork of
S   x   y  LH  B  y is influencing x
•
Support network N  x  of x  LH  B  is
defined as
N  x 
N  x
xx
•
Computation utilizes And/Or trees
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
E Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Queries using And/Or trees
• A probabilistic query
?- Q1...,Qn|E1=e1,...,Em=em.
asks for the distribution
P(Q1, ..., Qn |E1=e1, ..., Em=em).
•
•
Or node is proven if at least one
bt(eric)
cpd
father(brian,eric),bt(brian),
mother(cecily,eric),bt(cecily)
bt(brian)
of its successors is provable.
And node is proven if all of its
successors are provable.
bt(cecily)
father(brian,eric)
mother(cecily,eric)
bt(eric)
bt(brian)
•
?- bt(eric).
bt(cecily)
father(brian,eric)
mother(cecily,eric)
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
combined
cpd
Bayesian Logic Programs
Examples and Language
E Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Consistency Condition (contd.)
Projective family
 PI  IH  J  exists if
• the dependency graph is acyclic, and
• every random variable is influenced by a
finite set of random variables only
well-defined Bayesian logic program
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
E Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Relational Character
% ground facts
bt(petra).
bt(bsilvester).
bt(brian).
bt(anne).
bt(wilhelm).
bt(unknown1).
bt(beate).
% ground facts
bt(ann).
bt(cecily).
bt(unknown1).
father(silvester,claudien). mother(beate,claudien).
father(unknown1,fred).
mother(ann,fred). mother(anne, marthe).
father(wilhelm,marta).
% ground facts
father(brian,dorothy).
mother(ann, dorothy).
...
bt(susanne).
bt(ralf).father(brian,eric).
mother(cecily,eric).
bt(peter).
bt(uta). father(unknown2,gwenn). mother(ann,gwenn).
father(fred,henry).
mother(dorothy,henry).
father(ralf,luca). mother(susanne,luca).
father(eric,irene).
mother(gwenn,irene).
...
father(henry,john).
mother(irene,john).
% rules
bt(X) | father(F,X), bt(F), mother(M,X), bt(M).
P(bt(X))
father(X,F)
bt(F)
mother(X,M)
bt(M)
(1.0,0.0,0.0)
true
Aa
true
aa
AA
false
AA
...
(0.33,0.33,0.33)
false
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
E Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
•
•
•
•
•
•
•
Bayesian Logic Programs
- Summary
First order logic extension of Bayesian networks
constants, relations, functors
discrete and continuous random variables
ground atoms = random variables
CPDs associated to clauses
Dependency graph =
(possibly) infinite Bayesian network
Generalize dynamic Bayesian networks and
definite clause logic (range-restricted)
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Applications
• Probabilistic, logical
•
•
•
•
Description and prediction
Regression
Classification
Clustering
•
APrIL IST-2001-33053
• Computational Biology
• Web Mining
• Query approximation
• Planning, ...
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Other frameworks
•
•
•
•
•
•
Probabilistic Horn Abduction [Poole 93]
Distributional Semantics (PRISM) [Sato 95]
Stochastic Logic Programs [Muggleton 96; Cussens 99]
Relational Bayesian Nets [Jaeger 97]
Probabilistic Logic Programs [Ngo, Haddawy 97]
Object-Oriented Bayesian Nets [Koller, Pfeffer 97]
Probabilistic Frame-Based Systems [Koller, Pfeffer 98]
Probabilistic Relational Models [Koller 99]
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Outline
• Bayesian Logic Programs
• Examples and Language
• Semantics and Support Networks
E • Learning Bayesian Logic Programs
• Data Cases
• Parameter Estimation
• Structural Learning
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning D
Data Cases
Parameter Estimation
Structural Learning
Learning
Bayesian Logic Programs
Data
+
Background
Knowledge
A | E, B.
learning
algorithm
E B
P(A)
e b .9
.1
e b .7
.3
e b .8
.2
e b .99 .01
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning D
Data Cases
Parameter Estimation
Structural Learning
Why Learning
Bayesian Logic Programs ?
Learning within
Bayesian network
Inductive Logic
Programming
Learning within
Bayesian Logic
Programs
Of interest to different communities ?
• scoring functions, pruning techniques,
theoretical insights, ...
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
E Data Cases
Parameter Estimation
Structural Learning
What is the data about ?
m  ann, dorothy   true, f  brian, dorothy   true,






 pc  brian   b, bt  ann   a, bt  brian   ?, bt  dorothy   a 


 m  cecily, fred   true, f  henry, fred   true, bt  cecily   ab, 


bt  henry   b, bt  fred   ?, m  kim, bob   true, f  fred , bob   true, 


bt
kim

?,
bt
bob

b






A data case Di  D is a partially observed
joint state of a finite, nonempty subset
x  LH  B 
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
EParameter Estimation
Structural Learning
Learning Task
Given:
• set D  D1 , , Dn of data cases
• a Bayesian logic program B
Goal: for each c  B the parameters

λ(c)    c 1 ,
,   c e c 

of cpd(c ) that best fit the given data
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
EParameter Estimation
Structural Learning
Parameter Estimation (contd.)
• „best fit“ ~ ML-Estimation
λ*  arg max PB λ  D 
λH
where the hypothesis space H is spanned by
the product space over the possible values of
λ c
λ :
:
cB
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
EParameter Estimation
Structural Learning
Parameter Estimation (contd.)
λ*  arg max PB λ   D 
λH
 arg max ln PB λ   D 
λH
Assumption:
D1,...,DN are independently sampled from indentical
distributions (e.g. totally separated families),),
 arg max ln  PB λ   Di 
λH
i
 arg max  ln PB λ   Di 
λH
i
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
EParameter Estimation
Structural Learning
Parameter Estimation (contd.)
λ  arg max  ln PB λ   Di 
*
λH
marginal of
i
 arg max  ln PN  var  Di   Di 
λH
N λ  var  Di  
i
 arg max  ln PN  λ  Di 
λH
i
N  λ  is an ordinary BN
marginal of
N  λ  :
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
i
N λ  var  Di  
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
EParameter Estimation
Structural Learning
Parameter Estimation (contd.)
•
Reduced to a problem within Bayesian networks:
given structure,
partially observed random varianbles
•
EM
[Dempster, Laird, Rubin, ´77], [Lauritzen, ´91]
•
Gradient Ascent
[Binder, Koller, Russel, Kanazawa, ´97], [Jensen, ´99]
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
EParameter Estimation
Structural Learning
•
Decomposable CRs
Parameters of the clauses and not of the
support network.
A11
A12
A1
...
A1n1
A21
A22
...
A2
Am1
A2n2
...
Am 2
...
Amnm
Am
Single ground instance
Multiple ground instance
of a Bayesian clause
of the sameA Bayesian clause
CPD for Combining Rule
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
EParameter Estimation
Structural Learning
Gradient Ascent
Bayesian ground clause
 ln PN  λ   D 
 cpd  c  jk
Bayesian
clause
 ln PN  λ   D   cpd  c  jk 

 
 cpd  c  jk
subst. j , k   cpd  c  j k 


subst.
 ln PN  λ   D 
 cpd  c  jk
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
1, if j  j , k  k 

0, if j  j , k  k 
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
EParameter Estimation
Structural Learning
Gradient Ascent
 ln PN  λ   D 
Bayesian Network
Inference Engine
cpd  c  jk


subst.

 ln PN  λ   D 
cpd  c  jk
n
 
subst.
i 1
PN  λ   head  c   u j , body  c   uk Di 
cpd  c  jk
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
EParameter Estimation
Structural Learning
Algorithm
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
E Parameter Estimation
Structural Learning
1.
Expectation-Maximization
Initialize parameters
2. E-Step and M-Step, i.e.
compute expected counts for each clause and
treat the expected count as counts
 
cpd  c  jk  subst.
n
i 1
PN  λ  head  c   u j , body  c   uk Di 
 
n
i 1
PN  λ  body  c   uk Di 
subst.
3. If not converged, iterate to 2
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
EParameter Estimation
Structural Learning
•
Experimental Evidence
[Koller, Pfeffer ´97]
support network is a good approximation
•
[Binder et al. ´97]
equality constraints speed up learning
m(M,X)
f(F,X)
pdf (c)(h(X)|h(M),h(F))
true
true
N(0.5*h(M)+0.5*h(F),s)
true
false
N(165,s)
false
true
N(165,s)
false
false
N(165,s)
• 100 data cases
• constant step-size
• Estimation of means
• 13 iterations
• Estimation of the weights
• sum = 1.0
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Outline
• Bayesian Logic Programs
• Examples and Language
• Semantics and Support Networks
• Learning Bayesian Logic Programs
• Data Cases
• Parameter Estimation
E • Structural Learning
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
E Structural Learning
•
Structural Learning
Combination of Inductive Logic
Programming and Bayesian network
learning
• Datalog fragment of
Bayesian logic programs (no functors)
• intensional Bayesian clauses
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
E Structural Learning
Idea - CLAUDIEN
learning from interpretations
• all data cases are
Herbrand interpretations
• a hypothesis should
reflect what is in the
data
probabilistic extension
all data cases are partially
observed joint
states of Herbrand
intepretations
all hypotheses have to be
(logically) true in
all data cases
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
E Structural Learning
What is the data about ?
m  ann, dorothy   true, f  brian, dorothy   true,







 pc  brian   b, bt  ann   a, bt  brian   ?, bt  dorothy   a 

...
m  cecily, fred   true, f  henry, fred   true, 


 bt  cecily   ab, bt  henry   b, bt  fred   ?, 


 m  kim, bob   true, f  fred , bob   true, 


bt
kim

?,
bt
bob

b






Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
E Structural Learning
:
Claudien Learning From Interpretations
•
D
•
C : set of all clauses that can be part of
set of data cases
hypotheses
H  C (logically) valid iff Di  D : H is logically true in Di
H  C logical solution iff H is a logically maximally general
valid hypothesis
H  C probabilistic solution iff
H is (logically) valid and
the Bayesian network induced by B on D is acyclic
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
E Structural Learning
Learning Task
Given:
• set D  D1 , , Dn  of data cases
• a set H of Bayesian logic programs
• a scoring function scoreD : H  R
Goal: probabilistic solution H *  H
• matches the data best according to score D
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
E Structural Learning
Algorithm
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
E Structural Learning
Example
Original Bayesian logic program
mc(X) | m(M,X), mc(M), pc(M).
pc(X) | f(F,X), mc(F), pc(F).
bt(X) | mc(X), pc(X).
mc(ann)
mc(eric)
pc(ann)
m(ann,john)
pc(eric)
mc(john)
Data cases
bc(john)
{m(ann,john)=true, pc(ann)=a, mc(ann)=?,
f(eric,john)=true, pc(eric)=b, mc(eric)=a,
mc(john)=ab, pc(john)=a, bt(john) = ? }
...
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
pc(john)
f(eric,john)
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
E Structural Learning
Example
Original Bayesian logic program
mc(X) | m(M,X), mc(M), pc(M).
pc(X) | f(F,X), mc(F), pc(F).
bt(X) | mc(X), pc(X).
mc(ann)
Initial hypothesis
mc(X) | m(M,X).
pc(X) | f(F,X).
bt(X) | mc(X).
mc(eric)
pc(ann)
m(ann,john)
pc(eric)
mc(john)
bc(john)
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
pc(john)
f(eric,john)
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
E Structural Learning
Example
Original Bayesian logic program
mc(X) | m(M,X), mc(M), pc(M).
pc(X) | f(F,X), mc(F), pc(F).
bt(X) | mc(X), pc(X).
mc(ann)
Initial hypothesis
mc(X) | m(M,X).
pc(X) | f(F,X).
bt(X) | mc(X).
mc(eric)
pc(ann)
m(ann,john)
pc(eric)
mc(john)
bc(john)
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
pc(john)
f(eric,john)
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
E Structural Learning
Example
Original Bayesian logic program
mc(X) | m(M,X), mc(M), pc(M).
pc(X) | f(F,X), mc(F), pc(F).
bt(X) | mc(X), pc(X).
mc(ann)
pc(ann)
Initial hypothesis
mc(X) | m(M,X).
pc(X) | f(F,X).
bt(X) | mc(X).
mc(eric)
m(ann,john)
pc(eric)
mc(john)
bc(john)
Refinement
mc(X) | m(M,X).
pc(X) | f(F,X).
bt(X) | mc(X), pc(X).
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
pc(john)
f(eric,john)
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
E Structural Learning
Example
Original Bayesian logic program
mc(X) | m(M,X), mc(M), pc(M).
pc(X) | f(F,X), mc(F), pc(F).
bt(X) | mc(X), pc(X).
mc(ann)
pc(ann)
Initial hypothesis
mc(X) | m(M,X).
pc(X) | f(F,X).
bt(X) | mc(X).
mc(eric)
pc(eric)
mc(john)
m(ann,john)
pc(john)
bc(john)
Refinement
mc(X) | m(M,X).
pc(X) | f(F,X).
bt(X) | mc(X), pc(X).
Refinement
mc(X) | m(M,X),mc(X).
pc(X) | f(F,X).
bt(X) | mc(X), pc(X).
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
f(eric,john)
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
E Structural Learning
Example
Original Bayesian logic program
mc(X) | m(M,X), mc(M), pc(M).
pc(X) | f(F,X), mc(F), pc(F).
bt(X) | mc(X), pc(X).
mc(ann)
pc(ann)
Initial hypothesis
mc(X) | m(M,X).
pc(X) | f(F,X).
bt(X) | mc(X).
mc(eric)
m(ann,john)
pc(eric)
mc(john)
pc(john)
bc(john)
Refinement
mc(X) | m(M,X).
pc(X) | f(F,X).
bt(X) | mc(X), pc(X).
Refinement
mc(X) | m(M,X),pc(X).
pc(X) | f(F,X).
bt(X) | mc(X), pc(X).
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
f(eric,john)
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
E Structural Learning
Example
Original Bayesian logic program
mc(X) | m(M,X), mc(M), pc(M).
pc(X) | f(F,X), mc(F), pc(F).
bt(X) | mc(X), pc(X).
mc(ann)
pc(ann)
Initial hypothesis
mc(X) | m(M,X).
pc(X) | f(F,X).
bt(X) | mc(X).
mc(eric)
m(ann,john)
pc(eric)
mc(john)
pc(john)
f(eric,john)
bc(john)
Refinement
mc(X) | m(M,X).
pc(X) | f(F,X).
bt(X) | mc(X), pc(X).
Refinement
mc(X) | m(M,X),pc(X).
pc(X) | f(F,X).
bt(X) | mc(X), pc(X).
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
...
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
E Structural Learning
•
•
•
•
•
Properties
All relevant random variables are known
First order equivalent of Bayesian network
setting
Hypothesis postulates true regularities in the
data
Logical solutions as inital hypotheses
Highlights Background Knowledge
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
E Structural Learning
Example Experiments
mc(X) | m(M,X), mc(M), pc(M).
pc(X) | f(F,X), mc(F), pc(F).
bt(X) | mc(X), pc(X).
Data: sampling from 2 families, each 1000 samples
Score: LogLikelihood
Goal: learn the definition of bt
CLAUDIEN
Bloodtype
mc(X) | m(M,X).
pc(X) | f(F,X).
mc(X) | m(M,X), mc(M), pc(M).
pc(X) | f(F,X), mc(F), pc(F).
bt(X) | mc(X), pc(X).
highest score
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
Structural Learning
Conclusion
•
EM-based and Gradient-based method to do
ML parameter estimation
•
Link between ILP and learning Bayesian
networks
CLAUDIEN setting used to define and to
traverse the search space
Bayesian network scores used to evaluate
hypotheses
•
•
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Bayesian Logic Programs
Examples and Language
Semantics and Support Networks
Learning
Data Cases
Parameter Estimation
E Structural Learning
Thanks !
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland
K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany
Download