Document 13479958

advertisement
Note to other teachers and users of these slides.
Andrew would be delighted if you found this source
material useful in giving your own lectures. Feel free
to use these slides verbatim, or to modify them to fit
your own needs. PowerPoint originals are available. If
you make use of a significant portion of these slides in
your own lecture, please include this message, or the
following link to the source repository of Andrew’s
tutorials: http://www.cs.cmu.edu/~awm/tutorials .
Comments and corrections gratefully received.
Bayes Net Structure
Learning
Andrew W. Moore
Associate Professor
School of Computer Science
Carnegie Mellon University
www.cs.cmu.edu/~awm
awm@cs.cmu.edu
412-268-7599
Copyright © 2001, Andrew W. Moore
Oct 29th, 2001
Reminder: A Bayes Net
Copyright © 2001, Andrew W. Moore
Bayes Net Structure: Slide 2
1
Estimating
Probability
Tables
Copyright © 2001, Andrew W. Moore
Bayes Net Structure: Slide 3
Estimating
Probability
Tables
Copyright © 2001, Andrew W. Moore
Bayes Net Structure: Slide 4
2
Scoring a
structure
(Which of these fits
the data best?)
N. Friedman and Z. Yakhini, On the sample
complexity of learning Bayesian networks,
Proceedings of the 12th conference on
Uncertainty in Artificial Intelligence, Morgan
Kaufmann, 1996
Score =
N
− params log R
2
 num combinations 


m  of parent values  (arityof X j )
+ R∑
j =1
∑
∑ P(V ) P( X
k =1
k
v =1
j
= v | Vk ) log P ( X j = v | Vk )
Copyright © 2001, Andrew W. Moore
Bayes Net Structure: Slide 5
Scoring a
structure
Number
of nonredundant
parameters defining
the net
#Attributes
Sums over all the
rows in the probability table for X j
#Records
Score =
N
− params log R
2
The parent values
in the k’th row of
X j ’s probability
table
 num combinations 


m  of parent values  (arityof X j )
+ R∑
j =1
∑
k =1
∑ P(V ) P( X
v =1
k
j
= v | Vk ) log P ( X j = v | Vk )
All these values estimated from data
Copyright © 2001, Andrew W. Moore
Bayes Net Structure: Slide 6
3
Scoring a
structure
This is called a BIC (Bayes Information
Criterion) estimate
This part is a penalty for too many
parameters
This part is the training set loglikelihood
Score =
N
− params log R
2
BIC asymptotically tries to get the
structure right. (There’s a lot of heavy emotional debate
about whether this is the best scoring criterion)
 num combinations 


m  of parent values  (arityof X j )
+ R∑
j =1
∑
k =1
∑ P(V ) P( X
v =1
k
j
= v | Vk ) log P ( X j = v | Vk )
All these values estimated from data
Copyright © 2001, Andrew W. Moore
Bayes Net Structure: Slide 7
Searching
for structure
with best
score
Copyright © 2001, Andrew W. Moore
Bayes Net Structure: Slide 8
4
Inputs Inputs
Inputs
Learning Methods until today
Dec Tree, Sigmoid Perceptron, Sigmoid N.Net,
Classifier
Predict Gauss/Joint BC, Gauss Naïve BC, N.Neigh
category
Density
Estimator
Probability
Regressor
Predict
real no.
Copyright © 2001, Andrew W. Moore
Joint DE, Naïve DE, Gauss/Joint DE, Gauss Naïve
DE
Linear Regression, Quadratic Regression,
Perceptron, Neural Net, N.Neigh, Kernel, LWR
Bayes Net Structure: Slide 9
Inputs Inputs
Inputs
Learning Methods added today
Dec Tree, Sigmoid Perceptron, Sigmoid N.Net,
Classifier
Predict Gauss/Joint BC, Gauss Naïve BC, N.Neigh
category
Density
Estimator
Probability
Regressor
Predict
real no.
Copyright © 2001, Andrew W. Moore
Joint DE, Naïve DE, Gauss/Joint DE, Gauss Naïve
DE, Bayes Net Structure Learning (Note, can be
extended to permit mixed categorical/real values)
Linear Regression, Quadratic Regression,
Perceptron, Neural Net, N.Neigh, Kernel, LWR
Bayes Net Structure: Slide 10
5
Inputs Inputs
Inputs
But also, for free…
Dec Tree, Sigmoid Perceptron, Sigmoid N.Net,
Classifier
Predict Gauss/Joint BC, Gauss Naïve BC, N.Neigh, Bayes
category Net Based BC
Density
Estimator
Probability
Regressor
Predict
real no.
Copyright © 2001, Andrew W. Moore
Joint DE, Naïve DE, Gauss/Joint DE, Gauss Naïve
DE, Bayes Net Structure Learning
Linear Regression, Quadratic Regression,
Perceptron, Neural Net, N.Neigh, Kernel, LWR
Bayes Net Structure: Slide 11
Inputs Inputs
Inputs
Inputs
And a new operation…
Inference
P(E1|E2) Joint DE, Bayes Net Structure Learning
Engine Learn
Dec Tree, Sigmoid Perceptron, Sigmoid N.Net,
Classifier
Predict Gauss/Joint BC, Gauss Naïve BC, N.Neigh, Bayes
category Net Based BC
Density
Estimator
Probability
Regressor
Predict
real no.
Copyright © 2001, Andrew W. Moore
Joint DE, Naïve DE, Gauss/Joint DE, Gauss Naïve
DE, Bayes Net Structure Learning
Linear Regression, Quadratic Regression,
Perceptron, Neural Net, N.Neigh, Kernel, LWR
Bayes Net Structure: Slide 12
6
Download