Uploaded by Sudhanshu Keshaowar

Probabilistic Reasoning: Bayesian Networks

advertisement

Probabilistic Reasoning: Bayesian Networks

Probabilistic Reasoning
• Representing knowledge in an uncertain
domain
Probabilistic Reasoning
Representing knowledge in an uncertain
domain
The full joint distribution can answer any
question about the domain, but can become
interactively large as the number of variables
grows.
Probabilistic Reasoning
Independence and conditional Independence
relationship among variables can greatly reduce
the number of probabilities that need to be
specified in order to define the full joint
distribution.
we introduce a data structure called by Bayesian
network to represent dependencies among
variables.
Bayesian network can represent essentially a full
joint probability distribution.
How to define a Bayesian network ?
How to define a Bayesian network ?
A Bayesian network is a directed graph.
The full specification is as follows:
1. Each node corresponds to a random variable
which may be discrete or continuous.
2. A set of directed edges (arrows) connect a
pair of nodes.
If there is a directed edge from node X to node
Y, X is said to be parent of Y.
The graph has no directed cycles.
3. Each node Xi has a conditional probability
distribution P(Xi| Parents(Xi) quantifies the
effect of parent on a node.
How to define topology of network?
Topology specifies the conditional independence
relation that holds in the domain.
It is usually easy for a domain expert to decide
what direct influence exist in the domain
Can you give an example to find
topology of some world?
Consider a simple world consisting of of random
variables
Toothache
Cavity
Catch
Weather
Can you give an example to find
topology of some world?
Consider a simple word consisting of of random
variables
Toothache
Cavity
Catch
Weather
What do you think which random variables are
independent and which random variables are
conditional independent?
What do you think which random
variables are independent and which
random variables are conditional
independent?
Weather is independent of other variables
Toothache and Catch a conditionally
independent given Cavity.
How to represent this showing directed graph ?
Topology of Network
Once we have define topology of the
Bayesian network what is left?
Once we have define topology of the
Bayesian network what is left?
We need to specify a conditional probability
distribution for each variable given its parents.
Formally conditional independence of
Toothache and Catch given Cavity is indicated by
the absence of link between Toothache and
Catch.
Bayesian belief network
Assume your house has an alarm system against
burglary. You live in the seismically active area
and the alarm system can get occasionally set
off by an earthquake. You have two neighbors,
Mary and John, who do not know each other. If
they hear the alarm they call you, but this is not
guaranteed.
Bayesian belief network
What is the probability that alarm has sounded,
but neither a burglary nor earthquake has
occurred and both John and Mary calls?
What is the probability that alarm has sounded,
but neither a burglary nor earthquake has
occurred and both John and Mary calls?
P(a,ⴈb,ⴈe, j, m)= ?
What is the probability that alarm has sounded,
but neither a burglary nor earthquake has
occurred and both John and Mary calls?
P(j, m, a,ⴈb,ⴈe)=P(j| m, a,ⴈb,ⴈe)* P( m, a,ⴈb,ⴈe)
=P(j|a)* P( m| a,ⴈb,ⴈe)* P( a,ⴈb,ⴈe)
=P(j|a)* P( m|a)* P( a|ⴈb,ⴈe) *P( ⴈb,ⴈe)
=P(j|a)* P( m|a)* P( a|ⴈb,ⴈe) *P( ⴈb| ⴈe)*P( ⴈe)
==P(j|a)* P( m|a)* P( a|ⴈb,ⴈe) *P( ⴈb)*P( ⴈe)
• What is the probability that alarm has
sounded, but neither a burglary nor
earthquake has occurred and both John and
Mary calls?
P(a,ⴈb,ⴈe, j, m)= P(a|ⴈb,ⴈe)x P(ⴈb)x P(ⴈe)x P(j|a)x
P(m|a)
=0.001x0.999x0.998x0.9x0.7=.000628
What is the probability that John has called and
Mary has called and alarm has sounded and
burglary has occurred and earthquake also has
occurred?
Calculate P(j,m,a,b,e)
P(j,m,a,b,e)=P(j|m,a,b,e)xP(m,a,b,e)
Calculate P(j,m,a,b,e)
P(j,m,a,b,e)=P(j|m,a,b,e)xP(m,a,b,e)
=P(j|m,a,b,e)xP(m|a,b,e)xP(a,b,e)
Calculate P(j,m,a,b,e)
P(j,m,a,b,e)=P(j|m,a,b,e)xP(m,a,b,e)
=P(j|m,a,b,e)xP(m|a,b,e)xP(a,b,e)
=P(j|m,a,b,e)xP(m|a,b,e)xP(a|b,e)xP(b,e)
Calculate P(j,m,a,b,e)
P(j,m,a,b,e)=P(j|m,a,b,e)xP(m,a,b,e)
=P(j|m,a,b,e)xP(m|a,b,e)xP(a,b,e)
=P(j|m,a,b,e)xP(m|a,b,e)xP(a|b,e)xP(b,e)
=P(j|m,a,b,e)xP(m|a,b,e)xP(a|b,e)xP(b|e)xP(e)
Calculate P(j,m,a,b,e)
P(j,m,a,b,e)=P(j|m,a,b,e)xP(m,a,b,e)
=P(j|m,a,b,e)xP(m|a,b,e)xP(a,b,e)
=P(j|m,a,b,e)xP(m|a,b,e)xP(a|b,e)xP(b,e)
=P(j|m,a,b,e)xP(m|a,b,e)xP(a|b,e)xP(b|e)xP(e)
=P(j|a)
xP(m|a)
xP(a|b,e)xP(b) xP(e)
Calculate P(j,m,a,b,e)
P(j,m,a,b,e)=P(j|m,a,b,e)xP(m,a,b,e)
=P(j|m,a,b,e)xP(m|a,b,e)xP(a,b,e)
=P(j|m,a,b,e)xP(m|a,b,e)xP(a|b,e)xP(b,e)
=P(j|m,a,b,e)xP(m|a,b,e)xP(a|b,e)xP(b|e)xP(e)
=P(j|a)
xP(m|a)
xP(a|b,e)xP(b) xP(e)
=.9x.7x.95x.001x.002=0.000001197=1.197x10-6
What is the probability that burglary has
occurred and John has called and Mary has
called ?
P(b, j,m)=?
Bayesian belief network
P(b, j,m)=?
= e a P(j, m, a, b, e)
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈa)xP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈa)xP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|a)xP(m|a)xP(a|b, ⴈe)xP(b)xP(ⴈe)]
+[P(j|ⴈa)xP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e +
P(j|ⴈa)xP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
Bayesian belief network
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|a)xP(m|a)xP(a|b, ⴈe)xP(b)xP(ⴈe)]
+[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e +
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7xP(a|b, e)xP(b)xP(e)+
P(j|a)xP(m|a)xP(a|b, ⴈe)xP(b)xP(ⴈe)]
+[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e +
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95xP(b)xP(e)+
P(j|a)xP(m|a)xP(a|b, ⴈe)xP(b)xP(ⴈe)]
+[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e +
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001xP(e)+
P(j|a)xP(m|a)xP(a|b, ⴈe)xP(b)xP(ⴈe)]
+[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e +
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
P(j|a)xP(m|a)xP(a|b, ⴈe)xP(b)xP(ⴈe)]
+[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e +
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
Bayesian belief network
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
0.9xP(m|a)xP(a|b, ⴈe)xP(b)xP(ⴈe)]
+[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e +
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
0.9x0.7xP(a|b, ⴈe)xP(b)xP(ⴈe)]
+[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e +
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
0.9x0.7x0.94xP(b)xP(ⴈe)]
+[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e +
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
0.9x0.7x0.94x0.001xP(ⴈe)]
+[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e +
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
0.9x0.7x0.94x0.001x0.998]
+[P(j|ⴈa)xP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e +
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
Bayesian belief network
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
0.9x0.7x0.94x0.001x0.998]
+[0.05xP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e +
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
0.9x0.7x0.94x0.001x0.998]
+[0.05x0.01xP(ⴈa|b, e)xP(b)xP e +
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
0.9x0.7x0.94x0.001x0.998]
+[0.05x0.01x0.05xP(b)xP e +
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
0.9x0.7x0.94x0.001x0.998]
+[0.05x0.01x0.05x0.001xP e +
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
0.9x0.7x0.94x0.001x0.998]
+[0.05x0.01x0.05x0.001x0.002 +
P(j|ⴈa)xP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
Bayesian belief network
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
0.9x0.7x0.94x0.001x0.998]
+[0.05x0.01x0.05x0.001x0.002 +
0.05xP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
0.9x0.7x0.94x0.001x0.998]
+[0.05x0.01x0.05x0.001x0.002 +
0.05x0.01xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
0.9x0.7x0.94x0.001x0.998]
+[0.05x0.01x0.05x0.001x0.002 +
0.05x0.01x0.06xP(b)xP ⴈe ]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
0.9x0.7x0.94x0.001x0.998]
+[0.05x0.01x0.05x0.001x0.002 +
0.05x0.01x0.06x0.001xP ⴈe ]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
0.9x0.7x0.94x0.001x0.998]
+[0.05x0.01x0.05x0.001x0.002 +
0.05x0.01x0.06x0.001x0.998]
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
0.9x0.7x0.94x0.001x0.998]
+[0.05x0.01x0.05x0.001x0.002 +
0.05x0.01x0.06x0.001x0.998]
0.000001197+0.0005910156+5x10-11+2.99x10-8
P(b, j,m)=?
= e a P(j, m, a, b, e)
= e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)
= e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+
P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)]
=[0.9x0.7x0.95x0.001x0.002+
0.9x0.7x0.94x0.001x0.998]
+[0.05x0.01x0.05x0.001x0.002 +
0.05x0.01x0.06x0.001x0.998]
0.000001197+0.0005910156+5x10-11+2.99x10-8
0.0005922426
Some Terminology
Consider the Query
P(b|j,m)
In this Burglary is a query variable
JohnCalls and MarryCalls are evidence variables
What are the other non-evidence and
non-query variables called ?
They are called hidden variables
P(b,j,m)
P(b|j,m)=
=alpha*P(b,j,m)
P(j,m)
In calculating denominator three summations
will be there.
To reduce the calculations we use normalization.
P(B|j,m)=<P(b|j,m),P(ⴈb|j,m)>
(1)
P(B|j,m)=<P(b|j,m),P(ⴈb|j,m)>
P(b|j,m)=alpha*P(b,j,m)
(1)
(2)
P(B|j,m)=<P(b|j,m),P(ⴈb|j,m)>
P(b|j,m)=alpha*P(b,j,m)
P(ⴈb|j,m)=alpha*P(ⴈb,j,m)
(1)
(2)
(3)
P(B|j,m)=<P(b|j,m),P(ⴈb|j,m)>
(1)
P(b|j,m)=alpha*P(b,j,m)
(2)
P(ⴈb|j,m)=alpha*P(ⴈb,j,m)
(3)
To find alpha add eq2 and eq3 and equate to 1.
P(B|j,m)=<P(b|j,m),P(ⴈb|j,m)>
(1)
P(b|j,m)=alpha*P(b,j,m)
(2)
P(ⴈb|j,m)=alpha*P(ⴈb,j,m)
(3)
To find alpha add eq2 and eq3 and equate to 1.
P(ⴈb|j,m)=alpha* 0.0014919
P(B|j,m)=<P(b|j,m),P(ⴈb|j,m)>
(1)
P(b|j,m)=alpha*P(b,j,m)
(2)
P(ⴈb|j,m)=alpha*P(ⴈb,j,m)
(3)
To find alpha add eq2 and eq3 and equate to 1.
P(ⴈb,j,m)=alpha* 0.0014919
P(B|j,m)=alpha*<0.0005922426, 0.0014919>
P(B|j,m)=<P(b|j,m),P(ⴈb|j,m)>
(1)
P(b|j,m)=alpha*P(b,j,m)
(2)
P(ⴈb|j,m)=alpha*P(ⴈb,j,m)
(3)
To find alpha add eq2 and eq3 and equate to 1.
P(ⴈb|j,m)=alpha* 0.0014919
P(B|j,m)=alpha*<0.0005922426, 0.0014919>
= <0.284, 0.716>

Download

advertisement