Probabilistic Reasoning • Representing knowledge in an uncertain domain Probabilistic Reasoning Representing knowledge in an uncertain domain The full joint distribution can answer any question about the domain, but can become interactively large as the number of variables grows. Probabilistic Reasoning Independence and conditional Independence relationship among variables can greatly reduce the number of probabilities that need to be specified in order to define the full joint distribution. we introduce a data structure called by Bayesian network to represent dependencies among variables. Bayesian network can represent essentially a full joint probability distribution. How to define a Bayesian network ? How to define a Bayesian network ? A Bayesian network is a directed graph. The full specification is as follows: 1. Each node corresponds to a random variable which may be discrete or continuous. 2. A set of directed edges (arrows) connect a pair of nodes. If there is a directed edge from node X to node Y, X is said to be parent of Y. The graph has no directed cycles. 3. Each node Xi has a conditional probability distribution P(Xi| Parents(Xi) quantifies the effect of parent on a node. How to define topology of network? Topology specifies the conditional independence relation that holds in the domain. It is usually easy for a domain expert to decide what direct influence exist in the domain Can you give an example to find topology of some world? Consider a simple world consisting of of random variables Toothache Cavity Catch Weather Can you give an example to find topology of some world? Consider a simple word consisting of of random variables Toothache Cavity Catch Weather What do you think which random variables are independent and which random variables are conditional independent? What do you think which random variables are independent and which random variables are conditional independent? Weather is independent of other variables Toothache and Catch a conditionally independent given Cavity. How to represent this showing directed graph ? Topology of Network Once we have define topology of the Bayesian network what is left? Once we have define topology of the Bayesian network what is left? We need to specify a conditional probability distribution for each variable given its parents. Formally conditional independence of Toothache and Catch given Cavity is indicated by the absence of link between Toothache and Catch. Bayesian belief network Assume your house has an alarm system against burglary. You live in the seismically active area and the alarm system can get occasionally set off by an earthquake. You have two neighbors, Mary and John, who do not know each other. If they hear the alarm they call you, but this is not guaranteed. Bayesian belief network What is the probability that alarm has sounded, but neither a burglary nor earthquake has occurred and both John and Mary calls? What is the probability that alarm has sounded, but neither a burglary nor earthquake has occurred and both John and Mary calls? P(a,ⴈb,ⴈe, j, m)= ? What is the probability that alarm has sounded, but neither a burglary nor earthquake has occurred and both John and Mary calls? P(j, m, a,ⴈb,ⴈe)=P(j| m, a,ⴈb,ⴈe)* P( m, a,ⴈb,ⴈe) =P(j|a)* P( m| a,ⴈb,ⴈe)* P( a,ⴈb,ⴈe) =P(j|a)* P( m|a)* P( a|ⴈb,ⴈe) *P( ⴈb,ⴈe) =P(j|a)* P( m|a)* P( a|ⴈb,ⴈe) *P( ⴈb| ⴈe)*P( ⴈe) ==P(j|a)* P( m|a)* P( a|ⴈb,ⴈe) *P( ⴈb)*P( ⴈe) • What is the probability that alarm has sounded, but neither a burglary nor earthquake has occurred and both John and Mary calls? P(a,ⴈb,ⴈe, j, m)= P(a|ⴈb,ⴈe)x P(ⴈb)x P(ⴈe)x P(j|a)x P(m|a) =0.001x0.999x0.998x0.9x0.7=.000628 What is the probability that John has called and Mary has called and alarm has sounded and burglary has occurred and earthquake also has occurred? Calculate P(j,m,a,b,e) P(j,m,a,b,e)=P(j|m,a,b,e)xP(m,a,b,e) Calculate P(j,m,a,b,e) P(j,m,a,b,e)=P(j|m,a,b,e)xP(m,a,b,e) =P(j|m,a,b,e)xP(m|a,b,e)xP(a,b,e) Calculate P(j,m,a,b,e) P(j,m,a,b,e)=P(j|m,a,b,e)xP(m,a,b,e) =P(j|m,a,b,e)xP(m|a,b,e)xP(a,b,e) =P(j|m,a,b,e)xP(m|a,b,e)xP(a|b,e)xP(b,e) Calculate P(j,m,a,b,e) P(j,m,a,b,e)=P(j|m,a,b,e)xP(m,a,b,e) =P(j|m,a,b,e)xP(m|a,b,e)xP(a,b,e) =P(j|m,a,b,e)xP(m|a,b,e)xP(a|b,e)xP(b,e) =P(j|m,a,b,e)xP(m|a,b,e)xP(a|b,e)xP(b|e)xP(e) Calculate P(j,m,a,b,e) P(j,m,a,b,e)=P(j|m,a,b,e)xP(m,a,b,e) =P(j|m,a,b,e)xP(m|a,b,e)xP(a,b,e) =P(j|m,a,b,e)xP(m|a,b,e)xP(a|b,e)xP(b,e) =P(j|m,a,b,e)xP(m|a,b,e)xP(a|b,e)xP(b|e)xP(e) =P(j|a) xP(m|a) xP(a|b,e)xP(b) xP(e) Calculate P(j,m,a,b,e) P(j,m,a,b,e)=P(j|m,a,b,e)xP(m,a,b,e) =P(j|m,a,b,e)xP(m|a,b,e)xP(a,b,e) =P(j|m,a,b,e)xP(m|a,b,e)xP(a|b,e)xP(b,e) =P(j|m,a,b,e)xP(m|a,b,e)xP(a|b,e)xP(b|e)xP(e) =P(j|a) xP(m|a) xP(a|b,e)xP(b) xP(e) =.9x.7x.95x.001x.002=0.000001197=1.197x10-6 What is the probability that burglary has occurred and John has called and Mary has called ? P(b, j,m)=? Bayesian belief network P(b, j,m)=? = e a P(j, m, a, b, e) P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈa)xP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈa)xP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|a)xP(m|a)xP(a|b, ⴈe)xP(b)xP(ⴈe)] +[P(j|ⴈa)xP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e + P(j|ⴈa)xP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] Bayesian belief network P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|a)xP(m|a)xP(a|b, ⴈe)xP(b)xP(ⴈe)] +[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e + P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7xP(a|b, e)xP(b)xP(e)+ P(j|a)xP(m|a)xP(a|b, ⴈe)xP(b)xP(ⴈe)] +[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e + P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95xP(b)xP(e)+ P(j|a)xP(m|a)xP(a|b, ⴈe)xP(b)xP(ⴈe)] +[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e + P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001xP(e)+ P(j|a)xP(m|a)xP(a|b, ⴈe)xP(b)xP(ⴈe)] +[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e + P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ P(j|a)xP(m|a)xP(a|b, ⴈe)xP(b)xP(ⴈe)] +[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e + P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] Bayesian belief network P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ 0.9xP(m|a)xP(a|b, ⴈe)xP(b)xP(ⴈe)] +[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e + P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ 0.9x0.7xP(a|b, ⴈe)xP(b)xP(ⴈe)] +[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e + P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ 0.9x0.7x0.94xP(b)xP(ⴈe)] +[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e + P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ 0.9x0.7x0.94x0.001xP(ⴈe)] +[P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e + P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ 0.9x0.7x0.94x0.001x0.998] +[P(j|ⴈa)xP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e + P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] Bayesian belief network P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ 0.9x0.7x0.94x0.001x0.998] +[0.05xP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP e + P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ 0.9x0.7x0.94x0.001x0.998] +[0.05x0.01xP(ⴈa|b, e)xP(b)xP e + P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ 0.9x0.7x0.94x0.001x0.998] +[0.05x0.01x0.05xP(b)xP e + P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ 0.9x0.7x0.94x0.001x0.998] +[0.05x0.01x0.05x0.001xP e + P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ 0.9x0.7x0.94x0.001x0.998] +[0.05x0.01x0.05x0.001x0.002 + P(j|ⴈa)xP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] Bayesian belief network P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ 0.9x0.7x0.94x0.001x0.998] +[0.05x0.01x0.05x0.001x0.002 + 0.05xP(m|ⴈa)xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ 0.9x0.7x0.94x0.001x0.998] +[0.05x0.01x0.05x0.001x0.002 + 0.05x0.01xP(ⴈa|b, ⴈe)xP(b)xP ⴈe ] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ 0.9x0.7x0.94x0.001x0.998] +[0.05x0.01x0.05x0.001x0.002 + 0.05x0.01x0.06xP(b)xP ⴈe ] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ 0.9x0.7x0.94x0.001x0.998] +[0.05x0.01x0.05x0.001x0.002 + 0.05x0.01x0.06x0.001xP ⴈe ] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ 0.9x0.7x0.94x0.001x0.998] +[0.05x0.01x0.05x0.001x0.002 + 0.05x0.01x0.06x0.001x0.998] P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ 0.9x0.7x0.94x0.001x0.998] +[0.05x0.01x0.05x0.001x0.002 + 0.05x0.01x0.06x0.001x0.998] 0.000001197+0.0005910156+5x10-11+2.99x10-8 P(b, j,m)=? = e a P(j, m, a, b, e) = e a P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e) = e[P(j|a)xP(m|a)xP(a|b, e)xP(b)xP(e)+ P(j|ⴈaxP(m|ⴈa)xP(ⴈa|b, e)xP(b)xP(e)] =[0.9x0.7x0.95x0.001x0.002+ 0.9x0.7x0.94x0.001x0.998] +[0.05x0.01x0.05x0.001x0.002 + 0.05x0.01x0.06x0.001x0.998] 0.000001197+0.0005910156+5x10-11+2.99x10-8 0.0005922426 Some Terminology Consider the Query P(b|j,m) In this Burglary is a query variable JohnCalls and MarryCalls are evidence variables What are the other non-evidence and non-query variables called ? They are called hidden variables P(b,j,m) P(b|j,m)= =alpha*P(b,j,m) P(j,m) In calculating denominator three summations will be there. To reduce the calculations we use normalization. P(B|j,m)=<P(b|j,m),P(ⴈb|j,m)> (1) P(B|j,m)=<P(b|j,m),P(ⴈb|j,m)> P(b|j,m)=alpha*P(b,j,m) (1) (2) P(B|j,m)=<P(b|j,m),P(ⴈb|j,m)> P(b|j,m)=alpha*P(b,j,m) P(ⴈb|j,m)=alpha*P(ⴈb,j,m) (1) (2) (3) P(B|j,m)=<P(b|j,m),P(ⴈb|j,m)> (1) P(b|j,m)=alpha*P(b,j,m) (2) P(ⴈb|j,m)=alpha*P(ⴈb,j,m) (3) To find alpha add eq2 and eq3 and equate to 1. P(B|j,m)=<P(b|j,m),P(ⴈb|j,m)> (1) P(b|j,m)=alpha*P(b,j,m) (2) P(ⴈb|j,m)=alpha*P(ⴈb,j,m) (3) To find alpha add eq2 and eq3 and equate to 1. P(ⴈb|j,m)=alpha* 0.0014919 P(B|j,m)=<P(b|j,m),P(ⴈb|j,m)> (1) P(b|j,m)=alpha*P(b,j,m) (2) P(ⴈb|j,m)=alpha*P(ⴈb,j,m) (3) To find alpha add eq2 and eq3 and equate to 1. P(ⴈb,j,m)=alpha* 0.0014919 P(B|j,m)=alpha*<0.0005922426, 0.0014919> P(B|j,m)=<P(b|j,m),P(ⴈb|j,m)> (1) P(b|j,m)=alpha*P(b,j,m) (2) P(ⴈb|j,m)=alpha*P(ⴈb,j,m) (3) To find alpha add eq2 and eq3 and equate to 1. P(ⴈb|j,m)=alpha* 0.0014919 P(B|j,m)=alpha*<0.0005922426, 0.0014919> = <0.284, 0.716>