CSC242: Intro to AI Lecture 18 Quiz Stop Time: 2:15 Exact Inference in Bayesian Networks P(X | e) = =↵ P(X, e) = n XY y i=1 X P(X, e, y) y P (Xi | parents(Xi )) Exact Inference in Bayesian Networks • Exact inference in BNs is NP-hard • Can be shown to be as hard as computing the number of satisfying assignments of a propositional logic formula => #P-hard Approximate Inference in Bayesian Networks The Goal • Query variable X • Evidence variables E , ..., E • Observed values: e = < e , ..., e > • Non-evidence, non-query (“hidden”) variables: Y • Approximate: P(X | e) 1 m 1 m A Simpler Goal • Query variable X • Evidence variables E , ..., E • Observed values: e = < e , ..., e > • Non-evidence, non-query (“hidden”) variables: Y • Approximate: P(X) 1 m 1 m A Really Simple Goal Heads Goal: P(Heads) i.e., P(Heads=true) i.e., P(heads) A Really Simple Goal Heads # of flips: # of heads: N Nheads A Really Simple Goal Heads Nheads P (heads) ⇡ N A Really Simple Goal Heads Nheads P (heads) = lim N !1 N Sampling • Generating events (possible worlds) from a distribution • Estimating probabilities as ratio of observed events to total events • Consistent estimate: becomes exact in the large-sample limit P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f R P(W) t .99 f .90 t .90 f .00 C P(R) t .80 f .20 P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f R P(W) t .99 f .90 t .90 f .00 P (Rain = true) C P(R) t .80 f .20 Sampling • Generate assignments of values to the random variables • That is consistent with the full joint distribution encoded in the network • In the sense that in the limit, the probability of any event is equal to the frequency of its occurrence P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f C P(R) t .80 f .20 R P(W) t .99 f .90 t .90 f .00 hCloudy = true, Sprinkler = false, Rain = true, WetGrass = truei Generating Samples • Sample each variable in topological order • Child appears after its parents • Choose the value for that variable conditioned on the values already chosen for its parents P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f R P(W) t .99 f .90 t .90 f .00 C P(R) t .80 f .20 Cloudy Sprinkler Rain WetGrass P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f C P(R) t .80 f .20 Cloudy Sprinkler Rain WetGrass R P(W) t .99 f .90 t .90 f .00 P(Cloudy) = h0.5, 0.5i true P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f C P(R) t .80 f .20 Cloudy true Sprinkler false Rain WetGrass R P(W) t .99 f .90 t .90 f .00 P(Sprinkler | Cloudy = true) = 0.1, 0.9⇥ P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f C P(R) t .80 f .20 Cloudy true Sprinkler false Rain true WetGrass R P(W) t .99 f .90 t .90 f .00 P(Rain | Cloudy = true) = 0.8, 0.2⇥ P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f C P(R) t .80 f .20 Cloudy true Sprinkler false Rain true WetGrass true R P(W) t .99 f .90 t .90 f .00 P(WetGrass | Sprinkler = false, Rain = true) = h0.9, 0.1i P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f C P(R) t .80 f .20 Cloudy true Sprinkler false Rain true WetGrass true R P(W) t .99 f .90 t .90 f .00 hCloudy = true, Sprinkler = false, Rain = true, WetGrass = truei Guaranteed to be a consistent estimate (becomes exact in the large-sample limit) Sampling • Generate an assignment of values to the random variables • That is consistent with the full joint distribution encoded in the network • In the sense that in the limit, the probability of any event is equal to the frequency of its occurrence The Goal • Query variable X • Evidence variables E , ..., E • Observed values: e = < e , ..., e > • Non-evidence, non-query (“hidden”) variables: Y • Approximate: P(X | e) 1 m 1 m P(C)=.5 P(Rain | Sprinkler = true) Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f C P(R) t .80 f .20 R P(W) t .99 f .90 t .90 f .00 hCloudy = true, Sprinkler = false, Rain = true, WetGrass = truei Rejection Sampling • Generate sample from the prior distribution specified by the network • Reject sample if inconsistent with the evidence • Use remaining samples to estimate probability of event P(C)=.5 P(Rain | Sprinkler = true) Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f R P(W) t .99 f .90 t .90 f .00 P(Rain | Sprinkler = true) C P(R) t .80 f .20 100 samples Sprinkler=false: 73 Sprinkler=true: 27 Rain=true: 8 Rain=false: 19 8 19 ⇥ , ⇤ = ⇥0.296, 0.704⇤ 27 27 Rejection Sampling • Generate sample from the prior distribution specified by the network • Reject sample if inconsistent with the evidence • Use remaining samples to estimate probability of event • Fraction of samples consistent with the evidence drops exponentially with number of evidence variables Likelihood Weighting • Generate only samples consistent with the evidence • i.e., fix values of evidence varables • Instead of counting 1 for each non-rejected sample, weight the count by the likelihood (probability) of the sample given the evidence P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f R P(W) t .99 f .90 t .90 f .00 C P(R) t .80 f .20 Cloudy Sprinkler Rain WetGrass P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f C P(R) t .80 f .20 Cloudy Sprinkler Rain WetGrass R P(W) t .99 f .90 t .90 f .00 P(Rain | Cloudy = true, WetGrass = true) P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f R P(W) t .99 f .90 t .90 f .00 C P(R) t .80 f .20 Cloudy Sprinkler Rain WetGrass w = 1.0 P(Rain | Cloudy = true, WetGrass = true) P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f R P(W) t .99 f .90 t .90 f .00 C P(R) t .80 f .20 Cloudy Sprinkler Rain WetGrass w = 1.0 P(Rain | Cloudy = true, WetGrass = true) true P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f R P(W) t .99 f .90 t .90 f .00 C P(R) t .80 f .20 Cloudy true Sprinkler false Rain WetGrass w = 0.5 P(Rain | Cloudy = true, WetGrass = true) P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f R P(W) t .99 f .90 t .90 f .00 C P(R) t .80 f .20 Cloudy true Sprinkler false Rain true WetGrass w = 0.5 P(Rain | Cloudy = true, WetGrass = true) P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f R P(W) t .99 f .90 t .90 f .00 C P(R) t .80 f .20 Cloudy true Sprinkler false Rain true WetGrass true w = 0.5 P(Rain | Cloudy = true, WetGrass = true) P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f R P(W) t .99 f .90 t .90 f .00 C P(R) t .80 f .20 Cloudy true Sprinkler false Rain true WetGrass true w = 0.45 P(Rain | Cloudy = true, WetGrass = true) Likelihood Weighting • Generate sample using topological order • Evidence variable: Fix value to evidence value and update weight of sample using probability in network • Non-evidence variable: Sample from values using probabilities in the network (given parents) Likelihood Weighting • Pros: • Doesn’t reject any samples • Cons: • More evidence lower weight • Affected by order of evidence vars in topsort (later = worse) Approximate Inference in Bayesian Networks • Rejection Sampling • Likelihood Weighting P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f C P(R) t .80 f .20 Cloudy true Sprinkler true Rain false WetGrass true R P(W) t .99 f .90 t .90 f .00 P(Rain | Sprinkler = true, WetGrass = true) Markov Chain Monte Carlo Simulation • To approximate: P(X | e) • Generate a sequence of states • Values of evidence variables are fixed • Values of other variables appear in the right proportion given the distribution encoded by the network U1 ... Um X Z1j Y1 ... U1 Z nj Yn Conditional Independence Um ... X Z 1j Y1 ... Z nj Yn Markov Blanket Markov Blanket • The Markov Blanket of a node is its parents, its children, and its children’s parents. • A node is conditionally independent of all other nodes in the network given its Markov Blanket MCMC Gibbs Simulation Sampling • To approximate: P(X | e) • Start in a state with evidence variables set to evidence values (others arbitrary) • On each step, sample the non-evidence variables conditioned on the values of the variables in their Markov Blankets • Order irrelevant P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f C P(R) t .80 f .20 Cloudy true Sprinkler true Rain false WetGrass true R P(W) t .99 f .90 t .90 f .00 P(Rain | Sprinkler = true, WetGrass = true) P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f C P(R) t .80 f .20 Cloudy true Sprinkler true Rain false WetGrass true R P(W) t .99 f .90 t .90 f .00 P(Rain | Sprinkler = true, WetGrass = true) P(Cloudy | Sprinkler = true, Rain = false) P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f C P(R) t .80 f .20 Cloudy false Sprinkler true Rain false WetGrass true R P(W) t .99 f .90 t .90 f .00 P(Rain | Sprinkler = true, WetGrass = true) P(Rain | Sprinkler = true, Rain = false, Cloudy = false) P(C)=.5 Cloudy C t f P(S) .10 .50 Rain Sprinkler Wet Grass S t t f f C P(R) t .80 f .20 Cloudy false Sprinkler true Rain true WetGrass true R P(W) t .99 f .90 t .90 f .00 P(Rain | Sprinkler = true, WetGrass = true) Gibbs Sampling • To approximate: P(X | e) • Start in a state with evidence variables set to evidence values (others arbitrary) • On each step, sample non-evidence variables conditioned on the values of the variables in their Markov Blanket • Order irrelevant • A form of local search! Exact Inference in Bayesian Networks • #P-Hard even for distribution described as a Bayesian Network Approximate Inference in Bayesian Networks • Sampling consistent with a distribution • Rejection Sampling: rejects too much • Likelihood Weighting: weights get too small • Gibbs Sampling: MCMC algorithm (like local search) • All generate consistent estimates (equal to exact probability in the large-sample limit) Project 3 Inference in Bayesian Networks Implement and evaluate Full description: http://www.cs.rochester.edu/courses/242/ spring2013/projects/project-03.pdf Due: Fri 5 Apr 23:00 For Next Time: AIMA 15.0-15.3