The Wumpus World! 2012级ACM班 金汶功 Hunt the wumpus! Description • • • • Performance measure Environment Actuators Sensors: Stench & Breeze & Glitter & Bump & Scream An Example An Example Reasoning via logic Semantics • Semantics: Relationship between logic and the real world • Model: 𝑀(α) • Entailment: 𝛼 ⊨ β iff M(𝛼) ⊆ 𝑀(𝛽) Models • KB: valid sentences • 𝛼1 : “There is no pit in [1,2]” • 𝛼2 : “There is no pit in [2,2]” Sensors Tell Axioms Current States Knowledge base Ask Answer Tell Model checking Actuators Agent Efficient Model Checking • • • • • • DPLL Early termination Pure symbol heuristic Unit clause heuristic Component analysis … Drawbacks • Model checking is NP-complete • Knowledge base may tell nothing. Probabilistic Reasoning Full joint probability distribution • P(X, Y) = P(X|Y)P(Y) • X: {1,2,3,4} -> {0.1,0.2,0.3,0.4} • Y: {a,b} -> {0.4, 0.6} • P(X = 2, Y = a) = P(X = 2|Y = a)P(Y = a) • The probability of all combination of values Normalization • 𝑃 𝑐𝑎𝑣𝑖𝑡𝑦 𝑡𝑜𝑜𝑡ℎ𝑎𝑐ℎ𝑒) = 𝑃(𝑐𝑎𝑣𝑖𝑡𝑦∧𝑡𝑜𝑜𝑡ℎ𝑎𝑐ℎ𝑒) 𝑃(𝑡𝑜𝑜𝑡ℎ𝑎𝑐ℎ𝑒) • 𝑃(𝑡𝑜𝑜𝑡ℎ𝑎𝑐ℎ𝑒) is a constant • 𝑃 𝑐𝑎𝑣𝑖𝑡𝑦 𝑡𝑜𝑜𝑡ℎ𝑎𝑐ℎ𝑒) = α 𝑃(𝑐𝑎𝑣𝑖𝑡𝑦 ∧ 𝑡𝑜𝑜𝑡ℎ𝑎𝑐ℎ𝑒) • 𝐏 𝑐𝑎𝑣𝑖𝑡𝑦 𝑡𝑜𝑜𝑡ℎ𝑎𝑐ℎ𝑒) = 𝛼 < 0.12,0.08 > • =< 0.6,0.4 > The Wumpus World • Aim: calculate the probability that each of the three squares contains a pit. Full joint distribution • P(𝑃1,1 , ⋯ 𝑃4,4 , 𝐵1,1 , 𝐵1,2 , 𝐵2,1 ) = P(𝐵1,1 , 𝐵1,2 , 𝐵2,1 |𝑃1,1 , ⋯ 𝑃4,4 ) P(𝑃1,1 , ⋯ 𝑃4,4 ) • P(𝑃1,1 , ⋯ 𝑃4,4 ) = 𝑖,𝑗 P(𝑃𝑖,𝑗 ) • Every room contains a pit of probability 0.2 • 𝑃 𝑃1,1 , ⋯ 𝑃4,4 = 0.2𝑛 × 0.816−𝑛 How likely is it that [1,3] has a pit? • Given observation: • 𝑏 = ¬𝑏1,1 ∧ 𝑏1,2 ∧ 𝑏2,1 • 𝑘𝑛𝑜𝑤𝑛 = ¬𝑝1,1 ∧ ¬𝑝1,2 ∧ ¬𝑝2,1 • 𝐏 𝑃1,3 𝑘𝑛𝑜𝑤𝑛, 𝑏 = 𝛼 • 212 = 4096 terms 𝑢𝑛𝑘𝑛𝑜𝑤𝑛 𝐏(𝑃1,3 , 𝑢𝑛𝑘𝑛𝑜𝑤𝑛, 𝑘𝑛𝑜𝑤𝑛, 𝑏) Using independence Simplification • 𝐏 𝑃1,3 𝑘𝑛𝑜𝑤𝑛, 𝑏 = 𝛼 • =𝛼 • =𝛼 𝑢𝑛𝑘𝑛𝑜𝑤𝑛 𝐏(𝑃1,3 , 𝑘𝑛𝑜𝑤𝑛, 𝑏, 𝑢𝑛𝑘𝑛𝑜𝑤𝑛) 𝑢𝑛𝑘𝑛𝑜𝑤𝑛 𝐏(𝑏|𝑃1,3 , 𝑘𝑛𝑜𝑤𝑛, 𝑢𝑛𝑘𝑛𝑜𝑤𝑛)𝐏(𝑃1,3 , 𝑘𝑛𝑜𝑤𝑛, 𝑢𝑛𝑘𝑛𝑜𝑤𝑛) 𝑓𝑟𝑜𝑛𝑡𝑖𝑒𝑟 𝐏(𝑏|𝑃1,3 , 𝑘𝑛𝑜𝑤𝑛, 𝑓𝑟𝑜𝑛𝑡𝑖𝑒𝑟, 𝑜𝑡ℎ𝑒𝑟) 𝑜𝑡ℎ𝑒𝑟 𝐏(𝑃 , 𝑘𝑛𝑜𝑤𝑛, 𝑓𝑟𝑜𝑛𝑡𝑖𝑒𝑟, 𝑜𝑡ℎ𝑒𝑟) 1,3 • =⋯ • = 𝛼𝑃 𝑘𝑛𝑜𝑤𝑛 𝐏(𝑃1,3 ) 𝑓𝑟𝑜𝑛𝑡𝑖𝑒𝑟 𝐏(𝑏|𝑃1,3 , 𝑘𝑛𝑜𝑤𝑛, 𝑓𝑟𝑜𝑛𝑡𝑖𝑒𝑟)𝑃(𝑓𝑟𝑜𝑛𝑡𝑖𝑒𝑟) • Now there are only 4 terms, cheers! Finally • 𝐏 𝑃1,3 𝑘𝑛𝑜𝑤𝑛, 𝑏 =< 0.31, 0.69 > • [2,2] contains a pit with 86% probability! • Data structures---independence Bayesian Network Simple Example P(B) Burglary P(E) Earthquake .001 Alarm(Bark) John Calls Bark P(J) true .90 false .05 .002 B E P(A) True true .95 true false .94 false true .29 false false .001 Mary Calls Bark P(M) true .70 false .01 Specification • Each node corresponds to a random variable • Acyclic – DAG • Each node has a conditional probability distribution 𝐏 𝑋𝑖 𝑃𝑎𝑟𝑒𝑛𝑡𝑠(𝑋𝑖 ) Conditional Independence Exact Inference • 𝑃 𝑏 𝑗, 𝑚 = α𝑃 𝑏, 𝑗, 𝑚 • = α 𝑒 𝑎 𝑃(𝑏, 𝑗, 𝑚, 𝑒, 𝑎) • = α 𝑒 𝑎 𝑃 𝑏 𝑃 𝑒 𝑃 𝑎 𝑏, 𝑒 𝑃 𝑗 𝑎 𝑃(𝑚|𝑎) • 𝐏(𝐵|𝑗, 𝑚) =< 0.284,0.716 > P(P2,2) 0.2 P1,3 P2,2 P(P3,1) P3,1 0.2 P(known) P(1,3) known 0.2 b P1,3 P2,2 P3,1 b True True True 1 True True False 1 True False True 1 True False False 0 False True True 1 False True False 1 False False True 0 False False False 0 Approximate Inference • Markov Chain Monte Carlo • Gibbs Sampling • Idea: The long-run fraction of time spent in each state is exactly proportional to its posterior probability. 𝑃(𝑥𝑖 ′ |𝑀𝑎𝑟𝑘𝑜𝑣𝐵𝑙𝑎𝑛𝑘𝑒𝑡 𝑋𝑖 ) = αP(𝑥𝑖′ |𝑃𝑎𝑟𝑒𝑛𝑡𝑠 𝑋𝑖 ) × 𝑃(𝑦𝑗 |𝑝𝑎𝑟𝑒𝑛𝑡𝑠(𝑌𝑗 )) 𝑌𝑗 ∈𝐶ℎ𝑖𝑙𝑑𝑟𝑒𝑛 𝑋𝑖 Reference • http://zh.wikipedia.org/wiki/Hunt_the_Wumpus • http://zh.wikipedia.org/wiki/%E8%B4%9D%E5%8F%B6%E6%9 6%AF%E7%BD%91%E7%BB%9C • Stuart Russell, Peter Norvig Artificial Intelligence—A Modern Approach 3rd edition, 2010