CSC242: Introduction to Artificial Intelligence Homework 5: Learning and All Course Review 1. Short-answer questions about reinforcement learning. (a) What is reinforcement learning? (b) How does reinforcement learning differ from supervised classification learning? (c) How does reinforcement learning differ from planning? (d) What is role of function-approximation in reinforcement learning? 2. Neural networks. Consider an artificial neuron with two inputs, where w0 = 0.5, w1 = 0.6, w2 = −0.4. (a) Write the formula for the output o of the neuron if it perceptron, using the specific weight values above. (b) Write the formula for the output o of the neuron if it is a sigmoid unit (Mitchell Sec. 4.5.1), using the specific weight values above. (c) The training rule for a perceptron is wi ← wi + ∆wi where ∆wi = η(t − o)xi Explain what each of η, t, o, and xi are in this formula. (d) What is the formula for ∆wi for a sigmoid unit? (See slide 17 of my lecture Neural Networks II.) (e) Suppose we are doing incremental learning, and adjusting the weights after each training example. Let the training example be [x1 = 1, x2 = 1, t = −1] Assuming the unit is a perceptron, calculate each of ∆w0 , ∆w1 , and ∆w2 for the specific values of the initial weight above. Note that by convention x0 is always 1. (f) Repeat the calculation assuming the unit is a sigmoid unit and the example is [x1 = 1, x2 = 1, t = 0] 1 3. Short answer questions about classification. (a) What does it mean for a data set to be linearly separable? (b) What does it mean for a classifier to overfit a training set? (c) How is a validation set used to reduce overfitting? (d) What is a hypothesis space for a learning system? 4. Decision-tree learning. (See slides 39-59 of my lecture on Learning from Examples and Sec. 18.3 of Russell & Norvig.) (a) What are the roles of entropy and information gain in constructing a decision tree? (b) Consider a set S of 10 training instances, of which 3 are positive examples and 7 are negative examples. Calculate the entropy of S. (c) Consider the following data set about edible and non-edible berries. Construct the the decision tree that the greedy algorithm would build. Calculate and state the information gain at each node. color size firmness edible Red Small Hard No Red Large Soft Yes Red Small Soft Yes Green Small Hard No Green Small Soft No Green Large Soft No 5. A* search. (See my slides on Search Strategies. A* is the heap-based version of graphsearch (slide 57) using the cost function described on slide 107. In Russell & Norvig, A* is is identical to the UNIFORM-COST-SEARCH algorithm (Fig. 3.14) but with PATH-COST replaced by the function described in Sec. 3.5.2.) (a) Simulate the execution of graph-search A* for finding a path from S to G for the following graph. Fill out the table so that each line indicates the operation: • expand – remove the lowest cost node from the heap and process it • insert – add a node to the heap using the indicated cost • change – change the cost of a node that is in the heap as well as the node letter and cost. You might not need to use all the lines in the table. Page 2 1 S 12 A 2 G 5 4 3 2 B C heuristic distance to goal S 7 A 6 B 2 C 1 G 0 step 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 operation node cost 6. Short answer questions on propositional logic. (a) What is a model (or possible world) in propositional logic? (b) Suppose α and β are sentences in propositional logic. How can you determine whether α |= β using an oracle for satisfiability? (c) What does it mean for a system of deduction to be sound? Page 3 7. Consider the following set of propositional sentences. ¬(P ∧ ¬Q) ∨ ¬(¬S ∧ ¬T ) ¬(T ∨ Q) U → (¬T → (¬S ∧ P )) (a) Convert the sentences into a set of clauses. (A clause is disjunction of literals, where a literal is a propositional letter or its negation.) (b) Create a resolution refutation proof that ¬U follows from these sentences. Hint: there is a proof where in each resolution step, one of the resolvants is a unit clause. 8. Translate the following sentences into first-order logic. Follow the syntax of the sentences as closely as possible. Do not covert an quantifier from existential to universal or viceversa. A sentence that includes “no” or “nobody” should have a corresponding negation ¬ in its translation. (a) No one in the Systems class is smarter than everyone in the AI class. (b) Henry likes all the students who do not like themselves. (c) Susan’s sister’s mother loves John’s brother’s father. (Think carefully about whether to whether to use predicates or functions for each relation.) 9. Consider the Bayesian network below. The variables are all Boolean. smokes overweight heartDisease chestPain tiredness Suppose network (A) includes the following conditional probability tables: P(smokes) = 0.2 P(overweight) = 0.6 overweight P(heartDisease)= ¬overweight smokes ¬smokes 0.5 0.3 0.2 0.1 Page 4 heartDisease ¬heartDisease P(chestPain) 0.3 0.1 P(tiredness) heartDisease 0.5 ¬heartDisease 0.4 Show your work for each of the following calculations. (a) Calculate P(heartDisease). (b) Calculate P(chestPain). (c) Calculate P(heartDisease | chestPain). (d) Calculate P(heartDisease | chestPain, ¬overweight) Thank you for an enjoyable semester. Be sure to solve all of these problems before the exam. You may work with other students on solving them if you like. Page 5