COGS 300 — Fall 2013 Assignment 2 Due: 11:00am, Tuesday 26 November 2013 (hardcopy in class and email to cogs300@gmail.com) This can be done in groups of size 1, 2 or 3. Working alone is not recommended. All members of the group need to be able to explain the group’s answer. Please look at all of the questions, as the final exam will assume that you have thought about all of the questions. Every group should do questions 1 and 7, and the group should do as many of questions 2 to 6 as there are members in the group. The group should hand in one copy of the assignment. You are encouraged to discuss this assignment and collaborate with other classmates, as long as (a) you list the people with whom you discussed the assignment and (b) you give your own answers and explanations. Please post questions to the Connect web site. Question 1 One of the ethical decisions students have to make is whether they cheat. A rational model would specify that the decision of whether to cheat depends on the costs and the benefits. Here we will develop and critique such a model. Consider the following decision network: Watched Punishment Caught 1 Cheat 1 Caught 2 Utility Cheat 2 Grade 1 Grade 2 Final Grade This diagram models a student’s decisions about whether to cheat at two different times. If students cheat they can caught cheating, but they can also get higher grades. The punishment (either suspension, cheating recorded on the transcript, or none) depends on whether they get caught at either or both opportunities. Whether they get caught depends on whether they are being watched and whether they cheat. The utility depends on their final grades and their punishment. 1 Consider the example http://www.cs.ubc.ca/~poole/cogs300/2013/cheat_decision.xml for the AISpace Belief network tool at http://www.aispace.org/bayes/. (Do a “Load from URL” in the “File” menu). A group of size n need to answer 2n + 2 of the following parts: (a) What is an optimal strategy (policy)? Give a description in English of an optimal strategy that could be followed by a member of the general public. What is the value of an optimal strategy? (b) What happens to the optimal strategy when the probability of being watched goes up? [Modify the probability of “Watched” in create mode.] Try a number of values. Explain what happens and why. (c) What is an optimal strategy when the rewards for cheating are reduced? Try a number of different parametrizations. (d) Change the model so that the once a student has been caught cheating, they will be watched more carefully. [Hint: whether they are watched at the first opportunity needs to be a different variable than whether they are watched at the second opportunity.] Show the resulting model (both the structure and any new parameters), and give the policies and expected utilities for various settings of the parameters. (e) What does the current model imply about how cheating affects future grades? Change the model so that cheating affects subsequent grades. Explain how the new model achieves this. (f) How could this model be changed to be more realistic (but still be simple)? [E.g., are the probabilities reasonable? Are the utilities reasonable? Is the structure reasonable?] (g) Suppose the university decided to set up an honour system so that instructors do not actively check for cheating, but there is severe punishments for first offences if cheating is discovered. How could this be modelled? Specify a model for this and explain what decision it is rational to do (for a few different parameter settings). (h) Should students and instructors be encouraged to think of the cheating problem as a rational decision in a game? Explain why or why not in a single paragraph. Question 2 Students have to make decisions about how much to study for each course. The aim of this question is to investigate how to use decision networks to help them make such decisions. Suppose students first have to decide how much to study for the midterm. They can study a lot, study a little, or not study at all. Whether they pass the midterm depends on how much they study and on the difficulty of the course. As a first-order approximation, they pass if they study hard or if the course is easy and they study a bit. After receiving their midterm grade, they have to decide how much to study for the final exam. Again, the final exam result depends on how much they study and on the difficulty of the course. Their final grade depends on which exams they pass; generally they get an A if they pass both exams, a B if they only pass the final, a C if they only pass the midterm, or an F if they fail both. Of course, there is a great deal of noise in these general estimates. Suppose that their final utility depends on their total effort and their final grade. Suppose the total effort is obtained by combining the effort in studying for the midterm and the final. (For simplicity, assume that total effort is a binary variable that measures their subjective effort.) (a) Draw a decision network for a student decision based on the preceding story. (Show the random, decision and value nodes, but you dont need to specify the ranges of the variables, or any of the probabilities or utilities.) 2 (b) Give an appropriate utility function for a student who is lazy and just wants to pass (not get an F). The total effort here measures whether they (thought they) worked a lot or a little overall. (Use 100 for the best outcome and 0 for the worst outcome.) You can fill in the following table: Grade Total Effort Utility A Lot A Little B Lot B Little C Lot C Little F Lot F Little (c) Give an appropriate utility function for a student who doesnt mind working hard and really wants to get an A, and would be very disappointed with a B or lower. (Use 100 for the best outcome and 0 for the worst outcome. Fill in a table similar to part (b).) Question 3 Chris and Sam were playing with a mock-up help system based on the example given in class. Their example for the AIspace belief network tool is at http://artint.info/tutorials/helpsystem.xml For the query “Cannot find file”, Sam was conditioning on “cannot”, “find” and “file” being true, and was querying the “HelpPage” variable. Chris claimed that this was not correct because we also need to condition on the other words being false. Who was right? Explain to the other person why one of them is right. Use an example in your explanation. Question 4 (a) What are the independence assumptions made in the Naive bayesian classifier for the help system? (Slide number 10 of http://www.cs.ubc.ca/~poole/cogs300/2013/cogsys-m09b-learning.pdf) (b) Are these independence assumptions reasonable? Explain why or why not. (c) What are the independence assumptions make in the topic-model network on slide number 18 of those slides? (d) Give an example of where the topics would not be independent. Question 5 Give a possible exam question (perhaps with sub-parts) to test students either about control, hierarchical control, probability, belief networks and/or causality. It should be worth 15 marks, and take students approximately 15 minutes to complete in an exam setting. It must be clear what the question is asking for and must be self-contained. Give a solution and a marking scheme (how much each part is worth). 3 Question 6 On the wiki http://wiki.ubc.ca/Course:COGS300, create some online learning resources. You should create pedagogical or real-world examples that use useful for COGS 300 students to learn about utility, decision making or learning. Please add references for any external resources used. You will need to login with your CWL to edit. This is intended to be an open-ended creative question. This is a cooperative question, as anyone can edit other people’s questions. It is possible to gain credit by improving other’s contributions. Please help to build a useful resource. Explain clearly what your contribution was. It can be worth multiple questions; justify any claim of how many questions your contribution is worth. It is even possible to do the whole assignment just by creating useful resources. Question 7 How long did this assignment take? What did you learn? Was it reasonable? 4