SEG4630 E-Commerce Data Mining (2008

advertisement
SEG45630 Computational Intelligence for
Decision Making (2008-09 Second Term)
Assignment 3
(100 points)
Due time and date: 7pm, April 20 Monday
Submit to assignment box D03, 5/F ERB
Submission Requirements


The hand-in version must be ordered correctly and stapled in the top left
corner.
The hand-in version must include a header page (or with sufficient space)
indicating: student name, student ID and assignment number.
Question 1. Suppose you are given a bag containing n unbiased coins. You are
told that n-1 of these coins are normal, with heads on one side and tails on the
other, whereas one coin is a fake, with heads on both sides. [20 pts]
a. Suppose you reach into the bag, pick out a coin uniformly at random, flip
it, and get a head. What is the conditional probability that the coin you
chose is the fake coin?
b. Suppose you continue flipping the coin for a total of k times after picking
it and see k heads. What is the conditional probability that you picked the
fake coin?
c. Suppose you wanted to decide whether the chosen coin was fake by
flipping it k times. The decision procedure returns FAKE if all k flips
come up heads, otherwise it returns NORMAL. What is the
(unconditional) probability that this procedure makes an error?
Question 2. The markov blanket of a variable Xi is MB(Xi) which includes:
Parents(Xi ), children Y1, Y2,…, Yn of Xi, and Zj which are parents of Yj other
than Xi. Prove the following equation, where Parents(Yj) includes Zj and Xi.
[15pts]
P( X i | MB ( X i ))  P( X i | Parents( X i ), Y1 ,..., Yn , Z1 ,..., Z n )
 P( X i | Parents( X i ))
 P(Y
j
| Parents(Y j ))
Y j Children( X i )
Question 3. In a nuclear power station, there is an alarm that senses when a
temperature gauge exceeds a given threshold. The gauge measures the
temperature of the core. Consider the Boolean variables A (alarm sounds), FA
(alarm is faulty), and FG (gauge is faulty) and the multivalued nodes G (gauge
reading) and T (actual core temperature). [35 pts]
a. Draw a Bayesian network for this domain, given that the gauge is more
likely to fail when the core temperature gets too high; and the gauge
reading which exceeds a threshold will cause the alarm to sound.
b. Is your network a polytree?
c. Suppose for actual temperature (T) and measured temperature (G), there
are two possible values: normal and high; the probability that the gauge
gives the “correct” temperature is x when it is working, but y when it is
faulty. Fill in the conditional probability table associated with G, i.e.,
P(G|T, FG). (Note that “correct” here means when T=normal, G is also
normal, or when T=high, G is also high.)
T=Normal
FG
 FG
T=High
FG
 FG
G=Normal
G=High
d. Suppose the alarm works correctly unless it is faulty, in which case it
never sounds. Fill in the conditional probability table associated with A,
i.e., P(A|G, FA).
G=Normal
FA
A
A
 FA
G=High
FA
 FA
e. Suppose the alarm and gauge are working and the alarm sounds.
Calculate an expression for the probability that the temperature of the
core is too high, i.e., P(T=High|  FG,  FA, A). [Hint: Because the
alarm’s sound behavior is deterministic, when the alarm is working and
sounds, we can reason that G must be High. In addition, T is
conditionally independent of FA and A. So you can simplify the
calculation to P(T=High|  FG, G=High).] You can assume P(T)=p,
P(FG|T)=g, and P(FG|  T)=h. Express the conditional probability with p,
g, h and probabilities from the above CPTs.
Question 4. Consider the query P(Rain| Sprinkler=true, WetGrass=true ) in
slide # 15 of Chapter 14b, and how MCMC can answer it. [30pts]
a.
b.
c.
d.
How many states does the Markov chain have?
Calculate the transition matrix Q containing q ( y  y ' ) for all y, y’.
What does Q2, the square of the transition matrix, represent?
What about Qn, as n   ?
Download