Week 7 - Probability&Bayes Net - 's solution

advertisement
CS 188 Sp07 Discussion Note Week 7 – Discrete Probability and Bayesian Network
by Nuttapong Chentanez
Probability
Sample Space – Set of all possible outcome of some experiment
Random Variables – Function that assign a value to each outcome in a sample space
eg. If sample space S is the set of all students in this class, one could define a random variable A,
measuring age. If p is a person, A(p) is his/her age.
Event - a set of outcomes that share property you are interested in. eg. For sample space S, J may
be the set of juniors. Randomly picking a person may pr may not result in the event that he/she is
a junior. P(J) denotes the probability that the event J occurs.
Events can be union, intersects, complement to define new events. Particular conditions on random
variables such as A=6’1”, A<7’ can also be considered an event.
Conditional Probability – P(X|Y) = P(X  Y) / P(Y) is probability that event X occurs given that
event Y occurs
Joint Distribution – P(A = a, B = b) denotes probability that A = a and B = b
Marginal Distribution – P(A = a) =  P(A = a, B = b) This summation is called
“marginalization”
b
Conditional distribution – P(A = a| B = b) gives conditional probability
Important Rules:
Chain Rules: P(X, Y) = P(X|Y)P(Y)
Bayes’s Rule: P(X|Y) = P(Y|X) P(X) / P(Y),
Axioms of probability
1. 0<= P(a) <= 1, for any proposition a,
2. P(true) =1 , P(false) = 1
3. P( a  b) = P(a) + P(b) – P(a  b)
very useful in AI
Exercise:
1. Show P(a| a  b) = 1
2. Consider the problem of dealing 5-card poker hands from a standard deck of 52 cards,
assuming that the dealer is fair.
a. How many atomic events are there in the joint probability distribution (how many
5-card hands are there)?
b. What is the probability of each atomic event?
c. What is the probability of dealt a royal straight flush? Four of a kind?
3. From this table:
Toothache
~ Toothache
Catch
~Catch
Catch
~Catch
Cavity
0.108
0.012
0.072
~Cavity
0.016
0.064
0.144
Compute
a. P(toothache)
b. P(Cavity)
c. P(Toothache| cavity)
d. P(Cavity| toothache  catch)
0.008
0.576
4. After your yearly checkup, the doctor has bad news and good news. The bad news is that you
tested positive for a serious disease and the test is 99% accurate (i.e. The probability of testing
positive when you have the disease is 0.99). The good news is that this is a rare disease, striking
only 1 in 100,000 people of your age. Why is it a good news that the disease is rare? What is the
chance that you actually have the disease?
Independence
X and Y are independent if P(X,Y) = P(X)P(Y), equivalently P(X) = P(X|Y)
Conditional independent
X and Y are conditional independent given Z if P(X,Y|Z) = P(X|Z) P(Y|Z)
Exercise: Show that the above is equivalent to P(X|Y,Z) = P(X|Z)
Chain rules
P(x1, x2, …, xn) = P(xn|xn-1,…..,x1)P(xn-1,…, x1)
n
=  P(xi|xi-1,….,x1)
i=1
Bayes’s Nets
A set of random variables as nodes (discrete or continuous)
A set of directed links or arrows connects pairs of nodes.
Each node Xi has a conditional probability distribution P(Xi|Parents(Xi))
Graph has no directed cycle, directed acyclic graph (DAG)
The topology imply certain conditional independencies:
P(Xi|Xi-1,….X1) = P(Xi| Parents(Xi)) Given that Parents(Xi)  {Xi-1, … , X1}
Combined with chain rule, we can write full joint probability as:
n
P(X1, …Xn)
=  P(Xi|Parents(Xi)),
using the partial ordering implied by the DAG
i=1
Given a set of random variables, the correct order for constructing Bayes’s net is by adding “root
causes” first, then the variables they influence and so on.
This is because parents of a node are “direct influencers” of the node.
Directed edges need not be causality, but constructing the graphs with causality in mind tend to
make the graph has less edges. Example is adding M, J, A, B, E will have more edges. Also more
difficult to collect data.
Independence in BN: Are two nodes conditionally independent given certain evidences?
Causal Chain
X
Y
Z
Are X and Z always independent? No. eg. X=Low air pressure, Y=Rain, Z=Traffic
They could be independent, by crafting CPTs so that P(Z|X) = P(Z)
Are X and Z independent given Y? Yes, P(Z|X,Y) = P(X,Y,Z)/P(X,Y) = P(X)P(Y|X)P(Z|Y) /
P(X)P(Y|X) = P(Z|Y)
Common Cause
Y
X
Z
X = Newsgroup busy, Y = Project due, Z = Lab full
Is X and Z independent given Y? Yes, P(Z|X,Y) = P(X,Y,Z)/P(X,Y) =
P(Y)P(X|Y)P(Z|Y)/P(Y)P(X|Y) = P(Z|Y)
Common Effect
X
Z
Y
X: CS188 project due, Z: CS184 project due, Y: Lack of sleep
Is X and Z independent? Yes
Is X and Z independent given Y? No, if you don’t have cs188 proj and lack of sleep, it’s more
likely that cs184 is due :P
General Case: Bayes Ball Algorithm
• Correct algorithm:
• Shade in evidence
• Start at source node
• Try to reach target by search
•
States: pair of (node X, previous state S)
•
Successor function:
• X unobserved:
• To any child
• To any parent if coming from a child
• X observed:
• From parent to parent
•
If you can’t reach a node, it’s conditionally independent of the start node given
evidence
S
X
X
S
S
X
S
Example
X
L
R
D
B
T
T’
Yes, Yes, No, No (L->R->T->T’->T->B), Yes
Exercise:
1.
The Surprise Candy Company makes candy in two 70% are strawberry and 30% are anchovy
Each new piece of candy starts out with a round shape; as it moves along the production line, a
machine randomly selects a certain percentage to be trimmed into a square; then, each piece is
wrapped in a wrapper whose color is chosen randomly to be red or brown. 80% of the strawberry
candies are round and 80% have a red wrapper, while 90% of the anchovy candies are square and
90% have a brown wrapper. All candies are sold individually in sealed, identical, black boxes.
Now you, the customer, have just bought a Surprise candy at the store but have not yet opened the
box.
Consider these three Bayes nets:
a. Which network(s) can correctly represent P (Flavor, Wrapper, Shape)?
b. Which network is the best representation for this problem?
c. True/False: Network (i) asserts that P (Wrapper | Shape) =P (Wrapper).
d. What is the probability that your candy has a red wrapper?
(i) 0.8 (ii) 0.56 (iii) 0.59
e. In the box is a round candy with a red wrapper. The probability that its flavor is strawberry
(i)0.7 (ii) Between 0.7 and 0.99 (iii) > 0.99
2.
A simple Bayes net with Boolean variables I = Intelligent, H =Honest, P =Popular,
L=LotsOfCampaignFunds, E =Elected.
a. Which of the followings are asserted by the network (ignoring CPT)?
b. Calculate P(i, h, ~l, p, ~e)
c. Calculate the probability that someone is intelligent given that they are honest, have few
campaign funds, and are elected.
d. True/False If there are two candidates in the race, then making two copies of the network will
correctly represent the joint distribution over the two sets of variables.
3.
Is X2  X3 | {X1, X6}? How about X1  X6 | {X2, X3}?
No, Yes
Download