PPE 110

advertisement
PPE 110
Introduction to Decision Theory
1
1. Quantifying uncertainty
• The class will try and translate a lot of the
natural language we use to reason about
uncertainty to a more formal language.
• The cost of this translation will be learning
the new language. The benefit will be the
ability to gain a more precise way of
expressing uncertainty and how to deal
with it
2
• The basic building block of this new language
will be the use of probability.
• Probability is a quantification of uncertainty.
When you say something is likely, or moderately
likely, or very likely, you only have a small and
finite number of phrases to express the
uncertainty.
• Numbers give you a richer spectrum of options.
• Probability is a way of using numbers to capture
intuitive notions of uncertainty, richer than what
would be otherwise possible with words. It
allows for finer gradations.
3
• Let us revisit some basic probability.
• Example: 3 flips of a fair coin. What are all the possibilities? Many
times, it helps us to draw pictures, to visualize the information.
• Here it is cut and dried: there are 8.
• Write down each of these 8 possibilities as a point in a rectangle. Let
1=HHH, 2=HHT, 3=HTH,4=THH,5=HTT,6=THT,7=TTH,8=TTT
• Each has the same probability, so the probability is 1/8
1 2 3 4
5 6 7 8
• These are the elementary atoms or propositions which are the
building blocks of our investigation. The set of all propositions we are
interested in will form the Universal Set. It is shown above by the numbers
enclosed within the rectangle.
4
• A description of the universal set, and the probability of each
atom within the universal set is known as a probability
distribution.
• For example, the probability distribution of the set of rolls of a die
is described below:
x
1
2
3
p(x)
1/6
1/6 1/6
4
5
6
1/6
1/6
1/6
5
2. Formalizing all this.
• Let us make all this more formal – which will be useful
later. This is the grammar of our new language.
• Given a set X, a probability system is a function p which
has the following properties
• For all x∈X (for all x an element of X, where each
element is the same thing as an atom),
(1) p(x) ≥ 0
(2)∑ p(x)=1 (the sum of all the probabilities of the atoms
equals 1)
(3)p(φ)=0 where φ is the null set (what is the null set? It is
the set of all impossible events. An example of an
impossible event are propositions like: a student in PPE
110 will get both an A and a B for the class).
6
3. From atoms to events
• When we try to reason about uncertainty, we are often interested in
not just the likelihood of the most basic of the propositions, but also
about collections of such propositions (George AND Ringo are nice
men; The weather in Philadelphia can be very hot OR very cold).
• Think of equating an elementary proposition to an atom, and
now wanting to be able to represent more complex sentences
in the new language. We will use the words atom and propositions
interchangeably
• The compound sentence (often called ‘event’ in probability
theory) will always boil down to a grouping of atoms, where the
group is formed by the particular concept we are trying to express.
In other words, we want to be able to express our uncertainty not
just about individual atoms or fundamental situations, but also about
collective, aggregate situations that may involve more than one of
our atomistic propositions.
7
• Loosely speaking, compound sentences are just
elementary propositions connected together by
connectors and qualifiers: and, or, not, etc. Which means
that they refer to a one or more atoms, or a collection of
atoms.
• This collection of atoms (event) will be called a subset of
the universal set we encountered earlier.
• A is a subset of X implies A is part (possibly all of but not
necessarily) of X. It is written as A ⊆X. A ⊂ X means that
A is a part of X but not all of it.
• The next slide shows an event, and the slides after that
uses visual representations of subsets to illustrate a
problem.
8
• Depiction of an event A={2,3,6,7}. Refer to slide 4 for the meaning of
this event.
1 2 3 4
5 6 7 8
9
• We continue with trying to think of a problem in visual terms. Here is
an example of using events, or a collection of elementary facts.
• A town has 100 taxis. 85 green taxis owned by the Green Cab
Company and 15 blue taxies owned by the Blue Cab Company.
• On March 1, 1990, Alice was struck by a speeding cab, and the only
witness testified that the cab was blue rather than green. Alice sued
the Blue Cab Company.
• The judge instructed the jury and the lawyers at the start of the case
that the reliability of a witness must be assumed to be 80% in a case
of this sort, and that liability requires that the “preponderance of the
evidence,’ meaning at least a 50% probability, be on the side of the
plaintiff.
• Conditional on seeing a blue cab, the witness is likely to have
correctly identified it as blue, or incorrectly as blue. The situation is
the same for green cabs.
• There are 15 blue cabs of which 12 would be correctly identified,
and 85 green cabs or which 17 would be incorrectly identified.
• Assume that a priori, each cab is equally likely to be the culprit.
10
•
We show this situation schematically. We will return to this problem later. For now, just see the
visualization. Each atom here is a particular cab (not shown) and each circle is an event, as are
the shaded parts of each circle.
11
4. From collections of atoms to collections of
events
• In much the same way we want to talk about events (as
an enlargement of our ability to talk about elementary
propositions), we also want to talk about collections of
events. For this, we need to introduce some new terms.
• A∩B (A intersection B) will mean the set of situations
where both A and B occur, A U B (A union B) will mean
the set of situations where A or B occur, and Ac (A
complement) will mean the set of situations
incompatible with A. These are to be shown in the
class.
• Using these constructors, we can talk about various
events, such as: the set of blue cabs which are never
mistaken for yellow.
• Let us check some facts visually:
• 1: (AUB)c = Ac∩Bc
• 2: (A∩B)c = AcUBc
12
•
•
Here is a visual proof that (A∩B)c = AcUBc
(AUB)c are in black lines, Ac is in green and Bc is in pink. The intersection of the latter two should
be the area in black.
13
•
•
•
•
•
5. Probabilities of events.
An earlier slide talks about the probability of each atom
and the fact that such probability must add to 1.
Thus the probability of each atom must be known. If
we know that, then we may extend the analysis to find
the probability of the larger events:
For instance, for any event A, probability of A or
p(A)=sum of all the probabilities of the atoms in A.
Thus, the probability of being hit by a blue cab was
1/100+1/100+..=15/100
It is possible that whenever one event occurs, the
other is impossible. If this is true, then we call the two
events mutually exclusive or disjoint.
14
• This means that if we have two sets that do not
intersect, say A and B, then P(A U
B)=P(A)+P(B). Why? Let Z= A U B. Now, the
probability of Z is the sum of the probability of all
the atoms in Z. Now atoms in Z can be either in
A or in B, which means that the probability of Z is
the sum of the probability of all the atoms in A
and the atoms in B.
• By the way, recall that P(A U B) is the probability
of A or B. What is this probability if A and B may
or may not intersect?
15
• For any two sets, A and B, We will show that
P(AUB)=P(A)+P(B)-P(A ∩ B), by using this
information and breaking up AUB into disjoint
parts.
• P(A)+P(B)-P(A ∩ B)=P(A ∩ Bc)+P(B ∩ Ac)+P(A
∩ B) (draw a diagram and check).
• Let us write this expression on the right-hand
side as [P(A ∩ Bc)+ P(A ∩ B)] +[P(B ∩ Ac)+P(A ∩
B)] - P(A ∩ B)
• But note that
• P(A ∩ Bc)+ P(A ∩ B)=P(A)
• P(B ∩ Ac)+P(A ∩ B)=P(B)
• Hence we are done
16
• Thus, when A and B have no outcomes in
common, the probability of AUB is
P(A)+P(B)
• When A and B are not disjoint, and are just
two arbitrary sets, then
P(AUB)=P(A)+P(B)-P(A∩B)
• Note that the disjoint case is a special
version of the arbitrary case (why?)
17
•
6. An example.
Using our new tools, let us attempt problem 5, page 232 of the statistics text.
Suppose the percentages refer to a total of 100 men and women. F is the set of
freshmen and S is the set of sophomores. The probability of each set is shown in the
aggregate. The individual men and women within each set (the individual atoms) are
the dots. Not all the dots or atoms are shown, and they will be omitted after this slide.
0.8=prob of men (M)
F
S
0.15
0.2=prob of women (W)
. . . . ..
.. ... ...
0.85
. . . . ..
... ...
18
• (a) The min prob of P(S ∩ W)=0.05
(shown)
0.8=prob of men (M)
0.2=prob of women (W)
F 0.15
S
0.85
19
• (b) The max prob of P(S ∩ W)=0.2
0.8=prob of men (M)
F
0.2=prob of women (W)
0.15
0.2
S 0.85
20
• What is the max probability of (M∩S)c? We know this is
equal to Mc U Sc=W U F. The max probability of this is
when W ∩F= φ and is equal to 0.35
0.8=prob of men (M)
0.2=prob of women (W)
F
0.15
S 0.65
0.15
21
• What is the minimum and max probability of Fc U Wc?
The former is S and the latter is M. The minimum
probability is therefore 0.85 (when P(S ∩ W)=0.05 )and
the max is 1 (shown below).
0.8=prob of men (M)
0.2=prob of women (W)
F
0.15
S
0.85
22
7. Conditional probability.
• Now, suppose we want to know if knowing one event has occurred
adds information about another event. For instance, if there is a
breeze, is there an increased possibility of rain? Given the budget
crisis in Philadelphia, will it take longer to rebuild the south street
bridge? What is the probability of the 2nd toss of a fair coin landing
heads if the first has been already seen to be heads? Always, we
are interested in the probability of a 2nd event given a first.
• In general, we write P(A|B) to mean the probability of event A given
the knowledge that B has occurred. This is also said to be the
conditional probability of A given B.
• We define P(A|B)=P(A∩B)/P(B).
• Let us come back to the cab problem. Let A be the event that Alice
was struck by a blue cab. Let B be the event that a witness sees a
blue cab. P(B) is the probability that the witness saw a blue cab.
• Let us convince ourselves that what we need is P(A|B)
• What is P(B)? It is 0.8[15/100]+0.2[85/100]=29/100
• What is P(A∩B)? It is the probability that the Alice was struck by a
blue cab and it was correctly identified as blue. It is the part of the
blue circle not shaded green. This is 12/100
23
•
Thus, the chance that one of those cabs seen to be blue may have caused the accident is 12/29 <
0.5
24
8. Independence
• When knowing that one event occurred gives us no information
about the probability of the other event, we will say that the two
events are independent.
• Another way of saying this is, given the information of one, or
conditional on obtaining the information about one, the assessment
of the other likelihood is unchanged.
• This is a case of independence.
• We are going to make precise the nature of independence between
two events by using probabilities, and in particular, conditional
probabilities.
• For instance, what is the probability of the 2nd toss of a fair coin
landing heads if the first has been already seen to be heads?
Clearly, the first coin toss does not affect the probabilities of the
second, so the probability is 0.5
• In our formula, note that P(A|B)=P(A∩B)/P(B)=P(A) if B gives no
information about A. This in turn means that P(A∩B)=P(A) × P(B)
25
• Let us return to the example we have been
working. Suppose that P(F ∩ M)=0.12
• This is shown below
0.8=prob of men (M)
0.2=prob of women (W)
F
0.12
0.03
S
0.68
0.17
26
• Now, suppose that you know a person you have
picked at random is a male. Does this change
the probability of this person being a freshman?
• The prior probability of encountering a freshman
was 0.15. Now, you know this person is a male.
Is this useful information in terms of changing
our probabilities?
• Well, of the 80 men, 12 are freshmen, so in fact,
the probability of encountering a freshman
=12/80=0.15 remains the same as before. The
events “encountering a freshman” and
“encountering a male” are independent.
27
• Let us assume that the area of the sets are proportional to their
probabilities. This means we cannot leave empty spaces as before
and must draw the F and S sets in a different way. This is done
below.
0.8=prob of men (M)
0.2=prob of women (W)
F
0.12
0.03
0.17
0.68
S
28
• Here is a visual representation of independence. What
independence means is that conditional on being in the blue region,
the probability of being in the upper left quadrant is equal to the
probability of being in the F region. Or, the ratio of the shaded region
to the region of M is equal to the ratio of F as a fraction of the whole
box.
0.8=prob of men (M)
0.2=prob of women (W)
F
0.12
0.03
0.17
0.68
S
29
• Some new but related questions. If M
gives no further information about F, does
F give any further information about M?
• If F is known to have not occurred, then
does our probability of M not occurring
change?
• These questions we can answer generally.
In fact, let us prove the following:
30
• For any two sets A and B, is it true that if
P(A|B)=P(A), then P(B|A)=P(B)?
• Suppose any two sets A and B are
independent. Are Ac and Bc independent
as well?
• The answer to both are yes. The former is
very easy. The latter is done in the next
slides.
31
• For this statement to be true, we need P(Ac ∩
Bc)= P(Ac) ×P(Bc)
• First note again that Ac ∩ Bc=(AUB)c
• Thus, if we can prove that
• P(Ac) ×P(Bc)=P(AUB)c, then we are done.
• Note that the left hand side
• = [1-P(A)][1-P(B)]
• =1-P(A)-P(A)+P(A)P(B)
• =1-[P(A)+P(B)-P(A ∩B)]
• =1-[P(AUB)]
• = P(AUB)c (hence proved)
32
• We will do parts of some problems from the book that
illustrates how simple probability problems may be
conceptualized and solved using sets.
• Let us look at problem 3 from page 247
• What is the set of outcomes or the universal set? It is the
set of all arrangements of the deck – thus, there are 52!
outcomes (52 factorial) in the set.
• We will briefly define factorials in class, but since they
are coming up already, here is a very simple explanation
• :http://www.purplemath.com/modules/factorial.htm
33
• Each of those points have equal
probability, since the deck is shuffled.
• Thus, we can find the probability of any
situation that is described by a collection
of outcomes in this universal set.
• In general then, if you can write down the
set of possibilities and assign a probability
to all of its outcomes, a lot of the hard
work is done.
34
• (a) Let A = set of all those outcomes which
have Jack of clubs at the top place. What
is the probability of this set?
• There are 51! elements in this set (check),
and there are a total of 52! Outcomes.
Thus, the likelihood of A happening is
51!/52! which boils down to 1/52
• (c) Let B be the set of outcomes which
have the bottom card = J of Diamonds
35
• We need P(AUB) which we know equals
P(A)+P(B)-P(A∩B). The first two terms equal
1/52, using logic just encountered.
• The last term is that set, call it C, which lies in
both A and B. This means that the first and last
cards are fixed, and there are 50 cards to be
arranged in 50 possible spaces. This can clearly
be done in 50! ways.
• Which means that the likelihood of C occurring is
50!/52! or 1/(52 ×51).
• Thus, the answer is 1/52+ 1/52- 1/(52 ×51).
• You are also encouraged to do Problem 4 and 5
(same section)
36
Download