Uploaded by gwyneth.goi.2023

COR-STAT1202 Introductory Statistics Seminar 3 Full Version

advertisement
COR-STAT1202 Introductory Statistics
Seminar 3 – Probability
Readings: Statistics for Business and Economics (Chapter 3), Handouts
Please Print Double-Sided To Save Paper!
Any content in a dotted box is for your understanding only and will not be included in the
examinations.
Introduction
To recap, knowledge of probability is required when performing inferential statistics as it helps us
understand the relationship between a population and a sample drawn from it.
Events, Samples Spaces and Probability
Experiment: A process of observation leading to a single outcome that cannot be predicted with
certainty
Sample point or simple event: The most basic outcome of an experiment
(see also definition of event below)
Event: A specific collection of sample points
A simple event contains only a single sample point
A compound event contains two or more sample points
Sample space, 𝑺: Collection of all of an experiment’s sample points
Probability of an event 𝑬: Calculated by summing the probabilities of all the sample points for
event 𝑬 in the sample space 𝑺
1
Example 3.1: Die Throw
Consider the outcome of a single throw of a 6-sided fair die.
The experiment is the single throw of the die.
One possible sample point or simple event is the throw of 1.
One possible compound event is the throw of an odd number, i.e. 1, 3 or 5.
The sample space 𝑆 is {1,2,3,4,5,6}.
1
The probability of the event of throwing 1 is .
6
3
1
6
2
The probability of event of throwing an odd number is = .
The three probability axioms upon which all probability theory is based on are:
1) The probability of an event 𝐸 is a non-negative real number, i.e.
𝑷(𝑬) ≥ 𝟎, 𝑷(𝑬) ∈ ℝ , ∀𝑬 ∈ 𝑺
2) The probabilities of all the sample points in a sample space 𝑆 must sum up to 1, i.e.
∑𝐚π₯π₯ π’Š 𝑷(π‘¬π’Š ) = 𝟏, where 𝐸𝑖 is a sample point in sample space 𝑆
3) The probability of a union of any countable sequence of disjoint events (or mutually exclusive
events) 𝐸1 , 𝐸2 , … is the sum of the individual probabilities, i.e.
𝑷(β‹ƒπšπ₯π₯ π’Š π‘¬π’Š ) = ∑𝐚π₯π₯ π’Š 𝑷(π‘¬π’Š )
The three main consequences of the three probability axioms are:
1) The probability of an empty set is 0, i.e.
𝑷(∅) = 𝟎
2) If event 𝐴 is a subset of event 𝐡, then the probability of event 𝐴 is less than or equal to the
probability of event 𝐡, i.e. if 𝑨 ⊆ 𝑩, then 𝑷(𝑨) ≤ 𝑷(𝑩)
3) The probability of any event is between 0 and 1 inclusive, i.e.
𝟎 ≤ 𝑷(𝑬) ≤ 𝟏 , ∀𝑬 ∈ 𝑺
Example 3.2: Die Throw (continued)
Continuing with the setup in Example 3.1, we have the following:
With respect to the three probability axioms, we note the following:
1
Probability of one of the simple events = ≥ 0
6
1
1
1
1
1
1
6
6
6
6
6
6
Probability of all the sample points or simple events = + + + + + = 1
Noting that the events are disjoint events, e.g. it is not possible to throw a 1 and a 6 at the same time,
1
1
1
3
1
6
6
6
6
2
𝑃(odd) = 𝑃(1) + 𝑃(3) + 𝑃(5) = + + = =
2
With respect to the three main consequences, we note the following:
𝑃(∅) = 0 and 0 ≤ 𝑃(𝐸) ≤ 1 , ∀𝐸 ∈ 𝑆 can be observed easily
1
1
6
2
Event of throwing 1 ⊆ event of throwing an odd number, and 𝑃(1) = ≤ 𝑃(odd) =
Two tools useful for presenting an experiment’s sample space and its events are:
•
Tree diagram: Events are shown chronologically using lines and nodes
•
Venn diagram: Events are shown as shapes inside a big rectangle that represents the sample
space 𝑆
Example 3.3: Coin Throws
Consider the outcomes of two throws of a fair coin.
Denoting 𝐻 for a throw of head and 𝑇 for a throw of tail, the sample points are 𝐻𝐻, 𝐻𝑇, 𝑇𝐻 and 𝑇𝑇.
Note that the order of the throws matters, i.e. 𝐻𝑇 is a different sample point from 𝑇𝐻!
1st throw
2nd throw
Sample points
𝐻
𝐻𝐻
𝑇
𝐻𝑇
𝐻
𝑇𝐻
𝑇
𝑇𝑇
𝐻
𝑇
Tree diagram
𝑆
𝐻𝐻
𝑇𝐻
𝐻𝑇
𝑇𝑇
Venn diagram
3
Counting Rules
Three counting rules are available for counting events in probability calculations:
•
Multiplicative rule: The number of ways to arrange 𝒏 distinct items is 𝒏!,
where 𝒏! = 𝒏 × (𝒏 − 𝟏) × … × πŸ × πŸ
and 0! is defined to be equal to 1
•
Combinations rule: The number of ways to arrange 𝒏 items into 𝟐 groups of size 𝒓 and
(𝒏 − 𝒓) respectively, where the order within both groups is ignored, is 𝒏π‘ͺ𝒓 or (𝒏𝒓)
𝒏
π‘ͺ𝒓 = (𝒏𝒓) =
𝒏!
𝒓!(𝒏−𝒓)!
The combinations rule can be expanded to a more general rule for π‘˜ > 2 groups of sizes
π‘˜1 , π‘˜2 , π‘˜3 … to give
•
𝑛!
π‘˜1 !π‘˜2 !π‘˜3 !…
Permutations rule: The number of ways to arrange 𝒏 items into 𝟐 groups of size 𝒓 and
(𝒏 − 𝒓) respectively, where the order within the group of size (𝒏 − 𝒓) is ignored, is 𝒏𝑷𝒓
𝒏
𝒏!
𝑷𝒓 = (𝒏−𝒓)!
Example 3.4: Your Ah-Gong and Ah-Ma’s Favourite Hobby?
Consider the numbers 1, 2, 3, 4.
The number of ways to arrange 1, 2, 3, 4 is 4! = 24.
This is an example of the application of the multiplicative rule.
Now consider the numbers 1, 2, 3, 3.
The number of ways to arrange 1, 2, 3, 3 is
4!
2!
= 12.
This is an example of the application of the permutations rule.
Now consider the numbers 1, 1, 3, 3.
The number of ways to arrange 1, 1, 3, 3 is
4!
2!2!
= 6.
This is an example of the application of the combinations rule.
Now consider the numbers 1, 1, 1, 3.
The number of ways to arrange 1, 1, 1, 3 is
4!
3!
= 4.
Now consider the numbers 1, 1, 1,1.
The number of ways to arrange 1, 1, 1, 1 is
4!
4!
= 1.
4
Exercise 3.1: Students’ Seating Arrangement
Consider 7 students, namely Alice, Brandon, Carol, Dan, Elaine, Fred and Gigi. They are seated in
a row of 7 chairs.
How many different ways can we arrange the students’ seating, assuming we treat all of them as
individuals?
7! = 5040
How many different ways can we arrange the students’ seating, assuming we treat Alice, Carol,
Elaine and Gigi homogeneously as girls and treat Brandon, Dan and Fred homogeneously as boys?
7!
4!3!
= 35
How many different ways can we arrange the students’ seating, assuming we treat the girls as
individuals and treat Brandon, Dan and Fred homogeneously as boys?
7!
3!
= 840
Unions, Intersections, Complementary Events
A number of compound events are frequently used in probability calculations:
•
Intersection: The intersection of two events 𝑨 and 𝑩, denoted by 𝑨 ∩ 𝑩, is the event that
occurs if both 𝑨 and 𝑩 occurs in a single experiment
Event 𝐴 ∩ 𝐡 shown pictorially, with a Venn diagram:
𝑆
𝐴
𝐴∩𝐡
𝐡
Intersection can also be defined with more than two events, e.g. 𝐴 ∩ 𝐡 ∩ 𝐢
5
•
Union: The union of two events 𝑨 and 𝑩, denoted by 𝑨 ∪ 𝑩, is the event that occurs if
either 𝑨 or 𝑩 or both 𝑨 and 𝑩 occurs in a single experiment
𝑷(𝑨 ∪ 𝑩) = 𝑷(𝑨) + 𝑷(𝑩) − 𝑷(𝑨 ∩ 𝑩)
Event 𝐴 ∪ 𝐡 shown pictorially, with a Venn diagram:
𝑆
𝐴∪𝐡
𝐡
𝐴
Union can also be defined with more than two events, e.g. 𝐴 ∪ 𝐡 ∪ 𝐢
•
Complementary: The complement of an event 𝑨, denoted by 𝑨𝒄 or 𝑨′ , is the event that 𝑨
does not occur
𝑷(𝑨′ ) = 𝟏 − 𝑷(𝑨) or 𝑷(𝑨) + 𝑷(𝑨′ ) = 𝟏
Event 𝐴′ shown pictorially, with a Venn diagram:
𝑆
𝐴′
𝐴
Example 3.5: Die Throw (continued)
Continuing with the setup in Examples 3.1 and 3.2, define event 𝐴 as the throw of ≤ 3 and event 𝐡
as the throw of an odd number:
Event 𝐴 ∩ 𝐡 is the event of throwing 1 or 3.
2
1
6
3
𝑃(𝐴 ∩ 𝐡) = =
Event 𝐴 ∪ 𝐡 is the event of throwing 1, 2, 3 or 5.
4
2
3
3
1
2
6
3
6
6
3
3
𝑃(𝐴 ∪ 𝐡) = = or 𝑃(𝐴 ∪ 𝐡) = 𝑃(𝐴) + 𝑃(𝐡) − 𝑃(𝐴 ∩ 𝐡) = + − =
Event 𝐴′ is the event of throwing 4, 5 or 6.
3
1
6
2
𝑃(𝐴′ ) = =
6
Exercise 3.2: Formula for Union of 3 Events
Derive a general formula for the probability of the union of 3 events, i.e. 𝑃(𝐴 ∪ 𝐡 ∪ 𝐢). Assume
𝑃(𝐴 ∩ 𝐡) > 0, 𝑃(𝐴 ∩ 𝐢) > 0, 𝑃(𝐡 ∩ 𝐢) > 0 and 𝑃(𝐴 ∩ 𝐡 ∩ 𝐢) > 0.
𝑆
𝐴
𝐡
𝐴∪𝐡∪𝐢
𝐢
If we sum up 𝑃(𝐴), 𝑃(𝐡) and 𝑃(𝐢), we will be double-counting 𝑃(𝐴 ∩ 𝐡), 𝑃(𝐴 ∩ 𝐢) and 𝑃(𝐡 ∩ 𝐢) and
triple-counting 𝑃(𝐴 ∩ 𝐡 ∩ 𝐢).
If we then subtract 𝑃(𝐴 ∩ 𝐡), 𝑃(𝐴 ∩ 𝐢) and 𝑃(𝐡 ∩ 𝐢) from the sum of 𝑃(𝐴), 𝑃(𝐡) and 𝑃(𝐢), we will
be removing 𝑃(𝐴 ∩ 𝐡 ∩ 𝐢) completely.
If we then add 𝑃(𝐴 ∩ 𝐡 ∩ 𝐢) back, then we will have counted every area exactly once.
Thus, 𝑃(𝐴 ∪ 𝐡 ∪ 𝐢) = 𝑃(𝐴) + 𝑃(𝐡) + 𝑃(𝐢) − 𝑃(𝐴 ∩ 𝐡) − 𝑃(𝐴 ∩ 𝐢) − 𝑃(𝐡 ∩ 𝐢) + 𝑃(𝐴 ∩ 𝐡 ∩ 𝐢)
Conditional Probability
The conditional probability that event 𝑨 occurs given that event 𝑩 has occurred,
𝑷(𝑨|𝑩) =
𝑷(𝑨∩𝑩)
𝑷(𝑩)
, assuming 𝑃(𝐡) > 0
Let us understand 𝑃(𝐴|𝐡) with the help of a Venn diagram:
𝑆
𝐴
𝐴∩𝐡
7
𝐡
Since we are told event 𝐡 has occurred, we only need to consider inside the circle representing event
𝐡. Anything outside of event 𝐡, i.e. event 𝐡′ , is ignored. Another perspective is we can treat event 𝐡 as
a “reduced sample space”. We then observe that the only way for event 𝐴 to occur within this “reduced
sample space” is via the event 𝐴 ∩ 𝐡. As such, we arrive at the ratio of
𝑃(𝐴∩𝐡)
𝑃(𝐡)
.
Example 3.6: Die Throw (continued)
Continuing with the setup in Examples 3.1, 3.2 and 3.5, define event 𝐴 as the throw of 1 and event
𝐡 as the throw of an odd number:
Event 𝐴 ∩ 𝐡 is the event of throwing 1.
𝑃(𝐴 ∩ 𝐡) =
1
6
Event 𝐡 is the event of throwing 1, 3 or 5.
3
1
6
2
𝑃(𝐡) = =
Event 𝐴|𝐡 is the event of a throw of 1 given that the same throw is an odd number.
Substituting, 𝑃(𝐴|𝐡) =
𝑃(𝐴∩𝐡)
𝑃(𝐡)
=
1
6
1
2
=
1
3
Exercise 3.3: Students’ Seating Arrangement (continued)
Continuing with the setup in Exercise 3.1, given that Alice is seated on the leftmost seat, what is the
probability that Alice is sitting next to another girl? Show full working.
Let event 𝐴 be the event of Alice sitting next to another girl and event 𝐡 be the event that Alice is
seated on the leftmost seat.
We are therefore solving for 𝑃(𝐴|𝐡).
Number of ways that the 7 students can be seated = 7! = 5040
Number of ways that the 7 students can be seated but with Alice on the leftmost seat = 1 × 6! = 720
Thus, 𝑃(𝐡) =
720
5040
=
1
7
Number of ways that the 7 students can be seated but with Alice on the leftmost seat and another
girl on the 2nd leftmost seat = 1 × 3 × 5! = 360
Thus, 𝑃(𝐴 ∩ 𝐡) =
360
5040
=
1
14
8
Substituting, 𝑃(𝐴|𝐡) =
𝑃(𝐴∩𝐡)
𝑃(𝐡)
=
1
14
1
7
=
1
2
It is also possible to get to the answer via logical reasoning. After removing Alice from consideration,
there are 6 possible choices for the 2nd leftmost seat. Since 3 of these choices are girls, the
3
1
6
2
probability of picking a girl for the 2nd leftmost seat must be = .
Mutually Exclusive Events and Independent Events
Two properties relating to events are frequently mentioned and/or used in probability calculations:
•
Mutually exclusive events: Two events 𝑨 and 𝑩 are mutually exclusive if both 𝑨 and 𝑩
cannot occur at the same time in a single experiment
𝑷(𝑨 ∩ 𝑩) = 𝟎
Mutually exclusive events 𝐴 and 𝐡 shown pictorially, with a Venn diagram:
𝑆
𝐡
𝐴
It follows that for mutually exclusive events, 𝑃(𝐴 ∪ 𝐡) = 𝑃(𝐴) + 𝑃(𝐡)
•
Independent events: Two events 𝑨 and 𝑩 are independent of each other if the probability
of 𝑩 occurring is not affected by whether 𝑨 has occurred
𝑷(𝑨 ∩ 𝑩) = 𝑷(𝑨) × π‘·(𝑩) or 𝑷(𝑨|𝑩) = 𝑷(𝑨) or 𝑷(𝑩|𝑨) = 𝑷(𝑩)
The above three tests for independence are equivalent to one another, so only one of the above
checks need to be carried out when checking for independence between events
Example 3.7: More Die Throws
Continuing with the setup in Example 3.1, 3.2, 3.5 and 3.6, define event 𝐴 as the throw of 1 and
event 𝐡 as the throw of 2:
Events 𝐴 and 𝐡 are mutually exclusive events because we can only obtain a single outcome on a
single throw of the die. It is not possible to obtain 1 and 2 on the same throw.
𝑃(𝐴 ∩ 𝐡) = 0
9
Suppose the die used above is blue in colour. A new red die is introduced. Define event 𝐴 as the
throw of 1 on the blue die and event 𝐢 as the throw of 3 on the red die.
Events 𝐴 and 𝐢 are independent events because the outcome of the throw with the blue die does
not affect the outcome of the throw with the red die.
1
1
1
6
6
36
𝑃(𝐴 ∩ 𝐢) = 𝑃(𝐴) × π‘ƒ(𝐢) = × =
Exercise 3.4: First Throw and Sum of Two Throws
Consider the outcome of two throws of one 6-sided fair die. Three events are defined as follows:
•
Event 𝐴: First throw is 1
•
Event 𝐡: Sum of the two throws is 6
•
Event 𝐢: Sum of the two throws is 7
Are events 𝐴 and 𝐡 independent of each other? Are events 𝐴 and 𝐢 independent of each other?
Are events 𝐴 and 𝐡 mutually exclusive? Are events 𝐴 and 𝐢 mutually exclusive?
𝑃(𝐴) =
𝑃(𝐡) =
1
6
5
36
(cases of 1 + 5, 2 + 4, 3 + 3, 4 + 2, 5 + 1)
𝑃(𝐴 ∩ 𝐡) =
1
36
1
5
6
36
(case of 1 + 5) ≠ 𝑃(𝐴) × π‘ƒ(𝐡) = ×
=
5
216
Thus, events 𝐴 and 𝐡 are not independent of each other.
Similarly, we have 𝑃(𝐴|𝐡) =
We also have 𝑃(𝐡|𝐴) =
𝑃(𝐢) =
6
36
𝑃(𝐴∩𝐡)
=
𝑃(𝐡)
𝑃(𝐴∩𝐡)
𝑃(𝐴)
=
1
36
1
6
1
36
5
36
1
1
5
6
= ≠ 𝑃(𝐴) =
1
5
6
36
= ≠ 𝑃(𝐡) =
(cases of 1 + 6, 2 + 5, 3 + 4, 4 + 3, 5 + 2, 6 + 1) =
𝑃(𝐴 ∩ 𝐢) =
1
36
1
1
1
6
6
36
(case of 1 + 6) = 𝑃(𝐴) × π‘ƒ(𝐢) = × =
1
6
Thus, events 𝐴 and 𝐢 are independent of each other.
Similarly, we have 𝑃(𝐴|𝐢) =
𝑃(𝐴∩𝐢)
𝑃(𝐢)
Similarly, we also have 𝑃(𝐢|𝐴) =
=
1
36
1
6
𝑃(𝐴∩𝐢)
𝑃(𝐴)
1
1
6
6
= = 𝑃(𝐴) =
=
1
36
1
6
1
1
6
6
= = 𝑃(𝐢) =
Some independent events are not immediately obvious upon observation, so the best way to find out
is to carry out one of the checks, 𝑃(𝐴 ∩ 𝐡) = 𝑃(𝐴) × π‘ƒ(𝐡) or 𝑃(𝐴|𝐡) = 𝑃(𝐴) or 𝑃(𝐡|𝐴) = 𝑃(𝐡).
Since 𝑃(𝐴 ∩ 𝐡) =
Since 𝑃(𝐴 ∩ 𝐢) =
1
36
1
36
≠ 0, events 𝐴 and 𝐡 are not mutually exclusive.
≠ 0, events 𝐴 and 𝐢 are not mutually exclusive.
10
Bayes’ Theorem
Given π‘˜ mutually exclusive and exhaustive events, 𝐡1 , 𝐡2 , … , π΅π‘˜ , i.e. 𝑃(𝐡1 ) + 𝑃(𝐡2 ) + β‹― + 𝑃(π΅π‘˜ ) = 1,
and an observed event 𝐴,
𝑷(π‘©π’Š |𝑨) =
𝑷(π‘©π’Š )𝑷(𝑨|π‘©π’Š )
𝑷(π‘©πŸ )𝑷(𝑨|π‘©πŸ )+𝑷(π‘©πŸ )𝑷(𝑨|π‘©πŸ )+β‹―+𝑷(π‘©π’Œ )𝑷(𝑨|π‘©π’Œ )
We are primarily interested in the probability of one of the events 𝐡1 , 𝐡2 , … , π΅π‘˜ .
Event 𝐴 contains some information about events 𝐡1 , 𝐡2 , … , π΅π‘˜ .
𝑃(𝐴|𝐡𝑖 ) is the accuracy of 𝐴 in predicting 𝐡1 , 𝐡2 , … , π΅π‘˜ , based on past experience.
𝑃(𝐡𝑖 ) , known as the prior probability, is the estimate of the probability of 𝐡𝑖 before any external
information is received and taken into consideration. Hence, “prior”.
𝑃(𝐡𝑖 |𝐴), known as the posterior probability, is the estimate of the probability of 𝐡𝑖 after some external
information in the form of event 𝐴 is received and taken into consideration. Hence, “posterior”.
Example 3.8: Your Lecturer’s Morning Umbrella Dilemma
I am a lazy lecturer who dislikes carrying an umbrella. Consider therefore the question of whether it
will rain later today when I woke up this morning.
Based on my past experience of living in Singapore, I have my own estimates of the probability of
various weather types on a typical September day. These are my prior probabilities. For example,
𝐡1 is sunny, 𝐡2 is cloudy, 𝐡3 is rainy and so on, and I have 𝑃(𝐡3 ) based on my past experience.
I then switch on the radio and the DJ reads a weather report stating it will rain later today. This is the
external information 𝐴. This new information will affect my estimate of the probability of rain later
today, so my objective is now to calculate 𝑃(𝐡3 |𝐴), to take into account external information 𝐴.
For many days in the past, I have recorded down the weather forecasts and noted what the weather
was like later on those same days. This allows me to calculate 𝑃(𝐴|𝐡𝑖 ) for all the 𝐡𝑖 s. For example,
𝑃(𝐴|𝐡3 ) is the probability that the weather report stating it would rain given that it rained that day.
We can then apply Bayes’ Theorem to arrive at what I need,
𝑃(𝐡3 |𝐴) =
𝑃(𝐡3 )𝑃(𝐴|𝐡3 )
𝑃(𝐡1 )𝑃(𝐴|𝐡1 )+𝑃(𝐡2 )𝑃(𝐴|𝐡2 )+β‹―+𝑃(π΅π‘˜ )𝑃(𝐴|π΅π‘˜ )
Note that the LHS is the probability we are interested in and the RHS contains all the information
(past and current) we have on hand.
11
Proof of Bayes’ Theorem: 𝑃(𝐡𝑖 |𝐴) =
𝑃(𝐡𝑖 )𝑃(𝐴|𝐡𝑖 )
.
𝑃(𝐡1 )𝑃(𝐴|𝐡1 )+𝑃(𝐡2 )𝑃(𝐴|𝐡2 )+β‹―+𝑃(π΅π‘˜ )𝑃(𝐴|π΅π‘˜ )
We start from the LHS and apply the definition of conditional probability, 𝑃(𝐡𝑖 |𝐴) =
𝑃(𝐡𝑖 ∩𝐴)
𝑃(𝐴)
.
The proof proceeds by showing
•
The numerators are equal: 𝑃(𝐡𝑖 ∩ 𝐴) = 𝑃(𝐡𝑖 )𝑃(𝐴|𝐡𝑖 ), and
•
The denominators are equal: 𝑃(𝐴) = 𝑃(𝐡1 )𝑃(𝐴|𝐡1 ) + 𝑃(𝐡2 )𝑃(𝐴|𝐡2 ) + β‹― + 𝑃(π΅π‘˜ )𝑃(𝐴|π΅π‘˜ )
Numerator
Applying definition of conditional probability, 𝑃(𝐴|𝐡𝑖 ) =
Since 𝑃(𝐴 ∩ 𝐡𝑖 ) = 𝑃(𝐡𝑖 ∩ 𝐴), we have 𝑃(𝐴|𝐡𝑖 ) =
𝑃(𝐴∩𝐡𝑖 )
𝑃(𝐡𝑖 ∩𝐴)
𝑃(𝐡𝑖 )
𝑃(𝐡𝑖 )
.
.
Rearranging, 𝑃(𝐡𝑖 ∩ 𝐴) = 𝑃(𝐡𝑖 )𝑃(𝐴|𝐡𝑖 ).
Denominator
Since the events 𝐡1 , 𝐡2 , … , π΅π‘˜ are mutually exclusive and exhaustive, we have
𝑃(𝐴) = 𝑃(𝐴 ∩ 𝐡1 ) + 𝑃(𝐴 ∩ 𝐡2 ) + β‹― + 𝑃(𝐴 ∩ π΅π‘˜ ).
Applying definition of conditional probability again, 𝑃(𝐴|𝐡𝑖 ) =
Since 𝑃(𝐴 ∩ 𝐡1 ) = 𝑃(𝐡1 ∩ 𝐴), we have 𝑃(𝐴|𝐡1 ) =
𝑃(𝐴∩𝐡𝑖 )
𝑃(𝐡𝑖 )
.
𝑃(𝐡1 ∩𝐴)
𝑃(𝐡1 )
.
Rearranging, 𝑃(𝐡1 ∩ 𝐴) = 𝑃(𝐡1 )𝑃(𝐴|𝐡1 ).
Similarly, we have 𝑃(𝐴 ∩ 𝐡2 ) = 𝑃(𝐡2 )𝑃(𝐴|𝐡2 ) and so on.
Substituting back, 𝑃(𝐴) = 𝑃(𝐡1 )𝑃(𝐴|𝐡1 ) + 𝑃(𝐡2 )𝑃(𝐴|𝐡2 ) + β‹― + 𝑃(π΅π‘˜ )𝑃(𝐴|π΅π‘˜ ).
Combining, 𝑃(𝐡𝑖 |𝐴) =
𝑃(𝐡𝑖 )𝑃(𝐴|𝐡𝑖 )
𝑃(𝐡1 )𝑃(𝐴|𝐡1 )+𝑃(𝐡2 )𝑃(𝐴|𝐡2 )+β‹―+𝑃(π΅π‘˜ )𝑃(𝐴|π΅π‘˜ )
.
Example 3.9: Your Lecturer’s Morning Umbrella Dilemma (cont.)
Continuing with the setup in Example 3.8, suppose there are only three weather types, namely sunny,
cloudy and rainy.
I believe the probabilities of sunny, cloudy and rainy weather on a typical September day are 0.25,
0.35 and 0.4 respectively. From past experience, 15% of the weather forecasts predicted rainy
weather when it turned out sunny, 30% of the weather forecasts predicted rainy weather when it
turned out cloudy, and 85% of the weather forecasts predicted rainy weather when it turned out rainy.
12
Given that the weather forecast predicted rainy weather later today, what is the probability that it will
be rainy later in the day?
Let 𝐹𝑅 be the event that the weather forecast predicts rainy weather, π‘Šπ‘† be the event of sunny
weather, π‘ŠπΆ be the event of cloudy weather and π‘Šπ‘… be the event of rainy weather.
We want to calculate 𝑃(π‘Šπ‘… |𝐹𝑅 ).
Applying Baye’s Theorem, 𝑃(π‘Šπ‘… |𝐹𝑅 ) =
𝑃(π‘Šπ‘… )𝑃(𝐹𝑅 |π‘Šπ‘… )
𝑃(π‘Šπ‘† )𝑃(𝐹𝑅 |π‘Šπ‘† )+𝑃(π‘ŠπΆ )𝑃(𝐹𝑅 |π‘ŠπΆ )+𝑃(π‘Šπ‘… )𝑃(𝐹𝑅 |π‘Šπ‘… )
From the question,
𝑃(π‘Šπ‘† ) = 0.25, 𝑃(π‘ŠπΆ ) = 0.35 and 𝑃(π‘Šπ‘… ) = 0.4
𝑃(𝐹𝑅 |π‘Šπ‘† ) = 0.15, 𝑃(𝐹𝑅 |π‘ŠπΆ ) = 0.3 and 𝑃(𝐹𝑅 |π‘Šπ‘… ) = 0.85
Substituting,
𝑃(π‘Šπ‘… |𝐹𝑅 ) =
0.4×0.85
0.25×0.15+0.35×0.3+0.4×0.85
= 0.704663
Given that 𝑃(π‘Šπ‘… |𝐹𝑅 ) is quite high, I better bring an umbrella today!
Note that the prior probability of rain was 0.4 and it has increased to more than 0.7 after receiving
external information that predicted it will rain. This is because the weather forecast has been
reasonably accurate in the past. We can see this by noting that the proportion of incorrect predictions
of rainy weather are low, i.e. 𝑃(𝐹𝑅 |π‘Šπ‘† ) and 𝑃(𝐹𝑅 |π‘ŠπΆ ) are low, and that the proportion of correct
predictions of rainy weather is high, i.e. 𝑃(𝐹𝑅 |π‘Šπ‘… ) is high.
There are three main methods when applying the Bayes’ Theorem:
𝑃(𝐡𝑖 )𝑃(𝐴|𝐡𝑖 )
•
Formula method: Apply 𝑃(𝐡𝑖 |𝐴) =
•
Tree diagram method: Draw a tree diagram, calculate each branch’s probability, then consider
𝑃(𝐡1 )𝑃(𝐴|𝐡1 )+𝑃(𝐡2 )𝑃(𝐴|𝐡2 )+β‹―+𝑃(π΅π‘˜ )𝑃(𝐴|π΅π‘˜ )
directly
the branches of interest
•
Tableau method: Set out the probabilities in a table and calculate the probabilities
systematically
Example 3.10: Your Lecturer’s Morning Umbrella Dilemma (cont.)
Apply the three methods for Bayes’ Theorem to solve Example 3.9.
The formula method has already been covered under Example 3.9.
13
To apply the tree diagram and tableau methods, we need to define two more events. Let 𝐹𝑆 be the
event that the weather forecast predicts sunny weather and 𝐹𝐢 be the event that the weather forecast
predicts cloudy weather.
Tree Diagram
Weather
Weather
Forecast
𝑃(𝐹𝑗 |π‘Šπ‘– )
𝑃(π‘Šπ‘– )
π‘Šπ‘†
0.15
0.25
0.35
π‘ŠπΆ
0.3
0.4
π‘Šπ‘…
0.85
Sample points
(π‘Šπ‘– ∩ 𝐹𝑗 )
𝐹𝑆
π‘Šπ‘† ∩ 𝐹𝑆
𝐹𝐢
π‘Šπ‘† ∩ 𝐹𝐢
𝐹𝑅
π‘Šπ‘† ∩ 𝐹𝑅
𝐹𝑆
π‘ŠπΆ ∩ 𝐹𝑆
𝐹𝐢
π‘ŠπΆ ∩ 𝐹𝐢
𝐹𝑅
π‘ŠπΆ ∩ 𝐹𝑅
𝐹𝑆
π‘Šπ‘… ∩ 𝐹𝑆
𝐹𝐢
π‘Šπ‘… ∩ 𝐹𝐢
𝐹𝑅
π‘Šπ‘… ∩ 𝐹𝑅
Recall that we want to calculate 𝑃(π‘Šπ‘… |𝐹𝑅 ).
𝑃(π‘Šπ‘… |𝐹𝑅 ) =
𝑃(π‘Šπ‘… ∩𝐹𝑅 )
𝑃(𝐹𝑅 )
From the tree diagram, we observe that
𝑃(π‘Šπ‘… ∩ 𝐹𝑅 ) = 𝑃(π‘Šπ‘… ) × π‘ƒ(𝐹𝑅 |π‘Šπ‘… ) = 0.4 × 0.85 = 0.34
From the tree diagram, we also observe that
𝑃(𝐹𝑅 ) = 𝑃(π‘Šπ‘† ∩ 𝐹𝑅 ) + 𝑃(π‘ŠπΆ ∩ 𝐹𝑅 ) + 𝑃(π‘Šπ‘… ∩ 𝐹𝑅 )
From the tree diagram, we again observe
𝑃(π‘Šπ‘† ∩ 𝐹𝑅 ) = 𝑃(π‘Šπ‘† ) × π‘ƒ(𝐹𝑅 |π‘Šπ‘† ) = 0.25 × 0.15 = 0.0375
𝑃(π‘ŠπΆ ∩ 𝐹𝑅 ) = 𝑃(π‘ŠπΆ ) × π‘ƒ(𝐹𝑅 |π‘ŠπΆ ) = 0.35 × 0.3 = 0.105
Substituting,
𝑃(𝐹𝑅 ) = 𝑃(π‘Šπ‘† ∩ 𝐹𝑅 ) + 𝑃(π‘ŠπΆ ∩ 𝐹𝑅 ) + 𝑃(π‘Šπ‘… ∩ 𝐹𝑅 ) = 0.0375 + 0.105 + 0.34 = 0.4825
Substituting,
𝑃(π‘Šπ‘… |𝐹𝑅 ) =
𝑃(π‘Šπ‘… ∩𝐹𝑅 )
𝑃(𝐹𝑅 )
=
0.34
0.4825
= 0.704663
Note that the branches for 𝐹𝑆 and 𝐹𝐢 are drawn, but the probabilities are not required in solving for
𝑃(π‘Šπ‘… |𝐹𝑅 ).
14
Tableau Method
𝑃(𝐹𝑗 |π‘Šπ‘– )
𝑃(𝐹𝑗 ∩ π‘Šπ‘– )
𝑃(π‘Šπ‘– |𝐹𝑗 )
𝑃(π‘Šπ‘– )
𝐹𝑆
𝐹𝐢
𝐹𝑅
𝐹𝑆
𝐹𝐢
𝐹𝑅
𝐹𝑆
𝐹𝐢
𝐹𝑅
π‘Šπ‘†
0.25
−
−
0.15
−
−
0.0375
−
−
0.0778
π‘ŠπΆ
0.35
−
−
0.3
−
−
0.105
−
−
0.2176
π‘Šπ‘…
0.4
−
−
0.85
−
−
0.34
−
−
0.7047
−
−
0.4825
𝑃(𝐹𝑗 )
𝑃(π‘Šπ‘– ) and 𝑃(𝐹𝑗 |π‘Šπ‘– ) are given in the question.
𝑃(𝐹𝑗 ∩ π‘Šπ‘– ) = 𝑃(π‘Šπ‘– ) × π‘ƒ(𝐹𝑗 |π‘Šπ‘– )
𝑃(𝐹𝑗 ) = ∑all 𝑖 𝑃(𝐹𝑗 ∩ π‘Šπ‘– )
𝑃(π‘Šπ‘– |𝐹𝑗 ) =
𝑃(𝐹𝑗 ∩π‘Šπ‘– )
𝑃(𝐹𝑗 )
Reading off the table,
𝑃(π‘Šπ‘… |𝐹𝑅 ) =
𝑃(π‘Šπ‘… ∩𝐹𝑅 )
𝑃(𝐹𝑅 )
=
0.34
0.4825
= 0.704663
Note that the cells for 𝐹𝑆 and 𝐹𝐢 are included, but the probabilities are not required in solving for
𝑃(π‘Šπ‘… |𝐹𝑅 ).
Exercise 3.5: Lie Detector
A lie detector is known to be accurate, giving 5% false positives and 1% false negatives. A police
detective administers the lie detector test on a suspect, whom he believes has a 75% probability of
lying. The lie detector test gives a “lying” result and the police detective accuses the suspect of lying
based on it. What is the probability that the police detective’s accusation is incorrect?
Let 𝐷𝐿 be the event that the lie detector gives a “lying” result, 𝐷𝑁 be the event that the lie detector
gives a “not lying” result, 𝐿 be the event of the suspect lying and 𝑁 be the event of the suspect is not
lying, i.e. telling the truth.
𝑃(𝐿) = 0.75, so that 𝑃(𝑁) = 1 − 0.75 = 0.25
𝑃(𝐷𝐿|𝑁) = 0.05, so that 𝑃(𝐷𝑁|𝑁) = 1 − 0.05 = 0.95
𝑃(𝐷𝑁|𝐿) = 0.01, so that 𝑃(𝐷𝐿|𝐿) = 1 − 0.01 = 0.99
We want to calculate 𝑃(𝑁|𝐷𝐿).
15
Applying Bayes’ Theorem, 𝑃(𝑁|𝐷𝐿) =
𝑃(𝑁)𝑃(𝐷𝐿|𝑁)
𝑃(𝑁)𝑃(𝐷𝐿|𝑁)+𝑃(𝐿)𝑃(𝐷𝐿|𝐿)
=
0.25×0.05
0.25×0.05+0.75×0.99
= 0.016556
Thus, there is a probability of 1.66% that the police detective’s accusation is incorrect.
Food for Thought Question 3
There are 45 students in G22, COR-STAT1202 class. What is the probability that at least two
students share the same birthday in G22? Ignore 29th February, i.e. there are 365 days in a year.
16
Download