Uploaded by Js M

Probability Theory Tutorial: Basic Rules & Random Variables

advertisement
STAT3210 Tutorial 1
2.4
Basic probability rules
Let E be an event and S as the sample space containing event E.
1. 0 ≤ P (E) ≤ 1 ⇒ P (E) ≥ 0 and P (E) ≤ 1
2. P (E) = 0 (if E is impossible, or no outcomes in S contribute to event E)
3. P (E) = 1 (if E is certain, or all outcomes in S contribute to event E)
4. Sum of the probabilities of all the outcomes in S is 1
2.5
Properties derived from basic probability rules
1. P (E) = 1 − P (E c ), where E c is the complementary event of E.
Example: Complementary event
From section 2.1, we know that the event of getting a sum of 7 can actually be expressed as the
set
E = {(1, 6) , (2, 5) , (3, 4) , (4, 3) , (5, 2) , (6, 1)}
The corresponding complementary event is a set of all possible outcomes in the sample space
excluding those in E. Note that
E c = { (1, 1) , (1, 2) , (1, 3) , (1, 4) , (1, 5) ,
(2, 1) , (2, 2) , (2, 3) , (2, 4) , (2, 6) ,
(3, 1) , (3, 2) , (3, 3) , (3, 5) , (3, 6) ,
(4, 1) , (4, 2) , (4, 4) , (4, 5) , (4, 6) ,
(5, 1) , (5, 3) , (5, 4) , (5, 5) , (5, 6) ,
(6, 2) , (6, 3) , (6, 4) , (6, 5) , (6, 6) }
Example: Find the probability in section 2.1
Denote E as the event of having the sum of upper faces of 2 simultaneously rolled dies equal to
7. Let S be the sample space. The event E is actually equivalent to observing
(1, 6) , (2, 5) , (3, 4) , (4, 3) , (5, 2) or (6, 1)
in a new roll. The sample space is shown in example ‘Sample space of specific events’ The
required probability
n(E)
n(S)
6
=
36
≈ 0.167
P (E) =
13
2.6
Addition rules of probability
The probability of either one of two events A and B occurring is defined as
P (A or B) = P (A) + P (B) − P (A and B)
which can be simplified into the following form
P (A or B) = P (A) + P (B)
if and only if events A and B are mutually exclusive, which is equivalent to
P (A and B) = 0
By saying that two events are mutually exclusive, we mean that these two events have no overlapping
outcomes.
Example: Tossing three coins (Hard)
Given that there are 3 fair coins for you to toss. You are going to observe the faces upward of
all these 3 coins after a toss.
1. Identify the sample space S of this coming toss.
2. Construct a sample space S2 of possible number of heads in the coming toss.
3. *Hence, are the outcomes in the sample space S2 equally likely to happen?
4. Find the probability of having at least 1 tail in the next toss.
5. Find the probability of having at least 1 tail and at least 1 head. Determine whether these
two events are mutually exclusive.
Solution.
1. Let H and T be the head and tail of a coin respectively. Each outcome is an
ordered triple of faces upward of these 3 coins after a toss. The sample space S is
{ (H, H, H) , (H, H, T ) , (H, T, H) , (H, T, T ) ,
(T, H, H) , (T, H, T ) , (T, T, H) , (T, T, T )}
or simply
{HHH, HHT, HTH, HTT,
THH, THT, TTH, TTT}
2. Let S2 be the sample space of all possible head counts. Based on sample space S, we
know that there should only be either no heads, 1 head, 2 heads, or 3 heads. Hence, the
sample space S2 is
{0 head, 1 head, 2 heads, 3 heads}
14
3. Note that the outcomes in sample space S are equally likely to happen due to the fairness
of these 3 coinsa . We then have
P (0 head) = P ({(T, T, T )})
1
=
8
We also have
P (1 head) = P ({(H, T, T ) , (T, H, T ) , (T, T, H)})
= P ({(H, T, T )}) + P ({(T, H, T )})
+ P ({(T, T, H)})
1
= ·3
8
1
3
= 6=
8
8
(addition rule)
The second equality comes from the fact that these outcomes can never occur at the same
time. It is impossible to have the upper face of coin 1 equal to head and equal to tail at
the same time. So, these outcomes are mutually exclusive.
Since P (0 head) 6= P (1 head), without showing the probabilities of other 2 outcomes, we
conclude that the outcomes of sample space S2 are not equiprobable, or equally likely
to occur.
4. Note that the complement of ‘having at least 1 tail’ is ‘having no tail’. Hence,
P (at least 1 tail) = 1 − P (0 tail)
1
=1−
8
7
=
8
5. We can transform the event of ‘having at least 1 tail’ to ‘having at most 2 heads’. Therefore, the probability of having at least 1 tail and at least 1 head can be seen as the
probability of having one to two heads. Then,
P (at least 1 tail and at least 1 head) = P (at least 1 but at most 2 heads)
= P (1 head) + P (2 heads)
3
= ·2
8
3
=
4
(addition rule)
Since P (at least 1 tail and at least 1 head) 6= 0 we conclude that the events ‘having at
least 1 tail’ and ‘having at least 1 head’ are not mutually exclusive.
a The proof of the equiprobability makes use of the multiplication rule of probability
15
1
The Multiplication Rules and Conditional Probability
When two events are independent, the probability of both occurring is
P(A and B) = P(A) · P(B)
where two events A and B are independent events if the fact that A occurs does not affect the probability
of B occurring.
When two events are dependent, the probability of both occurring is
P(A and B) = P(A) · P(B|A)
When the outcome or occurrence of the first event affects the outcome or occurrence of the second event in
such a way that the probability is changed, the events are said to be dependent events.
Conditional probability: the probability that the second event B occurs given that the first event A has
occurred can be found by dividing the probability that both events occurred by the probability that the first
event has occurred.
P(A and B)
P(B|A) =
P(A)
Exercise 1.1
The probability that Sam parks in a no-parking zone and gets a parking ticket is 0.06, and the probability
that Sam cannot find a legal parking space and has to park in the no-parking zone is 0.20. On Tuesday, Sam
arrives at school and has to park in a no-parking zone. Find the probability that he will get a parking ticket.
2
Law of Total Probability
2.1
Partition
A collection of non-empty sets A1 , A2 , . . . , An is called a partition of a set S if they are pairwise disjoint
and their union is S , i.e. they satisfy
• Ai ∩ A j = ∅ for any i , j
• A1 ∪ A2 ∪ A3 ∪ · · · ∪ An = S
Remark: A partition can be a collection of infinitely many sets A1 , A2 , A3 , . . .
1
2.2
Law of total probability
If A1 , A2 , A3 , . . . , An is a partition of the sample space S , then for any event E we have
P(E) =
=
n
X
i=1
n
X
P(E ∩ Ai )
P(E|Ai )P(Ai )
i=1
Remark: For n = 2, we have P(E) = P(E ∩ A) + P(E ∩ Ac ) = P(E|A)P(A) + P(E|Ac )P(Ac )
2.3
Bayes’s Theorem
Let A1 , A2 , . . . , An be a partition of the sample space S , then for any event E with P(E) > 0, the probability
of Ai given E is
P(Ai ∩ E)
P(E)
P(E|Ai )P(Ai )
=
P(E)
P(E|Ai )P(Ai )
= Pn
i=1 P(E|Ai )P(Ai )
P(Ai |E) =
Remark: For n = 2, we have
P(A|E) =
P(E|A)P(A)
P(E|A)P(A) + P(E|Ac )P(Ac )
Exercise 2.1
Urn A contains 2 white balls and 1 black ball, whereas urn B contains 1 white ball and 5 black balls. A ball
is drawn at random from urn A and then placed in urn B. A ball is then drawn from urn B. It happens to be
white. What is the probability that the ball transferred was white?
Exercise 2.2
You ask your neighbor to water a sickly plant while you are on vacation. Without water, it will die with
probability 0.8; with water, it will die with probability 0.15. You are 90% certain that your neighbor will
remember to water the plant.
(a) What is the probability that the plant will be alive when you return?
(b) If the plant is dead upon your return, what is the probability that your neighbor forgot to water it?
2
Exercise 2.3 “False positive”
Suppose there is a rare disease that can be randomly found in 0.5% of the general population. A certain
clinical blood test is 99% effective in detecting the presence of this disease; that is, it will yield an accurate
positive result in 99% of the cases where the disease is actually present. But it also yields false-positive
results in 5% of the cases where the disease is not present. Suppose one person gets tested and the result is
positive, what is the probability that he actually has the disease?
3
Discrete Random Variable
3.1
Random variable
• A random variable is a variable whose values are determined by chance.
• Let X be a random variable, and S be the set of all possible values that X can assume.
– If S contains finite or countably infinite number of elements, then X is called a discrete random
variable.
– If S is made up of “intervals”, then X is a continuous random variable.
– There exist random variables that are neither discrete nor continuous. (Mixed random variables)
3.2
Probability Distribution
• Each random variable is associated with a probability distribution.
• For a discrete random variable X, its discrete probability distribution can be characterized by its
probability mass function (pmf), which assigns a probability on each value that X can take:
p(x) = P(X = x)
= The probability that X takes the value x
for each x ∈ S .
• p(x) must satisfies
0 ⩽ p(x) ⩽ 1 for all x ∈ S
X
p(x) = 1
x∈S
3
3.3
Mean and variance of discrete random variable
• The mean, or expectation, or expected value, of a discrete random variable X, denoted by E(X) or
µ, is defined by
X
E(X) = µ =
x p(x)
x∈S
• The variance of X, denoted by Var(X) or σ2 , is defined by
X
X
Var(X) = σ2 =
(x − µ)2 p(x) =
x2 p(x) − µ2
x∈S
x∈S
= E[(X − µ)2 ] = E(X 2 ) − [E(X)]2
• The standard deviation, σ ,of X is the square root of the variance.
Remark: The mean of a random variable is a constant, not a random variable. Same for variance and standard deviation.
Exercise 3.1
For each of the following random variables, tell whether it is discrete or continuous.
(a) Random variable X: The result (from 1 to 6) of rolling a die.
(b) Random variable Y: The number of atoms in a randomly selected region of air.
(c) Ramdom variable Z: The exact waiting time for the next bus to come.
Exercise 3.2
Suppose that the random variable X is equal to the number of hits obtained by a certain baseball player in
his next 3 at bats.
(a) Find S , the set of all possible values that X can take.
(b) If P(X = 1) = 0.3, P(X = 2) = 0.2, and P(X = 0) = 3P(X = 3), find the probability distribution p(x)
for the random variable X. Then find E(X) and Var(X).
4
Exercise 3.3
Let X be a discrete random variable and S be the set of values that x can take. Let p(x) be the probability
P
mass function of X, µ be the mean of X. Given the definition of Var(X) = x∈S (x − µ)2 p(x), try to verify
that
X
Var(X) =
x2 p(x) − µ2
x∈S
5
Download