Chapter 6 Probability

advertisement
Chapter 6 Probability
6.1 Relative frequency and its properties
In chapter 3, we introduced frequency distributions for both categorical data and
numerical data. Let us recall them first.
The frequency for a particular category – the number of times the category appears in
the data set.
The relative frequency for a particular category  the proportion of the time that the
category appears in the data set.
Relative frequency = Frequency / Total number of observations.
Example 6.1 A veterinarian kept track of the type of 40 animals treated in her clinic
during a typical day. The following relative frequency distribution summarizes her data
Type of animal
Cat
Dog
Horse
Others
Frequency
15
12
4
9
40
Relative Frequency
0.375
0.300
0.100
0.225
?
It is easy to see that relative frequency has the following properties.

Properties of relative frequency
1. 0  relative frequency  1.
2. The cumulative relative frequency of two different categories is the sum of their
relative frequencies
3. Relative frequencies of all categories should total 1.
6.2 Probability and its basic properties
A probability  a number between 0 and 1 that reflects the likelihood of occurrence of
some outcome. A probability of 1 corresponds to an outcome that occurs 100% of the
time - a certain outcome. A probability of 0 corresponds to an outcome that occurs 0% of
the time - an impossible outcome.
The probability of an outcome, denoted by P(outcome), is interpreted as the long-run
relative frequency of the outcome when the experiment is performed repeatedly under
identical conditions.
Example 6.2 Suppose that we track tosses of a fair coin. The results for the first twenty
tosses might be as follows (H – head, T – tail):
1
H
1
Toss Number
Outcome
Relative
frequency of H
11
T
.546
Relative frequency of H
Toss Number
Outcome
Relative
frequency of H
2
H
1
3
T
.667
4
H
.75
12
T
.5
13
H
.538
14
T
.5
5
T
.6
6
T
.5
7
H
.571
8
H
.625
9
H
.667
10
T
.6
15
T
.467
16
T
.438
17
H
.471
18
T
.444
19
H
.474
20
H
.5
1.5
1
0.5
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Number of tosses
Figure 6.1 The fluctuation of relative frequency
As the number of tosses increases, the relative frequency does not continue to fluctuate
wildly, but instead stabilizes and approaches some fixed number, called the limiting
value. This limiting value is the true probability.

Basic properties of probability
1) 0  P(any outcome)  1, since probability is interpreted as the long-run relative
frequency and a relative frequency cannot be less than 0 or greater than 1.
2) (Addition rule) If two outcomes A1, A2 cannot occur simultaneously, then
P(A1 or A2) = P(A1)+P(A2).
Generally, if any two of outcomes A1, A2, , Ak cannot occur simultaneously, then
P(A1 or A2 or or Ak) = P(A1)+P(A2)++P(Ak).
3) (Complement rule) The probability that an outcome A will not occur is equal to 1
minus the probability that the outcome will occur, that is,
P(not A) = 1 – P(A).

Three ways to find probability
a) Determining probabilities analytically.
Let S = {all possible outcomes} and E = {some outcomes}  an event.
Generally, probabilities can be determined analytically, by employing mathematical rules
and probability properties. For example, if all outcomes are equally likely to occur, then
P(E) = (the number of outcomes in E) / (the number of total outcomes).
Example 6.3
1) E – Get a head in a toss of a fair coin. Here S = {head, tail}. Since it is a fair coin,
“head” and “tail” are equally likely to occur. Thus, P(E) = 1/2
2) Suppose that a multiple-choice problem has 4 choices with one correct and a student
answered the problem by guess. E – The student got the correct answer to the
problem. Since the student answered the problem by guess. 4 choices are equally
likely to be chosen. Thus, P(E) = ¼
3) E – Get an ace from a well-shuffled deck of cards. Since it is a well-shuffled deck of
cards, each card has the same chance to be chosen. Thus, P(E) = 4/52 = 1/13
b) Estimating probabilities empirically
When an analytical approach is impossible or impractical, we can estimate probabilities
empirically, that is, by observed long-run proportions. For example, the probability of the
event that we get a head in tossing an unfair coin can be estimated by observed
proportion of head in 200 tosses. Intuitively, we feel that more times we toss, more
accurate this estimate will be. If the proportion is based on 20 tosses rather than 200, we
will be more hesitant to use this proportion as an estimate of the corresponding
probability.
c) Estimating probabilities by simulation
When we are unable to determine probabilities analytically and it is impractical to
estimate them empirically, we can estimate probabilities by simulation.
1) Design a method that uses a random mechanism (such as a random number generator,
the selection of a ball from a box, the toss of a coin, etc.) to represent an observation.
Be sure that the important characteristics of the actual process are preserved.
2) Generate an observation using the method in step 1, and determine whether the
outcome of interest has occurred.
3) Repeat step 2 a large number of times.
4) Estimate the probability by the relative frequency of the outcome of interest, that is,
(the number of observations for which the outcome of interest occurred) / (the total
number of observations generated).
Example 6.4 Suppose that couples who wanted children were to continue having children
until a boy is born. If we assume that each newborn child is equally likely to be a boy or a
girl, would this behavior change the proportion of boys in the population?
We can simulate sibling groups by tossing a fair coin with “head” representing a boy and
“tail” representing a girl.
Sibling group 1
Sibling group 2
Sibling group 3
Sibling group 4
TH
H
H
TTTTTH
girl, boy
boy
boy
girl, girl, girl, girl, girl, boy
Continuing the simulation to obtain a large number of observations suggests that the
long-run proportion of boys in the population would still be .5. We can also simulate
sibling groups by drawing a ball at random from a box that contains a red ball and a blue
ball with replacement, with the blue ball representing a boy and the red ball representing
a girl. Generally, we use statistical software packages to carry out simulations.

Independence
Independent outcomes: Two outcomes are said to be independent if the probability that
one outcome occurs is not affected by knowledge of whether the other has occurred.
More than two outcomes are said to be independent if knowledge that some of the
outcomes have occurred does not change the probabilities that any of the other outcomes
occur.
Dependent outcomes: If the occurrence of one outcome changes the probability that the
other outcome occurs, the outcomes are dependent.
Example 6.5
(1) Toss a coin twice. Let
A1  get a head in the first toss.
A2  get a head in the second toss.
Then A1 and A2 are ?.
(2) Choose two cards sequentially from a well-shuffled deck of cards. Let
B1  get an ace in the first choosing.
B2  get an ace in the second choosing
Then B1 and B2 are ? since P(B2 if B1 occurs) = ? is different from P(B2 if B1 does not
occurs) = ?.

The multiplication rule for independent outcomes
If two outcomes, A1 and A2, are independent, the probability that both outcomes occur is
the product of the individual outcome probabilities, that is,
P(A1 and A2 ) = P(A1)P(A2).
More generally, if k outcomes, A1, , Ak, are mutually independent, then
P(A1 and A2 and and Ak) = P(A1)P(A2) P(Ak)
Example 6.6
A student guesses answers to two multiple-choice problems (each with 4 choices). Find
(1) P (the student gets two correct answers),
(2) P (the student gets two wrong answers),
(3) P (the student gets one correct answer),
(4) P (the student gets at least one correct answer).
Let A1  the student gets the correct answer to the first problem,
A1c  the student gets the wrong answer to the first problem,
A2  the student gets the correct answer to the second problem,
A2c  the student gets the wrong answer to the second problem.
Then
(1) P (the student gets two correct answers) = P (A1 and A2)
= P (A1) P (A2) = (¼)  (¼) = 1/16
(2) P (the student gets two wrong answers) = P ( A1c and A2c )
= P ( A1c ) P ( A2c ) = (3/4)  (3/4) = 9/16
(3) P (the student gets one correct answer) = P ((A1 and A2c ) or ( A1c and A2))
= P (A1 and A2c ) + P ( A1c and A2)
= P (A1) P ( A2c )+P ( A1c ) P (A2)
= (1/4)(3/4)+(3/4)(1/4) = 6/16 = 3/8
(4) P (the student gets at least one correct answer)
= 1- P (the student gets two wrong answers) = 1-(9/16) = 7/16
Example 6.7 At the opening of a new hit movie the age and sex of 400 movie-goers was
recorded and are given below.
Sex \ Age
Female
Male
0 - < 13
14
10
13 - < 20
72
110
20 - < 30
86
62
Over 30
22
24
1) Estimate the probability of the event A that a movie-goer is female.
2) Estimate the probability of the event B that a movie-goer is younger than 20.
3) Are the two events A and B independent?
4) If two movie-goers are selected independently of each other, estimate the probability
that both are female.
1) P(A) = (?+?+?+?)/? = 0.485.
2) P(B) = (?+?+?+?)/? = 0.515.
3) Since P(A and B) = P(a movie-goer is female and is younger than 20) = (?+?)/400 =
0.215  0.2498 = 0.485  0.515 = P(A)P(B), A and B are not independent.
4) Since two movie-goers are selected independently, P(both are female) = ?  ? =
0.2352.

Making decisions based on probability
We can use probability information to make a decision.
Example 6.8
(1) When we go to a casino, we may decide to play a game or not based on P(we win the
game).
(2) We may decide to invest in a stock or not based on P(we earn money from the
investment).
(3) We may assess the risk of a nuclear power plant by P(reactor core melts).
Download