Chapter 14: From Randomness to Probability

advertisement
Unit 4
CHAPTER 17: PROBABILIT Y MODELS
AP Statistics
BERNOULLI TRIALS
 The basis for the probability models we will examine
in this chapter is the Bernoulli (Ber-Noo-Lee) trial.
 We have Bernoulli trials if:
 there are two possible outcomes (success and
failure).
 the probability of success, p, is constant.
 the trials are independent.
THE GEOMETRIC MODEL
 A single Bernoulli trial is usually not all that interesting.
 A Geometric probability model tells us the probability for
a random variable that counts the number of Bernoulli
trials until the first success.
 Geometric models are completely specified by one
parameter, p, the probability of success, and are denoted
Geom(p).
THE GEOMETRIC SETTING
 Each observation is in one of two categories: success
or failure.
 The probability is the same for each observation.
 Observations are independent. (Knowing the result of
one observation tells you nothing about the other
observations.)
 The variable of interest is the number of trials
required to obtain the first success.
THE GEOMETRIC MODEL EXAMPLE
 A new sales gimmick is to sell bags of candy that
have 30% of M&M’s covered with speckles. These
“groovy” candies are mixed randomly with the normal
candies as they are put into the bags for distribution
and sale. You buy a bag and remove candies one at a
time looking for the speckles.
 Is a geometric probability model appropriate here?
THE GEOMETRIC MODEL EXAMPLE
(CONT.)
 A new sales gimmick is to sell bags of candy that
have 30% of M&M’s covered with speckles. These
“groovy” candies are mixed randomly with the normal
candies as they are put into the bags for distribution
and sale. You buy a bag and remove candies one at a
time looking for the speckles.
 What’s the probability that the first speckled one we see is the
fourth candy we get? Note that the skills to answer this
question come from the very first day of the probability unit.
THE GEOMETRIC MODEL EXAMPLE
(CONT.)
 What’s the probability that the first speckled one is
the tenth one? Write a general formula.
 What’s the probability that the first speckled candy
is one of the first three we look at?
 How many do we expect to have to check, on
average, to find a speckled one?
THE GEOMETRIC MODEL FORMULAS
Geometric probability model for Bernoulli trials: Geom(p)
p = probability of success
q = 1 – p = probability of failure
X = number of trials until the first success occurs
P(X = x) = q x-1 p
1
E(X)   
p

q
p2
THE GEOMETRIC MODEL EXAMPLE 2
Postini is a global company specializing in communications
security. The company monitors over 1 billion Internet messages
per day and recently reported that 91% of emails are spam.
Let’s assume that your emails are typical —91% spam. We’ll also
assume that you aren’t using a spam filter, so every message
goes to your inbox. And, since spam comes from many dif ferent
sources, we’ll consider your messages to be independent.
 Overnight your inbox collects email. When you first check your email the
next day, about how many spam emails should you expect to have to wade
through and discard before you find a real message? What’s the
probability that the 4 th message in your inbox is the first one that isn’t
spam?
INDEPENDENCE
 One of the important requirements for Bernoulli
trials is that the trials be independent.
 When we don’t have an infinite population, the trials
are not independent. But, there is a rule that allows
us to pretend we have independent trials:
 The 10% condition: Bernoulli trials must be
independent. If that assumption is violated, it is still
okay to proceed as long as the sample is smaller
than 10% of the population.
THE GEOMETRIC MODEL EXAMPLE 3
 People with O-negative blood are “universal donors.”
Only about 6% of people have O-negative blood.
1.
If donors line up at random for a blood drive, how many do you
expect to examine before you find someone who has O -negative
blood?
2.
What’s the probability that the first O-negative donor found is
one of the four people in line?
GEOMETRIC PROBABILITIES USING
CALCULATOR
 2 nd  DISTR  geometpdf(
 Note the pdf for Probability Density Function
 Used to find any individual outcome
 Format: geometpdf(p,x)
 2 nd  DISTR  geometcdf(
 Note the cdf for Cumulative Density Function
 Used to find the first success on or before the x th trial
 Format: geometcdf(p,x)
 Try the last example using the calculator! Much easier…
PERMUTATIONS VS. COMBINATIONS
 Permutations: When r items are selected from n available
items (without replacement).
 Therefore, the order matters.
n!
n pr 
(n  r )!
 Calculate the following permutations:
10
p3
7
p2
p3
24 p3
15
PERMUTATIONS VS. COMBINATIONS
(CONT.)
 Example:
 Forty-three sprinters race in a 5K. How many ways
can they finish first, second, and third?
PERMUTATIONS VS. COMBINATIONS
(CONT.)
 You are picking 3 different flavors to put on your banana
split. You can choose from 25 different flavors. How
many ways can this be done?
 Does the order matter here?
PERMUTATIONS VS. COMBINATIONS
(CONT.)
 Combination Rule:
 When order does not matter, and we want to calculate the number of
ways (combinations) r items can be selected from n different items.
n!
n Cr 
(n  r )! r!
 RECAP: When different orderings of the same items are counted
separately, we have a permutation problem, but when different
orderings of the same items are not counted separately, we have a
combination problem.
 Calculate the following combinations:
16
C4
12
C2
C3
25 C5
10
PERMUTATIONS VS. COMBINATIONS
(CONT.)
 Example: You are picking 3 different flavors to put on
your banana split. You can choose from 25 different
flavors. How many ways can this be done?
 Example: You want to buy three different CDs from a
selection of 5 CDs. How many ways can you make
your selection?
THE BINOMIAL MODEL
Day 2
 The geometric model counts the number of trials before
the first success.
 A Binomial model tells us the probability for a random
variable that counts the number of successes in a fixed
number of Bernoulli trials.
 Two parameters define the Binomial model: n, the
number of trials; and, p, the probability of success. We
denote this Binom(n, p).
THE BINOMIAL MODEL (CONT.)
 In n trials, there are
n!
n Ck 
k ! n  k !
ways to have k successes.
 Read n C k as “n choose k.”
 Note: n! = n  (n – 1)  …  2  1, and we’re not overly
excited about n  n! is read as “n factorial.”
THE BINOMIAL MODEL (CONT.)
n
p
q
X
=
=
=
=
Binomial probability model for Bernoulli trials:
Binom(n,p)
number of trials
probability of success
1 – p = probability of failure
# of successes in n trials
P(X = x) = nC x p x q n–x
  np
  npq
BINOMIAL MODEL EXAMPLE
 Recap: The communications monitoring company has reported
that 91% of e -mail messages are spam. Suppose your inbox
contains 25 messages.
 What are the mean and standard deviation of the number of
real messages you should expect to find in your inbox?
 What is the probability that you will find only 1 or 2 real
messages?
BINOMIAL PROBABILIT Y ON CALCULATOR
 2 nd  DISTR  binompdf(
 Note the pdf for Probability Density Function
 Used to find any individual outcome
 Format: binompdf(n,p,x)
 2 nd  DISTR  binomcdf(
 Note the cdf for Cumulative Density Function
 Used for getting x or fewer successes among n trials
 Format: binomcdf(n,p,x)
 Note: if you wanted to find up to a #, use the complement
rule. All possible probabilities in the model will add up
to 1.
BINOMIAL MODEL EXAMPLE 2
20 donors come to a blood drive. Recall that 6% of people are
“universal donors.”
 What are the mean and standard deviation of the number of
universal donors among them?
 What is the probability that there are 2 or 3 universal donors?
THE NORMAL MODEL TO THE RESCUE!
 When dealing with a large number of trials in a
Binomial situation, making direct calculations of the
probabilities becomes tedious (or outright
impossible).
 Fortunately, the Normal model comes to the rescue…
THE NORMAL MODEL TO THE RESCUE!
(CONT.)
 As long as the Success/Failure Condition holds, we
can use the Normal model to approximate Binomial
probabilities.
 Success/failure condition: A Binomial model is
approximately Normal if we expect at least 10
successes and 10 failures:
np ≥ 10 and nq ≥ 10
NORMAL MODEL EXAMPLE
Recall the communications monitoring company Postini has
reported that 91% of email messages are spam. Recently, you
installed a spam filter. You observe that over the past week it
okayed only 151 of 1422 emails you received, classifying the
rest as junk. Should you worry the filtering is too aggressive?
 What’s the probability that no more than 151 of 1422 emails
is a real message?
CONTINUOUS RANDOM VARIABLES
 When we use the Normal model to approximate the
Binomial model, we are using a continuous random
variable to approximate a discrete random variable.
 So, when we use the Normal model, we no longer
calculate the probability that the random variable
equals a particular value, but only that it lies
between two values.
WHAT CAN GO WRONG?
 Be sure you have Bernoulli trials.
 You need two outcomes per trial, a constant probability
of success, and independence.
 Remember that the 10% Condition provides a
reasonable substitute for independence.
 Don’t confuse Geometric and Binomial models.
 Don’t use the Normal approximation with small n.
 You need at least 10 successes and 10 failures to use
the Normal approximation.
RECAP
 Bernoulli trials show up in lots of places.
 Depending on the random variable of interest, we
might be dealing with a
 Geometric model
 Binomial model
 Normal model
RECAP (CONT.)
 Geometric model
 When we’re interested in the number of Bernoulli
trials until the next success.
 Binomial model
 When we’re interested in the number of
successes in a certain number of Bernoulli trials.
 Normal model
 To approximate a Binomial model when we
expect at least 10 successes and 10 failures.
ASSIGNMENTS: PP. 401 – 404
 Day 1: # 1, 3, 9 – 15 ODD
 Day 2: # 2, 5, 10, 12, 17, 19, 21, 29, 32
 Day 3: # 14 – 22 EVEN, 23, 25, 27, 37
Download