From Randomness to Probability - math-b

advertisement
From Randomness to
Probability
Randomness
• We have learned to
understand randomness.
• The surprising fact is that
in the long run even
truly random
phenomena settle down
in a way that is
consistent and
predictable.
Dealing with Random Phenomena
• Every day you drive
through the intersection at
College and Main
• It seems like the light is
never green when you get
there.
• But this can’t really be
true.
• In fact, if you try really
hard, you can recall just
sailing through the green
light once in a while.
Random Phenomena
• What is random here?
• The light itself is governed
by a timer
• The light may be red at
precisely the same times
each day.
• It is the pattern of your
driving that is random.
• At the precision level of 30
seconds or so that light
spends being red or green,
the time you arrive at the
light is random.
• Even if you try to leave
your house at exactly the
same time every day,
whether the light is red or
green as you reach the
intersection is a random
phenomenon.
Random Phenomena
• Is the color of the light
completely
unpredictable?
• You expect some kind of
regularity in the long-run
experience.
• Some fraction of the
time the light will be
green as you get to the
intersection.
• How can you figure out
what that fraction is?
Random Phenomena
• You might record what happens at the intersection
each day and graph the accumulated percentage of
green lights like this:
Random Phenomena
• The first day you
recorded the light, it was
green. Then on the next
five days: red, then
green again, then green,
red, and red.
• If you plot the
percentage of green
lights against days, the
graph would start at
100%
• The next day it was red
so the accumulated
percentage of green
dropped to 50% (1 out of
2)
• The third day it was
green again (2 out of 3
or 67%)
• Then green again (3 out
of 4 or 75%
Random Phenomena
Day
Light
% Green
1
Green
100
2
Red
50
3
Green
66.7
4
Green
75
5
Red
60
6
Red
50
…
…
…
• As you collect new data
value for each day each
new outcome becomes a
smaller and smaller
fraction of the
accumulated experience
• In the long run, the graph
settles down.
• As it settles down you can
see that, in fact, the light is
green about 35% of the
time.
Random Phenomena
• Let us define some terms:
• Each occasion upon which we observe a random phenomenon is
called a trial
• At each trial, we note the value of the random phenomenon, and
call that the trial’s outcome.
Random Phenomena
• For the traffic light, there
are really three possible
outcomes: red, yellow,
green
• Often we are interested in
a combination of outcomes
rather than in the
individual ones
• When you see a light turn
yellow, what do you do?
• Do you gun it and try to
catch it before it turns red?
• Or do you slow down and
wait to safely catch the
next green?
• No matter what you
choose you may want to
group the yellow with one
or the other.
Random Phenomena
• Whether you treat the
yellow light more like a red
or a green, you may want
to combine outcomes.
• Such a combination is
called an event
• We sometimes talk about
the collection of all
possible outcomes and call
that event the sample
space
Random Phenomena - Terms
• Observing a random
phenomenon is called a
trial
• At each trial we note the
value of the random
phenomenon and call
that the trial’s outcome
• We combine outcomes
to create an event
• The collection of all
possible outcomes is
called the sample space.
• We denote the sample
space S
• For our example, S =
{red, green, yellow}
The Law of Large Numbers
• What is the probability
of a green light at the
intersection?
• But do all phenomena
behave well enough for
this to make sense?
• It looks like the relative
frequency settles down
to about 35% so saying
that the probability is .35
seems a reasonable
answer.
• Perhaps the relative
frequency of an event
can bounce back and
forth between two
values forever, never
settling on just one
number…
The Law of Large Numbers
• Fortunately, a principle called
the Law of Large Numbers
(LLN) gives us the guarantee we
need.
• It simplifies things if we assume
that the events are
independent, the outcome of
one trial doesn’t affect the
outcomes of the others.
• The LLN says that as the
number of independent trials
increases the long run relative
frequency of repeated events
gets closer and closer to a
single value.
• The LLN wasn’t proven until the
18th century
• “For even the most stupid of
men…is convinced that the
more observations have been
made, the less danger there is
of wandering from one’s goal.”
- Jacob Bernoulli, 1713,
discoverer of the LLN
The Law of Large Numbers
• Because the LLN
guarantees that relative
frequencies settle down
in the long run, we can
now officially give a
name to the value that
they approach.
• We call it the probability
of an event.
• If the relative frequency
of green lights at the
intersection settles down
to 35% in the long run,
we say that the
probability of
encountering a green
light is 0.35 and we
write:
• P(green) = 0.35
Probability
• The value that events
approach in the long run
is called the probability
• Because the definition is
based on repeatedly
observing the same
outcome, this definition
of probability is often
called empirical
probability.
• For an event A,
# ๐‘ก๐‘–๐‘š๐‘’๐‘  ๐‘จ ๐‘œ๐‘๐‘๐‘ข๐‘Ÿ๐‘ 
๐‘ƒ ๐‘จ =
๐‘ก๐‘œ๐‘ก๐‘Ž๐‘™ # ๐‘ก๐‘Ÿ๐‘–๐‘Ž๐‘™๐‘ 
In the long run.
Non-Existent Law of Averages!
• The LLN says nothing
about short term
behavior
• The gambler’s fallacy
that if a certain outcome
hasn’t happened in a
long time that it is now
“due” is completely
wrong!
• The LLN speaks only
about long-term
behavior.
Modeling Probability
• It is easy to find
probabilities for events
that are made up of several
equally likely outcomes.
• ๐‘ƒ ๐‘จ =
# ๐’๐’–๐’•๐’„๐’๐’Ž๐’†๐’” ๐’Š๐’ ๐‘จ
# ๐’๐’‡ ๐’‘๐’๐’”๐’”๐’Š๐’ƒ๐’๐’† ๐’๐’–๐’•๐’„๐’๐’Ž๐’†๐’”
• For example, the probability of
drawing a face card (Jack, Queen,
King) from a deck of cards is
• We just count all the
outcomes that the event
contains. The probability of
the event is the number of
outcomes in the event
# ๐‘“๐‘Ž๐‘๐‘’ ๐‘๐‘Ž๐‘Ÿ๐‘‘๐‘  12
3
divided by the total
๐‘ƒ ๐’‡๐’‚๐’„๐’† ๐’„๐’‚๐’“๐’… =
=
=
#
๐‘๐‘Ž๐‘Ÿ๐‘‘๐‘ 
52
13
number of possible
outcomes.
Modeling Probability
• Finding the probability
of an event when the
outcomes are equally
likely is straightforward,
but it is not necessarily
easy.
• It gets hard when the
number of outcomes in
the event (and the
sample space) gets big.
Modeling Probability
• Think about flipping two
coins.
• The sample space is:
S = {HH, HT, TH, TT}
• Each outcome is equally
likely so what is the
probability of getting
exactly one head and
one tail?
• Let’s call that event A
• There are two outcomes
in the event A = {HT, TH}
• This is out of four
equally likely outcomes
in S
2
4
• So ๐‘ƒ ๐ด = =
1
2
Modeling Probability
• Now flip 100 coins.
• What is the chance of
getting exactly 67 heads?
• There are
1,267,650,600,228,229,4
01,496,703,205,376
different outcomes
possible when flipping
100 coins.
• No way this is going to
be easy!
The First Three Rules of Working With
Probability
• 1) Make a Picture
• 2) Make a Picture
• 3) Make a Picture
• The most common type
of picture is called a
Venn diagram. We will
use these a lot. Even
experienced statisticians
make Venn diagrams to
help them think about
probabilities of
compound or
overlapping events.
Formal Probability
• We need to be precise!
Using “50/50” in
conversation can mean
“I don’t know” or
“whatever”
• We will use it to mean
equally likely.
Formal Probability
• 1) If the probability is 0,
the event can’t occur
• If the probability is 1, the
event always occurs.
A probability is a number
between 0 and 1
For any event A,
๐ŸŽ ≤ ๐‘ท(๐‘จ) ≤ ๐Ÿ
Formal Probability
• 2) If a random
phenomenon has only
one possible outcome, it
is not very interesting or
random.
• We need to distribute
the probabilities among
all the outcomes a trial
can have.
• Probability Assignment
Rule:
The set of all possible
outcomes of a trial must
have a probability 1
P(S) = 1
Formal Probability
• 3) Suppose the probability
you get to class on time is
0.8
• What the probability that
you do not get to class on
time? Yes, it is 0.2
• Complement Rule:
The probability of an event
occurring is 1 minus the
probability that it does not
occur:
๐‘ท ๐‘จ = ๐Ÿ − ๐‘ท(๐‘จ๐’„ )
• The set of outcomes that
are not in the event A is
called the complement of
A and is denoted ๐‘จ๐‘
For Example
• Our traffic light is green
35% of the time.
• If P(green) = 0.35, what is
the probability that the
light is not green when you
get to the intersection?
• P(not green) = 1 – P(green)
• =1-0.35 = 0.65
• There is a 65% chance that
I won’t have a green light.
Formal Probability
• 4) Suppose the
probability that (A)
randomly selected
student is a sophomore
is 0.20
• The probability that (B)
he or she is a junior is
0.30
• What is the probability
that the student is either
a sophomore or a junior,
written: ๐‘ƒ(๐‘จ ∪ ๐‘ฉ)?
• The Addition Rule says
that you can add
probabilities of events
that are disjoint.
• To see whether events
are disjoint we take
them apart into their
component outcomes
and check whether they
have any outcomes in
common.
Formal Probability
• 4 Continued:
• Disjoint (or mutually
exclusive) events have
no outcomes in
common. The Addition
Rule states:
• For two disjoint events A and
B, the probability that one or
the other occurs is the sum
of the probabilities of the
two events.
• ๐‘ท ๐‘จ∪๐‘ฉ =๐‘ท ๐‘จ +๐‘ท ๐‘ฉ
• Provided A and B are
disjoint
For Example
• P(green) = 0.35
• ๐‘ƒ ๐‘”๐‘Ÿ๐‘’๐‘’๐‘› ∪ ๐‘ฆ๐‘’๐‘™๐‘™๐‘œ๐‘ค = 0.35 +
.04 = 0.39
• Suppose we find out that
• Red is the only remaining
P(yellow) is about 0.04
alternative and the probabilities
must add up to 1 so:
๐‘ƒ ๐‘Ÿ๐‘’๐‘‘ = ๐‘ƒ(๐‘›๐‘œ๐‘ก ๐‘”๐‘Ÿ๐‘’๐‘’๐‘› ∪ ๐‘ฆ๐‘’๐‘™๐‘™๐‘œ๐‘ค )
• What is the probability
= 1 − ๐‘ƒ ๐‘”๐‘Ÿ๐‘’๐‘’๐‘› ∪ ๐‘ฆ๐‘’๐‘™๐‘™๐‘œ๐‘ค
that the light is red?
= 1 − 0.39 = 0.61
Formal Probability
• Be careful! The Addition
Rule does not work for
events that are not
disjoint.
• If the probability of
owning an MP3 player is
0.50 and the probability
of owning a computer is
0.90 the probability of
owning both is not 1.40!
• The Addition Rule does
not work, you cannot
add these probabilities,
because the events are
not disjoint. You can own
both!
Formal Probability
• 5) Suppose your job
requires you to fly from
Atlanta to Houston every
Monday morning. The
airline’s website reports
that this flight is on time
85% of the time.
• What is the chance that
it will be on time for two
weeks in a row?
• That is, what is the
probability that it is on
time this week and on
time next week.
• For independent events
the answer is very
simple.
Formal Probability
• For two independent
events A and B, the
probability that both A and
B occur is the product of
the probabilities of the two
events.
• ๐‘ƒ ๐ด ∩ ๐ต = ๐‘ƒ ๐ด × ๐‘ƒ(๐ต)
Provided that A and B are
independent
• This rule can be
extended to more than
two independent events.
What is the chance of
your flight being on time
for a month – four
Mondays in a row?
• We multiply the
probabilities of it
happening each week:
.85 x .85 x .85 x .85 = 0.522
Formal Probability Example
• We have determined
that the probability that
we encounter a green
light at the intersection
is 0.35, a yellow light is
0.04, and a red light is
0.61
• Let us think about your
morning commute in the
week ahead.
Formal Probability Example
• P(green) = 0.35
P(yellow)= 0.04
P(red) = 0.61
• What is the probability
that you find the light
red on both Monday and
Tuesday?
Answer: Because the color
of the light you see on
Monday is independent of
the color you see on
Tuesday we can use the
Multiplication Rule:
๐‘ƒ ๐‘Ÿ๐‘’๐‘‘ ๐‘€๐‘œ๐‘›. ∩ ๐‘Ÿ๐‘’๐‘‘ ๐‘‡๐‘ข๐‘’๐‘ .
= ๐‘ƒ ๐‘Ÿ๐‘’๐‘‘ × ๐‘ƒ(๐‘Ÿ๐‘’๐‘‘
= 0.61 × 0.61
= 0.3721
There is about a 37%
chance it will be red on
both Monday and Tuesday.
Formal Probability Example
• What is the probability • Simplify by thinking of it as not red on
that you do not
Monday and Tuesday then red on
encounter a red light
Wednesday:
until Wednesday?
• For that to happen you
would have to see
Green or Yellow on
Monday and Tuesday
and then a red on
Wednesday.
๐‘ƒ ๐‘›๐‘œ๐‘ก ๐‘Ÿ๐‘’๐‘‘ = 1 − ๐‘ƒ ๐‘Ÿ๐‘’๐‘‘ = 1 − 0.61
= 0.39
Formal Probability Example
๐‘ƒ ๐‘›๐‘œ๐‘ก ๐‘Ÿ๐‘’๐‘‘ ๐‘€๐‘œ๐‘›. ∩ ๐‘›๐‘œ๐‘ก ๐‘Ÿ๐‘’๐‘‘ ๐‘‡๐‘ข๐‘’๐‘ .∩ ๐‘Ÿ๐‘’๐‘‘ ๐‘Š๐‘’๐‘‘.
= ๐‘ƒ ๐‘›๐‘œ๐‘ก ๐‘Ÿ๐‘’๐‘‘ ∗ ๐‘ƒ ๐‘›๐‘œ๐‘ก ๐‘Ÿ๐‘’๐‘‘ ∗ ๐‘ƒ ๐‘Ÿ๐‘’๐‘‘
= 0.39 0.39 0.61
= 0.092781
• There is about a 9% chance that this week I will hit my
first red light on a Wednesday morning.
Formal Probability Example
• What is the probability that
you will have to stop at
least once during the
week?
• This means I will have to
stop for the light either 1,
2, 3, 4, or 5 times next
week.
• It is easier to think about
the complement: never
having to stop at a red
light.
Formal Probability Example
• ๐‘ƒ โ„Ž๐‘Ž๐‘ฃ๐‘–๐‘›๐‘” ๐‘ก๐‘œ ๐‘ ๐‘ก๐‘œ๐‘ ๐‘Ž๐‘ก ๐‘กโ„Ž๐‘’ ๐‘™๐‘–๐‘”โ„Ž๐‘ก ๐‘Ž๐‘ก ๐‘™๐‘’๐‘Ž๐‘ ๐‘ก ๐‘œ๐‘›๐‘๐‘’ ๐‘–๐‘› ๐‘“๐‘–๐‘ฃ๐‘’ ๐‘‘๐‘Ž๐‘ฆ๐‘  =
1 − ๐‘ƒ ๐‘›๐‘œ ๐‘Ÿ๐‘’๐‘‘ ๐‘™๐‘–๐‘”โ„Ž๐‘ก ๐‘“๐‘œ๐‘Ÿ 5 ๐‘‘๐‘Ž๐‘ฆ๐‘  ๐‘–๐‘› ๐‘Ž ๐‘Ÿ๐‘œ๐‘ค
• = 1 − ๐‘ƒ(๐‘›๐‘œ๐‘ก ๐‘Ÿ๐‘’๐‘‘ ∩ ๐‘›๐‘œ๐‘ก ๐‘Ÿ๐‘’๐‘‘ ∩ ๐‘›๐‘œ๐‘ก ๐‘Ÿ๐‘’๐‘‘ ∩ ๐‘›๐‘œ๐‘ก ๐‘Ÿ๐‘’๐‘‘ ∩ ๐‘›๐‘œ๐‘ก ๐‘Ÿ๐‘’๐‘‘)
• = 1 − 0.39 0.39 0.39 0.39 0.39
• = 1 − 0.0090
• = 0.991
• There is a 99% chance that I will have to hit at least one red light
some time this week.
• Hint: the phrase “at least” is often a tip-off to think about the
complement.
Just Checking
• Opinion polling
organizations contact
respondents by
telephone.
• Random telephone
numbers are generated
and interviewers try to
contact those
households.
• In the 1990’s this
method could reach
about 69% of US
households
• According to the Pew
Research Center for the
People and the Press, by
2003 the contact rate
had risen to about 76%.
We can reasonably
assume each households
response to be
independent of the
others. What is the
probability that…
Just Checking
• P(contact in ‘90) = 0.69
P(contact in ‘03) = 0.76
• B) The interviewer successfully
contacts both of the next two
households on her list?
• A) The interviewer
successfully contacts the • Answer:
next household on her
๐‘ƒ ๐‘๐‘œ๐‘›๐‘ก๐‘Ž๐‘๐‘ก ๐‘›๐‘’๐‘ฅ๐‘ก ๐‘ก๐‘ค๐‘œ
= ๐‘ƒ(๐‘๐‘œ๐‘›๐‘ก๐‘Ž๐‘๐‘ก ๐‘›๐‘’๐‘ฅ๐‘ก
list?
• Answer:0.76
Just Checking
• The interviewer’s first
successful contact is the
third household on the
list?
• For this one, it is the
probability of not
contacting the first two
then contacting the
third.
• P(not contacting) = 1 –
P(contacting)
• = 1 – 0.76
= 0.24
• ๐‘ƒ ๐‘›๐‘œ๐‘ก ∩ ๐‘›๐‘œ๐‘ก ∩ ๐‘๐‘œ๐‘›๐‘ก๐‘Ž๐‘๐‘ก =
P not ∗ P not ∗ P contact
• = 0.24 ∗ 0.24 ∗ 0.76 =
0.043776
• There is about a 4.3% chance
of contacting someone only
at the third call.
Just Checking
• What is the probability
that the interviewer
makes at least one
successful contact
among the next five
households on the list?
• There is that “at least”
again – time to start
thinking about the
complement.
• This is the probability of
the caller making a
successful call either on
the 1st, 2nd, 3rd, 4th, or 5th
call.
• This could be thought of
as the probability of not
making 5 unsuccessful
calls.
Just Checking
• P(make one successful call out of 5) = 1 – P(5 unsuccessful calls)
• = 1 − 1 − 0.76 1 − 0.76 1 − 0.76 1 − 0.76 1 − 0.76
• = 1 − 1 − 0.76
5
• = 0.9992
• There is a 99.9% chance that the caller will make a successful
call in the first five attempts.
Another Step-By-Step Example
• In 2001 the maker of
M&Ms decided to add
another color to the
standard lineup of
brown, yellow, red,
orange, blue, and green.
To decide which color to
add they surveyed
people in nearly every
country of the world and
asked them to vote
among purple, pink, and
teal.
• The global winner was
purple!
• In the US 42% of voters
said purple, 37% said
teal, and only 19% said
pink
• In Japan the percentages
were 38% pink, 36% teal,
and only 16% purple
M&Ms
• Japan: P(pink) = 0.38
P(teal) = 0.36
P(purple) = 0.16
• What is the probability
that a Japanese M&Ms
survey respondent
selected at random
preferred either pink or
teal?
• Answer:
We must first make sure
the answers are
legitimate. To be
legitimate all the
probabilities add up to 1.
• Here they only add up to
0.9, so the remaining
respondents must have
expressed no preference
or written in another
color.
M&Ms
• Japan: P(pink) = 0.38
P(teal) = 0.36
P(purple) = 0.16
P(no pref.) = 0.10
• Question: What is the
probability that a Japanese
M&M survey respondent
selected at random
preferred either pink or
teal?
• Answer: The events “pink”
and “teal” are invidiual
outcomes (could not
choose both colors) and
they are disjoint, so we
may apply the addition rule
• ๐‘ƒ ๐‘๐‘–๐‘›๐‘˜ ∪ ๐‘ก๐‘’๐‘Ž๐‘™
= ๐‘ƒ ๐‘๐‘–๐‘›๐‘˜ + ๐‘ƒ ๐‘ก๐‘’๐‘Ž๐‘™
= 0.38 + 0.36 = 0.74
• The probability a
respondent picked either
pink or teal is 0.74
M&Ms
• Japan: P(pink) = 0.38
P(teal) = 0.36
P(purple) = 0.16
P(no pref.) = 0.10
• Answer: The word
“both” suggest we want
๐‘ƒ(๐ด ∩ ๐ต) which calls for
the multiplication rule.
• If we pick two
respondents at random,
what is the probability
they both said purple?
• P(both pick purple) =
• ๐‘ƒ ๐‘๐‘ข๐‘Ÿ๐‘๐‘™๐‘’ ∪ ๐‘๐‘ข๐‘Ÿ๐‘๐‘™๐‘’
• = 0.16 ∗ 0.16
• = 0.0256
M&Ms
• If we pick three
respondents at
random, what is
the probability
that at least one
preferred
purple?
• Here we again
see the term “at
least” and this
clues us in to the
fact that we may
need to use the
complement of a
set, ๐ด๐‘
• P(at least one picked purple) = 1 – P(none
picked purple)
=1
− ๐‘ƒ ๐‘›๐‘œ๐‘ก ๐‘๐‘ข๐‘Ÿ๐‘๐‘™๐‘’ ∩ ๐‘›๐‘œ๐‘ก ๐‘๐‘ข๐‘Ÿ๐‘๐‘™๐‘’ ∩ ๐‘›๐‘œ๐‘ก ๐‘๐‘ข๐‘Ÿ๐‘๐‘™๐‘’
= 1 − 0.84 0.84 0.84
= 1 − 0.5927
= 0.4073
• There is about a 40.73% chance that at least
one of the respondents picked purple
Homework
• Page 338, # 1, 5, 9, 11
Page 339, #19, 21, 23, 25, 27
Download