When Intuition Differs

advertisement
When Intuition Differs from
Relative Frequency
Coincidences, Gamblers’ Fallacy,
Confusion of the Inverse, Expected Values,
and Simpson’s Paradox
A few questions to test your intuition:
What is the probability that at least
two people in this class have the
same birthday?
Closer to 50% or 5%?
You test positive for rare disease. Your original
chances of having disease are 1 in 100. The test is
80% accurate.
Given that you tested positive, what do you think
is the probability that you actually have the
disease?
Higher or lower than 50%?
If you were to flip a fair coin six times,
which of the following sequences do
you think would be most likely:
HHHHHH or HHTHTH or HHHTTT?
Which one would you choose in each
set? (Choose either A or B and either C
or D.)
A. A gift of $240, guaranteed
B. A 25% chance to win $1000 and
a 75% chance of getting nothing
C. A sure loss of $740
D. A 75% chance to lose $1000 and
a 25% chance to lose nothing
Is it possible that a cause of death could
rank at or near the top of the list for almost
all age groups, but not near the top of the
list for the entire population?
Sharing the Same Birthday
What is the probability that at least two people in
this class have the same birthday?
Most people think that the probability is small but it is
actually close to 50%. Most are thinking about the
probability that someone will have their birthday which is
much more unlikely.
Sharing the Same Birthday
What is the probability that at least two people in this
class have the same birthday?
First find the probability that no one in the class has the
same birthday then subtract from 1.
Probability that none of the 27 people share a
birthday:
(365)(364)(363) · · · (341)(340)(339)/(365)27 = 0.37314
Probability at least 2 people share a birthday: 1–
.37314 =.62686
So the probability that 2 people in the class share the
same birthday is actually close to 63%!
Most Coincidences Only Seem Improbable
• Coincidences seem improbable only if we
consider the probability of that specific
event occurring at that specific time to
us.
• If we consider the probability of it occurring
some time, to someone, the probability can
become quite large.
• Since there are a multitude of experiences
we have each day, it is not surprising that
some may appear improbable.
More Likely Coin Flip Outcome
If you were to flip a fair coin six times,
which sequence do you think would be most
likely:
HHHHHH or HHTHTH or HHHTTT?
People regard the sequence HTHTTH to be more likely than
the sequence HHHTTT, which does not appear to
be random, and also more likely than HHHHTH, which does
not seem to represent the fairness of the coin.
However, each of the above sequences is equally likely. What
is the probability of each sequence?
Each has a probability of (.5)6 which is .015625.
The Gambler’s Fallacy
Gambler’s Fallacy is the mistaken notion that the chances of
something with a fixed probability increase or decrease
depending upon recent occurrences. People think the long-run
frequency of an event should apply even in the short run.
Remember: Independent
People tend to believe that a string of good
Chance
No
luck
will follow aEvents
string of bad Have
luck in a casino.
“Memory”
or
People tend to believe that a “streak” will continue.
However, winning or losing ten gambles in a row doesn’t change
the probability that the next gamble will be a win or a loss.
The Gambler’s Fallacy
When It May Not Apply
Gambler’s fallacy applies to independent events (one
in which the outcome of one event does not affect the
next).
It may not apply to situations where knowledge of
one outcome affects probabilities of the next.
Example:
In card games using a single deck, knowledge of
what cards have already been played provides
information about what cards are likely to be
played next.
Confusion of the Inverse
Malignant or Benign?
• Patient has a lump. In about 1% of cases, the lump is
malignant.
• Mammograms are 80% accurate for malignant lumps and
90% accurate for benign lumps.
• Mammogram indicates lump is malignant.
What are the chances the someone with a lump that tests
positive for malignancy really has malignant lump?
In study, most physicians said about 75%, do you agree?
Create a table in Excel in order to calculate the chance that a
patient with a positive test result does actually have a
malignant tumor.
Confusion of the Inverse
The otherThe
10%other
get positive
test
Mammogram
screening
20%
have
•Let’s
considerscreening
a study inresults
whichinmammograms
are
Mammogram
which
the
mammogram
correctly identifies 80% of
negative
test results even
given to 10,000
women
with
breast
tumors.
correctly
identifies
90% of
incorrectly
suggests
their cancer.
tumors
the 100 tumors as
though
they
have
the 9900 tumors as benign.
are malignant.
•Recallmalignant.
that in 1% of cases the tumor
is malignant.
(.01)*(10,000) = 100 women with cancer
Tumor is
Malignant
Tumor is
Benign
Positive
Mammogram
80 True
positives
Negative
Mammogram
20 False 8910 True
negatives negatives
Total
100
Totals
990 False
positives
9900
10,000
Confusion of the Inverse
Now we compute the row totals.
Tumor is
Malignant
Tumor is
Benign
Totals
Positive
Mammogram
80 True
positives
990 False
negatives
1070
Negative
Mammogram
20 False
positives
8910 True
negatives
8930
Total
100
9900
10,000
Confusion of the Inverse
According to the numbers in the table, percent of positive tests
who were actually malignant is: 80/1,070 = 0.075.
In study, most physicians said about 75%, but it is only 7.5%!
The physicians were off by a factor of 10!
Confusion of the Inverse: Physicians were confusing the
probability of getting a positive test if you do have cancer
with the probability of having cancer if you get a positive
test.
Confusion of the Inverse
The Probability of a False Positive Test
If base rate for disease is very low and test for disease
is less than perfect, there will be a relatively high
probability that a positive test result is a false positive.
The false positive rate for our example is 9900/10700 or
92.5%
To determine probability of a positive test result being accurate, you need:
1. Base rate or probability that you are likely to have disease, without
any knowledge of your test results.
2. Sensitivity of the test – the proportion of people who correctly
test positive when they actually have the disease
3. Specificity of the test – the proportion of people who correctly
test negative when they don’t have the disease
Using Expected Values To
Make Wise Decisions
Revisit the question from earlier:
If you were faced with the following alternatives, which would you
choose? Note that you can choose either A or B and either C or D.
A. A gift of $240, guaranteed
B. A 25% chance to win $1000 and a 75% chance of getting nothing
C. A sure loss of $740
D. A 75% chance to lose $1000 and a 25% chance to lose nothing
Using Expected Values To Make
Wise Decisions
A. A gift of $240, guaranteed
B. A 25% chance to win $1000 and a 75% chance of
getting nothing
A versus B: majority chose sure gain A.
Expected value under choice B is $250, higher
than sure gain of $240 in A, yet people prefer A.
To calculate the expected value multiply the
probability and the amount then add the values:
(.25)(1000) + (.75)(0) = $250
Using Expected Values To Make
Wise Decisions
C. A sure loss of $740
D. A 75% chance to lose $1000 and a 25% chance to lose
nothing
C versus D: majority chose D-gamble rather than sure loss.
Expected value under D is $750, a larger expected loss than
$740 in C.
(.75)(-1000) + (.25)(0) = -750
People value sure gain, but willing to take risk to
prevent loss.
Using Expected Values To Make Wise
Decisions
If you were faced with the following alternatives, which would you
choose? Note that you can choose either A or B and either C or D.
A: A 1 in 1000 chance of winning $5000
B: A sure gain of $5
C: A 1 in 1000 chance of losing $5000
D: A sure loss of $5
• Would you make the same decisions as you did in the previous
example? Why or why not?
• What is the Expected Value for each option?
Using Expected Values To Make Wise
Decisions
For A and B, the EV is $5
For C and D, the EV is -$5
• A versus B: 75% chose A (gamble). Similar to decision to
buy a lottery ticket, where sure gain is keeping $5 rather
than buying a ticket.
• C versus D: 80% chose D (sure loss). Similar to success
of insurance industry. Dollar amounts are important: sure
loss of $5 easy to absorb, while risk of losing $5000 may
be equivalent to risk of bankruptcy.
Simpson’s Paradox
Is it possible that a cause of death could rank at
or near the top of the list for almost all age
groups, but not near the top of the list for the
entire population?
‘Simpson’s Paradox refers to the reversal of the direction
of a comparison or an association when data from several
groups are combined to form a single group’
Simpson’s Paradox
How can death from car accident be at or near the top of the
list for most age groups, but 5th for all ages?
The numbers don’t seem to “work” but they do.
Let’s take a look at the Excel file for Leading Causes of
Death.
Simpson’s Paradox
How can death from car accident be at or near the top of the
list for most age groups, but 5th for all ages?
Each age group is not equally represented in the overall
number of deaths. As expected, the number of deaths in the
older age groups is much higher than in the younger age
groups. Since MV Traffic Crashes was not even in the top 10
for causes of death for ages 65 and over, it “pulls down” MV
Traffic crashes when comparing causes for all age groups.
Download