cognitive-biases-4

advertisement
Cognitive Biases 4
Fallacies Involving Probability
HOMEWORK 2
Assignment
Watch episode 1 of season 1 of “Ancient Aliens.”
Assignment
Find one thing that is said, shown, or presented
in the episode that is misleading. I want you to
describe it to me, then to explain why it is
misleading.
Assignment
What you describe to me can be misleading for
any reason, not just a reason we’ve talked about
in class. Just describe it, and tell me the reason
why it is misleading.
How to Get More Marks
I will award 1 mark to students who give me an
example of something misleading in the show
that is original—meaning that other students
did not give me the exact same example.
PROBABILITY BIASES
Conjunction Fallacy
Which of the following is most likely to happen?
a. There will not be a final exam in this class.
b. There will not be a final exam in this class,
because the instructor has to leave the country.
c. Lingnan University closes and there will not be
a final exam in this class.
The Killer
Ah Jong is a martial arts expert. He’s a high
ranking member of the Triads, and he’s killed
hundreds of people. Friends describe him as
“dangerous.”
The Killer
What Is Most Likely?
(a) Ah Jong sews women’s dresses.
(b) Ah Jong sews women’s dresses so the police
won’t think he’s a gangster
(c) Ah Jong sews women’s dresses as part of his
court-ordered rehabilitation.
Conjunction Fallacy
Linda is 31 years old, single, outspoken, and very
bright. She majored in philosophy. As a student,
she was deeply concerned with issues of
discrimination and social justice, and also
participated in anti-nuclear demonstrations.
Which is more probable?
a. Linda is a bank teller.
b. Linda is a bank teller and is active in the
feminist movement.
Conjunction Fallacy
The correct answer is (a):
a. There will not be a final exam.
a. Ah Jong sews women’s dresses.
a. Linda is a bank teller.
Conjunction Fallacy
Suppose: 20,000 people meet the description
(“31 years old, single, outspoken, and very
bright. She majored in philosophy…”).
How many of them are feminist bank tellers?
Pick any number: 5,000.
Conjunction Fallacy
How many of them are bank tellers (feminist or
not feminist)?
It has to be more than 5,000. That’s the number
of feminist bank tellers. You have to add in the
number of non-feminist bank tellers who meet
the description. Let’s say that’s only 1 person.
Conjunction Fallacy
Then the probability that Linda is a bank teller
AND is active in the feminist movement is 5,000
out of 20,000 or 25%.
And the probability that she is a bank teller
[maybe a feminist, maybe not] is 5,001 out of
20,000 > 25%.
Mathematics
Always, the probability of two events happening
(Linda being a bank teller AND Linda being a
feminist) is less than the probability of just one
of those events happening (for example, Linda
being a bank teller).
The illusion that the opposite is true especially
occurs in cases where one event explains the
other.
George
For example, suppose I tell you that there is a
man named “George.” George turned water into
wine, healed the sick, brought a dead person
back to life, and came back to life himself after
he died.
What is the probability that George did all these
things? How likely is it?
George
You can say whatever you like. 1%, 10%, 99%.
But suppose I add to the story. I say “George was
the son of God. That’s why he had all these
powers.”
George
Many people will say that it’s more likely that
George was the son of God AND did all these
things than it is that he did all these things. But
that can’t be true.
“A & B” is always less (or equally) probable than
A, or than B. For A & B to happen, A has to
happen and also B has to happen.
Debiasing
We can avoid this bias if we ask the question
differently:
There are 100 persons who fit the description
above (that is, Linda’s). How many of them are:
Bank tellers? ____ of 100
Bank tellers and active in the feminist
movement? ____ of 100
Frequencies
This shows that it’s good to translate
percentages and probabilities into frequencies
(number of X out of number of Y).
We are less susceptible to representativeness
bias when things are phrased in this way.
REPRESENTATIVENESS
Representativeness
Our (false) judgment that Linda is more likely to
be a feminist bank teller than to just be a bank
teller is an example of how we judge the truth of
claims based on how “representative” they are.
Representativeness
Consider again our case of coin flips that seem
non-random, due to clustering.
Since coins land 50% heads and 50% tails, “XO”
and “OX” are representative of this even split,
whereas “XX” and “OO” don’t represent it. So
sequences with clustering seem non-random,
even if they are (random).
Representativeness
Representativeness influences our other
judgments as well.
It’s hard to accept that two very tall parents
tend to, on average, have less tall children (as
regression to the mean requires). Children who
are as tall as their parents are more
representative of their parents’ heights.
Heuristics and Biases
Representativeness is often a good heuristic.
A heuristic is a strategy that is easy to use in
problem solving but doesn’t always work when
applied.
There is often no good reason to distinguish
between heuristics and biases.
Representativeness is a good heuristic
(sometimes) because (sometimes) things are
representative.
• Sometimes small effects have small causes.
Burnt toast can be caused by leaving bread in
the toaster for too long.
• Sometimes complex effects have complex
causes. World War I (a complex effect) was
caused by a very complex set of factors, only
one of which was the assassination of
Archduke Ferdinand.
However:
• Sometimes large effects have small causes. An
outbreak of a disease may be caused by a tiny
virus or bacterium.
• Sometimes complex effects have simple
causes. For instance, introducing a foreign
species into a new land may cause radical
changes in the ecosystem.
THE BASE RATE NEGLECT FALLACY
Base Rate Fallacy
• There are ½ million
people in Russia are
affected by HIV/ AIDS.
• There are 150 million
people in Russia.
Base Rate Fallacy
Imagine that the
government decides this is
bad and that they should
test everyone for HIV/
AIDS.
They develop a test with the following features:
• If someone has HIV/ AIDS, then 95% of the
time the test will be positive (correct), and
only 5% of the time will it be negative
(incorrect).
• If someone does not have HIV/ AIDS, then
95% of the time the test will be negative
(correct), and only 5% of the time will it be
positive (incorrect).
The Test
If someone has HIV/ AIDS,
then :
• 95% of the time the test
will be positive (correct)
• 5% of the time will it be
negative (incorrect)
The Test
If someone does not have
HIV/ AIDS, then:
• 95% of the time the test
will be negative
(correct)
• 5% of the time will it be
positive (incorrect)
You Get Tested
Suppose you are a Russian who gets tested for
HIV/ AIDS under the government program.
The test comes out positive. How likely are you
to have HIV/ AIDS?
HIV/AIDS =
Yes
HIV/ AIDS =
No
Test = Yes
Test = No
True
Positives
False
Positives
False
Negatives
True
Negatives
True Positives
We know that there are 500,000 people in
Russia who have HIV/AIDS. How many will get a
positive test result?
The Test
If someone has HIV/ AIDS,
then :
• 95% of the time the test
will be positive (correct)
• 5% of the time will it be
negative (incorrect)
True Positives
500,000 x 95% = 475,000
HIV/AIDS =
Yes
HIV/ AIDS =
No
Test = Yes
Test = No
True
Positives
False
Positives
False
Negatives
True
Negatives
HIV/AIDS =
Yes
HIV/ AIDS =
No
Test = Yes
Test = No
475,000
False
Negatives
True
Negatives
False
Positives
False Negatives?
How many people who have HIV/AIDS will test
negative?
500,000 – 475,000 = 25,000
500,000 x 5% = 25,000
HIV/AIDS =
Yes
HIV/ AIDS =
No
Test = Yes
Test = No
475,000
False
Negatives
True
Negatives
False
Positives
HIV/AIDS =
Yes
HIV/ AIDS =
No
Test = Yes
Test = No
475,000
25,000
False
Positives
True
Negatives
True Negatives
We also know that there are 150 million –
500,000 people in Russia who do not have
HIV/AIDS.
How many of them will correctly test negative?
The Test
If someone does not have
HIV/ AIDS, then:
• 95% of the time the test
will be negative
(correct)
• 5% of the time will it be
positive (incorrect)
True Negatives
(150,000,000 – 500,000) x 95% = ?
True Negatives
149,500,000 x 95% = ?
True Negatives
149,500,000 x 95% = 142,025,000
HIV/AIDS =
Yes
HIV/ AIDS =
No
Test = Yes
Test = No
475,000
25,000
False
Negatives
True
Negatives
HIV/AIDS =
Yes
HIV/ AIDS =
No
Test = Yes
Test = No
475,000
25,000
False
Negatives
142,025,000
False Positives
How many people who do not have HIV/AIDS
will falsely test positive for the disease?
149,500,000 – 142,025,000 = 7,475,000
149,500,000 x 5% = 7,475,000
HIV/AIDS =
Yes
HIV/ AIDS =
No
Test = Yes
Test = No
475,000
25,000
False
Positives
142,025,000
HIV/AIDS =
Yes
HIV/ AIDS =
No
Test = Yes
Test = No
475,000
25,000
7,475,000
142,025,000
HIV/AIDS =
Yes
HIV/ AIDS =
No
Test = Yes
Test = No
475,000
25,000
7,475,000
142,025,000
You tested positive. What percentage of people
who tested positive truly had HIV/AIDS?
HIV/AIDS =
Yes
HIV/ AIDS =
No
Totals
Test = Yes
Test = No
475,000
25,000
7,475,000
142,025,000
HIV/AIDS =
Yes
HIV/ AIDS =
No
Totals
Test = Yes
Test = No
475,000
25,000
7,475,000
142,025,000
7,950,000
Your Chances
True Positives ÷ All Positives =
475,000 ÷ 7,950,000 = 6%
Whether a test is good or worth doing depends
not only on how accurate it is (95% true
positive, 95% true negative), but also on how
prevalent the condition being tested for is.
Very rare conditions require very sensitive tests,
whereas very prevalent conditions only need
minorly accurate tests.
Base Rates
The “base rate” is the percentage of people in
the population who have a certain property.
The base rate of HIV/AIDS cases is the
percentage of people who have HIV/AIDS in the
population. The base rate of bearded people is
the percentage of people who have beards in
the population, etc.
Base Rate Neglect
The “base rate neglect fallacy” is the fallacy of
ignoring the base rate when making a judgment.
For example, if I assumed you were a terrorist,
because you tested positive, I would be
committing the base rate neglect fallacy. I
should assume you’re still probably not a
terrorist.
Base Rates
As we have seen, base rates matter. If the base
rate of a condition is very low (small percentage
of AIDS cases), then even very accurate tests
(95% true positive, 95% true negative) can be
useless.
In our example only 6% of people who tested
positive for AIDS had the disease!
Base Rate Neglect
The base rate neglect fallacy has nothing at all to
do with the number of people there are in the
world. Nothing.
It has to do with the probability of a variable
taking on a certain value, for instance, the
probability that someone’s height = 1.5m, the
probability that someone has AIDS, etc.
Base Rates
This is the “base rate” of people who are 1.5m
tall, and the “base rate” of people with AIDS.
If 1 in 100 people are terrorists, then the rate of
terrorists is 1 in 100 and the probability that a
randomly selected person is a terrorist is 1 in
100.
Base Rates
We call this the base rate, because it is the
probability that someone is a terrorist when we
don’t know anything else about them.
It might be that the base rate of terrorists is 1 in
100, but the rate of terrorists among people
who are holding rocket launchers is 1 in 2, and
the rate of terrorists among retirees is 1 in 500.
Tests
The base rate neglect fallacy happens when we
have a test that is meant to detect the value of a
variable.
For example we might have a test that tells us
whether someone has AIDS or not, or whether
someone is driving over the speed limit, or
whether they are drunk.
Reliability of Tests
Here is the important, and crucial fact. Please
learn this:
As the base rate of X = x decreases, the # of
false positives on tests for X = x increases.
Tests are less reliable when the condition we are
testing for becomes rare (low base rate).
Base Rate Neglect Fallacy
The base rate neglect fallacy happens when:
1.
2.
3.
4.
There is a low base rate of some condition.
We have a test for that condition.
Someone tests positive.
We assume that means they have the
condition, ignoring the unreliability of tests
for conditions with low base rates.
Base Rate Neglect
Humans have a tendency to ignore base rates.
For example, Kahneman and Tversky (1973)
conducted a study in which participants were
supposed to estimate the GPAs of certain
(fictional) students.
Kahneman & Tverskey 1973
Some of the participants were given good
evidence that students had high (or low) GPAs.
In particular, they were given the students’
percentiles (95th percentile, for example).
Other participants were given only very weak
evidence: the scores that the students got on a
test of humor.
All the participants were given the base rate of
students with various GPAs. For example, 20% A,
40% B, 30% C, 10% D.
But all of the participants ignored the base rate.
A good test for a prevalent condition (like
number of people with A’s, not like being a
terrorist) gives you lots of information.
If someone is in the 99th percentile, for instance,
you can be sure that they got an A. If they’re in
the bottom quartile, you know that they did not
get an A.
But scoring high on a test of humor is not a good
indicator of your GPA. Maybe people with a
good sense of humor are a little bit more likely
to get better grades, but not much more likely.
Given only such information, your guess should
be very close to the base rate (for example, it’s
40% likely the student has a B GPA).
But, as I said, participants ignored the base rate.
They guessed that people who did very well on
the humor test had high GPAs, and people who
did poorly on the test had low GPAs.
Prosecutor’s Fallacy
The base rate neglect
fallacy is often called the
prosecutor’s fallacy, as I
shall explain.
Murder!
Let’s suppose that there
has been a murder.
There is almost no
evidence to go on except
that the police find one
hair at the crime scene.
You are the Suspect
If someone is the killer, there is a 100% chance
that their DNA will match the hair’s DNA.
The police have a database that contains the
DNA of everyone in Hong Kong.
They run the DNA in the hair through their
database and discover that you are a match!
Comprehension Question
If you have been following along you should be
able to answer this question:
What is the probability that you are the
murderer, given that you are a DNA match for
the hair?
Answer
If you said 100%, then you have just committed
the base rate neglect fallacy.
The correct answer is “Much lower, because the
base rate of people who committed this murder
out of the Hong Kong population as a whole is 1
in 7 million.”
Perfect Conditions for Fallacy
Here’s what we have:
1. A low base rate (only 1 person who
committed this murder in the world).
2. A test for whether someone is the murderer.
3. You, who’ve tested positive on this test.
4. And the police who think you did it!
Let’s Look at the Numbers
We know that if you are the murderer, then
there is a 100% chance of a DNA match.
But what is the false positive rate? How likely is
a randomly selected person will match the DNA?
False Results
Here’s a quote from “False result fear over DNA
tests,” Nick Paton Walsh, The Guardian:
“Researchers had asked the labs to match a
series of DNA samples. They knew which ones
were from the same person, but found that in
over 1 per cent of cases the labs falsely matched
samples, or failed to notice a match.”
Let’s assume that half of the cases where “labs
falsely matched samples, or failed to notice a
match.” were cases where they falsely matched
samples.
So the probability of a false positive is ½ x 1% =
0.5%, or 5 in 1,000.
Since there are 7 million people in Hong Kong,
we expect about 0.5% x 7 million = 35,000 of
them to match the hair’s DNA.
Actually, it’s 35,000 + 1, because the true killer is
a match, and not by accident.
So we expect that there are 35,001 DNA
matches in all of Hong Kong.
And only one of them is the murderer. So what
is the probability that you are the murderer?
1 in 35,001. That’s way less than 100%.
Important Things to Remember
There are three important things to remember:
1. If the test is more accurate (fewer false
positives), then it’s more reliable
2. If the base rate is higher, the test is more
reliable.
3. If the police have other reasons to suspect
you, the test is more reliable.
1. If the test is more reliable…
Theoretically, DNA tests only return a false
positive about 1 in 3 billion times.
In that case, we’d expect only .002 false
positives in all of Hong Kong.
So your chances of being guilty would be 1 in
1.002, or 99.8%. Still, that’s lower than 100%.
2. If there base rate is higher…
Maybe the person who died was stabbed 5,000
times, once each by 5,000 different people. So
there are 5,000 murderers.
Then with the previous false positive number at
35,000, you have a 5,000 in 40,000 chance of
being one of the killers, or 12.5%.
3. If the police have some other reason
to suspect you…
To figure out your chances of being guilty, we
looked at the probability that a randomly
selected person from HK would be a DNA match.
We were assuming you were randomly selected.
But what if you weren’t randomly selected?
What if the police tested you because you had a
reason to kill the victim?
Reason to Suspect You
Then we would have to look at not the
probability that a randomly selected person
would match, but the probability that a person
who had reason to kill the victim would match.
Suppose there are 5 people who had reasons to
kill the victim, and the killer is one of them.
Much Higher Chance
Then your chances are:
Let K = you’re the killer and M = you’re a match
P(K/ M)
= [P(K) x P(M/ K)] ÷ P(M)
= [(1/5) x 100%] ÷ P(M)
= 0.2 ÷ [(1 + 0.025) ÷ 5]
= 97.6%
Download