Handout - amatyc

advertisement
Rethinking Probability Activities and Problem Settings
Adapted from Statistics: Learning From Data (Peck, 2014)
Activity 1: Should You Paint the Nursery Pink?
Background: Ultrasound is a medical imaging technique that is routinely used to assess the health of a baby prior
to birth. It is also sometimes possible to determine gender of the baby during an ultrasound examination. Are
gender identifications made from ultrasound images during the first trimester (3 months) of pregnancy accurate?
The paper “The Use of Three-Dimensional Ultrasound for Fetal Gender Determination in the First Trimester” (The
British Journal of Radiology [2003]: 448 – 451) describes a study of the accuracy of ultrasound gender predictions.
In one part of this study, an experienced radiologist looked at 159 first trimester ultrasound images and made a
gender prediction based on each image. The ultrasound gender predictions were then compared to the actual
gender when the baby was born. The table below summarizes the data from this part of the study.
Baby is Male
Baby is Female
Total
Predicted Male
74
14
88
Radiologist 1
Predicted Female
12
59
71
Total
86
73
159
Notice that the gender prediction based on the ultrasound image is not always correct.
1.
What is the estimated probability that a gender prediction made by this radiologist is correct? (Hint:
Which cells in the table correspond to correct gender predictions?)
2.
Use the given data to estimate the probability that the radiologist’s prediction is correct given that the
baby is male. (Hint: It will probably help to add row and column totals to the table.)
Instructor note: After students have figured this out, this is a good place to introduce the notation for
conditional probabilities (e.g. P(prediction is correct|baby is male)), but the formula for conditional
probability isn’t really necessary.
3.
Based on the estimated probabilities from steps 1 and 2, do you think that the events radiologist’s
prediction is correct and baby is male are independent? Why or why not?
Instructor note: This is a good place to bring the class together to revisit the notion of independence and
what it means for two events to be independent in terms of conditional probabilities.
1
Rethinking Probability Activities and Problem Settings
Adapted from Statistics: Learning From Data (Peck, 2014)
4.
How does the estimated probability in step 2 compare to the estimated probability that the radiologist’s
prediction is correct given that the baby is a girl? Is this radiologist more likely to be correct when the
baby is male than when the baby is female?
5.
(Investigative) If radiologist 1 predicts that gender is female and you then paint the nursery pink, what is
the estimated probability that you will have to repaint?
Instructor note: Let students struggle with this one for a bit—they can usually figure this out. You want to
make sure that they understand the difference between the probability computed in step 4 and the
probability of interest here—the difference between
probability that the prediction is female given that the baby is female
and
probability that the baby is female given that the prediction is female
Taking it a bit further…
The paper also gave data for a second radiologist who made gender predictions based on 154 first trimester
ultrasound images.
Baby is Male
Baby is Female
Total
6.
Predicted Male
81
7
88
Radiologist 2
Predicted Female
8
58
66
Total
89
65
154
Does the chance of a correct gender prediction differ for the two radiologists?
2
Rethinking Probability Activities and Problem Settings
Adapted from Statistics: Learning From Data (Peck, 2014)
Suppose that these two radiologists both work in the same clinic and that the probability of a correct gender
prediction from a first trimester ultrasound image for each of the two radiologists is equal to the probability
previously computed--0.837 for radiologist 1 and 0.903 for radiologist 2. Further, suppose that radiologist 1
works part-time and handles 30% of the ultrasounds for the clinic and that radiologist 2 handles the remaining
70%. Let’s investigate the following questions:
1.
What is the probability that a gender prediction based on a first trimester ultrasound at this clinic is
correct?
2. If a first trimester ultrasound gender prediction is incorrect, what is the probability that the prediction
was made by radiologist 2?
To answer these questions, we can translate the given probability information into a “hypothetical 1000” table.
7.
Use the information given about first trimester gender predictions at this clinic to complete the
following:
P(prediction is made by radiologist 1) 
P(prediction is made by radiologist 2) 
P(prediction is correct | prediction made by radiologist 1) 
P(prediction is correct | prediction made by radiologist 2) 
8.
Think about a hypothetical set of 1000 gender predictions made at this clinic. How many of them
would you expect to have been made by Radiologist 1? How many by Radiologist 2? Enter these
counts into the total column of the table below
Prediction correct
Prediction incorrect
Radiologist 1
Radiologist 2
Total
9.
Total
1000
Now consider the predictions made by Radiologist 1. How many of them do you expect to be correct?
How many of them do you expect to be incorrect? Enter these counts into the Radiologist 1 row of the
table above.
10. Complete the table above by entering the counts for the Radiologist 2 row.
3
Rethinking Probability Activities and Problem Settings
Adapted from Statistics: Learning From Data (Peck, 2014)
Now use the table above to answer the two questions posed earlier:
11. What is the probability that a gender prediction based on a first trimester ultrasound at this clinic is
correct?
12. If a first trimester ultrasound gender prediction is incorrect, what is the probability that the prediction
was made by radiologist 2?
Instructor note: Students who complete this activity have computed a simple probability, conditional probabilities
and used the approach of creating a hypothetical 1000 table to calculate a traditional “Bayes Rule” probability. No
formulas…and student’s find these calculations logical!
Problem Settings
Using one of the following contexts, write a question involving conditional probability that you could assign in
your introductory statistics class and then use the hypothetical 1000 table approach to solve it.
Context 1 Soccer Penalty Kicks: The paper “Action Bias among Elite Soccer Goalkeepers: The Case of Penalty Kicks”
(Journal of Economic Psychology [2007]: 606–621) presents an interesting analysis of 286 penalty kicks in televised
championship soccer games from around the world. In a penalty kick, the only players involved are the kicker and
the goalkeeper from the opposing team. The kicker tries to kick a ball into the goal from a point located 11 meters
away. The goalkeeper tries to block the ball from reaching the goal. For each penalty kick analyzed, the researchers
recorded the direction that the goalkeeper moved (jumped to the left, stayed in the center, or jumped to the right)
and whether or not the penalty kick was successfully blocked. Consider the following events:
L = the event that the goalkeeper jumps to the left
C = the event that the goalkeeper stays in the center
R = the event that the goalkeeper jumps to the right
B = the event that the penalty kick is blocked
Based on their analysis of the penalty kicks, the authors of the paper gave the following probability estimates:
P( L)  .493
P( B | L)  .142
P(C )  .063
P( B | C )  .333
P( R )  .444
P( B | R )  .126
4
Rethinking Probability Activities and Problem Settings
Adapted from Statistics: Learning From Data (Peck, 2014)
Context 2 Internet Addiction: Internet addiction has been defined by researchers as a disorder characterized by
excessive time spent on the Internet, impaired judgment and decision-making ability, social withdrawal, and
depression. The paper “The Association between Aggressive Behaviors and Internet Addiction and Online Activities
in Adolescents” (Journal of Adolescent Health [2009]: 598–605) reported on a study involving a large sample of
adolescents. Each participant in the study was assessed using the Chen Internet Addiction Scale to determine if he
or she suffered from Internet addiction. The following statements are based on the survey results:
1. 51.8% of the study participants were female and 48.2% were male.
2. 13.1% of the females suffered from Internet addiction.
3. 24.8% of the males suffered from Internet addiction.
Context 3 Correctness of Medical Diagnoses: The authors of the paper “Do Physicians Know when Their Diagnoses
Are Correct?” (Journal of General Internal Medicine [2005]: 334–339) presented detailed case studies to medical
students and to faculty at medical schools. Each participant was asked to provide a diagnosis in the case and also
to indicate whether his or her confidence in the correctness of the diagnosis was high or low. Define the events C,
I, and H as follows:
C = event that diagnosis is correct
I = event that diagnosis is incorrect
H = event that confidence in the correctness of the diagnosis is high
Data appearing in the paper were used to estimate the following probabilities for medical students:
P(C )  .261
P ( I )  .739
P( H | C )  .739
P( H | I )  .073
Data from the paper were also used to estimate the following probabilities for medical school faculty:
P(C )  .495
P ( I )  .505
P( H | C )  .537
P( H | I )  .252
Context 4 Twitter Use: The report “Twitter in Higher Education: Usage Habits and Trends of Today’s College
Faculty” (Magna Publications, September 2009) describes results of a survey of nearly 2000 college faculty. The
report indicates the following:
 30.7% reported that they use Twitter and 69.3% said that they did not use Twitter.
 Of those that use Twitter, 39.9% said they sometimes use Twitter to communicate with students.
 Of those that use Twitter, 27.5% said that they sometimes use Twitter as a learning tool in the classroom.
Context 5 Drug Testing: In an article that appears on the web site of the American Statistical Association
(www.amstat.org), Carlton Gunn, a public defender in Seattle, Washington, wrote about how he uses statistics in
his work as an attorney. He states:
I personally have used statistics in trying to challenge the reliability of drug testing results. Suppose the
chance of a mistake in the taking and processing of a urine sample for a drug test is just 1 in 100. And your
client has a “dirty” (i.e., positive) test result. Only a 1 in 100 chance that it could be wrong? Not
necessarily. If the vast majority of all tests given—say 99 in 100—are truly clean, then you get one false
dirty and one true dirty in every 100 tests, so that half of the dirty tests are false.
5
Download