Simpson`s Paradox is merely the concept that we can see a reversal

advertisement
Simpson’s Paradox is merely the concept that we can see a reversal in the direction of a relationship
when we look at two data sets separately, and when we look at them combined. The lurking variable
between the two data sets is categorical (such as gender, good vs. poor condition, etc.)
A 2-way table is how we compare two or more sets of categorical data. Typically these problems lead to
simple probability or conditional probability problems. Some samples are provided.
1) A bank offers both adjustable-rate and fixed-rate mortgage loans on residential properties, which are
classified into three categories: single-family houses, condominiums, and multifamily dwellings. Each
loan made in 2010 was classified according to type of mortgage and type of property, resulting in the
following table. Consider the chance experiment of selecting one of these 3,750 loans at random.
Adjustable
Fixed-Rate
Total
Single-Family
1,500
375
1,875
Condo
788
377
1,165
Multifamily
337
373
710
Total
2,625
1,125
3,750
a) Determine the probability that the selected loan will be for an adjustable rate mortgage.
b) Determine the probability that the selected loan will be for a multifamily property.
c) Determine the probability that a selected loan will not be for a single-family property.
d) Determine the probability a loan will be for a single family property or a condo.
e) Determine the probability that the selected loan will be a fixed-rate loan for a condo.
f) Determine the probability that the selected loan will be a fixed rate loan given that it’s a condo.
g) What is the big difference between questions e and f? We’ll explore this more when we get to
probability.
2) The paper “Good for Women, Good for Men, Bad for People: Simpson’s Paradox and the Importance
of Sex-Specific Analysis in Observational Studies” (Journal of Women’s Health and Gender Based
Medicine [2001]: 867-872) described the results of a medical study in which one treatment was shown
to be better for men and better for women than a competing treatment. However, if the data for men
and women are combined, it appears as though the competing treatment is better. To see how this can
happen, consider the accompanying data tables constructed from information in the paper. Subjects in
the study were given either Treatment A or Treatment B, and their survival was noted.
The follow table summarizes data for men and women combined.
Survived
Died
Treatment A
215
85
Treatment B
241
59
Total
456
144
Total
300
300
600
a) Determine the percentage of people who survived.
b) Determine the percentage of people who survived given that they took Treatment A.
c) Determine the percentage of people who survive given that they took Treatment B.
The following table summarizes the data for just men who participated in the study.
Survived
Died
Total
Treatment A
120
80
200
Treatment B
20
20
40
Total
140
100
240
d) Determine the percentage of people who survived.
e) Determine the percentage of people who survived given that they took Treatment A.
f) Determine the percentage of people who survive given that they took Treatment B.
The following table summarizes the data for just women who participated in the study.
Survived
Died
Total
Treatment A
95
5
100
Treatment B
221
39
260
Total
316
44
360
g) Determine the percentage of people who survived.
h) Determine the percentage of people who survived given that they took Treatment A.
i) Determine the percentage of people who survive given that they took Treatment B.
j) Explain which treatment is better for whom in this study, and any other factors of the study that stand
out as possible explanations as to these percentages.
Download