Assignment #3 - Winona State University

advertisement
Assignment #3~ Biometry - STAT 305 Spring 2014
Conditional Probabilities, Relative Risk, Odds Ratio, Mosaic Plots,
Correspondence Analysis, and Baye’s Rule & Medical Screening Tests.
1. Low Birth Weight Risk Factors
(Lowbirth.JMP)
The purpose of
this study was to identify potential risk factors for low birth weight. The
following categorical variables were measured: previous history of premature labor
(Prev?), hypertension during pregnancy (Hyper?), smoking, uterine irritability during
pregnancy (Uterine), and minority status (Minority).
a) For each risk factor calculate P(Low|risk factor present) and P(Low|risk factor absent)
for each of the FIVE potential risk factors. What do these tell you about each of the
potential risk factors? (8 pts.)
b) Use your answers in part (a) to calculate the relative risk (RR) associated with each
factor and interpret. (4 pts.)
c) Calculate the odds ratio (OR) associated with each characteristic. Discuss. (4 pts.)
d) Which factor do you think poses the greatest risk of having a child with low birth
weight? the least? Explain your answers. (3 pts.)
To answer part (d) complete the table shown below.
Risk Factor
RR OR Rank
Smoked During
Pregnancy
History of Premature
Labor
Hypertensive During
Pregnancy
Mother is an Ethnic
Minority
Uterine Irritability
During Pregnancy
2. Myocardial Infarctions and Oral Contraceptive Use
(Case-Control Study)
CODING USED IN TABLES:
Case-Control Status
1 = Case (Myocardial Infarction (MI))
2 = Control
Oral Contraceptive Use?
1 = Yes
2 = No
Age Group
1 = 25 - 29 yrs., 2 = 30 – 34 yrs., 3 = 35 – 39 yrs., 4 = 40 – 44 yrs., 5 = 45 – 49 yrs.
Overall Table (aggregated across age group)
a) What is the OR for myocardial infarctions associated with oral contraceptive use for
all women in this study? Use the table above. Interpret the resulting OR. (3 pts.)
If we take age of the women into account using the ordinal age group variable
defined above, we obtain the following 2 X 2 tables relating MI status and OC
use status.
Age Specific Tables
Age Group = 1
Age Group = 2
ORˆ 
Age Group = 4
ORˆ 
Age Group = 5
ORˆ 
ORˆ 
Age Group = 3
ORˆ 
b) For each of the age specific table above calculate the OR for having a myocardial
infarction associated with being an oral contraceptive user. (5 pts.)
Age Group 1
Age Group 2
Age Group 3
Age Group 4
Age Group 5
ORˆ 
ORˆ 
ORˆ 
ORˆ 
ORˆ 
c) When comparing these odds ratios to the overall odds ratio (ignoring age) from part (a)
what do we find? How can these results be explained and more importantly what does
this say about the risk for having an MI associated with oral contraceptive (OC) use?
Hint: Think about how age of the women would be related to oral contraceptive use
and with myocardial infarction status. Explain. (4 pts.)
3. HIV ELISA Tests
The enzyme-linked immuosorbent assay (ELISA) test was the main test used to screen
blood samples for antibodies to the HIV virus (rather than the virus itself) in 1985. It
gives a measured mean absorbance ratio for HIV (previously called HTLV) antibodies.
The table on the following page gives the absorbance ratio values for 297 healthy blood
donors and 88 HIV patients. Healthy donors tend to give low ratios, but some are quite
high, partly because the test also responds to some other types of antibody, such a human
leucocyte antigen or HLA. HIV patients tend to give high ratios, but a few give lower
values because they have not been able to mount a strong immune reaction.
To test this in practice we need a cutoff value so that those who fall below the value are
deemed to have tested negatively and those above to have tested positively. Any such
cutoff will naturally involve misclassifying some people without HIV as having a
positive HIV test (which will be a huge emotional shock), and some people with HIV as
having a negative HIV test (with consequences to their own health, the health of people
around them, and the integrity of the blood bank, etc.).
MAR (mean absorb. ratio)
<2
2 – 2.99
3 – 3.99
4 – 4.99
5 – 5.99
6 – 11.99
12+
Total
Health Donor
202
73
15
3
2
2
0
297
HIV Patients
0
2
7
7
15
36
21
88
a) If we regard a MAR value > 3 as a positive test result for having the HIV virus what
are the sensitivity, specificity, false-positive rate, and false-negative rate of the ELISA
test? (4 pts.)
Interesting fact: The Economist (July 4, 1992) told a story of a young American who
committed suicide on learning that he had tested positive for HIV.
b) In 1992 the number of Americans who were HIV positive was estimated to be
218,301 out of a population of 252.7 million. Assuming this estimate is correct, what
is the probability that a randomly selected American is HIV positive, i.e. P(D+)? (1 pt.)
c) Using the estimate from part (b) and provided with no other information about the
young American referred to above, what is probability that they actually had HIV
given a positive ELISA test result, i.e. find the Positive Predictive Value P(D+|T+)?
(3 pts.)
d) If the ELISA test for a blood sample is negative what is the probability that the blood
sample is actually HIV free, i.e. find the Negative Predictive Value P(D-|T-)?
Again use your answer from part (b) doing this calculation. (3 pts.)
e) If we changed the MAR cutoff value to > 4 what would happen to the following
probabilities in terms of an increase or decrease? You should calculate them and then
state whether they have increased or decreased. (6 pts.)
Sensitivity
Specificity
False-negative
False-positive
Positive-predictive value
Negative-predictive value
4. Pottery Fragments in an Archeological Dig
Data File: Arch-pottery.JMP in the Biometry JMP folder
Key Words: Bar graphs, Mosaic Plots, Conditional Probabilities, Correspondence
Analysis
Review the JMP tutorial Bivariate Displays for Categorical Data… before beginning
this problem.
The purpose is to understand the distribution of certain pottery types within seven
different archeological dig sites.
The variables in the data file are:



Site - dig site (P0, P1, ..., P6)
Pottery Type - A,B,C, or D
Freq - # of pottery type fragments found within each site
a) By using the Distribution of Y option examine univariate displays for both Site and
Pottery Type. Briefly summarize what is displayed in each. (2 pts.)
b) Use the Fit Y by X option to examine the relationship between Site (X) and Pottery
Type (Y). Use the mosaic plot and correspondence analysis to examine this relationship.
Discuss. (3 pts.)
c) Using JMP calculate all of the conditional probabilities of the form:
P( Pottery Type | Site)
These are easily obtained by simply looking at the Row %’s found in the contingency
table. Use these probabilities to compare and contrast the sites in terms of the pottery
types found in each. Discuss them briefly. How do these probabilities relate to the results
from your graphical analysis in part (c). (3 pts.)
Download