Homework 5

advertisement
PubH 6414 Fall2011 Homework 5 (20 points)
We encourage you to work together in computing and discussing the problems.
However, each student is expected to independently write up the submitted
assignment using her or his own computing and giving explanations in her or his
own words. Identical or nearly identical homework submissions will not receive
credit.



Turn in this completed Word document in class by the homework due date.
You may use R commander to do the calculations needed for each question. Paste in ONLY the
parts of the output needed to answer the question. (You may use another statistical software
package to do the calculations, if you prefer, but the instructor and TAs cannot provide
assistance with other packages.)
Data needed for this homework assignment are on the website link:
http://www.biostat.umn.edu/~susant/FALL11PH6414HMK.html .
Problem 1: Multiple Choice Questions. (2 points)
Part A. Let's suppose that the body temperatures of healthy children are normally distributed with
mean = 98.6 F and standard deviation = 0.7 F. (Made up numbers.) Suppose further than a child will be
sent home from school if their fever is greater than 100.0 F. Which statement below describes the
probability that a randomly chosen healthy child will have a body temperature above 100.0 F?
a. The area in the standard normal curve between z = 0 and z = 2.0.
b. The area under the standard normal curve to the left of z = + 2.0.
c. The area under the standard normal curve to the right of z = + 2.0.
d. The area under the standard normal curve in between z = -2.0 and z = +2.0.
e. The area under the standard normal curve to the left of z = - 2.0.
Part B. Answer the following questions with ‘Yes’, ‘No’, or ‘It Depends’.
B1. The probability of 'success' in the binomial distribution can vary from one trial to the next.
B2. All probability distributions are symmetric about the mean.
B3. The area under the curve of a normal distribution can be interpreted as probability.
B4. The normal approximation to the binomial distribution can be used when the number of
trials, N, is large.
B5. The area under a normal curve to the right of the mean is 0.5.
B6. The probability of being within 2 standard deviations of the mean in a normal distribution
is 0.68 or 68%.
B7. All normal distributions have mean = 0 and standard deviation = 1.
B8. The binomial distribution only applies if the events are independent.
B9. The total area under the curve of a probability distribution is 1.0.
Problem 2: Serum Sodium Levels. (4 points)
The values of serum sodium in normal healthy adults approximately follow a normal distribution with
a mean = 141 mEq/L and a known standard deviation (sigma) = 3 mEq/L.
A. Use the R Commander menu options to plot this distribution. Paste the plot here.
Use the pnorm function (or R Commander menu options) to answer the following questions. (Assume
that probabilities refer to randomly selected healthy adults.) Please give the pnorm formula you used
(or the formula R Commander used) as well as the result.
B. What is the probability that a normal healthy adult will have a serum sodium value > 147
mEq/L?
C. What is the probability that a normal healthy adult will have a serum sodium value < 130
mEq/L?
D. What is the probability that a normal healthy adult will have a serum sodium value between
132 and 150 mEq/L?
Use the qnorm function to answer the following questions. Please give the qnorm formula you used (or
the formula R Commander used) as well as the result.
E. What serum sodium level is necessary to put someone in the top 1% of the distribution?
(Hint: 1% of the population have serum sodium levels greater than or equal to this value.)
F. What serum sodium level is necessary to put someone in the bottom 10% of the distribution?
(Hint: 10% of the population have serum sodium levels less than or equal to this value.)
Problem 3: Birth Weights. (4 points)
A cohort study was carried out involving 18,665 Caucasian infants born at Montreal's Royal Victoria
Hospital from January 1978 to March 1990. The birth weights of those infants were normally
distributed with a mean =3.369 kg and standard deviation = 0.567 kg. (Reference: Shi Wu Wen,
Michael S. Kramer, and Robert H. Usher, Comparison of Birth Weight Distributions between Chinese
and Caucasian Infants, Am. J. Epidemiol. 1995; 141: 1177-1187.)
A. Use the R Commander menu options to plot this distribution. Paste the plot here.
Assume that probabilities refer to a randomly selected newborn child from the same population
sampled for this cohort study. Use the pnorm and qnorm functions (or the corresponding menu options
in R Commander) to answer the following questions. Please give the pnorm or qnorm formula you (or
R Commander) used as well as the result.
B. What is the probability that a randomly selected newborn child will weigh less than 3 kg?
C. What is the probability that a randomly selected newborn child will weigh more than 4 kg?
D. What is the probability that a randomly selected newborn child will weigh between 2.5 kg
and 3.5 kg?
E. How many of these 18,665 children you would expect to have had a birth weight between
4.5 kg and 5 kg? (Note: This is looking for a count, not a proportion.)
F. What is the necessary birth weight to put a randomly selected newborn child in the top 10%
of the distribution? (Hint: 10% of the population have birth weights greater than or equal to this
value.)
G. What is the necessary birth weight to put a randomly selected newborn child in the lower
20% of the distribution? (Hint: 20% of the population have birth weights less than or equal to
this value.)
Problem 4: Skin Grafts. (5 points)
A plastic surgeon wants to compare the number of successful skin grafts in her series of burn patients
with the number in other burn patients. A literature survey indicates that approximately 30% of the
grafts become infected but that 80% of the grafts survive. She has had 7 of 8 skin grafts survive in her
series of patients and has had one infection. (Note: This problem concerns the probabilities of two
distinct events: graft infection, and graft survival.)
A. Based on the literature survey and this surgeon's sample size, the probability of graft infection is
distributed Binomial(8, 0.3). Use the R Commander menu options to tabulate and then to plot this
distribution. Paste both the table and the plot here.
B. Based on either the table or the plot, what is the most likely number of graft infections in the 8 skin
grafts performed by this surgeon?
C. How likely is this surgeon’s experience of having only 1 infection out of 8 skin grafts? (Phrased
differently, what is the probability of having only 1 infection out of 8 skin grafts?)
D. Ho w likely is it that this surgeon would have experienced 2 or more infections in the 8 skin grafts
she performed?
E. How likely is it that all 8 of this surgeon’s skin grafts would become infected?
F. Based on the literature survey and this surgeon's sample size, how is the probability of graft
survival distributed? (Note: No calculations are needed to answer this question.)
G. Use the R Commander menu options to tabulate and then to plot the distribution of graft survival.
Paste both the table and the plot here.
H. Using either the table or the plot, what is the most likely number of skin grafts that will survive out
of the 8 performed by this surgeon?
I. How likely is this surgeon’s experience of having exactly 7 graft survivals out of 8 skin grafts?
J. How likely is it that all 8 of this surgeon’s skin grafts would survive?
Problem 5. Schizophrenia. (5 points)
According to the DSM-IV, approximately 1 percent of people in all cultures in a given year have
schizophrenia. Use this result and the normal approximation of the binomial distribution to calculate
the following probabilities for a random sample of 3500 Americans.
A. Is the normal approximation to the binomial distribution appropriate? Give a reason for your
answer.
B. Calculate the mean and standard deviation of the normal approximation to this binomial
distribution. Please show your work.
C. Using the normal approximation, what is the probability that 40 or more of the 3500 people were
affected by schizophrenia, P(X ≥ 40)? Please show the R formula you (or R Commander) used as well
as the result.
D. Using the normal approximation, what is the probability that 29 or fewer of the 3500 people were
affected by schizophrenia, P(X ≤ 29)? Please show the R formula you (or R Commander) used as well
as the result.
E. Using the normal approximation, what is the probability that between 30 and 40 of the 3500 people
were affected by schizophrenia? Please show the R formula you (or R Commander) used as well as the
result.
F. Ignoring the normal approximation, how is the probability of schizophrenia distributed?
G. Use the R Commander menu options to plot this binomial distribution. Paste the plot here. Does it
look approximately normal to you?
H. Using the pbinom function (or the R Commander menu options), what is the probability that 40 or
more of the 3500 people were affected by schizophrenia, P(X ≥ 40)? How does this compare to your
answer in Part C above?
I. Using the pbinom function (or the R Commander menu options), what is the probability that 29 or
fewer of the 3500 people were affected by schizophrenia, P(X ≤ 29)? How does this compare to your
answer in Part D above?
Download