hwk2.07

advertisement
Stat 502
Homework 2
Assigned 10/11/07
Due 10/18/07
1. Sample size: A researcher needs to estimate the mean circumference of a
population of small trees in the Cascade mountains. The researcher would like to
construct a confidence interval for the mean, and wants this interval to contain the true
mean with probability at least .95 and would like to estimate the mean with a precision
of about 2 cm, i.e. the length of the confidence interval should be no more than 2 cm.
Studies from other regions suggest the standard deviation in circumference is about 6.5
cm. How many trees do you recommend the researcher sample? What assumptions are
you making for this recommendation? Describe mathematically the relation between the
number of samples required and the precision.
2(*). Null distribution of p-values: Recall that a p-value is a function of the data, and
so before the experiment is run it is a random variable.
(a) Consider the one-sample t-test for evaluating evidence against H 0 :   0 . Derive
the distribution of the p-value under the null hypothesis. Using the result, show that the
type I error of a level-  test is  .
(b**) Find the null distribution of the p-value for a generic hypothesis test.
(c*) You expect a small p-value when the null hypothesis is false. What does this tell
you about the shape of the distribution of the p-value when the null hypothesis is false in
comparison with your result for part (a).
3. Randomization test versus t-test: A chemist is testing the reaction time of a set of
cells to two different, but similar chemical compounds A and B. A set of 16 cell cultures
are randomized so that 8 receive A and 8 receive B. The reaction times in seconds are
recorded.
(a) Make a histogram and boxplots for each of the two groups. Comment on the
differences.
(b) Compute the means and medians of each group. Comment on the differences.
(c) Plot the density of the appropriate t-distribution if one were to use the ordinary two
sample t-test to evaluate differences between the groups. Obtain the corresponding pvalue. Write down the assumptions which validate the use of this p-value, and
comment on whether or not they are met for these data.
(d) Make a histogram of the randomization distribution of the t-statistic, and compute the
corresponding p-value. Write down the assumptions which validate the use of this pvalue, and comment on whether or not they are met for these data.
4. Power calculation: Researchers are planning a clinical trial for testing a drug to
treat ALS (Amyotrophic Lateral Sclerosis). They will randomize n A ALS patients to the
standard treatment (A) and nB  nA patients to the new drug (B). The response Yi will be
the change in muscle score from time of enrollment into the study to one year post-
enrollment. Previous studies suggest that  A = −1.3 and  A = .89, indicating that on
average for these patients, muscle score declines.
(a) It is thought that the new drug will give a mean response ranging from −.5 to .5, i.e. a
smaller decline in muscle score. Compute a power curve, as a function of nB  nA , for
each of B .5,0,.5 using a type I error rate of  = 0.05. Find the sample size for
each B value so that the power is 0.75.
(b) Suppose that, unknown to the researchers, the variances are unequal, and
that  B  .4 . Via simulation, normal approximation or otherwise compute the actual
power of the test under the three sample sizes and values of B you found in (a),
assuming that the researchers use the standard two-sample t-test.
Download