Stat 500 Midterm 1 – 8 October 2009 page 1 of 7

advertisement
Stat 500
Midterm 1 – 8 October 2009
page 1 of 7
Please put your name on the back of your answer book.
Do NOT put it on the front. Thanks.
Do not start until I tell you to.
• The exam is closed book, closed notes. Use only the formula sheet and tables I provide today.
You may use a calculator.
• Write your answers in your blue book. Ask if you need a second (or third) blue book.
• You have 2 hours (120 minutes) to complete the exam.
Stop working when the end of the exam is announced.
• Points are indicated for each question. There are 130 total points.
• Important reminders:
– budget your time. Some parts of each question should be easy; others may be hard. Make
sure you do all parts you can.
– notice that some parts do not require any computations.
– show your work neatly so you can receive partial credit.
• Good luck!
Stat 500
Midterm 1 – 8 October 2009
page 2 of 7
1. 20 pts. The following problem is based on an engineering study comparing two methods (A or
B) for one step in making a component. The response is the length of time required for that
step. Thirty components were used in the study: 20 were randomly assigned to method A; the
remaining 10 used method B. Here are summary data for each group.
Method
A
B
ni
Time required
Average s.d.
20
14.65
3.20
10
7.34
1.62
In studies like these, the ratio between the two times is the most reasonable way to summarize
the treatment effect. The investigators expect that method A will take longer. They want to
know if the ratio of means (Method A / Method B) is larger than 1.5.
(a) 5 pts. Is this a survey, an observational study or an experimental study? Briefly explain.
(b) 5 pts. The investigators used a randomization test to test the null hypothesis that the
ratio of the two means equals 1.5. The test statistic is the ratio of the two sample averages
(i.e. Method A average / Method B average). Below are histograms of the randomization
distribution given the Ho : ratio = 1.5 and the bootstrap distribution. Below each figure are
the largest 20 and smallest 20 observations in order. There are a total of 999 values in the
randomization distribution and 1000 values in the bootstrap distribution.
100
50
Frequency
150
100
Bootstrap Distribution
0
50
0
Frequency
200
150
Randomization Distribution
1.0
1.5
2.0
Ratio
2.5
1.6
1.8
2.0
Ratio
2.2
Stat 500
Midterm 1 – 8 October 2009
Smallest 20 1.002
1.080
1.102
1.128
Largest 20
2.415
2.074
2.019
1.992
page 3 of 7
Randomization
1.051 1.051 1.056
1.081 1.095 1.098
1.103 1.105 1.107
1.133 1.136 1.138
1.065
1.098
1.111
1.141
1.533
1.552
1.578
1.589
Bootstrap
1.535 1.541 1.543
1.558 1.564 1.568
1.582 1.583 1.586
1.594 1.600 1.605
1.551
1.576
1.586
1.605
2.129
2.055
2.018
1.985
2.079
2.023
1.995
1.966
2.338
2.246
2.212
2.191
2.308
2.245
2.203
2.186
2.247
2.220
2.192
2.180
2.129
2.054
2.017
1.979
2.121
2.031
2.012
1.979
2.265
2.236
2.203
2.185
2.265
2.227
2.197
2.184
What is the p-value for a one-sided randomization test of ratio = 1.5? Note: the alternative
hypothesis is HA : ratio > 1.5.
(c) 5 pts. The standard error of the estimated ratio is 0.0946. Calculate the t-statistic to test
the hypothesis that the ratio = 1.5.
(d) 5 pts. Is it reasonable to use t-distributions on the raw data (e.g. what you did in part 1c
to approximate design-based inferences about the ratio of Method A to Method B? Explain
why or why not.
2. 40 pts. An study evaluated an experimental genetic treatment thought to increase litter size
(the number of offspring) of sows (female pigs). 30 sows were chosen from a pig research facility.
10 were randomly assigned to receive the genetic treatment; the remaining 20 served as controls.
The size of each sow’s first litter was recorded. The data are summarized below.
Control Experiment
average
10.3
10.7
s.d.
2.4
1.5
n
20
10
(a) 5 pts. Which average litter size (control or experimental group) is more precisely known?
Make as few assumptions as possible for this part. Explain.
(b) 10 pts. Use a two-tailed t-test to test the null hypothesis of no effect of the experimental
genetic treatment on the mean litter size. Please report your test statistic, an approximate
p-value, and a one-sentence conclusion.
(c) 10 pts. List the assumptions underlying the t-test used in question 2b. For each assumption,
list one reasonable method of assessing the validity of that assumption.
(d) 5 pts. There is evidence that litter size is partly controlled by genetic factors so that sows
with the same mother will tend to have similar litter sizes. Describe a modification of
the experimental design (leaving the sample sizes the same) that is likely to increase the
precision of the estimated difference between the population means. Explain.
(e) 5 pts. In a related study of the same 30 sows, another researcher measured 500 genetic
markers in each sow. For each marker, she did a t-test comparing the mean litter size of
sows with the marker and sows without. The smallest 15 p-values, and some additional
calculations, are shown on the next page.
Stat 500
Midterm 1 – 8 October 2009
page 4 of 7
Marker # i
p-value
p * i/500 p * 500/i p*500
427
1 0.0000036 < 0.000001
0.0018 0.0018
157
2 0.0000059 < 0.000001
0.0015 0.0029
10
3 0.0000065 < 0.000001
0.0011 0.0032
495
4
0.00013
0.0000010
0.016 0.065
86
5
0.00020
0.0000020
0.020
0.10
259
6
0.00062
0.0000074
0.052
0.31
121
7
0.00064
0.0000089
0.046
0.32
208
8
0.0024
0.000038
0.15
1.29
332
9
0.0069
0.00012
0.38
3.42
466
10
0.0095
0.00019
0.47
4.81
463
11
0.050
0.0011
2.29 25.19
99
12
0.058
0.0014
2.43 29.16
110
13
0.073
0.0019
2.83 36.79
122
14
0.083
0.0023
2.98 41.72
254
15
0.086
0.0025
2.87 43.05
Which of these 15 markers would be considered significant at a false discovery rate of 5%
= 0.05? If you wish, you are welcome to say “all 15”, or “all except 466” instead of listing
most of the markers.
(f) 5 pts. Which of these 15 markers would be considered significant at an experiment-wise
error rate of 0.05 if you used a Bonferroni correction to adjust for testing 500 hypotheses?
3. 70 pts. The U.S. Government has very strict rules for food safety. These require sterilization
of most processed foods to reduce or eliminate unwanted bacteria. One common sterilization
method is to heat food to a specified temperature and hold it at that temperature for a specified
length of time. This kills any bacteria that are present in the food. Unfortunately, heating food
also affects its quality. The following problem is based on a study of canned peas. This study
compared five different combinations of time and temperature. Three (A, B, and D in the table
below) meet the requirements for sterilization and two (C and E), labeled ’super-sterile’, exceed
the requirements. This study examined the quality of the peas after sterilization and storage for
6 months.
Because of the time required to change temperatures and inherent variability in raw material
and plant conditions, the experiment was conducted over 6 days. Each day, five batches of peas
were randomly assigned to each of the five temperature and time combinations (one batch per
treatment per day). Treatments are labeled by the temperature / time, so 55/30 was heated to
55 degrees C for 30 minutes. Batches were then stored in a single warehouse for six months. The
response is an average quality score measured for each batch of canned peas. This score ranges
from 0 (very crisp = high quality) to 10 (very mushy = low quality). The data and treatment
means are shown below:
Day
Type
Sterile
Sterile
Super-sterile
Sterile
Super-sterile
Treatment Temp/Time 1
2
3
4
5
A
50/60
3.2 3.8 1.2 0.6 0.6
B
55/30
1.1 2.4 2.8 1.6 1.8
C
55/45
4.9 4.5 2.2 3.0 2.0
D
60/15
2.3 1.6 2.2 1.9 1.3
E
60/20
4.6 6.5 5.6 3.8 4.1
6 Mean
1.5 1.82
1.3 1.83
1.1 2.95
3.6 2.15
4.3 4.82
Stat 500
Midterm 1 – 8 October 2009
page 5 of 7
Additional SAS output that may be useful is included at the end of the exam. The treatments
are listed in “SAS order”.
(a) 5 pts. What is the experimental unit in this study? Briefly explain.
What is the observational unit? Briefly explain.
(b) 10 pts. Complete
Source
df
Days
??
Treatments ??
Error
??
Total
??
the entries in the ANOVA table :
SS
MS
F
11.68
??
??
??
??
0.968
69.29
(c) 5 pts. Write an appropriate statistical model (i.e. the equation relating observations to
parameters) that corresponds to the above ANOVA table.
(d) 5 pts. Compute the efficiency of blocking.
(e) 10 pts. The F statistic for Treatments in the part 3b tests a specific null hypothesis (H0):
i. identify that null hypothesis
ii. approximate the p-value for that test
iii. write an appropriate one sentence conclusion.
(f) 5 pts. Which of the following is the best approach to answer the question ’which pairs are
treatments are significantly different?’. Briefly explain your choice.
T-tests (LSD), Protected LSD, Tukey-Kramer HSD, Bonferroni, Scheffe, Something else.
(g) 10 pts. The investigators are specifically interested in two questions:
How large is the mean difference between the two super-sterile treatments (C and E)?
How large is the difference between the mean quality of the three sterile treatments (A, B,
and D) and the mean quality of the two super-sterile treatments (C and E).
i. Write out the coefficients to estimate each of these contrasts.
ii. Are these two contrasts orthogonal? Explain why or why not.
(h) 5 pts. Construct a 95% confidence interval for the difference between the two ’super-sterile’
treatments, (C and E)
(i) 5 pts. The investigators are also interested in whether there are any differences in quality
among the three ’sterile’ treatments, (A, B, and D). That is, they want to test H0: µA =
µB = µD . If this is possible without computing Sums-of-Squares from the raw data,
please test H0: no differences between treatments A, B, and D; report the F statistic and
approximate the p-value. If this test is not possible without additional information, say so
and tell me what additional information you need.
I am not expecting you to compute SS from the raw data.
(j) 5 pts. The investigators will repeat this study. It turns out to be very difficult to change
the temperature many times during the day. It would be easier to use only one treatment
(time/temperature combination) each day. That design requires 30 days to collect 6 replicates of these 5 treatments. The investigators ask for your opinion. Should they continue
using all five treatments in a day, or should they use just one treatment per day? Provide a
short justification for your recommendation. Note: one more part of this question on
the next page
Stat 500
Midterm 1 – 8 October 2009
page 6 of 7
(k) 5 pts. The investigators are intrigued by the apparent slight decrease in quality in treatment
D (60/15), compared to treatments A and B. They want to conduct another study using
just these three treatments (A, B and D). Again, they will use a block design and want you
to suggest a sample size (number of days). They are concered primarily about the contrast
between treatment D and the average of treatments A and B. You agree to use a standard
deviation of 1. They want an α = 5% test to have 80% power to detect the relatively small
difference of 0.35. How many days should they use?
SAS code and part of the SAS output for problem 3 (quality of canned peas)
data food;
infile ’foodqual.txt’;
input day trt quality;
proc glm;
class day trt;
model quality = day trt;
lsmeans trt /stderr;
estimate ’ave of A,B,D - ave of C,E’ trt 2 2 -3 2 -3 /divisor = 6;
contrast ’ave of A,B,D - ave of C,E’ trt 2 2 -3 2 -3;
estimate ’ave of A,B,C - ave of D,E’ trt 2 2 2 -3 -3 /divisor = 6;
contrast ’ave of A,B,C - ave of D,E’ trt 2 2 2 -3 -3;
estimate ’difference C - E’ trt 0 0 1 0 -1;
contrast ’difference C - E’ trt 0 0 1 0 -1;
estimate ’difference D - E’ trt 0 0 0 1 -1;
contrast ’difference D - E’ trt 0 0 0 1 -1;
run;
------------------------------------------------------------------------Class Level Information
Class
Levels
Values
day
6
1 2 3 4 5 6
trt
5
50/60 55/30 55/45 60/15 60/20
Number of Observations Read
Number of Observations Used
Note: Some output deleted.
30
30
Stat 500
Midterm 1 – 8 October 2009
page 7 of 7
Continuation of the SAS output for problem 3 (quality of canned peas)
Least Squares Means
quality
LSMEAN
Standard
Error
Pr > |t|
1.81666667
1.83333333
2.95000000
2.15000000
4.81666667
0.40163555
0.40163555
0.40163555
0.40163555
0.40163555
0.0002
0.0002
<.0001
<.0001
<.0001
trt
50/60
55/30
55/45
60/15
60/20
Dependent Variable: quality
Contrast
DF Contrast SS
ave of A,B,D
ave of A,B,C
difference C
difference D
-
ave of C,E
ave of D,E
E
E
Parameter
ave of A,B,D
ave of A,B,C
difference C
difference D
-
ave of C,E
ave of D,E
E
E
1
1
1
1
27.378
11.858
10.453
21.333
Mean Sq.
F Value
Pr > F
28.29
12.25
10.80
22.04
<.0001
0.0023
0.0037
0.0001
27.378
11.858
10.453
21.333
Estimate
Standard
Error
t Value
Pr > |t|
-1.95000000
-1.28333333
-1.86666667
-2.66666667
0.36664141
0.36664141
0.56799844
0.56799844
-5.32
-3.50
-3.29
-4.69
<.0001
0.0023
0.0037
0.0001
Download