Fall 2013 STATISTICS 479 Assignment #8 (40 points)

advertisement
Fall 2013
STATISTICS 479
Assignment #8 (40 points)
Instructions: Turn in the programs, any plots and the hand written answers for each problem.
1. A marketing consultant conducted an experiment to compare four different package designs for a new
breakfast cereal. Twenty four stores with approximately similar sales volumes were selected and each
store was required to carry only one of the package designs. Thus each package design was randomly
assigned to six stores. Sales, in number of cases, were observed for the study period. The data are given
below:
Package Design
1
2
3
4
Package Design
Package Design
12 14 19 24
1
3 colors, with cartoons
18 12 17 30
2
3 colors, without cartoons
14 13 21 27
3
5 colors, with cartoons
15 10 23 28
4
5 colors, without cartoons
17 15 16 32
15 12 20 30
Prepare and run a SAS program to obtain the output necessary to provide all of the following information. You must extract numbers from the SAS output and write your own answers on a separate sheet
of paper.
(a) Assuming the fixed effects oneway classification model for this data, give estimates of the
true mean sales volumes µ1 , µ2 , µ3 , µ4 , and the error variance σ 2 . Write down the corresponding
analysis of variance table including the p-value. State the hypothesis tested by the F-statistic and
your decision based on the p-value. Use α = .05.
(b) Use contrast statements to compute F-statistics for
i. compare the the average effect of the 3-color designs with the average effect of the 5-color
designs,
ii. compare the the average effect of the designs with cartoons with the average effect of the
designs without cartoons,
iii. compare the the effect of the 3-color design with cartoons with the effect of the 3-color design
without cartoons, and
iv. compare the the effect of the 5-color design with cartoons with the effect of the 5-color design
without cartoons.
Add lines containing the corresponding sum of squares, degrees of freedom, and the F-statistic in
an expanded anova table. Based on the p-values, what are your conclusions from each of these
tests. Use α = .05.
(c) Include an estimate statement for the four comparisons above. Use the results from this statement
to obtain a t-test of this comparison. State your decision based on the p-value. Use α = .05.
(d) Compute 95% confidence intervals for all pairwise differences in true mean sales volumes i.e., (µp −
µq )’s. Extract the 6 confidence intervals of interest from the output and report them separately.
(e) Use the confidence intervals in part (d) to find the differences that are significant by
checking those that do not include zero in the interval. What is your conclusion about the true
mean sales volumes corresponding to the four package designs?
1
2. Six samples of each of four types of cereal grain grown in a certain region were randomly selected and
analyzed to determine the thiamin content (mcg/gm) in an experiment. The data are:
Cereal
Wheat
Barley
Maize
Oats
Thiamin
5.2 4.5
6.5 7.0
5.8 4.7
8.3 6.7
content (mcg/gm)
6.0 6.1 6.7 5.8
6.1 7.5 5.9 5.7
6.4 4.9 6.0 5.2
7.8 7.0 5.9 7.2
Prepare and run a SAS program to obtain the output necessary to provide all of the following information. You must extract numbers from the SAS output and write your own answers on a separate sheet
of paper.
(a) Assuming the fixed effects oneway classification model for this data give estimates of true mean
thiamin content µ1 , µ2 , µ3 , µ4 , and the error variance σ 2 . Write down the corresponding analysis
of variance table including the p-value. State the hypothesis tested by the F-statistic and your
decision based on the p-value.
(b) Include a statement to compare true thiamin content means using the the LSD procedure using
α = .05. Use the letters on the output to conduct the underlining method on your answer sheet
showing the ordered sample means labelled with the corresponding treatment. Make a concluding
statement.
(c) Include a statement to compare true thiamin content means using the the TUKEY procedure using
α = .05. Use the letters on the output to conduct the underlining method on your answer sheet
showing the ordered sample means labelled with the corresponding filmtypes. Make a concluding
statement.
(d) In what way are the conclusions from parts (b) and (c) different?
3. The data displayed below are results from an experiment on the use of drugs in the treatment of leprosy.
The drugs were A and D, which were antibiotics and F, an inert drug used as a control. The dependent
variable Y (PostScore) was a score of leprosy bacilli measured on each patient after several months of
treatment. The covariate X (PreScore)was a pretreatment score of leprosy bacilli.
A
X Y
11 6
8 0
5 2
14 8
19 11
6 4
10 13
6 1
11 8
3 0
Drugs
D
X
Y
6
0
6
2
7
3
8
1
18
18
8
4
19
14
8
9
5
1
15
9
F
X
16
13
11
9
21
16
12
12
7
12
Y
13
10
18
5
23
12
5
16
1
20
(a) Use proc glm and the one-way covariance (equal slopes) model to analyze this data. Write down
the model first identifying each term as in the text book. Construct an adjusted anova table as on
page 311 of the text.
(b) Using the above anova table, test the hypothesis H0 : µ1 = µ2 = µ3 (use the p-value and state
decision).
2
(c) Construct 95% confidence intervals for all differences in pairs of means (e.g., µ1 − µ2 ) adjusted for
multiple testing using the Bonferroni method.
(d) What does the test of H0 : β = 0 tell you? Test this hypothesis using the above adjusted anova
table and state your conclusion.
(e) Construct an analysis of variance that is not adjusted for the pre-score. What conclusion can you
draw from this Anova table.
4. A textile mill weaves a fabric on a large number of looms. To investigate whether there is an appreciable
variation among the output of cloth per minute by the looms, the process engineer selects 5 looms at
random and measured their output on 5 randomly chosen days. The following data are obtained:
Loom
1
2
3
4
5
14.0
13.9
14.1
13.6
13.8
Output (lb/min)
14.1 14.2 14.0
13.8 13.9 14.0
14.2 14.1 14.0
13.8 13.9 13.7
13.6 13.9 13.8
14.0
13.9
14.0
(a) Write the oneway random model you will use to analyze this data stating assumptions about each
parameter in the model and tell what each parameter represents. Construct the corresponding
analysis of variance using SAS/MIXED procedure. Write the anova table including a column
for Expected Mean Square (EMS).
(b) Express the hypothesis that there is no variability in output among the looms, in terms of the
model parameters. Perform a test of this hypothesis using the analysis in part (a).
(c) If the hypothesis in part (b) is rejected, estimates of the variance components associated with the
model in part (a) may be desired. Obtain these estimates using the results of parts (a) and (b).
Notes:
• Along with the data, I have included pieces of SAS code that will create the SAS data sets necessary
to perform the analyses required for each problem. These are downloadable from the Homework
Assignments page as usual.
• When the levels of the factors are of character type it is recommended that you use the order=data
option in the the proc statements to preserve the ordering of the class levels found in the input
data.
Due Thursday 12, December, 2013
3
Download