Stat 301 HW 6 Due: 23 / 26 October 2015

advertisement
Stat 301
HW 6
Due: 23 / 26 October 2015
1. Text problem 4.40 (p. 207) describes a study of how the time to failure of a part of computer
chips depends on the temperature of one step in the manufacturing process. The data are
in wafer.txt. TEMP is the solder temperature; this was randomly assigned to the chip.
FAILTIME is the time to failure; this is commonly log transformed as part of the data analysis.
Ignore the book’s questions. Also, don’t worry about violations of assumptions for any parts
of this question.
(a) Estimate and report the slope, β1 , in the model Log Failtime = β0 + β1 TEMP.
(b) Estimate the effect on Failtime of increasing the solder temperature by 10 degrees.
Note: This question asks for the effect on failtime, not log transformed failtime.
You consider log transforming TEMP as well as the failure time.
(c) Estimate and report the slope, β1 , for the model Log Failtime = β0 + β1 log TEMP.
(d) Estimate the effect on Failtime of increasing the solder temperature by 10%, i.e. multiplying solder temperature by 1.1.
Notes: In lecture, we’ve looked at doubling X. Calculating the estimated consequence
of doubling X involved the number 0.693, which is (approximately) the natural log of 2.
Doubling X is not relevant for these data, but the same ideas can be applied to calculating the effect of multiplying by 1.1.
Again, this question asks you to describe the effect on failtime, not log transformed
failtime.
2. Text problem 4.14 (p. 189), with new questions
Ignore the book’s questions. Instead:
(a) Fit a multiple linear regression to predict VOLTAGE from the 7 variables. Assume that
additive effects model (what the book calls a first order model) is appropriate, i.e., the
regression model is:
E Voltage = β0 + β1 Volume + β2 Salinity + β3 Temperature + β4 Delay
+β5 Surfactant + β6 SpanTriton + β7 SolidPart
Which variables have regression coefficients that are significantly different from 0, using
p < 0.05 as the criterion for significance? Report the variables and their associated
p-values.
Note: make sure to get the WATEROIL.txt data set from the class web site. The version
of the data set in the text and on the book CD has a serious typo. I have corrected that
on the web site.
(b) Test the null hypothesis that all slope coefficients in the multiple regression of part 2a
are equal to 0. Report the F statistic and p-value for that test, then write a one-sentence
conclusion.
(c) Calculate and report the correlation between each pair of X variables in the WATEROIL.txt
data set. If the correlation is the same value for all pairs, you don’t have to list all pairs;
just tell us ’XX for all pairs of variables’.
1
(d) The combinations of the seven X variables were deliberately chosen by the experimenters
so the correlations between X variables were what you (should have) found in part 2c.
How does this simplify the interpretation of the multiple regression coefficients?
3. Problem 4.33 (page 200). Parts a-c as written. Clarification for part b: the “first order”
model from 4.14 includes all seven X variables. No formal test required. Additional part d.
(d) The interaction model used in this problem postulates that the slope for VOLUME
depends on the level of SALINITY and the level of SURFACTANT. Estimate the slope
for VOLUME for the following 3 conditions:
SALINITY
0
3
3
2
SURFACTANT
0
0
3
Download