Stat 231 Exam 3 Fall 2010

advertisement
Stat 231 Exam 3
Fall 2010
I have neither given nor received unauthorized assistance on this exam.
________________________________________________________
Name Signed
Date
_________________________________________________________
Name Printed
1
This exam concerns the analysis of some data of Prof. I-Cheng Yeh on compressive strength of
concrete specimens of different formulas at various curing times. Only part of Prof. Yeh's data is
being considered, consisting of n  749 cases from a larger data set that can be found at
http://archive.ics.uci.edu/ml/datasets/Concrete+Compressive+Strength
The predictors
x1  cement content of mixture (kg/m3 )
x2  blast furnace slag content of mixture (kg/m3 )
x3  fly ash content of mixture (kg/m3 )
x4  water content of mixture (kg/m3 )
x5  superplasticizer content of mixture (kg/m3 )
x6  coarse aggregate content of mixture (kg/m3 )
x7  fine aggregate content of mixture (kg/m3 )
x8  age of specimen (days)
and
are available for use in the modeling of
y  concrete specimen compressive strength (MPa)
Use the JMP reports in this exam in answering the questions posed herein.
A summary of an "all possible models" MLR run with the data is:
9 pts
a) On the basis of the values in this table, what models seem most promising? (List the predictors.)
On the basis of R 2 :____________________________________________
On the basis of RMSE :_________________________________________
On the basis of C p :____________________________________________
2
Here are JMP reports for two fits of specimen strength to specimen age only.
9 pts
b) What is the value of a test statistic and a corresponding p-value for testing whether the
quadratic model provides a statistically detectable improvement in the ability to predict y over the
linear one?
test statistic ______________________
p-value _________________________
3
9 pts
c) The two R 2 values for the linear and quadratic fits of strength to age are respectively .20817
and .21279 , and are not much different. If one judges the p-value in part b) above to be small,
how does one reconcile that with this very small increase in R 2 ? (What about the data set used here
allows for this possibility?)
Below is a JMP report for age  x8 . Use it and the first report on the previous page and consider
inference under the simple linear regression model strength   0  1  age   .
9 pts
d) Give 95% two-sided confidence limits for the increase in mean specimen strength over a 7 day
(one week) period for specimens like those in the study. (Plug in fully, but you need not simplify.)
9 pts
e) Give 95% two-sided prediction limits for the next strength measured at 20 days. (Plug in
completely, but you need not do the final arithmetic.)
4
Next are JMP reports for two models with respectively 2 and 4 predictors.
9 pts
f) Give the value of a statistic and an associated p-value for testing whether the variables
cement  x1 and age  x8 together provide statistically significant ability to predict or explain
y  strength .
statistic ______________________
p-value ___________________________
5
10 pts g) Give the value of an F statistic and degrees of freedom for judging whether after accounting for
cement  x1 and age  x8 , the variables slag  x2 and water  x4 add (statistically) significantly to
ones ability to predict/explain strength  y .
F  ______________________ d . f .  ________________________
The remaining questions on this exam and the remaining JMP reports concern a model for y that
includes all eight predictor variables x1 , x2 , , x8 .
9 pts
h) What does the normal plot of standardized residuals suggest about the very largest of those
residuals (and therefore about the model fitting)? Does the single largest standardized residual
correspond to a set of predictors in on an "edge" of the region where one has data? Explain.
9 pts
i) In this model, what are two-sided 95% confidence limits for the standard deviation of strengths
associated with any fixed concrete formula and specimen age? (For very large degrees of freedom,
 , the 2 distributions are approximately normal with mean  and standard deviation 2 so that
upper and lower 2 2.5% points are   1.96 2 .) (Plug in completely, but you need not
simplify.)
6
9 pts
j) For a fixed concrete formula, give 95% two-sided confidence limits for the increase in mean
specimen strength over a 7 day (one week) period.
9 pts
k) 95% two-sided confidence limits for the mean response when
x1  260.9, x2  100.5, x3  78.3, x4  200.6, x5  8.6, x6  864.5, x7  761.5, and x8  28
are
33.76 MPa and 35.77 MPa
Give 95% two-sided prediction limits for the next compressive strength for a specimen of this type
and age.
7
Results for 8-Predictor Model
8
9
Download