Regression

advertisement
Activity 5: Regression
The Turbine Oil Oxidation Test (TOST) and the Rotating Bomb Oxidation Test (RBOT)
are different procedures for evaluating the oxidation stability of steam turbine oils. We
will assume that the regression function of Y=RBOT time (hr) on X=TOST time (min)
has the simple linear form, and will use the data in
www.stat.psu.edu/~mga/401/labs/05/lab5/regr.data.txt to estimate the model parameters.
1. Plot the Data:
Graph>Scatterplot>Simple>Input RBOT for Y and TOST for X>OK
The plot suggests that the assumption of a simple linear regression model is reasonable.
2. Fit the Simple Linear Regression Model:
Stat>Regression > Regression>Input RBOT for Response and TOST for
Predictor>OK .
The output that Minitab gives, together with explanatory notes from me, follow:
Regression Analysis: RBOT: versus TOST:
The regression equation is
RBOT: = - 13.9 + 0.0902 TOST:
[NOTE: The regression equation is the estimated regression function, under the
assumption of a simple linear regression model. Thus, the estimate of the
intercept is -13.9, while the estimate of the slope is 0.0902.]
Predictor
Constant
TOST:
Coef
-13.85
0.09024
SE Coef
44.78
0.01188
T
-0.31
7.59
P
0.763
0.000
[NOTE: This table gives the estimates of the coefficients. The column headed SE
Coef gives the estimated standard errors of the estimators. The column headed T
gives the ratios, Coef/(SE Coef), which are the statistics for testing the null
hypothesis that the coefficient is zero. The last column gives the p-values.
Thus, the intercept coefficient is not significantly different from zero, but
there is strong evidence suggesting that the slope coefficient is not zero.]
S = 25.1546
R-Sq = 85.2%
R-Sq(adj) = 83.7%
[NOTE: S is the estimated intrinsic scatter, or the standard deviation of the
error term. R-Sq is the proportion of the total variability explained for by
the model. A large R-Sq suggests good fit of the model to the data.]
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
10
11
SS
36489
6328
42817
MS
36489
633
F
57.67
P
0.000
[NOTE: The Analysis of Variance table decomposes the total variability, or
total sum of squares (which is the numerator of the sample variance of the
response variable), into the variability explained by the regression model and
the variability due to the intrinsic scatter (or error variability). The
regression sum of squares (i.e. the variability explained by the model) divided
by the total sum of squares gives the R-Sq we saw above.]
Unusual Observations
Obs
3
TOST:
3750
RBOT:
375.00
Fit
324.56
SE Fit
7.27
Residual
50.44
St Resid
2.09R
R denotes an observation with a large standardized residual.
[NOTE: Minitab, and other software packages, list observations for which the
response deviates considerably from the regression line. With our data, the
response of the third observation does so.]
FURTHER NOTES:
a. The estimates of the regression coefficients, together with their estimated
standard errors, which are given in the first table, can be used to construct
confidence intervals. Thus,
(0.09024 – (2.228)(0.01188), 0.09024 + (2.228)(0.01188)) = (0.0638 , 0.1167),
where 2.228 is the 97.5th percentile of the t-distribution with 10 degrees of
freedom and 0.01188 is the estimated standard error of the slope estimator, is
a 95% CI for the slope, 1.
b. The test statistics T, which are given in the first table, allow us to test one-sided
hypotheses regarding the parameters. For example, to test H0: 1=0 vs. Ha:1>0 at
0.05 level of significance, the rejection rule is T>1.812, where 1.812 is the 95th
percentile of the t-distribution with 10 degrees of freedom. Since T=7.59, the null
hypothesis is rejected.
3. CI and PI:
Here we will see the commands needed to obtain a CI for the mean response at a given
value of X, as well as a PI for a future Y observation at a given value of X
STAT>Regression > Regression>Input RBOT for Response and TOST for
Predictor>Click Options, then input 4000 into Prediction intervals for new
observations, and 95 into Confidence level.
The additional part of the Minitab output is:
Predicted Values for New Observations
New Obs
1
Fit
347.12
SE Fit
95% CI
8.00 (329.30, 364.94)
Values of Predictors for New Observations
New Obs
TOST:
95% PI
(288.31, 405.94)
1
4000
[NOTE: The PI is wider than the CI, as expected from a previous discussion.]
Download